Connecting Tech Pros Worldwide Help | Site Map

Database Normalization and Table Structures

 
Old January 8th, 2007, 03:43 AM
msquared's Avatar
Administrator
 
Join Date: Aug 2006
Location: Dublin, Ireland
Posts: 10,780
Normalisation is the term used to describe how you break a file down into tables to create a database. There are 3 or 4 major steps involved known as 1NF (First Normal Form), 2NF (Second Normal Form), 3NF (Third Normal Form) and BCNF (Boyce-Codd Normal Form). There are others but they are rarely if ever used. A database is said to be Normalised if it is in 3NF (or ideally in BCNF). These steps are descibed as follows:


Note: When attribute is used we are speaking of a field in the table

1NF

To put a database in 1N
  • ensure that all attributes (columns) are atomic (which means that any single field should only have a value for ONE thing).
Examples:

In a database a table on Customers would have an address attribute. The address is made up of Company Name, Address Line1, Address Line2, Address Line3, City, Postcode. There are 6 values to this address and as such each should have it's own field (column).


If your company sold furniture a table on products could have a description attribute. If for example that attribute was 'Beech Desk 120w x 75h x 50d'. Ideally this would be broken down into a number attributes like 'Colour', 'Type', 'Width', 'Height' and 'Depth'. The reason for this is it would allow you to seach the database for all Desks, for all pieces of Beech furniture, for all desks with a width of 120 etc.
  • Create a separate table for each set of related data and Identify each set of related data with a primary key
Example:


In a general Invoicing table you would have a separate table for Customers, Orders, Products, Invoices and you would probably need tables for OrderDetails and InvoiceDetails as well. Each of these tables must have their own primary key. Each of these tables except for customers would have a foreign key reference to the primary key of another table. (See Relationships below)
  • Do not use multiple fields in a single table to store similar data
Example:
(Underlined fields are Primary Keys and Italicised fields are Foreign Keys)

In a customer order you could have more than one product. That is the customer has ordered more than one item. If you tried to put all of this in one table as {OrderID, CustomerID, OrderDate, Product1, Product2, Product3} what would happen if the customer ordered more than 3 products. There would also be implications for querying the kind or quantiy of products ordered by a customer. Therefore these product fields don't belong in the order table which is why we would have an OrderDetails table which would have a foreign key refernce to the Orders table {OrderDetailsID, OrderID, ProductID, Quantity}. Using productID as a foreign key to the product table means you don't have to identify the product attributes here. This also allows you to enter a quantity figure for the product ordered.

Relationships:

All tables should have a 1 to 1 or 1 to many relationship. This means for example that 1 customer can have 1 or many orders and 1 order can have 1 or many details.

Therefore Orders table would have a foreign key reference to the Customer table primary key {OrderID, CustomerID, OrderDate} and the OrderDetails table would have a foreign key reference to the Order table primary key {OrderDetailsID, OrderID, ProductID, Quantity}. This table also contains a foreign key reference to the Products table. As a product is likely to be ordered more than once there is a many to 1 relationship between the OrderDetails and the Products table.

If any tables have a many to many relationship this must be broken out using a JOIN table. For example, Customers can have many Suppliers and Suppliers can supply to many Customers. This is known as a many to many relationship. You would need to create a JOIN table that would have a primary key made up of a foreign key reference to the Customers table and a foreign key reference to the suppliers table. Therefore the SuppliersPerCustomer table would be {SupplierID, CustomerID}. Now the Suppliers table will have a 1 to many relationship with the SuppliersPerCustomer table and the Customers table will also have a 1 to many relationship with the SuppliersPerCustomer table.

2NF

The database must meet all the requirements of the 1NF.

In addition, records should not depend on anything other than a table's primary key (a primary key can be made up of more than one field, only if absolutely necessary like in a JOIN table).

Example:

A customers address is needed by the Customers table, but also by the Orders, and Invoices tables. Instead of storing the customer's address as a separate entry in each of these tables, store it in one place, either in the Customers table or in a separate Addresses table.

3NF

The database must meet all the requirements of the 1NF and 2NF.

The third normal form requires that all columns in a relational table are dependent only upon the primary key. A more formal definition is:
  • A relational table is in third normal form (3NF) if it is already in 2NF and every non-key column is non transitively dependent upon its primary key.
In other words, all nonkey attributes are functionally dependent only upon the primary key. All 3NF really means is that all fields (attributes) should be dependent on the tables primary key. If they are not they should be put in their own table. This means that every attribute unless it is a primary or foreign key must be DIRECTLY dependent on the Primary Key of this table and not on some other column.

Example:

The Customer table contains information such as address, city, postcode imagine it also contained a column called shipping cost. The value of shipping cost changes in relation to which city the products are being delivered to, and therefore is not directly dependent on the customer even though the cost might not change per customer, but it is dependent on the city that the customer is in. Therefore we would need to create another separate table to hold the information about cities and shipping costs.

BCNF

A relation is in Boyce-Codd Normal Form (BCNF) if every determinant is a candidate key. BCNF is very similar to 3NF but deals with dependencies within the primary keys. BCNF in it's simplist terms just says don't have a primary key made up of more than one field unless it is a join table to disperse a many to many relationship and only contains the two primary keys of the tables it is joining.

Most relations that are in 3NF are also in BCNF. It only happens that a relation which is in 3NF is not in BCNF when the primary key in a table is made up of more than one field and the other columns are not dependent on both fields but only on one or the other.

A database is said to be normalised if it is in 3NF and/or BCNF



  #51  
Old June 3rd, 2008, 12:55 AM
msquared's Avatar
Administrator
 
Join Date: Aug 2006
Location: Dublin, Ireland
Posts: 10,780

re: Database Normalization and Table Structures


Quote:
Originally Posted by DirtNap
The stuff I posted has been the way it is since the mid 70's, and continues to be that way, although there are some new forms like DKNF and 6NF that are newer, and which I didn't cover (because they are not normally included when talking about "normalizing" a database.) However, if you've learned about this before, say, 1998 and learned about it today, you would get a whole different flavor, because before 1998 you would not have a very large practical component in the training.
It was pre '98 originally. (LOL) Now I'm exposing my age.

In truth I normalise nowadays as a matter of habit and would have to think about what I do and why. I think based on the results I'm doing it right maybe just not explaining it right :)
  #52  
Old June 4th, 2008, 04:04 PM
Newbie
 
Join Date: Jun 2008
Location: Boston, MA
Posts: 3

re: Database Normalization and Table Structures


Quote:
Originally Posted by msquared
It was pre '98 originally. (LOL) Now I'm exposing my age.

In truth I normalise nowadays as a matter of habit and would have to think about what I do and why. I think based on the results I'm doing it right maybe just not explaining it right :)

There are three real reasons to do this:
  1. Integrity constraints are only guaranteed to work on normalized data
  2. When you are writing a query, normalization gives you a mathematical certainty that your data will behave in a predictable way
  3. The database itself can learn from normaliztion.

The first two are because SQL and integrity constraints are based on set theory, which is what normalization is based on. The third doesn't apply to access, but for bigger databases like Oracle or DB2 that have a sophisticated query engine, the database can imply things about some columns based on other columns or tables because the data is normalized.

Of course, there are many advantages to selective denormalization, but that's a different article.
  #53  
Old June 4th, 2008, 05:30 PM
msquared's Avatar
Administrator
 
Join Date: Aug 2006
Location: Dublin, Ireland
Posts: 10,780

re: Database Normalization and Table Structures


Quote:
Originally Posted by DirtNap
Of course, there are many advantages to selective denormalization, but that's a different article.


Agreed! I also concur with the three points you made in the previous post. The amount of database designers who use denormalization as a "way around" badly structured tables is something that annoys me. I try to explain to them that before you can denormalize first you have to normalize. Most don't get it.
  #54  
Old June 5th, 2008, 10:53 PM
Newbie
 
Join Date: Jun 2008
Posts: 3

re: Database Normalization and Table Structures


Has anyone ever used the Pick Database? I was involved in programming for that database and found it to be so intuitive.
All data relative to the record was stored in the record, ie. multiple phone #'s all in one field.
File structure, wow, so different from anything currently in use. Basically you create a file to hold the records. Then the records are retrieved via a Hash #. Fantastically quick. B-trees also used to access records where the data is 'inside' the record; no need to search entire db to get records. SQL has the same thing, BUT, all the data is scattered because all the pertinent data isnt contained in the record.
Just my 2cents.
Marshall Neill
  #55  
Old June 5th, 2008, 11:16 PM
msquared's Avatar
Administrator
 
Join Date: Aug 2006
Location: Dublin, Ireland
Posts: 10,780

re: Database Normalization and Table Structures


Quote:
Originally Posted by ramien
Has anyone ever used the Pick Database? I was involved in programming for that database and found it to be so intuitive.
All data relative to the record was stored in the record, ie. multiple phone #'s all in one field.
File structure, wow, so different from anything currently in use. Basically you create a file to hold the records. Then the records are retrieved via a Hash #. Fantastically quick. B-trees also used to access records where the data is 'inside' the record; no need to search entire db to get records. SQL has the same thing, BUT, all the data is scattered because all the pertinent data isnt contained in the record.
Just my 2cents.
Marshall Neill
I haven't encountered it but sounds like something interesting to "play around" with :)
  #56  
Old June 5th, 2008, 11:59 PM
Newbie
 
Join Date: Jun 2008
Posts: 3

re: Database Normalization and Table Structures


Quote:
Originally Posted by msquared
I haven't encountered it but sounds like something interesting to "play around" with :)
I think they have a free version for testing at rainingdata.com
odd name i know. used to be Pick Systems. It was developed by Richard Pick and i think Mike Nelso back in 1967 for the Army and some helicopter database.
  #57  
Old June 6th, 2008, 12:02 AM
msquared's Avatar
Administrator
 
Join Date: Aug 2006
Location: Dublin, Ireland
Posts: 10,780

re: Database Normalization and Table Structures


Quote:
Originally Posted by ramien
I think they have a free version for testing at rainingdata.com
odd name i know. used to be Pick Systems. It was developed by Richard Pick and i think Mike Nelso back in 1967 for the Army and some helicopter database.
I'll have to check it out. Thanks for the recommendation.

Mary
  #58  
Old June 6th, 2008, 02:20 AM
Newbie
 
Join Date: Jun 2008
Posts: 3

re: Database Normalization and Table Structures


Quote:
Originally Posted by msquared
I'll have to check it out. Thanks for the recommendation.

Mary
My bad, Mike Nelson. n
You're quite welcome Mary
  #59  
Old June 11th, 2008, 02:28 PM
mshmyob's Avatar
Expert
 
Join Date: Jan 2008
Location: witness protection
Posts: 610

re: Database Normalization and Table Structures


I finally got around to reading your article Mary (lol). Nice. I explain it a little different but the concept is there.

As for the people explaining anything above 3NF you are just confusing things. Anyone out there who is a bigginner you can stop at 3NF and design any database you want.

As for why you normalize it is simple - It helps to reduce data redundancy. Notice I said help not eliminate because that is impossible.

cheers,
  #60  
Old June 12th, 2008, 02:37 AM
msquared's Avatar
Administrator
 
Join Date: Aug 2006
Location: Dublin, Ireland
Posts: 10,780

re: Database Normalization and Table Structures


Quote:
Originally Posted by mshmyob
I finally got around to reading your article Mary (lol). Nice. I explain it a little different but the concept is there.

As for the people explaining anything above 3NF you are just confusing things. Anyone out there who is a bigginner you can stop at 3NF and design any database you want.

As for why you normalize it is simple - It helps to reduce data redundancy. Notice I said help not eliminate because that is impossible.

cheers,
Thanks :D
  #61  
Old June 16th, 2008, 01:53 PM
Newbie
 
Join Date: Jun 2008
Posts: 25

re: Database Normalization and Table Structures


Nice post.

Thanks msquared!
  #62  
Old June 16th, 2008, 04:21 PM
msquared's Avatar
Administrator
 
Join Date: Aug 2006
Location: Dublin, Ireland
Posts: 10,780

re: Database Normalization and Table Structures


Quote:
Originally Posted by b0010100
Nice post.

Thanks msquared!
You're welcome :)
  #63  
Old June 26th, 2008, 04:49 PM
Newbie
 
Join Date: Jun 2008
Posts: 3

re: Database Normalization and Table Structures


Great post. Now I have a question.

Say for example I have a three tables:

user -- userid, username, email
post -- postid, userid, posttext
comment -- commentid, postid, userid, commenttext

Now, for the sake of easing database load when displaying a post with comments, or displaying a list of posts would it be permissible to have say:

post -- postid, userid, username, posttext, commentcount
comment -- commentid, postid, userid, commenttext, username

And have the username and commentcount be 'duplicate information' in the post table, and for example have the username as duplicate info in the comment table, so that I don't need the extra join with the user table to display a long list of comments on a post?

What is the consensus on doing things like this for performance reasons after having a normalized database?
  #64  
Old June 26th, 2008, 04:53 PM
msquared's Avatar
Administrator
 
Join Date: Aug 2006
Location: Dublin, Ireland
Posts: 10,780

re: Database Normalization and Table Structures


Quote:
Originally Posted by eDavid
Great post. Now I have a question.

Say for example I have a three tables:

user -- userid, username, email
post -- postid, userid, posttext
comment -- commentid, postid, userid, commenttext

Now, for the sake of easing database load when displaying a post with comments, or displaying a list of posts would it be permissible to have say:

post -- postid, userid, username, posttext, commentcount
comment -- commentid, postid, userid, commenttext, username

And have the username and commentcount be 'duplicate information' in the post table, and for example have the username as duplicate info in the comment table, so that I don't need the extra join with the user table to display a long list of comments on a post?

What is the consensus on doing things like this for performance reasons after having a normalized database?
First thing I would say is storing a calculated field like commentcount is not recommended. The reality of denormalising a database is really a matter of balance. If you do something like this for preformance reasons then you have the added headache of handling duplicate storage of information. My advice is not to do it unless the preformance issue is something you just can't handle any other way. Saving time in one place usually means you end up paying for it elsewhere.
  #65  
Old June 26th, 2008, 05:02 PM
Newbie
 
Join Date: Jun 2008
Posts: 3

re: Database Normalization and Table Structures


Quote:
Originally Posted by msquared
First thing I would say is storing a calculated field like commentcount is not recommended. The reality of denormalising a database is really a matter of balance. If you do something like this for preformance reasons then you have the added headache of handling duplicate storage of information. My advice is not to do it unless the preformance issue is something you just can't handle any other way. Saving time in one place usually means you end up paying for it elsewhere.
Thanks for the response.

So given my example above, how would you display a list of posts along with their comment count? How would you set up the query, and don't you think that would kill the db server to do a
Expand|Select|Wrap|Line Numbers
  1. SELECT COUNT(*) FROM comment WHERE postid = x
for every row of the post table?
  #66  
Old June 26th, 2008, 05:12 PM
msquared's Avatar
Administrator
 
Join Date: Aug 2006
Location: Dublin, Ireland
Posts: 10,780

re: Database Normalization and Table Structures


Quote:
Originally Posted by eDavid
Thanks for the response.

So given my example above, how would you display a list of posts along with their comment count? How would you set up the query, and don't you think that would kill the db server to do a
Expand|Select|Wrap|Line Numbers
  1. SELECT COUNT(*) FROM comment WHERE postid = x
for every row of the post table?
What language are you developing the database in?
  #67  
Old June 26th, 2008, 05:18 PM
Newbie
 
Join Date: Jun 2008
Posts: 3

re: Database Normalization and Table Structures


Quote:
Originally Posted by msquared
What language are you developing the database in?
In MySQL and PHP. This obviously isn't the right forum for the specifics, but the normalization topic is applicable regardless of the RDBMS you use.
  #68  
Old June 26th, 2008, 05:21 PM
msquared's Avatar
Administrator
 
Join Date: Aug 2006
Location: Dublin, Ireland
Posts: 10,780

re: Database Normalization and Table Structures


BTW there is no way around the fact that you have to count the comments for every post somewhere. Even if you were storing the value you still have to calculate it.

The syntax of the following will change depending on the language but ...

Expand|Select|Wrap|Line Numbers
  1. SELECT Post.PostID, Count(Comments.CommentID) As TotalComments
  2. FROM Post LEFT JOIN Comments
  3. ON Post.PostID = Comments.PostID
  #69  
Old June 26th, 2008, 05:26 PM
msquared's Avatar
Administrator
 
Join Date: Aug 2006
Location: Dublin, Ireland
Posts: 10,780

re: Database Normalization and Table Structures


Quote:
Originally Posted by msquared
BTW there is no way around the fact that you have to count the comments for every post somewhere. Even if you were storing the value you still have to calculate it.

The syntax of the following will change depending on the language but ...

Expand|Select|Wrap|Line Numbers
  1. SELECT Post.PostID, Count(Comments.CommentID) As TotalComments
  2. FROM Post LEFT JOIN Comments
  3. ON Post.PostID = Comments.PostID
OOPS! Sorry forgot the Group by ...

Expand|Select|Wrap|Line Numbers
  1. SELECT Post.PostID, Count(Comments.CommentID) As TotalComments
  2. FROM Post LEFT JOIN Comments
  3. ON Post.PostID = Comments.PostID
  4. GROUP BY Post.PostID
  #70  
Old September 11th, 2008, 03:56 PM
Newbie
 
Join Date: Sep 2008
Location: Nyack,NY
Posts: 8

re: Database Normalization and Table Structures


Great article on normalization. Refreshed my memory from college.
  #71  
Old October 27th, 2008, 10:54 AM
Newbie
 
Join Date: Oct 2008
Posts: 1

re: Database Normalization and Table Structures


Many Thanks to "BYTES" for posting such a Usefull description on Database Normalization.
Kunal
Reply

Tags
database, db2, ms access, mysql, normalisation, normalization, oracle, postgresql, relationships, sql server, table structure, tables


Similar Threads
Thread Thread Starter Forum Replies Last Post
DVD Collection Database MNNovice answers 4 April 15th, 2009 11:39 AM
Normalization and AVG across columns brianc@rstenstrom.com answers 3 March 1st, 2006 07:15 PM
Database normalization Shwetabh answers 2 January 31st, 2006 03:25 PM