473,799 Members | 3,255 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Should I Denormalize?

I have a database design question and wanted to see if folks could lend some
advice. For simplicity sake, I'll illustrate using a book (but the
hierarchy is the same in the real application - just with extra fields).

A book has a title has chapters.
A chapter has a title and has pages.
A page has a title has pictures.
A picture has a title.

My application will show a book, a chapter, a page, or a picture at any
given time. However, regardless of what I'm showing, I need to show my
name, and the names of all my parents (e.g. a page must show it's name, and
the names of it's parent chapter and book).

The question - in database terms, does the second, denormalized table design
give any performance advantage? Here are my two options:

-- Normalized --
book: book_id, title
chapter: chapter_id, book_id, title
page: page_id, chapter_id, title
picture: picture_id, page_id, title

-- Denormalized --
book: book_id, title
chapter: chapter_id, book_id, title
page: page_id, chapter_id, book_id, title
picture: picture_id, page_id, chapter_id, book_id, title

I'll use the most extreme case (the lowest table) as an example. When
viewing a picture, I'll need to get the picture title, and also the titles
of each of my parents. Below are the resulting queries and explain routes:

NORMALIZED QUERY

select book.title, chapter.title, page.title, picture.title
from book, chapter, page, picture
where book.book_id = chapter.book_id
and chapter.chapter _id = page.chapter_id
and page.page_id = picture.page_id
and picture.picture _id = 3601

NORMALIZED EXPLAIN

PRIMARY,fk_pict ure_2_page
PRIMARY,fk_page _2_chapter
PRIMARY,fk_chap ter_2_book
PRIMARY

If I chose the denormalized version, they are:

DENORMALIZED QUERY

select book.title, chapter.title, page.title, picture.title
from book, chapter, page, picture
where book.book_id = picture.book_id
and chapter.chapter _id = picture.chapter _id
and page.page_id = picture.page_id
and picture.picture _id = 3601

DENORMALIZED EXPLAIN

PRIMARY,fk_pict ure_2_page,fk_p icture_2_chapte r,fk_picture_2_ book
PRIMARY
PRIMARY
PRIMARY

I've run some basic tests, and haven't noticed any real difference in
performance. I'm pretty new to database design, so if anyone has any
pointers, please let me know.

Thanks!

--
Josh
Jul 23 '05 #1
4 2842
"Josh" <memphis75 - gmail> wrote in message
news:11******** *****@corp.supe rnews.com...
<snip>
The question - in database terms, does the second, denormalized table design give any performance advantage?

<snip>

The entire (MySQL) system is designed to perform it's all around best using
textbook
classic normalized tables. The kind every intro to db design preaches. This
seems to be the most difficult thing to "preach" to newcomers to database
design. It just sounds so unconvincing that one *should* spend a bit more
effort writing table joins instead of cramming redundant information into
flat tables. My advice to someone serious about database design would be to
simply take it on faith because the "Ahah!" experience will come to you.
For most of us who begin by hot rodding flat tables at the get go the
revelation experience is more like "Oh sh**! - So that's why you're supposed
to do it that way! ..."

Let me just say that you (might!) get some performance advantage using flat
"denormaliz ed" tables. MySQL would certainly be the *wrong* tool for this.
I would go for a simple Excel table or one of the several flat file database
systems that are out there. As long as the data set is *small* and the
entire system is *simple* you will enjoy a performance advantage and may
never encounter the numerous and onerous booby traps this will set for a
system you expect to grow over time. MySQL, on the other hand, is meant to
scale large. It may not be as fast out of the gate, but neither will it slow
or bog down as the system expands. It's more like a marathon runner. As the
miles wear on the sprinter always falls behind never to catch up again.

But hey! If you are only going to cover 100 or 200 yards - by all means
sprint. Just don't waste your efforts by using a marathon tool like MySQL.

Okay - enough. Now what does your book say about normalizing a database
that you don't quite believe? Or the advantage that you think might be worth
sacrificing for few microseconds faster response time?

Thomas Bartkus

Oh how I wish production MySQL had views or virtual tables. Then when the
next PHP hotshot demands a flat table to simplify his coding I could just
write a nice proper query, store it, and tell him I'm giving him his table -
muttering "jerk!" under my breath.
Jul 23 '05 #2
Thomas - Thank you for taking the time to explain this!
The entire (MySQL) system is designed to perform it's all around best
using textbook classic normalized tables.
This is exactly the information I was hoping to get.
take it on faith because the "Ahah!" experience will come to you.


I loaded a few million records across the tables and ran some performance
tests (multi-threaded). You're absolutely right about the minimal
performance gains. As long as I was joining on primary keys, there was no
significant boost by denormalizing, only extra columns to keep up with!

--
Josh

Jul 23 '05 #3
"Josh" <memphis75 - gmail> wrote in message
news:11******** *****@corp.supe rnews.com...
Thomas - Thank you for taking the time to explain this!
The entire (MySQL) system is designed to perform it's all around best
using textbook classic normalized tables.
This is exactly the information I was hoping to get.
take it on faith because the "Ahah!" experience will come to you.


I loaded a few million records across the tables and ran some performance
tests (multi-threaded). You're absolutely right about the minimal
performance gains. As long as I was joining on primary keys, there was no
significant boost by denormalizing, ....

.... only extra columns to keep up with! * * *


Yup - that's one of the ahahs!
Maintaining data integrity is going to be *so much easier and reliable* when
you follow the rules.

There will be more!
Thomas Bartkus
Jul 23 '05 #4
Thomas Bartkus wrote:
"Josh" <memphis75 - gmail> wrote in message
news:11******** *****@corp.supe rnews.com...
Thomas - Thank you for taking the time to explain this!

The entire (MySQL) system is designed to perform it's all around best
using textbook classic normalized tables.


This is exactly the information I was hoping to get.

take it on faith because the "Ahah!" experience will come to you.


I loaded a few million records across the tables and ran some performance
tests (multi-threaded). You're absolutely right about the minimal
performance gains. As long as I was joining on primary keys, there was no
significant boost by denormalizing, ....


> .... only extra columns to keep up with! * * *


Yup - that's one of the ahahs!
Maintaining data integrity is going to be *so much easier and reliable* when
you follow the rules.

There will be more!
Thomas Bartkus

Another factor to consider when looking at performance is the
cost of "throwing hardware at the problem". As RAM and hard
disks both become less expensive and faster to access trying for
performance improvements by denormalizing your database (aside
from probably not yielding any significant gains) is short
sighted and will result in additional maintenance costs that
will exceed the cost of additional hardware.
Jul 23 '05 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

303
17794
by: mike420 | last post by:
In the context of LATEX, some Pythonista asked what the big successes of Lisp were. I think there were at least three *big* successes. a. orbitz.com web site uses Lisp for algorithms, etc. b. Yahoo store was originally written in Lisp. c. Emacs The issues with these will probably come up, so I might as well mention them myself (which will also make this a more balanced
3
2205
by: Andy Visniewski | last post by:
Should be easy, but I've been trying to figure this out for about half an hour with no luck. There is a table 'Cybex' which has all the Cybex products we sell, and a table 'Datasheets' which contains a list of the datasheets available for products. Both tables have a PartNumber column. On with the problem. If i do SELECT * FROM Cybex, it lists the 1000 or so products. and then SELECT * FROM DataSheets, it lists the 800 datasheets we...
36
4113
by: Jack Hughes | last post by:
This argument has come up two or three times lately. We have a web standards document at our company that say "Use valid HTML 4; We don't recommend switching to DIV-based CSS-P for layout just yet though, tables are a lot easier." That's because we have a lot of developers who aren't ready to move to CSS-P and DIVs just yet. But I'm being told our policies were contradictory. If we're going to
12
2389
by: Sanjay | last post by:
hi, We are currently porting our project from VB6 to VB .NET. Earlier we used to make scale transformations on objects like pictureBox , forms etc.Now Such transformations are made on the graphics object of the form/pictureBox. Should It be better if make a graphics object from my pictureBox in load event handler of the form and store it as member variable of the form , make
8
1126
by: Chris Mayers | last post by:
I am trying to track a suspected memory leak in an application I'm writing and through repeated simplification of my project, I have found the following which is quite easily replicated should you be prepared to give me 5 minutes of your time... Create a new C# (Probably the same in VB dotNET but I've not tried it) Windows application with 2 forms, the default form -Form1 and one other one. that has a bunch of controls on it (say 20...
5
1070
by: davidw | last post by:
Last time, I asked a question about if a XPathDocument could be serialized. Actually what I really want to know is how should I do in this case: I have a class that will be serialized. In the class, I need call another class, which will read setting from database. I want to load the xml once and use it in the following calls. So it need to be serialized. But XPathDocument doesn't support it. What should I do here? make the class...
6
3673
by: Terry Bell | last post by:
We've had a very large A97 app running fine for the last seven years. I've just converted to SQL Server backend, which is being tested, but meanwhile the JET based version, running under terminal server, has suddenly started running very slowly. The network engineer has thrown up his hands and said "It's Access 97". I've checked out lots of things including the Oplocks setting and other stuff from this NG, and I think I've done everything...
8
2224
by: FAQ server | last post by:
----------------------------------------------------------------------- FAQ Topic - When should I use eval? ----------------------------------------------------------------------- The ` eval() ` function should _only_ be used when it is necessary to evaluate a string supplied or composed at run-time; the string can be anything from a simple (but unpredictable) expression such as 12*2.54 to a substantial piece of javascript code. The `...
1
2333
by: baburk | last post by:
Hi all, Is there is any Specific Reasons To Denormalize. Any example?. Thanks
0
9541
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10485
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
10231
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
10027
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
1
7565
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6805
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5463
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5585
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4141
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.