I have a database design question and wanted to see if folks could lend some
advice. For simplicity sake, I'll illustrate using a book (but the
hierarchy is the same in the real application - just with extra fields).
A book has a title has chapters.
A chapter has a title and has pages.
A page has a title has pictures.
A picture has a title.
My application will show a book, a chapter, a page, or a picture at any
given time. However, regardless of what I'm showing, I need to show my
name, and the names of all my parents (e.g. a page must show it's name, and
the names of it's parent chapter and book).
The question - in database terms, does the second, denormalized table design
give any performance advantage? Here are my two options:
-- Normalized --
book: book_id, title
chapter: chapter_id, book_id, title
page: page_id, chapter_id, title
picture: picture_id, page_id, title
-- Denormalized --
book: book_id, title
chapter: chapter_id, book_id, title
page: page_id, chapter_id, book_id, title
picture: picture_id, page_id, chapter_id, book_id, title
I'll use the most extreme case (the lowest table) as an example. When
viewing a picture, I'll need to get the picture title, and also the titles
of each of my parents. Below are the resulting queries and explain routes:
NORMALIZED QUERY
select book.title, chapter.title, page.title, picture.title
from book, chapter, page, picture
where book.book_id = chapter.book_id
and chapter.chapter_id = page.chapter_id
and page.page_id = picture.page_id
and picture.picture_id = 3601
NORMALIZED EXPLAIN
PRIMARY,fk_picture_2_page
PRIMARY,fk_page_2_chapter
PRIMARY,fk_chapter_2_book
PRIMARY
If I chose the denormalized version, they are:
DENORMALIZED QUERY
select book.title, chapter.title, page.title, picture.title
from book, chapter, page, picture
where book.book_id = picture.book_id
and chapter.chapter_id = picture.chapter_id
and page.page_id = picture.page_id
and picture.picture_id = 3601
DENORMALIZED EXPLAIN
PRIMARY,fk_picture_2_page,fk_picture_2_chapter,fk_ picture_2_book
PRIMARY
PRIMARY
PRIMARY
I've run some basic tests, and haven't noticed any real difference in
performance. I'm pretty new to database design, so if anyone has any
pointers, please let me know.
Thanks!
--
Josh