473,383 Members | 1,813 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,383 software developers and data experts.

Normalization connundrum

I've come up with this issue in several apps now. There are things that, from
one perspective, are all handled the same, so it would be desirable that they
all be handled in the same table with some field as a type specification.
From other perspective of foreign key relationships, however, they are
different things and can't be stored in the same table.

For example, I have a scheme for indicating mappings between dimension records
at one time period to new dimension records at another time period. I could
use one set of tables for all mappings since they all work exactly the same
way, but then I can't set up DRI between the mapping tables and the dimension
tables. If I just make separate mapping tables for each dimension table, then
I'm creating 4 new tables per dimension table, all identical with respect to
what fields they contain, what kinds of unique constraints they have, and what
relationships they have to each other with the sole distinction that they each
map to the integer-type key of a different dimension table. I would not look
forward to doing maintenance on this schema!

Is there any strategy for having the cake and eating it, too?
Jul 20 '05 #1
7 1632
Steve Jorgensen wrote:
I've come up with this issue in several apps now. There are things that, from
one perspective, are all handled the same, so it would be desirable that they
all be handled in the same table with some field as a type specification.
From other perspective of foreign key relationships, however, they are
different things and can't be stored in the same table.

For example, I have a scheme for indicating mappings between dimension records
at one time period to new dimension records at another time period. I could
use one set of tables for all mappings since they all work exactly the same
way, but then I can't set up DRI between the mapping tables and the dimension
tables. If I just make separate mapping tables for each dimension table, then
I'm creating 4 new tables per dimension table, all identical with respect to
what fields they contain, what kinds of unique constraints they have, and what
relationships they have to each other with the sole distinction that they each
map to the integer-type key of a different dimension table. I would not look
forward to doing maintenance on this schema!

Is there any strategy for having the cake and eating it, too?


That things are handled the same way is not a criterion for putting them
into a single table. Let me refer you to Date and Cobb.

A table is a set. If the data constitutes a set as defined by set theory
it belongs in one table. If not ... it doesn't. The sole exceptions
being non-relational models such as star and object schemas.

--
Daniel Morgan
http://www.outreach.washington.edu/e...ad/oad_crs.asp
http://www.outreach.washington.edu/e...oa/aoa_crs.asp
da******@x.washington.edu
(replace 'x' with a 'u' to reply)

Jul 20 '05 #2
On Mon, 05 Jan 2004 17:42:22 -0800, Daniel Morgan <da******@x.washington.edu>
wrote:
Steve Jorgensen wrote:
I've come up with this issue in several apps now. There are things that, from
one perspective, are all handled the same, so it would be desirable that they
all be handled in the same table with some field as a type specification.
From other perspective of foreign key relationships, however, they are
different things and can't be stored in the same table.

For example, I have a scheme for indicating mappings between dimension records
at one time period to new dimension records at another time period. I could
use one set of tables for all mappings since they all work exactly the same
way, but then I can't set up DRI between the mapping tables and the dimension
tables. If I just make separate mapping tables for each dimension table, then
I'm creating 4 new tables per dimension table, all identical with respect to
what fields they contain, what kinds of unique constraints they have, and what
relationships they have to each other with the sole distinction that they each
map to the integer-type key of a different dimension table. I would not look
forward to doing maintenance on this schema!

Is there any strategy for having the cake and eating it, too?


That things are handled the same way is not a criterion for putting them
into a single table. Let me refer you to Date and Cobb.

A table is a set. If the data constitutes a set as defined by set theory
it belongs in one table. If not ... it doesn't. The sole exceptions
being non-relational models such as star and object schemas.


Then, the correct wording of the question might refer to avoiding schema
duplication (leading to difficulty of maintenance) rather than normalization.
Perhaps, the best answer in this case would not be the best normalized answer.
Jul 20 '05 #3
Steve Jorgensen wrote:
Then, the correct wording of the question might refer to avoiding schema
duplication (leading to difficulty of maintenance) rather than normalization.
Perhaps, the best answer in this case would not be the best normalized answer.


Not unless there is some compelling case for denormalization. And I
don't see one in what was posted.

--
Daniel Morgan
http://www.outreach.washington.edu/e...ad/oad_crs.asp
http://www.outreach.washington.edu/e...oa/aoa_crs.asp
da******@x.washington.edu
(replace 'x' with a 'u' to reply)

Jul 20 '05 #4
On Mon, 05 Jan 2004 19:38:41 -0800, Daniel Morgan <da******@x.washington.edu>
wrote:
Steve Jorgensen wrote:
Then, the correct wording of the question might refer to avoiding schema
duplication (leading to difficulty of maintenance) rather than normalization.
Perhaps, the best answer in this case would not be the best normalized answer.


Not unless there is some compelling case for denormalization. And I
don't see one in what was posted.


The compelling aspects as I see them are ...

1. It is simply a good practice to eliminate duplication in code or in a data
model. Any intentional duplication is a source of potential trouble, since
it's possible to make subtle mistakes when attempting to change that shared,
repeated code.

2. Having the number of supporting tables multiplied by the number of
dimension tables makes for a vastly more complex ER diagram that someone will
have to decipher in the future.

3. I wouldn't have to resort to dynamic table names in code or writing a code
generator to avoid duplicating code as well. This is especially an issue if I
am I use stored procedures to update the data in these tables.

Perhaps, you don't think these are compelling, and that's fine. I find them
compelling, though, and I'm still hoping for some advice on how to accomplish
it.

I had one idea in which I can use an intervening table to translate between
the general and specific cases. Basically, I would have a generic dimension
table with a type code, then a specific dimension table (say category) that
also has a dimension ID and a type code and a 1-1 relationship to the generic
dimension on both fields. The specific table, then has a constraint that only
allows the applicable dimension type code as a valid value for the type code
field.

Any opinions on this? Other ideas?
Jul 20 '05 #5
Steve Jorgensen wrote:
Perhaps, you don't think these are compelling, and that's fine. I find them
compelling, though, and I'm still hoping for some advice on how to accomplish
it.
You are correct. I'd suggest you ask Mr. Celko for his opinion. ;-)
I had one idea in which I can use an intervening table to translate between
the general and specific cases. Basically, I would have a generic dimension
table with a type code, then a specific dimension table (say category) that
also has a dimension ID and a type code and a 1-1 relationship to the generic
dimension on both fields. The specific table, then has a constraint that only
allows the applicable dimension type code as a valid value for the type code
field.

Any opinions on this? Other ideas?


The fact that you don't know how to implement this and are talking about
a translation table, at least to me, would be warning enough you are
heading down the wrong road.

--
Daniel Morgan
http://www.outreach.washington.edu/e...ad/oad_crs.asp
http://www.outreach.washington.edu/e...oa/aoa_crs.asp
da******@x.washington.edu
(replace 'x' with a 'u' to reply)

Jul 20 '05 #6
On Mon, 05 Jan 2004 22:06:55 -0800, Daniel Morgan <da******@x.washington.edu>
wrote:
Steve Jorgensen wrote:
Perhaps, you don't think these are compelling, and that's fine. I find them
compelling, though, and I'm still hoping for some advice on how to accomplish
it.


You are correct. I'd suggest you ask Mr. Celko for his opinion. ;-)
I had one idea in which I can use an intervening table to translate between
the general and specific cases. Basically, I would have a generic dimension
table with a type code, then a specific dimension table (say category) that
also has a dimension ID and a type code and a 1-1 relationship to the generic
dimension on both fields. The specific table, then has a constraint that only
allows the applicable dimension type code as a valid value for the type code
field.

Any opinions on this? Other ideas?


The fact that you don't know how to implement this and are talking about
a translation table, at least to me, would be warning enough you are
heading down the wrong road.


I'm not sure what you're saying. I know how to implement what I just
described. As you say, perhaps it's not the "right" approach, and certainly
more complicated than what I'd prefer, but in the absence of any other
suggestions, it's the only one I've come up with besides either implementing
the massive duplication or forgoing the use of DRI.

Also, I'm not sure what seems so obviously wrong to you about the translation
table idea (though I would welcome an explanation). It's defintitely a
kludge, but on the surface, it doesn't look too terrible to me as kludges go.
Jul 20 '05 #7
Steve Jorgensen (no****@nospam.nospam) writes:
I've come up with this issue in several apps now. There are things
that, from one perspective, are all handled the same, so it would be
desirable that they all be handled in the same table with some field as
a type specification. From other perspective of foreign key
relationships, however, they are different things and can't be stored in
the same table.

For example, I have a scheme for indicating mappings between dimension
records at one time period to new dimension records at another time
period. I could use one set of tables for all mappings since they all
work exactly the same way, but then I can't set up DRI between the
mapping tables and the dimension tables. If I just make separate
mapping tables for each dimension table, then I'm creating 4 new tables
per dimension table, all identical with respect to what fields they
contain, what kinds of unique constraints they have, and what
relationships they have to each other with the sole distinction that
they each map to the integer-type key of a different dimension table. I
would not look forward to doing maintenance on this schema!


I'm not at all into data warehousing, so I don't know what dimension tables
are, but it sounds to me that you have a bunch of table which are like:

CREATE TABLE entity(entityid int NOT NULL PRIMARY KEY,
entityname varchar(30) NOT NULL)

and you want to collapse them into one. Yes, I have seen this. And I
killed it when I took over that data model. You get less tables, but
you also get less flexibility. In the system I work with, we may have
over fifty such tables. But not all are equal. Suddenly there is a need
for an extra column in such a table, one is specific for the entity in
question. Not very fun if you have all in one table.

Two other drawbacks with this arrangement is when you look at a graphical
data model, you can only see that a table refers to the general entity
table, but not for which entity. And each reference requires two columns
in the referring table, whereof one is a constant column. I rather have
extra tables than superfluous tables.

Now, judging the previous thread, you seem to have a special problem
with mappings that are in the same domain, but different at different
points in time. These mappings might be worthwhile to lump into one
table, four by four so to speak. Actually, we have such an arrangement
somewhere in our data model for one specific entity. But it would be
dangerous to lump everything into this basket.

--
Erland Sommarskog, SQL Server MVP, so****@algonet.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 20 '05 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

7
by: Adrian | last post by:
Normalization Question - Please bear with me, hopefully things will be clear at the end of the question. Given a treaty table containg treaties; Treaty1, Treaty2 etc and a benefit table;...
3
by: JoeB | last post by:
I found an interesting thread on this from Nov., 2000, but it didn't fully answer my question. I understand normalization, but am trying to find the line between good database design and personal...
4
by: Catherine Jo Morgan | last post by:
I'm wondering if I'm trying to normalize data way beyond what will prove practical. For example, if I put People, Addreses, and Comms in 3 different tables, with junction tables for...
2
by: Kostas | last post by:
Hi all, I strived understanding the concepts of all normal forms up to the domain-key normal form but there are some issues I still am unsure of, and I will describe the first one in this post....
7
by: John Welch | last post by:
I have three tables and the way they are currently set up violates good normalization, but I'm having trouble figuring out a better way. Can someone suggest a more elegant solution? My tables...
20
by: hippomedon | last post by:
Hello everyone, I'm looking for some advice on whether I should break the normalization rule. Normally, I would not consider it, but this seems to be a special case. I have created an...
36
by: Shraddha | last post by:
What is mean by normalization of pointers...Where we use that?
1
by: weird0 | last post by:
I know that is not the appropriate group for asking this question, but I do not know of any better forum. Can anyone plz explain clearly what normalization is? The first three normal forms upto...
22
by: sophia | last post by:
Dear all, can any one explain what is meant by pointer normalization given here:- http://c-faq.com/ansi/norml.html
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.