By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
424,982 Members | 1,955 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 424,982 IT Pros & Developers. It's quick & easy.

Separate foreign keys with shared ID space

P: n/a
Let's say I have a type hierarchy: (just an example)

the general entity customer:

CREATE TABLE customer(customer_id int, customer_name varchar(250),
customer_type int)

three specific customer subtypes: 1 - business, 2 - home, 3 -
university

represented by the following three tables (being subtypes, they share
ID space for customer_id)

CREATE TABLE business_customer(customer_id int, business_sector int,
.... )
CREATE TABLE home_customer(customer_id int, household_income_bracket
int, ...)
CREATE TABLE university_customer(customer_id int, number_students int,
....)

Is it possible to make a foreing key constraint that says:

the customer_id in busineness_customer table must reference only those
customer_id in the customer table where customer_type == 1?

the customer_id in home_customer table must reference only those
customer_id in the customer table where customer_type == 2?
Thanks!

- Robert
Jul 20 '05 #1
Share this Question
Share on Google+
31 Replies


P: n/a
**** Post for FREE via your newsreader at post.usenet.com ****

Hi Robert
Is it possible to make a foreing key constraint that says:

the customer_id in busineness_customer table must reference only those
customer_id in the customer table where customer_type == 1?

the customer_id in home_customer table must reference only those
customer_id in the customer table where customer_type == 2?


With a foreign key you can only reference a primary key. Since customer_type
is not part of it, you cannot use it on the subtypes.

On the other side, if you add customer_type to the customer's primary key,
you should add it on the subtype as well. But the foreign key alone will not
be enough, i.e. you should add a check constraint on the subtype to ensure
such a rule.

==> There is no good (from a design point of view) foreign key to enforce
such a constraint.
Chris

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
*** Usenet.com - The #1 Usenet Newsgroup Service on The Planet! ***
http://www.usenet.com
Unlimited Download - 19 Seperate Servers - 90,000 groups - Uncensored
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Jul 20 '05 #2

P: n/a
ro*************@yahoo.com (Robert Brown) wrote in message news:<24**************************@posting.google. com>...
Let's say I have a type hierarchy: (just an example)

the general entity customer:

CREATE TABLE customer(customer_id int, customer_name varchar(250),
customer_type int)

three specific customer subtypes: 1 - business, 2 - home, 3 -
university

represented by the following three tables (being subtypes, they share
ID space for customer_id)

CREATE TABLE business_customer(customer_id int, business_sector int,
... )
CREATE TABLE home_customer(customer_id int, household_income_bracket
int, ...)
CREATE TABLE university_customer(customer_id int, number_students int,
...)

Is it possible to make a foreing key constraint that says:

the customer_id in busineness_customer table must reference only those
customer_id in the customer table where customer_type == 1?

the customer_id in home_customer table must reference only those
customer_id in the customer table where customer_type == 2?
Thanks!

- Robert


Something like:

CREATE TABLE customer (
customer_id int not null primary key,
customer_name varchar(250) not null,
customer_type int not null check (customer_type in (1,2,3)),
unique (customer_id, customer_type )
);

CREATE TABLE business_customer (
customer_id int not null primary key,
customer_type int not null default 1 check (customer_type = 1),
...
foreign key (customer_id, customer_type)
references customer (customer_id, customer_type)
);

etc.

HTH
/Lennart
Jul 20 '05 #3

P: n/a
It seems like type hierarchies are a common technique in relation
design. What approaches do people use to provide relational integrity
in cases like this? Should I add the customer_type column to the
subtypes and then use composite foreign key? I guess that could work
but it has the downside of using up tablespace with a column of
customer_type that will always have the same value for the subtype. Is
that considered just the cost of doing business in this situation?
"Christian Antognini" <ch*****************@trivadis.com> wrote in message news:<41********@post.usenet.com>...
**** Post for FREE via your newsreader at post.usenet.com ****

Hi Robert
Is it possible to make a foreing key constraint that says:

the customer_id in busineness_customer table must reference only those
customer_id in the customer table where customer_type == 1?

the customer_id in home_customer table must reference only those
customer_id in the customer table where customer_type == 2?


With a foreign key you can only reference a primary key. Since customer_type
is not part of it, you cannot use it on the subtypes.

On the other side, if you add customer_type to the customer's primary key,
you should add it on the subtype as well. But the foreign key alone will not
be enough, i.e. you should add a check constraint on the subtype to ensure
such a rule.

==> There is no good (from a design point of view) foreign key to enforce
such a constraint.
Chris

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
*** Usenet.com - The #1 Usenet Newsgroup Service on The Planet! ***
http://www.usenet.com
Unlimited Download - 19 Seperate Servers - 90,000 groups - Uncensored
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Jul 20 '05 #4

P: n/a
Your design may need some more consideration.

From a theoreticaly relational design standpoint, when you have a
entity type that has subtypes, at the ERD stage, you have to figure
out the attributes that belong to the (supertype), attributes that
belong to each subtype, and optionality of each of those attributes.

When you are ready to consider creating tables from those entities ...
you have a choice.

You can either create one table the supertype and in that table it
will have all of the columns that belong to the supertype plus all of
the columns that belong TO EACH of the subtypes.

OR

You do not create the supertype table and then you create a separate
set of tables for each subtype. In each subtype go all of the columns
from the supertype along with the specific set of columns that belong
to the subtype.

Then there are programming tradeoff's that depend on which way the
design tradeoff was implemented.

It appears to me if I understand what you presented, you have both the
supertype table along with the subtype tables. That looks wrong to
me.
Jul 20 '05 #5

P: n/a
We have a similar design issue in our DB.

We implemented this kind of super-class/sub-class design using views.

We created views with joins with the parent & child tables and
'instead of' triggers. The users can only see the views though they
see them as if they were tables (by using synonyms).

In your example you could create three views: business_customer_vw,
home_customer_vw, university_customer_vw. They share the common
attributes from
the customer table and the respective specific attributes from the
business_customer, home_customer, university_customer tables.
Optionally, you can create synonyms business_customer, home_customer,
university_customer for the views.

The underlying tables data are managed by the instead of triggers.

I can only say It works for us...

Hope this helps.

Carlos.

le*****@kommunicera.umea.se (Lennart Jonsson) wrote in message news:<6d**************************@posting.google. com>...
ro*************@yahoo.com (Robert Brown) wrote in message news:<24**************************@posting.google. com>...
Let's say I have a type hierarchy: (just an example)

the general entity customer:

CREATE TABLE customer(customer_id int, customer_name varchar(250),
customer_type int)

three specific customer subtypes: 1 - business, 2 - home, 3 -
university

represented by the following three tables (being subtypes, they share
ID space for customer_id)

CREATE TABLE business_customer(customer_id int, business_sector int,
... )
CREATE TABLE home_customer(customer_id int, household_income_bracket
int, ...)
CREATE TABLE university_customer(customer_id int, number_students int,
...)

Is it possible to make a foreing key constraint that says:

the customer_id in busineness_customer table must reference only those
customer_id in the customer table where customer_type == 1?

the customer_id in home_customer table must reference only those
customer_id in the customer table where customer_type == 2?
Thanks!

- Robert


Something like:

CREATE TABLE customer (
customer_id int not null primary key,
customer_name varchar(250) not null,
customer_type int not null check (customer_type in (1,2,3)),
unique (customer_id, customer_type )
);

CREATE TABLE business_customer (
customer_id int not null primary key,
customer_type int not null default 1 check (customer_type = 1),
...
foreign key (customer_id, customer_type)
references customer (customer_id, customer_type)
);

etc.

HTH
/Lennart

Jul 20 '05 #6

P: n/a
**** Post for FREE via your newsreader at post.usenet.com ****

Hi Robert
It seems like type hierarchies are a common technique in relation
design. What approaches do people use to provide relational integrity
in cases like this? Should I add the customer_type column to the
subtypes and then use composite foreign key?


A PK should have no business meaning. Therefore, I'll not add such a column
to the primary key.

Once I solved such a problem with as many parent tables as child tables and
then by putting a views that aggregated the information over the different
types of customer. Of course it is only a good solution if you rarely query
all customers at the same time. Another drawback is that the primary key
cannot be guaranteed for the whole customers. But, if you generate the
customer id with a sequence, this should not be a problem.

Chris

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
*** Usenet.com - The #1 Usenet Newsgroup Service on The Planet! ***
http://www.usenet.com
Unlimited Download - 19 Seperate Servers - 90,000 groups - Uncensored
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Jul 20 '05 #7

P: n/a
"Christian Antognini" <ch*****************@trivadis.com> wrote in message news:41********@post.usenet.com...

A PK should have no business meaning.


Says who? Can you justify this statement?
Marshall
Jul 20 '05 #8

P: n/a
After a long battle with technology, "Marshall Spight" <ms*****@dnai.com>, an earthling, wrote:
"Christian Antognini" <ch*****************@trivadis.com> wrote in message news:41********@post.usenet.com...

A PK should have no business meaning.


Says who? Can you justify this statement?


A good reason for this is that business meanings can change, but
primary keys can't.

A typical example of this is the use of the government "social
insurance/security" as a PK. It's not _supposed_ to change, but it
can.

Supposing somebody does a "steal my identity" thing using my SIN/SSN
number, and things go so gravely badly that the government actually
decides that it is a better thing to give me a new number, that
_breaks_ the use of SSN/SIN as a primary key.

If we fabricate a number of our own as an "employee ID," that's well
and good, until such time as there is a corporate merger that has
conflicting ID spaces so that peoples' IDs have to change.
--
(format nil "~S@~S" "cbbrowne" "acm.org")
http://cbbrowne.com/info/sgml.html
"That's convenience, not cracker-proofing. Security is an emergent
property, not a feature." -- void <fl***@interport.net>
Jul 20 '05 #9

P: n/a
Marshall Spight wrote:
"Christian Antognini" <ch*****************@trivadis.com> wrote in message
news:41********@post.usenet.com...

A PK should have no business meaning.


Says who? Can you justify this statement?


A PK should be selected to uniquely identify an entity. Ideally, and by
formal definition, the PK is invariant.

All to often a unique attribute of the entity, such as empno (or SSN/SIN or
name or email address,) is used as the PK. Attributes generally have a
business meaning. Such attributes can change, although some change very
infrequently.

Selecting an attribute as the PK can cause DBAs (and/or developers and/or
businesses) headaches when attempting to change the PK. This is especially
true when taking the offline archives into account.

All of the examples are not invariant, although they tend not to change
often. In some cases using the suggested PK is against the law (eg: SIN in
Canada, has privacy implications).

Hans
Jul 20 '05 #10

P: n/a

"Marshall Spight" <ms*****@dnai.com> wrote in message
news:LPQOc.63922$eM2.25470@attbi_s51...
"Christian Antognini" <ch*****************@trivadis.com> wrote in message news:41********@post.usenet.com...

A PK should have no business meaning.


Says who? Can you justify this statement?


Hmm, I want to see what Joe Celko says about this one....



Marshall

Jul 20 '05 #11

P: n/a
Dan

"Christopher Browne" <cb******@acm.org> wrote in message
news:2n************@uni-berlin.de...
After a long battle with technology, "Marshall Spight" <ms*****@dnai.com>, an earthling, wrote:
"Christian Antognini" <ch*****************@trivadis.com> wrote in message news:41********@post.usenet.com...

A PK should have no business meaning.


Says who? Can you justify this statement?


A good reason for this is that business meanings can change, but
primary keys can't.


On the contrary, primary keys, as with any key, can change as long as they
don't run afoul of another primary key value.

A typical example of this is the use of the government "social
insurance/security" as a PK. It's not _supposed_ to change, but it
can.
Supposing somebody does a "steal my identity" thing using my SIN/SSN
number, and things go so gravely badly that the government actually
decides that it is a better thing to give me a new number, that
_breaks_ the use of SSN/SIN as a primary key.
Why?

CREATE TABLE old_ssns
(
old_ssn CHAR(9),
current_ssn CHAR(9),
PRIMARY KEY (old_ssn)
);

UPDATE people
SET SSN = <new SSN>,
WHERE SSN = <old SSN>

INSERT INTO old_ssns
(old_ssn, current_ssn)
VALUES (<old SSN>, <new SSN>);

COMMIT;


If we fabricate a number of our own as an "employee ID," that's well
and good, until such time as there is a corporate merger that has
conflicting ID spaces so that peoples' IDs have to change.
I agree that this is a problem but, it is a problem of logic and reflecting
how the world works and changes. I think that convoluting the notion of
keys (which are by definition a subset of properties of some entity)
reflecting semantics of certain properties to somthing different is not
always the proper approach.

Codd actually went into this in depth with his RM/2 proposal, which I really
need to reread. I think his proposal and ideas haven't been given the
attention they deserve.

- Dan --
(format nil "~S@~S" "cbbrowne" "acm.org")
http://cbbrowne.com/info/sgml.html
"That's convenience, not cracker-proofing. Security is an emergent
property, not a feature." -- void <fl***@interport.net>

Jul 20 '05 #12

P: n/a
Marshall Spight (ms*****@dnai.com) writes:
"Christian Antognini" <ch*****************@trivadis.com> wrote in message

news:41********@post.usenet.com...

A PK should have no business meaning.


Says who? Can you justify this statement?


Because most real-world look-like keys do not live up to the strict
requirements of the primary keys in a database. They are too often
nullable or non-unique. Person identification numbers such as SSN
are a good example of both.
--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 20 '05 #13

P: n/a

"Erland Sommarskog" <es****@sommarskog.se> wrote in message
news:Xn*********************@127.0.0.1...
Marshall Spight (ms*****@dnai.com) writes:
"Christian Antognini" <ch*****************@trivadis.com> wrote in message
news:41********@post.usenet.com...
A PK should have no business meaning.


Says who? Can you justify this statement?


Because most real-world look-like keys do not live up to the strict
requirements of the primary keys in a database. They are too often
nullable or non-unique. Person identification numbers such as SSN
are a good example of both.


The problem I have with synthetic primary keys is precisely that they
abstract themselves so far away from business logic and hence become
'meaningless' that they cease to do the job they were intended to do. If I
insert a new record into the PERSONS table, using only a sequence number
generator to supply a new, unique ID for the row, I am pretty well
guaranteed to be able to insert the new record, even for a criminal clone of
Al Capone. If I use the SSN as a primary key field, however, it is highly
likely that my insert will (correctly) fail whenever someone is trying to do
a bit of identity theft. Sure, I could use a synthetic primary key, and then
add a unique constraint to a field such as SSN... but then I am merely doing
to the SSN manually what the declaration of it as a primary key would have
done in the first place. Meanwhile, there's a second index that I now have
to bear the costs of maintaining instead of just one.

For me, a primary key should absolutely embody a business rule ("no two
people can share a SSN", for example), since a primary key is a *logical*
way of uniquely referring to every row in the table, and logic and business
rules are close companions. If I want a non-nullable, meaningless and
guaranteed unique *physical* reference to my rows, I have the ROWID for
that. Synthetic keys seem to me to merely replicating the same sort of
functionality that the ROWID gives one in any case.

There are arguments both ways, of course. And some of the 'big names' are
definitely fans of synthetic keys (Steve Adams springs to mind), which
should give me pause. But I still don't like the extra unique checks needed
when synthetics are employed. And it seems to me that, in any case, there is
no way the original bald statement should be accepted without qualification.

Regards
HJR

Jul 20 '05 #14

P: n/a
Howard J. Rogers (hj*@dizwell.com) writes:
The problem I have with synthetic primary keys is precisely that they
abstract themselves so far away from business logic and hence become
'meaningless' that they cease to do the job they were intended to do. If
I insert a new record into the PERSONS table, using only a sequence
number generator to supply a new, unique ID for the row, I am pretty
well guaranteed to be able to insert the new record, even for a criminal
clone of Al Capone. If I use the SSN as a primary key field, however, it
is highly likely that my insert will (correctly) fail whenever someone
is trying to do a bit of identity theft.
So you use SSN. Now you want to enter me in your database. What is my
SSN?
Sure, I could use a synthetic primary key, and then
add a unique constraint to a field such as SSN...
At least then you have less problem when someone's SSN changes. And
you can handle persons without SSNs. (That is, your constraint should
be unique-when-NULL.)

You can also apply a partial validation rule. If the person is a Swedish
tax payer, I may require that is person identifcation number is unique
(because our customers reports to Swedosh tax authorities), but if he
pays tax in Upper Fragonia I could care less what his identification
numbers they may use there.
For me, a primary key should absolutely embody a business rule ("no two
people can share a SSN", for example),


The problem is that the real-world business rule is different. Two
persons *can* have the same person identification number. It can be
because of systems that have goofed, or it can be because they live
in different countries - but still lives in the same system.
--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 20 '05 #15

P: n/a

"Erland Sommarskog" <es****@sommarskog.se> wrote in message
news:Xn*********************@127.0.0.1...
Howard J. Rogers (hj*@dizwell.com) writes:
The problem I have with synthetic primary keys is precisely that they
abstract themselves so far away from business logic and hence become
'meaningless' that they cease to do the job they were intended to do. If
I insert a new record into the PERSONS table, using only a sequence
number generator to supply a new, unique ID for the row, I am pretty
well guaranteed to be able to insert the new record, even for a criminal
clone of Al Capone. If I use the SSN as a primary key field, however, it
is highly likely that my insert will (correctly) fail whenever someone
is trying to do a bit of identity theft.
So you use SSN. Now you want to enter me in your database. What is my
SSN?


I'm not entirely sure what point you're making. Either you have one, in
which case I must know it, and then I can enter your details. Or you don't
have one, in which case, I can't enter your details. (And I can't tell you
the number of times that is exactly what has happened to me: "I need to see
your driver's licence". "I don't have one". "Well, your passport then".
"I've left it in the hotel". "Well, tough. I cannot process your application
further without that information".) Unless the business rules are not as
simple as that, and in fact I can ask for some other form of identification
from you, and can use that/those as the primary key.

In other words, if obtaining and entering your SSN is not a problem, SSN is
a good primary key candidate. If there's a problem with obtaining your SSN,
or if there is some other potentital complexity associated with it, my
primary key design needs to reflect that reality.
Sure, I could use a synthetic primary key, and then
add a unique constraint to a field such as SSN...


At least then you have less problem when someone's SSN changes. And
you can handle persons without SSNs. (That is, your constraint should
be unique-when-NULL.)


But you are assuming that I would *want* to handle people without SSNs. I'll
say up-front, I live in Australia, and I have no idea what the rules on US
SSNs are. So let's try to keep it generic. The point is that, either my
business rule states that two people *cannot* share SSNs (in which case, SSN
is a good candidate for a primary key), or the business rules state some
other point of uniqueness, and hence use that (or a combination of factors)
as the primary key. But what you shouldn't be doing is claiming some
profound weakness in how SSNs work, and from that claim that synthetic keys
are the only way to go.

Again, suppose my business rules state: you can open an account for someone
who cannot supply a US SSN. If that's my business rule, then SSN is a poor
choice for a primary key, obviously. But if my business rule states "no SSN,
no account", then your statement that "at least you can handle persons
without SSNs" is meaningless... my company has chosen not to contemplate
that possibilty, and the use of SSN as our primary key is therefore not a
problem.
You can also apply a partial validation rule. If the person is a Swedish
tax payer, I may require that is person identifcation number is unique
(because our customers reports to Swedosh tax authorities), but if he
pays tax in Upper Fragonia I could care less what his identification
numbers they may use there.
A multi-column primary key might be in order, then. Such as passport number,
surname, date of birth, SSN if available etc etc etc. Point is, I could
still construct a primary key from the available possibilities, dependent
only on what my business requirements are, and still not resort to synthetic
sequence numbers, which mean nothing and cannot prevent duplication on their
own.
For me, a primary key should absolutely embody a business rule ("no two
people can share a SSN", for example),


The problem is that the real-world business rule is different. Two
persons *can* have the same person identification number.


Well, I think you're getting too specific and missing the actual point. If
in the real world two people can share an identity number X, then X is not a
candidate for a primary key. So find me something else that they *cannot*
share. But do that extra thinking, please, before giving up and declaring
only that a meaningless sequence number is the only answer.

What you are saying here is merely that the business rule is not described
correctly or completely. In which case, fine: describe it further and in
more refined detail. But doing so doesn't invalidate the whole idea of
'natural' primary keys.
It can be
because of systems that have goofed, or it can be because they live
in different countries - but still lives in the same system.


It simply means the business rule gets more complicated. If I was designing
a system that was to store US and British and Aussie citizens, I would have
an 'ID' field that could accept the US SSN, the British National Insurance
Number (assuming it still exists!!) and the Aussie Tax File number. As an
example. I can call that field anything I want, but provided I make it a
VARCHAR2, and make it suitably long, it can do duty for storing the
requisite information from any of the three countries. Now you tell me that
in the US, the SSN is not unique... fine, I believe you: so add something
else to the primary key that would qualify it. When you finally report to me
that, to guarantee uniqueness in the US I would have to include 32 other
fields as part of the primary key, I might then think about letting you use
a synthetic sequence number!

In short, it is not cut-and-dried one way or the other, nor without merit to
discuss these things. It certainly isn't the brainless decision that the
original quotation that started this part of the thread suggested it was.

Regards
HJR
Jul 20 '05 #16

P: n/a
Howard J. Rogers (hj*@dizwell.com) writes:
"Erland Sommarskog" <es****@sommarskog.se> wrote in message
news:Xn*********************@127.0.0.1...
So you use SSN. Now you want to enter me in your database. What is my
SSN?
I'm not entirely sure what point you're making. Either you have one, in
which case I must know it, and then I can enter your details. Or you don't
have one, in which case, I can't enter your details.


Two cases:

1) You're a stock broker, and you need to enter person-numbers for
natural persons that pays taxes in your own country, since the tax
authorities requires you to report their transactions and positions.
But for customers that pays tax elsewhere, or are juridical persons
you have no requirement. And there might be some shady customers which
you should report, but which prefer that you don't.

2) You are entering data about patients at a hospital. And here am I,
an illegal alien, or just an unfortunate tourist, and you cannot
enter me.

Well, if the system was designed by someone who religiously believed in
natural keys you have a problem.

Of course, there are systems in which it's perfectly reasonable to use
person identifaction numbers as the key. The Swedish population registry
for example. But in this case it makes sense, because this is the database
that assigns the number - so in essence, here you have a sequence number.
(Although it's not only 1-2-3, there is YYMMDD in the beginning and a
check digit at the end.)
But you are assuming that I would *want* to handle people without SSNs.
Yes, in most systems where you handle persons you want to that. You set
up a web shop. You only accept customers that have an Australian SSN?
The point is that, either my business rule states that two people
*cannot* share SSNs (in which case, SSN is a good candidate for a
primary key), or the business rules state some other point of
uniqueness,
Or the business rules does not state any sort of 100% uniqueness. Business
rules comes from people acting, and people are not computers. They don't
need any 100% unique key. If a way of identification, like SSN is 99%
unique, that's good enough for them, and they few duplicates can be
handled informally. But an RDBMS is not equally flexible.

Sure, you can bend over backwards and find something which makes each
instance unique, but how much extra value does that give the organization
who is paying for the work, and if you did not bend long enough, you
may still have left them in a dead end.

Understanding business rules and business requirements is essential
when you design a database. It's easy to say "use SSN and not surrogate
keys", as long you are nave to believe SSN:s to be unqiue and non
nullable.
It simply means the business rule gets more complicated. If I was
designing a system that was to store US and British and Aussie citizens,
I would have an 'ID' field that could accept the US SSN, the British
National Insurance Number (assuming it still exists!!) and the Aussie
Tax File number.


Not all countries have any person identification numbers at all. Not all
countries even have central registers of all their citizens. Not all
countries even know how many people who lives there.

Yes, sure, there may be special cases where you can use the person
identification number as a primary key, but they are rather the exception.

Now, the point with surrogate keys is that they give you a back-door.
You can expose to the user a non-unique nullable key and give them
possibilities to work around duplicates, and spend the rest of the time
of developing the system at a reasonable cost. That may seem lax to you,
but most projects do have a budget.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 20 '05 #17

P: n/a

"Erland Sommarskog" <es****@sommarskog.se> wrote in message
news:Xn**********************@127.0.0.1...
Howard J. Rogers (hj*@dizwell.com) writes:
"Erland Sommarskog" <es****@sommarskog.se> wrote in message
news:Xn*********************@127.0.0.1...
So you use SSN. Now you want to enter me in your database. What is my
SSN?
I'm not entirely sure what point you're making. Either you have one, in
which case I must know it, and then I can enter your details. Or you don't have one, in which case, I can't enter your details.


Two cases:

1) You're a stock broker, and you need to enter person-numbers for
natural persons that pays taxes in your own country, since the tax
authorities requires you to report their transactions and positions.
But for customers that pays tax elsewhere, or are juridical persons
you have no requirement. And there might be some shady customers which
you should report, but which prefer that you don't.

2) You are entering data about patients at a hospital. And here am I,
an illegal alien, or just an unfortunate tourist, and you cannot
enter me.

Well, if the system was designed by someone who religiously believed in
natural keys you have a problem.


No, if the system is designed by someone who religiously believed in natural
keys *and who didn't understand the uses to which it would be put*, then we
have a problem. But that's what a design stage is for: to thrash out
precisely what the uses are for the proposed system and hence what the
business rules are.

Again, your point 2 is not a problem *if that is what the rules state*. I'm
sure we've all heard the stories of patients without insurance being whisked
out of the emergency department because the hospital administrator finds
they cannot pay. Maybe illegal aliens face the same dilemma. Point is,
that's a design decision: what will your system allow or not allow. And
design your primary keys accordingly.
Of course, there are systems in which it's perfectly reasonable to use
person identifaction numbers as the key. The Swedish population registry
for example. But in this case it makes sense, because this is the database
that assigns the number - so in essence, here you have a sequence number.
(Although it's not only 1-2-3, there is YYMMDD in the beginning and a
check digit at the end.)
But you are assuming that I would *want* to handle people without SSNs.


Yes, in most systems where you handle persons you want to that. You set
up a web shop. You only accept customers that have an Australian SSN?


You've missed my point. You are assuming I would want to, and that's a valid
assumption for many systems, as you state. In which case, design your keys
accordingly. But there will be times when that assumption is wrong. In which
case, SSNs would be a valid primary key. Point is, the key design depends
upon what your system should and should not allow. What it *doesn't* depend
on is sweeping statements that keys should never have business meaning.
The point is that, either my business rule states that two people
*cannot* share SSNs (in which case, SSN is a good candidate for a
primary key), or the business rules state some other point of
uniqueness,


Or the business rules does not state any sort of 100% uniqueness. Business
rules comes from people acting, and people are not computers. They don't
need any 100% unique key. If a way of identification, like SSN is 99%
unique, that's good enough for them, and they few duplicates can be
handled informally. But an RDBMS is not equally flexible.

Sure, you can bend over backwards and find something which makes each
instance unique, but how much extra value does that give the organization
who is paying for the work, and if you did not bend long enough, you
may still have left them in a dead end.

Understanding business rules and business requirements is essential
when you design a database. It's easy to say "use SSN and not surrogate
keys", as long you are nave to believe SSN:s to be unqiue and non
nullable.


I said no such thing. Quite the opposite: I said that I have no idea how
SSNs work, and the details of how they work are irrelevant to the discussion
in any case. What I actually said was, use SSNs if they are appropriate.
Your first sentence here simply repeats what I posted earlier: it depends on
the business rules (and inevitably on one's ability to understand them).
It simply means the business rule gets more complicated. If I was
designing a system that was to store US and British and Aussie citizens,
I would have an 'ID' field that could accept the US SSN, the British
National Insurance Number (assuming it still exists!!) and the Aussie
Tax File number.


Not all countries have any person identification numbers at all. Not all
countries even have central registers of all their citizens. Not all
countries even know how many people who lives there.

Yes, sure, there may be special cases where you can use the person
identification number as a primary key, but they are rather the exception.

Now, the point with surrogate keys is that they give you a back-door.
You can expose to the user a non-unique nullable key and give them
possibilities to work around duplicates, and spend the rest of the time
of developing the system at a reasonable cost. That may seem lax to you,
but most projects do have a budget.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

You are merely confirming my point. That the decision on whether natural or
synthetic keys should be used depends upon a proper understanding of the
business rules to be used. Which is a far cry from whoever it was who posted
that 'primary keys should NEVER have a business meaning'. It doesn't sound
like that was you, based on the content of this reply of yours... therefore,
I think it safe to say we agree.

You are thinking of an application where duplicate or null identities should
be allowed. I am thinking of an application where either of those approaches
would be a big no-no. Both sorts of application exist in the real world.
Both approaches are therefore valid.

HJR

Jul 20 '05 #18

P: n/a
Hans Forbrich <fo******@yahoo.net> wrote in message news:<jZROc.2236$T_6.1453@edtnps89>...
Marshall Spight wrote:
"Christian Antognini" <ch*****************@trivadis.com> wrote in message
news:41********@post.usenet.com...

A PK should have no business meaning.


Says who? Can you justify this statement?


A PK should be selected to uniquely identify an entity. Ideally, and by
formal definition, the PK is invariant.

All to often a unique attribute of the entity, such as empno (or SSN/SIN or
name or email address,) is used as the PK. Attributes generally have a
business meaning. Such attributes can change, although some change very
infrequently.

Selecting an attribute as the PK can cause DBAs (and/or developers and/or
businesses) headaches when attempting to change the PK. This is especially
true when taking the offline archives into account.

All of the examples are not invariant, although they tend not to change
often. In some cases using the suggested PK is against the law (eg: SIN in
Canada, has privacy implications).

Hans


exactly.
how are you going to use SSN number as a primary key if it is properly
encrypted?
use a function-based-index to decrypt it? no. that leaves the
decrypted values in the index. security violation.
use a surrogate key (e.g. sequence-generated value) for the pk_id and
be done with it. it also leaves you free to change the encryption
method used and not have to update all of the foreign keys.

now, this does not discuss the general point at all.
it merely singles out the case of storing a SSN in clear text in the
database.
bad design, in my opinion.

btw, the enhancements in oracle 10g for encryption look good.
56 bit DES is going away soon.

-bdbafh
Jul 20 '05 #19

P: n/a
Paul Drake wrote:
exactly.
how are you going to use SSN number as a primary key if it is properly
encrypted?
use a function-based-index to decrypt it? no. that leaves the
decrypted values in the index. security violation.
use a surrogate key (e.g. sequence-generated value) for the pk_id and
be done with it. it also leaves you free to change the encryption
method used and not have to update all of the foreign keys.

now, this does not discuss the general point at all.
it merely singles out the case of storing a SSN in clear text in the
database.
bad design, in my opinion.

btw, the enhancements in oracle 10g for encryption look good.
56 bit DES is going away soon.

-bdbafh


Paul,

I'm not quite sure what you are saying.

I advocate using PK that is defined as an invariant. I also point out that
many default choices for PK are bad as they are not invariant. Quite
honestly, I do not care whether a surrogate is chosen, just whether it is
invariant.

I also mention that some seemingly obvious choices are bad due to [local]
legal constraints. For exmple, in Canada it does not matter whether you
encrypt the SIN or not. The SIN can only be used for tax reporting
purposes and may not be stored for any other purpose, encrypted or
otherwise.

/Hans
Jul 20 '05 #20

P: n/a
Howard J. Rogers (hj*@dizwell.com) writes:
You are merely confirming my point. That the decision on whether natural
or synthetic keys should be used depends upon a proper understanding of
the business rules to be used. Which is a far cry from whoever it was
who posted that 'primary keys should NEVER have a business meaning'. It
doesn't sound like that was you, based on the content of this reply of
yours... therefore, I think it safe to say we agree.


That is true, that our positions does not seem to on separate ends of
the universe like the people who are making these two kinds of incompatible
statements:

1) Primary keys should ALWAYS be natural keys (and verifiable, whatever
that means, according to Celko).
2) Primary keys should NEVER have any business meaning.

I agree that neither of these positions are tenable, but my position is a
lot more closer to 2, and if you replace NEVER with NORMALLY. However,
the interpretation of 2 may need some modification.

Basically we can identitfy four kinds of keys:

A) Natural keys defined by some other system, like a person identification
number, an ISIN code for a financial instrument, or a currency code.

B) Natural keys defined by our own system. A customer opens an account with
a financial institution, and this account gets an account number
which is defined by our system. The account number is exposed, used
by the users, and can be fed to or from external systems.

C) Completely synthetic keys, which are not exposed to the user, and
which used internally. Often they go in parallel with some alternate
key, (which may be a semi-unique key, and thus not a real key in the
RDBMS sense.)

D) A composite which is composed by keys from other tables, which may of
any of the other sort. This can in fact say be the same as A, even if
all keys are surrogates in the original table.

Keys of type B are not really a problem. Neither are keys of type D
really, although you can see developers who put IDENTITY keys in all
their tables, even those where you have a good composite key. But these
experience often have poor knowledge of database design. And for the
statement that started this thread "no business meaning", and composite
key which consists solely of the C type of keys does not violate this
statement.

So when you use keys of type A? Well, above I had three examples:

1) Persons. In some cases the person identifcation number can be used,
but in many cases not. Many systems do B - that is they assign you a
customer number, and may not even care about your person identification
number. (Which you as a customer may appreciate, if you care about your
privacy.) The system I work with do C - that is we don't expose the
customer number, but the user finds a customer through the name or
the person identification number, and the user will have to sort out
the ambiguities. Most often they probably use the account number,
though, and this is a B type of key. (A customer may have many
accounts, and an account may have many onwers.)

2) Financial instruments. This area is a complete mess, and anyone who
would try to use a natural key would be in for a disaster. There are
several competing global identification schemes, of which none is
whole-covering. So we use an internal ID of type C, but requires the
user to enter a unique name. (Names can change, and there are over
50 table referring to instruments, some of them biggies.) User can
also use ISIN-code to find an instrument.

3) Currency codes. Here it would be a folly to use a key of type C. There
is a commonly used code for currency codes, and even if there are some
people that don't obey them complete (The Brits seems think that there
is something called UKP. There is not.), this is not a big issue. A
country may change its currency, but then usually the value changes
too. For instance Poland went from PLZ to PLN about 10 years ago.
1 PLN = 10000 PLZ, so this affects outstanding values.

And when in doubt - use C.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 20 '05 #21

P: n/a
Erland Sommarskog wrote:
Marshall Spight (ms*****@dnai.com) writes:
"Christian Antognini" <ch*****************@trivadis.com> wrote in message


news:41********@post.usenet.com...
A PK should have no business meaning.


Says who? Can you justify this statement?

Because most real-world look-like keys do not live up to the strict
requirements of the primary keys in a database. They are too often
nullable or non-unique. Person identification numbers such as SSN
are a good example of both.


Joe Celko has explained this one to you numerous times and he is
correct. Little more needs to be said.

--
Daniel A. Morgan
University of Washington
da******@x.washington.edu
(replace 'x' with 'u' to respond)

Jul 20 '05 #22

P: n/a
>> Let's say I have a type hierarchy: .. <<

The classic scenario calls for a root class with all the common
attributes and then specialized sub-classes under it. As an example,
let's take the class of Vehicles and find an industry standard
identifier (VIN), and add two mutually exclusive sub-classes, Sport
utility vehicles and sedans ('SUV', 'SED').

CREATE TABLE Vehicles
(vin CHAR(17) NOT NULL PRIMARY KEY,
vehicle_type CHAR(3) NOT NULL
CHECK(vehicle_type IN ('SUV', 'SED')),
UNIQUE (vin, vehicle_type),
..);

Notice the overlapping candidate keys. I then use a compound candidate
key (vin, vehicle_type) and a constraint in each sub-class table to
assure that the vehicle_type is locked and agrees with the Vehicles
table. Add some DRI actions and you are done:

CREATE TABLE SUV
(vin CHAR(17) NOT NULL PRIMARY KEY,
vehicle_type CHAR(3) DEFAULT 'SUV' NOT NULL
CHECK(vehicle_type = 'SUV'),
UNIQUE (vin, vehicle_type),
FOREIGN KEY (vin, vehicle_type)
REFERENCES Vehicles(vin, vehicle_type)
ON UPDATE CASCADE
ON DELETE CASCADE,
..);

CREATE TABLE Sedans
(vin CHAR(17) NOT NULL PRIMARY KEY,
vehicle_type CHAR(3) DEFAULT 'SED' NOT NULL
CHECK(vehicle_type = 'SED'),
UNIQUE (vin, vehicle_type),
FOREIGN KEY (vin, vehicle_type)
REFERENCES Vehicles(vin, vehicle_type)
ON UPDATE CASCADE
ON DELETE CASCADE,
..);

I can continue to build a hierarchy like this. For example, if I had
a Sedans table that broke down into two-door and four-door sedans, I
could a schema like this:

CREATE TABLE Sedans
(vin CHAR(17) NOT NULL PRIMARY KEY,
vehicle_type CHAR(3) DEFAULT 'SED' NOT NULL
CHECK(vehicle_type IN ('2DR', '4DR', 'SED')),
UNIQUE (vin, vehicle_type),
FOREIGN KEY (vin, vehicle_type)
REFERENCES Vehicles(vin, vehicle_type)
ON UPDATE CASCADE
ON DELETE CASCADE,
..);

CREATE TABLE TwoDoor
(vin CHAR(17) NOT NULL PRIMARY KEY,
vehicle_type CHAR(3) DEFAULT '2DR' NOT NULL
CHECK(vehicle_type = '2DR'),
UNIQUE (vin, vehicle_type),
FOREIGN KEY (vin, vehicle_type)
REFERENCES Sedans(vin, vehicle_type)
ON UPDATE CASCADE
ON DELETE CASCADE,
..);

CREATE TABLE FourDoor
(vin CHAR(17) NOT NULL PRIMARY KEY,
vehicle_type CHAR(3) DEFAULT '4DR' NOT NULL
CHECK(vehicle_type = '4DR'),
UNIQUE (vin, vehicle_type),
FOREIGN KEY (vin, vehicle_type)
REFERENCES Sedans (vin, vehicle_type)
ON UPDATE CASCADE
ON DELETE CASCADE,
..);

The idea is to build a chain of identifiers and types in a UNIQUE()
constraint that go up the tree when you use a REFERENCES constraint.
Obviously, you can do variants of this trick to get different class
structures.

If an entity doesn't have to be exclusively one subtype, you play with
the root of the class hierarchy:

CREATE TABLE Vehicles
(vin CHAR(17) NOT NULL,
vehicle_type CHAR(3) NOT NULL
CHECK(vehicle_type IN ('SUV', 'SED')),
PRIMARY KEY (vin, vehicle_type),
..);

Now start hiding all this stuff in VIEWs immediately and add an
INSTEAD OF trigger to those VIEWs.
The classic scenario calls for a root class with all the common
attributes and then specialized sub-classes under it. As an example,
let's take the class of Vehicles and find an industry standard
identifier (VIN), and add two mutually exclusive sub-classes, Sport
utility vehicles and sedans ('SUV', 'SED').

CREATE TABLE Vehicles
(vin CHAR(17) NOT NULL PRIMARY KEY,
vehicle_type CHAR(3) NOT NULL
CHECK(vehicle_type IN ('SUV', 'SED')),
UNIQUE (vin, vehicle_type),
..);

Notice the overlapping candidate keys. I then use a compound candidate
key (vin, vehicle_type) and a constraint in each sub-class table to
assure that the vehicle_type is locked and agrees with the Vehicles
table. Add some DRI actions and you are done:

CREATE TABLE SUV
(vin CHAR(17) NOT NULL PRIMARY KEY,
vehicle_type CHAR(3) DEFAULT 'SUV' NOT NULL
CHECK(vehicle_type = 'SUV'),
UNIQUE (vin, vehicle_type),
FOREIGN KEY (vin, vehicle_type)
REFERENCES Vehicles(vin, vehicle_type)
ON UPDATE CASCADE
ON DELETE CASCADE,
..);

CREATE TABLE Sedans
(vin CHAR(17) NOT NULL PRIMARY KEY,
vehicle_type CHAR(3) DEFAULT 'SED' NOT NULL
CHECK(vehicle_type = 'SED'),
UNIQUE (vin, vehicle_type),
FOREIGN KEY (vin, vehicle_type)
REFERENCES Vehicles(vin, vehicle_type)
ON UPDATE CASCADE
ON DELETE CASCADE,
..);

I can continue to build a hierarchy like this. For example, if I had
a Sedans table that broke down into two-door and four-door sedans, I
could a schema like this:

CREATE TABLE Sedans
(vin CHAR(17) NOT NULL PRIMARY KEY,
vehicle_type CHAR(3) DEFAULT 'SED' NOT NULL
CHECK(vehicle_type IN ('2DR', '4DR', 'SED')),
UNIQUE (vin, vehicle_type),
FOREIGN KEY (vin, vehicle_type)
REFERENCES Vehicles(vin, vehicle_type)
ON UPDATE CASCADE
ON DELETE CASCADE,
..);

CREATE TABLE TwoDoor
(vin CHAR(17) NOT NULL PRIMARY KEY,
vehicle_type CHAR(3) DEFAULT '2DR' NOT NULL
CHECK(vehicle_type = '2DR'),
UNIQUE (vin, vehicle_type),
FOREIGN KEY (vin, vehicle_type)
REFERENCES Sedans(vin, vehicle_type)
ON UPDATE CASCADE
ON DELETE CASCADE,
..);

CREATE TABLE FourDoor
(vin CHAR(17) NOT NULL PRIMARY KEY,
vehicle_type CHAR(3) DEFAULT '4DR' NOT NULL
CHECK(vehicle_type = '4DR'),
UNIQUE (vin, vehicle_type),
FOREIGN KEY (vin, vehicle_type)
REFERENCES Sedans (vin, vehicle_type)
ON UPDATE CASCADE
ON DELETE CASCADE,
..);

The idea is to build a chain of identifiers and types in a UNIQUE()
constraint that go up the tree when you use a REFERENCES constraint.
Obviously, you can do variants of this trick to get different class
structures.

If an entity doesn't have to be exclusively one subtype, you play with
the root of the class hierarchy:

CREATE TABLE Vehicles
(vin CHAR(17) NOT NULL,
vehicle_type CHAR(3) NOT NULL
CHECK(vehicle_type IN ('SUV', 'SED')),
PRIMARY KEY (vin, vehicle_type),
..);

Now start hiding all this stuff in VIEWs immediately and add an
INSTEAD OF trigger to those VIEWs.
Jul 20 '05 #23

P: n/a
Daniel Morgan (da******@x.washington.edu) writes:
Erland Sommarskog wrote:
Because most real-world look-like keys do not live up to the strict
requirements of the primary keys in a database. They are too often
nullable or non-unique. Person identification numbers such as SSN
are a good example of both.


Joe Celko has explained this one to you numerous times and he is
correct. Little more needs to be said.


Joe Celko has indeed several times explained how SSN are derived, and I
assume that he has been talking about SSNs in the US. Whether he is correct
on that tidbit is irrelevant, because person identification numbers are
formed from other rules elsewhere. Or not formed at all.

But it also true that both you and Celko likes to ride high horse and
pretend you know how should model my database, although you don't know
my business rules.

So, yes, there is very little more you need to say.
--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 20 '05 #24

P: n/a
>> [A PK should have no business meaning] Hmm, I want to see what Joe
Celko says about this one.... <<

Time to do one of my standard rants :)

The main property you want in a key is that it is verifiable in the
reality of the data model. Invariant is very nice, but not actually
required. Right now, the United States is facing a massive change in
retail:

Bar Code Detente: U.S. Finally Adds One More Digit (2004 July 12, NYT
By STEVE LOHR): http://www.nytimes.com/2004/07/12/bu...2cb9baba72e846

For American retailers, whose checkout-line bar-code scanners will be
expected to read the global bar-code standard by 2005 Jan 01. Europe
won this one. The global bar code standard will be the European
Article Numbering Code. It turns out that the American Universal
Product Code - which turned 30 years old
last month - was never so universal after all.

Are we doomed? No. We just do some ALTER DOMAIN and ALTER TABLE
staetments, get a mag tape from my suppliers and keep going. Why do
all these standard codes work? Because there is a trusted outside
verification of the codes with internal validation.

That validation is possible **only** because the codes carry business
meaning in them -- the first set of digits is the country of origin
and the manufacturer. Within those areas, people familar with the
particulars for each company can guess the codes (United States, P&G,
cleaning products).

An exposed surogate is usually a physical locator like IDENTITY, some
kind of auto-increment, or an actual disk address. They do not port,
even from release to release of the same product. They are a pathetic
attempt to mimic pointer chains from the old IMS and IDMS databases,
but when they get out of synch, there are no routines to re-build the
pointer chains.

If there is a natural key, they are redundant and dangerous -- update
one but not the other and see what a mess you have.
Jul 20 '05 #25

P: n/a

"Christian Antognini" <ch*****************@trivadis.com> wrote in message
news:41********@post.usenet.com...
A PK should have no business meaning. Therefore, I'll not add such a column to the primary key.


This is too strongly stated.

The following sounds better to me:

A PK should have no business meaning other than to identify (an instance of)
an entity or a (reified) relationship.

Jul 20 '05 #26

P: n/a
--CELKO-- wrote:

The main property you want in a key is that it is verifiable in the
reality of the data model. Invariant is very nice, but not actually
required. Right now, the United States is facing a massive change in
retail:


I basically concur with your assessment. However, the following scenario
may provide additional rationale for invariant PK (as well as supporting
your verifiable statement):

In some legal jurisdictions it is necessary to keep data for a minimum
amount of time (eg: 7 years) in case that data could be used in a law suit.
In such instances, the data may be (preferrably is) archived, possibly to a
read-only medium, to permit future investigation by authorities.

If the PK is not invariant, a full trace of the changes to the PK is a
necessary submission to support the validity of the PK. Without such a
trace, or a confirmation that the PK is invariant, the relevance or even
validity of the data can be suspect.

Thus I conclude that a business reason for invariant PK can be established
if the data is to be, or potentially could be, used in a legal situation.

I am not sure whether the above is under consideration in the United States,
but [in some circles] it is quite an issue in Canada and potentially other
jurisdictions.

Hans
Jul 20 '05 #27

P: n/a

"--CELKO--" <jc*******@earthlink.net> wrote in message
news:18**************************@posting.google.c om...
For American retailers, whose checkout-line bar-code scanners will be
expected to read the global bar-code standard by 2005 Jan 01. Europe
won this one.


When is the last one that America won? World War II?
Jul 20 '05 #28

P: n/a

"Laconic2" <la******@comcast.net> wrote in message
news:1Z********************@comcast.com...

"--CELKO--" <jc*******@earthlink.net> wrote in message
news:18**************************@posting.google.c om...
For American retailers, whose checkout-line bar-code scanners will be
expected to read the global bar-code standard by 2005 Jan 01. Europe
won this one.


When is the last one that America won? World War II?


Nah, Watergate. Any other country on the face of the Earth (with the
possible exception of the UK) would have had a civil war over it.

Regards
HJR
Jul 20 '05 #29

P: n/a

"Howard J. Rogers" <hj*@dizwell.com> wrote in message
news:41***********************@news.optusnet.com.a u...
When is the last one that America won? World War II?


Nah, Watergate. Any other country on the face of the Earth (with the
possible exception of the UK) would have had a civil war over it.


You make a good point. But Watergate was not an international matter. If
the WTO had had ruling authority over Watergate, America would have lost,
somehow. I can't figure out how, but the people who run the WTO would
have.


Jul 20 '05 #30

P: n/a
**** Post for FREE via your newsreader at post.usenet.com ****

Hi Marshall

"Marshall Spight" <ms*****@dnai.com> wrote in message
news:LPQOc.63922$eM2.25470@attbi_s51...
"Christian Antognini" <ch*****************@trivadis.com> wrote in message

news:41********@post.usenet.com...

A PK should have no business meaning.


Says who? Can you justify this statement?


"Unfortunately" I'm some days in holiday... therefore I can reply only
today...

Other people already mentioned the advantages of having surrogate keys... in
my opinion the most important points are the following:

- Business change to often, therefore a good DB design should be able to
assimilate business changes. Of course such a schema is not good according
to Codd's rules, but "only" in practice ;-) Some people wrote that changing
the primary key is just matter of executing few DDL and DML statements! Such
people forget that usually applications that access the DB exist...
therefore if a primary key is changed, the foreign keys are changed as well
and therefore ALL statements that join the tables which are touched by such
a change, should be changed as well!!!!!

- From a performance point of view it is better to have a single-column
numeric primary key. This makes look-ups faster and the life of the
cost-based optimizer is easier (at least with Oracle... sorry I have to
in-depth knowledge of other optimizers...).

- During development a simple primary key makes things easier for
developers. In fact if a simple naming convention is used, they don't have
to remember or give a look to a DB design each time that they wrote a join.
Chris

PS: there was lot of discussion about SSN... therefore, FYI, in
Switzerland, where I live, the SSN is NOT unique.

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
*** Usenet.com - The #1 Usenet Newsgroup Service on The Planet! ***
http://www.usenet.com
Unlimited Download - 19 Seperate Servers - 90,000 groups - Uncensored
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Jul 20 '05 #31

P: n/a
**** Post for FREE via your newsreader at post.usenet.com ****
"Laconic2" <la******@comcast.net> wrote in message
news:dq********************@comcast.com...

"Christian Antognini" <ch*****************@trivadis.com> wrote in message
news:41********@post.usenet.com...
A PK should have no business meaning. Therefore, I'll not add such a column
to the primary key.


This is too strongly stated.

The following sounds better to me:

A PK should have no business meaning other than to identify (an instance

of) an entity or a (reified) relationship.


I completely agree with you!

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
*** Usenet.com - The #1 Usenet Newsgroup Service on The Planet! ***
http://www.usenet.com
Unlimited Download - 19 Seperate Servers - 90,000 groups - Uncensored
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Jul 20 '05 #32

This discussion thread is closed

Replies have been disabled for this discussion.