"David Portas" <RE************ *************** *@acm.org> wrote in message
news:e-*************** *****@giganews. com...
living people than VendorID values. I don't think an identity VendorID
value is any more artificial than any other unique way of identifying a
vendor, so long as it is assigned the first time a vendor enters the
system and is never changed.
But the point is that a "natural" key is verifiable outside of the system.
When I see a NG post from "skass[at]drew.edu" I don't care whether that's
based on your "real" name or even whether S.Kass is the same name as on
your passport or driver's licence. What's important to me is that it's
determined by a consistent method outside of the system which gives me some
acceptable degree of confidence that you're the same person who posted here as
"skass[at]drew.edu" yesterday. Of course that validity is destroyed if you
change your email address or if someone spoofs your address. But it's
still intrinsically better than an arbitrary ID allocated by the server.
I am a firm believer that natural keys should only be used to logically
design/normalize the data. When it comes to the real reason for keys, data
integrity, more often than not I have seen that natural keys are
intrinsically not good physical primary keys.
1. Natural keys are, being natural and therefore user entered [i.e.,
provided to the database by external means], fungible. If a user enters
data, they must also be able to modify it. If data can be modified, then its
value as a systemic primary key is gone. Yes, you can cascade updates to
these, but why do it when it can be avoided to start with.
2. Natural keys are typically a composite of atomic attributes. If using a
composite, these must be propagated to referencing tables as foreign keys.
Your normalization drops below par, by having these [potentially] massively
duplicated columns.
Attributes that are single, [supposedly] unique attributes (e.g., SSN),
usually represent some official, governmentally recognized ID, and therefore
have legal issues with being propagated throughout a system.
Also, for amateurs and many professionals, natural keys are very often
chosen incorrectly. e.g., I believe some combination of Name and other info
has been used by my ISP as their primary key. My last name was entered into
their system incorrectly, but they cannot fix it because their system will
not allow it. Preposterous and poor design.
Surrogate keys generated by using the identity property are ideal for data
integrity, because
1. They are static values [i.e., once entered, it does not change] and the
DBA has control over allowing values in identity columns to be modified.
2. They are singleton row ids. The fact that they are sequential is
irrelevant. That is simply the most efficient means of generating new
numeric values.
Identity integers can be problematic in two-way replication, but proper
management of key ranges can alleviate these issues. GUIDs are the MS
recommended way to deal with distributed data and two-way replication, but
not as easy to deal with in unreplicated databases.