By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
440,230 Members | 2,471 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 440,230 IT Pros & Developers. It's quick & easy.

Using an Identity Field as Primary Key

P: n/a
I want to use the Identity field (increment 1,1) as a primary key and
have a unique constraint on my other field which is of type char.
I am worried that related data in other tables may lose referntial
integrity if records in the ID table get messed up and need to be
re-entered.
Can you please advice on best way to do this. I definitely need a
numeric id field because it makes the joins and queries so much
faster.
Jul 20 '05 #1
Share this Question
Share on Google+
9 Replies


P: n/a
A foreign key can reference unique constraint columns; it doesn't have
to be the primary key.

--
Hope this helps.

Dan Guzman
SQL Server MVP

-----------------------
SQL FAQ links (courtesy Neil Pike):

http://www.ntfaq.com/Articles/Index....partmentID=800
http://www.sqlserverfaq.com
http://www.mssqlserver.com/faq
-----------------------

"Rathtap" <am****@yahoo.com> wrote in message
news:b2**************************@posting.google.c om...
I want to use the Identity field (increment 1,1) as a primary key and
have a unique constraint on my other field which is of type char.
I am worried that related data in other tables may lose referntial
integrity if records in the ID table get messed up and need to be
re-entered.
Can you please advice on best way to do this. I definitely need a
numeric id field because it makes the joins and queries so much
faster.

Jul 20 '05 #2

P: n/a
Rathtap (am****@yahoo.com) writes:
I want to use the Identity field (increment 1,1) as a primary key and
have a unique constraint on my other field which is of type char.
I am worried that related data in other tables may lose referntial
integrity if records in the ID table get messed up and need to be
re-entered.
Can you please advice on best way to do this. I definitely need a
numeric id field because it makes the joins and queries so much
faster.


Have you actually benchmarked this?

If you have a natural key, use it, and don't make your system more
complex than necessary.
--
Erland Sommarskog, SQL Server MVP, so****@algonet.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 20 '05 #3

P: n/a
Erland Sommarskog <so****@algonet.se> wrote in message news:<Xn*********************@127.0.0.1>...
Rathtap (am****@yahoo.com) writes:
I want to use the Identity field (increment 1,1) as a primary key and
have a unique constraint on my other field which is of type char.
I am worried that related data in other tables may lose referntial
integrity if records in the ID table get messed up and need to be
re-entered.
Can you please advice on best way to do this. I definitely need a
numeric id field because it makes the joins and queries so much
faster.


Have you actually benchmarked this?

If you have a natural key, use it, and don't make your system more
complex than necessary.


The issue here is that these tables are being used to populate cubes.
We have found that there is a tremendous performance gain when the
Primary and Foreign key tables are joined using int fields as opposed
to using chars.
I therefore cannot make my foreign key reference a unique constraint.
Let me explain with an example.
Table Procedures has fields ProcID(int, PK, Identity(1,1)),ProcCode
(char) and ProcDescription (char).
Table Patient has amongst others ProcID which is a foreign key.
My dilemma is, to keep the database and the programming simple - I
want ProcID to be an identity column, however, I am worried that
should something happen to the Procedures table then the relationship
between the two tables will be lost.
What are the best practices in this situation? I simply cannot have a
char field for FK/PK.
One solution is to store both the ProcedureID and ProcedureCode fields
in the Patient table so that even if the Procedures table has to be
recreated, I can run a procedure to update the FKs in the Patient
table.
Jul 20 '05 #4

P: n/a
Rathtap (am****@yahoo.com) writes:
The issue here is that these tables are being used to populate cubes.
We have found that there is a tremendous performance gain when the
Primary and Foreign key tables are joined using int fields as opposed
to using chars.
How long are the char columns?

True, that character comparison is more complex, because of collation
rules (unless you use a binary collation). But the overhead is not likely
to be more than say 20%. And that is not a tremendous difference in the
database world. :-)

Maybe there were other problems, for instance joining a char and nchar
column leads to auto-conversion that precludes use of indexes.
I therefore cannot make my foreign key reference a unique constraint.
Let me explain with an example.
Table Procedures has fields ProcID(int, PK, Identity(1,1)),ProcCode
(char) and ProcDescription (char).
Table Patient has amongst others ProcID which is a foreign key.
My dilemma is, to keep the database and the programming simple - I
want ProcID to be an identity column, however, I am worried that
should something happen to the Procedures table then the relationship
between the two tables will be lost.
What are the best practices in this situation? I simply cannot have a
char field for FK/PK.
One solution is to store both the ProcedureID and ProcedureCode fields
in the Patient table so that even if the Procedures table has to be
recreated, I can run a procedure to update the FKs in the Patient
table.


I'm not sure that I follow your example, but if you insist of having the
extra column, this is how you could do it:

CREATE TABLE parent
(parentid int IDENTITY PRIMARY KEY, -- artificial key
parentcode char(n) NOT NULL UNIQUE, -- natural key
other_columns...,
UNIQUE (parentid, parentcode))

CREATE TABLE child
(childid some_type NOT NULL PRIMARY KEY,
...
parentid int NOT NULL REFERENCES parent(parentid),
parentcode char(n) NOT NULL REFERENCES parent(parentcode),
...
FOREIGN KEY (parentid, parentcode)
REFERENCES parent (parentid, parentcode)

This ensures that id:s and codes are unique in the parent table,
you don't have orphan id:s and codes in the child table, and that
id:s and codes are in sync in both tables. The price you pay is
an extra redundant UNIQUE constraint on the parent table.

Note: for brevity I have not named the constraints in this outline.
However, in actual code I recommend that you always name your
constraints (save for temp tables and table variables).

Then again, if you don't have parentcode at all in the child table,
you can always retrieve the parentcode by joining to the parent
table.

--
Erland Sommarskog, SQL Server MVP, so****@algonet.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 20 '05 #5

P: n/a
Erland Sommarskog <so****@algonet.se> wrote in message news:<Xn**********************@127.0.0.1>...
Rathtap (am****@yahoo.com) writes:
The issue here is that these tables are being used to populate cubes.
We have found that there is a tremendous performance gain when the
Primary and Foreign key tables are joined using int fields as opposed
to using chars.
How long are the char columns?


The columns are not long -- 15 to 20. The problem is that there are a
few million records and that is why it is slow.
True, that character comparison is more complex, because of collation
rules (unless you use a binary collation). But the overhead is not likely
to be more than say 20%. And that is not a tremendous difference in the
database world. :-)

Maybe there were other problems, for instance joining a char and nchar
column leads to auto-conversion that precludes use of indexes.
I therefore cannot make my foreign key reference a unique constraint.
Let me explain with an example.
Table Procedures has fields ProcID(int, PK, Identity(1,1)),ProcCode
(char) and ProcDescription (char).
Table Patient has amongst others ProcID which is a foreign key.
My dilemma is, to keep the database and the programming simple - I
want ProcID to be an identity column, however, I am worried that
should something happen to the Procedures table then the relationship
between the two tables will be lost.
What are the best practices in this situation? I simply cannot have a
char field for FK/PK.
One solution is to store both the ProcedureID and ProcedureCode fields
in the Patient table so that even if the Procedures table has to be
recreated, I can run a procedure to update the FKs in the Patient
table.


I'm not sure that I follow your example, but if you insist of having the
extra column, this is how you could do it:

CREATE TABLE parent
(parentid int IDENTITY PRIMARY KEY, -- artificial key
parentcode char(n) NOT NULL UNIQUE, -- natural key
other_columns...,
UNIQUE (parentid, parentcode))

CREATE TABLE child
(childid some_type NOT NULL PRIMARY KEY,
...
parentid int NOT NULL REFERENCES parent(parentid),
parentcode char(n) NOT NULL REFERENCES parent(parentcode),
...
FOREIGN KEY (parentid, parentcode)
REFERENCES parent (parentid, parentcode)

This ensures that id:s and codes are unique in the parent table,
you don't have orphan id:s and codes in the child table, and that
id:s and codes are in sync in both tables. The price you pay is
an extra redundant UNIQUE constraint on the parent table.

Note: for brevity I have not named the constraints in this outline.
However, in actual code I recommend that you always name your
constraints (save for temp tables and table variables).

Then again, if you don't have parentcode at all in the child table,
you can always retrieve the parentcode by joining to the parent
table.

This is what I want to do but am not sure is the best way to do it.
Reason being if my parentcode in the parent table is lost, how do I
match it up with the parentid in the child. If I re-insert values in
the
parent, the id's generated will be out of synch with those in the
child.
Jul 20 '05 #6

P: n/a
Rathtap (am****@yahoo.com) writes:
The columns are not long -- 15 to 20. The problem is that there are a
few million records and that is why it is slow.


Even with a case-insensitive collation, I doubt that the overhead for
a varcar(20) key is more than 30% of an integer key. And that is for
join operations. If you go for the scheme I suggested, you instead
get an overhead for updates.
I'm not sure that I follow your example, but if you insist of having the
extra column, this is how you could do it:

CREATE TABLE parent
(parentid int IDENTITY PRIMARY KEY, -- artificial key
parentcode char(n) NOT NULL UNIQUE, -- natural key
other_columns...,
UNIQUE (parentid, parentcode))

CREATE TABLE child
(childid some_type NOT NULL PRIMARY KEY,
...
parentid int NOT NULL REFERENCES parent(parentid),
parentcode char(n) NOT NULL REFERENCES parent(parentcode),
...
FOREIGN KEY (parentid, parentcode)
REFERENCES parent (parentid, parentcode)

This is what I want to do but am not sure is the best way to do it.
Reason being if my parentcode in the parent table is lost, how do I
match it up with the parentid in the child. If I re-insert values in
the parent, the id's generated will be out of synch with those in the
child.


I may be missing something, but the idea with the outline above is
that you cannot lose a parentcode this way.

But if you really losing sleep over losing partentcodes, then ditch
the artificial key. You cannot both have the cake and eat it.

--
Erland Sommarskog, SQL Server MVP, so****@algonet.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 20 '05 #7

P: n/a
Rathtap wrote:
I want to use the Identity field (increment 1,1) as a primary key and
have a unique constraint on my other field which is of type char.
I am worried that related data in other tables may lose referntial
integrity if records in the ID table get messed up and need to be
re-entered.


That would be the case even if your foreign keys are on char
fields. You can't delete the old row when there are rows in
other tables referring to it, and you can't insert another one,
because it has to be unique.

Really, though, this is what UPDATE is for.

Bill

Jul 20 '05 #8

P: n/a
I'm making a boatload of assumptions here but based on your statement of
using this data for cubes, here we go:

There has always been a debate on using surrogate vs or in conjunction
with natural keys in regards to warehouse data. Most material I've read
recently recommends using a surrogate key and a natural key when
populating a warehouse to build cubes. (what happens when you pull in
application data that has the same natural keys as your current
app?)Unless you are allowing write-backs to the cube data, you probably
don't need to worry about updates. You only need to worry about the
performance of your ETL processing (unless you are doing real-time
stuff) That being the case, I would use that ident key if you are doing
any type of SCD, it makes any changes to the natural key easier to
implement. Your DW should be around longer than most of your
applications (if properly used and designed). Here are some pages that
might help. Remember fact and dim table design is usually a star schema
and not as normalized as your relational tables.

http://dw.ittoolbox.com/documents/document.asp?i=1501

A good discussion on this is here:
http://www.dmreview.com/editorial/dm....cfm?EdID=4892

BTW: The guru of DW (Ralph Kimball) recommends the use of natural and
surrogate keys, if you don't know, Kimball is the Celko of data
warehousing (IMO)

HTH

Ray Higdon MCSE, MCDBA, CCNA

*** Sent via Developersdex http://www.developersdex.com ***
Don't just participate in USENET...get rewarded for it!
Jul 20 '05 #9

P: n/a
>> BTW: The guru of DW (Ralph Kimball) recommends the use of natural
and surrogate keys, if you don't know, Kimball is the Celko of data
warehousing (IMO) <<

Actually, Ralph is smarter and nicer than I am -- and he is now
winning all the Reader's Choice Awards in INTELLIGENT ENTERPRISE!!
Jul 20 '05 #10

This discussion thread is closed

Replies have been disabled for this discussion.