Auto-increment PK - Use MS or Grow your own ?

csomberg

SQL 2000

I thought I would throw this out there for some feedback from others.

I'd like to know if you feel using MS auto-increment field is a good
solution these days or should one grow their own ?

Thanks,

Me.

Jul 23 '05 #1

Subscribe Post Reply

2758

cwwilly

Definitely use SQL Servers to auto-increment the primary key field.
Then there is no chance of duplicates.

Jul 23 '05 #2

David Rawheiser

I can give you a definite Maybe.

I've used both depending on:

* Do you care that numbers may be missing in the sequence if insert
transactions get rolled back?
* Are you making the Primary Key the table's clustered index?
* Do you need to populate the value of that key into other tables within
the same procedure?
<cs******@dwr.com> wrote in message
news:11*********************@z14g2000cwz.googlegro ups.com...

SQL 2000

I thought I would throw this out there for some feedback from others.

I'd like to know if you feel using MS auto-increment field is a good
solution these days or should one grow their own ?

Thanks,

Me.

Jul 23 '05 #3

Ryan

This question is bound to start a few of the regulars off on one of
their favorite arguements. Still to add my tuppence worth, I'd say it's
OK to use it under some circumstances. When and where depends entirely
on the solution you are providing and the data you are working with.

Surrogate keys in my opinion are valid and often much more simple than
a primary key of several columns. Having said that, you do get
instances with 'gaps' in the number sequence so this may not be
something you want. You could go down the route of generating a
surrogate key yourself (your original question), but I would hesitate
if IDENTITY offers you the same advantages as you may introduce more
problems depending on your approach. If possible and where logical, use
a properly defined primary key, but use your head and decide what is
most appropriate.

I know full well that some of the regulars will not agree with me and
that some will. However, it is my opinion and they are entitled to
theirs. I would suggest looking up 'Occams Razor' for anyone who
disagrees ( http://pespmc1.vub.ac.be/ASC/OCCAM'_RAZOR.html )

Going back to your question, if you 'grow your own' then this seems
reasonable if you are doing something that using IDENTITY doesn't allow
you. If it's the same, then why bother ?

Ryan

cs******@dwr.com wrote:

SQL 2000

I thought I would throw this out there for some feedback from others.

I'd like to know if you feel using MS auto-increment field is a good
solution these days or should one grow their own ?

Thanks,

Me.

Jul 23 '05 #4

David Portas

If you want an artificial key then use the feature provided - it's the
most efficient method. The harder question is, do you really want an
artificial key?

--
David Portas
SQL Server MVP
--

Jul 23 '05 #5

Erland Sommarskog

(cs******@dwr.com) writes:

I thought I would throw this out there for some feedback from others.

I'd like to know if you feel using MS auto-increment field is a good
solution these days or should one grow their own ?

Depends. If you need consecutive numbers, forget about IDENTITY. If you
want high scalability and don't want to have a hot spot on the current
key value, use IDENTITY.

Another situation where IDENTITY is difficult is when you insert many rows
in one table, and then need to know the values for inserts into a child
table.

If none of this applies, it's a toss-up. IDENTITY is somewhat simpler to
use, but there are some gotchas in odd situations. Rolling your own
is simple as well.

A general remark is that whatever method you use, don't use it for
every table. For "top" concepts like customers, orders, products it
may be inevitable. But for derived concepts that refers to other
concepts, there is rarely any need for artificial keys.
--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp

Jul 23 '05 #6

Steve Jorgensen

On 23 Feb 2005 01:16:27 -0800, "Ryan" <ry********@hotmail.com> wrote:

This question is bound to start a few of the regulars off on one of
their favorite arguements. Still to add my tuppence worth, I'd say it's

Don't you mean "tupple's worth"? <g>

Jul 23 '05 #7

Steve Jorgensen

On 22 Feb 2005 15:39:18 -0800, cs******@dwr.com wrote:

SQL 2000

I thought I would throw this out there for some feedback from others.

I'd like to know if you feel using MS auto-increment field is a good
solution these days or should one grow their own ?

Thanks,

Me.

I often use a mix.

As has been pointed out here, if you need to ensure there no gaps, IDENTITY is
not your friend. To me, however, the kind of numbering that needs no gaps
should be a logical key, not a physical/surrogate key, so that's no issue.

Where I have found IDENTITY to be limiting are as follows.

1. I want to be able to merge data sets from 2 or more separate databases
without using a proprietary replication system. In this case, I like to use
some kind of variation on the GUID. Using the global key as the primary key
keeps merge processes simpler than if the global key is used in addition to a
local primary key

2. I need to generate sequences of master-detail sets in stored procedures.

In case #2, the problem is that we want to try to use set operations, not
cursors, and there's no good way to figure out the IDs of the master records
created in one query, so you can use them to create matching details in a
subsequent query. On the other hand, if you have a shared counter, you obtain
a count of the master records to be added, get the current counter value, and
update the counter, adding the master-count to its value. Reading and
updating the counter can be in its own short transaction to reduce blocking
overhead, since we should not care if we add a gap, but don't end up adding
the records with those keys.

Jul 23 '05 #8

David Rawheiser

"Steve Jorgensen" <no****@nospam.nospam> wrote in message
news:1f********************************@4ax.com...

On 23 Feb 2005 01:16:27 -0800, "Ryan" <ry********@hotmail.com> wrote:
This question is bound to start a few of the regulars off on one of
their favorite arguements. Still to add my tuppence worth, I'd say it's

Don't you mean "tupple's worth"? <g>

From the definition of the DIF file format:
It uses the terms vector and TUPLE.
You may generally interpret vector as column and tuple as row.

From the Mary Poppins Sound Track:
With TUPPENCE for paper and strings
You can have your own set of wings
With your feet on the ground
You're a bird in a flight
With your fist holding tight
To the string of your kite

I hope I've cleared that up,

Oh my ... It seems I'm out of Pinot.

Jul 23 '05 #9

Steve Jorgensen

On Thu, 24 Feb 2005 05:02:55 GMT, "David Rawheiser" <ra*******@hotmail.com>
wrote:

"Steve Jorgensen" <no****@nospam.nospam> wrote in message
news:1f********************************@4ax.com.. .
On 23 Feb 2005 01:16:27 -0800, "Ryan" <ry********@hotmail.com> wrote:
This question is bound to start a few of the regulars off on one of
their favorite arguements. Still to add my tuppence worth, I'd say it's

Don't you mean "tupple's worth"? <g>

From the definition of the DIF file format:
It uses the terms vector and TUPLE.
You may generally interpret vector as column and tuple as row.

From the Mary Poppins Sound Track:
With TUPPENCE for paper and strings
You can have your own set of wings
With your feet on the ground
You're a bird in a flight
With your fist holding tight
To the string of your kite

I hope I've cleared that up,

Oh my ... It seems I'm out of Pinot.

LOL

Jul 23 '05 #10

Ryan

Oh no ! I'm going to be humming that all day now ! :-)

From the Mary Poppins Sound Track:
With TUPPENCE for paper and strings
You can have your own set of wings
With your feet on the ground
You're a bird in a flight
With your fist holding tight
To the string of your kite

Jul 23 '05 #11

--CELKO--

>> I'd like to know if you feel using MS auto-increment field is a good
solution these days or should one grow their own ? <<

What did you want to use it for?

It is fine for adding a reference number to a cursor, which is a
sequential file structure. But you would never use it inside the
schema for anything. That would that your data model is all screwed up
and has the PHYSICAL state of the machine mixed with the LOGICAL data
model.

Newbies often use IDENTITY and the like to substitute for the pointer
chains they had in IDMS, IMS, TOTAL and other pre-RDBMS databases.

if they are really screwed up, they use them for keys and do not have
natural keys. There is no way to verifiy or validate the data and the
schema loses data integrity.

If I told you that the best key is the 17 digit Hebrew number which God
assigned to all things in creation, you would think I was nuts. There
is no magic universal key; you actually have to do some work when you
design a schema.
We have a lot of problems with terminology on this one, so let me get
that out of the way.

There is no such thing as a "universal, one-size-fits-all" key. Just
as no two sets of entities are the same, the attributes that make them
unique have to be found in the reality of the data. Here is my
classification of types of keys:

natural artificial exposed surrogate
================================================== ================
Constructed from reality |
of the data model | Y N N Y
|
verifiable in reality | Y N N N
|
verifiable in itself | Y Y N N
|
visible to the user | Y Y Y N

1) A natural key is a subset of attributes which occur in a table and
act as a unique identifier. They are seen by the user. You can go to
the external reality and verify them. you would also like to have some
validation rule. Example: UPC codes on consumer goods (read the
package barcode) and validate them with a check digit or a
manufacturer's website, geographical co-ordinates (get a GPS).

2) An artificial key is an extra attribute added to the table which is
seen by the user. It does not exist in the external reality, but can
be verified for syntax or check digits inside itself.

Example: the open codes in the UPC scheme which a user can assign to
his own stuff. The check digits still work, but you have to verify
them inside your own enterprise.

If you have to construct a key yourself, it takes time to deisgn them,
to invetn a validation rule, etc.

3) An "exposed physical locator" is not based on attributes in the data
model and is exposed to user. There is no way to predict it or verify
it. The system obtains a value thru some physical process in the
storage hardware totally unrelated to the logical data model. Example:
IDENTITY columns, other proprietary, non-relaitonal auto-numbering
devices.

Technically, these are not really keys at all, sinc they are attributes
of the PHYSICAL storage and are not even part of the LOGICAL data
model. But they are handy for lazry, non-RDBMS programmers who don't
want to research or think! This is the worst way to program in SQL.

4) A surrogate key is system generated to replace the actual key behind
the covers where the user never sees it. It is based on attributes in
the table. Example: Teradata hashing algorithms, pointer chains.

The fact that you can never see it or use it for DELETE and UPDATE or
create it for INSERT is vital. When users can get to them, they will
screw up the data integrity by getting the real keys and these physical
locators out of synch. The system must maintain them.

** Notice that people get "exposed physical locator" and surrogate
mixed up; they are totally different concepts. **

An appeal to authority, with a quote from Dr. Codd: "..Database users
may cause the system to generate or delete a surrogate, but they have
no control over its value, nor is its value ever displayed to them
...."(Dr. Codd in ACM TODS, pp 409-410) and Codd, E. (1979), Extending
the database relational model to capture more meaning. ACM
Transactions on Database Systems, 4(4). pp. 397-434.

This means that a surrogate ought to act like an index; created by the
user, managed by the system and NEVER seen by a user. That means never
used in queries, DRI or anything else that a user does.

Codd also wrote the following:

"There are three difficulties in employing user-controlled keys as
permanent surrogates for entities.

(1) The actual values of user-controlled keys are determined by users
and must therefore be subject to change by them (e.g. if two companies
merge, the two employee databases might be combined with the result
that some or all of the serial numbers might be changed.).

(2) Two relations may have user-controlled keys defined on distinct
domains (e.g. one uses social security, while the other uses employee
serial numbers) and yet the entities denoted are the same.

(3) It may be necessary to carry information about an entity either
before it has been assigned a user-controlled key value or after it has
ceased to have one (e.g. and applicant for a job and a retiree).

These difficulties have the important consequence that an equi-join on
common key values may not yield the same result as a join on common
entities. A solution - proposed in part [4] and more fully in [14] -
is to introduce entity domains which contain system-assigned
surrogates. Database users may cause the system to generate or delete
a surrogate, but they have no control over its value, nor is its value
ever displayed to them....." (Codd in ACM TODS, pp 409-410).

References

Codd, E. (1979), Extending the database relational model to capture
more meaning. ACM Transactions on Database Systems, 4(4). pp. 397-434

The steps for finding a key are

1) Look for an industry standard and the trusted source that maintains
it.

2) Look for a natural key in the attributes. Example: (longitude,
latitude) makes a good key for a geographical location.

3) If you must design a new identifier, plan it carefully -- especially
if people will see and use it. You have to be able to validate it in
application programs, so you need a regular expression, other syntax
rule and/or check digits. You have to be able to be verify in the
reality of the model or with a trusted source.

Jul 23 '05 #12

by: Manlio Perillo | last post by:

Hi. This post follows "does python have useless destructors". I'm not an expert, so I hope what I will write is meaningfull and clear. Actually in Python there is no possibility to write code...

Python

Page auto-refresh and javascript

by: Lew | last post by:

Hi all, I'm trying to create a page that has a user-selectable page auto-refresh option (IE 5.5). Essentially, it's a page that contains a checkbox, when the user checks the checkbox, I'd like...

Javascript

Problem with the width of DIV in IE6 using CSS and width:auto

by: Glabbeek | last post by:

I'm changing the layout of my site. Instead of using tables, I will use DIVs. It's working fine, except for 1 thing: In IE6 some DIVs are not the correct width. Mozilla and Opera are showing the...

HTML / CSS

width: auto seems to be ignoring me in Mozilla

by: Robert Downes | last post by:

I'm using the following in a page that I'm testing in Mozilla: p.actionLinkBlock {border: 1px #000000 dashed; padding: 0.2cm; width: auto} But the dashed border is extending to the right-edge...

HTML / CSS

Why not auto?

by: Vijay Kumar R. Zanvar | last post by:

Hello, Unlike register, auto keyword can not be used to declare formal parameter(s). Is there any specific reason for this? Kind regards, Vijay Kumar R. Zanvar

C / C++

Localization: user selectable UICulture w/ UICulture = auto

by: Samuel | last post by:

Hi, I am running into a problem of mixing UICulture = auto and allowing users to select culture using a dropdown list. I am detecting a querystring, "setlang", and when found, setting the...

ASP.NET

'auto' value question..

by: maya | last post by:

at work they decided to center divs thus: body {text-align:center} #content {width: 612px; text-align:left; margin: 0 auto 0 auto; } this works fine in IE & FF, EXCEPT in FF it doesn't work if...

HTML / CSS

Usage of "auto". What for?

by: nospam_news | last post by:

I currently get asked about my usage of "auto". What is it for? The keyword is clearly superflous here. In contrast to the huge majority of C/C++ developers I write definitions very explicitly...

C / C++

Firefox and retrieving height set to "auto"

by: Piotr K | last post by:

Hi, I've encountered a strange problem with Firefox which I don't have any idea how to resolve. To the point: I've <divelement with a style "height: auto" and I want to retrieve this value...

Javascript

What's the differences between REGISTER and AUTO?

by: JOYCE | last post by:

Look the subject,that's my problem! I hope someone can help me, thanks

C / C++

Batch import of multiple excel files into the database

by: ryjfgjl | last post by:

If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...

Data Management

Merging data from multiple Excel files

by: ryjfgjl | last post by:

In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...

Data Management

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

The easy way to turn off automatic updates for Windows 10/11

by: Hystou | last post by:

Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

Windows Server

Auto-increment PK - Use MS or Grow your own ?

Similar topics