"Louis Frolio" :
Greetings All, I have read many upon many articles here regarding GUID
data types and uniqueness. There have been many opinions regarding
the effectiveness of GUID's and when they should/should not be used.
However, every article strongly implies, if it does not state it
outright, that GUID's are always unique. My question is this, what
happens if you have a database that uses GUID's and the NIC is changed
out on the box? From what I understand the MAC address of the NIC is
used as part of the algorithm to generate a GUID. If you change out
the NIC after generating 1 billion GUID's do you run the chance of
generating a duplicate GUID?
I look forward to your insightfulness on this issue.
Regards, Louis.
Louis,
Assuming the RPC and COM implementation for GUID creation on Windows (and I
assume that's what SQL Server uses as well, but I'm not sure), then a GUID
is virtually guaranteed to be unique even if you switch MAC addresses (by
switching cards or changing it in software if the network card supports that
feature). Computers without network cards are guaranteed to generate unique
GUID on that computer but in practice are still pretty good at generating
completely unique GUID's.
I think you'll find that most articles (especially those written by people
with training in Computer Science or Software Engineering) will skirt the
issue because I don't think there's an algorithm that can formally guarantee
universal unqiueness. (If there is someone *please* correct me: I'd love to
know what it is!)
But... keep in mind that a GUID is 128 bits. If you assume that 64 bits
will be used as a system identifier (like a MAC address), you would still
need more than a quadrillion GUID's before you started approaching the realm
of likely duplication. For some really crazy numbers: assume 1,000,000
computers generating a 1,000,000 GUID's a day. In a thousand years, you
would have used up around 2% of the "number space" in 64 bits. (Of course,
this is a specious example since we conveniently ignored the algorithmic
difficulty of "divying up" the range of numbers possible in 64 bits, but
it's still a cool number :)
That's why GUID's really shine in distributed environments: any computer can
generate as many unique identifiers as required. We use them for a system
we have where a client must create an "entity" that when mapped to our
relational schema could be hundreds of rows across multiple tables. And for
reasons I won't go into, the client must be able to reliably create the
entire entity before ever connecting to a server. Without GUID's (or
something similar) we'd need to something much more painful which would
probably end emulating the MS GUID algorithm anyway.
The point of the above? After millions of these "entities" and hundreds
(possibly thousands) of GUID's for COM components, we've never had a
collision. And that includes computers that have changed Ethernet cards 3
or 4 times, computers that were operated with no network adapter installed
for a week at a time, and computers with multiple adapters. And in our
initial testing for the system 3 or so years ago, we generated 500 million
GUIDS from a single computer and never had a collision.
The bottom line and my $0.02? If you decide to go with GUID's, of course
use constraints in the database like you would for any other column that is
supposed to be unique or a foreign key or whatever, but if an error is ever
detected, look for software bugs waayyyyyy before suspecting duplicate GUID
creation.
Craig