473,791 Members | 3,154 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Preventing Duplicate Rows on Insert

I have a table using an identity column as its Primary Key and two
columns (table reduced for simplicity) EmployeeNumber and ArrivalTime.

CREATE TABLE [tblRecords] (
[ID] [bigint] IDENTITY (1, 1) NOT NULL ,
[EmployeeNumber] [varchar] (10) COLLATE SQL_Latin1_Gene ral_CP1_CI_AS
NOT NULL ,
[ArrivalTime] [datetime] NOT NULL ,
CONSTRAINT [PK_tblRecords] PRIMARY KEY CLUSTERED
(
[ID]
) ON [PRIMARY]
) ON [PRIMARY]
GO

I have an insert procedure that checks for duplicates before inserting
a new record:

IF (SELECT TOP 1 [ID] FROM tblRecords WHERE EmployeeNumber =
@SocialSecurity ) IS NULL
BEGIN
INSERT INTO tblRecords(Empl oyeeNumber,Arri valTime)
VALUES (@EmployeeNumbe r, @ArrivalTime)
SELECT SCOPE_IDENTITY( )
END
ELSE
SELECT 0 AS DuplicateRecord

In 99.9% of the cases, this works well. However, in the event that the
insert attempts are literally "ticks" apart, the "SELECT TOP 1..."
command completes on both attempts before the first attempt completes.
So I end up with duplicate entries if the procedure is called multiple
times vey quickly. The system needs to prevent duplicate
EmployeeNumbers within the past 45 days so setting the EmployeeNumber
to UNIQUE would not work. I can check for older entries (45 days or
newer) very easily, but I do not know how to handle the times when the
procedure is called multiple times within milliseconds. Would a
TRANSACTION with a duplicate check after the INSERT with a ROLLBACK
work in this case? Any help is greatly appreciated!

-E

Jul 23 '05
18 27721
David Portas (RE************ *************** *@acm.org) writes:
In this case the IDENTITY _is_ a subset
of the attributes
That's dangerous. IDENTITY is generally suitable only as an artificial
key. It should not have any business meaning, in fact it should be
completely hidden from users of the system. If you assign any business
meaning to an IDENTITY key you set yourself up for a whole lot of
potential problems to do with migrating and deploying data in a live
environment.


Another one that competes with Joe Celko to know everything about everyone
else's system, I see.

Well, it's often the case that what you call "natural keys" are in fact
someone else's arbitrary sequential number, assigned with IDENTITY or
something else.

There is a whole of real-worl data out there that does not have any
natural key that obeys the laws of the relational model.
Also, your design is fundamental flawed if you lack a natural candidate
key - this is a frequent cause of incorrect results and invalid data in
poorly designed databases.


Personally, I would be very careful with anyone else's design "fundamenta lly
flawed" when I have only seen glimpses of that design. Particularly when
it's clear from Elroyskimms that he has quite some experience of system
design. Just because he has not made the design that you or I(*) would
have done, does not mean that the design is wrong. When I think of it,
you should learn Perl. If nothing else, to learn the motto of Perl:
"There's more than one way to do it".
--
Erland Sommarskog, SQL Server MVP, es****@sommarsk og.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 23 '05 #11
> Another one that competes with Joe Celko to know everything about everyone
else's system, I see.
I was giving my opinion and suggestions based on the information given.
I can't do more than that and I don't claim omniscience.

There is a whole of real-worl data out there that does not have any
natural key that obeys the laws of the relational model.


No-one says the world has to obey the laws of the relational model
(whatever that means). This is purely a question of how to model the
world in a table. We can represent ANY facts as relations (i.e. with
keys). Whether we choose to use IDENTITY or not is a practical issue in
SQL Server that has absolutely zero to do with the logical data model.
That's why I said "fundamenta l flaw".

--
David Portas
SQL Server MVP
--

Jul 23 '05 #12
AK
>> There is a whole of real-worl data out there that does not have any
natural key that obeys the laws of the relational model.
<<

I could not agree more.

Jul 23 '05 #13
Actually I may have mis-read Erland's statement. It could be read more
than one way and I'm not sure what interpretation you are agreeing
with.
"There is a whole of real-worl data out there that does not have any
natural key that obeys the laws of the relational model."


If this means [1] "there is a lot of data in database systems that
doesn't have a natural key" then I agree. If it means [2] "there is a
lot of information that cannot be modelled in an RDBMS without a
surrogate key" then I disagree and I think Erland is confusing logical
and physical models.

--
David Portas
SQL Server MVP
--

Jul 23 '05 #14
David Portas (RE************ *************** *@acm.org) writes:
"There is a whole of real-worl data out there that does not have any
natural key that obeys the laws of the relational model."


If this means [1] "there is a lot of data in database systems that
doesn't have a natural key" then I agree. If it means [2] "there is a
lot of information that cannot be modelled in an RDBMS without a
surrogate key" then I disagree and I think Erland is confusing logical
and physical models.


I simply mean that there is a lot of data for which you cannot find
any useful natural primary key. Whether there exists one in theory, or none
at all, is completely uninterestering for us who gets paid to implement such
systems. (Because our customers could not care less.)

Typical examples of entities that do not have any useful natural key:

Persons (For certain subsets a natural key can be found)
Addresses (The entire address is a key in itself, but not useful as one.
And a lot of it can be NULL.)
Financial instruments
(Several competing schemes of natural keys exist, but none of
them are whole-covering.)


--
Erland Sommarskog, SQL Server MVP, es****@sommarsk og.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 23 '05 #15
> Whether there exists one in theory, or none
at all, is completely uninterestering for us who gets paid to implement such
systems. (Because our customers could not care less.)


If customers aren't interested it's surely because they pay database
professionals to worry about such things. That's what keeps you and I
gainfully employed.

--
David Portas
SQL Server MVP
--

Jul 23 '05 #16
David Portas (RE************ *************** *@acm.org) writes:
Whether there exists one in theory, or none at all, is completely
uninterestering for us who gets paid to implement such systems.
(Because our customers could not care less.)


If customers aren't interested it's surely because they pay database
professionals to worry about such things. That's what keeps you and I
gainfully employed.


No, my customers pay me to get a system where they can store transactions
about financial instruments. If I tell them "no, you can't store this
instrument in the database, because there is no identification on it
than I can use for a primary key", they will give me very sinister glances.
Likewise if I tell them "for the system to be able to identify a customer,
you will need to enter, this, this and this". Or, "it will take 50 hours
to find out whether this entity has a potential primary key that we are
likely to be able to use anyway".

Simply put, our customers pays us to get solutions. They don't pay us
for do theoretical investigations. It's part of our professional role
to quickly identify that here is an entity, we are not likely to find
a useful primary key for.
--
Erland Sommarskog, SQL Server MVP, es****@sommarsk og.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 23 '05 #17
> If I tell them "no, you can't store this
instrument in the database, because there is no identification on it
than I can use for a primary key", they will give me very sinister glances.


Keys don't prevent you storing information. They prevent you storing
redundant data. You lose nothing by eliminating that redundancy and you
may gain a lot because redundancy can cause incorrect results.

Of course it is possible to produce correct results from redundant data
and you may still be able to meet the customer's immediate needs.
However, it's a golden rule that enterprise data outlives and extends
beyond the applications that manage it. That's true without exception
in my experience. Storing redundant data today means that not only your
application but also future developers and other applications have to
cope with the potential anomalies that result. Coping with data quality
issues costs money, plus it may prove impossible to resolve those
issues to a satisfactory level of confidence in the context of some
future set of requirements. Anyone who works on data integration
projects knows those experiences too well.

Customers who understand that data quality is valuable to the business
should also understand that the time taken to design a consistent
logical data model is necessary to guarantee that data quality. That's
the philosophy I work to anyway. There are other imperatives too of
course and it's always interesting and sometimes instructive to hear
how others approach problems.

--
David Portas
SQL Server MVP
--

Jul 23 '05 #18
David Portas (RE************ *************** *@acm.org) writes:
If I tell them "no, you can't store this instrument in the database,
because there is no identification on it than I can use for a primary
key", they will give me very sinister glances.


Keys don't prevent you storing information. They prevent you storing
redundant data. You lose nothing by eliminating that redundancy and you
may gain a lot because redundancy can cause incorrect results.


If I require that when users registers instruments that they must
specify an ISIN code, users have two choices when they have an instrument
that do not have an ISIN code (or the users don't know the code):

1) don't register until they have found an ISIN code for the instrument.
2) make one up.

1) is not not a serious option, since an OTC instrument agreed on between
two parties never will have an ISIN code.

And what with 2)? Well, what if users invents a new code, instead of
using an existing instrument?

So much did that key help from preventing redudant data.

Another example: for customers you get the idea to use the national
registration number. Never mind the fine detail that potential customers
without a national registration number cannot get into the database.
Anyway, at the end of the year we need to produce reporting to the
tax authorities. A customer has exactly one tax country, but oops, here
is this guy who changed his taxation country in the middle of the
year. Of course, rather than making the tax country an attribute of
a customer, we could have a customertaxcoun tryhistory table, but, eh,
we can't fix that this year. Anyway, the workaround is simple: register
the customer anew, with a new accounts, so his transactions for the
part of the year we need to report him. Since we do not require uniqueness
on national registration number (and even less are foolish to make
this a key), this was possible.

Building a system too much on natural keys means that when you have
goofed on the model and missed the 0.1% exceptional case, mean that
users will have to fund ugly workarounds - which in worst cases can
mean entirely manual procedures. And don't tell me goofs in the design
don't happen. They do.

Bad data can also get into the system in ways you cannot protect the
system from. You can prevent a user from entering that a person is
living in Flance, by having countries as entity, and only permit a
set of defined countries. But you cannot prevent a user from entering
Switzerland for a customer who is living in Sweden.

So while you by means of constraints, keys etc can prevent some crap data
from getting into the database, you can not prevent all sorts of crap.
But if you are too restrictive, you put users in a straightjacket
where your system hinders their job.

In this context is a fallacy to turn every stone to find a natural primary
key. Sometimes it is simply not there. In an order registration system,
the key is the order number generated by the application. That key can
serve as a natural key in other system, but in the order registration
system we should not try to find some other key. If you want to detect
double-registration of orders, it's probably better to develop some sort
of fuzzy-logic report, where the users manually have to consider the
suspects.
--
Erland Sommarskog, SQL Server MVP, es****@sommarsk og.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 23 '05 #19

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
577
by: g_chime | last post by:
How can I make MySQL reject duplicate rows when inserting? The table is quite large: 100+ rows and millions of rows. MySQL doesn't allow me to create a UNIQUE INDEX for all the rows (16 rows is the max.) The brute force method (selecting the new row and inserting only when not found) is way too slow.
10
5129
by: florian | last post by:
Hi, we have a contention problem because of an application which tries to insert a duplicate row in a table with primary key. This insert fails of course but the locks are not released within this transcation. Why DB2 does not release the X lock after a failed insert??? We use DB2 UDB EEE Version 7.2 Fixpak 9, but we also can reproduce the Problem on DB2 UDB ESE 8.1 Linux Fixpak 4.
4
14403
by: Vijay Pote | last post by:
Hi All, i have a datatable in which iam creating 2 datacolumns , one is SampleID and second is TestType. iam creating a datarow depending on a for loop, if the for loop goes 5 times it will insert 5 records. every thing works well, what i want is to delete the Duplicate Rows
1
1787
by: Asha | last post by:
greetings, does anyone have any idea how to delete duplicate rows in a dataset gracefully (the fast and easy way)
6
13764
by: Bob Stearns | last post by:
I am getting duplicate rows back from a select distinct statement of the form: SELECT DISTINCT 'jhough', '000111', t0.bhid FROM (SELECT lots of good stuff) t0 LEFT OUTER JOIN another_table t1 ON relevant_stuff WHERE (lots of conditions) After re-reading the relevant pat ofVol 1 of the SQL Reference I am unablee to see how this is possible.
7
9783
by: Jon Maz | last post by:
Hi, I have a MySql problem I hope someone can help me with. I'm trying to run an update on a linking table, the update is running into a Primary Key constraint violation, and in my workaround I've got stuck trying to write a DELETE statement. Here's the table I'm working on: CREATE TABLE `articles_categories` (
2
2799
by: prateekbtbl | last post by:
Hello Sir, I have a table with userID and articleID, now if a user access an article it is stored in this table, how do i write a query where no duplicate entry for article viewed by a user is store. after that i need total count of a user to fetch and store in another table. Can anyone help me writing a query. Thanks in advance Prateek Batabyal
5
8430
jamesd0142
by: jamesd0142 | last post by:
My manager and I where looking at some complex code to eliminate duplicate records in a database table. then it hit me how its done easily... so i thought i'd share it... In English: -------------------------------- -- Eliminating Duplicate rows -- -------------------------------- -- select all into a temp table. -- truncate origional table.
4
5467
by: ravir81 | last post by:
Hi, I am currently working on excel validation using Perl. I am new to Excel validation but not for Perl. I have a question regarding one of the validation. Could anyone please tell me how to get the number of duplicate rows based on a particular cell value of each these duplicate rows. I mean all the cell values of a row will not be duplicated but a individual columns cell value will be duplicated and I need to create a separate excel with...
0
9669
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
10428
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
10156
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9997
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
9030
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
6776
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5435
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
1
4110
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
3718
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.