473,782 Members | 2,513 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Preventing Duplicate Rows on Insert

I have a table using an identity column as its Primary Key and two
columns (table reduced for simplicity) EmployeeNumber and ArrivalTime.

CREATE TABLE [tblRecords] (
[ID] [bigint] IDENTITY (1, 1) NOT NULL ,
[EmployeeNumber] [varchar] (10) COLLATE SQL_Latin1_Gene ral_CP1_CI_AS
NOT NULL ,
[ArrivalTime] [datetime] NOT NULL ,
CONSTRAINT [PK_tblRecords] PRIMARY KEY CLUSTERED
(
[ID]
) ON [PRIMARY]
) ON [PRIMARY]
GO

I have an insert procedure that checks for duplicates before inserting
a new record:

IF (SELECT TOP 1 [ID] FROM tblRecords WHERE EmployeeNumber =
@SocialSecurity ) IS NULL
BEGIN
INSERT INTO tblRecords(Empl oyeeNumber,Arri valTime)
VALUES (@EmployeeNumbe r, @ArrivalTime)
SELECT SCOPE_IDENTITY( )
END
ELSE
SELECT 0 AS DuplicateRecord

In 99.9% of the cases, this works well. However, in the event that the
insert attempts are literally "ticks" apart, the "SELECT TOP 1..."
command completes on both attempts before the first attempt completes.
So I end up with duplicate entries if the procedure is called multiple
times vey quickly. The system needs to prevent duplicate
EmployeeNumbers within the past 45 days so setting the EmployeeNumber
to UNIQUE would not work. I can check for older entries (45 days or
newer) very easily, but I do not know how to handle the times when the
procedure is called multiple times within milliseconds. Would a
TRANSACTION with a duplicate check after the INSERT with a ROLLBACK
work in this case? Any help is greatly appreciated!

-E

Jul 23 '05 #1
18 27720
Try this:

INSERT INTO tblRecords (employeenumber ,arrivaltime)
SELECT @employeenumber , @arrivaltime
WHERE NOT EXISTS
(SELECT *
FROM tblRecords
WHERE employeenumber = @employeenumber
AND arrivaltime >= DATEADD(DAY,-45,@arrivaltime )) ;

As posted your table design is flawed because it has no alternate keys.
IDENTITY should never be the only key of a table.

--
David Portas
SQL Server MVP
--
Jul 23 '05 #2
A better solution is probably to use a trigger.

First, add the all-important UNIQUE constraint:

ALTER TABLE tblRecords ADD CONSTRAINT ak_employees UNIQUE
(employeenumber ,arrivaltime)

Now here's a trigger which will enforce your business rule for UPDATEs as
well as INSERTs:

CREATE TRIGGER trg_employee ON tblRecords FOR UPDATE, INSERT
AS
IF EXISTS
(SELECT T1.employeenumb er
FROM inserted AS T1
JOIN tblrecords AS T2
ON T1.employeenumb er = T2.employeenumb er
AND T1.arrivaltime <> T2.arrivaltime
AND T2.arrivaltime <= DATEADD(DAY,45, T1.arrivaltime)
AND T2.arrivaltime >= DATEADD(DAY,-45,T1.arrivalti me))
BEGIN
ROLLBACK TRAN
RAISERROR('Empl oyee number repeated within 45 days',16,1)
END

GO

--
David Portas
SQL Server MVP
--
Jul 23 '05 #3
David Portas (RE************ *************** *@acm.org) writes:
Try this:

INSERT INTO tblRecords (employeenumber ,arrivaltime)
SELECT @employeenumber , @arrivaltime
WHERE NOT EXISTS
(SELECT *
FROM tblRecords
WHERE employeenumber = @employeenumber
AND arrivaltime >= DATEADD(DAY,-45,@arrivaltime )) ;


You would need serializable isolation level here. I think the UPDLOCK
in the SELECT statement is the best. (With plain HOLDLOCK you could
get a deadlock.)

As for whether a trigger is best... You don't have to lose sleep over
concurrent inserts, but in case of an error you do more work (INSERT +
ROLLBACK). Whether an error message is desired we don't know. Then again,
you can have both. The trigger to enforce the business rule, and the
application code to give the nice behaviour.

--
Erland Sommarskog, SQL Server MVP, es****@sommarsk og.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 23 '05 #4
As mentioned by Erland, you could place your batch in a transaction, and
serialize the action either by settings the TRANSACTION ISOLATION LEVEL
to SERIALIZABLE or to use locking hints, as shown below:
BEGIN TRANSACTION

If NOT EXISTS (
SELECT *
FROM tblRecords (UPDLOCK HOLDLOCK)
WHERE EmployeeNumber = @SocialSecurity
AND ArrivalTime >= DATEADD(day,-45,@ArrivalTime )
)
Begin
INSERT INTO tblRecords(Empl oyeeNumber,Arri valTime)
VALUES (@EmployeeNumbe r, @ArrivalTime)
SELECT SCOPE_IDENTITY( )
End
Else
SELECT 0 AS DuplicateRecord

COMMIT TRANSACTION
HTH,
Gert-Jan
Elroyskimms wrote:

I have a table using an identity column as its Primary Key and two
columns (table reduced for simplicity) EmployeeNumber and ArrivalTime.

CREATE TABLE [tblRecords] (
[ID] [bigint] IDENTITY (1, 1) NOT NULL ,
[EmployeeNumber] [varchar] (10) COLLATE SQL_Latin1_Gene ral_CP1_CI_AS
NOT NULL ,
[ArrivalTime] [datetime] NOT NULL ,
CONSTRAINT [PK_tblRecords] PRIMARY KEY CLUSTERED
(
[ID]
) ON [PRIMARY]
) ON [PRIMARY]
GO

I have an insert procedure that checks for duplicates before inserting
a new record:

IF (SELECT TOP 1 [ID] FROM tblRecords WHERE EmployeeNumber =
@SocialSecurity ) IS NULL
BEGIN
INSERT INTO tblRecords(Empl oyeeNumber,Arri valTime)
VALUES (@EmployeeNumbe r, @ArrivalTime)
SELECT SCOPE_IDENTITY( )
END
ELSE
SELECT 0 AS DuplicateRecord

In 99.9% of the cases, this works well. However, in the event that the
insert attempts are literally "ticks" apart, the "SELECT TOP 1..."
command completes on both attempts before the first attempt completes.
So I end up with duplicate entries if the procedure is called multiple
times vey quickly. The system needs to prevent duplicate
EmployeeNumbers within the past 45 days so setting the EmployeeNumber
to UNIQUE would not work. I can check for older entries (45 days or
newer) very easily, but I do not know how to handle the times when the
procedure is called multiple times within milliseconds. Would a
TRANSACTION with a duplicate check after the INSERT with a ROLLBACK
work in this case? Any help is greatly appreciated!

-E

Jul 23 '05 #5
>> I have a table using an identity column [sic] Primary Key and two columns (table reduced for simplicity) EmployeeNumber and ArrivalTime. <<

Have you ever read the definition of a key??? An IDENTITY can never.
never be a PRIMARY KEY by defintion because it is not a subset of the
entities attributes. It is an exposed physical storage locator derived
from the state of the hardware and not the data model. Why did you
give it the vague, useless name "ID"?

Why did you put ":tbl-" on your table names? Read ISO-11179 for the
right way to name things. Rows are not records. Columns are not
fields. Why do you unneedlessly use proprietary datatypes and the most
procedural T-SQL you could write? Do you see destruction of portable
code and nonrelational programming as a good thing?

How do you have an arrival without a departure? Is the ISO model of
time wrong?

CREATE TABLE Events -- needs a better name
(emp_nbr CHAR (10) NOT NULL
REFERENCES Personnel (emp_nbr)
ON UPDATE CASCADE,
arrival_time DATETIME DEFAULT CURRRENT_TIMEST AMP NOT NULL,
depart_time DATETIME, -- null means current
PRIMARY KEY (emp_nbr, arrival_time));

You have missing the major points of RDBMS and nee to get some help,
more than the kludges you will get a newsgroup.

Jul 23 '05 #6
AK
>> An IDENTITY can never.
never be a PRIMARY KEY by defintion because it is not a subset of the
entities attributes. It is an exposed physical storage locator derived

from the state of the hardware and not the data model.
<<

with all due respect, Oracle's rowid is what are speking about -
"an exposed physical storage locator". Identity is an entirely
different beast...

Jul 23 '05 #7
>> Oracle's rowid is what are speaking about - "an exposed physical storage locator". Identity is an entirely different beast... <<

No, it is not significantly different. Both of them are numbers that
have nothing to do with the data model or the reality of the data
model. Both are created inside the hardware used to store the data.
Both use a bit of software to move the read-write head on a disk drive
directly to a physical location.

The rowid is merely easier programming and faster than IDENTITY, which
is using the indexing code. Both are exposed physical locators, not
surrogates as per Dr. Codd. That is not a logical difference.

The IDENTITY property is a hold-over from the days when the only way we
knew to storage data involved physically contigous records, made of
physically contigous fields, with tree-structured indexing. This is
the worse way to store relational data.

Jul 23 '05 #8
For those of you who offered suggestions to my question, thanks for
your help. It is greatly appreciated.

Celko,
I don't know what to make of your response, but I will try and address
your concerns individually:
Have you ever read the definition of a key??? Yes, I have. Thanks for asking.
An IDENTITY can never never be a PRIMARY KEY by defintion
because it is not a subset of the entities attributes. I appreciate your concern, but you are assuming things about these
entities that are incorrect. In this case the IDENTITY _is_ a subset
of the attributes. I only displayed a simplified version of the tables
because my question was not really specific to the tables in my
database.
Why did you give it the vague, useless name "ID"? Because it is neither vague nor useless to everyone who works with the
data in these tables. Again, you are assuming that you know everything
about this data and how it is being used. I will remind you that I
posted a simplified set of tables.
Why did you put ":tbl-" on your table names? Read ISO-11179 for the
right way to name things. I prefixed my table names with tbl because there is a lot more code
that I work with besides SQL. So to keep all of my objects straight,
their names also reflect what type of object they are. So, my textboxes
are named txt_____ and my buttons are named btn___... I'm sure you get
the idea.
Rows are not records. Columns are not fields. You are right. I actually knew that, but I prefer to use lamens terms
because I hate sounding like a prick no-it-all that is out to teach
everyone something. No one likes those types of people, don't you
agree?
Why do you unneedlessly use proprietary datatypes and the most
procedural T-SQL you could write? Do you see destruction of portable
code and nonrelational programming as a good thing? I'm not sure I understand your concern here. Again, you are assuming
that by seeing these 2 sample tables you have an omniscient view of my
database. My code is portable and my data relational. I know some
people are better than me, I'm OK with that. However, I choose not to
bash them simply because they know less than me. I hate those types of
people, don't you?
How do you have an arrival without a departure? Again, you assume that because you see a couple of tables you know all
about my data. I know you are the smartest person in these groups, but
here is something you didn't know. I did not include a departure time
because each record can have one arrival time and many departure times.
I think that is called a one-to-many relationship. If I'm wrong, I'm
sure you'll point it out. Because of this "relationsh ip" between my
tables, departures are stored in a separate table. I think that is
called relational data, but I could be wrong... no one likes being
right all of the time. I hate those types of people, don't you?
CREATE TABLE Events -- needs a better name Events is a better name? What kind of vague and useless name is
"Event"? For your information, my actual table name is not Record I
just used it to irritate you. Did it work?
(emp_nbr CHAR (10) NOT NULL I prefer to spell out the table name so I would use number instead of
nbr. Sometimes, the people I work with don't read English very well and
they don't know what to do with nbr. But 'number' is in their
vocabulary.
depart_time DATETIME, -- null means current Only one departure per arrival? This doesn't look very relational to
me.
You have missing the major points of RDBMS and
nee to get some help, more than the kludges you
will get a newsgroup.

My data is relational, my code is portable, and my table names are
readable to anyone with a 3rd grade English vocabulary. I'm sorry if
that is not to your liking. I'll try and do better next time.

Jul 23 '05 #9
> In this case the IDENTITY _is_ a subset
of the attributes


That's dangerous. IDENTITY is generally suitable only as an artificial
key. It should not have any business meaning, in fact it should be
completely hidden from users of the system. If you assign any business
meaning to an IDENTITY key you set yourself up for a whole lot of
potential problems to do with migrating and deploying data in a live
environment. Also, your design is fundamental flawed if you lack a
natural candidate key - this is a frequent cause of incorrect results
and invalid data in poorly designed databases.

--
David Portas
SQL Server MVP
--

Jul 23 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
577
by: g_chime | last post by:
How can I make MySQL reject duplicate rows when inserting? The table is quite large: 100+ rows and millions of rows. MySQL doesn't allow me to create a UNIQUE INDEX for all the rows (16 rows is the max.) The brute force method (selecting the new row and inserting only when not found) is way too slow.
10
5127
by: florian | last post by:
Hi, we have a contention problem because of an application which tries to insert a duplicate row in a table with primary key. This insert fails of course but the locks are not released within this transcation. Why DB2 does not release the X lock after a failed insert??? We use DB2 UDB EEE Version 7.2 Fixpak 9, but we also can reproduce the Problem on DB2 UDB ESE 8.1 Linux Fixpak 4.
4
14403
by: Vijay Pote | last post by:
Hi All, i have a datatable in which iam creating 2 datacolumns , one is SampleID and second is TestType. iam creating a datarow depending on a for loop, if the for loop goes 5 times it will insert 5 records. every thing works well, what i want is to delete the Duplicate Rows
1
1787
by: Asha | last post by:
greetings, does anyone have any idea how to delete duplicate rows in a dataset gracefully (the fast and easy way)
6
13764
by: Bob Stearns | last post by:
I am getting duplicate rows back from a select distinct statement of the form: SELECT DISTINCT 'jhough', '000111', t0.bhid FROM (SELECT lots of good stuff) t0 LEFT OUTER JOIN another_table t1 ON relevant_stuff WHERE (lots of conditions) After re-reading the relevant pat ofVol 1 of the SQL Reference I am unablee to see how this is possible.
7
9782
by: Jon Maz | last post by:
Hi, I have a MySql problem I hope someone can help me with. I'm trying to run an update on a linking table, the update is running into a Primary Key constraint violation, and in my workaround I've got stuck trying to write a DELETE statement. Here's the table I'm working on: CREATE TABLE `articles_categories` (
2
2798
by: prateekbtbl | last post by:
Hello Sir, I have a table with userID and articleID, now if a user access an article it is stored in this table, how do i write a query where no duplicate entry for article viewed by a user is store. after that i need total count of a user to fetch and store in another table. Can anyone help me writing a query. Thanks in advance Prateek Batabyal
5
8430
jamesd0142
by: jamesd0142 | last post by:
My manager and I where looking at some complex code to eliminate duplicate records in a database table. then it hit me how its done easily... so i thought i'd share it... In English: -------------------------------- -- Eliminating Duplicate rows -- -------------------------------- -- select all into a temp table. -- truncate origional table.
4
5459
by: ravir81 | last post by:
Hi, I am currently working on excel validation using Perl. I am new to Excel validation but not for Perl. I have a question regarding one of the validation. Could anyone please tell me how to get the number of duplicate rows based on a particular cell value of each these duplicate rows. I mean all the cell values of a row will not be duplicated but a individual columns cell value will be duplicated and I need to create a separate excel with...
0
10311
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10146
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
9942
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
8967
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
6733
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5509
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4043
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
3639
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2874
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.