Preventing Duplicate Rows on Insert

Elroyskimms

I have a table using an identity column as its Primary Key and two
columns (table reduced for simplicity) EmployeeNumber and ArrivalTime.

CREATE TABLE [tblRecords] (
[ID] [bigint] IDENTITY (1, 1) NOT NULL ,
[EmployeeNumber] [varchar] (10) COLLATE SQL_Latin1_Gene ral_CP1_CI_AS
NOT NULL ,
[ArrivalTime] [datetime] NOT NULL ,
CONSTRAINT [PK_tblRecords] PRIMARY KEY CLUSTERED
(
[ID]
) ON [PRIMARY]
) ON [PRIMARY]
GO

I have an insert procedure that checks for duplicates before inserting
a new record:

IF (SELECT TOP 1 [ID] FROM tblRecords WHERE EmployeeNumber =
@SocialSecurity ) IS NULL
BEGIN
INSERT INTO tblRecords(Empl oyeeNumber,Arri valTime)
VALUES (@EmployeeNumbe r, @ArrivalTime)
SELECT SCOPE_IDENTITY( )
END
ELSE
SELECT 0 AS DuplicateRecord

In 99.9% of the cases, this works well. However, in the event that the
insert attempts are literally "ticks" apart, the "SELECT TOP 1..."
command completes on both attempts before the first attempt completes.
So I end up with duplicate entries if the procedure is called multiple
times vey quickly. The system needs to prevent duplicate
EmployeeNumbers within the past 45 days so setting the EmployeeNumber
to UNIQUE would not work. I can check for older entries (45 days or
newer) very easily, but I do not know how to handle the times when the
procedure is called multiple times within milliseconds. Would a
TRANSACTION with a duplicate check after the INSERT with a ROLLBACK
work in this case? Any help is greatly appreciated!

-E

Jul 23 '05 #1

Subscribe Reply

27720

David Portas

Try this:

INSERT INTO tblRecords (employeenumber ,arrivaltime)
SELECT @employeenumber , @arrivaltime
WHERE NOT EXISTS
(SELECT *
FROM tblRecords
WHERE employeenumber = @employeenumber
AND arrivaltime >= DATEADD(DAY,-45,@arrivaltime )) ;

As posted your table design is flawed because it has no alternate keys.
IDENTITY should never be the only key of a table.

--
David Portas
SQL Server MVP
--

Jul 23 '05 #2

David Portas

A better solution is probably to use a trigger.

First, add the all-important UNIQUE constraint:

ALTER TABLE tblRecords ADD CONSTRAINT ak_employees UNIQUE
(employeenumber ,arrivaltime)

Now here's a trigger which will enforce your business rule for UPDATEs as
well as INSERTs:

CREATE TRIGGER trg_employee ON tblRecords FOR UPDATE, INSERT
AS
IF EXISTS
(SELECT T1.employeenumb er
FROM inserted AS T1
JOIN tblrecords AS T2
ON T1.employeenumb er = T2.employeenumb er
AND T1.arrivaltime <> T2.arrivaltime
AND T2.arrivaltime <= DATEADD(DAY,45, T1.arrivaltime)
AND T2.arrivaltime >= DATEADD(DAY,-45,T1.arrivalti me))
BEGIN
ROLLBACK TRAN
RAISERROR('Empl oyee number repeated within 45 days',16,1)
END

GO

--
David Portas
SQL Server MVP
--

Jul 23 '05 #3

Erland Sommarskog

David Portas (RE************ *************** *@acm.org) writes:

Try this:

INSERT INTO tblRecords (employeenumber ,arrivaltime)
SELECT @employeenumber , @arrivaltime
WHERE NOT EXISTS
(SELECT *
FROM tblRecords
WHERE employeenumber = @employeenumber
AND arrivaltime >= DATEADD(DAY,-45,@arrivaltime )) ;

You would need serializable isolation level here. I think the UPDLOCK
in the SELECT statement is the best. (With plain HOLDLOCK you could
get a deadlock.)

As for whether a trigger is best... You don't have to lose sleep over
concurrent inserts, but in case of an error you do more work (INSERT +
ROLLBACK). Whether an error message is desired we don't know. Then again,
you can have both. The trigger to enforce the business rule, and the
application code to give the nice behaviour.

--
Erland Sommarskog, SQL Server MVP, es****@sommarsk og.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp

Jul 23 '05 #4

Gert-Jan Strik

As mentioned by Erland, you could place your batch in a transaction, and
serialize the action either by settings the TRANSACTION ISOLATION LEVEL
to SERIALIZABLE or to use locking hints, as shown below:
BEGIN TRANSACTION

If NOT EXISTS (
SELECT *
FROM tblRecords (UPDLOCK HOLDLOCK)
WHERE EmployeeNumber = @SocialSecurity
AND ArrivalTime >= DATEADD(day,-45,@ArrivalTime )
)
Begin
INSERT INTO tblRecords(Empl oyeeNumber,Arri valTime)
VALUES (@EmployeeNumbe r, @ArrivalTime)
SELECT SCOPE_IDENTITY( )
End
Else
SELECT 0 AS DuplicateRecord

COMMIT TRANSACTION
HTH,
Gert-Jan
Elroyskimms wrote:

I have a table using an identity column as its Primary Key and two
columns (table reduced for simplicity) EmployeeNumber and ArrivalTime.

CREATE TABLE [tblRecords] (
[ID] [bigint] IDENTITY (1, 1) NOT NULL ,
[EmployeeNumber] [varchar] (10) COLLATE SQL_Latin1_Gene ral_CP1_CI_AS
NOT NULL ,
[ArrivalTime] [datetime] NOT NULL ,
CONSTRAINT [PK_tblRecords] PRIMARY KEY CLUSTERED
(
[ID]
) ON [PRIMARY]
) ON [PRIMARY]
GO

I have an insert procedure that checks for duplicates before inserting
a new record:

IF (SELECT TOP 1 [ID] FROM tblRecords WHERE EmployeeNumber =
@SocialSecurity ) IS NULL
BEGIN
INSERT INTO tblRecords(Empl oyeeNumber,Arri valTime)
VALUES (@EmployeeNumbe r, @ArrivalTime)
SELECT SCOPE_IDENTITY( )
END
ELSE
SELECT 0 AS DuplicateRecord

In 99.9% of the cases, this works well. However, in the event that the
insert attempts are literally "ticks" apart, the "SELECT TOP 1..."
command completes on both attempts before the first attempt completes.
So I end up with duplicate entries if the procedure is called multiple
times vey quickly. The system needs to prevent duplicate
EmployeeNumbers within the past 45 days so setting the EmployeeNumber
to UNIQUE would not work. I can check for older entries (45 days or
newer) very easily, but I do not know how to handle the times when the
procedure is called multiple times within milliseconds. Would a
TRANSACTION with a duplicate check after the INSERT with a ROLLBACK
work in this case? Any help is greatly appreciated!

-E

Jul 23 '05 #5

--CELKO--

>> I have a table using an identity column [sic] Primary Key and two columns (table reduced for simplicity) EmployeeNumber and ArrivalTime. <<

Have you ever read the definition of a key??? An IDENTITY can never.
never be a PRIMARY KEY by defintion because it is not a subset of the
entities attributes. It is an exposed physical storage locator derived
from the state of the hardware and not the data model. Why did you
give it the vague, useless name "ID"?

Why did you put ":tbl-" on your table names? Read ISO-11179 for the
right way to name things. Rows are not records. Columns are not
fields. Why do you unneedlessly use proprietary datatypes and the most
procedural T-SQL you could write? Do you see destruction of portable
code and nonrelational programming as a good thing?

How do you have an arrival without a departure? Is the ISO model of
time wrong?

CREATE TABLE Events -- needs a better name
(emp_nbr CHAR (10) NOT NULL
REFERENCES Personnel (emp_nbr)
ON UPDATE CASCADE,
arrival_time DATETIME DEFAULT CURRRENT_TIMEST AMP NOT NULL,
depart_time DATETIME, -- null means current
PRIMARY KEY (emp_nbr, arrival_time));

You have missing the major points of RDBMS and nee to get some help,
more than the kludges you will get a newsgroup.

Jul 23 '05 #6

>> An IDENTITY can never.
never be a PRIMARY KEY by defintion because it is not a subset of the
entities attributes. It is an exposed physical storage locator derived

from the state of the hardware and not the data model.
<<

with all due respect, Oracle's rowid is what are speking about -
"an exposed physical storage locator". Identity is an entirely
different beast...

Jul 23 '05 #7

--CELKO--

>> Oracle's rowid is what are speaking about - "an exposed physical storage locator". Identity is an entirely different beast... <<

No, it is not significantly different. Both of them are numbers that
have nothing to do with the data model or the reality of the data
model. Both are created inside the hardware used to store the data.
Both use a bit of software to move the read-write head on a disk drive
directly to a physical location.

The rowid is merely easier programming and faster than IDENTITY, which
is using the indexing code. Both are exposed physical locators, not
surrogates as per Dr. Codd. That is not a logical difference.

The IDENTITY property is a hold-over from the days when the only way we
knew to storage data involved physically contigous records, made of
physically contigous fields, with tree-structured indexing. This is
the worse way to store relational data.

Jul 23 '05 #8

Elroyskimms

For those of you who offered suggestions to my question, thanks for
your help. It is greatly appreciated.

Celko,
I don't know what to make of your response, but I will try and address
your concerns individually:

Have you ever read the definition of a key??? Yes, I have. Thanks for asking.
An IDENTITY can never never be a PRIMARY KEY by defintion
because it is not a subset of the entities attributes. I appreciate your concern, but you are assuming things about these
entities that are incorrect. In this case the IDENTITY _is_ a subset
of the attributes. I only displayed a simplified version of the tables
because my question was not really specific to the tables in my
database.
Why did you give it the vague, useless name "ID"? Because it is neither vague nor useless to everyone who works with the
data in these tables. Again, you are assuming that you know everything
about this data and how it is being used. I will remind you that I
posted a simplified set of tables.
Why did you put ":tbl-" on your table names? Read ISO-11179 for the
right way to name things. I prefixed my table names with tbl because there is a lot more code
that I work with besides SQL. So to keep all of my objects straight,
their names also reflect what type of object they are. So, my textboxes
are named txt_____ and my buttons are named btn___... I'm sure you get
the idea.
Rows are not records. Columns are not fields. You are right. I actually knew that, but I prefer to use lamens terms
because I hate sounding like a prick no-it-all that is out to teach
everyone something. No one likes those types of people, don't you
agree?
Why do you unneedlessly use proprietary datatypes and the most
procedural T-SQL you could write? Do you see destruction of portable
code and nonrelational programming as a good thing? I'm not sure I understand your concern here. Again, you are assuming
that by seeing these 2 sample tables you have an omniscient view of my
database. My code is portable and my data relational. I know some
people are better than me, I'm OK with that. However, I choose not to
bash them simply because they know less than me. I hate those types of
people, don't you?
How do you have an arrival without a departure? Again, you assume that because you see a couple of tables you know all
about my data. I know you are the smartest person in these groups, but
here is something you didn't know. I did not include a departure time
because each record can have one arrival time and many departure times.
I think that is called a one-to-many relationship. If I'm wrong, I'm
sure you'll point it out. Because of this "relationsh ip" between my
tables, departures are stored in a separate table. I think that is
called relational data, but I could be wrong... no one likes being
right all of the time. I hate those types of people, don't you?
CREATE TABLE Events -- needs a better name Events is a better name? What kind of vague and useless name is
"Event"? For your information, my actual table name is not Record I
just used it to irritate you. Did it work?
(emp_nbr CHAR (10) NOT NULL I prefer to spell out the table name so I would use number instead of
nbr. Sometimes, the people I work with don't read English very well and
they don't know what to do with nbr. But 'number' is in their
vocabulary.
depart_time DATETIME, -- null means current Only one departure per arrival? This doesn't look very relational to
me.
You have missing the major points of RDBMS and
nee to get some help, more than the kludges you
will get a newsgroup.

My data is relational, my code is portable, and my table names are
readable to anyone with a 3rd grade English vocabulary. I'm sorry if
that is not to your liking. I'll try and do better next time.

Jul 23 '05 #9

David Portas

> In this case the IDENTITY _is_ a subset

of the attributes

That's dangerous. IDENTITY is generally suitable only as an artificial
key. It should not have any business meaning, in fact it should be
completely hidden from users of the system. If you assign any business
meaning to an IDENTITY key you set yourself up for a whole lot of
potential problems to do with migrating and deploying data in a live
environment. Also, your design is fundamental flawed if you lack a
natural candidate key - this is a frequent cause of incorrect results
and invalid data in poorly designed databases.

--
David Portas
SQL Server MVP
--

Jul 23 '05 #10

Similar topics

577

Help: rejecting duplicate rows

by: g_chime | last post by:

How can I make MySQL reject duplicate rows when inserting? The table is quite large: 100+ rows and millions of rows. MySQL doesn't allow me to create a UNIQUE INDEX for all the rows (16 rows is the max.) The brute force method (selecting the new row and inserting only when not found) is way too slow.

MySQL Database

5127

failed insert because of duplicate rows and still locks held by application

by: florian | last post by:

Hi, we have a contention problem because of an application which tries to insert a duplicate row in a table with primary key. This insert fails of course but the locks are not released within this transcation. Why DB2 does not release the X lock after a failed insert??? We use DB2 UDB EEE Version 7.2 Fixpak 9, but we also can reproduce the Problem on DB2 UDB ESE 8.1 Linux Fixpak 4.

DB2 Database

14403

DataTable DataRow Delete Duplicate Rows

by: Vijay Pote | last post by:

Hi All, i have a datatable in which iam creating 2 datacolumns , one is SampleID and second is TestType. iam creating a datarow depending on a for loop, if the for loop goes 5 times it will insert 5 records. every thing works well, what i want is to delete the Duplicate Rows

C# / C Sharp

1787

deleting duplicate rows in a dataset

by: Asha | last post by:

greetings, does anyone have any idea how to delete duplicate rows in a dataset gracefully (the fast and easy way)

ASP.NET

13764

Duplicate rows from a SELECT DISTINCT DB2 UDB v8.1.9 Linux

by: Bob Stearns | last post by:

I am getting duplicate rows back from a select distinct statement of the form: SELECT DISTINCT 'jhough', '000111', t0.bhid FROM (SELECT lots of good stuff) t0 LEFT OUTER JOIN another_table t1 ON relevant_stuff WHERE (lots of conditions) After re-reading the relevant pat ofVol 1 of the SQL Reference I am unablee to see how this is possible.

DB2 Database

9782

MySql UPDATE problem with duplicate rows

by: Jon Maz | last post by:

Hi, I have a MySql problem I hope someone can help me with. I'm trying to run an update on a linking table, the update is running into a Primary Key constraint violation, and in my workaround I've got stuck trying to write a DELETE statement. Here's the table I'm working on: CREATE TABLE `articles_categories` (

MySQL Database

2798

Preventing duplicate rows in a MySQL table

by: prateekbtbl | last post by:

Hello Sir, I have a table with userID and articleID, now if a user access an article it is stored in this table, how do i write a query where no duplicate entry for article viewed by a user is store. after that i need total count of a user to fetch and store in another table. Can anyone help me writing a query. Thanks in advance Prateek Batabyal

MySQL Database

8430

Eliminating Duplicate Rows In SQL SERVER 2000

by: jamesd0142 | last post by:

My manager and I where looking at some complex code to eliminate duplicate records in a database table. then it hit me how its done easily... so i thought i'd share it... In English: -------------------------------- -- Eliminating Duplicate rows -- -------------------------------- -- select all into a temp table. -- truncate origional table.

Microsoft SQL Server

5459

How to get a duplicate rows based on the particular cell value in excel using Perl

by: ravir81 | last post by:

Hi, I am currently working on excel validation using Perl. I am new to Excel validation but not for Perl. I have a question regarding one of the validation. Could anyone please tell me how to get the number of duplicate rows based on a particular cell value of each these duplicate rows. I mean all the cell values of a row will not be duplicated but a individual columns cell value will be duplicated and I need to create a separate excel with...

Perl

10311

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...

C / C++

10146

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...

Online Marketing

9942

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...

General

8967

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...

Career Advice

6733

Couldn’t get equations in html when convert word .docx file to html file in C#.

by: conductexam | last post by:

I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...

C# / C Sharp

5509

Windows Forms - .Net 8.0

by: adsilva | last post by:

A Windows Forms form does not have the event Unload, like VB6. What one acts like?

Visual Basic .NET

4043

transfer the data from one system to another through ip address

by: 6302768590 | last post by:

Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system

C# / C Sharp

3639

How to add payments to a PHP MySQL app.

by: muto222 | last post by:

How can i add a mobile payment intergratation into php mysql website.

PHP

2874

Comprehensive Guide to Website Development in Toronto: Expert Insights from BSMN Consultancy

by: bsmnconsultancy | last post by:

In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

General