473,836 Members | 1,626 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

delete otherwise duplicate records based on differing values in one column

6 New Member
I have a table with what I consider duplicate records. Data in all columns are duplicate except for the date column, meaning that duplicate data was entered on different dates and those dates were stored along with the data.

I want to delete the records except for most recent one. I can select the client ID field and the max(date field) to determine the ones I want to keep, but how do I determine the ones to delete. There are often more than two duplicates, so the min(date field) doesn't do it.

Any suggestions or guidance will be appreciated!

I use:
Microsoft SQL Enterprise Manager 8.0 running on Windows XP SP2
Feb 14 '08 #1
7 3906
Delerna
1,134 Recognized Expert Top Contributor
Paste this code into query analyser and see if it does what you need

Expand|Select|Wrap|Line Numbers
  1. --Setup a table and add some data for rhe example to work with
  2. create table tblDulicateDates([Num1] [tinyint],[Num2] [tinyint],[dte] [datetime])
  3.  
  4. delete from tblDulicateDates
  5. insert into tblDulicateDates select 1,1,'2007-01-01'
  6. insert into tblDulicateDates select 1,1,'2007-01-02'
  7. insert into tblDulicateDates select 1,1,'2007-01-03'
  8. insert into tblDulicateDates select 1,2,'2007-01-01'
  9. insert into tblDulicateDates select 1,2,'2007-01-02'
  10. insert into tblDulicateDates select 1,2,'2007-01-03'
  11. insert into tblDulicateDates select 1,3,'2007-01-01'
  12.  
  13.  
  14.  
  15. --show the table contents with the duplicate records except for date
  16. select * from tblDulicateDates
  17.  
  18. --Declare the necessary variables
  19. Declare @ThereAreDuplicates int,@Num1 int,@Num2 int, @Dte datetime
  20.  
  21.  
  22. --see if there are any duplicate records
  23. set @ThereAreDuplicates=(select count(a.num1) from
  24.     (select num1,num2,min(Dte) as Dte from tblDulicateDates group by num1,num2)a
  25.     join
  26.     (select num1,num2,max(Dte) as Dte from tblDulicateDates group by num1,num2)b on a.num1=b.num1 and a.num2=b.num2
  27.     where a.dte<>b.dte)
  28.  
  29.  
  30. --if there are duplicates then enter the loop
  31. while @ThereAreDuplicates > 0
  32. BEGIN
  33.     --select the duplicates that need to be deleted into a cursor
  34.     DECLARE DuplicatesCursor CURSOR FOR
  35.     select a.num1,a.num2,a.dte from
  36.     (select num1,num2,min(Dte) as Dte from tblDulicateDates group by num1,num2)a
  37.     join
  38.     (select num1,num2,max(Dte) as Dte from tblDulicateDates group by num1,num2)b on a.num1=b.num1 and a.num2=b.num2
  39.     where a.dte<>b.dte
  40.  
  41.  
  42.     OPEN DuplicatesCursor
  43.     FETCH NEXT FROM DuplicatesCursor
  44.     INTO @Num1,@Num2,@Dte
  45.  
  46.  
  47.     --enter a loop that deletes each of the records in the cursor
  48.     WHILE @@FETCH_STATUS = 0
  49.     BEGIN
  50.         DELETE FROM tblDulicateDates where Num1=@Num1 and Num2=@Num2 and Dte=@Dte
  51.  
  52.         FETCH NEXT FROM DuplicatesCursor
  53.         INTO @Num1,@Num2,@Dte
  54.     END
  55.     CLOSE DuplicatesCursor
  56.     DEALLOCATE DuplicatesCursor
  57.  
  58.     --Check to see if there are any more duplicates still in the table
  59.     --This is to handle the case where there are 3 or more duplicate records
  60.     set @ThereAreDuplicates=(select count(a.num1) from
  61.     (select num1,num2,min(Dte) as Dte from tblDulicateDates group by num1,num2)a
  62.     join
  63.     (select num1,num2,max(Dte) as Dte from tblDulicateDates group by num1,num2)b on a.num1=b.num1 and a.num2=b.num2
  64.     where a.dte<>b.dte)
  65.  
  66. END
  67.  
  68. --now show the table contents
  69. -- no duplicates and only the ones that had the max date are left
  70. select * from tblDulicateDates
  71.  
regards
Feb 14 '08 #2
jmstur2
6 New Member
Thank you. I will give this a try!

-Jeanne

---------------------------------------------------------------------------

Paste this code into query analyser and see if it does what you need

Expand|Select|Wrap|Line Numbers
  1. --Setup a table and add some data for rhe example to work with
  2. create table tblDulicateDates([Num1] [tinyint],[Num2] [tinyint],[dte] [datetime])
  3.  
  4. delete from tblDulicateDates
  5. insert into tblDulicateDates select 1,1,'2007-01-01'
  6. insert into tblDulicateDates select 1,1,'2007-01-02'
  7. insert into tblDulicateDates select 1,1,'2007-01-03'
  8. insert into tblDulicateDates select 1,2,'2007-01-01'
  9. insert into tblDulicateDates select 1,2,'2007-01-02'
  10. insert into tblDulicateDates select 1,2,'2007-01-03'
  11. insert into tblDulicateDates select 1,3,'2007-01-01'
  12.  
  13.  
  14.  
  15. --show the table contents with the duplicate records except for date
  16. select * from tblDulicateDates
  17.  
  18. --Declare the necessary variables
  19. Declare @ThereAreDuplicates int,@Num1 int,@Num2 int, @Dte datetime
  20.  
  21.  
  22. --see if there are any duplicate records
  23. set @ThereAreDuplicates=(select count(a.num1) from
  24.     (select num1,num2,min(Dte) as Dte from tblDulicateDates group by num1,num2)a
  25.     join
  26.     (select num1,num2,max(Dte) as Dte from tblDulicateDates group by num1,num2)b on a.num1=b.num1 and a.num2=b.num2
  27.     where a.dte<>b.dte)
  28.  
  29.  
  30. --if there are duplicates then enter the loop
  31. while @ThereAreDuplicates > 0
  32. BEGIN
  33.     --select the duplicates that need to be deleted into a cursor
  34.     DECLARE DuplicatesCursor CURSOR FOR
  35.     select a.num1,a.num2,a.dte from
  36.     (select num1,num2,min(Dte) as Dte from tblDulicateDates group by num1,num2)a
  37.     join
  38.     (select num1,num2,max(Dte) as Dte from tblDulicateDates group by num1,num2)b on a.num1=b.num1 and a.num2=b.num2
  39.     where a.dte<>b.dte
  40.  
  41.  
  42.     OPEN DuplicatesCursor
  43.     FETCH NEXT FROM DuplicatesCursor
  44.     INTO @Num1,@Num2,@Dte
  45.  
  46.  
  47.     --enter a loop that deletes each of the records in the cursor
  48.     WHILE @@FETCH_STATUS = 0
  49.     BEGIN
  50.         DELETE FROM tblDulicateDates where Num1=@Num1 and Num2=@Num2 and Dte=@Dte
  51.  
  52.         FETCH NEXT FROM DuplicatesCursor
  53.         INTO @Num1,@Num2,@Dte
  54.     END
  55.     CLOSE DuplicatesCursor
  56.     DEALLOCATE DuplicatesCursor
  57.  
  58.     --Check to see if there are any more duplicates still in the table
  59.     --This is to handle the case where there are 3 or more duplicate records
  60.     set @ThereAreDuplicates=(select count(a.num1) from
  61.     (select num1,num2,min(Dte) as Dte from tblDulicateDates group by num1,num2)a
  62.     join
  63.     (select num1,num2,max(Dte) as Dte from tblDulicateDates group by num1,num2)b on a.num1=b.num1 and a.num2=b.num2
  64.     where a.dte<>b.dte)
  65.  
  66. END
  67.  
  68. --now show the table contents
  69. -- no duplicates and only the ones that had the max date are left
  70. select * from tblDulicateDates
  71.  
regards
Feb 15 '08 #3
jmstur2
6 New Member
The sample code worked well. It didn't quite do what I need, though. I tried to fit it into my situation, but had no luck.

I have been given a table (TABLEDUPS) of known duplicate records. I need to delete records from a different table (TABLEEVENTS) based on this duplicate records table.

So what I really need to do is look at TABLEDUPS to determine the duplicate records to keep (most recent) and delete records in TABLEEVENTS that match the remaining record(s) in TABLEDUPS.

Any ideas...? Thanks.



I have a table with what I consider duplicate records. Data in all columns are duplicate except for the date column, meaning that duplicate data was entered on different dates and those dates were stored along with the data.

I want to delete the records except for most recent one. I can select the client ID field and the max(date field) to determine the ones I want to keep, but how do I determine the ones to delete. There are often more than two duplicates, so the min(date field) doesn't do it.

Any suggestions or guidance will be appreciated!

I use:
Microsoft SQL Enterprise Manager 8.0 running on Windows XP SP2
Feb 19 '08 #4
ck9663
2,878 Recognized Expert Specialist
"delete records in TABLEEVENTS that match the remaining record(s) in TABLEDUPS."

Could you define "match" ? What columns will you use to determine if if matches? You can give the structure of those two tables.

-- CK
Feb 19 '08 #5
jmstur2
6 New Member
They need to match on key fields. The structure for these key fields is the same for both tables:
Region varchar 2
ID varchar 9
Claim varchar 6
SVCDate datetime 8
Modifier varchar 2
Provider varchar 15
Profess varchar 15
Place varchar 2
ReportDate datetime 8

Once the match is made, I need to look at ReportDate in TABLEDUPS, then delete from TABLEEVENTS all records except the one with the most recent date.

Thanks.

"delete records in TABLEEVENTS that match the remaining record(s) in TABLEDUPS."

Could you define "match" ? What columns will you use to determine if if matches? You can give the structure of those two tables.

-- CK
Feb 21 '08 #6
ck9663
2,878 Recognized Expert Specialist
Run this first:
Expand|Select|Wrap|Line Numbers
  1. select tableevents.* 
  2. from tableevents
  3. left join
  4. (select Region, ID, Claim, SVCDate, Modifier, Provider, Profess, Place, max(ReportDate) as latest from TABLEDUPS
  5. group by Region, ID, Claim, SVCDate, Modifier, Provider, Profess, Place) as dups
  6. on dups.Region = tableevents.Region and  dups.ID = tableevents.ID 
  7.  and dups.Claim = tableevents.Claim and dups.SVCDate = tableevents.SVCDate 
  8.  and dups.Modifier = tableevents.Modifier and dups.Provider = tableevents.Provider 
  9.  and dups.Profess = tableevents.Profess and dups.Place = tableevents.Place 
  10.  and dups.latest > tableevents.ReportDate
  11.  
If it returns the record you want you can delete it. Just make it a DELETE query. The reason I'm asking you run this first so that you don't delete the rows you don't want. By running a SELECT first, you can take a look first if your deleting the right rows. But it's always good to have a backup.

Happy coding

-- CK
Feb 21 '08 #7
jmstur2
6 New Member
Thanks. I'll give this a try.

Run this first:
Expand|Select|Wrap|Line Numbers
  1. select tableevents.* 
  2. from tableevents
  3. left join
  4. (select Region, ID, Claim, SVCDate, Modifier, Provider, Profess, Place, max(ReportDate) as latest from TABLEDUPS
  5. group by Region, ID, Claim, SVCDate, Modifier, Provider, Profess, Place) as dups
  6. on dups.Region = tableevents.Region and  dups.ID = tableevents.ID 
  7.  and dups.Claim = tableevents.Claim and dups.SVCDate = tableevents.SVCDate 
  8.  and dups.Modifier = tableevents.Modifier and dups.Provider = tableevents.Provider 
  9.  and dups.Profess = tableevents.Profess and dups.Place = tableevents.Place 
  10.  and dups.latest > tableevents.ReportDate
  11.  
If it returns the record you want you can delete it. Just make it a DELETE query. The reason I'm asking you run this first so that you don't delete the rows you don't want. By running a SELECT first, you can take a look first if your deleting the right rows. But it's always good to have a backup.

Happy coding

-- CK
Feb 22 '08 #8

Sign in to post your reply or Sign up for a free account.

Similar topics

1
10810
by: Patrizio | last post by:
Hi All, I've the following table with a PK defined on an IDENTITY column (INSERT_SEQ): CREATE TABLE MYDATA ( MID NUMERIC(19,0) NOT NULL, MYVALUE FLOAT NOT NULL, TIMEKEY INTEGER NOT NULL, TIMEKEY_DTTM DATETIME NULL,
2
7158
by: Barbara | last post by:
Hi, I have an sql database that has the primary key set to three fields, but has not been set as unique(I didn't create the table). I have 1 record that has 2 duplicates and I am unable to delete the duplicate entries. If I try to delete any of the three records(they are identical) I get the message 'key column is insufficient or incorrect. Too many rows were affected by update'. I am trying to do this within Enterprise Mgr. Any...
16
17034
by: Philip Boonzaaier | last post by:
I want to be able to generate SQL statements that will go through a list of data, effectively row by row, enquire on the database if this exists in the selected table- If it exists, then the colums must be UPDATED, if not, they must be INSERTED. Logically then, I would like to SELECT * FROM <TABLE> WHERE ....<Values entered here>, and then IF FOUND UPDATE <TABLE> SET .... <Values entered here> ELSE INSERT INTO <TABLE> VALUES <Values...
2
4996
by: ms | last post by:
Access 2000: I am trying to delete duplicate records imported to a staging table leaving one of the duplicates to be imported into the live table. A unique record is based on a composite key of 3 fields (vehicleID, BattID, and ChgHrs). VehicleID and BattID are a TEXT datatype and ChrHrs are a number(long int.) datatype. Since records to be imported can have duplicate records of the composite key I need to clean all but one of the...
3
2797
by: vcornjamb | last post by:
Hello, I am developing a web form that contains some buttons and a data grid which has as its last column link buttons that will delete the data associated with that row. Everything works fine, but users have requested a confirmation message pop up so the user can confirm the delete. I can not quite get this to work. Here are the facts: I am working in the Microsoft Development Environment 2003 (Version
2
1415
by: rich | last post by:
I am building a database and I am using a list where I can make multiple choices. The data is like this Master table item1id item2 index(item1id) detail table item1id
6
3863
by: polocar | last post by:
Hi, I'm writing a program in Visual C# 2005 Professional Edition. This program connects to a SQL Server 2005 database called "Generations" (in which there is only one table, called "Generations"), and it allows the user to add, edit and delete the various records of the table. "Generations" table has the following fields: "IDPerson", NamePerson", "AgePerson" and "IDParent". A record contains the information about a person (his name, his...
4
2918
by: ramdil | last post by:
Hi All I have table and it have around 90000 records.Its primary key is autonumber field and it has also have date column and name, then some other columns Now i have problem with the table,as my table contains duplicate entries for a particular date.How can i delete the duplicate entries from the table for that particular column,Now i am doing manually with name column as it will be unique for that date.Can any one help me giving the query...
1
4098
watertraveller
by: watertraveller | last post by:
Hi all. My ultimate goal is to return two columns, where no single value appears anywhere twice. This means that not only do I want to check that nothing from column A appears in column B and vice-versa, but I also don't want the same value appearing twice in A and twice in B. So far I have: --Diff the columns INSERT INTO @Table SELECT One, Two FROM @Column1 a FULL OUTER JOIN @Column2 b
0
9825
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, weíll explore What is ONU, What Is Router, ONU & Routerís main usage, and What is the difference between ONU and Router. Letís take a closer look ! Part I. Meaning of...
0
9673
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10560
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10602
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
10260
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
6984
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5829
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4463
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
4023
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.