I have a table with what I consider duplicate records. Data in all columns are duplicate except for the date column, meaning that duplicate data was entered on different dates and those dates were stored along with the data.
I want to delete the records except for most recent one. I can select the client ID field and the max(date field) to determine the ones I want to keep, but how do I determine the ones to delete. There are often more than two duplicates, so the min(date field) doesn't do it.
Any suggestions or guidance will be appreciated!
I use:
Microsoft SQL Enterprise Manager 8.0 running on Windows XP SP2
7 3883
Paste this code into query analyser and see if it does what you need -
--Setup a table and add some data for rhe example to work with
-
create table tblDulicateDates([Num1] [tinyint],[Num2] [tinyint],[dte] [datetime])
-
-
delete from tblDulicateDates
-
insert into tblDulicateDates select 1,1,'2007-01-01'
-
insert into tblDulicateDates select 1,1,'2007-01-02'
-
insert into tblDulicateDates select 1,1,'2007-01-03'
-
insert into tblDulicateDates select 1,2,'2007-01-01'
-
insert into tblDulicateDates select 1,2,'2007-01-02'
-
insert into tblDulicateDates select 1,2,'2007-01-03'
-
insert into tblDulicateDates select 1,3,'2007-01-01'
-
-
-
-
--show the table contents with the duplicate records except for date
-
select * from tblDulicateDates
-
-
--Declare the necessary variables
-
Declare @ThereAreDuplicates int,@Num1 int,@Num2 int, @Dte datetime
-
-
-
--see if there are any duplicate records
-
set @ThereAreDuplicates=(select count(a.num1) from
-
(select num1,num2,min(Dte) as Dte from tblDulicateDates group by num1,num2)a
-
join
-
(select num1,num2,max(Dte) as Dte from tblDulicateDates group by num1,num2)b on a.num1=b.num1 and a.num2=b.num2
-
where a.dte<>b.dte)
-
-
-
--if there are duplicates then enter the loop
-
while @ThereAreDuplicates > 0
-
BEGIN
-
--select the duplicates that need to be deleted into a cursor
-
DECLARE DuplicatesCursor CURSOR FOR
-
select a.num1,a.num2,a.dte from
-
(select num1,num2,min(Dte) as Dte from tblDulicateDates group by num1,num2)a
-
join
-
(select num1,num2,max(Dte) as Dte from tblDulicateDates group by num1,num2)b on a.num1=b.num1 and a.num2=b.num2
-
where a.dte<>b.dte
-
-
-
OPEN DuplicatesCursor
-
FETCH NEXT FROM DuplicatesCursor
-
INTO @Num1,@Num2,@Dte
-
-
-
--enter a loop that deletes each of the records in the cursor
-
WHILE @@FETCH_STATUS = 0
-
BEGIN
-
DELETE FROM tblDulicateDates where Num1=@Num1 and Num2=@Num2 and Dte=@Dte
-
-
FETCH NEXT FROM DuplicatesCursor
-
INTO @Num1,@Num2,@Dte
-
END
-
CLOSE DuplicatesCursor
-
DEALLOCATE DuplicatesCursor
-
-
--Check to see if there are any more duplicates still in the table
-
--This is to handle the case where there are 3 or more duplicate records
-
set @ThereAreDuplicates=(select count(a.num1) from
-
(select num1,num2,min(Dte) as Dte from tblDulicateDates group by num1,num2)a
-
join
-
(select num1,num2,max(Dte) as Dte from tblDulicateDates group by num1,num2)b on a.num1=b.num1 and a.num2=b.num2
-
where a.dte<>b.dte)
-
-
END
-
-
--now show the table contents
-
-- no duplicates and only the ones that had the max date are left
-
select * from tblDulicateDates
-
regards
Thank you. I will give this a try!
-Jeanne
---------------------------------------------------------------------------
Paste this code into query analyser and see if it does what you need -
--Setup a table and add some data for rhe example to work with
-
create table tblDulicateDates([Num1] [tinyint],[Num2] [tinyint],[dte] [datetime])
-
-
delete from tblDulicateDates
-
insert into tblDulicateDates select 1,1,'2007-01-01'
-
insert into tblDulicateDates select 1,1,'2007-01-02'
-
insert into tblDulicateDates select 1,1,'2007-01-03'
-
insert into tblDulicateDates select 1,2,'2007-01-01'
-
insert into tblDulicateDates select 1,2,'2007-01-02'
-
insert into tblDulicateDates select 1,2,'2007-01-03'
-
insert into tblDulicateDates select 1,3,'2007-01-01'
-
-
-
-
--show the table contents with the duplicate records except for date
-
select * from tblDulicateDates
-
-
--Declare the necessary variables
-
Declare @ThereAreDuplicates int,@Num1 int,@Num2 int, @Dte datetime
-
-
-
--see if there are any duplicate records
-
set @ThereAreDuplicates=(select count(a.num1) from
-
(select num1,num2,min(Dte) as Dte from tblDulicateDates group by num1,num2)a
-
join
-
(select num1,num2,max(Dte) as Dte from tblDulicateDates group by num1,num2)b on a.num1=b.num1 and a.num2=b.num2
-
where a.dte<>b.dte)
-
-
-
--if there are duplicates then enter the loop
-
while @ThereAreDuplicates > 0
-
BEGIN
-
--select the duplicates that need to be deleted into a cursor
-
DECLARE DuplicatesCursor CURSOR FOR
-
select a.num1,a.num2,a.dte from
-
(select num1,num2,min(Dte) as Dte from tblDulicateDates group by num1,num2)a
-
join
-
(select num1,num2,max(Dte) as Dte from tblDulicateDates group by num1,num2)b on a.num1=b.num1 and a.num2=b.num2
-
where a.dte<>b.dte
-
-
-
OPEN DuplicatesCursor
-
FETCH NEXT FROM DuplicatesCursor
-
INTO @Num1,@Num2,@Dte
-
-
-
--enter a loop that deletes each of the records in the cursor
-
WHILE @@FETCH_STATUS = 0
-
BEGIN
-
DELETE FROM tblDulicateDates where Num1=@Num1 and Num2=@Num2 and Dte=@Dte
-
-
FETCH NEXT FROM DuplicatesCursor
-
INTO @Num1,@Num2,@Dte
-
END
-
CLOSE DuplicatesCursor
-
DEALLOCATE DuplicatesCursor
-
-
--Check to see if there are any more duplicates still in the table
-
--This is to handle the case where there are 3 or more duplicate records
-
set @ThereAreDuplicates=(select count(a.num1) from
-
(select num1,num2,min(Dte) as Dte from tblDulicateDates group by num1,num2)a
-
join
-
(select num1,num2,max(Dte) as Dte from tblDulicateDates group by num1,num2)b on a.num1=b.num1 and a.num2=b.num2
-
where a.dte<>b.dte)
-
-
END
-
-
--now show the table contents
-
-- no duplicates and only the ones that had the max date are left
-
select * from tblDulicateDates
-
regards
The sample code worked well. It didn't quite do what I need, though. I tried to fit it into my situation, but had no luck.
I have been given a table (TABLEDUPS) of known duplicate records. I need to delete records from a different table (TABLEEVENTS) based on this duplicate records table.
So what I really need to do is look at TABLEDUPS to determine the duplicate records to keep (most recent) and delete records in TABLEEVENTS that match the remaining record(s) in TABLEDUPS.
Any ideas...? Thanks.
I have a table with what I consider duplicate records. Data in all columns are duplicate except for the date column, meaning that duplicate data was entered on different dates and those dates were stored along with the data.
I want to delete the records except for most recent one. I can select the client ID field and the max(date field) to determine the ones I want to keep, but how do I determine the ones to delete. There are often more than two duplicates, so the min(date field) doesn't do it.
Any suggestions or guidance will be appreciated!
I use:
Microsoft SQL Enterprise Manager 8.0 running on Windows XP SP2
"delete records in TABLEEVENTS that match the remaining record(s) in TABLEDUPS."
Could you define "match" ? What columns will you use to determine if if matches? You can give the structure of those two tables.
-- CK
They need to match on key fields. The structure for these key fields is the same for both tables:
Region varchar 2
ID varchar 9
Claim varchar 6
SVCDate datetime 8
Modifier varchar 2
Provider varchar 15
Profess varchar 15
Place varchar 2
ReportDate datetime 8
Once the match is made, I need to look at ReportDate in TABLEDUPS, then delete from TABLEEVENTS all records except the one with the most recent date.
Thanks.
"delete records in TABLEEVENTS that match the remaining record(s) in TABLEDUPS."
Could you define "match" ? What columns will you use to determine if if matches? You can give the structure of those two tables.
-- CK
Run this first: -
select tableevents.*
-
from tableevents
-
left join
-
(select Region, ID, Claim, SVCDate, Modifier, Provider, Profess, Place, max(ReportDate) as latest from TABLEDUPS
-
group by Region, ID, Claim, SVCDate, Modifier, Provider, Profess, Place) as dups
-
on dups.Region = tableevents.Region and dups.ID = tableevents.ID
-
and dups.Claim = tableevents.Claim and dups.SVCDate = tableevents.SVCDate
-
and dups.Modifier = tableevents.Modifier and dups.Provider = tableevents.Provider
-
and dups.Profess = tableevents.Profess and dups.Place = tableevents.Place
-
and dups.latest > tableevents.ReportDate
-
If it returns the record you want you can delete it. Just make it a DELETE query. The reason I'm asking you run this first so that you don't delete the rows you don't want. By running a SELECT first, you can take a look first if your deleting the right rows. But it's always good to have a backup.
Happy coding
-- CK
Thanks. I'll give this a try.
Run this first: -
select tableevents.*
-
from tableevents
-
left join
-
(select Region, ID, Claim, SVCDate, Modifier, Provider, Profess, Place, max(ReportDate) as latest from TABLEDUPS
-
group by Region, ID, Claim, SVCDate, Modifier, Provider, Profess, Place) as dups
-
on dups.Region = tableevents.Region and dups.ID = tableevents.ID
-
and dups.Claim = tableevents.Claim and dups.SVCDate = tableevents.SVCDate
-
and dups.Modifier = tableevents.Modifier and dups.Provider = tableevents.Provider
-
and dups.Profess = tableevents.Profess and dups.Place = tableevents.Place
-
and dups.latest > tableevents.ReportDate
-
If it returns the record you want you can delete it. Just make it a DELETE query. The reason I'm asking you run this first so that you don't delete the rows you don't want. By running a SELECT first, you can take a look first if your deleting the right rows. But it's always good to have a backup.
Happy coding
-- CK
Sign in to post your reply or Sign up for a free account.
Similar topics
by: Patrizio |
last post by:
Hi All,
I've the following table with a PK defined on an IDENTITY column
(INSERT_SEQ):
CREATE TABLE MYDATA (
MID NUMERIC(19,0) NOT NULL,
MYVALUE FLOAT NOT NULL,
TIMEKEY ...
|
by: Barbara |
last post by:
Hi,
I have an sql database that has the primary key set to three fields,
but has not been set as unique(I didn't create the table).
I have 1 record that has 2 duplicates and I am unable to delete...
|
by: Philip Boonzaaier |
last post by:
I want to be able to generate SQL statements that will go through a list of
data, effectively row by row, enquire on the database if this exists in the
selected table- If it exists, then the colums...
|
by: ms |
last post by:
Access 2000:
I am trying to delete duplicate records imported to a staging table leaving one
of the duplicates to be imported into the live table. A unique record is based
on a composite key of 3...
|
by: vcornjamb |
last post by:
Hello, I am developing a web form that contains some buttons and a data grid
which has as its last column link buttons that will delete the data
associated
with that row. Everything works fine,...
|
by: rich |
last post by:
I am building a database and I am using a list where I can make
multiple choices. The data is like this
Master table
item1id
item2
index(item1id)
detail table
item1id
|
by: polocar |
last post by:
Hi,
I'm writing a program in Visual C# 2005 Professional Edition.
This program connects to a SQL Server 2005 database called
"Generations" (in which there is only one table, called...
|
by: ramdil |
last post by:
Hi All
I have table and it have around 90000 records.Its primary key is autonumber field and it has also have date column and name, then some other columns
Now i have problem with the table,as my...
|
by: watertraveller |
last post by:
Hi all. My ultimate goal is to return two columns, where no single value appears anywhere twice. This means that not only do I want to check that nothing from column A appears in column B and...
|
by: DolphinDB |
last post by:
Tired of spending countless mintues downsampling your data? Look no further!
In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
|
by: ryjfgjl |
last post by:
ExcelToDatabase: batch import excel into database automatically...
|
by: isladogs |
last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM).
In this month's session, we are pleased to welcome back...
|
by: jfyes |
last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
|
by: ArrayDB |
last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
|
by: PapaRatzi |
last post by:
Hello,
I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
|
by: CloudSolutions |
last post by:
Introduction:
For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
|
by: Defcon1945 |
last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
|
by: af34tf |
last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
| |