473,385 Members | 2,028 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

Delete Duplicates

I'm having trouble figuring out how to delete some _almost_ duplicate
records in a look-up table. Here's the table:
CREATE TABLE [user_fields] (
[fKEY] [char] (16) NOT NULL ,
[SEQUENCE] [char] (2) NOT NULL ,
[FIELD_LABEL] [varchar] (20) NULL ,
[FIELD_VALUE] [varchar] (50) NULL ,
[EXPORT_DATE] [datetime] NULL ,
CONSTRAINT [PK_user_fields] PRIMARY KEY CLUSTERED
(
[fKEY],
[SEQUENCE]
)
CONSTRAINT [FK_USRFLD_INV_DOCID] FOREIGN KEY
(
[fKEY]
) REFERENCES [OTHER_TABLE] (
[PKEY]
)
)

Some values:
fKEY SEQUENCE FIELD_LABEL FIELD_VALUE
----------------------------------------------------------
8525645200692B89 19 Co. ID #
8525645200692B89 20 Co. ID #
8525645200692B89 21 Co. ID #
8525645200692B89 13 Co/Div/Dept
8525645200692B89 14 Co/Div/Dept
8525645200692B89 15 Co/Div/Dept
8525645200692B89 16 Division
8525645200692B89 17 Division
8525645200692B89 18 Division
8525645200692B89 10 Group
8525645200692B89 11 Group
8525645200692B89 12 Group
8525645200692B89 1 HR Contact JOHN NOVAK
8525645200692B89 2 HR Contact JOHN NOVAK
8525645200692B89 3 HR Contact JOHN NOVAK
8525645200692B89 24 Job Location
8525645200692B89 22 Job Location
8525645200692B89 23 Job Location
8525645200692B89 4 Manager
8525645200692B89 5 Manager
8525645200692B89 6 Manager
8525645200692B89 7 Recruiter
8525645200692B89 8 Recruiter
8525645200692B89 9 Recruiter
85256D740081C3A4 13 Co. ID #
85256D740081C3A4 14 Co. ID #
85256D740081C3A4 10 Co/Div/Dept
85256D740081C3A4 9 Co/Div/Dept
85256D740081C3A4 11 Division
85256D740081C3A4 12 Division
85256D740081C3A4 8 Group
85256D740081C3A4 7 Group
85256D740081C3A4 2 HR Contact Diana Tarry
85256D740081C3A4 1 HR Contact Diana Tarry
85256D740081C3A4 15 Job Location
85256D740081C3A4 16 Job Location
85256D740081C3A4 3 Manager
85256D740081C3A4 4 Manager
85256D740081C3A4 5 Recruiter
85256D740081C3A4 6 Recruiter

Note that fKEY 8525645200692B89 has three of every FIELD_LABEL, and
fKEY 85256D740081C3A4 has two. Both, however, should have only one.
Unfortunately, when I do a slect ... having count(*) > 1, I have nearly
900 different fKEYs with some variation of this problem.
It's just not coming to me how to delete the duplicates (except for
sequence). I don't care which of the sequence values I keep but as a
matter of preference I tried to do something using max(sequence) but,
so far, everything I've tried deletes all records for any given fKEY.
Help?
Thanks.
Randy

Jul 23 '05 #1
3 2319
My project manager is a smart cookie:

delete User_Fields
FROM
user_fields a
WHERE exists (
select null
from user_fields b
where a.fKEY= b. fKEY
and a.serverrepid = b.serverrepid
and a.FIELD_LABEL = b.FIELD_LABEL
group by FKEY, serverrepid, FIELD_LABEL
having count(*) > 1 and min(b.Sequence) <> a.sequence
)

Jul 23 '05 #2

EoRaptor013 wrote:
Note that fKEY 8525645200692B89 has three of every FIELD_LABEL, and
fKEY 85256D740081C3A4 has two. Both, however, should have only one.
Unfortunately, when I do a slect ... having count(*) > 1, I have nearly 900 different fKEYs with some variation of this problem.
It's just not coming to me how to delete the duplicates (except for
sequence). I don't care which of the sequence values I keep but as a
matter of preference I tried to do something using max(sequence) but,
The easiest would be the MAX(sequence) that you suggested. I'm just
typing this DML off the top of my head;

DELETE FROM user_fields
WHERE PK_user_fields
NOT IN (SELECT PK_user_fields
FROM user_fields
LEFT JOIN (SELECT PK_user_fields, MAX(SEQUENCE) AS MAXSeq
FROM PK_user_fields
GROUP BY PK_user_fields) AS MAXKEY
ON MAXKEY.PK_user_fields = user_fields.PK_user_fields
GROUP BY user_fields.PK_user_fields
HAVING MAXKEY.PK_user_fields IS NULL)

I'm actually more of an MSAccess guy, so my syntax may be off, but bear
with me :) This has an inner join collecting the max sequence; A
mid-join taking the parent-table to the MAX-value, and the outer DELETE
operating on the criteria of the RHS of the left-join being null.

so far, everything I've tried deletes all records for any given fKEY.
Help?


Make sure you back up if you try the above qry- :-)

HTH-

kDot

Jul 23 '05 #3
Have you thought about using a relational design for the table intead
of this? It looks like an EAV design flaw, with mixed data and metadata
in the same table. First you need a real key. Then you need to
understand that rows are not records and columns are not fields --
totally different things. You will never have data integrity with this
thing.

Jul 23 '05 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

44
by: Xah Lee | last post by:
here's a large exercise that uses what we built before. suppose you have tens of thousands of files in various directories. Some of these files are identical, but you don't know which ones are...
1
by: Patrizio | last post by:
Hi All, I've the following table with a PK defined on an IDENTITY column (INSERT_SEQ): CREATE TABLE MYDATA ( MID NUMERIC(19,0) NOT NULL, MYVALUE FLOAT NOT NULL, TIMEKEY ...
3
by: Alexander Anderson | last post by:
I have a DELETE statement that deletes duplicate data from a table. It takes a long time to execute, so I thought I'd seek advice here. The structure of the table is little funny. The following is...
1
by: Smythe32 | last post by:
If anyone could help, I would appreciate it. I have a table as listed below. I need to check for duplicates by the OrderItem field and if there are duplicates, it then needs to keep the...
2
by: ms | last post by:
Access 2000: I am trying to delete duplicate records imported to a staging table leaving one of the duplicates to be imported into the live table. A unique record is based on a composite key of 3...
1
by: JC | last post by:
Hello, I am trying to change a select, find duplicates, query into a delete query. I want to get rid of the records in the main table "tblHurnsHistory." I changed the Find Duplicates query to...
3
by: rn5a | last post by:
I am trying to delete records from a MS-Access DB table using the following query: strSQL=DELETE FROM MyTable WHERE =CInt(Request.QueryString("delete")) con.Execute strSQL When I run the app...
2
by: Radu | last post by:
Hi. I have a "union" table which results of a union of two tables. Occasionally I could have duplicates, when the same PIN has been added to both tables, albeit at different Datees/Times, such...
7
by: AccessHunter | last post by:
I have a access query that has more than 400000 records including duplicates.First I need help in finding a way to filter the whole 400000 records and list all duplicates. Lets say ID, Job &...
4
by: moon24 | last post by:
Hi im working with linked list and i have to implement a function that deletes the duplicates of a number. for example if given 2 7 1 7 12 7 then the result should be 2 7 1 12 here is what I have:...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.