473,411 Members | 1,975 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,411 software developers and data experts.

Conditional Delete of Duplicate Records from Access

I have a access query that has more than 400000 records including duplicates.First I need help in finding a way to filter the whole 400000 records and list all duplicates.
Lets say ID, Job & Description are the columns that make duplicates. For example, see the 2 rows below,

12345678, Cashier, Emergency Money
12345678, Cashier

I want to delete the second row because the its third column is blank.

Sometimes the rows could be like this too,

89898989, , Financial Aid
89898989, Accountant, Financial Aid

In this example, I want to delete the first row because its second column is blank.

Any help will be really appreciated.
Nov 3 '07 #1
7 5525
Jim Doherty
897 Expert 512MB
I have a access query that has more than 400000 records including duplicates.First I need help in finding a way to filter the whole 400000 records and list all duplicates.
Lets say ID, Job & Description are the columns that make duplicates. For example, see the 2 rows below,

12345678, Cashier, Emergency Money
12345678, Cashier

I want to delete the second row because the its third column is blank.

Sometimes the rows could be like this too,

89898989, , Financial Aid
89898989, Accountant, Financial Aid

In this example, I want to delete the first row because its second column is blank.

Any help will be really appreciated.
Hi AccessHunter,

Post your table name and fieldnames please for the table on which you wish to perfom the deletions that way any contributors can properly reference them in any coded responses given

Thanks

Jim :)
Nov 3 '07 #2
Hi AccessHunter,

Post your table name and fieldnames please for the table on which you wish to perfom the deletions that way any contributors can properly reference them in any coded responses given

Thanks

Jim :)
Sorry abotu not mentioning the correct table and column names.
Table Name - Loan Officer Details
Column names - Clerk ID, Position, Description

My requirement is to find any duplicate rows for (Clerk ID + Position + Description) and then delete the rows as per my previous email.

Thanks for the help.
Nov 3 '07 #3
Jim Doherty
897 Expert 512MB
Sorry abotu not mentioning the correct table and column names.
Table Name - Loan Officer Details
Column names - Clerk ID, Position, Description

My requirement is to find any duplicate rows for (Clerk ID + Position + Description) and then delete the rows as per my previous email.

Thanks for the help.

Given your datasize it might be wise to run this SQL over your table first to determine how many duplications you have where either one or the other columns ie Position or Description is null or zero length string.

In order for you to do this you must add an extra field to the table called RowID and set its datatype to autonumber (which will automatically populate it with data when you switch to table view) or 'Number' datatype in which case you must populate it yourself with sequential numbers in order to identify each individual row. The identification of each row is important because you would need to identify where you are within the recordset during any subsequent traversing of the data within the SQL.

The assumption I have made here is that IF ALL THREE fields are complete then duplications in that regard are OK and you would not want to delete those records, as they are complete and form records on the many side of a one to many relationship with another table? I don't know your database schema.

Have a look at this query first see what it shows you and then give me an idea on the number of rows it returns, highest and lowest counts etc.

Expand|Select|Wrap|Line Numbers
  1. SELECT [Loan Officer Details].[Clerk ID], [Loan Officer Details].Position, [Loan Officer Details].Description, Count([Loan Officer Details].RowID) AS Duplications, Min([Loan Officer Details].RowID) AS MinimumRow
  2. FROM [Loan Officer Details]
  3. GROUP BY [Loan Officer Details].[Clerk ID], [Loan Officer Details].Position, [Loan Officer Details].Description
  4. HAVING ((([Loan Officer Details].Position)="" Or ([Loan Officer Details].Position) Is Null) AND (([Loan Officer Details].Description)<>"" Or ([Loan Officer Details].Description) Is Not Null)) OR ((([Loan Officer Details].Position)<>"" Or ([Loan Officer Details].Position) Is Not Null) AND (([Loan Officer Details].Description)="" Or ([Loan Officer Details].Description) Is Null))
  5. ORDER BY [Loan Officer Details].[Clerk ID];
The fourth column of this query is designed to to give you a count of the number of duplications that exist for each Clerk ID shown in column 1 for each row where either one or the other columns have data in them or not.

The fifth column is designed to reflect the minimum ROWID number for that duplication set. Logic can ultimately be developed on that, much as to say something like this:

"Loop through the Loan Officer Details table and delete all records for the given Clerk ID displayed in this query row but only where the Position and Description fields are equal to the Position and Description fields shown in this query AND only where the ROWID is greater than the minimum RowID number that this query is displaying".

If this logic agrees with you and your stored data (the content of which you will see aggregated by ClerkID when you run this query) then we can advance to writing something that pulls out data, traverses it and deletes records according to that logic.

Theres no point in doing anything if my assumptions are wrong or if your data shows you something that you didnt expect? and which might get inadavertently deleted if any program flow didnt take it into account.

Regards

Jim :)
Nov 4 '07 #4
Given your datasize it might be wise to run this SQL over your table first to determine how many duplications you have where either one or the other columns ie Position or Description is null or zero length string.

In order for you to do this you must add an extra field to the table called RowID and set its datatype to autonumber (which will automatically populate it with data when you switch to table view) or 'Number' datatype in which case you must populate it yourself with sequential numbers in order to identify each individual row. The identification of each row is important because you would need to identify where you are within the recordset during any subsequent traversing of the data within the SQL.

The assumption I have made here is that IF ALL THREE fields are complete then duplications in that regard are OK and you would not want to delete those records, as they are complete and form records on the many side of a one to many relationship with another table? I don't know your database schema.

Have a look at this query first see what it shows you and then give me an idea on the number of rows it returns, highest and lowest counts etc.

Expand|Select|Wrap|Line Numbers
  1. SELECT [Loan Officer Details].[Clerk ID], [Loan Officer Details].Position, [Loan Officer Details].Description, Count([Loan Officer Details].RowID) AS Duplications, Min([Loan Officer Details].RowID) AS MinimumRow
  2. FROM [Loan Officer Details]
  3. GROUP BY [Loan Officer Details].[Clerk ID], [Loan Officer Details].Position, [Loan Officer Details].Description
  4. HAVING ((([Loan Officer Details].Position)="" Or ([Loan Officer Details].Position) Is Null) AND (([Loan Officer Details].Description)<>"" Or ([Loan Officer Details].Description) Is Not Null)) OR ((([Loan Officer Details].Position)<>"" Or ([Loan Officer Details].Position) Is Not Null) AND (([Loan Officer Details].Description)="" Or ([Loan Officer Details].Description) Is Null))
  5. ORDER BY [Loan Officer Details].[Clerk ID];
The fourth column of this query is designed to to give you a count of the number of duplications that exist for each Clerk ID shown in column 1 for each row where either one or the other columns have data in them or not.

The fifth column is designed to reflect the minimum ROWID number for that duplication set. Logic can ultimately be developed on that, much as to say something like this:

"Loop through the Loan Officer Details table and delete all records for the given Clerk ID displayed in this query row but only where the Position and Description fields are equal to the Position and Description fields shown in this query AND only where the ROWID is greater than the minimum RowID number that this query is displaying".

If this logic agrees with you and your stored data (the content of which you will see aggregated by ClerkID when you run this query) then we can advance to writing something that pulls out data, traverses it and deletes records according to that logic.

Theres no point in doing anything if my assumptions are wrong or if your data shows you something that you didnt expect? and which might get inadavertently deleted if any program flow didnt take it into account.

Regards

Jim :)

Hi Jim,

I haven't yet tried your query yet.I want to add one more condition to the delete. Once I find the duplicates if the Description has the text "Emergency" I want to delete that record.

Example,

12345678, Loan Officer, Emergency Loan Person
12345678, , Special Loan

In this case even though the second column is blank for the 2nd record, I want to delete the 1st record which has the text "Emengency" in the Description field.

Sorry about not mentioning this before.

Thanks.
Nov 4 '07 #5
Jim Doherty
897 Expert 512MB
Hi Jim,

I haven't yet tried your query yet.I want to add one more condition to the delete. Once I find the duplicates if the Description has the text "Emergency" I want to delete that record.

Example,

12345678, Loan Officer, Emergency Loan Person
12345678, , Special Loan

In this case even though the second column is blank for the 2nd record, I want to delete the 1st record which has the text "Emengency" in the Description field.

Sorry about not mentioning this before.

Thanks.

Thats ok, I got that . Just try that query when you have time and get back to me. No point in coding anything until we can satisfy ourselves that we are happy with everything. Wouldn't want anything to bite us in the rear.

Jim :)
Nov 4 '07 #6
Using your idea I was able to group my table more efficiently, Now my real table looks like this,

Fields :- Job Num, Seq Num, Vendor Last Name, Vendor First Name, Vendor Middle Initial, Agency, TranDate

Data will be as follows,

27809, 1, MILLER , , , Young Men's Association , 12/21/2002
56634, 1, PEOPLES, , , People Inc , 12/02/2003
22345, 2, COX , , , Family Business Corporation, 11/19/2003
22345, 2, COX , , , Family Links , 11/19/2003

My idea to seperate the duplicates is as follows,

First look at TranDate to see if there are any duplicates
IF found,
Look to see if Job Num+Seq Num gives any duplicates
If Yes,
Those rows are duplicates for that TranDate .

Basically, what I am trying to do is eliminate duplicate Job Num/Seq Num for a given TranDate. Multiple data entry people has entered the same information 2 or more times. I am trying to first find those duplicates now. My next step will be to identify the correct row to be deleted.


Thanks
Nov 5 '07 #7
Jim Doherty
897 Expert 512MB
Using your idea I was able to group my table more efficiently, Now my real table looks like this,

Fields :- Job Num, Seq Num, Vendor Last Name, Vendor First Name, Vendor Middle Initial, Agency, TranDate

Data will be as follows,

27809, 1, MILLER , , , Young Men's Association , 12/21/2002
56634, 1, PEOPLES, , , People Inc , 12/02/2003
22345, 2, COX , , , Family Business Corporation, 11/19/2003
22345, 2, COX , , , Family Links , 11/19/2003

My idea to seperate the duplicates is as follows,

First look at TranDate to see if there are any duplicates
IF found,
Look to see if Job Num+Seq Num gives any duplicates
If Yes,
Those rows are duplicates for that TranDate .

Basically, what I am trying to do is eliminate duplicate Job Num/Seq Num for a given TranDate. Multiple data entry people has entered the same information 2 or more times. I am trying to first find those duplicates now. My next step will be to identify the correct row to be deleted.


Thanks
On the basis of that description I'll leave it with you. You can get a simple Access duplicates query to do that for you

Jim
Nov 5 '07 #8

Sign in to post your reply or Sign up for a free account.

Similar topics

3
by: Jouke Langhout | last post by:
Hello all! For quite some time now, I've got the following problem: Access won't close properly when a user closes the application. An ACCESS process stays active and that process can only be...
5
by: hpi | last post by:
Hello, I have a table : Batch It contains fields batchnummer : Number (Long Integer) datum : Date/Time status : Number (Long Integer) nr_records : Number (Long Integer)
2
by: ms | last post by:
Access 2000: I am trying to delete duplicate records imported to a staging table leaving one of the duplicates to be imported into the live table. A unique record is based on a composite key of 3...
4
by: KT | last post by:
Is there any one click solution that would do the trick? I would like to create a button, so the person who maintains the database can perform clean up work to delete duplicate records which...
16
by: Theodore70 | last post by:
I am trying to delete duplicate records in an access table using VB. Not a programmer here, but this was one of the codes that I found on the web. I made some modifications to it, but it still did...
3
by: itoxic07 | last post by:
how to delete the records ? i am having problem deleting the records including the duplicate records from access database. I inserted a textbox on a form ,the name written in a textbox must be...
3
by: rajeshkrsingh | last post by:
Hi friends, Step1- create table duplicate ( intId int, varName varchar(50) ) insert into duplicate(intId,varName) values(1,'rajesh') insert into duplicate(intId,varName) values(2,'raj12')...
7
by: jmstur2 | last post by:
I have a table with what I consider duplicate records. Data in all columns are duplicate except for the date column, meaning that duplicate data was entered on different dates and those dates were...
6
by: Dilip1983 | last post by:
Hi All, I want to delete duplicate records from a large table. There is one index(INDEX_U1) on 4 columns(col1,col2,col3,col4) which is in unusable state. First of all when i tried to rebuild...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.