473,395 Members | 1,516 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,395 software developers and data experts.

Removing Duplicates based on most recent

bard777
23
I fully expect to have a duh moment when I read replies to this....BUT....

I have a table with duplicates in the [LN] field (could be 2 records with same value, might be 7 records). I need to delete all but the the most recent based on the [eDate] field.

I hope this is enough info.

Thanks.
May 12 '10 #1

✓ answered by NeoPa

This should handle the nesting of the subqueries and the duplications when [eDate]s match :
Expand|Select|Wrap|Line Numbers
  1. DELETE tDo.*
  2. FROM   [tblDuplicates] AS tDo
  3. WHERE  tDo.ID Not In(
  4.     SELECT   TOP 1
  5.              tDi.ID
  6.     FROM     [tblDuplicates] AS tDi
  7.     WHERE    tDi.LN=tDo.LN
  8.     ORDER BY tDi.[eDate] DESC
  9.            , tDi.ID DESC
  10.     )

12 9686
gershwyn
122 100+
@bard777
Do you have a unique key set up in your table?

If so, you could do something like this:
Expand|Select|Wrap|Line Numbers
  1. DELETE Table.* FROM Table
  2. WHERE [ID] NOT IN (SELECT Last(ID) FROM
  3. (SELECT ID, LN FROM Table ORDER BY eDate)
  4. GROUP BY LN);
That's a lot more sub-queries then I like to put into a criteria expression, but it gives you the idea. Also - do you ever have two records with the same LN and the same eDate? Would you want to keep them all, or does it matter which is deleted?

Things are a little trickier if there's no ID field. Another thing to consider is you could select the data you want with a GROUP BY clause and make a new table from it. I wouldn't recommend it if this is something you will be repeating, but it might be an option depending on your setup.
May 12 '10 #2
bard777
23
@gershwyn
Yes, I have an ID field, so this is a good lead, thanks.

It is possible that [eDate] could be the same as well, but in this case either will do. The real issue is there is no data validation on the front end.
May 12 '10 #3
Jim Doherty
897 Expert 512MB
@bard777
The following function will open a recordset examining two columns LN and edate in a table named Table1 and stack records in Ascending order of LN and descending descending order of edate. The procedure loops records skipping over the first LN number (ie the latest date for that LN value) and deletes any other records having the same LN value in the stack. It repeats this process for each individual LN number found until the end of the table. The end result is an individual LN value in the first column and in the second column the latest date for that LN number. Obviously test this out on a copy table to see if it fits your needs

Expand|Select|Wrap|Line Numbers
  1. Function Get_Rid_Of_LNRecords_Except_Latest_EDate()
  2. On Error GoTo Err_Get_Rid_Of_LNRecords_Except_Latest_EDate
  3. Dim db As DAO.Database
  4. Dim rst As DAO.Recordset
  5. Set db = CurrentDb
  6. mysql = "SELECT Table1.LN, Table1.eDate "
  7. mysql = mysql & "FROM Table1 "
  8. mysql = mysql & "ORDER BY Table1.LN, Table1.eDate DESC;"
  9.  
  10. ' open a recordset of two columns
  11. ' sorting records by LN first then the edate descending
  12. Set rst = db.OpenRecordset(mysql, dbOpenDynaset)
  13. ' if a dataset is available loop through records
  14. ' skipping the first LN record encountered and deleting the rest
  15. ' the first LN record encountered will be first LN record for any given date
  16.  
  17. mypointer = rst!LN
  18.  
  19.         Do While Not rst.BOF And Not rst.EOF
  20.             'skip the first one
  21.             rst.MoveNext
  22.                     If mypointer = rst!LN Then
  23.                        rst.Delete
  24.                     Else
  25.                     'ressign the pointer because we have encountered
  26.                     ' a new LN
  27.                         mypointer = rst!LN
  28.                     End If
  29.          Loop
  30.  
  31. rst.Close
  32. mysql = ""
  33. Set rst = Nothing
  34. Set db = Nothing
  35. MsgBox "Process Complete", vbInformation, "System Message"
  36. Exit_Get_Rid_Of_LNRecords_Except_Latest_EDate:
  37.     Exit Function
  38.  
  39. Err_Get_Rid_Of_LNRecords_Except_Latest_EDate:
  40.     If Err = 3021 Then
  41.         'just ignore because encountered last row
  42.     MsgBox "Process Complete", vbInformation, "System Message"
  43.     Else
  44.         MsgBox "Error " & Err.Number & " " & Err.Description
  45.     End If
  46.     Resume Exit_Get_Rid_Of_LNRecords_Except_Latest_EDate
  47.  
  48. End Function
You can run the above if you paste it as is into a new module and then place a command button on a form (without using the wizard) then in the property box for the command button at the 'On click' event just type this =Get_Rid_Of_LNRecords_Except_Latest_EDate()
May 12 '10 #4
bard777
23
@Jim Doherty
Thanks Jim...I had already started donig something similar so I went with that (for now anyways).

Here is what I have, but it is not working:

Expand|Select|Wrap|Line Numbers
  1. Set db = CurrentDb
  2. Dim rsDEDUPE As Recordset
  3. Dim varLN As String
  4. Dim varTic As Long
  5. Dim varToc As Long
  6.  
  7. varTic = 0 ' count total deleted duplicate records
  8. varToc = 0 ' count total for duplicate recordset
  9.  
  10. Set rsDEDUPE = db.OpenRecordset("_1duplicates for tbl_exceptions_CSV") ' rename query
  11. rsDEDUPE.MoveFirst
  12.  
  13. Do While Not rsDEDUPE.EOF
  14.  
  15.     If Not rsDEDUPE.EOF Then
  16.         varLN = rsDEDUPE(0)
  17.         rsDEDUPE.MoveNext
  18.         varToc = varToc + 1
  19.     End If
  20.  
  21. If Not rsDEDUPE.EOF Then
  22.    If Trim(varLN) = Trim(rsDEDUPE(0)) Then
  23.  
  24.         DoCmd.RunSQL ("DELETE * FROM tbl_exceptions_CSV WHERE [AutoKey] = " & rsDEDUPE(2))
  25.  
  26.         varTic = varTic + 1
  27.     End If
  28. End If
  29.  
  30. Loop
It is giving me a "Record Deleted" error on the live that is:

Expand|Select|Wrap|Line Numbers
  1. varLN = rsDEDUPE(0)
I have gotten this error in the past when I have a query that has a linked table that is linked to a linked table. The table that the query uses to create the recordset is local, so I don't know why this would be happening.
May 13 '10 #5
Jim Doherty
897 Expert 512MB
Have you looked at the logic of the one I sent you. I did take the time out to present you with something that aligns itself to your original post!

I am not so sure I understand what you are experiencing with the #deletion# thing. If you have an openrecordset and delete records in SQL (Delete From X Where etc etc) at the same time then obviously the recordset will reflect that
May 13 '10 #6
NeoPa
32,556 Expert Mod 16PB
This should handle the nesting of the subqueries and the duplications when [eDate]s match :
Expand|Select|Wrap|Line Numbers
  1. DELETE tDo.*
  2. FROM   [tblDuplicates] AS tDo
  3. WHERE  tDo.ID Not In(
  4.     SELECT   TOP 1
  5.              tDi.ID
  6.     FROM     [tblDuplicates] AS tDi
  7.     WHERE    tDi.LN=tDo.LN
  8.     ORDER BY tDi.[eDate] DESC
  9.            , tDi.ID DESC
  10.     )
May 14 '10 #7
Jim Doherty
897 Expert 512MB
@NeoPa
Just a small point Neopa' the correlated inner loop needs descending order edate sorting otherwise the poster will be left with the 'earliest date' as opposed to the latest date having the benefit of two insights SQL and VBA. What more can one need?
May 14 '10 #8
NeoPa
32,556 Expert Mod 16PB
You're absolutely right of course Jim :)

Strangely I've done a couple of similar ones today and thought I'd included that in this one, as I did in the other. Thanks for catching.
May 14 '10 #9
bard777
23
Thanks NeoPa, that did the trick! I thought it could all be done in a query, I just couldn't wrap my mind around the logic. I also found out something interesting about using TOP 1....it will return multiple records if they meet the criteria (I didn't include the ID field in the ORDER first time around), didn't know it would do that.

Jim, thanks for the time you spent on your method. I had already started working on some VBA along the same lines before I saw your code. I didn't ignore your effort, I was trying to work with what I had instead of re-writing at that point. I did mean to seem ungratelful for your help.
May 18 '10 #10
NeoPa
32,556 Expert Mod 16PB
No worries Bard. The TOP predicate catches many of us out believe me ;)
May 19 '10 #11
Jim Doherty
897 Expert 512MB
@bard777
Don't worry about it Bard I am not the 'sensitive soul' :) I much rather prefer that multiple methods came piling into threads the browsing individual not just yourself then has choice according to their particular flavour of working.

SQL,s TOP predicate will return more than 'one' record as you experienced because it returns TIES by default in Access (whereas in SQL Server it does not you explicitly request it) You simply code that out by using the DISTINCT predicate
May 19 '10 #12
NeoPa
32,556 Expert Mod 16PB
Interesting Jim (DISTINCT predicate of the SELECT clause for handling ties). I normally include a unique reference field in the sorting to avoid them (as in current thread).
May 19 '10 #13

Sign in to post your reply or Sign up for a free account.

Similar topics

7
by: Nova's Taylor | last post by:
Hi folks, I am a newbie to Python and am hoping that someone can get me started on a log parser that I am trying to write. The log is an ASCII file that contains a process identifier (PID),...
20
by: Rubinho | last post by:
I've a list with duplicate members and I need to make each entry unique. I've come up with two ways of doing it and I'd like some input on what would be considered more pythonic (or at least...
2
by: Shaiguy | last post by:
I have a table containing the following fields: ProjectUpdateID (PrimaryKey) ProjectID UpdateDate I would like to create a Query in Ms Access 2000 which will return me the most recent 2...
3
by: William Wisnieski | last post by:
Hello Everyone, I'm helping out a non-profit school with their database. They would like to know the last gift made by each donor, the donor name, and the gift amount. I built a query based...
4
by: Lee | last post by:
I have an application where I need to find the most recent file placed in a folder. For example, my users periodically place an image file with the format "ImageXXXX.jpg" (where XXXX is some...
9
by: psuaudi | last post by:
i have a table with dates that events occured. it looks something like this: 11/18/2006 1:00PM Open 11/18/2006 1:25PM Close I created a select query that selects the last (most recent)...
8
by: Jason H | last post by:
Hi, I am sure I am just overlooking the obvious, but I am having a little trouble with this one... I am setting up an inventory database that tracks company tools and their location. This...
4
by: Sector 7G | last post by:
I'm working with a SQL query for a Human Resources database. Its intended purpose is to find all the paycheck records with a check date (prckhist.chkdate ) more recent than eleven days past the...
2
by: robert.waters | last post by:
I need to perform the following: - select the most recent X number of records in a table (there is a timestamp field) - select the Nth occurrence of X number of records ex: - most recent 10...
4
by: zacks | last post by:
Most applications whose purpose is to work with various types of files implement a "Most Recent Files" list, where the last, say, four files accessed by the application can quickly be re-opened by...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.