473,396 Members | 2,154 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

REQ HELP: Problem eliminating duplicates

Hi,

I have a table with duplicate records.
Some of the duplicates need to be eliminated from the table and some
need not.
A duplicate record does not need to be eliminated if the one record
appears multiple times on one employee's report.
However, if the same record(s) appear on any other employee's report,
it means they collaborated on the item, and the item from the second
employee's report should be eliminated from the table.

Each record includes the following fields:

ID
Period
Report
AgentID
Date
Invoice
JobID
Description
Country
Gross
Comm

A duplicate record is identified if the following fields are NOT the
same:

ID
AgentID

the following fields are identical:

Date
Invoice
JobID
Description
Country
Gross

I have created an append query to update a copy of the original table
(structure only) with the fields that must be identical designated as
primary key fields.

Unfortunately, when this table is updated with the append query, some
duplicates are not eliminated (seems this happens when more than two
employees are involved so there are three or more duplicates of a
record), and (I think) some records that should not be eliminated
(suplicates from same employee) are eliminated.

Will someone please help me figure this out?

Thanks,
MHenry

Some sample data (Description column omitted)

ID Period Report AgentID Date Invoice JobID
Country Gross Comm
1913 403 S DA 3/30/04 I0019013 LEP HFL DA 21
… England $2,775.00 $555.00
1914 403 S HFL 3/30/04 I0019013 LEP HFL DA 21
… England $2,775.00 $555.00
1915 403 S LEP 3/30/04 I0019013 LEP HFL DA 21
.... England $2,775.00 $555.00
1916 403 S DA 3/30/04 I0019014 LEP HFL DA 21
… USA $3,000.00 $600.00
1917 403 S HFL 3/30/04 I0019014 LEP HFL DA 21
… USA $3,000.00 $600.00
1918 403 S LEP 3/30/04 I0019014 LEP HFL DA 21
.... USA $3,000.00 $600.00
1919 403 S HFL 3/30/04 I0019016 LEP HFL 200304
… USA $1,000.00 $300.00
1920 403 S LEP 3/30/04 I0019016 LEP HFL 200304
… USA $1,000.00 $300.00
1922 403 S LEP 3/30/04 I0019034 LEP HFL 200304
… USA $500.00 $150.00
1921 403 S HFL 3/30/04 I0019034 LEP HFL 200304
… USA $500.00 $150.00
1923 403 S LEP 3/31/04 I0019050 LEP 090104 F
England $370.00 $111.00
1924 403 S JFO 3/31/04 I0019050 LEP JFO 090104
England $370.00 $111.00
Nov 13 '05 #1
1 1651
Hi

I have copied and converted your data, and am re-posting it for you as CSV
text in case anyone else wants to attempt to help you, too:

"ID","Period","Report","AgentID","Date","Invoice", "JobID","Country","Gross",
"Comm"
1913.00,403.00,"S","DA",3/30/04 0:00:00,"I0019013","LEP HFL DA
21","England",$2775.00,$555.00
1914.00,403.00,"S","HFL",3/30/04 0:00:00,"I0019013","LEP HFL DA
21","England",$2775.00,$555.00
1915.00,403.00,"S","LEP",3/30/04 0:00:00,"I0019013","LEP HFL DA
21","England",$2775.00,$555.00
1916.00,403.00,"S","DA",3/30/04 0:00:00,"I0019014","LEP HFL DA
21","USA",$3000.00,$600.00
1917.00,403.00,"S","HFL",3/30/04 0:00:00,"I0019014","LEP HFL DA
21","USA",$3000.00,$600.00
1918.00,403.00,"S","LEP",3/30/04 0:00:00,"I0019014","LEP HFL DA
21","USA",$3000.00,$600.00
1919.00,403.00,"S","HFL",3/30/04 0:00:00,"I0019016","LEP HFL
200304","USA",$1000.00,$300.00
1920.00,403.00,"S","LEP",3/30/04 0:00:00,"I0019016","LEP HFL
200304","USA",$1000.00,$300.00
1922.00,403.00,"S","LEP",3/30/04 0:00:00,"I0019034","LEP HFL
200304","USA",$500.00,$150.00
1921.00,403.00,"S","HFL",3/30/04 0:00:00,"I0019034","LEP HFL
200304","USA",$500.00,$150.00
1923.00,403.00,"S","LEP",3/31/04 0:00:00,"I0019050","LEP 090104
F","England",$370.00,$111.00
1924.00,403.00,"S","JFO",3/31/04 0:00:00,"I0019050","LEP JFO
090104","England",$370.00,$111.00

Observations:
--------------

1.) All of the data appears to be a classic case of "flat-file" database
thinking, where you seem to think that all data needs to be in a single
table. Access is a relational database, where different elements of data are
stored in different, but related tables. This methodology allows you to
enter items ONCE, and then link them to related items in another table in
order to re-combine the elements and *display* them as though they are all
in one record.

I'd "study up" a bit on relational table design, then attempt to seperate
this data into the following tables if I were you:

tblJobs
tblInvoices
tblCountries
tblAgents
tblCommissions (if each Agent gets paid a seperate commission)

2.) A lot of the data appears to be just cocatenated or formatted data from
other fields.
i.e (from the first record) JobID = LEP HFL DA 21
I'm assuming that to mean that 3 "AgentID's (LEP, HFL, and DA) all
contributed to the completion of some task or sale... What does "21"
represent?
Period = 403... does that represent the 3rd month of 2004?
Gross = The total profit on the job?
Comm = Commision paid. Is that amount paid to each Agent, or is split evenly
amongst them? Divided up in diferently-sized portions?

3.) These appear to be the relationships, although I am not sure where some
fields (i.e. Period, Report) "fit" because the data is identical in all
records:
1 Job ==> Many Invoices
1 Job ==> Many Countries
1 Invoice ==> Many Agents
1 Invoice ==> 1 Gross, 1 Comm? (This implies that all agents share
equally?)

Oh waitaminute ...
1 Invoice ==> Many Jobs also? (Inv# I0019050 -- JobID's LEP 090104 F,
and LEP JFO 090104)
If so, then we're talking about a "Many-to-Many" relationship which
requires a join table.
1 Job ==> Many - [tblJoinInvoiceJobs] - Many <== 1 Invoice

So, there... that should get you started :-)

P.S.
I'd also like to advise you to not use *Reserved Words* like "Date" for your
field names... it *will* come back to haunt you eventually.
"MHenry" <MH****@NoSpam.net> wrote in message
news:22********************************@4ax.com...
Hi,

I have a table with duplicate records.
Some of the duplicates need to be eliminated from the table and some
need not.
A duplicate record does not need to be eliminated if the one record
appears multiple times on one employee's report.
However, if the same record(s) appear on any other employee's report,
it means they collaborated on the item, and the item from the second
employee's report should be eliminated from the table.

Each record includes the following fields:

ID
Period
Report
AgentID
Date
Invoice
JobID
Description
Country
Gross
Comm

A duplicate record is identified if the following fields are NOT the
same:

ID
AgentID

the following fields are identical:

Date
Invoice
JobID
Description
Country
Gross

I have created an append query to update a copy of the original table
(structure only) with the fields that must be identical designated as
primary key fields.

Unfortunately, when this table is updated with the append query, some
duplicates are not eliminated (seems this happens when more than two
employees are involved so there are three or more duplicates of a
record), and (I think) some records that should not be eliminated
(suplicates from same employee) are eliminated.

Will someone please help me figure this out?

Thanks,
MHenry

Some sample data (Description column omitted)

ID Period Report AgentID Date Invoice JobID
Country Gross Comm
1913 403 S DA 3/30/04 I0019013 LEP HFL DA 21
. England $2,775.00 $555.00
1914 403 S HFL 3/30/04 I0019013 LEP HFL DA 21
. England $2,775.00 $555.00
1915 403 S LEP 3/30/04 I0019013 LEP HFL DA 21
... England $2,775.00 $555.00
1916 403 S DA 3/30/04 I0019014 LEP HFL DA 21
. USA $3,000.00 $600.00
1917 403 S HFL 3/30/04 I0019014 LEP HFL DA 21
. USA $3,000.00 $600.00
1918 403 S LEP 3/30/04 I0019014 LEP HFL DA 21
... USA $3,000.00 $600.00
1919 403 S HFL 3/30/04 I0019016 LEP HFL 200304
. USA $1,000.00 $300.00
1920 403 S LEP 3/30/04 I0019016 LEP HFL 200304
. USA $1,000.00 $300.00
1922 403 S LEP 3/30/04 I0019034 LEP HFL 200304
. USA $500.00 $150.00
1921 403 S HFL 3/30/04 I0019034 LEP HFL 200304
. USA $500.00 $150.00
1923 403 S LEP 3/31/04 I0019050 LEP 090104 F
England $370.00 $111.00
1924 403 S JFO 3/31/04 I0019050 LEP JFO 090104
England $370.00 $111.00

Nov 13 '05 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

9
by: Paul | last post by:
Can anyone suggest an efficient way to eliminate duplicate entries in a list? The naive approach below works fine, but is very slow with lists containing tens of thousands of entries: def...
2
by: Sebastian | last post by:
The following query needs about 2 minutes to complete (finding dupes) on a table of about 10000 addresses. Does anyone have an idea on how to speed this up ? Thanks in advance !!! Sebastian
4
by: Phillo | last post by:
Hello, I'm new at Javascript, and have written a script for a series of random roll-over button images, but one thing I would like to add is a function that checks to make sure that there are no...
2
by: nikkii | last post by:
I currently have a spreadsheet with more than 1000 records. Within this spreadsheet are many duplicates. In the past I've been using he filter feature to find the duplictes. However, as the...
6
by: Marlene | last post by:
Hi All I have the following scenario, where I have found all the duplicates in a table, based on an order number and a part number (item).I might have something like this: Order PODate Rec...
3
by: AK | last post by:
Hi Our product uses MS-SQL Server 2000. One of our customer has 10 installations with each installation stroring data in its own database. Now the customer wants to consolidate these databases...
5
by: EP | last post by:
This inquiry may either turn out to be about the suitability of the SHA-1 (160 bit digest) for file identification, the sha function in Python ... or about some error in my script. Any insight...
1
by: capdownlondon | last post by:
Im using the following code to duplicate a record varCnt(retrieved from a combo box on the form) many times, and it only duplicates the record with the fields present on the form for that record....
118
by: Chuck Cheeze | last post by:
This might be in the wrong group, but... Here is an example of my data: entry_id cat_id 1 20 2 25 3 30 4 25 5 35
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.