Hello to all,
I have a small or big problem with a customer data base, where during a change of
system we might have created duplicate records. This should be easy to find, you
might think, but, we are talking about roughly 10000 records or less in a total
volume of 1 MIO records or more.
I have considered a strategy: The station ID and a field with something like a
sequence number are supposed to be unique during that period. The sequence number
will repeat roughly every two - four years, so in the period of three months,
which is my problematic time, the sequence number (SeqNo) must be unique for each
of the 40 stations (StationID) that I record in this data base table.
I did run a simple 'create index' on these two fields, and, as to be expected, the
routine fails, blahblah, meaning, there are duplicates. Fine. So, how can I write
a query which lists to me all pairs (StationID, SeqNo) of data, where two or more
rows are identically?
If it makes things easier, I have a copy of that data base on a separate server,
not on the live system I mean. I could strip off all data which is not related to
that period of time.
Can someone give to me a hint? Or a select statement?
TS 4 4199
On Aug 1, 2:07*pm, Thomas Arthur Seidel <thomas.arthur. sei...@gmx.de>
wrote:
Hello to all,
I have a small or big problem with a customer data base, where during a change of
system we might have created duplicate records. This should be easy to find, you
might think, but, we are talking about roughly 10000 records or less in atotal
volume of 1 MIO records or more.
I have considered a strategy: The station ID and a field with something like a
sequence number are supposed to be unique during that period. The sequence number
will repeat roughly every two - four years, so in the period of three months,
which is my problematic time, the sequence number (SeqNo) must be unique for each
of the 40 stations (StationID) that I record in this data base table.
I did run a simple 'create index' on these two fields, and, as to be expected, the
routine fails, blahblah, meaning, there are duplicates. Fine. So, how canI write
a query which lists to me all pairs (StationID, SeqNo) of data, where twoor more
rows are identically?
If it makes things easier, I have a copy of that data base on a separate server,
not on the live system I mean. I could strip off all data which is not related to
that period of time.
Can someone give to me a hint? Or a select statement?
TS
You need to first figure out what you classify as a duplicate record?
If only one field in a record is different from the next record is it
a therefor not duplicate? If the only problem is a particular field
over a given period, then you simply run a procedure to count the
number of occurances for every ID over a given time. Or better yet
run a procedure that only returns fields that have a count greater
than 1.
Try this:
SELECT SeqNo, StationID, COUNT(*)
FROM Table
GROUP BY SeqNo, StationID
HAVING COUNT(*) 1;
Plamen Ratchev http://www.SQLStudio.com
To classify what is a duplicate record: If (StationID,SeqN o) is repeating. Mainly,
if there is more then one record of the same SeqNo for one StationId. Each of the
two values individually may repeat: Many records do come from the same station,
and every station starts with the same sequence Number (SeqNo), it starts with
"1", increments with every record, continues for years, until it is reset to "1"
again some nice and warm day, when the accountant does decide this. For the period
of three months, where I want to find my problem, for sure there is no regular way
to have two or more transactions from the same station with a repeated SeqNo. The
SeqNo must be unique for each station during this time.
TS
rhaazy wrote:
On Aug 1, 2:07 pm, Thomas Arthur Seidel <thomas.arthur. sei...@gmx.de>
wrote:
>Hello to all,
I have a small or big problem with a customer data base, where during a change of system we might have created duplicate records. This should be easy to find, you might think, but, we are talking about roughly 10000 records or less in a total volume of 1 MIO records or more.
I have considered a strategy: The station ID and a field with something like a sequence number are supposed to be unique during that period. The sequence number will repeat roughly every two - four years, so in the period of three months, which is my problematic time, the sequence number (SeqNo) must be unique for each of the 40 stations (StationID) that I record in this data base table.
I did run a simple 'create index' on these two fields, and, as to be expected, the routine fails, blahblah, meaning, there are duplicates. Fine. So, how can I write a query which lists to me all pairs (StationID, SeqNo) of data, where two or more rows are identically?
If it makes things easier, I have a copy of that data base on a separate server, not on the live system I mean. I could strip off all data which is not related to that period of time.
Can someone give to me a hint? Or a select statement?
TS
You need to first figure out what you classify as a duplicate record?
If only one field in a record is different from the next record is it
a therefor not duplicate? If the only problem is a particular field
over a given period, then you simply run a procedure to count the
number of occurances for every ID over a given time. Or better yet
run a procedure that only returns fields that have a count greater
than 1.
Works !
Unfortunately it proved also, that my idea of duplicate records was false.
TS
Plamen Ratchev wrote:
Try this:
SELECT SeqNo, StationID, COUNT(*)
FROM Table
GROUP BY SeqNo, StationID
HAVING COUNT(*) 1;
Plamen Ratchev http://www.SQLStudio.com This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics |
by: atse |
last post by:
Hi,
My table in the database may contain duplicate records, which means except
the primary key (auto_increment) field is unique, all or almost of all the
fields are with the same content. How can I select them to display and
delete them?
Thanks for any idea.
Atse
|
by: ms |
last post by:
Access 2000:
I am trying to delete duplicate records imported to a staging table leaving one
of the duplicates to be imported into the live table. A unique record is based
on a composite key of 3 fields (vehicleID, BattID, and ChgHrs). VehicleID and
BattID are a TEXT datatype and ChrHrs are a number(long int.) datatype. Since
records to be imported can have duplicate records of the composite key I need to
clean all but one of the...
|
by: KT |
last post by:
Is there any one click solution that would do the trick? I would like
to create a button, so the person who maintains the database can
perform clean up work to delete duplicate records which contain same
information in the ID field and the account number field once a week.
Thanks in advance!
KT
|
by: Carroll |
last post by:
I'm looking for a way in SQL to find duplicate records in a single
table, that are the same based on 3 columns, regardless of what is in
the other columns in the duplicate records. I would like to keep both
records (or it could be more than 2 as well) where duplicate records
are found. Also, I am interested in selecting all columns from the
duplicate records.
Thanks,
Carroll Rinehart
|
by: rgurganus |
last post by:
I'm trying to setup a way to find possible duplicate or near-duplicate
records of contact people. I have a table of 50k records doing an
inner join with itself, something like this:
SELECT p1.fields, p2.fields
FROM table AS p1 INNER JOIN table AS p2 ON p1.lastname = p2.lastname
WHERE p1.id <p2.id AND
(other criteria)
The id and name and other criteria fields are indexed as well as they
| |
by: nethravathy |
last post by:
Hi,
The following table namely elcbtripselect contains 5147 records.I want to know wether this table contains duplicate records or not.
I tried with following query
1)SELECT elcbtripselect.ELCBTRIP_voltsMIN, elcbtripselect.ELCBTRIP_voltsMAX, elcbtripselect.ELCBTrip_is_partwinding, elcbtripselect.ELCBTrip_is_ydelta, elcbtripselect.ELCBTRIP_starter_size, elcbtripselect.ELCBTRIP_UnitFunction, elcbtripselect.ELCBTRIP_strcb_speedi_frame_ty,...
|
by: sierra7 |
last post by:
I have found that I have duplicates in my Products table. I am using the DLookup() function to find a ProductID by a product 'criteria' and only create a new one if can not find a match e.g.
varX=dlookup("","ProductsTableName", strCriteria)
if isnull(varX) then
'Proceed to add new record and create new ProductID (autonumber)
else
'Use ProductID currently held in varX
endif
There are about 14,000 rows in the Products...
|
by: nomvula |
last post by:
hi guys
i need some help to duplicate records on my form datasheet:
here's the example of my form results:
ClientLookup DateCaptured ForecastDate Description ForecastQuantity Forecast Actual
UJ 18-Apr-08 01-Mar-08 Fees: Asset 1 R 31,200.00 R 31,200.00
NMBM 22-Apr-08 23-Mar-08 P-MI (E) 07/2006 3 R 47,485 R 38,849
i have 200 records deplayed in the form i'm using access2007 and i have a command button which is the built in command button to...
|
by: Dilip1983 |
last post by:
Hi All,
I want to delete duplicate records from a large table.
There is one index(INDEX_U1) on 4 columns(col1,col2,col3,col4) which is in unusable state.
First of all when i tried to rebuild index it showed error as unique key violation.
So i want to delete duplicate records for col1,col2,col3,col4 combination.
How can i delete the duplicate records from this large table?
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look !
Part I. Meaning of...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it.
First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
| |
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed.
This is as boiled down as I can make it.
Here is my compilation command:
g++-12 -std=c++20 -Wnarrowing bit_field.cpp
Here is the code in...
|
by: conductexam |
last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one.
At the time of converting from word file to html my equations which are in the word document file was convert into image.
Globals.ThisAddIn.Application.ActiveDocument.Select();...
|
by: TSSRALBI |
last post by:
Hello
I'm a network technician in training and I need your help.
I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs.
The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols.
I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
|
by: adsilva |
last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
|
by: 6302768590 |
last post by:
Hai team
i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
|
by: muto222 |
last post by:
How can i add a mobile payment intergratation into php mysql website.
| |
by: bsmnconsultancy |
last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...
| |