Ok, here is a asample table representing the problem more clearly
A | B | C | D
-----------------
a1 b1 c1 d1
a1 b2 c2 d2
a3 b3 c1 d3
a4 b4 c4 d3
a5 b5 c5 d5
a6 b6 c6 d3
Tha duplications are:
row 1+2 in param A
row 1+3 in param C
row 3+4+6 in param D
only row 5 is unique in all parameters.
conclusion: row 1+2+3+4+6 are the same user
goal: to find all duplicated rows & to delete them all accept one
instance to leave.
Note:
Finding that row 1similar to 2 in A & deleting it will loose data
because we won't know that row 1 is ALSO similar to 3 on C & later on
finding that 3 is similar to 4 & 6 on D & so on
The simple time consuming (about 2 weaks) query to acomplish the task
is:
SELECT count(*),A.B,C,D
FROM tbl
GROUP BY A,B,C,D
HAVING count(*)>1
I THANK YOU ALL 3 1364
> SELECT count(*),A.B,C,D FROM tbl GROUP BY A,B,C,D HAVING count(*)>1
If that query takes 2 weeks to run then you've got serious and I mean
serious problems with your hardware, that type of query should take minutes
if not seconds most desktops let alone servers.
To aid performance for that specific query create your clustered index on A,
B, C and D.
--
Tony Rogerson
SQL Server MVP http://sqlblogcasts.com/blogs/tonyrogerson - technical commentary from a SQL
Server Consultant http://sqlserverfaq.com - free video tutorials
"groupy" <li*******@gmail.com> wrote in message
news:11**********************@i39g2000cwa.googlegr oups.com... Ok, here is a asample table representing the problem more clearly
A | B | C | D ----------------- a1 b1 c1 d1 a1 b2 c2 d2 a3 b3 c1 d3 a4 b4 c4 d3 a5 b5 c5 d5 a6 b6 c6 d3
Tha duplications are: row 1+2 in param A row 1+3 in param C row 3+4+6 in param D only row 5 is unique in all parameters. conclusion: row 1+2+3+4+6 are the same user goal: to find all duplicated rows & to delete them all accept one instance to leave.
Note: Finding that row 1similar to 2 in A & deleting it will loose data because we won't know that row 1 is ALSO similar to 3 on C & later on finding that 3 is similar to 4 & 6 on D & so on
The simple time consuming (about 2 weaks) query to acomplish the task is: SELECT count(*),A.B,C,D FROM tbl GROUP BY A,B,C,D HAVING count(*)>1
I THANK YOU ALL
Thanks Tony i've just managed something..
Tony Rogerson כתב: SELECT count(*),A.B,C,D FROM tbl GROUP BY A,B,C,D HAVING count(*)>1
If that query takes 2 weeks to run then you've got serious and I mean serious problems with your hardware, that type of query should take minutes if not seconds most desktops let alone servers.
To aid performance for that specific query create your clustered index onA, B, C and D.
-- Tony Rogerson SQL Server MVP http://sqlblogcasts.com/blogs/tonyrogerson - technical commentary from a SQL Server Consultant http://sqlserverfaq.com - free video tutorials
"groupy" <li*******@gmail.com> wrote in message news:11**********************@i39g2000cwa.googlegr oups.com... Ok, here is a asample table representing the problem more clearly
A | B | C | D ----------------- a1 b1 c1 d1 a1 b2 c2 d2 a3 b3 c1 d3 a4 b4 c4 d3 a5 b5 c5 d5 a6 b6 c6 d3
Tha duplications are: row 1+2 in param A row 1+3 in param C row 3+4+6 in param D only row 5 is unique in all parameters. conclusion: row 1+2+3+4+6 are the same user goal: to find all duplicated rows & to delete them all accept one instance to leave.
Note: Finding that row 1similar to 2 in A & deleting it will loose data because we won't know that row 1 is ALSO similar to 3 on C & later on finding that 3 is similar to 4 & 6 on D & so on
The simple time consuming (about 2 weaks) query to acomplish the task is: SELECT count(*),A.B,C,D FROM tbl GROUP BY A,B,C,D HAVING count(*)>1
I THANK YOU ALL
A | B | C | D
-----------------
a1 b1 c1 d1
a1 b2 c2 d2
a3 b3 c1 d3
a4 b4 c4 d3
a5 b5 c5 d5
a6 b6 c6 d3
I am going to guess at the DDL and add another column
CREATE TABLE Foobar
(a CHAR(2) NOT NULL,
b CHAR(2) NOT NULL,
c CHAR(2) NOT NULL,
d CHAR(2) NOT NULL,
dups INTEGER DEFAULT 0 NOT NULL
CHECK(dups >= 0),
PRIMARY KEY(a, b, c, d));
INSERT INTO Foobar(a, b, c, d) VALUES ('a1', 'b1', 'c1', 'd1');
INSERT INTO Foobar(a, b, c, d) VALUES ('a1', 'b2', 'c2', 'd2');
INSERT INTO Foobar(a, b, c, d) VALUES ('a3', 'b3', 'c1', 'd3');
INSERT INTO Foobar(a, b, c, d) VALUES ('a4', 'b4', 'c4', 'd3');
INSERT INTO Foobar(a, b, c, d) VALUES ('a5', 'b5', 'c5', 'd5');
INSERT INTO Foobar(a, b, c, d) VALUES ('a6', 'b6', 'c6', 'd3');
Since you seem to want to preserve some of the information about
duplications, you can keep a tally
UPDATE Foobar
SET dups
= dups
+ (SELECT 1
FROM Foobar AS F1
WHERE F1.a = Foobar.a
HAVING COUNT(*) > 1)
+ (SELECT 1
FROM Foobar AS F1
WHERE F1.b = Foobar.b
HAVING COUNT(*) > 1)
+ (SELECT 1
FROM Foobar AS F1
WHERE F1.c = Foobar.c
HAVING COUNT(*) > 1)
+ (SELECT 1
FROM Foobar AS F1
WHERE F1.d = Foobar.d
HAVING COUNT(*) > 1);
The duplications are:
row 1+2 in param A
row 1+3 in param C
row 3+4+6 in param D
only row 5 is unique in all parameters.
conclusion: row 1+2+3+4+6 are the same user This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: Diego |
last post by:
Hello. I need help.
I'm writing an application that requires audio recording and I have decided
to use waveform audio interface.
I'm able to process WIM_DATA message but when i call waveInStop the...
|
by: Lars Netzel |
last post by:
A little background:
I use three Datagrids that are in a child parent relation.
I Use Negative Autoincrement on the the DataTables and that's workning nice.
My problem is when I Update these...
|
by: Rob Dob |
last post by:
Hi,
I'm amazed!!!
I am using VS2005, I create a new web project, c# and then within the
default.aspx form right mouse click and select "View Compnent Designer" , I
then select and drag a...
|
by: Knut Olsen-Solberg |
last post by:
I try to change the text in a <p> using getElementById(). I wonder what properties exists, and which one to use here. (The following does not work.)
Regards Knut
______________________
...
|
by: groupy |
last post by:
input: 1.5 million records table consisting users with 4 nvchar
fields:A,B,C,D
the problem: there are many records with dublicates A's or duplicates
B's or duplicates A+B's or duplicates B+C+D's &...
|
by: Evan |
last post by:
So I have two versions of a program (these are complete programs, not
excerpts):
Version 1:
template <class T >
void foo() {
return bar( T() );
}
|
by: shivapadma |
last post by:
I want explanation for CLOB datafield
1.I created clobtable with the query
create table clobexample(id number,text CLOB);
2.I tried to insert very large text by the following query
insert...
|
by: Prisoner at War |
last post by:
Friends, your opinions and advice, please:
I have a very simple JavaScript image-swap which works on my end but
when uploaded to my host at http://buildit.sitesell.com/sunnyside.html
does not...
|
by: Terry Carroll |
last post by:
I am trying to do something with a very large tarfile from within
Python, and am running into memory constraints. The tarfile in
question is a 4-gigabyte datafile from freedb.org,...
|
by: jacob navia |
last post by:
Buffer overflows are a fact of life, and, more specifically, a fact of
C.
All is not lost however. In the book
"Value Range Analysis of C programs" Axel Simon tries to establish a...
|
by: emmanuelkatto |
last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud.
Please let me know.
Thanks!
Emmanuel
|
by: nemocccc |
last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
|
by: Sonnysonu |
last post by:
This is the data of csv file
1 2 3
1 2 3
1 2 3
1 2 3
2 3
2 3
3
the lengths should be different i have to store the data by column-wise with in the specific length.
suppose the i have to...
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
|
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers,...
|
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
|
by: Hystou |
last post by:
Overview:
Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
|
by: agi2029 |
last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
| |