By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
438,575 Members | 1,968 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 438,575 IT Pros & Developers. It's quick & easy.

Finding duplicates rows in a table

P: 9
I have to concatenate 2 colimns in a table and find duplicates in them.
I already used the method
select * from tableA a
where (select count(*) from TableA b
where acol1+ +col2 = b.col1+ +col2)>1
But the performance is very bad.
Data is also huge
Can you help me
Thanks in advance
Nov 5 '08 #1
Share this Question
Share on Google+
17 Replies


ck9663
Expert 2.5K+
P: 2,878
I have to concatenate 2 colimns in a table and find duplicates in them.
I already used the method
select * from tableA a
where (select count(*) from TableA b
where acol1+ +col2 = b.col1+ +col2)>1
But the performance is very bad.
Data is also huge
Can you help me
Thanks in advance
try this to show the duplicates.

Expand|Select|Wrap|Line Numbers
  1. select a.*
  2. from tableA a
  3. inner join (select col1, col2 from tableA b group by col1, col2 having count(*) > 1) s on a.col1 = s.col1 and s.col2 = a.col2
  4.  

Happy coding!

-- CK
Nov 5 '08 #2

P: 9
Hi ck9663,
Thank u for u r immediate reply.
But when I run the below query

select Col1 + + Col2
from TableA
group by Col1 + + Col2
having(count(*))>1

I should get the duplicates .
But the the duplicate which I got from the query which u suggested is
different from the above qurey.
Is the above query wrong?
can u please help me
Thanks in advance
Nov 5 '08 #3

ck9663
Expert 2.5K+
P: 2,878
try:

Expand|Select|Wrap|Line Numbers
  1. select Col1, Col2
  2. from TableA
  3. group by Col, Col2
  4. having count(*) >1
By the way, use the tags when posting.

-- CK
Nov 5 '08 #4

P: 9
Hi CK
Thank u for u r reply.
The 2 query's which u have suggested r working well.
But The count which both of them return should be the same.
But I am getting 1000 more records with the first one.

Next time when i post a query I will use tags

Can u please help me
If I am not clear with the question me please let me know
Thank u in advance
Nov 5 '08 #5

ck9663
Expert 2.5K+
P: 2,878
This query will give the duplicate:

Expand|Select|Wrap|Line Numbers
  1. select Col1, Col2, count(*) as cnt
  2. from TableA
  3. group by Col, Col2
  4. having count(*) >1
  5.  
Look at cnt column. If it says "2" it means you have two records on your table with the same col1 and col2. This means it will return the unique combination of the two columns and the number of rows that are duplicate. Not the actual duplicate rows. Make sense?

-- CK
Nov 5 '08 #6

P: 9
YUP ,GOT U

Thank u So much.
Nov 5 '08 #7

P: 9
Sorry for troubling you again.
I am loading the data using SSIS.
If there r 2 duplicates only one should be loaded into the
target table ,the remaining should go into another table.
Is this possible in SSIS.

Thank you for helping me and giving me all the above query's
Thank you
Nov 6 '08 #8

ck9663
Expert 2.5K+
P: 2,878
Am not sure if it's possible in SSIS.

What you could do is to create a trigger on your target table. If the value you're trying to insert on your table is already existing, insert it to another table and rollback the transaction.

On T-SQL, it would've been different. Depending on your case, a DISTINCT might be sufficient enough. If not, you might need to further analyze your table and establish an algorithm on which record to keep and which record to throw.

-- CK
Nov 6 '08 #9

P: 9
Thank you for u r immedaite answer.
I will try creating trigger.

In SSMS,when we right click on the table
we will get options as script table as create to ...ect
But for me alter to is disabled,what should I do to enable that.
why will that be disabled when I have permissions to devlop in that box
and even execute is disabled.

Can u please help me

Thank you in advance
Nov 6 '08 #10

ck9663
Expert 2.5K+
P: 2,878
Permission to see the table and do a SELECT on it, yes. Other rights and permissions, no. Call your admin to raise your permission rights.

-- CK
Nov 6 '08 #11

P: 9
Hi Ck,
Sorry for asking u the same question again.
the query above which u suggested will return all the duplicates in a table
But I want a select statement that will return only one duplicate.
Can u please help me
Thanks in advance
Nov 12 '08 #12

ck9663
Expert 2.5K+
P: 2,878
Which query?

This:

Expand|Select|Wrap|Line Numbers
  1.  
  2. select Col1, Col2
  3. from TableA
  4. group by Col, Col2
  5. having count(*) >1
  6.  
  7.  
will only return one row for each pair.

-- CK
Nov 12 '08 #13

P: 9
Thank you for u r immediate reply.

But i need to select all the fileds in the table.
For that I got the query like

Expand|Select|Wrap|Line Numbers
  1. CREATE TABLE [dbo].[Employee] (
  2.             [id] [int] NULL ,
  3.             [name] [Varchar] (50) COLLATE Latin1_General_CI_AS NULL ,
  4.             [salary] [Numeric](18, 2) NULL 
  5.  ) ON [PRIMARY]
  6.  
  7. Insert into employee values (1,'Ram', 1000.00)
  8. Insert into employee values (1,'Ram', 1000.00)
  9. Insert into employee values (2,'Joe', 2000.00)
  10. Insert into employee values (2,'Joe', 1000.00)
  11. Insert into employee values (3,'Mary', 1000.00)
  12. Insert into employee values (4,'Julie', 5000.00)
  13. Insert into employee values (2,'Joe', 1000.00)
  14. Insert into employee values (1,'Ram', 1000.00)
  15.  
  16. --The above is the table and I am trying to find out duplicates
  17.  
  18. --My query is
  19. SELECT *  FROM employee
  20. WHERE ID IN  (SELECT ID  FROM employee   
  21. WHERE EXIST(SELECT NULL FROM employee AS tmptable     
  22.  WHERE employee.id = tmptable.id
  23.  group by id          
  24. HAVING (employee.id > MIN(tmptable.id))))
But it throwing me errors
Sorry for troubling u again

Thanks in advance
Nov 12 '08 #14

ck9663
Expert 2.5K+
P: 2,878
Joe (ID = 2) has three records with 2 salary values (1000.00 and 2000.00). Which one do you want to return? Is there a pattern?

-- CK
Nov 13 '08 #15

P: 9
Hello CK
thank you for u r reply.

I am looking for duplicate Id
Is that possible?

What I thought is, same person cannot have 2 different salaries
so I wanted to check for dupliate id.
Can u please help me.

Thank you
Nov 13 '08 #16

ck9663
Expert 2.5K+
P: 2,878
To see the complete records of duplicate in employee table, try this:

Expand|Select|Wrap|Line Numbers
  1. select * from employee e1
  2. inner join (select e2.id, count(*) cnt from employee e2 group by id having count(*) > 1) dups on dups.id = e1.id
  3. order by id
  4.  
Happy coding!


-- CK
Nov 13 '08 #17

P: 1
do u want to completely remove the whole records or leave one and remove the other record.................
Nov 14 '08 #18

Post your reply

Sign in to post your reply or Sign up for a free account.