473,387 Members | 1,749 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

Finding almost duplicate rows in mysql

I'm using MySQL 4.1.1

I've inherited a database which has some (almost) duplicate rows.

The databse is like this.

userID
userPosition
userDepartment

No user should be in more than one department but some are. I need to
find all the users who are listed in more than one department for manual
fixing.

I can do something like

SELECT *, COUNT(userID)
FROM users
GROUP BY COUNT(userID)
ORDER BY COUNT(userID) DESC

But that still gives me all 10,000 rows. All I want is where
COUNT(userID) > 1

If I add a WHERE clause to that effect I just get error messages.

Any suggestions?

Jul 17 '05 #1
4 12049
The query should be like this, I think:

SELECT userID, COUNT(*)
FROM users
GROUP BY userID
ORDER BY userID
HAVING COUNT(*) > 1;
Uzytkownik "Russell" <nu**@null.noemail> napisal w wiadomosci
news:DY*******************@news-binary.blueyonder.co.uk...
I'm using MySQL 4.1.1

I've inherited a database which has some (almost) duplicate rows.

The databse is like this.

userID
userPosition
userDepartment

No user should be in more than one department but some are. I need to
find all the users who are listed in more than one department for manual
fixing.

I can do something like

SELECT *, COUNT(userID)
FROM users
GROUP BY COUNT(userID)
ORDER BY COUNT(userID) DESC

But that still gives me all 10,000 rows. All I want is where
COUNT(userID) > 1

If I add a WHERE clause to that effect I just get error messages.

Any suggestions?

Jul 17 '05 #2
On 2004-01-10, Russell <nu**@null.noemail> wrote:
I'm using MySQL 4.1.1

I've inherited a database which has some (almost) duplicate rows.

The databse is like this.

userID
userPosition
userDepartment

No user should be in more than one department but some are. I need to
find all the users who are listed in more than one department for manual
fixing.

I can do something like

SELECT *, COUNT(userID)
FROM users
GROUP BY COUNT(userID)
ORDER BY COUNT(userID) DESC


That needs to be:
SELECT *
FROM users
GROUP BY userID HAVING COUNT(*) > 1
ORDER BY userID DESC
Oh, and a good database design requires you have a primary key for each
table. I suggest you use userID. This way, next time when another record
is inserted with a userID that already exists, mysql will complain about
that ;)
--
http://home.mysth.be/~timvw
Jul 17 '05 #3
Chung Leong wrote:
The query should be like this, I think:

SELECT userID, COUNT(*)
FROM users
GROUP BY userID
ORDER BY userID
HAVING COUNT(*) > 1;


You, Sir, are a star :) Thanks!

The HAVING clause has to come before ORDER BY but other than that it's
perfect - and yes, once the data is fixed, the Primary Key will be too.

Thanks,

Russell.

Jul 17 '05 #4

Hi,

try something like (not tested but it should be something like this) :

select
*
from
users u
where
(
(select count(*) from users where userID = u.userID) > 1
)

It might not be entirely correct but it should get you started.

Good luck !


On Sat, 10 Jan 2004 16:40:42 +0000, Russell <nu**@null.noemail> wrote:
I'm using MySQL 4.1.1

I've inherited a database which has some (almost) duplicate rows.

The databse is like this.

userID
userPosition
userDepartment

No user should be in more than one department but some are. I need to
find all the users who are listed in more than one department for manual
fixing.

I can do something like

SELECT *, COUNT(userID)
FROM users
GROUP BY COUNT(userID)
ORDER BY COUNT(userID) DESC

But that still gives me all 10,000 rows. All I want is where
COUNT(userID) > 1

If I add a WHERE clause to that effect I just get error messages.

Any suggestions?


--
Jeeke
Jul 17 '05 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: g_chime | last post by:
How can I make MySQL reject duplicate rows when inserting? The table is quite large: 100+ rows and millions of rows. MySQL doesn't allow me to create a UNIQUE INDEX for all the rows (16 rows...
3
by: dan graziano | last post by:
Hi, How do you suggest is the best way to check for duplicate rows in an access table. And once one knows if there are duplicates, to remove all but one. In my access table, there are 5...
1
by: TaeHo Yoo | last post by:
I have a table that has more than 1 milion rows so practically it is impossible to remove all duplicate rows by hand. Could you help me to remove those duplicate rows at all? This table doesn't...
1
by: Asha | last post by:
greetings, does anyone have any idea how to delete duplicate rows in a dataset gracefully (the fast and easy way)
6
by: Bob Stearns | last post by:
I am getting duplicate rows back from a select distinct statement of the form: SELECT DISTINCT 'jhough', '000111', t0.bhid FROM (SELECT lots of good stuff) t0 LEFT OUTER JOIN another_table ...
7
by: Jon Maz | last post by:
Hi, I have a MySql problem I hope someone can help me with. I'm trying to run an update on a linking table, the update is running into a Primary Key constraint violation, and in my workaround...
1
by: hung tran | last post by:
Hi, Here is the code to eliminate duplicate rows, but what if I want to keep them and elimiante all other the unique ? public void RemoveDuplicateRows(DataTable dTable,string colName) {...
5
jamesd0142
by: jamesd0142 | last post by:
My manager and I where looking at some complex code to eliminate duplicate records in a database table. then it hit me how its done easily... so i thought i'd share it... In English:...
4
by: ravir81 | last post by:
Hi, I am currently working on excel validation using Perl. I am new to Excel validation but not for Perl. I have a question regarding one of the validation. Could anyone please tell me how to get...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.