473,487 Members | 2,680 Online
Bytes | Software Development & Data Engineering Community
Create Post

Home Posts Topics Members FAQ

Indexes to speed up a duplicates query.

OK, say you're trying to find all the non-exact duplicates in a table, and especially
do it with a single query, so that the check can be part of a user interface.
Then naturally you're going to have a slow query, because it has to compare
every row with every other row.

You ought to be able to cut out exactly half of the records in the second table,
by just saying:

SELECT * FROM TABLE
LEFT JOIN users dups ON users.created<dups.created

But I can't get MySQL to use the index I have on 'created' (same results for
integer-primary key, & a string field).

My best guess why not is that (according to the manual) "MySQL will not use
an index when ... the use of the index would require MySQL to access more than
30% of the rows in the table" - is this the reason?
Another option: say that I thought that my duplicates will always have the same
letter in their surname, then I could make an index on the first letter of the
surname, and make the join just on *that*. This should make the number of rows
needed up to 26 times less, right?

But exactly how is this done? MySQL won't use my index on the following query:
SELECT * FROM TABLE
LEFT JOIN users dups
ON SUBSTRING(users.name,0,1)=SUBSTRING(dups.name,0,1)

Thank you all in advance.

Tom Cunningham.

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/my***********...ie.nctu.edu.tw

Jul 19 '05 #1
0 1924

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
1786
by: Jack A | last post by:
OK Guys. I'm fed up of the query below taking too much time. I CANT change the query since it is generated by a 3rd party product. I can change indexes and add new indexes though. The schema of...
9
1876
by: pheonix1t | last post by:
hello, I've been assigned to do performance tuning on an SQL2000 database (around 10GB in size, several instances). So far, I see a single RAID5 array, 4CPU (xeon 700MHZ), 4GB RAM. I see the...
3
4560
by: Tom Mitchell | last post by:
All: I'm stumped on a query. How do I find duplicates in a table where one of the duplicates has values is a certain field and the other doesn't. For example, I have the following table: ...
2
1595
by: Kozzen | last post by:
Hi, can somebody briefly explain me how and where to build the expression to run a query that does the following: I have a " find duplicates query " on a table build on email adresses. In...
1
2661
by: Erik Schweitzer | last post by:
I work for a tool & die shop, and we have an Access DB that captures work performed on each job we do. Each job could have 5 or more entries for work performed on it. I am trying to create a...
3
1740
by: Parasyke | last post by:
I have a complicated query that is giving me duplicate records for some of the returned records. Underwriter name is what I don't want to duplicate. Any clues? Thanks for any help! SELECT...
16
3460
by: ARC | last post by:
Hello all, So I'm knee deep in this import utility program, and am coming up with all sorts of "gotcha's!". 1st off. On a "Find Duplicates Query", does anyone have a good solution for...
7
1577
rajiv07
by: rajiv07 | last post by:
select trackinfo.title, trackinfo.artist, trackinfo.trackid, trackinfo.duration, trackinfo.price, trackinfo.lyrisist, albuminfo.title from trackinfo, albuminfo where...
9
2769
by: iheartvba | last post by:
To put the below code simply, I have created a find duplicates query for a table using the Access query wizard. I have then tried to plug the Sql String (strSqlDupColl) copied from the Query Builder...
0
7106
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
7181
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
1
6846
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
5442
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
4565
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
3076
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...
0
3071
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
600
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
267
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.