473,404 Members | 2,195 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,404 software developers and data experts.

Finding Duplicates

I have a company table and I would like to write a query that will return to
me any duplicate companies. However, it is a little more complicated then
just matching on exact company names. I would like it to give me duplicates
where x number of letters at the beginning of the company name match AND x
number of letters of the address match AND x number of letters of the city
match. I will be doing this in batches based on the first letter of the
company name. So for example I will first process all companies that start
with the letter "A".

So for all "A" companies I want to find companies where the first 5 letters
in the company name match and the first 5 characters of the address field
match and the first 5 characters of the city match. THANKS!!!

Jul 20 '05 #1
3 4183
Can you post simplified DDLs, some sample data & expected results? For
detail refer to : www.aspfaq.com/5006

--
- Anith
( Please reply to newsgroups only )
Jul 20 '05 #2
"Erich" <er********@hotmail.com> wrote in message news:<10*************@corp.supernews.com>...
I have a company table and I would like to write a query that will return to
me any duplicate companies. However, it is a little more complicated then
just matching on exact company names. I would like it to give me duplicates
where x number of letters at the beginning of the company name match AND x
number of letters of the address match AND x number of letters of the city
match. I will be doing this in batches based on the first letter of the
company name. So for example I will first process all companies that start
with the letter "A".

So for all "A" companies I want to find companies where the first 5 letters
in the company name match and the first 5 characters of the address field
match and the first 5 characters of the city match. THANKS!!!


Something like this may work:

select t.*
from dbo.MyTable t
join
(
select
left(CompanyName, 5) as 'CompName',
left(Address, 5) as 'Addr',
left(City, 5) as 'City',
count(*) as 'Dupes'
from
dbo.MyTable
where
left(CompanyName, 1) = 'A'
group by
left(CompanyName, 5),
left(Address, 5),
left(City, 5)
having
count(*) > 1
) dt
on dt.CompName = left(t.CompanyName, 5)
and dt.Addr = left(t.Address, 5)
and dt.City = left(t.City, 5)
If this doesn't work as you expect, then please consider posting your
table DDL, as well as some sample data.

Simon
Jul 20 '05 #3
This may work as well, assuming a parameter is passed into the stored proc
or function for the first letter:

SELECT C1.CompanyID, C2.CompanyID,
C1.CompanyName, C2.CompanyName,
C1.Address, C2.Address, C1.City, C2.City
FROM Company C1 JOIN Company C2 ON
LEFT(C1.CompanyName, 5) = LEFT(C2.CompanyName, 5) AND
LEFT(C1.Address, 5) = LEFT(C2.Address, 5) AND
LEFT(C1.City, 5) = LEFT(C2.City, 5) AND
C1.CompanyID != C2.CompanyID
WHERE LEFT(C1.CompanyName, 1) = @FirstLetter

You could also use a variable parameter instead of hard-coding "5" to allow
for more specific or more general matches.
ie, ... LEFT(C1.Company, @MatchLength) = LEFT(C2.CompanyName, @MatchLength)

"Erich" <er********@hotmail.com> wrote in message
news:10*************@corp.supernews.com...
I have a company table and I would like to write a query that will return to me any duplicate companies. However, it is a little more complicated then
just matching on exact company names. I would like it to give me duplicates where x number of letters at the beginning of the company name match AND x
number of letters of the address match AND x number of letters of the city
match. I will be doing this in batches based on the first letter of the
company name. So for example I will first process all companies that start
with the letter "A".

So for all "A" companies I want to find companies where the first 5 letters in the company name match and the first 5 characters of the address field
match and the first 5 characters of the city match. THANKS!!!

Jul 20 '05 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
by: Timo Nentwig | last post by:
Hi! <node> <name>foo</name> <id>1</id> <age>7</age> </node> <node.....> <node> <name>foo</name>
6
by: Marlene | last post by:
Hi All I have the following scenario, where I have found all the duplicates in a table, based on an order number and a part number (item).I might have something like this: Order PODate Rec...
6
by: Maxi | last post by:
I have 100 tabes in an Access database, every table has 1 filed with 100 names (records), no primary key assigned. I would like to find duplicates. Here is the criteria: The computer should...
11
by: paradox | last post by:
Basically I have an ArrayList of strings; I need a fast way of searching through and comparing the elements of the ArrayList and the return an ArrayList of items that have 3 Duplicates. For...
3
by: AK | last post by:
Hi Our product uses MS-SQL Server 2000. One of our customer has 10 installations with each installation stroring data in its own database. Now the customer wants to consolidate these databases...
22
by: Simon Forman | last post by:
Is there a more efficient way to do this? def f(L): '''Return a set of the items that occur more than once in L.''' L = list(L) for item in set(L): L.remove(item) return set(L)
5
by: limperger | last post by:
Hello everyone! Is out there any way to search for duplicate entries without using the "Find duplicates" option? In the Access 97 installed in my workplace, the Find duplicates option is disabled...
17
by: Gayatree | last post by:
I have to concatenate 2 colimns in a table and find duplicates in them. I already used the method select * from tableA a where (select count(*) from TableA b where acol1+ +col2 =...
3
Thekid
by: Thekid | last post by:
I'm trying to figure out a way to find if there are duplicates in an array. My idea was to take the array as 'a' and make a second array as 'b' and remove the duplicates from 'b' using 'set' and then...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.