473,722 Members | 2,295 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Finding Duplicates

I have a company table and I would like to write a query that will return to
me any duplicate companies. However, it is a little more complicated then
just matching on exact company names. I would like it to give me duplicates
where x number of letters at the beginning of the company name match AND x
number of letters of the address match AND x number of letters of the city
match. I will be doing this in batches based on the first letter of the
company name. So for example I will first process all companies that start
with the letter "A".

So for all "A" companies I want to find companies where the first 5 letters
in the company name match and the first 5 characters of the address field
match and the first 5 characters of the city match. THANKS!!!

Jul 20 '05 #1
3 4209
Can you post simplified DDLs, some sample data & expected results? For
detail refer to : www.aspfaq.com/5006

--
- Anith
( Please reply to newsgroups only )
Jul 20 '05 #2
"Erich" <er********@hot mail.com> wrote in message news:<10******* ******@corp.sup ernews.com>...
I have a company table and I would like to write a query that will return to
me any duplicate companies. However, it is a little more complicated then
just matching on exact company names. I would like it to give me duplicates
where x number of letters at the beginning of the company name match AND x
number of letters of the address match AND x number of letters of the city
match. I will be doing this in batches based on the first letter of the
company name. So for example I will first process all companies that start
with the letter "A".

So for all "A" companies I want to find companies where the first 5 letters
in the company name match and the first 5 characters of the address field
match and the first 5 characters of the city match. THANKS!!!


Something like this may work:

select t.*
from dbo.MyTable t
join
(
select
left(CompanyNam e, 5) as 'CompName',
left(Address, 5) as 'Addr',
left(City, 5) as 'City',
count(*) as 'Dupes'
from
dbo.MyTable
where
left(CompanyNam e, 1) = 'A'
group by
left(CompanyNam e, 5),
left(Address, 5),
left(City, 5)
having
count(*) > 1
) dt
on dt.CompName = left(t.CompanyN ame, 5)
and dt.Addr = left(t.Address, 5)
and dt.City = left(t.City, 5)
If this doesn't work as you expect, then please consider posting your
table DDL, as well as some sample data.

Simon
Jul 20 '05 #3
This may work as well, assuming a parameter is passed into the stored proc
or function for the first letter:

SELECT C1.CompanyID, C2.CompanyID,
C1.CompanyName, C2.CompanyName,
C1.Address, C2.Address, C1.City, C2.City
FROM Company C1 JOIN Company C2 ON
LEFT(C1.Company Name, 5) = LEFT(C2.Company Name, 5) AND
LEFT(C1.Address , 5) = LEFT(C2.Address , 5) AND
LEFT(C1.City, 5) = LEFT(C2.City, 5) AND
C1.CompanyID != C2.CompanyID
WHERE LEFT(C1.Company Name, 1) = @FirstLetter

You could also use a variable parameter instead of hard-coding "5" to allow
for more specific or more general matches.
ie, ... LEFT(C1.Company , @MatchLength) = LEFT(C2.Company Name, @MatchLength)

"Erich" <er********@hot mail.com> wrote in message
news:10******** *****@corp.supe rnews.com...
I have a company table and I would like to write a query that will return to me any duplicate companies. However, it is a little more complicated then
just matching on exact company names. I would like it to give me duplicates where x number of letters at the beginning of the company name match AND x
number of letters of the address match AND x number of letters of the city
match. I will be doing this in batches based on the first letter of the
company name. So for example I will first process all companies that start
with the letter "A".

So for all "A" companies I want to find companies where the first 5 letters in the company name match and the first 5 characters of the address field
match and the first 5 characters of the city match. THANKS!!!

Jul 20 '05 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
1485
by: Timo Nentwig | last post by:
Hi! <node> <name>foo</name> <id>1</id> <age>7</age> </node> <node.....> <node> <name>foo</name>
6
2402
by: Marlene | last post by:
Hi All I have the following scenario, where I have found all the duplicates in a table, based on an order number and a part number (item).I might have something like this: Order PODate Rec Qty Invoice# Item Supplier Status POReceivedDate 570133 03/09/2004 50 0 DMEDIUM L0010 PENDING 03/09/2004 570133 03/09/2004 50 0 DMEDIUM L0010 PENDING 03/09/2004 570133 03/09/2004 50 0 DMEDIUM L0010 PENDING 03/09/2004
6
2892
by: Maxi | last post by:
I have 100 tabes in an Access database, every table has 1 filed with 100 names (records), no primary key assigned. I would like to find duplicates. Here is the criteria: The computer should pick up the first name of Table1 and check that name in that table (Table1) as well as the remaining 99 tables. Continue this till we reach the last name (record) of the 100th table.
11
33536
by: paradox | last post by:
Basically I have an ArrayList of strings; I need a fast way of searching through and comparing the elements of the ArrayList and the return an ArrayList of items that have 3 Duplicates. For example, if the ArrayList has the following elements: elems = "blue" elems = "red" elems = "blue" elems = "green"
3
2171
by: AK | last post by:
Hi Our product uses MS-SQL Server 2000. One of our customer has 10 installations with each installation stroring data in its own database. Now the customer wants to consolidate these databases into one and we already have plan for that by consolidating one DB at a time. But first they want to find how many unique or duplicate entries they have across all the 10 databases Assumptions:
22
6910
by: Simon Forman | last post by:
Is there a more efficient way to do this? def f(L): '''Return a set of the items that occur more than once in L.''' L = list(L) for item in set(L): L.remove(item) return set(L)
5
8283
by: limperger | last post by:
Hello everyone! Is out there any way to search for duplicate entries without using the "Find duplicates" option? In the Access 97 installed in my workplace, the Find duplicates option is disabled (don't ask me why) and I think that there are little chances of having it installed. Any wonder of how to overcome this situation without the aforementioned wizard? Thank you very much in advance Best regards from Barcelona
17
2999
by: Gayatree | last post by:
I have to concatenate 2 colimns in a table and find duplicates in them. I already used the method select * from tableA a where (select count(*) from TableA b where acol1+ +col2 = b.col1+ +col2)>1 But the performance is very bad. Data is also huge Can you help me Thanks in advance
3
25075
Thekid
by: Thekid | last post by:
I'm trying to figure out a way to find if there are duplicates in an array. My idea was to take the array as 'a' and make a second array as 'b' and remove the duplicates from 'b' using 'set' and then compare a to b. If they're different then it will print out 'duplicates found'. The problem is that even after trying different arrays, some with duplicates some without, that 'b' rearranges the numbers. Here's an example: a='1934, 2311,...
0
8863
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9384
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
9238
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
8052
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
6681
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5995
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4502
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
4762
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
2
2602
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.