473,898 Members | 3,407 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Resolving duplicates entries in table among 10 databases

AK
Hi

Our product uses MS-SQL Server 2000. One of our customer has 10
installations with each installation stroring data in its own database.
Now the customer wants to consolidate these databases into one and we
already have plan for that by consolidating one DB at a time. But first
they want to find how many unique or duplicate entries they have across
all the 10 databases

Assumptions:
1. All the databases reside on the same server. (This is just an
assumption, not the real environment at customer site)
2. Databases can not be merged before it is found how many unique or
duplicate rows exist.

Table under consideration:
Message
(
HashID PK,
....
)

# of rows in Message table in each of databases: 1 Million

Here is my question: How can I find how many unique or duplicate
entries they have across all the 10 databases. I easily find unique
rows for two databases with a query like this:

SELECT COUNT(A.HasID) FROM db1.dbo.Message A LEFT OUTER JOIN ON
db2.dbo.Message B ON A.HashID = B.HashID WHERE B.HashID IS NULL

How can I do this for 10 databases. This will require factorial of 10
queries to solve this problem.

I will appreciate if someone can provide hint on this.

Regards
AK

Feb 11 '06 #1
3 2179
> Here is my question: How can I find how many unique or duplicate
entries they have across all the 10 databases.
The following will list the count of unique values (Duplicates = 0) as well
has the non-unique values grouped by the number of duplicates (1-9).

SELECT
Duplicates,
(Duplicates + 1) * COUNT(*) AS TotalHashIDCoun t
FROM (
SELECT HashID, COUNT(*) - 1 AS Duplicates
FROM (
SELECT HashID FROM db1.dbo.Message
UNION ALL SELECT HashID FROM db2.dbo.Message
UNION ALL SELECT HashID FROM db3.dbo.Message
UNION ALL SELECT HashID FROM db4.dbo.Message
UNION ALL SELECT HashID FROM db5.dbo.Message
UNION ALL SELECT HashID FROM db6.dbo.Message
UNION ALL SELECT HashID FROM db7.dbo.Message
UNION ALL SELECT HashID FROM db8.dbo.Message
UNION ALL SELECT HashID FROM db9.dbo.Message
UNION ALL SELECT HashID FROM db10.dbo.Messag e
) AS Messages
GROUP BY HashID) AS HashIDCounts
GROUP BY Duplicates
ORDER BY Duplicates

--
Hope this helps.

Dan Guzman
SQL Server MVP

"AK" <am***@yahoo.co m> wrote in message
news:11******** *************@g 44g2000cwa.goog legroups.com... Hi

Our product uses MS-SQL Server 2000. One of our customer has 10
installations with each installation stroring data in its own database.
Now the customer wants to consolidate these databases into one and we
already have plan for that by consolidating one DB at a time. But first
they want to find how many unique or duplicate entries they have across
all the 10 databases

Assumptions:
1. All the databases reside on the same server. (This is just an
assumption, not the real environment at customer site)
2. Databases can not be merged before it is found how many unique or
duplicate rows exist.

Table under consideration:
Message
(
HashID PK,
...
)

# of rows in Message table in each of databases: 1 Million

Here is my question: How can I find how many unique or duplicate
entries they have across all the 10 databases. I easily find unique
rows for two databases with a query like this:

SELECT COUNT(A.HasID) FROM db1.dbo.Message A LEFT OUTER JOIN ON
db2.dbo.Message B ON A.HashID = B.HashID WHERE B.HashID IS NULL

How can I do this for 10 databases. This will require factorial of 10
queries to solve this problem.

I will appreciate if someone can provide hint on this.

Regards
AK

Feb 11 '06 #2
AK
Thank you Dan. This is exactly what I needed (in fact more than what I
needed :)

Regards
AK

Dan Guzman wrote:
Here is my question: How can I find how many unique or duplicate
entries they have across all the 10 databases.


The following will list the count of unique values (Duplicates = 0) as well
has the non-unique values grouped by the number of duplicates (1-9).

SELECT
Duplicates,
(Duplicates + 1) * COUNT(*) AS TotalHashIDCoun t
FROM (
SELECT HashID, COUNT(*) - 1 AS Duplicates
FROM (
SELECT HashID FROM db1.dbo.Message
UNION ALL SELECT HashID FROM db2.dbo.Message
UNION ALL SELECT HashID FROM db3.dbo.Message
UNION ALL SELECT HashID FROM db4.dbo.Message
UNION ALL SELECT HashID FROM db5.dbo.Message
UNION ALL SELECT HashID FROM db6.dbo.Message
UNION ALL SELECT HashID FROM db7.dbo.Message
UNION ALL SELECT HashID FROM db8.dbo.Message
UNION ALL SELECT HashID FROM db9.dbo.Message
UNION ALL SELECT HashID FROM db10.dbo.Messag e
) AS Messages
GROUP BY HashID) AS HashIDCounts
GROUP BY Duplicates
ORDER BY Duplicates

--
Hope this helps.

Dan Guzman
SQL Server MVP

"AK" <am***@yahoo.co m> wrote in message
news:11******** *************@g 44g2000cwa.goog legroups.com...
Hi

Our product uses MS-SQL Server 2000. One of our customer has 10
installations with each installation stroring data in its own database.
Now the customer wants to consolidate these databases into one and we
already have plan for that by consolidating one DB at a time. But first
they want to find how many unique or duplicate entries they have across
all the 10 databases

Assumptions:
1. All the databases reside on the same server. (This is just an
assumption, not the real environment at customer site)
2. Databases can not be merged before it is found how many unique or
duplicate rows exist.

Table under consideration:
Message
(
HashID PK,
...
)

# of rows in Message table in each of databases: 1 Million

Here is my question: How can I find how many unique or duplicate
entries they have across all the 10 databases. I easily find unique
rows for two databases with a query like this:

SELECT COUNT(A.HasID) FROM db1.dbo.Message A LEFT OUTER JOIN ON
db2.dbo.Message B ON A.HashID = B.HashID WHERE B.HashID IS NULL

How can I do this for 10 databases. This will require factorial of 10
queries to solve this problem.

I will appreciate if someone can provide hint on this.

Regards
AK


Feb 14 '06 #3
Better too much than too little :-)

--
Hope this helps.

Dan Guzman
SQL Server MVP

"AK" <am***@yahoo.co m> wrote in message
news:11******** **************@ g43g2000cwa.goo glegroups.com.. .
Thank you Dan. This is exactly what I needed (in fact more than what I
needed :)

Regards
AK

Dan Guzman wrote:
> Here is my question: How can I find how many unique or duplicate
> entries they have across all the 10 databases.


The following will list the count of unique values (Duplicates = 0) as
well
has the non-unique values grouped by the number of duplicates (1-9).

SELECT
Duplicates,
(Duplicates + 1) * COUNT(*) AS TotalHashIDCoun t
FROM (
SELECT HashID, COUNT(*) - 1 AS Duplicates
FROM (
SELECT HashID FROM db1.dbo.Message
UNION ALL SELECT HashID FROM db2.dbo.Message
UNION ALL SELECT HashID FROM db3.dbo.Message
UNION ALL SELECT HashID FROM db4.dbo.Message
UNION ALL SELECT HashID FROM db5.dbo.Message
UNION ALL SELECT HashID FROM db6.dbo.Message
UNION ALL SELECT HashID FROM db7.dbo.Message
UNION ALL SELECT HashID FROM db8.dbo.Message
UNION ALL SELECT HashID FROM db9.dbo.Message
UNION ALL SELECT HashID FROM db10.dbo.Messag e
) AS Messages
GROUP BY HashID) AS HashIDCounts
GROUP BY Duplicates
ORDER BY Duplicates

--
Hope this helps.

Dan Guzman
SQL Server MVP

"AK" <am***@yahoo.co m> wrote in message
news:11******** *************@g 44g2000cwa.goog legroups.com...
> Hi
>
> Our product uses MS-SQL Server 2000. One of our customer has 10
> installations with each installation stroring data in its own database.
> Now the customer wants to consolidate these databases into one and we
> already have plan for that by consolidating one DB at a time. But first
> they want to find how many unique or duplicate entries they have across
> all the 10 databases
>
> Assumptions:
> 1. All the databases reside on the same server. (This is just an
> assumption, not the real environment at customer site)
> 2. Databases can not be merged before it is found how many unique or
> duplicate rows exist.
>
> Table under consideration:
> Message
> (
> HashID PK,
> ...
> )
>
> # of rows in Message table in each of databases: 1 Million
>
> Here is my question: How can I find how many unique or duplicate
> entries they have across all the 10 databases. I easily find unique
> rows for two databases with a query like this:
>
> SELECT COUNT(A.HasID) FROM db1.dbo.Message A LEFT OUTER JOIN ON
> db2.dbo.Message B ON A.HashID = B.HashID WHERE B.HashID IS NULL
>
> How can I do this for 10 databases. This will require factorial of 10
> queries to solve this problem.
>
> I will appreciate if someone can provide hint on this.
>
> Regards
> AK
>

Feb 15 '06 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
2677
by: Joe | last post by:
Hey, I'm going to give some background on my situation in case anyone can point out a way around my problem altogether... for the problem itself, please skip to the bottom of the post. thanks. I've been having some problems with database performance... Several threads are constantly attempting INSERTs of new records into a large table - that is hundreds of thousands of records -large by my account :-) The table has a VARCHAR field...
36
4668
by: toedipper | last post by:
Hello, I am designing a table of vehicle types, nothing special, just a list of unique vehicle types such as truck, lorry, bike, motor bike, plane, tractor etc etc For the table design I am proposing a single column table with a field name called vehicle_type and this will contain the vehicle type. Sot it will be
3
42182
by: Kall, Bruce A. | last post by:
I've look for a solution to this, but have only been able to find solutions to delete duplicate entries in a table by deleting entries not returned by SELECT DISTINCT. What sql should I use to SELECT entries in a table that have two particular column values that match? For example, my_table has name, phone number, identification_number, zip code, date of birth, and city
2
2690
by: sjlung | last post by:
I apologise if this is a trivial question but I have appended three tables in access and within this table, there are duplicate entries. I have tried to set my reference number for this table to be a primary key to ensure duplicates are no longer shown but this changes itself as soon as the tables are all appended together. I appreciate any help you can give me. Thanks alot.
4
7094
by: yin_n_yang74 | last post by:
I am new to SQL and SQL Server world. There must be a simple solution to this, but I'm not seeing it. I am trying to create a crystal report (v8.5) using a stored procedure from SQL Server (v2000) in order to report from two databases and to enable parameters. When I create the stored procedure, it joins multiple one-to-many relationship tables. This results in repeated/duplicate records. Is this an issue that should be solved within...
16
3536
by: ARC | last post by:
Hello all, So I'm knee deep in this import utility program, and am coming up with all sorts of "gotcha's!". 1st off. On a "Find Duplicates Query", does anyone have a good solution for renaming the duplicate records? My thinking was to take the results of the duplicate query, and somehow have it number each line where there is a duplicate (tried a groups query, but "count" won't work), then do an update query to change the duplicate to...
3
2855
by: ryan.paquette | last post by:
In the table there are 2 fields in which I wish to limit (i.e. No Duplicates) Although I do not want to limit them to "No Duplicates" separately. I need them to be limited to "No Duplicates" as if they were one field. The 2 fields are "Employee_Name" & "Training_Course". *(There is another table for Employees & another for Training Courses, both of which are related to this table.)
3
2375
allingame
by: allingame | last post by:
Need help with append and delete duplicates I have tables namely 1)emp, 2)time and 3)payroll TABLE emp ssn text U]PK name text
0
1783
by: Ornitobase | last post by:
Hello, my form is used to filter data displayed in its subform. The origin of the data of the subform is dynamically generated. The filters work in VBA. The code is inspired by Allen Browne's "Search2000": http://allenbrowne.com/ser-62.html Now, I'd like to have an option in the main form to hide the entries in the subform that I consider duplicates (entries with same
0
9993
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, weíll explore What is ONU, What Is Router, ONU & Routerís main usage, and What is the difference between ONU and Router. Letís take a closer look ! Part I. Meaning of...
0
9842
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10858
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
10487
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
5882
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
6078
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4708
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
4297
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
3308
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.