473,387 Members | 1,812 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

Resolving duplicates entries in table among 10 databases

AK
Hi

Our product uses MS-SQL Server 2000. One of our customer has 10
installations with each installation stroring data in its own database.
Now the customer wants to consolidate these databases into one and we
already have plan for that by consolidating one DB at a time. But first
they want to find how many unique or duplicate entries they have across
all the 10 databases

Assumptions:
1. All the databases reside on the same server. (This is just an
assumption, not the real environment at customer site)
2. Databases can not be merged before it is found how many unique or
duplicate rows exist.

Table under consideration:
Message
(
HashID PK,
....
)

# of rows in Message table in each of databases: 1 Million

Here is my question: How can I find how many unique or duplicate
entries they have across all the 10 databases. I easily find unique
rows for two databases with a query like this:

SELECT COUNT(A.HasID) FROM db1.dbo.Message A LEFT OUTER JOIN ON
db2.dbo.Message B ON A.HashID = B.HashID WHERE B.HashID IS NULL

How can I do this for 10 databases. This will require factorial of 10
queries to solve this problem.

I will appreciate if someone can provide hint on this.

Regards
AK

Feb 11 '06 #1
3 2152
> Here is my question: How can I find how many unique or duplicate
entries they have across all the 10 databases.
The following will list the count of unique values (Duplicates = 0) as well
has the non-unique values grouped by the number of duplicates (1-9).

SELECT
Duplicates,
(Duplicates + 1) * COUNT(*) AS TotalHashIDCount
FROM (
SELECT HashID, COUNT(*) - 1 AS Duplicates
FROM (
SELECT HashID FROM db1.dbo.Message
UNION ALL SELECT HashID FROM db2.dbo.Message
UNION ALL SELECT HashID FROM db3.dbo.Message
UNION ALL SELECT HashID FROM db4.dbo.Message
UNION ALL SELECT HashID FROM db5.dbo.Message
UNION ALL SELECT HashID FROM db6.dbo.Message
UNION ALL SELECT HashID FROM db7.dbo.Message
UNION ALL SELECT HashID FROM db8.dbo.Message
UNION ALL SELECT HashID FROM db9.dbo.Message
UNION ALL SELECT HashID FROM db10.dbo.Message
) AS Messages
GROUP BY HashID) AS HashIDCounts
GROUP BY Duplicates
ORDER BY Duplicates

--
Hope this helps.

Dan Guzman
SQL Server MVP

"AK" <am***@yahoo.com> wrote in message
news:11*********************@g44g2000cwa.googlegro ups.com... Hi

Our product uses MS-SQL Server 2000. One of our customer has 10
installations with each installation stroring data in its own database.
Now the customer wants to consolidate these databases into one and we
already have plan for that by consolidating one DB at a time. But first
they want to find how many unique or duplicate entries they have across
all the 10 databases

Assumptions:
1. All the databases reside on the same server. (This is just an
assumption, not the real environment at customer site)
2. Databases can not be merged before it is found how many unique or
duplicate rows exist.

Table under consideration:
Message
(
HashID PK,
...
)

# of rows in Message table in each of databases: 1 Million

Here is my question: How can I find how many unique or duplicate
entries they have across all the 10 databases. I easily find unique
rows for two databases with a query like this:

SELECT COUNT(A.HasID) FROM db1.dbo.Message A LEFT OUTER JOIN ON
db2.dbo.Message B ON A.HashID = B.HashID WHERE B.HashID IS NULL

How can I do this for 10 databases. This will require factorial of 10
queries to solve this problem.

I will appreciate if someone can provide hint on this.

Regards
AK

Feb 11 '06 #2
AK
Thank you Dan. This is exactly what I needed (in fact more than what I
needed :)

Regards
AK

Dan Guzman wrote:
Here is my question: How can I find how many unique or duplicate
entries they have across all the 10 databases.


The following will list the count of unique values (Duplicates = 0) as well
has the non-unique values grouped by the number of duplicates (1-9).

SELECT
Duplicates,
(Duplicates + 1) * COUNT(*) AS TotalHashIDCount
FROM (
SELECT HashID, COUNT(*) - 1 AS Duplicates
FROM (
SELECT HashID FROM db1.dbo.Message
UNION ALL SELECT HashID FROM db2.dbo.Message
UNION ALL SELECT HashID FROM db3.dbo.Message
UNION ALL SELECT HashID FROM db4.dbo.Message
UNION ALL SELECT HashID FROM db5.dbo.Message
UNION ALL SELECT HashID FROM db6.dbo.Message
UNION ALL SELECT HashID FROM db7.dbo.Message
UNION ALL SELECT HashID FROM db8.dbo.Message
UNION ALL SELECT HashID FROM db9.dbo.Message
UNION ALL SELECT HashID FROM db10.dbo.Message
) AS Messages
GROUP BY HashID) AS HashIDCounts
GROUP BY Duplicates
ORDER BY Duplicates

--
Hope this helps.

Dan Guzman
SQL Server MVP

"AK" <am***@yahoo.com> wrote in message
news:11*********************@g44g2000cwa.googlegro ups.com...
Hi

Our product uses MS-SQL Server 2000. One of our customer has 10
installations with each installation stroring data in its own database.
Now the customer wants to consolidate these databases into one and we
already have plan for that by consolidating one DB at a time. But first
they want to find how many unique or duplicate entries they have across
all the 10 databases

Assumptions:
1. All the databases reside on the same server. (This is just an
assumption, not the real environment at customer site)
2. Databases can not be merged before it is found how many unique or
duplicate rows exist.

Table under consideration:
Message
(
HashID PK,
...
)

# of rows in Message table in each of databases: 1 Million

Here is my question: How can I find how many unique or duplicate
entries they have across all the 10 databases. I easily find unique
rows for two databases with a query like this:

SELECT COUNT(A.HasID) FROM db1.dbo.Message A LEFT OUTER JOIN ON
db2.dbo.Message B ON A.HashID = B.HashID WHERE B.HashID IS NULL

How can I do this for 10 databases. This will require factorial of 10
queries to solve this problem.

I will appreciate if someone can provide hint on this.

Regards
AK


Feb 14 '06 #3
Better too much than too little :-)

--
Hope this helps.

Dan Guzman
SQL Server MVP

"AK" <am***@yahoo.com> wrote in message
news:11**********************@g43g2000cwa.googlegr oups.com...
Thank you Dan. This is exactly what I needed (in fact more than what I
needed :)

Regards
AK

Dan Guzman wrote:
> Here is my question: How can I find how many unique or duplicate
> entries they have across all the 10 databases.


The following will list the count of unique values (Duplicates = 0) as
well
has the non-unique values grouped by the number of duplicates (1-9).

SELECT
Duplicates,
(Duplicates + 1) * COUNT(*) AS TotalHashIDCount
FROM (
SELECT HashID, COUNT(*) - 1 AS Duplicates
FROM (
SELECT HashID FROM db1.dbo.Message
UNION ALL SELECT HashID FROM db2.dbo.Message
UNION ALL SELECT HashID FROM db3.dbo.Message
UNION ALL SELECT HashID FROM db4.dbo.Message
UNION ALL SELECT HashID FROM db5.dbo.Message
UNION ALL SELECT HashID FROM db6.dbo.Message
UNION ALL SELECT HashID FROM db7.dbo.Message
UNION ALL SELECT HashID FROM db8.dbo.Message
UNION ALL SELECT HashID FROM db9.dbo.Message
UNION ALL SELECT HashID FROM db10.dbo.Message
) AS Messages
GROUP BY HashID) AS HashIDCounts
GROUP BY Duplicates
ORDER BY Duplicates

--
Hope this helps.

Dan Guzman
SQL Server MVP

"AK" <am***@yahoo.com> wrote in message
news:11*********************@g44g2000cwa.googlegro ups.com...
> Hi
>
> Our product uses MS-SQL Server 2000. One of our customer has 10
> installations with each installation stroring data in its own database.
> Now the customer wants to consolidate these databases into one and we
> already have plan for that by consolidating one DB at a time. But first
> they want to find how many unique or duplicate entries they have across
> all the 10 databases
>
> Assumptions:
> 1. All the databases reside on the same server. (This is just an
> assumption, not the real environment at customer site)
> 2. Databases can not be merged before it is found how many unique or
> duplicate rows exist.
>
> Table under consideration:
> Message
> (
> HashID PK,
> ...
> )
>
> # of rows in Message table in each of databases: 1 Million
>
> Here is my question: How can I find how many unique or duplicate
> entries they have across all the 10 databases. I easily find unique
> rows for two databases with a query like this:
>
> SELECT COUNT(A.HasID) FROM db1.dbo.Message A LEFT OUTER JOIN ON
> db2.dbo.Message B ON A.HashID = B.HashID WHERE B.HashID IS NULL
>
> How can I do this for 10 databases. This will require factorial of 10
> queries to solve this problem.
>
> I will appreciate if someone can provide hint on this.
>
> Regards
> AK
>

Feb 15 '06 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: Joe | last post by:
Hey, I'm going to give some background on my situation in case anyone can point out a way around my problem altogether... for the problem itself, please skip to the bottom of the post. thanks....
36
by: toedipper | last post by:
Hello, I am designing a table of vehicle types, nothing special, just a list of unique vehicle types such as truck, lorry, bike, motor bike, plane, tractor etc etc For the table design I am...
3
by: Kall, Bruce A. | last post by:
I've look for a solution to this, but have only been able to find solutions to delete duplicate entries in a table by deleting entries not returned by SELECT DISTINCT. What sql should I use to...
2
by: sjlung | last post by:
I apologise if this is a trivial question but I have appended three tables in access and within this table, there are duplicate entries. I have tried to set my reference number for this table to be...
4
by: yin_n_yang74 | last post by:
I am new to SQL and SQL Server world. There must be a simple solution to this, but I'm not seeing it. I am trying to create a crystal report (v8.5) using a stored procedure from SQL Server...
16
by: ARC | last post by:
Hello all, So I'm knee deep in this import utility program, and am coming up with all sorts of "gotcha's!". 1st off. On a "Find Duplicates Query", does anyone have a good solution for...
3
by: ryan.paquette | last post by:
In the table there are 2 fields in which I wish to limit (i.e. No Duplicates) Although I do not want to limit them to "No Duplicates" separately. I need them to be limited to "No Duplicates" as...
3
allingame
by: allingame | last post by:
Need help with append and delete duplicates I have tables namely 1)emp, 2)time and 3)payroll TABLE emp ssn text U]PK name text
0
by: Ornitobase | last post by:
Hello, my form is used to filter data displayed in its subform. The origin of the data of the subform is dynamically generated. The filters work in VBA. The code is inspired by Allen Browne's...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.