MSSQL - DTS Package - Find distinct rows - Output to TXT file - ActiveX?

jb1

Hello All,

I am trying to create a DTS package.
I have two tables tbl_A and tbl_B with similar data/rows but no
primary keys.
tbl_A is master.

I would like this package to query tbl_A and tbl_B and find
1)all rows in tbl_A that are different in tbl_B, 2)all rows in tbl_A
that are not present in tbl_B and
3)all rows in tbl_B that are not present in tbl_A, and then just show
those rows.

Can this be done with a simple UNION?

Perhaps this could produce a temp Table that can be dropped once the
DTS package exists successfully.

The 2nd part after all the above rows are retrieved is that I would
like to add an addional Column to the retrieved data called STATUS
which has 3 possible values(letters) at the end of each row...

M (modified) means that row exists in tbl_B but has 1 or more
different columns
A (add) means this row exists in tbl_A but not in tbl_B
D (delete) means this row exists in tbl_B but not in tbl_A

I'm hopping this DTS package would output a nice comma seperated TXT
file with only...
1) rows from tbl_A that are different in tbl_B (STATUS M)
2) rows from tbl_A that are not present in tbl_B (STATUS A)
3) rows from tbl_B that are not present in tbl_A (STATUS D)
Can a DTS package in MS SQL be used to perfom all of the above tasks?
I would very much appreciate any help or any advise.

Thanks in advance :-)

Jun 11 '07 #1

Subscribe Reply

4534

Ed Murphy

jb1 wrote:

I am trying to create a DTS package.
I have two tables tbl_A and tbl_B with similar data/rows but no
primary keys.
tbl_A is master.

I would like this package to query tbl_A and tbl_B and find
1)all rows in tbl_A that are different in tbl_B, 2)all rows in tbl_A
that are not present in tbl_B and
3)all rows in tbl_B that are not present in tbl_A, and then just show
those rows.

In the absence of primary keys, what conceptual rule do you use to
associate rows in tbl_A with rows in tbl_B? For instance, consider
this hypothetical data:

[tbl_A]

col1 | col2
-----+-----
1 | 2
3 | 4

[tbl_B]

col1 | col2
-----+-----
1 | 4
3 | 2

What associations would you draw here, and why?

Can this be done with a simple UNION?

Perhaps this could produce a temp Table that can be dropped once the
DTS package exists successfully.

Why not a view?

Jun 11 '07 #2

undercups

The answer is to write 3 queries, 1 for each situation you describe
and link the output using the UNION ALL statement

Something like

SELECT a.id, a.col1, a.col2, a.col3, b.col4 FROM table1 a JOIN table2
b on a.id = b.id
WHERE a.col1 <b.col1
UNION ALL
SELECT a.id, a.col1, a.col2, a.col3, b.col4 FROM table1 a LEFT JOIN
table2 b on a.id = b.id
WHERE b.col4 IS NULL
UNION ALL
SELECT b.id, b.col1, b.col2, b.col3, a.col4 FROM table2 b LEFT JOIN
table1 a on b.id = a.id
WHERE a.col4 IS NULL

Note that the col1, col2, col3 and col4 in each of the queries must be
of the same datatypes or can be converted to the same datatypes e.g.
VARCHAR(20) is the same as CHAR(5). The where clauses can be more
specific and varied than I have shown but the important part is the
test for IS NULL in the last 2 queries as this is the way to find rows
in one table and that are not in another.

Hope this helps

Jun 12 '07 #3

jb1

On 12 Jun, 16:24, undercups <d...@woodace.co.ukwrote:

The answer is to write 3 queries, 1 for each situation you describe
and link the output using the UNION ALL statement

Something like

SELECT a.id, a.col1, a.col2, a.col3, b.col4 FROM table1 a JOIN table2
b on a.id = b.id
WHERE a.col1 <b.col1
UNION ALL
SELECT a.id, a.col1, a.col2, a.col3, b.col4 FROM table1 a LEFT JOIN
table2 b on a.id = b.id
WHERE b.col4 IS NULL
UNION ALL
SELECT b.id, b.col1, b.col2, b.col3, a.col4 FROM table2 b LEFT JOIN
table1 a on b.id = a.id
WHERE a.col4 IS NULL

That's great! Thank you. I'll have a look into this.
Do you have any ideas how I can add the above mentioned STATUS column
after each returned row? eg. If row is different in tbl_B from tbl_A
then add 'M' at the end of the row, as has Master data. something
like.. ("col1.value","col2.value","col3.value","M")

Thanks again :-)

Note that the col1, col2, col3 and col4 in each of the queries must be
of the same datatypes or can be converted to the same datatypes e.g.
VARCHAR(20) is the same as CHAR(5). The where clauses can be more
specific and varied than I have shown but the important part is the
test for IS NULL in the last 2 queries as this is the way to find rows
in one table and that are not in another.

Hope this helps

Jun 12 '07 #4

Erland Sommarskog

jb1 (ja***********@googlemail.com) writes:

On 12 Jun, 16:24, undercups <d...@woodace.co.ukwrote:
>The answer is to write 3 queries, 1 for each situation you describe
and link the output using the UNION ALL statement

Something like

SELECT a.id, a.col1, a.col2, a.col3, b.col4 FROM table1 a JOIN table2
b on a.id = b.id
WHERE a.col1 <b.col1

Note here that you should probably compare all columns. If columns can
be nullable, this can be quite hairy.

That's great! Thank you. I'll have a look into this.
Do you have any ideas how I can add the above mentioned STATUS column
after each returned row? eg. If row is different in tbl_B from tbl_A
then add 'M' at the end of the row, as has Master data. something
like.. ("col1.value","col2.value","col3.value","M")

Just att the desired status letter to the column list of the queries,
for instance:

SELECT a.id, a.col1, a.col2, a.col3, b.col4, 'M' AS status

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx

Jun 12 '07 #5

Ed Murphy

undercups wrote:

The answer is to write 3 queries, 1 for each situation you describe
and link the output using the UNION ALL statement

Something like

SELECT a.id, a.col1, a.col2, a.col3, b.col4 FROM table1 a JOIN table2
b on a.id = b.id
WHERE a.col1 <b.col1
UNION ALL

(etc.)

The problem is that, if the tables don't have primary keys, then they
may not have a column like 'id'. (Or maybe they do, and it just isn't
defined as a primary key - in which case, why not?)

Jun 13 '07 #6

jb1

On 13 Jun, 06:25, Ed Murphy <emurph...@socal.rr.comwrote:

undercups wrote:
The answer is to write 3 queries, 1 for each situation you describe
and link the output using the UNION ALL statement

Something like

SELECT a.id, a.col1, a.col2, a.col3, b.col4 FROM table1 a JOIN table2
b on a.id = b.id
WHERE a.col1 <b.col1
UNION ALL

(etc.)

The problem is that, if the tables don't have primary keys, then they
may not have a column like 'id'. (Or maybe they do, and it just isn't
defined as a primary key - in which case, why not?)

Well there are about 5 or 6 cols in each table and I need to check
each one to see if anything has changed. There is 1 column in each
that is kind of like the ID you are talking about though it is not set
to Primary Key.

Anyway, this is what I have for the 1st query to find Modified rows in
TableMaster...

SELECT a.id,
a.col2, a.col3, a.col4, a.col5,
b.id, b.col2, b.col3, b.col4, b.col5
'M' AS status

FROM TableMaster a JOIN TableSlave b on a.id = b.id

WHERE (a.col2 <b.col2) OR (a.col3 <b.col3) OR (a.col4 <b.col4)
OR (a.col5 <b.col5)
UNION ALL

Now does the UNION ALL command at the end imply that another query
will follow and that the results from this query and the other query
be joined?
Will the above view query return what I'm looking for? - rows in
TableMaster that are different in TableSlave? so tha later I can
update TableSlave with these new modified row.
How could I continue and query Rows that are present in TableMaster
but not in TableSlave? - Status 'A' (add)
Finally, I would like to put a 3rd query in for TableMaster and
TableGrandMaster, which tells me which rows are present in
TableGrandMaster but missing in TableMaster - Status 'D' (delete)

Thank you again everyone for all your help and advise! :-)

JB

Jun 25 '07 #7

Ed Murphy

jb1 wrote:

On 13 Jun, 06:25, Ed Murphy <emurph...@socal.rr.comwrote:
>undercups wrote:
>>The answer is to write 3 queries, 1 for each situation you describe
and link the output using the UNION ALL statement
Something like
SELECT a.id, a.col1, a.col2, a.col3, b.col4 FROM table1 a JOIN table2
b on a.id = b.id
WHERE a.col1 <b.col1
UNION ALL
(etc.)

The problem is that, if the tables don't have primary keys, then they
may not have a column like 'id'. (Or maybe they do, and it just isn't
defined as a primary key - in which case, why not?)

Well there are about 5 or 6 cols in each table and I need to check
each one to see if anything has changed. There is 1 column in each
that is kind of like the ID you are talking about though it is not set
to Primary Key.

Like I said, why isn't it? Is there a good reason not to set it
to Primary Key now?

Anyway, this is what I have for the 1st query to find Modified rows in
TableMaster...

SELECT a.id,
a.col2, a.col3, a.col4, a.col5,
b.id, b.col2, b.col3, b.col4, b.col5
'M' AS status

FROM TableMaster a JOIN TableSlave b on a.id = b.id

WHERE (a.col2 <b.col2) OR (a.col3 <b.col3) OR (a.col4 <b.col4)
OR (a.col5 <b.col5)
UNION ALL

Now does the UNION ALL command at the end imply that another query
will follow and that the results from this query and the other query
be joined?

Yes. ALL means to include duplicates, which is unnecessary in this
case (the first section of the query will contribute rows with
status = M, the second will contribute rows with status = A, etc.).

Will the above view query return what I'm looking for? - rows in
TableMaster that are different in TableSlave? so tha later I can
update TableSlave with these new modified row.

Yes, assuming that 'id' values are not changed or duplicated.

How could I continue and query Rows that are present in TableMaster
but not in TableSlave? - Status 'A' (add)

For parallelism with the part before UNION ALL:

select a.id, a.col2, a.col3, a.col4, a.col5,
b.id, b.col2, b.col3, b.col4, b.col5
'A' AS status
from TableMaster a
left join TableSlave b on a.id = b.id
where b.id is null

Finally, I would like to put a 3rd query in for TableMaster and
TableGrandMaster, which tells me which rows are present in
TableGrandMaster but missing in TableMaster - Status 'D' (delete)

select a.id, a.col2, a.col3, a.col4, a.col5,
b.id, b.col2, b.col3, b.col4, b.col5
'D' AS status
from TableSlave b
left join TableMaster a on a.id = b.id
where a.id is null

There's also RIGHT JOIN, but I avoid it because it's confusing, and
one of the main packages I work on doesn't allow it anyway.

Jun 26 '07 #8

Similar topics

2268

mssql server

by: udo polder | last post by:

hello can anyone tell me if it is possible to use activeX components (com/dcom) in MS-sql stored procedures? tia

Microsoft SQL Server

12653

Executing a DTS Package using an ASP (VBScript)

by: chris.dunigan | last post by:

I'm looking for an example of how to execute an existing DTS package from an ASP (VB)script and would appreciate any and all response. I don't even know if it's possible Thanks - Chuck...

Microsoft SQL Server

9216

Delete rows in Excel From DTS package

by: smonczka | last post by:

I have a DTS package that needs to refresh data in 3 separate Excel spreadsheets on a daily basis. The problem is that unless I manually delete the previous day's data, it appends rather than...

Microsoft SQL Server

53332

Distinct vs DistinctRow?

by: Fred Zuckerman | last post by:

Can someone explain the difference between these 2 queries? "Select Distinct id, account, lastname, firstname from table1" and "Select DistinctRow id, account, lastname, firstname from table1" ...

Microsoft Access / VBA

13740

Duplicate rows from a SELECT DISTINCT DB2 UDB v8.1.9 Linux

by: Bob Stearns | last post by:

I am getting duplicate rows back from a select distinct statement of the form: SELECT DISTINCT 'jhough', '000111', t0.bhid FROM (SELECT lots of good stuff) t0 LEFT OUTER JOIN another_table ...