By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
438,575 Members | 1,948 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 438,575 IT Pros & Developers. It's quick & easy.

Problem with a join due to multiple Nulls

P: n/a
I want to join 2 tables by a unique ID field, but the ID field also has
multiple NULLS which I do not want to ignore and I fear they will cause
duplication.

Using TableA and TableB below i will demonstrate the problem.

TableA
TableA.ID Field1 Field2
1 Paul 1
Null John 1
2 John 1
TableB
TableB.ID Field3 Field4
1 25 1
Null 32 1
Null 23 1
2 26 1

The Table I want is

TableA.ID TableB.ID Field1 Field2 Field3 Field4
1 1 Paul 1 25 1
2 2 John 1 26 1
Null Null John 1 Null Null
Null Null Null Null 32 1
Null Null Null Null 26 1

I think a select distcinct statement with a full outer join may do what
I want, but I'm not certain so want to check.

Regards,

Ciarán

Mar 29 '06 #1
Share this Question
Share on Google+
10 Replies


P: n/a
chudson...@hotmail.com wrote:
I want to join 2 tables by a unique ID field, but the ID field also has
multiple NULLS which I do not want to ignore and I fear they will cause
duplication.

Using TableA and TableB below i will demonstrate the problem.

TableA
TableA.ID Field1 Field2
1 Paul 1
Null John 1
2 John 1
TableB
TableB.ID Field3 Field4
1 25 1
Null 32 1
Null 23 1
2 26 1

The Table I want is

TableA.ID TableB.ID Field1 Field2 Field3 Field4
1 1 Paul 1 25 1
2 2 John 1 26 1
Null Null John 1 Null Null
Null Null Null Null 32 1
Null Null Null Null 26 1

I think a select distcinct statement with a full outer join may do what
I want, but I'm not certain so want to check.

Regards,

Ciarán


It appears that TableA doesn't have a key. Not clear what the key is in
TableB either. Please post DDL rather than sketches of tables otherwise
we just have to guess.

Based on what you've posted I'd say you need to fix some data model
issues (missing keys) before you attempt your query. Maybe that's what
you are trying to do but it isn't obvious how your requested output
will help you.

--
David Portas, SQL Server MVP

Whenever possible please post enough code to reproduce your problem.
Including CREATE TABLE and INSERT statements usually helps.
State what version of SQL Server you are using and specify the content
of any error messages.

SQL Server Books Online:
http://msdn2.microsoft.com/library/m...S,SQL.90).aspx

--

Mar 29 '06 #2

P: n/a
TableA.ID and TableB.ID are the respective keys, but the problem is
that they contain multiple Nulls

Regards,
Ciarán

Mar 29 '06 #3

P: n/a
(ch********@hotmail.com) writes:
I want to join 2 tables by a unique ID field, but the ID field also has
multiple NULLS which I do not want to ignore and I fear they will cause
duplication.

Using TableA and TableB below i will demonstrate the problem.

TableA
TableA.ID Field1 Field2
1 Paul 1
Null John 1
2 John 1
TableB
TableB.ID Field3 Field4
1 25 1
Null 32 1
Null 23 1
2 26 1

The Table I want is

TableA.ID TableB.ID Field1 Field2 Field3 Field4
1 1 Paul 1 25 1
2 2 John 1 26 1
Null Null John 1 Null Null
Null Null Null Null 32 1
Null Null Null Null 26 1


The IDs cannot really be keys if there are NULL values, even less if there
are multiple NULL.

If I'm taking a guess of what you are looking for, this might be it:

SELECT a.ID, b.ID, a.Field1, a.Field2, b.Field3, b.Field4
FROM TableA a
JOIN TableB b ON a.ID = b.ID
UNION ALL
SELECT NULL, NULL, a.Field1, a.Field2, NULL, NULL
FROM TableA a
WHERE a.ID IS NULL
UNION ALL
SELECT NULL, NULL, NULL, NULL, b.Field3, b.Field4
FROM TableB b
WHERE b.ID IS NULL
--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Mar 29 '06 #4

P: n/a
ch********@hotmail.com wrote:
TableA.ID and TableB.ID are the respective keys, but the problem is
that they contain multiple Nulls

Regards,
Ciarán


If they contain nulls then they aren't keys. Every table should have a
key. Fix the design first.

--
David Portas, SQL Server MVP

Whenever possible please post enough code to reproduce your problem.
Including CREATE TABLE and INSERT statements usually helps.
State what version of SQL Server you are using and specify the content
of any error messages.

SQL Server Books Online:
http://msdn2.microsoft.com/library/m...S,SQL.90).aspx

--

Mar 29 '06 #5

P: n/a
I like your logic Erland.
That'll work perfectly thanks.

Mar 29 '06 #6

P: n/a
>> want to join 2 tables by a unique ID field [sic], but the ID field [sic] also has multiple NULLS which I do not want to ignore and I fear they will cause duplication. <<

Unh? "Unique identifier" is redundant and columns are not anything
like fields. Looking at your personal narrative, I see the magical,
vague "id" column violating the basic principle of data modeling that
data elements name distinct things and are not magical uiversal vague
creatures.

Please post some DDL with a key in each table. Without a key, these
things are not tables by definition.

You might also wantto take the time to learn RDBMS, so you so you do
not embarass yourself in future postings.

Mar 30 '06 #7

P: n/a
David,

Unfortunately I'm working with data exttracts as opposed to a well
designed system.
The code below demonstrates what i am trying to do.
In TableA My_ID is is either Null or a unique number.
In TableB My_ID is either Null or a number which may not be unique.

I tried using a full outer join the other day, but after 15 hours it
still had not worked so I think I am doing somethiing wrong and need
help.

In my real data TableA contains just under 2 million records 1.2million
of which My_ID is null and TableB conatins 5million records of which
almost 3 million of which My_ID is null.

CREATE TABLE TableA ( My_ID nvarchar(4000),Field1
nvarchar(4000),CounterA nvarchar(4000))
GO

INSERT INTO TableA
SELECT
'1', 'Paul','1'
UNION all SELECT
'2', 'John','1'
UNION all SELECT
'3', 'Mark','1'
UNION all SELECT
Null, 'Simon','1'
UNION all SELECT
Null, 'Peter','1'
CREATE TABLE TableB ( My_ID nvarchar(4000),Field2
nvarchar(4000),CounterB nvarchar(4000))
GO

INSERT INTO TableB
SELECT
'1', '23','1'
UNION all SELECT
'1', '24','1'
UNION all SELECT
'4', '26','1'
UNION all SELECT
Null, '27','1'
UNION all SELECT
Null, '28','1'

SELECT *
FROM TableA FULL OUTER JOIN
TableB ON TableA.My_ID = TableB.My_ID
Regards,
Ciarán

Mar 31 '06 #8

P: n/a
ch********@hotmail.com wrote:
David,

Unfortunately I'm working with data exttracts as opposed to a well
designed system.
The code below demonstrates what i am trying to do.
In TableA My_ID is is either Null or a unique number.
In TableB My_ID is either Null or a number which may not be unique.

I tried using a full outer join the other day, but after 15 hours it
still had not worked so I think I am doing somethiing wrong and need
help.

In my real data TableA contains just under 2 million records 1.2million
of which My_ID is null and TableB conatins 5million records of which
almost 3 million of which My_ID is null.


I can see that the system isn't well designed. What I'm suggesting is
that you fix it. I assume what you mean by "data extracts" is that you
are importing some data into a database from an outside source over
which you don't have any control. Can't you create your own tables and
import the data into them? That's what I recommend: convert your source
data into a normalized data model BEFORE you attempt any further
processing. You obviously haven't done that yet and I don't understand
what you are trying to achieve with this FULL JOIN. Why would you want
to return a join consisting of 7 million rows?

If you aren't permitted to implement a better design then maybe you'll
have to live with sub-optimal performance. I can't redesign your tables
for you because I don't know what your data means or what the result
you've asked for means.

--
David Portas, SQL Server MVP

Whenever possible please post enough code to reproduce your problem.
Including CREATE TABLE and INSERT statements usually helps.
State what version of SQL Server you are using and specify the content
of any error messages.

SQL Server Books Online:
http://msdn2.microsoft.com/library/m...S,SQL.90).aspx
--

Mar 31 '06 #9

P: n/a
Unfortunately I am not able to implement a better design.
Is there any more optimal way to ensure I have all transactions from
each table and a match wherever an inner join exists?

Regards,
Ciarán

Mar 31 '06 #10

P: n/a
don't worry, celko says this to everyone. he can't actually help
anyone, he just finds excuses as to why he can't.

Apr 6 '06 #11

This discussion thread is closed

Replies have been disabled for this discussion.