Connecting Tech Pros Worldwide Forums | Help | Site Map

help with data transformations with outer join?

Preston Landers
Guest
 
Posts: n/a
#1: Jul 20 '05
Hello all.

I am trying to write a query that "just" switches some data around so
it is shown in a slightly different format. I am already able to do
what I want in Oracle 8i, but I am having trouble making it work in
SQL Server 2000. I am not a database newbie, but I can't seem to
figure this one out so I am turning to the newsgroup. I am thinking
that some of the SQL Gurus out there have done this very thing a
thousand times before and the answer will be obvious to them.

This message is pretty long but hopefully gives you enough information
to replicate the issue.

There are 3 tables involved in my scenario. Potentially a lot more in
the real application, but I'm trying to keep this example as simple as
possible.

In my database I have many "things". Let's call them "User Records"
(table: users) for this example. My app allows the customer to create
any number of custom "Extra Fields" (XF's) for a given User Record.
The Extra Field definitions are stored in a table which we can call
attribs. The actual XF values for a given user record are stored in a
third table, let's call it users_attribs.

users_attribs will look something like this (actual DDL below.)

UserID | ExtraFieldID | Value
--------------------------------------
User_1 | XF_1 | ham
User_1 | XF_2 | eggs
User_2 | XF_1 | bacon
User_2 | XF_2 | cheese
User_3 | XF_2 | onions

The end result is that I want a SQL query that returns something like
this:

UserID | XF_1 | XF_2
-------------------------------------
User_1 | ham | eggs
User_2 | bacon | cheese
User_3 | NULL | onions

Potentially there would be one column for each extra field definition.
One interesting question is how to get a dynamic number of columns to
show up in results, (so new XF's show up automatically) but I'm not
worried about that for now. Assume I will hard-code a specific set of
extra fields into my query.

The key here is that all users must show up in the final result EVEN
IF they don't have some extra field value defined. Since User_3 in
the example above doesn't have an XF_1 record, we see a NULL in that
column in the final result.

With Oracle I am able to accomplish this via an Outer Join, and I know
SQL Server supports Outer Joins, but I can't seem to make it work. In
ever version I have tried so far, if any user is missing any extra
field value, the entire row for the user goes "missing", and that is
my problem.

It seems like one possible solution would be to just go ahead and
populate the users_attribs table with a NULL value for that
combination of user ID and extra field ID, basically adding a new row
like this:

UserID | ExtraFieldID | Value
--------------------------------------
User_3 | XF_1 | NULL

I would like to avoid that if possible, for a number of reasons,
particularly the question of *when* that NULL would be added. I don't
want my report to touch the database and add stuff at reporting time
if at all possible. In Oracle, I seemingly don't have to, and I want
to get to that point on SQL Server.

So, here is some specific DDL to recreate this scenario:


CREATE TABLE users (user_id varchar(60), username varchar(60));

-- Extra Field (attribs) definitions
CREATE TABLE attribs (xf_id varchar(60), xf_name varchar(60));

-- Extra Field values for Users
CREATE TABLE users_attribs (user_id varchar(60), xf_id varchar(60),
val varchar(60));

-- populate the sample tables

-- sample User recs
INSERT INTO users VALUES ('U_1', 'John Smith');
INSERT INTO users VALUES ('U_2', 'Mary Rogers');

-- sample extra field definitions
INSERT INTO attribs VALUES ('XF_1', 'Extra Field 1');
INSERT INTO attribs VALUES ('XF_2', 'Extra Field 2');
INSERT INTO attribs VALUES ('XF_3', 'Extra Field 3');

-- sample values for User Extra Fields (XF's)
-- U_1 ("John Smith") has complete values for each XF
INSERT INTO users_attribs VALUES ('U_1', 'XF_1', 'XF_1 value for
U_1');
INSERT INTO users_attribs VALUES ('U_1', 'XF_2', 'XF_2 value for
U_1');
INSERT INTO users_attribs VALUES ('U_1', 'XF_3', 'XF_3 value for
U_1');

-- U_2 ("Mary Rogers") only has one value, missing the other two..
INSERT INTO users_attribs VALUES ('U_2', 'XF_2', 'XF_2 value for
U_2');


Now, I can get what I want on Oracle, provided that I define an new
view that joins the three tables together, then do a separate query on
that view that does an outer join. I could dispense with the view,
but I don't want to hard-code the XF ID's into the query. I am fine
with hardcoding the XF names, though. (Long story.)

-- Create a User Extra Field view that joins Users
-- extra field definitons (attribs)
-- and values (users_attribs.)

CREATE VIEW u_xf_view AS
SELECT u.user_id, at.xf_name, uxf.val
FROM
users u,
attribs at,
users_attribs uxf
WHERE
uxf.user_id = u.user_id AND
uxf.xf_id = at.xf_id


-- Oracle-only outer join syntax works if you use the view:

SELECT
u.username as "User Name",
uxf1.val as "Extra Field 1 Value",
uxf2.val as "Extra Field 2 Value",
uxf3.val as "Extra Field 3 Value"

FROM
users t,
u_xf_view uxf1,
u_xf_view uxf2,
u_xf_view uxf3

WHERE
uxf1.user_id(+) = t.user_id AND
uxf1.xf_name(+) = 'Extra Field 1' AND

uxf2.user_id(+) = t.user_id AND
uxf2.xf_name(+) = 'Extra Field 2' AND

uxf3.user_id(+) = t.user_id AND
uxf3.xf_name(+) = 'Extra Field 3'
;

-- RESULTS (correct):

User Name Extra Field 1 Value Extra Field 2 Value Extra
Field 3 Value
------------- ------------------------ ------------------------
------------------------
John Smith XF_1 value for U_1 XF_2 value for U_1 XF_3
value for U_1
Mary Rogers NULL XF_2 value for U_2 NULL

2 Row(s)

So far I have not been able to get the equivalent result in SQL
Server. Like I said, I am really hoping to avoid populating those
NULL values. Can anything think of a way to replicate Oracle's
behavior here? I have tried a number of variations on the ANSI join
syntax instead of Oracle's (+) operator, but everything I tried so far
has only yielded a row when ALL extra fields are populated (or even
worse behavior.)

I greatly appreciate any assitance you may be able to give. I would be
happy to provide any additional information if I forgot to mention
something important. I apologize in advance for any broken / wrapped
lines. Thank you for taking the time to read this.

I'm going to be out of town for the next week or so, so I won't check
for a response until then, but as soon as I get back home I will check
back in the newsgroup.

Thank you!!

Preston Landers

pibble (at) yahoo (dot) com

Erland Sommarskog
Guest
 
Posts: n/a
#2: Jul 20 '05

re: help with data transformations with outer join?


[posted and mailed, please reply in news]

Preston Landers (pibble@yahoo.com) writes:[color=blue]
> Potentially there would be one column for each extra field definition.
> One interesting question is how to get a dynamic number of columns to
> show up in results, (so new XF's show up automatically) but I'm not
> worried about that for now. Assume I will hard-code a specific set of
> extra fields into my query.[/color]

For that you would have to build the query dynamically. Since a SELECT
statement produces a table, and a table has a fixed set of columns,
a SELECT could not produce a variable set of columns. In SQL Server,
this is possible since you can build statemetns and then execute them
with sp_executesql or EXEC(). See http://www.sommarskog.se/dynamic_sql.html
if you want to learn more.
[color=blue]
> -- Oracle-only outer join syntax works if you use the view:
>
> SELECT
> u.username as "User Name",
> uxf1.val as "Extra Field 1 Value",
> uxf2.val as "Extra Field 2 Value",
> uxf3.val as "Extra Field 3 Value"
>
> FROM
> users t,
> u_xf_view uxf1,
> u_xf_view uxf2,
> u_xf_view uxf3
>
> WHERE
> uxf1.user_id(+) = t.user_id AND
> uxf1.xf_name(+) = 'Extra Field 1' AND
>
> uxf2.user_id(+) = t.user_id AND
> uxf2.xf_name(+) = 'Extra Field 2' AND
>
> uxf3.user_id(+) = t.user_id AND
> uxf3.xf_name(+) = 'Extra Field 3'
> ;[/color]

And with ANSI JOINS (which I would expect to work on Oracle too):

SELECT t.username as "User Name",
uxf1.val as "Extra Field 1 Value",
uxf2.val as "Extra Field 2 Value",
uxf3.val as "Extra Field 3 Value"
FROM users t
LEFT JOIN u_xf_view uxf1 ON uxf1.user_id = t.user_id AND
uxf1.xf_name = 'Extra Field 1'
LEFT JOIN u_xf_view uxf2 ON uxf2.user_id = t.user_id AND
uxf2.xf_name = 'Extra Field 2'
LEFT JOIN u_xf_view uxf3 ON uxf3.user_id = t.user_id AND
uxf3.xf_name = 'Extra Field 3'

The key to use LEFT JOIN correctly is to understand that the entire
FROM clause, which includes the JOIN operators, are built first,
and then the WHERE clause filters that result. So if you move
uxf1.xf_name = 'Extra Field 1' to the WHERE clause, you filter way
the rows where uxf1.xf_name are NULL. When it is in the JOIN operator
you are saying "join the two tables [users] and [ux_f_view WHERE
xf_name = 'Extra Field 1'], and keep all rows in [users] even if
there is no match in the other table.

Again, I don't know about Oracle, but in SQL Server the old operators
*= and =* are highly deprecated, and LEFT|RIGHT JOIN are much preferred.
It is possible that the (+) stuff works better, but I would suggest that
you should use the ANSI-style join in Oracle as well.

--
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
-P-
Guest
 
Posts: n/a
#3: Jul 20 '05

re: help with data transformations with outer join?


If you know how many columns "across" you'll have, this is a fairly simple query.

SELECT
u.user_id,
MAX( (CASE at.xf_name WHEN 'XF_1' then '' else NULL end) + uxf.val ) as XF_1,
MAX( (CASE at.xf_name WHEN 'XF_2' then '' else NULL end) + uxf.val ) as XF_2,
...
MAX( (CASE at.xf_name WHEN 'XF_n' then '' else NULL end) + uxf.val ) as XF_n

FROM
users u,
attribs at,
users_attribs uxf
WHERE
uxf.user_id = u.user_id AND
uxf.xf_id = at.xf_id
GROUP BY
u.user_id ;

This assumes that (NULL + 'string') evaluates to NULL, whereas (NULL || 'string') evaluates to 'string'.
If you have a dynamic number of columns across, then this solution doesn't apply.

--
Paul Horan
Sr. Architect
VCI Springfield, Mass
www.vcisolutions.com

"Preston Landers" <pibble@yahoo.com> wrote in message news:1136f745.0409011656.2c791d98@posting.google.c om...[color=blue]
> Hello all.
>
> I am trying to write a query that "just" switches some data around so
> it is shown in a slightly different format. I am already able to do
> what I want in Oracle 8i, but I am having trouble making it work in
> SQL Server 2000. I am not a database newbie, but I can't seem to
> figure this one out so I am turning to the newsgroup. I am thinking
> that some of the SQL Gurus out there have done this very thing a
> thousand times before and the answer will be obvious to them.
>
> This message is pretty long but hopefully gives you enough information
> to replicate the issue.
>
> There are 3 tables involved in my scenario. Potentially a lot more in
> the real application, but I'm trying to keep this example as simple as
> possible.
>
> In my database I have many "things". Let's call them "User Records"
> (table: users) for this example. My app allows the customer to create
> any number of custom "Extra Fields" (XF's) for a given User Record.
> The Extra Field definitions are stored in a table which we can call
> attribs. The actual XF values for a given user record are stored in a
> third table, let's call it users_attribs.
>
> users_attribs will look something like this (actual DDL below.)
>
> UserID | ExtraFieldID | Value
> --------------------------------------
> User_1 | XF_1 | ham
> User_1 | XF_2 | eggs
> User_2 | XF_1 | bacon
> User_2 | XF_2 | cheese
> User_3 | XF_2 | onions
>
> The end result is that I want a SQL query that returns something like
> this:
>
> UserID | XF_1 | XF_2
> -------------------------------------
> User_1 | ham | eggs
> User_2 | bacon | cheese
> User_3 | NULL | onions
>
> Potentially there would be one column for each extra field definition.
> One interesting question is how to get a dynamic number of columns to
> show up in results, (so new XF's show up automatically) but I'm not
> worried about that for now. Assume I will hard-code a specific set of
> extra fields into my query.
>
> The key here is that all users must show up in the final result EVEN
> IF they don't have some extra field value defined. Since User_3 in
> the example above doesn't have an XF_1 record, we see a NULL in that
> column in the final result.
>
> With Oracle I am able to accomplish this via an Outer Join, and I know
> SQL Server supports Outer Joins, but I can't seem to make it work. In
> ever version I have tried so far, if any user is missing any extra
> field value, the entire row for the user goes "missing", and that is
> my problem.
>
> It seems like one possible solution would be to just go ahead and
> populate the users_attribs table with a NULL value for that
> combination of user ID and extra field ID, basically adding a new row
> like this:
>
> UserID | ExtraFieldID | Value
> --------------------------------------
> User_3 | XF_1 | NULL
>
> I would like to avoid that if possible, for a number of reasons,
> particularly the question of *when* that NULL would be added. I don't
> want my report to touch the database and add stuff at reporting time
> if at all possible. In Oracle, I seemingly don't have to, and I want
> to get to that point on SQL Server.
>
> So, here is some specific DDL to recreate this scenario:
>
>
> CREATE TABLE users (user_id varchar(60), username varchar(60));
>
> -- Extra Field (attribs) definitions
> CREATE TABLE attribs (xf_id varchar(60), xf_name varchar(60));
>
> -- Extra Field values for Users
> CREATE TABLE users_attribs (user_id varchar(60), xf_id varchar(60),
> val varchar(60));
>
> -- populate the sample tables
>
> -- sample User recs
> INSERT INTO users VALUES ('U_1', 'John Smith');
> INSERT INTO users VALUES ('U_2', 'Mary Rogers');
>
> -- sample extra field definitions
> INSERT INTO attribs VALUES ('XF_1', 'Extra Field 1');
> INSERT INTO attribs VALUES ('XF_2', 'Extra Field 2');
> INSERT INTO attribs VALUES ('XF_3', 'Extra Field 3');
>
> -- sample values for User Extra Fields (XF's)
> -- U_1 ("John Smith") has complete values for each XF
> INSERT INTO users_attribs VALUES ('U_1', 'XF_1', 'XF_1 value for
> U_1');
> INSERT INTO users_attribs VALUES ('U_1', 'XF_2', 'XF_2 value for
> U_1');
> INSERT INTO users_attribs VALUES ('U_1', 'XF_3', 'XF_3 value for
> U_1');
>
> -- U_2 ("Mary Rogers") only has one value, missing the other two..
> INSERT INTO users_attribs VALUES ('U_2', 'XF_2', 'XF_2 value for
> U_2');
>
>
> Now, I can get what I want on Oracle, provided that I define an new
> view that joins the three tables together, then do a separate query on
> that view that does an outer join. I could dispense with the view,
> but I don't want to hard-code the XF ID's into the query. I am fine
> with hardcoding the XF names, though. (Long story.)
>
> -- Create a User Extra Field view that joins Users
> -- extra field definitons (attribs)
> -- and values (users_attribs.)
>
> CREATE VIEW u_xf_view AS
> SELECT u.user_id, at.xf_name, uxf.val
> FROM
> users u,
> attribs at,
> users_attribs uxf
> WHERE
> uxf.user_id = u.user_id AND
> uxf.xf_id = at.xf_id
>
>
> -- Oracle-only outer join syntax works if you use the view:
>
> SELECT
> u.username as "User Name",
> uxf1.val as "Extra Field 1 Value",
> uxf2.val as "Extra Field 2 Value",
> uxf3.val as "Extra Field 3 Value"
>
> FROM
> users t,
> u_xf_view uxf1,
> u_xf_view uxf2,
> u_xf_view uxf3
>
> WHERE
> uxf1.user_id(+) = t.user_id AND
> uxf1.xf_name(+) = 'Extra Field 1' AND
>
> uxf2.user_id(+) = t.user_id AND
> uxf2.xf_name(+) = 'Extra Field 2' AND
>
> uxf3.user_id(+) = t.user_id AND
> uxf3.xf_name(+) = 'Extra Field 3'
> ;
>
> -- RESULTS (correct):
>
> User Name Extra Field 1 Value Extra Field 2 Value Extra
> Field 3 Value
> ------------- ------------------------ ------------------------
> ------------------------
> John Smith XF_1 value for U_1 XF_2 value for U_1 XF_3
> value for U_1
> Mary Rogers NULL XF_2 value for U_2 NULL
>
> 2 Row(s)
>
> So far I have not been able to get the equivalent result in SQL
> Server. Like I said, I am really hoping to avoid populating those
> NULL values. Can anything think of a way to replicate Oracle's
> behavior here? I have tried a number of variations on the ANSI join
> syntax instead of Oracle's (+) operator, but everything I tried so far
> has only yielded a row when ALL extra fields are populated (or even
> worse behavior.)
>
> I greatly appreciate any assitance you may be able to give. I would be
> happy to provide any additional information if I forgot to mention
> something important. I apologize in advance for any broken / wrapped
> lines. Thank you for taking the time to read this.
>
> I'm going to be out of town for the next week or so, so I won't check
> for a response until then, but as soon as I get back home I will check
> back in the newsgroup.
>
> Thank you!!
>
> Preston Landers
>
> pibble (at) yahoo (dot) com[/color]


Closed Thread


Similar Microsoft SQL Server bytes