471,073 Members | 1,379 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 471,073 software developers and data experts.

RIGHT OUTER JOIN?

In my (admittedly brief) sojurn as an SQL programmer I've often admired
"outer joins" in textbooks but never really understood their use. I've
finally come across a problem that I think is served by an outer join.

-- This table stores the answer to each test question
CREATE TABLE TestResults (
studentId varchar (15) NOT NULL,
testId int NOT NULL REFERENCES Tests(testId),
qId int NOT NULL REFERENCES TestQuestions(qId),
response int NOT NULL REFERENCES TestDistractors(dId),
CONSTRAINT PK_TestResults
PRIMARY KEY NONCLUSTERED (testId, studentId, qId),
)

-- This table defines which questions are on which tests
CREATE TABLE TestQuestions_Tests (
testId int NOT NULL REFERENCES Tests(testId),
qId int NOT NULL REFERENCES TestQuestions(qId),
)

(Table Tests contains housekeeping information about a particular test,
TestQuestions defines individual questions, TestDistractors lists the
possible responses.)

In schematic form, the simplest form of my problem is to find all the
questions that haven't been answered. That would be:

SELECT tqt.qId
FROM TestResults AS tr
RIGHT OUTER JOIN TestQuestions_Tests AS tqt
ON tr.testId = tqt.testId AND tr.qId = tqt.qId
WHERE tr.qId is NULL

So far I think this is pretty straightforward and an efficient solution.
Agreed?

But my real problem is a little bit more complex. What I really want to
know is "for a given student, on a given test, which questions haven't
been answered?"

So now I have:

SELECT tqt.qId
FROM TestResults AS tr
RIGHT OUTER JOIN TestQuestions_Tests AS tqt
ON tr.testId = tqt.testId AND tr.qId = tqt.qId
WHERE tqt.testId = '1' AND tr.studentId = '7' AND tr.qId IS NULL

Is this the canonical form of the solution to my problem? It seems to
me like it is generating a whole slew of rows and then filtering them.
Is there a more elegant or efficient way to do it?

-- Rick

Jul 20 '05 #1
6 6165
Guinness Mann (GM***@dublin.com) writes:
But my real problem is a little bit more complex. What I really want to
know is "for a given student, on a given test, which questions haven't
been answered?"


This is how I would write it:

SELECT tqt.qId
FROM TestQuestions_Tests tqt
WHERE NOT EXISTS (SELECT *
FROM TestResults tr
WHERE tr.studentId = @studentid
AND tr.testId = @testid
AND tqt.testId = tr.testId)
AND tqt.testId = @testid

While outer joins can be used for NOT EXISTS logic, I prefer NOT EXISTS,
as it more clearly states what the query is all about.

Performancewise, outer join can sometimes be better, but it might just
as well be slower.

--
Erland Sommarskog, SQL Server MVP, so****@algonet.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 20 '05 #2
In article <MP************************@news.newsguy.com>,
GM***@dublin.com says...
So now I have:

SELECT tqt.qId
FROM TestResults AS tr
RIGHT OUTER JOIN TestQuestions_Tests AS tqt
ON tr.testId = tqt.testId AND tr.qId = tqt.qId
WHERE tqt.testId = '1' AND tr.studentId = '7' AND tr.qId IS NULL

Is this the canonical form of the solution to my problem? It seems to
me like it is generating a whole slew of rows and then filtering them.


(Yes, I'm talking to myself)

Whether it is or is not the canonical form, it's not a solution. <sigh>
If tr.qId IS NULL, then so is tr.studentId.... If I remove the
tr.studentId = '7' clause then I get the unanswered questions for *all*
students.

I just want to get a list of the questions on a particular test and emit
a list of the ones that haven't been answered by a particular student.

Don't tell me I'm going to have to resort to old-fashioned 3-gen
procedural programming to solve this problem. Any ideas?

-- Rick
Jul 20 '05 #3
In article <Xn**********************@127.0.0.1>, so****@algonet.se
says...
This is how I would write it:

SELECT tqt.qId
FROM TestQuestions_Tests tqt
WHERE NOT EXISTS (SELECT *
FROM TestResults tr
WHERE tr.studentId = @studentid
AND tr.testId = @testid
AND tqt.testId = tr.testId)
AND tqt.testId = @testid


Being a translation of mine, that had the same problem mine had -- it
returned no rows. (I know, I know, I didn't supply any create
statements. I have some but I didn't want to clutter up what I thought
was a conceptually simple problem.)

Here's what I made work:

SELECT tqt.qid
FROM
( SELECT testId, qId
FROM testResults
WHERE studentId = @studentId and testid = @testId
)
AS tr
RIGHT OUTER JOIN TestQuestions_Tests AS tqt
ON tr.qId = tqt.qId
WHERE tr.qId IS NULL AND tqt.testId = @testId

By prefiltering the data I avoid retrieving *everyone's* unanswered
questions. How would I convert that one to "NOT EXISTS?"

-- Rick
Jul 20 '05 #4
Guinness Mann wrote:
In my (admittedly brief) sojurn as an SQL programmer I've often admired
"outer joins" in textbooks but never really understood their use. I've
finally come across a problem that I think is served by an outer join.

-- This table stores the answer to each test question
CREATE TABLE TestResults (
studentId varchar (15) NOT NULL,
testId int NOT NULL REFERENCES Tests(testId),
qId int NOT NULL REFERENCES TestQuestions(qId),
response int NOT NULL REFERENCES TestDistractors(dId),
CONSTRAINT PK_TestResults
PRIMARY KEY NONCLUSTERED (testId, studentId, qId),
)

-- This table defines which questions are on which tests
CREATE TABLE TestQuestions_Tests (
testId int NOT NULL REFERENCES Tests(testId),
qId int NOT NULL REFERENCES TestQuestions(qId),
)


Why is TestQuestions not just Questions (or Question -- I find
singular table names usually work out better, if only because I can
always make the primary key <table name>ID). If it was just Questions,
then TestQuestions_Tests would just be TestQuestions. And why
TestDistractors, rather than just Answers?

Anyway, TestQuestions_Tests should have its own key field.
That would simplify TestResults, and eliminate the possibility
that the student is recorded as answering a question that wasn't
on the test.

If you ever delete from any of these tables, you'll want to make
sure all the foreign key fields are indexed. Otherwise, for
instance, any time you delete a row from TestQuestions_Tests,
a table scan of TestResults will be done to make sure the delete
can be allowed.

Bill

Jul 20 '05 #5
Guinness Mann wrote:

Being a translation of mine, that had the same problem mine had -- it
returned no rows. (I know, I know, I didn't supply any create
statements. I have some but I didn't want to clutter up what I thought
was a conceptually simple problem.)

Here's what I made work:

SELECT tqt.qid
FROM
( SELECT testId, qId
FROM testResults
WHERE studentId = @studentId and testid = @testId
)
AS tr
RIGHT OUTER JOIN TestQuestions_Tests AS tqt
ON tr.qId = tqt.qId
WHERE tr.qId IS NULL AND tqt.testId = @testId

By prefiltering the data I avoid retrieving *everyone's* unanswered
questions. How would I convert that one to "NOT EXISTS?"


SELECT TestQuestions_Tests.qid
FROM TestQuestions_Tests
WHERE
TestQuestions_Tests = @TestId
AND NOT EXISTS
(
SELECT *
FROM testResults
WHERE
studentId = @studentId
AND testid = @testId
AND TestQuestions_Tests.qid = TestResults.qID
)
BTW, I'm not sure I've ever seen a right join in production code,
only left. Conceptually, it's the same thing, but I always think
of queries left-to-right, so it would throw me off when reading it.

Bill

Jul 20 '05 #6
In article <3F**************@Ameritech.Net>, WC********@Ameritech.Net
says...
-- This table stores the answer to each test question
CREATE TABLE TestResults (
studentId varchar (15) NOT NULL,
testId int NOT NULL REFERENCES Tests(testId),
qId int NOT NULL REFERENCES TestQuestions(qId),
response int NOT NULL REFERENCES TestDistractors(dId),
CONSTRAINT PK_TestResults
PRIMARY KEY NONCLUSTERED (testId, studentId, qId),
)

-- This table defines which questions are on which tests
CREATE TABLE TestQuestions_Tests (
testId int NOT NULL REFERENCES Tests(testId),
qId int NOT NULL REFERENCES TestQuestions(qId),
)

Why is TestQuestions not just Questions (or Question -- I find
singular table names usually work out better, if only because I can
always make the primary key <table name>ID). If it was just Questions,
then TestQuestions_Tests would just be TestQuestions. And why
TestDistractors, rather than just Answers?


I didn't even mention TestQuestions_Distractors, TextResources,
TestQuestions_AudioResources, TestQuestions_TextResources or
AudioResources. :-)

Technically (hey, I'm not a content-provider, I'm just the implementer)
the "answer" is the distractor that represents the correct response.
Therefore we have a table of distractors and elsewhere the distractorId
of the correct response is noted.
If you ever delete from any of these tables, you'll want to make
sure all the foreign key fields are indexed. Otherwise, for
instance, any time you delete a row from TestQuestions_Tests,
a table scan of TestResults will be done to make sure the delete
can be allowed.


Actually, I go the other way on that. I delete from the Tests or
TestQuestions table, and it cascades into the intersection tables.
Thanks for the help with the "NOT EXISTS" part.

-- Rick

Jul 20 '05 #7

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

2 posts views Thread by Martin | last post: by
4 posts views Thread by Omavlana | last post: by
1 post views Thread by Omavlana | last post: by
3 posts views Thread by Martin | last post: by
reply views Thread by K k | last post: by
1 post views Thread by Andreas Bauer | last post: by
9 posts views Thread by shapper | last post: by
reply views Thread by leo001 | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.