By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
440,828 Members | 801 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 440,828 IT Pros & Developers. It's quick & easy.

SQL - JOIN or nested query?

P: n/a
I need to create a Recordset of all records that appear in one table but do
NOT appear in another.

To get the records that appear in BOTH tables, I can do this:

[Query1]
SELECT tblEntity.Entity_ID
FROM tblEntity INNER JOIN tblDocuments ON tblEntity.Entity_ID =
tblDocuments.Entity_ID;

Then, to get the records that appear in tblDocuments that do NOT have a
corresponding Entity_ID in tblEntity:

[Query2]
SELECT tblDocuments.Document
FROM tblDocuments
WHERE (tblDocuments.Entity_ID Not In (SELECT Query1.Entity_ID FROM Query1))

The problem with this is it's extremely slow - about 3 seconds.

Is there a way to consolidate this into one query or use a different JOIN
statement to improve performance?

Thanks in advance.
Nov 13 '05 #1
Share this Question
Share on Google+
9 Replies


P: n/a
SELECT tblDocuments.Entity_ID
FROM tblDocuments LEFT JOIN tblEntity
ON tblEntity.Entity_ID = tblDocuments.Entity_ID
WHERE tblEntity.Entity_ID IS NULL
--
Doug Steele, Microsoft Access MVP
http://I.Am/DougSteele
(no e-mails, please!)

"deko" <de**@nospam.com> wrote in message
news:U8*****************************************@c omcast.com...
I need to create a Recordset of all records that appear in one table but do
NOT appear in another.

To get the records that appear in BOTH tables, I can do this:

[Query1]
SELECT tblEntity.Entity_ID
FROM tblEntity INNER JOIN tblDocuments ON tblEntity.Entity_ID =
tblDocuments.Entity_ID;

Then, to get the records that appear in tblDocuments that do NOT have a
corresponding Entity_ID in tblEntity:

[Query2]
SELECT tblDocuments.Document
FROM tblDocuments
WHERE (tblDocuments.Entity_ID Not In (SELECT Query1.Entity_ID FROM
Query1))

The problem with this is it's extremely slow - about 3 seconds.

Is there a way to consolidate this into one query or use a different JOIN
statement to improve performance?

Thanks in advance.

Nov 13 '05 #2

P: n/a
> SELECT tblDocuments.Entity_ID
FROM tblDocuments LEFT JOIN tblEntity
ON tblEntity.Entity_ID = tblDocuments.Entity_ID
WHERE tblEntity.Entity_ID IS NULL


But there are no Nulls in tblEntity.

Actually, it's a bit more complicated.

There are documents associated with entities. And there are categories of
entities. Documents can be associated with more than one entity, including
entities in multiple categories.

When the user wants to delete a category of entities, the documents
associated with those entities should be deleted - but only if the document
is not associated with another entity in another category. This is because
the document in tblDocuments is actually a hyperlink to a document stored on
disk. So we want a Recordset of all documents that are associated with
entities in the selected category, but are not associated with entities in
other categories.

This will get me documents that are NOT in the selected category and
associated with more than one entity:

[qryA]
SELECT tblDocuments.Doc_ID
FROM tblDocuments INNER JOIN tblEntity ON tblDocuments.Entity_ID =
tblEntity.Entity_ID
WHERE (tblEntity.Cat_ID <> prmCatID And (tblDocuments.Document In (SELECT
tblDocuments.Document FROM tblDocuments GROUP BY tblDocuments.Document
HAVING Count(*)>1));

(I have to join to tblEntity because that's where the Cat_ID is)

This will get me documents that ARE in the selected category and associated
with one or more entity (in that category)

[qryB]
SELECT tblDocuments.Doc_ID
FROM tblDocuments INNER JOIN tblEntity ON tblDocuments.Entity_ID =
tblEntity.Entity_ID
WHERE (tblEntity.Cat_ID = prmCatID) And (tblDocuments.Document In (SELECT
tblDocuments.Document FROM tblDocuments GROUP BY tblDocuments.Document
HAVING Count(*)>=1));

This will get me documents appearing qryA and also in qryB (what we DON'T
want to delete)

[qryC]
SELECT DISTINCT qryB.Doc_ID
FROM qryB INNER JOIN qryA ON qryB.Document = qryA.Document;

So the Recordset I need is all records in qryB that do not appear in qryC.

But this is pathetically slow:

SELECT qryB.Doc_ID FROM qryB WHERE qryB.Doc_ID Not In (SELECT qryC.Doc_ID
FROM qryC)

The Not operator appears to really slows things down.

Anyway, I'm kind of stumped on this one so I appreciate any suggestions you
care to offer.
Nov 13 '05 #3

P: n/a
This seems to be working...

[qryDocsCat]
SELECT tblDocuments.Document
FROM tblDocuments INNER JOIN tblEntity ON tblDocuments.Entity_ID =
tblEntity.Entity_ID
WHERE (tblEntity.Cat_ID = prmCid);

[qryDocsCatOtherEntities]
SELECT tblDocuments.Document
FROM tblDocuments INNER JOIN tblEntity ON tblDocuments.Entity_ID =
tblEntity.Entity_ID
WHERE (tblEntity.Cat_ID <> prmCid) And (HyperlinkPart(tblDocuments.Document,
2) = prmDoc);

Set fso = New FileSystemObject
Set qdfs = db.QueryDefs
'rst of documents linked to entities in category
Set qdfRst = qdfs("qryDocsCat")
qdfRst.Parameters("prmCid") = lngCid
Set rst = qdfRst.OpenRecordset
'count of other entities linked to document
Set qdfCt = qdfs("qryDocsCatOtherEntities")
qdfCt.Parameters("prmCid") = lngCid
Do While Not rst.EOF
strDoc = rst!Document
strDoc = HyperlinkPart(strDoc, 2)
qdfCt.Parameters("prmDoc") = strDoc
Set rstCt = qdfCt.OpenRecordset
'rstCt.MoveLast ' - not sure if I need this...
If rstCt.RecordCount = 0 Then
If Left$(strDoc, 5) = "file:" Then
strDoc = Right$(strDoc, Len(strDoc) - 5)
End If
fso.DeleteFile strDoc, True
End If
rst.MoveNext
Loop
Nov 13 '05 #4

P: n/a
deko wrote:
SELECT tblDocuments.Entity_ID
FROM tblDocuments LEFT JOIN tblEntity
ON tblEntity.Entity_ID = tblDocuments.Entity_ID
WHERE tblEntity.Entity_ID IS NULL


But there are no Nulls in tblEntity.


You're missing a fundamental concept here.

In:

http://groups.google.com/group/comp....1472dc3?hl=en&

I discussed a way of using the LEFT JOIN ... ID IS NULL as a faster
alternative to using NOT IN (...). Douglas' SQL is nearly identical to
what I suggested.

If you create an unmatched query using the wizard on, say, a tblOrders
and tblOrderDetails using the OrderID you will see that the 'IS NULL'
part of the SQL is used to determine the unmatched part since an
OrderID that shows up in tblOrders but does not show up in
tblOrderDetails will have tblOrders.OrderID with a number and
tblOrderDetails.OrderID with a Null because the LEFT JOIN shows all the
OrderID's in tblOrders and no OrderID in tblOrderDetails exists that
matches it. So this kind of LEFT JOIN is used to get the list of
OrderID's that satisfy the 'NOT' part. For a large table the
'positive' list (using IN) is searched much more quickly than the
'negative' list (using NOT IN).

VBA can certainly perform better than using NOT IN with SQL but
adjusting the SQL to run faster by using LEFT JOIN instead of NOT IN
can often obviate the need for VBA or .NET code entirely.

James A. Fortune

Nov 13 '05 #5

P: n/a
> You're missing a fundamental concept here.

In:

http://groups.google.com/group/comp....1472dc3?hl=en&

I discussed a way of using the LEFT JOIN ... ID IS NULL as a faster
alternative to using NOT IN (...). Douglas' SQL is nearly identical to
what I suggested.

If you create an unmatched query using the wizard on, say, a tblOrders
and tblOrderDetails using the OrderID you will see that the 'IS NULL'
part of the SQL is used to determine the unmatched part since an
OrderID that shows up in tblOrders but does not show up in
tblOrderDetails will have tblOrders.OrderID with a number and
tblOrderDetails.OrderID with a Null because the LEFT JOIN shows all the
OrderID's in tblOrders and no OrderID in tblOrderDetails exists that
matches it. So this kind of LEFT JOIN is used to get the list of
OrderID's that satisfy the 'NOT' part. For a large table the
'positive' list (using IN) is searched much more quickly than the
'negative' list (using NOT IN).

VBA can certainly perform better than using NOT IN with SQL but
adjusting the SQL to run faster by using LEFT JOIN instead of NOT IN
can often obviate the need for VBA or .NET code entirely.

James A. Fortune


Thanks for the tip. I'm putting out other fires right now, but will give it
a shot and post back...
Nov 13 '05 #6

P: n/a
deko wrote:
You're missing a fundamental concept here.

In:

http://groups.google.com/group/comp....1472dc3?hl=en&

I discussed a way of using the LEFT JOIN ... ID IS NULL as a faster
alternative to using NOT IN (...). Douglas' SQL is nearly identical to
what I suggested.

If you create an unmatched query using the wizard on, say, a tblOrders
and tblOrderDetails using the OrderID you will see that the 'IS NULL'
part of the SQL is used to determine the unmatched part since an
OrderID that shows up in tblOrders but does not show up in
tblOrderDetails will have tblOrders.OrderID with a number and
tblOrderDetails.OrderID with a Null because the LEFT JOIN shows all the
OrderID's in tblOrders and no OrderID in tblOrderDetails exists that
matches it. So this kind of LEFT JOIN is used to get the list of
OrderID's that satisfy the 'NOT' part. For a large table the
'positive' list (using IN) is searched much more quickly than the
'negative' list (using NOT IN).

VBA can certainly perform better than using NOT IN with SQL but
adjusting the SQL to run faster by using LEFT JOIN instead of NOT IN
can often obviate the need for VBA or .NET code entirely.

James A. Fortune


Thanks for the tip. I'm putting out other fires right now, but will give it
a shot and post back...


For Cat_ID = 4 the NOT IN version came out something like:

SELECT DocID, Entity_ID, Document, IIf(DeleteMe(Right([Document],
Len([Document]) - 5)) = -1, 'True', 'False') AS DeletionSuccessful FROM
tblDocuments WHERE Entity_ID IN (SELECT tblEntity.Entity_ID FROM
tblEntity WHERE Cat_ID = 4) AND Document NOT IN (SELECT Document FROM
tblDocuments INNER JOIN tblEntity ON tblEntity.Entity_ID =
tblDocuments.Entity_ID WHERE Cat_ID <> 4) AND Left([Document],5) =
'file:';

The declaration for DeleteMe looks like:

Public Function DeleteMe(strDoc As String) As Boolean

I don't see immediately (or even not so immediately) how to convert the
NOT IN part to a LEFT JOIN that returns the set of documents that are
not in any other categories. The

Entity_ID IN (SELECT tblEntity.Entity_ID FROM tblEntity WHERE Cat_ID =
4)

part gets the the documents in the given category nicely. Part of the
problem is having to get the Cat_ID indirectly (via a join) from
tblEntity rather than from tblDocuments. The Unmatched Query works
best when the left join field is in both tables (i.e., Cat_ID). But
you probably don't want to do any denormalizing to achieve that. All
roads look mired with some kind of problem so in this case the VBA
solution is looking pretty nice. Perhaps you had more success than I
did. Maybe what I did will point to a SQL solution. Sigh.

James A. Fortune

Nov 13 '05 #7

P: n/a
> For Cat_ID = 4 the NOT IN version came out something like:

SELECT DocID, Entity_ID, Document, IIf(DeleteMe(Right([Document],
Len([Document]) - 5)) = -1, 'True', 'False') AS DeletionSuccessful FROM
tblDocuments WHERE Entity_ID IN (SELECT tblEntity.Entity_ID FROM
tblEntity WHERE Cat_ID = 4) AND Document NOT IN (SELECT Document FROM
tblDocuments INNER JOIN tblEntity ON tblEntity.Entity_ID =
tblDocuments.Entity_ID WHERE Cat_ID <> 4) AND Left([Document],5) =
'file:';

The declaration for DeleteMe looks like:

Public Function DeleteMe(strDoc As String) As Boolean

I don't see immediately (or even not so immediately) how to convert the
NOT IN part to a LEFT JOIN that returns the set of documents that are
not in any other categories. The

Entity_ID IN (SELECT tblEntity.Entity_ID FROM tblEntity WHERE Cat_ID =
4)

part gets the the documents in the given category nicely. Part of the
problem is having to get the Cat_ID indirectly (via a join) from
tblEntity rather than from tblDocuments. The Unmatched Query works
best when the left join field is in both tables (i.e., Cat_ID). But
you probably don't want to do any denormalizing to achieve that. All
roads look mired with some kind of problem so in this case the VBA
solution is looking pretty nice. Perhaps you had more success than I
did. Maybe what I did will point to a SQL solution. Sigh.

James A. Fortune


Thanks again for the detailed reply. The DLookup is working for now, but I
have another similar situation where a DLookup will not work for me. I've
got to deliver this tomorrow (today), and it's 2:00 am, so I'm still using
the Not In SQL statement in that one situation. I'll post back with results
of the Left Join idea as soon as I get a chance.
Nov 13 '05 #8

P: n/a
Got it working.

Here's the problem I was faced with:

Users perform an advanced search which selects records form the database
based on a number of different user-defined criteria. The query used to
select the records is built with dynamic SQL and is called qry5000. Another
query, qry5000Eid, selects unique Entity IDs from qry5000.

Each record in the advanced search set may or may not contain hyperlinks to
documents stored on disk, and there may or may not be other records in the
database (not in the advanced search set) that also contain hyperlinks to
these same documents.

When a user issues a command to delete all records in the advanced search
set, all hyperlinked documents associated with those records should be
deleted from disk - unless there are other records in the database (not in
the advanced search set) that contain a hyperlink to the document. Deleting
incorrect documents is a very bad thing.

Here's the text of the 3 queries required to build the recordset:

qryDocsAdvSearch [all documents linked to entities in the advanced search
set]
SELECT tblDocuments.Document, qry5000Eid.Entity_ID FROM tblDocuments INNER
JOIN qry5000Eid ON tblDocuments.Entity_ID = qry5000Eid.Entity_ID;

qryDocsAdvSearchEidNotIn [all Entity_IDs NOT in the advanced search set]
SELECT tblEntity.Entity_ID FROM tblEntity LEFT JOIN qry5000Eid ON
tblEntity.Entity_ID = qry5000Eid.Entity_ID WHERE (qry5000Eid.Entity_ID Is
Null);

QryDocsAdvSearchDelete [all documents linked to entities NOT in the advanced
search set that are also linked to entities IN the advanced search set]
SELECT HyperlinkPart(tblDocuments.Document, 2) As Doc,
tblDocuments.Entity_ID FROM (tblDocuments INNER JOIN qryDocsAdvSearch ON
HyperlinkPart(tblDocuments.Document, 2) =
HyperlinkPart(qryDocsAdvSearch.Document, 2)) INNER JOIN
qryDocsAdvSearchEidNotIn ON tblDocuments.Entity_ID =
qryDocsAdvSearchEidNotIn.Entity_ID;

The challenge was getting Qry2 correct - I had to try a few variations
before I got it right. It's *much* faster than SELECT ... WHERE ... Not In
....

I build the recordset on the last query and loop through it with the
scripting runtime to delete the documents from disk.
Nov 13 '05 #9

P: n/a
A few corrections...
qryDocsAdvSearch [all documents linked to entities in the advanced search
set]
SELECT tblDocuments.Document, qry5000Eid.Entity_ID FROM tblDocuments INNER
JOIN qry5000Eid ON tblDocuments.Entity_ID = qry5000Eid.Entity_ID;
[qryDocsAdvSearch corrected]
SELECT HyperlinkPart(tblDocuments.Document, 2) AS Doc,
tblDocuments.Entity_ID FROM tblDocuments INNER JOIN qry5000Eid ON
tblDocuments.Entity_ID = qry5000Eid.Entity_ID;
qryDocsAdvSearchEidNotIn [all Entity_IDs NOT in the advanced search set]
SELECT tblEntity.Entity_ID FROM tblEntity LEFT JOIN qry5000Eid ON
tblEntity.Entity_ID = qry5000Eid.Entity_ID WHERE (qry5000Eid.Entity_ID Is
Null);
[no correction for qryDocsAdvSearchEidNotIn]
QryDocsAdvSearchDelete [all documents linked to entities NOT in the
advanced search set that are also linked to entities IN the advanced
search set]
SELECT HyperlinkPart(tblDocuments.Document, 2) As Doc,
tblDocuments.Entity_ID FROM (tblDocuments INNER JOIN qryDocsAdvSearch ON
HyperlinkPart(tblDocuments.Document, 2) =
HyperlinkPart(qryDocsAdvSearch.Document, 2)) INNER JOIN
qryDocsAdvSearchEidNotIn ON tblDocuments.Entity_ID =
qryDocsAdvSearchEidNotIn.Entity_ID;
[QryDocsAdvSearchDelete corrected]
SELECT HyperlinkPart(tblDocuments.Document, 2) AS Doc,
tblDocuments.Entity_ID FROM (tblDocuments INNER JOIN qryDocsAdvSearchDocs ON
HyperlinkPart(tblDocuments.Document, 2) = qryDocsAdvSearchDocs.Doc) INNER
JOIN qryDocsAdvSearchEidNotIn ON tblDocuments.Entity_ID =
qryDocsAdvSearchEidNotIn.Entity_ID;
I build the recordset on the last query and loop through it with the
scripting runtime to delete the documents from disk.


[corrected]
I build the recordset on qryDocsAdvSearch, loop through it and look for each
Doc in QryDocsAdvSearchDelete - if no match is found, then delete.

Private Const QT As String = """"

Set qdf = qdfs("qryDocsAdvSearch")
Set rst = qdf.OpenRecordset
Do While Not rst.EOF
strDoc = rst!Doc
lngEid = Nz(DLookup("Entity_ID", "QryDocsAdvSearchDelete", _
"Doc = " & QT & strDoc & QT), 0)
If Len(strDoc) <> 0 And lngEid = 0 Then
If Left$(strDoc, 5) = "file:" Then
strDoc = Right$(strDoc, Len(strDoc) - 5)
End If
fso.DeleteFile strDoc, True
End If
rst.MoveNext
Loop
Nov 13 '05 #10

This discussion thread is closed

Replies have been disabled for this discussion.