473,385 Members | 1,331 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

SQL - JOIN or nested query?

I need to create a Recordset of all records that appear in one table but do
NOT appear in another.

To get the records that appear in BOTH tables, I can do this:

[Query1]
SELECT tblEntity.Entity_ID
FROM tblEntity INNER JOIN tblDocuments ON tblEntity.Entity_ID =
tblDocuments.Entity_ID;

Then, to get the records that appear in tblDocuments that do NOT have a
corresponding Entity_ID in tblEntity:

[Query2]
SELECT tblDocuments.Document
FROM tblDocuments
WHERE (tblDocuments.Entity_ID Not In (SELECT Query1.Entity_ID FROM Query1))

The problem with this is it's extremely slow - about 3 seconds.

Is there a way to consolidate this into one query or use a different JOIN
statement to improve performance?

Thanks in advance.
Nov 13 '05 #1
9 20080
SELECT tblDocuments.Entity_ID
FROM tblDocuments LEFT JOIN tblEntity
ON tblEntity.Entity_ID = tblDocuments.Entity_ID
WHERE tblEntity.Entity_ID IS NULL
--
Doug Steele, Microsoft Access MVP
http://I.Am/DougSteele
(no e-mails, please!)

"deko" <de**@nospam.com> wrote in message
news:U8*****************************************@c omcast.com...
I need to create a Recordset of all records that appear in one table but do
NOT appear in another.

To get the records that appear in BOTH tables, I can do this:

[Query1]
SELECT tblEntity.Entity_ID
FROM tblEntity INNER JOIN tblDocuments ON tblEntity.Entity_ID =
tblDocuments.Entity_ID;

Then, to get the records that appear in tblDocuments that do NOT have a
corresponding Entity_ID in tblEntity:

[Query2]
SELECT tblDocuments.Document
FROM tblDocuments
WHERE (tblDocuments.Entity_ID Not In (SELECT Query1.Entity_ID FROM
Query1))

The problem with this is it's extremely slow - about 3 seconds.

Is there a way to consolidate this into one query or use a different JOIN
statement to improve performance?

Thanks in advance.

Nov 13 '05 #2
> SELECT tblDocuments.Entity_ID
FROM tblDocuments LEFT JOIN tblEntity
ON tblEntity.Entity_ID = tblDocuments.Entity_ID
WHERE tblEntity.Entity_ID IS NULL


But there are no Nulls in tblEntity.

Actually, it's a bit more complicated.

There are documents associated with entities. And there are categories of
entities. Documents can be associated with more than one entity, including
entities in multiple categories.

When the user wants to delete a category of entities, the documents
associated with those entities should be deleted - but only if the document
is not associated with another entity in another category. This is because
the document in tblDocuments is actually a hyperlink to a document stored on
disk. So we want a Recordset of all documents that are associated with
entities in the selected category, but are not associated with entities in
other categories.

This will get me documents that are NOT in the selected category and
associated with more than one entity:

[qryA]
SELECT tblDocuments.Doc_ID
FROM tblDocuments INNER JOIN tblEntity ON tblDocuments.Entity_ID =
tblEntity.Entity_ID
WHERE (tblEntity.Cat_ID <> prmCatID And (tblDocuments.Document In (SELECT
tblDocuments.Document FROM tblDocuments GROUP BY tblDocuments.Document
HAVING Count(*)>1));

(I have to join to tblEntity because that's where the Cat_ID is)

This will get me documents that ARE in the selected category and associated
with one or more entity (in that category)

[qryB]
SELECT tblDocuments.Doc_ID
FROM tblDocuments INNER JOIN tblEntity ON tblDocuments.Entity_ID =
tblEntity.Entity_ID
WHERE (tblEntity.Cat_ID = prmCatID) And (tblDocuments.Document In (SELECT
tblDocuments.Document FROM tblDocuments GROUP BY tblDocuments.Document
HAVING Count(*)>=1));

This will get me documents appearing qryA and also in qryB (what we DON'T
want to delete)

[qryC]
SELECT DISTINCT qryB.Doc_ID
FROM qryB INNER JOIN qryA ON qryB.Document = qryA.Document;

So the Recordset I need is all records in qryB that do not appear in qryC.

But this is pathetically slow:

SELECT qryB.Doc_ID FROM qryB WHERE qryB.Doc_ID Not In (SELECT qryC.Doc_ID
FROM qryC)

The Not operator appears to really slows things down.

Anyway, I'm kind of stumped on this one so I appreciate any suggestions you
care to offer.
Nov 13 '05 #3
This seems to be working...

[qryDocsCat]
SELECT tblDocuments.Document
FROM tblDocuments INNER JOIN tblEntity ON tblDocuments.Entity_ID =
tblEntity.Entity_ID
WHERE (tblEntity.Cat_ID = prmCid);

[qryDocsCatOtherEntities]
SELECT tblDocuments.Document
FROM tblDocuments INNER JOIN tblEntity ON tblDocuments.Entity_ID =
tblEntity.Entity_ID
WHERE (tblEntity.Cat_ID <> prmCid) And (HyperlinkPart(tblDocuments.Document,
2) = prmDoc);

Set fso = New FileSystemObject
Set qdfs = db.QueryDefs
'rst of documents linked to entities in category
Set qdfRst = qdfs("qryDocsCat")
qdfRst.Parameters("prmCid") = lngCid
Set rst = qdfRst.OpenRecordset
'count of other entities linked to document
Set qdfCt = qdfs("qryDocsCatOtherEntities")
qdfCt.Parameters("prmCid") = lngCid
Do While Not rst.EOF
strDoc = rst!Document
strDoc = HyperlinkPart(strDoc, 2)
qdfCt.Parameters("prmDoc") = strDoc
Set rstCt = qdfCt.OpenRecordset
'rstCt.MoveLast ' - not sure if I need this...
If rstCt.RecordCount = 0 Then
If Left$(strDoc, 5) = "file:" Then
strDoc = Right$(strDoc, Len(strDoc) - 5)
End If
fso.DeleteFile strDoc, True
End If
rst.MoveNext
Loop
Nov 13 '05 #4
deko wrote:
SELECT tblDocuments.Entity_ID
FROM tblDocuments LEFT JOIN tblEntity
ON tblEntity.Entity_ID = tblDocuments.Entity_ID
WHERE tblEntity.Entity_ID IS NULL


But there are no Nulls in tblEntity.


You're missing a fundamental concept here.

In:

http://groups.google.com/group/comp....1472dc3?hl=en&

I discussed a way of using the LEFT JOIN ... ID IS NULL as a faster
alternative to using NOT IN (...). Douglas' SQL is nearly identical to
what I suggested.

If you create an unmatched query using the wizard on, say, a tblOrders
and tblOrderDetails using the OrderID you will see that the 'IS NULL'
part of the SQL is used to determine the unmatched part since an
OrderID that shows up in tblOrders but does not show up in
tblOrderDetails will have tblOrders.OrderID with a number and
tblOrderDetails.OrderID with a Null because the LEFT JOIN shows all the
OrderID's in tblOrders and no OrderID in tblOrderDetails exists that
matches it. So this kind of LEFT JOIN is used to get the list of
OrderID's that satisfy the 'NOT' part. For a large table the
'positive' list (using IN) is searched much more quickly than the
'negative' list (using NOT IN).

VBA can certainly perform better than using NOT IN with SQL but
adjusting the SQL to run faster by using LEFT JOIN instead of NOT IN
can often obviate the need for VBA or .NET code entirely.

James A. Fortune

Nov 13 '05 #5
> You're missing a fundamental concept here.

In:

http://groups.google.com/group/comp....1472dc3?hl=en&

I discussed a way of using the LEFT JOIN ... ID IS NULL as a faster
alternative to using NOT IN (...). Douglas' SQL is nearly identical to
what I suggested.

If you create an unmatched query using the wizard on, say, a tblOrders
and tblOrderDetails using the OrderID you will see that the 'IS NULL'
part of the SQL is used to determine the unmatched part since an
OrderID that shows up in tblOrders but does not show up in
tblOrderDetails will have tblOrders.OrderID with a number and
tblOrderDetails.OrderID with a Null because the LEFT JOIN shows all the
OrderID's in tblOrders and no OrderID in tblOrderDetails exists that
matches it. So this kind of LEFT JOIN is used to get the list of
OrderID's that satisfy the 'NOT' part. For a large table the
'positive' list (using IN) is searched much more quickly than the
'negative' list (using NOT IN).

VBA can certainly perform better than using NOT IN with SQL but
adjusting the SQL to run faster by using LEFT JOIN instead of NOT IN
can often obviate the need for VBA or .NET code entirely.

James A. Fortune


Thanks for the tip. I'm putting out other fires right now, but will give it
a shot and post back...
Nov 13 '05 #6
deko wrote:
You're missing a fundamental concept here.

In:

http://groups.google.com/group/comp....1472dc3?hl=en&

I discussed a way of using the LEFT JOIN ... ID IS NULL as a faster
alternative to using NOT IN (...). Douglas' SQL is nearly identical to
what I suggested.

If you create an unmatched query using the wizard on, say, a tblOrders
and tblOrderDetails using the OrderID you will see that the 'IS NULL'
part of the SQL is used to determine the unmatched part since an
OrderID that shows up in tblOrders but does not show up in
tblOrderDetails will have tblOrders.OrderID with a number and
tblOrderDetails.OrderID with a Null because the LEFT JOIN shows all the
OrderID's in tblOrders and no OrderID in tblOrderDetails exists that
matches it. So this kind of LEFT JOIN is used to get the list of
OrderID's that satisfy the 'NOT' part. For a large table the
'positive' list (using IN) is searched much more quickly than the
'negative' list (using NOT IN).

VBA can certainly perform better than using NOT IN with SQL but
adjusting the SQL to run faster by using LEFT JOIN instead of NOT IN
can often obviate the need for VBA or .NET code entirely.

James A. Fortune


Thanks for the tip. I'm putting out other fires right now, but will give it
a shot and post back...


For Cat_ID = 4 the NOT IN version came out something like:

SELECT DocID, Entity_ID, Document, IIf(DeleteMe(Right([Document],
Len([Document]) - 5)) = -1, 'True', 'False') AS DeletionSuccessful FROM
tblDocuments WHERE Entity_ID IN (SELECT tblEntity.Entity_ID FROM
tblEntity WHERE Cat_ID = 4) AND Document NOT IN (SELECT Document FROM
tblDocuments INNER JOIN tblEntity ON tblEntity.Entity_ID =
tblDocuments.Entity_ID WHERE Cat_ID <> 4) AND Left([Document],5) =
'file:';

The declaration for DeleteMe looks like:

Public Function DeleteMe(strDoc As String) As Boolean

I don't see immediately (or even not so immediately) how to convert the
NOT IN part to a LEFT JOIN that returns the set of documents that are
not in any other categories. The

Entity_ID IN (SELECT tblEntity.Entity_ID FROM tblEntity WHERE Cat_ID =
4)

part gets the the documents in the given category nicely. Part of the
problem is having to get the Cat_ID indirectly (via a join) from
tblEntity rather than from tblDocuments. The Unmatched Query works
best when the left join field is in both tables (i.e., Cat_ID). But
you probably don't want to do any denormalizing to achieve that. All
roads look mired with some kind of problem so in this case the VBA
solution is looking pretty nice. Perhaps you had more success than I
did. Maybe what I did will point to a SQL solution. Sigh.

James A. Fortune

Nov 13 '05 #7
> For Cat_ID = 4 the NOT IN version came out something like:

SELECT DocID, Entity_ID, Document, IIf(DeleteMe(Right([Document],
Len([Document]) - 5)) = -1, 'True', 'False') AS DeletionSuccessful FROM
tblDocuments WHERE Entity_ID IN (SELECT tblEntity.Entity_ID FROM
tblEntity WHERE Cat_ID = 4) AND Document NOT IN (SELECT Document FROM
tblDocuments INNER JOIN tblEntity ON tblEntity.Entity_ID =
tblDocuments.Entity_ID WHERE Cat_ID <> 4) AND Left([Document],5) =
'file:';

The declaration for DeleteMe looks like:

Public Function DeleteMe(strDoc As String) As Boolean

I don't see immediately (or even not so immediately) how to convert the
NOT IN part to a LEFT JOIN that returns the set of documents that are
not in any other categories. The

Entity_ID IN (SELECT tblEntity.Entity_ID FROM tblEntity WHERE Cat_ID =
4)

part gets the the documents in the given category nicely. Part of the
problem is having to get the Cat_ID indirectly (via a join) from
tblEntity rather than from tblDocuments. The Unmatched Query works
best when the left join field is in both tables (i.e., Cat_ID). But
you probably don't want to do any denormalizing to achieve that. All
roads look mired with some kind of problem so in this case the VBA
solution is looking pretty nice. Perhaps you had more success than I
did. Maybe what I did will point to a SQL solution. Sigh.

James A. Fortune


Thanks again for the detailed reply. The DLookup is working for now, but I
have another similar situation where a DLookup will not work for me. I've
got to deliver this tomorrow (today), and it's 2:00 am, so I'm still using
the Not In SQL statement in that one situation. I'll post back with results
of the Left Join idea as soon as I get a chance.
Nov 13 '05 #8
Got it working.

Here's the problem I was faced with:

Users perform an advanced search which selects records form the database
based on a number of different user-defined criteria. The query used to
select the records is built with dynamic SQL and is called qry5000. Another
query, qry5000Eid, selects unique Entity IDs from qry5000.

Each record in the advanced search set may or may not contain hyperlinks to
documents stored on disk, and there may or may not be other records in the
database (not in the advanced search set) that also contain hyperlinks to
these same documents.

When a user issues a command to delete all records in the advanced search
set, all hyperlinked documents associated with those records should be
deleted from disk - unless there are other records in the database (not in
the advanced search set) that contain a hyperlink to the document. Deleting
incorrect documents is a very bad thing.

Here's the text of the 3 queries required to build the recordset:

qryDocsAdvSearch [all documents linked to entities in the advanced search
set]
SELECT tblDocuments.Document, qry5000Eid.Entity_ID FROM tblDocuments INNER
JOIN qry5000Eid ON tblDocuments.Entity_ID = qry5000Eid.Entity_ID;

qryDocsAdvSearchEidNotIn [all Entity_IDs NOT in the advanced search set]
SELECT tblEntity.Entity_ID FROM tblEntity LEFT JOIN qry5000Eid ON
tblEntity.Entity_ID = qry5000Eid.Entity_ID WHERE (qry5000Eid.Entity_ID Is
Null);

QryDocsAdvSearchDelete [all documents linked to entities NOT in the advanced
search set that are also linked to entities IN the advanced search set]
SELECT HyperlinkPart(tblDocuments.Document, 2) As Doc,
tblDocuments.Entity_ID FROM (tblDocuments INNER JOIN qryDocsAdvSearch ON
HyperlinkPart(tblDocuments.Document, 2) =
HyperlinkPart(qryDocsAdvSearch.Document, 2)) INNER JOIN
qryDocsAdvSearchEidNotIn ON tblDocuments.Entity_ID =
qryDocsAdvSearchEidNotIn.Entity_ID;

The challenge was getting Qry2 correct - I had to try a few variations
before I got it right. It's *much* faster than SELECT ... WHERE ... Not In
....

I build the recordset on the last query and loop through it with the
scripting runtime to delete the documents from disk.
Nov 13 '05 #9
A few corrections...
qryDocsAdvSearch [all documents linked to entities in the advanced search
set]
SELECT tblDocuments.Document, qry5000Eid.Entity_ID FROM tblDocuments INNER
JOIN qry5000Eid ON tblDocuments.Entity_ID = qry5000Eid.Entity_ID;
[qryDocsAdvSearch corrected]
SELECT HyperlinkPart(tblDocuments.Document, 2) AS Doc,
tblDocuments.Entity_ID FROM tblDocuments INNER JOIN qry5000Eid ON
tblDocuments.Entity_ID = qry5000Eid.Entity_ID;
qryDocsAdvSearchEidNotIn [all Entity_IDs NOT in the advanced search set]
SELECT tblEntity.Entity_ID FROM tblEntity LEFT JOIN qry5000Eid ON
tblEntity.Entity_ID = qry5000Eid.Entity_ID WHERE (qry5000Eid.Entity_ID Is
Null);
[no correction for qryDocsAdvSearchEidNotIn]
QryDocsAdvSearchDelete [all documents linked to entities NOT in the
advanced search set that are also linked to entities IN the advanced
search set]
SELECT HyperlinkPart(tblDocuments.Document, 2) As Doc,
tblDocuments.Entity_ID FROM (tblDocuments INNER JOIN qryDocsAdvSearch ON
HyperlinkPart(tblDocuments.Document, 2) =
HyperlinkPart(qryDocsAdvSearch.Document, 2)) INNER JOIN
qryDocsAdvSearchEidNotIn ON tblDocuments.Entity_ID =
qryDocsAdvSearchEidNotIn.Entity_ID;
[QryDocsAdvSearchDelete corrected]
SELECT HyperlinkPart(tblDocuments.Document, 2) AS Doc,
tblDocuments.Entity_ID FROM (tblDocuments INNER JOIN qryDocsAdvSearchDocs ON
HyperlinkPart(tblDocuments.Document, 2) = qryDocsAdvSearchDocs.Doc) INNER
JOIN qryDocsAdvSearchEidNotIn ON tblDocuments.Entity_ID =
qryDocsAdvSearchEidNotIn.Entity_ID;
I build the recordset on the last query and loop through it with the
scripting runtime to delete the documents from disk.


[corrected]
I build the recordset on qryDocsAdvSearch, loop through it and look for each
Doc in QryDocsAdvSearchDelete - if no match is found, then delete.

Private Const QT As String = """"

Set qdf = qdfs("qryDocsAdvSearch")
Set rst = qdf.OpenRecordset
Do While Not rst.EOF
strDoc = rst!Doc
lngEid = Nz(DLookup("Entity_ID", "QryDocsAdvSearchDelete", _
"Doc = " & QT & strDoc & QT), 0)
If Len(strDoc) <> 0 And lngEid = 0 Then
If Left$(strDoc, 5) = "file:" Then
strDoc = Right$(strDoc, Len(strDoc) - 5)
End If
fso.DeleteFile strDoc, True
End If
rst.MoveNext
Loop
Nov 13 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: eXavier | last post by:
Hello, I have query joining several tables, the last table is joined with LEFT JOIN. The last table has more then million rows and execution plan shows table scan on it. I have indexed columns...
7
by: alexcn | last post by:
I have the following query: SELECT dbo.tSymExch.exCode, dbo.tSymGrp.sgCode, dbo.tSymMain.smCode FROM dbo.tSymExch FULL OUTER JOIN dbo.tSymGrp ON dbo.tSymExch.exID =...
6
by: Thomas Beutin | last post by:
Hi, i've a speed problem withe the following statement: SELECT DISTINCT pz.l1_id, pz.l2_id, pz.l3_id, pz.l4_id FROM ot_adresse AS a, ot_produkt AS p LEFT OUTER JOIN ot_kat_prod AS pz ON (...
2
by: smauldin | last post by:
Why does the execution plan have a nested loop join for a simple select with an UDF in the where clause? Here is the query: select * from test_plan where vCol = my_udf('test') Here is the...
23
by: Brian | last post by:
Hello All - I am wondering if anyone has any thoughts on which is better from a performance perspective: a nested Select statement or an Inner Join. For example, I could do either of the...
21
by: CSN | last post by:
I have a pretty simple select query that joins a table (p) with 125K rows with another table (pc) with almost one million rows: select p.* from product_categories pc inner join products p on...
4
by: Michael Fuhr | last post by:
I have a query that works in 7.3.6 but not in 7.4.2 unless I turn off enable_hashjoin. I'm joining a table of network interfaces and a table of networks so I can find additional info about a...
6
by: dmonroe | last post by:
hi group -- Im having a nested inner join problem with an Access SQl statement/Query design. Im running the query from ASP and not usng the access interface at all. Here's the tables: ...
52
by: MP | last post by:
Hi trying to begin to learn database using vb6, ado/adox, mdb format, sql (not using access...just mdb format via ado) i need to group the values of multiple fields - get their possible...
1
by: Preeti.s83 | last post by:
is it possible to replace join type ( for eg. nested loop with hash join and so on) in xml plan... we will fst take plan in xml format ( show xml plan ) and then we will replace join with other...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.