By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
424,471 Members | 2,192 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 424,471 IT Pros & Developers. It's quick & easy.

SQL Syntax

P: n/a
I have an Access 97 database on WinXP with a linked Excel table of about
100,000 rows.
I'm trying to construct a query that will give me an editable datasheet
showing records where data in a pair of fields is duplicated.

As an example, I constructed this query on the Employees'table in the
Northwind Sample database. I appended four rows from the table back to the
table creating duplicates, then ran this SQL to locate rows where Firstname
and LastName together were duplicated.

SELECT Employees.*
FROM Employees
WHERE [LastName] & [FirstName] In (
SELECT [LastName] & [FirstName] AS FullName
FROM Employees
GROUP BY [LastName] & [FirstName]
HAVING Count([LastName] & [FirstName])>1)
AND Employees.TitleOfCourtesy='Mr.';

The final AND condition is actually a Date in my "real" project. LastName
and FirstName are Product Labels and ID Numbers. A query that works as
above will work on my real data.

This gives me exactly what I want, and I can add and change fields as
required (and it will be required.)

However, it takes (on my machine) about 10 seconds to run on the real data.
By creating and saving two separate queries, then constructing a third
query JOINing on FullName I can see the same result in 4 seconds.

So, my question.
Is there a different syntax that will give me the same (or better!) speed
in ONE query?
I'm aware that I can join on a Select statement instead of a Table or query
by enclosing the select statement in square brackets followed by a dot,
even in 97. My gut feeling is that there's a way I can use that fact, but
just at the moment my gut isn't talking to my brain.

I'd be grateful for any help.

Cheers,
DBDriver
May 9 '07 #1
Share this Question
Share on Google+
6 Replies


P: n/a
I'm aware that I can join on a Select statement instead of a Table or query
by enclosing the select statement in square brackets followed by a dot,
even in 97. My gut feeling is that there's a way I can use that fact,
you can add a derived table to your query in the form of a sub query
in SQL - Access messes this up once you view it in design mode and
converts the proper bracketing of the sub query with square brackets
and dot. eg (air code)

Select
Employees.name,
qryDuplicates.FullName
From
Employees Inner Join
(SELECT
FirstName, LastName,
[LastName] & [FirstName] AS FullName
FROM Employees
GROUP BY [LastName] & [FirstName]
HAVING Count([LastName] & [FirstName])>1
AND Employees.TitleOfCourtesy='Mr.';
) as qryDuplicates
On ((tblEmpoyees.FirstName = qryDuplicates.FirstName
AND (tblEmployees.LastName = qryDuplication.LastName))
once you view this in the design mode it will change it to:

Select
Employees.name,
qryDuplicates.FullName
From
Employees Inner Join
[SELECT
FirstName, LastName,
[LastName] & [FirstName] AS FullName
FROM Employees
GROUP BY [LastName] & [FirstName]
HAVING Count([LastName] & [FirstName])>1
AND Employees.TitleOfCourtesy='Mr.';
]. as qryDuplicates
On ((tblEmpoyees.FirstName = qryDuplicates.FirstName
AND (tblEmployees.LastName = qryDuplication.LastName))

....and then it won't recompile!!! maybe i'm missing something here -
but I always found that odd.

you could do that here, but it wouldn't be any faster. In fact it may
actually be slower then having a seperately compiled query joined. And
it wouldn't be updatable either... so no joy there

You may want to try compiling the dupilcate query seperately, but
using the same technique of using the "Where Field In" filter - I'm
not sure how the jet sql engine works, but I'd worry it was re-
evaluating that SQL expression on for each record of the Employess
table; try:

qryEmployees:=
SELECT Employees.*
FROM Employees
WHERE [LastName] & [FirstName] In (qryDuplicates)
AND Employees.TitleOfCourtesy='Mr.';

qryDuplicates:=
SELECT [LastName] & [FirstName] AS FullName
FROM Employees
GROUP BY [LastName] & [FirstName]
HAVING Count([LastName] & [FirstName])>1

....may not make a difference, but worth a shot :)

Also for speed, never use "select tbl.*" - naming the fields is faster

just at the moment my gut isn't talking to my brain.
ha! story of my life - but mostly its my liver not talking to
anyone... whoever said Guinness was good for you!

May 9 '07 #2

P: n/a

just noticed i messed up my copy & pasting - that _should_ have been:
qryEmployees:=
SELECT Employees.*
FROM Employees
WHERE [LastName] & [FirstName] In (qryDuplicates);

qryDuplicates:=
SELECT [LastName] & [FirstName] AS FullName
FROM Employees
GROUP BY [LastName] & [FirstName]
HAVING Count([LastName] & [FirstName])>1
AND Employees.TitleOfCourtesy='Mr.';
....much better!

May 9 '07 #3

P: n/a
Thanks for looking at this, BillCo

I found that Access complained about the AND clauee in qryDuplicates,
telling me it should have been part of an Aggregate function.

I found that qryEmployees considered qryDuplicates to be a Parameter.
(Testing in Win98 and Access 97)

It's been an interesing exersize.

Cheers,
DBDriver

BillCo <co**********@gmail.comwrote in
news:11**********************@l77g2000hsb.googlegr oups.com:
>
just noticed i messed up my copy & pasting - that _should_ have been:
>qryEmployees:=
SELECT Employees.*
FROM Employees
WHERE [LastName] & [FirstName] In (qryDuplicates);

qryDuplicates:=
SELECT [LastName] & [FirstName] AS FullName
FROM Employees
GROUP BY [LastName] & [FirstName]
HAVING Count([LastName] & [FirstName])>1
AND Employees.TitleOfCourtesy='Mr.';

...much better!

May 9 '07 #4

P: n/a
you can add a derived table to your query in the form of a sub query
in SQL - Access messes this up once you view it in design mode and
converts the proper bracketing of the sub query with square brackets
and dot. eg (air code)
I hadn't noticed that. Thanks!
>
you could do that here, but it wouldn't be any faster. In fact it may
actually be slower then having a seperately compiled query joined. And
it wouldn't be updatable either... so no joy there
Update is essential. I've created some quite interesting constructions
which weren't uptatable (and therefore no use to me.)
You may want to try compiling the dupilcate query seperately, but
using the same technique of using the "Where Field In" filter - I'm
not sure how the jet sql engine works, but I'd worry it was re-
evaluating that SQL expression on for each record of the Employess
table; try:

qryEmployees:=
SELECT Employees.*
FROM Employees
WHERE [LastName] & [FirstName] In (qryDuplicates)
AND Employees.TitleOfCourtesy='Mr.';

qryDuplicates:=
SELECT [LastName] & [FirstName] AS FullName
FROM Employees
GROUP BY [LastName] & [FirstName]
HAVING Count([LastName] & [FirstName])>1

...may not make a difference, but worth a shot :)
(Comment in next message.)
>
Also for speed, never use "select tbl.*" - naming the fields is faster
You're right, but in this case I'm making a semi-template to get various
duplicated entries in multiple columns from a 90 column Excel sheet.

I was really just curious about the []. method. My method worked quite well
for me, but there's always someone out there with a skillful tweak.

Cheers,
DBDriver
May 9 '07 #5

P: n/a
On May 9, 1:24 pm, DBDriver <DBDri...@no.cowrote:
Thanks for looking at this, BillCo

I found that Access complained about the AND clauee in qryDuplicates,
telling me it should have been part of an Aggregate function.

I found that qryEmployees considered qryDuplicates to be a Parameter.
(Testing in Win98 and Access 97)

It's been an interesing exersize.

Cheers,
DBDriver
here's where it gets tricky - it's so hard to advise without seeing
first hand. Try changing qryDuplicates to:

SELECT [LastName] & [FirstName] AS FullName
FROM Employees
WHERE Employees.TitleOfCourtesy='Mr.'
GROUP BY [LastName] & [FirstName]
HAVING Count([LastName] & [FirstName])>1

Actually, try to make this one work aswell (i'm curious after writing
the thing!):

Select
Employees.name,
qryDuplicates.FullName
From
Employees Inner Join
(SELECT
FirstName, LastName,
[LastName] & [FirstName] AS FullName
FROM Employees
GROUP BY [LastName] & [FirstName]
WHERE Employees.TitleOfCourtesy='Mr.'
HAVING Count([LastName] & [FirstName])>1
) as qryDuplicates
On ((tblEmpoyees.FirstName = qryDuplicates.FirstName
AND (tblEmployees.LastName = qryDuplication.LastName))

some final tips:
- You may have to go back to basics and analyse what you are trying to
achieve here... is there a better way to get there? Maybe stop
duplicates getting in to begin with.
- you may wish to revisit your use of first and last names as a
combined unique identifier. if there is no naturally absolutely unique
entity to use, then consider a syntetic primary key
- If you absolutely need a fast updatable query (and my experience
tells me that exposing users to data tables is not a good plan - even
masked by a bound form), consider filling a temporary table on the fly
and joining it on your main table to reduce the results returned...
speedville!

May 9 '07 #6

P: n/a
Comments inline.

BillCo <co**********@gmail.comwrote in
news:11**********************@w5g2000hsg.googlegro ups.com:
On May 9, 1:24 pm, DBDriver <DBDri...@no.cowrote:
>Thanks for looking at this, BillCo

I found that Access complained about the AND clauee in qryDuplicates,
telling me it should have been part of an Aggregate function.

I found that qryEmployees considered qryDuplicates to be a Parameter.
(Testing in Win98 and Access 97)

It's been an interesing exersize.

Cheers,
DBDriver

here's where it gets tricky - it's so hard to advise without seeing
first hand.
Very true. As I said:
"The final AND condition is actually a Date in my "real" project. LastName
and FirstName are Product Labels and ID Numbers. A query that works as
above will work on my real data."

In this case, the data didn't matter. My interest was in different ways of
structuring the queries and subQueries.
I've always managed to get exactly what I want using multiple queries. In a
few cases I've found that a MakeTable was needed if I wanted same-day
results. Often the single query has been slower.
Try changing qryDuplicates to:
SELECT [LastName] & [FirstName] AS FullName
FROM Employees
WHERE Employees.TitleOfCourtesy='Mr.'
GROUP BY [LastName] & [FirstName]
HAVING Count([LastName] & [FirstName])>1
My analogy (TitleOfCourtesy here = a unique Year in my "real" data) is not
good. The most common criterion will be that field, as duplications in
separate Years are OK while duplications within a Year are not.

The query I used in my original message is fine.
I was just curious about a join using the []. method, since I would use an
actual query instead.
Actually, try to make this one work aswell (i'm curious after writing
the thing!):
Sorry, still won't work for me. :-)
Office 97 on Win98 and Office 2003 on XP.

Error message:
Syntax Error in FROM Clause.

I noticed and changed qryDuplication, which of course made no difference.
Select
Employees.name,
qryDuplicates.FullName
From
Employees Inner Join
(SELECT
FirstName, LastName,
[LastName] & [FirstName] AS FullName
FROM Employees
GROUP BY [LastName] & [FirstName]
WHERE Employees.TitleOfCourtesy='Mr.'
HAVING Count([LastName] & [FirstName])>1
) as qryDuplicates
On ((tblEmpoyees.FirstName = qryDuplicates.FirstName
AND (tblEmployees.LastName = qryDuplication.LastName))

some final tips:
- You may have to go back to basics and analyse what you are trying to
achieve here... is there a better way to get there? Maybe stop
duplicates getting in to begin with.
That's the stage I'm trying to reach. The current query is to help me
prepare the Excel sheet for its new and exciting career as an Access
Database.
- you may wish to revisit your use of first and last names as a
combined unique identifier. if there is no naturally absolutely unique
entity to use, then consider a syntetic primary key
That's just my analogy. My "real" data is more relevant.
- If you absolutely need a fast updatable query (and my experience
tells me that exposing users to data tables is not a good plan - even
masked by a bound form), consider filling a temporary table on the fly
and joining it on your main table to reduce the results returned...
speedville!
You are absolutely right - the Temp table method is a good one.

Slight shift of approach - here's the *structure* of the kind of thing I
was attempting to explore;

A table of File information, obvious fields, DirID for my use.

tblDirInfo:
ID, AutoNumber, primary key
DirID, Long
FileName, Text
FileSize, Long
FileDate, Date/Time

This Query on that table:

SELECT DX.ID, DX.DirID, DX.FileName, DX.FileSize, DX.FileDate
FROM tblDirInfo AS DX INNER JOIN
[SELECT FileName, FileSize, FileDate
FROM tblDirInfo
GROUP BY FileName, FileSize, FileDate
HAVING (COUNT(*) = 1)].
AS Virt ON DX.FileName = Virt.FileName
ORDER BY DX.FileDate;

Look at that in Design View as an example of the kind of "virtual tables
through subqueries" picture I had in my head.

I shouldn't have used the Employees Table to illustrate what I wanted,
because although it provides some handy data for testing, there's a shift
of emphasis when looking at false data.

I mapped my Year info onto the TitleOfCourtesy field.
When you look at a field which can contain values such as Mr, Mrs, Miss,
Master, MZ, Hey You and Plenipotetatissimuss you tend to consider it
unimportant in terms of Grouping. How often, after collecting 20 years of
data would you need to group People information by TitleOfCourtesy, rather
than by Year?

Thanks for your interest.

Cheers,
DBDriver.
May 10 '07 #7

This discussion thread is closed

Replies have been disabled for this discussion.