473,386 Members | 1,710 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,386 software developers and data experts.

SQL Syntax

I have an Access 97 database on WinXP with a linked Excel table of about
100,000 rows.
I'm trying to construct a query that will give me an editable datasheet
showing records where data in a pair of fields is duplicated.

As an example, I constructed this query on the Employees'table in the
Northwind Sample database. I appended four rows from the table back to the
table creating duplicates, then ran this SQL to locate rows where Firstname
and LastName together were duplicated.

SELECT Employees.*
FROM Employees
WHERE [LastName] & [FirstName] In (
SELECT [LastName] & [FirstName] AS FullName
FROM Employees
GROUP BY [LastName] & [FirstName]
HAVING Count([LastName] & [FirstName])>1)
AND Employees.TitleOfCourtesy='Mr.';

The final AND condition is actually a Date in my "real" project. LastName
and FirstName are Product Labels and ID Numbers. A query that works as
above will work on my real data.

This gives me exactly what I want, and I can add and change fields as
required (and it will be required.)

However, it takes (on my machine) about 10 seconds to run on the real data.
By creating and saving two separate queries, then constructing a third
query JOINing on FullName I can see the same result in 4 seconds.

So, my question.
Is there a different syntax that will give me the same (or better!) speed
in ONE query?
I'm aware that I can join on a Select statement instead of a Table or query
by enclosing the select statement in square brackets followed by a dot,
even in 97. My gut feeling is that there's a way I can use that fact, but
just at the moment my gut isn't talking to my brain.

I'd be grateful for any help.

Cheers,
DBDriver
May 9 '07 #1
6 3581
I'm aware that I can join on a Select statement instead of a Table or query
by enclosing the select statement in square brackets followed by a dot,
even in 97. My gut feeling is that there's a way I can use that fact,
you can add a derived table to your query in the form of a sub query
in SQL - Access messes this up once you view it in design mode and
converts the proper bracketing of the sub query with square brackets
and dot. eg (air code)

Select
Employees.name,
qryDuplicates.FullName
From
Employees Inner Join
(SELECT
FirstName, LastName,
[LastName] & [FirstName] AS FullName
FROM Employees
GROUP BY [LastName] & [FirstName]
HAVING Count([LastName] & [FirstName])>1
AND Employees.TitleOfCourtesy='Mr.';
) as qryDuplicates
On ((tblEmpoyees.FirstName = qryDuplicates.FirstName
AND (tblEmployees.LastName = qryDuplication.LastName))
once you view this in the design mode it will change it to:

Select
Employees.name,
qryDuplicates.FullName
From
Employees Inner Join
[SELECT
FirstName, LastName,
[LastName] & [FirstName] AS FullName
FROM Employees
GROUP BY [LastName] & [FirstName]
HAVING Count([LastName] & [FirstName])>1
AND Employees.TitleOfCourtesy='Mr.';
]. as qryDuplicates
On ((tblEmpoyees.FirstName = qryDuplicates.FirstName
AND (tblEmployees.LastName = qryDuplication.LastName))

....and then it won't recompile!!! maybe i'm missing something here -
but I always found that odd.

you could do that here, but it wouldn't be any faster. In fact it may
actually be slower then having a seperately compiled query joined. And
it wouldn't be updatable either... so no joy there

You may want to try compiling the dupilcate query seperately, but
using the same technique of using the "Where Field In" filter - I'm
not sure how the jet sql engine works, but I'd worry it was re-
evaluating that SQL expression on for each record of the Employess
table; try:

qryEmployees:=
SELECT Employees.*
FROM Employees
WHERE [LastName] & [FirstName] In (qryDuplicates)
AND Employees.TitleOfCourtesy='Mr.';

qryDuplicates:=
SELECT [LastName] & [FirstName] AS FullName
FROM Employees
GROUP BY [LastName] & [FirstName]
HAVING Count([LastName] & [FirstName])>1

....may not make a difference, but worth a shot :)

Also for speed, never use "select tbl.*" - naming the fields is faster

just at the moment my gut isn't talking to my brain.
ha! story of my life - but mostly its my liver not talking to
anyone... whoever said Guinness was good for you!

May 9 '07 #2

just noticed i messed up my copy & pasting - that _should_ have been:
qryEmployees:=
SELECT Employees.*
FROM Employees
WHERE [LastName] & [FirstName] In (qryDuplicates);

qryDuplicates:=
SELECT [LastName] & [FirstName] AS FullName
FROM Employees
GROUP BY [LastName] & [FirstName]
HAVING Count([LastName] & [FirstName])>1
AND Employees.TitleOfCourtesy='Mr.';
....much better!

May 9 '07 #3
Thanks for looking at this, BillCo

I found that Access complained about the AND clauee in qryDuplicates,
telling me it should have been part of an Aggregate function.

I found that qryEmployees considered qryDuplicates to be a Parameter.
(Testing in Win98 and Access 97)

It's been an interesing exersize.

Cheers,
DBDriver

BillCo <co**********@gmail.comwrote in
news:11**********************@l77g2000hsb.googlegr oups.com:
>
just noticed i messed up my copy & pasting - that _should_ have been:
>qryEmployees:=
SELECT Employees.*
FROM Employees
WHERE [LastName] & [FirstName] In (qryDuplicates);

qryDuplicates:=
SELECT [LastName] & [FirstName] AS FullName
FROM Employees
GROUP BY [LastName] & [FirstName]
HAVING Count([LastName] & [FirstName])>1
AND Employees.TitleOfCourtesy='Mr.';

...much better!

May 9 '07 #4
you can add a derived table to your query in the form of a sub query
in SQL - Access messes this up once you view it in design mode and
converts the proper bracketing of the sub query with square brackets
and dot. eg (air code)
I hadn't noticed that. Thanks!
>
you could do that here, but it wouldn't be any faster. In fact it may
actually be slower then having a seperately compiled query joined. And
it wouldn't be updatable either... so no joy there
Update is essential. I've created some quite interesting constructions
which weren't uptatable (and therefore no use to me.)
You may want to try compiling the dupilcate query seperately, but
using the same technique of using the "Where Field In" filter - I'm
not sure how the jet sql engine works, but I'd worry it was re-
evaluating that SQL expression on for each record of the Employess
table; try:

qryEmployees:=
SELECT Employees.*
FROM Employees
WHERE [LastName] & [FirstName] In (qryDuplicates)
AND Employees.TitleOfCourtesy='Mr.';

qryDuplicates:=
SELECT [LastName] & [FirstName] AS FullName
FROM Employees
GROUP BY [LastName] & [FirstName]
HAVING Count([LastName] & [FirstName])>1

...may not make a difference, but worth a shot :)
(Comment in next message.)
>
Also for speed, never use "select tbl.*" - naming the fields is faster
You're right, but in this case I'm making a semi-template to get various
duplicated entries in multiple columns from a 90 column Excel sheet.

I was really just curious about the []. method. My method worked quite well
for me, but there's always someone out there with a skillful tweak.

Cheers,
DBDriver
May 9 '07 #5
On May 9, 1:24 pm, DBDriver <DBDri...@no.cowrote:
Thanks for looking at this, BillCo

I found that Access complained about the AND clauee in qryDuplicates,
telling me it should have been part of an Aggregate function.

I found that qryEmployees considered qryDuplicates to be a Parameter.
(Testing in Win98 and Access 97)

It's been an interesing exersize.

Cheers,
DBDriver
here's where it gets tricky - it's so hard to advise without seeing
first hand. Try changing qryDuplicates to:

SELECT [LastName] & [FirstName] AS FullName
FROM Employees
WHERE Employees.TitleOfCourtesy='Mr.'
GROUP BY [LastName] & [FirstName]
HAVING Count([LastName] & [FirstName])>1

Actually, try to make this one work aswell (i'm curious after writing
the thing!):

Select
Employees.name,
qryDuplicates.FullName
From
Employees Inner Join
(SELECT
FirstName, LastName,
[LastName] & [FirstName] AS FullName
FROM Employees
GROUP BY [LastName] & [FirstName]
WHERE Employees.TitleOfCourtesy='Mr.'
HAVING Count([LastName] & [FirstName])>1
) as qryDuplicates
On ((tblEmpoyees.FirstName = qryDuplicates.FirstName
AND (tblEmployees.LastName = qryDuplication.LastName))

some final tips:
- You may have to go back to basics and analyse what you are trying to
achieve here... is there a better way to get there? Maybe stop
duplicates getting in to begin with.
- you may wish to revisit your use of first and last names as a
combined unique identifier. if there is no naturally absolutely unique
entity to use, then consider a syntetic primary key
- If you absolutely need a fast updatable query (and my experience
tells me that exposing users to data tables is not a good plan - even
masked by a bound form), consider filling a temporary table on the fly
and joining it on your main table to reduce the results returned...
speedville!

May 9 '07 #6
Comments inline.

BillCo <co**********@gmail.comwrote in
news:11**********************@w5g2000hsg.googlegro ups.com:
On May 9, 1:24 pm, DBDriver <DBDri...@no.cowrote:
>Thanks for looking at this, BillCo

I found that Access complained about the AND clauee in qryDuplicates,
telling me it should have been part of an Aggregate function.

I found that qryEmployees considered qryDuplicates to be a Parameter.
(Testing in Win98 and Access 97)

It's been an interesing exersize.

Cheers,
DBDriver

here's where it gets tricky - it's so hard to advise without seeing
first hand.
Very true. As I said:
"The final AND condition is actually a Date in my "real" project. LastName
and FirstName are Product Labels and ID Numbers. A query that works as
above will work on my real data."

In this case, the data didn't matter. My interest was in different ways of
structuring the queries and subQueries.
I've always managed to get exactly what I want using multiple queries. In a
few cases I've found that a MakeTable was needed if I wanted same-day
results. Often the single query has been slower.
Try changing qryDuplicates to:
SELECT [LastName] & [FirstName] AS FullName
FROM Employees
WHERE Employees.TitleOfCourtesy='Mr.'
GROUP BY [LastName] & [FirstName]
HAVING Count([LastName] & [FirstName])>1
My analogy (TitleOfCourtesy here = a unique Year in my "real" data) is not
good. The most common criterion will be that field, as duplications in
separate Years are OK while duplications within a Year are not.

The query I used in my original message is fine.
I was just curious about a join using the []. method, since I would use an
actual query instead.
Actually, try to make this one work aswell (i'm curious after writing
the thing!):
Sorry, still won't work for me. :-)
Office 97 on Win98 and Office 2003 on XP.

Error message:
Syntax Error in FROM Clause.

I noticed and changed qryDuplication, which of course made no difference.
Select
Employees.name,
qryDuplicates.FullName
From
Employees Inner Join
(SELECT
FirstName, LastName,
[LastName] & [FirstName] AS FullName
FROM Employees
GROUP BY [LastName] & [FirstName]
WHERE Employees.TitleOfCourtesy='Mr.'
HAVING Count([LastName] & [FirstName])>1
) as qryDuplicates
On ((tblEmpoyees.FirstName = qryDuplicates.FirstName
AND (tblEmployees.LastName = qryDuplication.LastName))

some final tips:
- You may have to go back to basics and analyse what you are trying to
achieve here... is there a better way to get there? Maybe stop
duplicates getting in to begin with.
That's the stage I'm trying to reach. The current query is to help me
prepare the Excel sheet for its new and exciting career as an Access
Database.
- you may wish to revisit your use of first and last names as a
combined unique identifier. if there is no naturally absolutely unique
entity to use, then consider a syntetic primary key
That's just my analogy. My "real" data is more relevant.
- If you absolutely need a fast updatable query (and my experience
tells me that exposing users to data tables is not a good plan - even
masked by a bound form), consider filling a temporary table on the fly
and joining it on your main table to reduce the results returned...
speedville!
You are absolutely right - the Temp table method is a good one.

Slight shift of approach - here's the *structure* of the kind of thing I
was attempting to explore;

A table of File information, obvious fields, DirID for my use.

tblDirInfo:
ID, AutoNumber, primary key
DirID, Long
FileName, Text
FileSize, Long
FileDate, Date/Time

This Query on that table:

SELECT DX.ID, DX.DirID, DX.FileName, DX.FileSize, DX.FileDate
FROM tblDirInfo AS DX INNER JOIN
[SELECT FileName, FileSize, FileDate
FROM tblDirInfo
GROUP BY FileName, FileSize, FileDate
HAVING (COUNT(*) = 1)].
AS Virt ON DX.FileName = Virt.FileName
ORDER BY DX.FileDate;

Look at that in Design View as an example of the kind of "virtual tables
through subqueries" picture I had in my head.

I shouldn't have used the Employees Table to illustrate what I wanted,
because although it provides some handy data for testing, there's a shift
of emphasis when looking at false data.

I mapped my Year info onto the TitleOfCourtesy field.
When you look at a field which can contain values such as Mr, Mrs, Miss,
Master, MZ, Hey You and Plenipotetatissimuss you tend to consider it
unimportant in terms of Grouping. How often, after collecting 20 years of
data would you need to group People information by TitleOfCourtesy, rather
than by Year?

Thanks for your interest.

Cheers,
DBDriver.
May 10 '07 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

699
by: mike420 | last post by:
I think everyone who used Python will agree that its syntax is the best thing going for it. It is very readable and easy for everyone to learn. But, Python does not a have very good macro...
22
by: Tuang | last post by:
I'm checking out Python as a candidate for replacing Perl as my "Swiss Army knife" tool. The longer I can remember the syntax for performing a task, the more likely I am to use it on the spot if...
14
by: Sandy Norton | last post by:
If we are going to be stuck with @decorators for 2.4, then how about using blocks and indentation to elminate repetition and increase readability: Example 1 --------- class Klass: def...
16
by: George Sakkis | last post by:
I'm sure there must have been a past thread about this topic but I don't know how to find it: How about extending the "for <X> in" syntax so that X can include default arguments ? This would be very...
23
by: Carter Smith | last post by:
http://www.icarusindie.com/Literature/ebooks/ Rather than advocating wasting money on expensive books for beginners, here's my collection of ebooks that have been made freely available on-line...
19
by: Nicolas Fleury | last post by:
Hi everyone, I would to know what do you think of this PEP. Any comment welcomed (even about English mistakes). PEP: XXX Title: Specialization Syntax Version: $Revision: 1.10 $...
4
by: Jeremy Yallop | last post by:
Looking over some code I came across a line like this if isalnum((unsigned char)c) { which was accepted by the compiler without complaint. Should the compiler have issued a diagnostic in this...
177
by: C# Learner | last post by:
Why is C syntax so uneasy on the eye? In its day, was it _really_ designed by snobby programmers to scare away potential "n00bs"? If so, and after 50+ years of programming research, why are...
4
by: Bob hotmail.com> | last post by:
Everyone I have been spending weeks looking on the web for a good tutorial on how to use regular expressions and other methods to satisfy my craving for learning how to do FAST c-style syntax...
3
by: Manuel | last post by:
I'm trying to compile glut 3.7.6 (dowbloaded from official site)using devc++. So I've imported the glut32.dsp into devc++, included manually some headers, and start to compile. It return a very...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.