473,738 Members | 11,146 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Settle a disagreement with my boss re SELECT DISTINCT

My boss has been adamant that SELECT DISTINCT is a faster query than
SELECT all other factors being equal. I disagree. We are linking an
Access front end to a SQL Server back end and normally are only
returning one record. She states that with disctinct the query stops
as soon as it finds a matching record. Both of us are relative novices
in database theory (obviously).

Can someone help settle this?

Nov 13 '05 #1
18 3056
mathilda wrote:
My boss has been adamant that SELECT DISTINCT is a faster query than
SELECT all other factors being equal. I disagree. We are linking an
Access front end to a SQL Server back end and normally are only
returning one record. She states that with disctinct the query stops
as soon as it finds a matching record. Both of us are relative
novices in database theory (obviously).

Can someone help settle this?


It would vary from query to query, but I would expect DISTINCT to either be
slower or a tie, never faster. If your query is "unintellig ent" enough to
force a table scan then it is going to do that regardless of whether you
have the DISTINCT clause or not. After all, the engine wouldn't "know" that
you are only looking for one record and it won't know if other rows are
duplicates until AFTER they have been looked at.

Perhaps if a TOP 1 clause were added?
--
I don't check the Email account attached
to this message. Send instead to...
RBrandt at Hunter dot com
Nov 13 '05 #2
Thanks, looks like I win. But want do you mean by "forcing a table
scan"

Nov 13 '05 #3
mathilda wrote:
My boss has been adamant that SELECT DISTINCT is a faster query than
SELECT all other factors being equal. I disagree. We are linking an
Access front end to a SQL Server back end and normally are only
returning one record. She states that with disctinct the query stops
as soon as it finds a matching record. Both of us are relative novices
in database theory (obviously).


This seems entirely illogical.

From BOL: "The DISTINCT keyword eliminates duplicate rows from the
results of a SELECT statement. If DISTINCT is not specified, all rows
are returned, including duplicates."

So if there is only one row in the result set then there is none to
eliminate. If there are more then they will have to be tested for
duplicity (!). In the first case any difference would be unnoticeable;
in the second case the difference might be noticeable to an atomic clock.

Of course if you are using ODBC someone else would know.

--
Lyle
--
From ADO28.chm

Deprecated Components
Each of the following components is considered obsolete. While these
components are still supported in this release of the Microsoft® Data
Access Components (MDAC), they may be removed in the future. When
writing new applications, you should avoid using these deprecated
components. When modifying existing applications, you are strongly
encouraged to remove any dependency on these components.

ODBC Provider (MSDASQL)
You are strongly encouraged to use one of the native OLE DB Providers
instead of the Microsoft Open Database Connectivity (ODBC) Provider.
Native OLE DB Providers provide better application stability and
performance. Furthermore, native OLE DB Providers will be supported in
the future, whereas MSDASQL will not have any new features added to it,
will not be available on 64-bit, and will not be accessible from the OLE
DB NET Data Provider.

.....
Nov 13 '05 #4
mathilda wrote:
Thanks, looks like I win. But want do you mean by "forcing a table
scan"


If your query's WHERE clause is against an indexed field then only the index
is scanned and then the matching rows retrieved. If it is against a
non-indexed field then the entire table is scanned to look for matches.

--
I don't check the Email account attached
to this message. Send instead to...
RBrandt at Hunter dot com
Nov 13 '05 #5

mathilda wrote:
My boss has been adamant that SELECT DISTINCT is a faster query than
SELECT all other factors being equal. I disagree. We are linking an
Access front end to a SQL Server back end and normally are only
returning one record. She states that with disctinct the query stops
as soon as it finds a matching record. Both of us are relative novices in database theory (obviously).

Can someone help settle this?


You are correct: the DISTINCT command will actually slow the query.
First, the SELECT command is performed, and then the result set is
further processed to return the DISTINCT result only. SQL Server
processes DISTINCT like an aggregate function, it doesn't just stop
when it gets to one result.

The proof in in the pudding: open Query Analyzer, tell it to show the
Execution plan and then run the query with and without DISTINCT.
Probably the Execution Time will read "0:00:00" for both queries, but
you can show your boss the extra processing step.

Nov 13 '05 #6
"mathilda" <sm***********@ yahoo.com> wrote in message
news:11******** **************@ o13g2000cwo.goo glegroups.com.. .
My boss has been adamant that SELECT DISTINCT is a faster query than
SELECT all other factors being equal. I disagree. We are linking an
Access front end to a SQL Server back end and normally are only
returning one record.


There is some important things to be aware of here:

* since you are using sql server, then when you return ONLY one record
(either example....via a condition, or by the distinct) only ONE record is
transfer down the wire. Thus, assuming that both approaches can readably
return the ONE record in a short time, you as a end user likely will not
experience ANY difference. In other words, the LARGE time issue here is
transferring the one record...not that time of finding the one record. It is
possible that one approach might be faster then the other (say 1 / 10,000 of
a second), however, the time taken to transfer the data is a few 100's of s
second, so that 10,000 times faster to FIND and retrieve the record will NOT
be noticed by you. The MOST important issue here is that ONE record is being
returned down the wire.

* So, we have two issues:
Time to transfer the record (AFTER it has been found)
Time to find the ONE record, either via Distinct, or by some keyID
etc.

I have little doubt that using a condition to return one record is
considerably faster then telling the database to figure things out, and
NEVER return duplicates. Telling the database system to not return
duplicates
takes a lot of work. If you can return the record via:

select * from tblCustomer where custid = 123

The above is most certainly better then:

select distinct * from tblCustomer where custid = 123

However, adding the distinct keyword in the above is NOT going to improve
speed if ONLY one record exists. The time difference would be minimal..but
adding distinct would increase things by some amount...but likely not even a
amount that you can measure).

However, what about when there is going to be multiple records returned, and
we ONLY want one record?

Again:

select top 1 from tblCustomer where custid = 123

This I think again would be faster then distinct, with distinct all possible
matches
of custID = 123 have to be tested. There could be 50, or even 2000 records
with a custid = 123....and making distinct work on that can really cost.
With
a top 1, then only the first match need be returned.

However, you question was about one record, and all things being
equal. So, if only one record is to be normally retuned, and the distinct is
NOT needed, then I would leave it out. I am CERTAIN that putting
in distinct is NOT faster. The amount that distinct would be slower
by is likely not measure in this case, .but it would be a tiny tiny bit
slower.
--
Albert D. Kallal (Access MVP)
Edmonton, Alberta Canada
pl************* ****@msn.com
http://www.members.shaw.ca/AlbertKallal

Nov 13 '05 #7
On Feb 09 2005, 10:36 am, Lyle Fairfield <ly******@yahoo .ca> wrote in
news:M5******** ***********@rea d2.cgocable.net :
So if there is only one row in the result set then there is none to
eliminate. If there are more then they will have to be tested for
duplicity (!). In the first case any difference would be unnoticeable;
in the second case the difference might be noticeable to an atomic clock.


Actually the difference may be quite noticeable in the second case,
depending on the size of data set. This is similar to the difference (!)
between UNION and UNION ALL.

Paste the following into QA, hit Ctrl+L, and compare the plans:

use pubs

select state
from dbo.authors
go

select distinct state
from dbo.authors
go

--
remove a 9 to reply by email
Nov 13 '05 #8
Dimitri Furman wrote:
On Feb 09 2005, 10:36 am, Lyle Fairfield <ly******@yahoo .ca> wrote in
news:M5******** ***********@rea d2.cgocable.net :

So if there is only one row in the result set then there is none to
eliminate. If there are more then they will have to be tested for
duplicity (!). In the first case any difference would be unnoticeable;
in the second case the difference might be noticeable to an atomic clock.

Actually the difference may be quite noticeable in the second case,
depending on the size of data set. This is similar to the difference (!)
between UNION and UNION ALL.

Paste the following into QA, hit Ctrl+L, and compare the plans:

use pubs

select state
from dbo.authors
go

select distinct state
from dbo.authors
go


I feel
"We are linking an
Access front end to a SQL Server back end and normally are only
returning one record."
implies a very small number of records.

--
--
Lyle
--
Nov 13 '05 #9
Dimitri Furman wrote:
On Feb 09 2005, 10:36 am, Lyle Fairfield <ly******@yahoo .ca> wrote in
news:M5******** ***********@rea d2.cgocable.net :

So if there is only one row in the result set then there is none to
eliminate. If there are more then they will have to be tested for
duplicity (!). In the first case any difference would be unnoticeable;
in the second case the difference might be noticeable to an atomic clock.

Actually the difference may be quite noticeable in the second case,
depending on the size of data set. This is similar to the difference (!)
between UNION and UNION ALL.

Paste the following into QA, hit Ctrl+L, and compare the plans:

use pubs

select state
from dbo.authors
go

select distinct state
from dbo.authors
go


BTW the time difference is not so noticeable here.
--
--
Lyle
--
From ADO28.chm

Deprecated Components
Each of the following components is considered obsolete. While these
components are still supported in this release of the Microsoft® Data
Access Components (MDAC), they may be removed in the future. When
writing new applications, you should avoid using these deprecated
components. When modifying existing applications, you are strongly
encouraged to remove any dependency on these components.

ODBC Provider (MSDASQL)
You are strongly encouraged to use one of the native OLE DB Providers
instead of the Microsoft Open Database Connectivity (ODBC) Provider.
Native OLE DB Providers provide better application stability and
performance. Furthermore, native OLE DB Providers will be supported in
the future, whereas MSDASQL will not have any new features added to it,
will not be available on 64-bit, and will not be accessible from the OLE
DB NET Data Provider.

Remote Data Services (RDS)
Remote Data Services (RDS) is a proprietary Microsoft mechanism for
accessing remote data across the Internet or intranet. Microsoft is now
shipping the Microsoft Simple Object Access Protocol (SOAP) Toolkit 2.0
that enables you to access remote data using an open, XML-based
standard. Given the availability of the SOAP Toolkit 2.0, you should
migrate from RDS to SOAP. The SOAP 2.0 Toolkit 2.0 also includes sample
code for remotely accessing Microsoft ActiveX® Data Objects (ADO)
Recordsets.

Jet and Replication Objects (JRO)
The Microsoft Jet OLE DB Provider and other related components were
removed from MDAC 2.6. Microsoft has deprecated the Microsoft Jet
Engine, and plans no new releases or service packs for this component.
As a result, the Jet and Replication Objects (JRO) is being deprecated
in this release and will not be available in any future MDAC releases.

.....
Nov 13 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
11689
by: Martin Feuersteiner | last post by:
Dear Group I'm having trouble with the clause below. I would like to select only records with a distinct TransactionDate but somehow it still lists duplicates. I need to select the TransactionDate once as smalldatetime and once as varchar as I'm populating a drop-down with Text/Value pairs. So I can't just use 'SELECT DISTINCT TransactionDate FROM...' I'm grateful for any hints.
3
10559
by: blue | last post by:
I'm trying to order a varchar column first numerically, and second alphanumerically using the following SQL: SELECT distinct doc_number FROM doc_line WHERE product_id = 'WD' AND doc_type = 'O' ORDER BY CASE WHEN IsNumeric(doc_number) = 1 THEN CONVERT(FLOAT, doc_number) ELSE 999999999 END,
2
12546
by: mfyahya | last post by:
I have two tables, both containing an 'authors' column. Is there a way to get a unique list of authors from the two tables? I tried SELECT DISTINCT `authors` from `table1`, `table2`; but I got an "Column 'authors' in field list is ambiguous" error. Is there also a query to return only the count of distinct authors from the two tables? Thanks for any help.
3
6470
by: Tcs | last post by:
My backend is DB2 on our AS/400. While I do HAVE DB2 PE for my PC, I haven't loaded it yet. I'm still using MS Access. And no, I don't believe this is an Access question. (But who knows? I COULD be wrong... :) I've tried the access group...twice...and all I get is "Access doesn't like ".", which I know, or that my query names are too long, as there's a limit to the length of the SQL statement(s). But this works when I don't try to...
9
10900
by: Kelvin | last post by:
Okay so this is baking my noodle. I want to select all the attritbutes/fields from a table but then to excluded any row in which a single attributes data has been duplicated. I.E. Here's my table:- ID Ref Name DATE 1 AAA Joe 1/2 2 BBB Ken 1/2 3 AAA Len 6/3
6
5553
by: John M | last post by:
Hi, The line below is used to feed a combobox. (It is from a database which is used to log pupil behaviour!) The 'incidents' table contains a list of students who have been involved in incidents. Some may appear several times, hence the Distinct. However, thelist generated should still be sorted by Surname. When I add Order by I'm told it conflict with the 'Distinct'. Surely it does not? SELECT DISTINCT ., . & " " & . FROM...
3
5655
by: orekinbck | last post by:
Hi There Our test database has duplicate data: COMPANYID COMPANYNAME 1 Grupple Group 2 Grupple Group 5 Grupple Group 3 Yada Inc 4 Yada Inc
6
13569
by: Bob Stearns | last post by:
I am getting unwanted duplicate rows in my result set, so I added the DISTINCT keyword to my outermost SELECT. My working query then returned the following message: DB2 SQL error: SQLCODE: -214, SQLSTATE: 42822, SQLERRMC: CASE...;ORDER BY;2 Message: An expression in the ORDER BY clause in the following position, or starting with "CASE..." in the "ORDER BY" clause is not valid. Reason code = "2".  More exceptions ... DB2 SQL error:...
5
6896
by: Daniel Wetzler | last post by:
Dear MSSQL experts, I use MSSQL 2000 and encountered a strange problem wqhile I tried to use a select into statement . If I perform the command command below I get only one dataset which has the described properties. If I use the same statement in a select into statement (see the second select) I get several datasets with the described properties like I didn't use distinct
0
8969
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8788
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
1
9263
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9208
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
6053
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4570
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
4825
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
2
2745
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2193
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.