470,596 Members | 1,659 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 470,596 developers. It's quick & easy.

DISTINCT and ORDER BY

Colleagues:

Take a query like

SELECT
DISTINCT
X.COLA,
X.COLB
FROM
X
WHERE
X.COLN = 'CONSTANT'
ORDER BY
1,2

Can I get rid of the ORDER BY, as DB2 will sort the results in the
order I'm specifying to eliminate duplicates? Does DB2 do this anyway
as an optimization during re-write?

Env.: DB2 UDB LUW 8.2.3, AIX 5.x

Thanks,

--Jeff

Aug 1 '06 #1
7 15393
jefftyzzer wrote:
Colleagues:

Take a query like

SELECT
DISTINCT
X.COLA,
X.COLB
FROM
X
WHERE
X.COLN = 'CONSTANT'
ORDER BY
1,2

Can I get rid of the ORDER BY, as DB2 will sort the results in the
order I'm specifying to eliminate duplicates? Does DB2 do this anyway
as an optimization during re-write?

Env.: DB2 UDB LUW 8.2.3, AIX 5.x
You cannot get rid of the order by.
Not to worry. DB2 will not re-order if the distinct already caused a sort.

Here are three reasons why you can't remove the order by:
1. If you ran this query on a DPF system you would not get an ordered
result back because the DISTINCT would likely be done per node.
2. Future versions may use different technology to enforce DISTINCTness
than sort.
One such example would be hashing.
3. You could have an index on (COLB, COLA). DB2 may use that to enforce
the distinct and then resort the remaining (smaller) result set for the
ORDER BY.

Cheers
Serge
--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab

IOD Conference
http://www.ibm.com/software/data/ond...ness/conf2006/
Aug 1 '06 #2
Serge, thanks as always for your help and your absurdly-fast replies
:-).

Serge Rielau wrote:
jefftyzzer wrote:
Colleagues:

Take a query like

SELECT
DISTINCT
X.COLA,
X.COLB
FROM
X
WHERE
X.COLN = 'CONSTANT'
ORDER BY
1,2

Can I get rid of the ORDER BY, as DB2 will sort the results in the
order I'm specifying to eliminate duplicates? Does DB2 do this anyway
as an optimization during re-write?

Env.: DB2 UDB LUW 8.2.3, AIX 5.x
You cannot get rid of the order by.
Not to worry. DB2 will not re-order if the distinct already caused a sort.

Here are three reasons why you can't remove the order by:
1. If you ran this query on a DPF system you would not get an ordered
result back because the DISTINCT would likely be done per node.
2. Future versions may use different technology to enforce DISTINCTness
than sort.
One such example would be hashing.
3. You could have an index on (COLB, COLA). DB2 may use that to enforce
the distinct and then resort the remaining (smaller) result set for the
ORDER BY.

Cheers
Serge
--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab

IOD Conference
http://www.ibm.com/software/data/ond...ness/conf2006/
Aug 1 '06 #3
jefftyzzer wrote:
Serge, thanks as always for your help and your absurdly-fast replies
Don't tell my boss. He'll think I have nothing better to do ;-)
--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab

IOD Conference
http://www.ibm.com/software/data/ond...ness/conf2006/
Aug 1 '06 #4
jefftyzzer wrote:
Colleagues:

Take a query like

SELECT
DISTINCT
X.COLA,
X.COLB
FROM
X
WHERE
X.COLN = 'CONSTANT'
ORDER BY
1,2

Can I get rid of the ORDER BY, as DB2 will sort the results in the
order I'm specifying to eliminate duplicates?
The _only_ thing that guarantees the ordering of the results is the ORDER
BY. Every other behavior is just happenstance and you cannot rely on it.
So the order by is mandatory.

--
Knut Stolze
DB2 Information Integration Development
IBM Germany
Aug 2 '06 #5
In addition to what Serge said, you should also stop writing positional
numbers in the ORDER BY -- use column names in case the table is
altered or you have to port the code.

Aug 2 '06 #6
"--CELKO--" <jc*******@earthlink.netwrote in message
news:11**********************@s13g2000cwa.googlegr oups.com...
In addition to what Serge said, you should also stop writing positional
numbers in the ORDER BY -- use column names in case the table is
altered or you have to port the code.
The column position only refers to the position within the select list of a
particular query, and is not affected by the columns in the table.
Aug 2 '06 #7
>The column position only refers to the position within the select list of a particular query, and is not affected by the columns in the table.<<

I was on ANSI X3H2 when we deprecated positional numbers. I also hope
you know that ORDER BY is never part of a query; it is a clause in a
CURSOR. When the query changes, so do the positions in the cursor.

But more than that, it is simply baaaad programming not to explicitly
tell the poor bastard who maintains the code what the sort order really
is.

Aug 5 '06 #8

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

4 posts views Thread by Johnson, Shaunn | last post: by
9 posts views Thread by Kelvin | last post: by
5 posts views Thread by Fred Zuckerman | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.