By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
425,647 Members | 1,178 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 425,647 IT Pros & Developers. It's quick & easy.

Order by in a INSERT INTO..SELECT

P: n/a
I have the following basic statements being executed:

Create a temp table, #TempPaging

Insert Into #TempPaging (Col1, Col2)
Select Col1, Col2 From SomeOtherTable Order By Col2, Col1

Select * from #TempPaging

I can't provide a reproduceable scenario right now without making this
into a 200K post, so I'm hoping someone will know what the issue is.. I
can't find anything regarding this in BOL.

Basically the order is off a little bit every now and then when there
are large amounts of data, like more than a couple hundred rows. Is
there something I need to do to guarantee the originally selected
order?

This is very important that it be in order from the original select
statement as this is for paging. Adding an order by in the second
select from the temp table will not fix the problem. In this particular
instance that I have reproduced we are using SQL 2005 but have also
seen this on SQL 2000 servers. I had previously asked this question as
I was using a SELECT INTO statement, but now we are manually creating
the temp table (Pain in the ass) and still having the same issue. Best
case scenario is for it to work for a SELECT INTO.

Any ideas?

Mar 27 '06 #1
Share this Question
Share on Google+
29 Replies


P: n/a
pb648174 wrote:
I have the following basic statements being executed:

Create a temp table, #TempPaging

Insert Into #TempPaging (Col1, Col2)
Select Col1, Col2 From SomeOtherTable Order By Col2, Col1

Select * from #TempPaging

I can't provide a reproduceable scenario right now without making this
into a 200K post, so I'm hoping someone will know what the issue is.. I
can't find anything regarding this in BOL.

Basically the order is off a little bit every now and then when there
are large amounts of data, like more than a couple hundred rows. Is
there something I need to do to guarantee the originally selected
order?

This is very important that it be in order from the original select
statement as this is for paging. Adding an order by in the second
select from the temp table will not fix the problem. In this particular
instance that I have reproduced we are using SQL 2005 but have also
seen this on SQL 2000 servers. I had previously asked this question as
I was using a SELECT INTO statement, but now we are manually creating
the temp table (Pain in the ass) and still having the same issue. Best
case scenario is for it to work for a SELECT INTO.

Any ideas?


Tables are NOT logically ordered under any circumstances. You need to
specify ORDER BY here:

Select * from #TempPaging
ORDER BY ...

That is, when you QUERY the table, not when you INSERT.

See the following article for some reliable paging techniques (one
example uses the same method you have so avoid that one):
http://www.aspfaq.com/show.asp?id=2120

--
David Portas, SQL Server MVP

Whenever possible please post enough code to reproduce your problem.
Including CREATE TABLE and INSERT statements usually helps.
State what version of SQL Server you are using and specify the content
of any error messages.

SQL Server Books Online:
http://msdn2.microsoft.com/library/m...S,SQL.90).aspx
--

Mar 27 '06 #2

P: n/a
pb648174 (go****@webpaul.net) writes:
I have the following basic statements being executed:

Create a temp table, #TempPaging

Insert Into #TempPaging (Col1, Col2)
Select Col1, Col2 From SomeOtherTable Order By Col2, Col1

Select * from #TempPaging

I can't provide a reproduceable scenario right now without making this
into a 200K post, so I'm hoping someone will know what the issue is.. I
can't find anything regarding this in BOL.

Basically the order is off a little bit every now and then when there
are large amounts of data, like more than a couple hundred rows. Is
there something I need to do to guarantee the originally selected
order?

This is very important that it be in order from the original select
statement as this is for paging. Adding an order by in the second
select from the temp table will not fix the problem.
Once the data is in #TempPaging an ORDER BY will result in that
page being ordered. But that does not help if #TempPaging was not
loaded correctly.
In this particular instance that I have reproduced we are using SQL 2005
but have also seen this on SQL 2000 servers.


If you are on SQL 2005, the best is to use row_number():

SELECT OrderID, CustomerID, OrderDate, rowno
FROM (SELECT OrderID, CustomerID, OrderDate,
rowno = row_number() OVER
(PARTITION BY 1 ORDER BY CustomerID, OrderID)
FROM Northwind..Orders) AS x
WHERE rowno BETWEEN 100 AND 200
ORDER BY rowno

On SQL 2000 you can use a temp table table with an IDENTITY column,
and insert to that table with ORDER BY. I am told that this is
guaranteed to work, although I seem to recall that David claimed
to have seen conflicting testimony.

Note that this applies to INSERT only - it does *not* apply to SELECT INTO.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Mar 27 '06 #3

P: n/a
Yes; shot you and replace you with a programmer who has read the first
10 pages of **any** RDBMS book. You are posting realllllllly baaaaad
questions. Please take some time to catch up over the weekend before
you ask us to do your job for you again.

SQL is a set-orient language. Tables -- by definition -- have no
ordering. That is the nature of a set. Have you ever read Dr. Codd's
12 rules for RDBMS/ Look up the Information Principle: all
relationships are shown as values in columns. Ordering is a
relationship, so you need a column(s) for it.

If you do not know who Dr. Codd is or his rules, then you are like a
Geometry student who never heard of Euclid.

Mar 27 '06 #4

P: n/a
Celko,

Give the guy a break. Sheesh.

I'm beginning to think you are just an automated bot that jumps into
every conversation to complain and stir up the fire.

Mar 28 '06 #5

P: n/a
Celko is right.. Big dumb mistake on my part. It does have an identity
column but for some reason I thought it was ordering by that by
default. It's not a misunderstanding of the way the world works, just a
mistake.

Mar 29 '06 #6

P: n/a

pb648174 wrote:
I have the following basic statements being executed:

Create a temp table, #TempPaging

Insert Into #TempPaging (Col1, Col2)
Select Col1, Col2 From SomeOtherTable Order By Col2, Col1

Select * from #TempPaging

I can't provide a reproduceable scenario right now without making this
into a 200K post, so I'm hoping someone will know what the issue is.. I
can't find anything regarding this in BOL.

Basically the order is off a little bit every now and then when there
are large amounts of data, like more than a couple hundred rows. Is
there something I need to do to guarantee the originally selected
order?

This is very important that it be in order from the original select
statement as this is for paging. Adding an order by in the second
select from the temp table will not fix the problem. In this particular
instance that I have reproduced we are using SQL 2005 but have also
seen this on SQL 2000 servers. I had previously asked this question as
I was using a SELECT INTO statement, but now we are manually creating
the temp table (Pain in the ass) and still having the same issue. Best
case scenario is for it to work for a SELECT INTO.
In SQL 2005, use the RowNumber function in a nested sub query.

In SQL 2000, create enough index columns to handle your sort orders.

eg...
Select Mycol1, MyCol2, Mycol3,
MyIDX1 = Convert (Int, Null) ,
MyIDX2 = Convert (Int, Null) ,
MyIDX3 = Convert (Int, Null)
into #myResults
Select blah from whatever order by whatever

contrary to David Portas' assertion this does actually work most of the
time, however, I sure it can't be relied upon, though I've never seen
it fail...

Then, to have multiple order bys, and very very cheap and easy
pagination you then do.

Create Index #IDX1 on #myResults (Mycol1, Mycol2)
Create Index #IDX2 on #myResults (Mycol3, Mycol1)
Create Index #IDX3 on #myResults (Mycol2, Mycol3)
Followed by...

Declare @Fred int
set @Fred = 0
Update M
Set @Fred = MyIDX1 = @Fred + 1From #myResults M With (Index = #IDX1)

set @Fred = 0
Update M
Set @Fred = MyIDX2 = @Fred + 1From #myResults M With (Index = #IDX2)

set @Fred = 0
Update M
Set @Fred = MyIDX3 = @Fred + 1From #myResults M With (Index = #IDX3)

Select *From #MyResults Where IDX Between @X1 And @X2

Any ideas?


Apr 19 '06 #7

P: n/a

Erland Sommarskog wrote:
pb648174 (go****@webpaul.net) writes:
I have the following basic statements being executed:

Create a temp table, #TempPaging

Insert Into #TempPaging (Col1, Col2)
Select Col1, Col2 From SomeOtherTable Order By Col2, Col1

Select * from #TempPaging

I can't provide a reproduceable scenario right now without making this
into a 200K post, so I'm hoping someone will know what the issue is.. I
can't find anything regarding this in BOL.

Basically the order is off a little bit every now and then when there
are large amounts of data, like more than a couple hundred rows. Is
there something I need to do to guarantee the originally selected
order?

This is very important that it be in order from the original select
statement as this is for paging. Adding an order by in the second
select from the temp table will not fix the problem.
Once the data is in #TempPaging an ORDER BY will result in that
page being ordered. But that does not help if #TempPaging was not
loaded correctly.
In this particular instance that I have reproduced we are using SQL 2005
but have also seen this on SQL 2000 servers.


If you are on SQL 2005, the best is to use row_number():

SELECT OrderID, CustomerID, OrderDate, rowno
FROM (SELECT OrderID, CustomerID, OrderDate,
rowno = row_number() OVER
(PARTITION BY 1 ORDER BY CustomerID, OrderID)
FROM Northwind..Orders) AS x
WHERE rowno BETWEEN 100 AND 200
ORDER BY rowno

On SQL 2000 you can use a temp table table with an IDENTITY column,
and insert to that table with ORDER BY. I am told that this is
guaranteed to work, although I seem to recall that David claimed
to have seen conflicting testimony.


This isn't, though it is guaranteed to work in SQL 2005. afaik.

Creating an index on the chosen columns and then forcing the sliding
update statement
Update M Set @Fred = Col = @Fred + 1 From tabl M with (index=idxname)
is the only way I've ever seen it never fail for multi columns.

Note that this applies to INSERT only - it does *not* apply to SELECT INTO.
That's interesting, I've found the exact opposite, so long as a table
scan is forced on the created table, I can't create temporary indexes
on a temporary table in 2000 if it's created with insert into select,
only with select into.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx


Apr 19 '06 #8

P: n/a
(dr*****@hotmail.com) writes:
Creating an index on the chosen columns and then forcing the sliding
update statement
Update M Set @Fred = Col = @Fred + 1 From tabl M with (index=idxname)
is the only way I've ever seen it never fail for multi columns.


This is one more "works most of the time, but don't cry foul if it
doesn't". I see no point of using methods of which the result is
not defined.
Note that this applies to INSERT only - it does *not* apply to SELECT
INTO.


That's interesting, I've found the exact opposite, so long as a table
scan is forced on the created table, I can't create temporary indexes
on a temporary table in 2000 if it's created with insert into select,
only with select into.


Here is a quick example:

select IDENTITY(int, 1, 1) AS ident, OrderID = OrderID + 0,
CustomerID, OrderDate, ShipVia
into Orders
from Northwind..Orders Order by CustomerID, OrderID
go
SELECT * FROM Orders ORDER BY CustomerID, OrderID
go
TRUNCATE TABLE Orders
go
INSERT Orders (OrderID, CustomerID, OrderDate, ShipVia)
select OrderID = OrderID + 0, CustomerID, OrderDate, ShipVia
from Northwind..Orders
Order by CustomerID, OrderID
go
SELECT * FROM Orders ORDER BY CustomerID, OrderID
go
DROP TABLE Orders

It may not be the indexes you were talking about, but the result of a
query should never be dependent of the indexes on the table.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Apr 19 '06 #9

P: n/a
dr*****@hotmail.com wrote:

In SQL 2005, use the RowNumber function in a nested sub query.

In SQL 2000, create enough index columns to handle your sort orders.

eg...
Select Mycol1, MyCol2, Mycol3,
MyIDX1 = Convert (Int, Null) ,
MyIDX2 = Convert (Int, Null) ,
MyIDX3 = Convert (Int, Null)
into #myResults
Select blah from whatever order by whatever

contrary to David Portas' assertion this does actually work most of the
time, however, I sure it can't be relied upon, though I've never seen
it fail...


What assertion do you mean? All I said was tables are not ordered. They
are not. The example you posted above does not contradict me because
you specified ORDER BY, unlike the OP who had no ORDER BY in his SELECT
statement (only in the INSERT).

Your second example is more suspect. The behaviour of an assignment in
an UPDATE that references multiple rows is undefined.

--
David Portas, SQL Server MVP

Whenever possible please post enough code to reproduce your problem.
Including CREATE TABLE and INSERT statements usually helps.
State what version of SQL Server you are using and specify the content
of any error messages.

SQL Server Books Online:
http://msdn2.microsoft.com/library/m...S,SQL.90).aspx
--

Apr 20 '06 #10

P: n/a
I'm sorry David, I'd just drank a couple of bottles of particularly
nice Castello Banfi, Brunello di Montalcino. Any provocative statements
I make during such inebriated times is purely to have a poke. I value
your opinions I assure you.

As for your comments about updates that affect multiple rows being
undefined, I'm afraid you're wrong.

Update table set @fred = column = @fred + 1 is defined in SQL server's
own help, please look it up if you don't believe me.

Not only that, but SQL Server evaluates columns in order (otherwise you
can't do order by 1, order by 2 etc.)

Thus, it's perfectly legal to do this...
declare @rows int
Select a.Column1,
a.Column2,
b.Column3,
b.Column4.
c.Column5.
d.Column6,
sort1idx = convert (smallint, null)
sort2idx = convert (smallint, null)
sort3idx = convert (smallint, null)
into #results
From A Inner Join B on A.join = B.join
inner join C on C.join = b.join
Select @rows = @@rowcount,
@rc = @@error
create index #re1 on #results (column6, column3)
create index #re2 on #results (column2, column4)
create index #re1 on #results (column1, column5)

declare @fred = int
set @fred = 0
update R set @fred = @fred + 1, sort1idx = case when @fred <= 1000 then
convert (smallint, @fred) when @rows - @fred < 1000 then convert
(smallint, 2000 + @rows - @fred) else null end From #results R With
(index= #re1)
set @fred = 0
update R set @fred = @fred + 1, sort2idx = case when @fred <= 1000 then
convert (smallint, @fred) when @rows - @fred < 1000 then convert
(smallint, 2000 + @rows - @fred) else null end From #results R With
(index= #re2)
set @fred = 0
update R set @fred = @fred + 1, sort3idx = case when @fred <= 1000 then
convert (smallint, @fred) when @rows - @fred < 1000 then convert
(smallint, 2000 + @rows - @fred) else null end From #results R With
(index= #re3)

Select *From #results where coalesce (sort1idx, sort2idx, sort3idx) Is not Null

-- This line returns the top and bottom 1000 by three dimensions (more
or less, I've been on Sauvignon Blanc in All bar one in Leicester
Square all night) and is the equivalent of SQL 2005's

Select t.Column1, t.column2, t.column3, t.column4, t.column5,
t.column6,
sort1idx = case when sort1idxasc + sort1idxdesc < 2002 then
sort1idxasc when sort1idxasc < 1001 then sort1idx when sort1idxasc >
1000 and sort1idxdesc > 1000 then null else 1000 + 1001 - sort1idxdesc
end, -- or something, I'm a bit pissed
sort2idx = etc...
sort3idx =From ( Select a.Column1,
a.Column2,
b.Column3,
b.Column4.
c.Column5.
d.Column6,
sort1idxasc = Row_number () (over column6, column3)
sort2idxasc = Row_number () (over column2, column4)
sort3idxasc = Row_number () (over column1, column5)
sort1idxasc = Row_number () (over column6 desc, column3 desc)
sort2idxasc = Row_number () (over column2 desc, column4 desc)
sort3idxasc = Row_number () (over column1 desc, column5 desc)
into #resultsFrom A

Inner Join B on A.join = B.join
inner join C on C.join = b.join) T Where sort1idxasc <= 1000 or
sort2idxasc <= 1000 or
sort3idxasc <= 1000 or
sort1idxdesc <= 1000 or
sort2idxdesc <= 1000 or
sort3idxdesc <= 1000)

It implements these with the same execution plan as it would a cursor.
I'm sure M. Ben Gan can confirm this.
The only difference is that SQL 2005 doesn't seem to need tempdb, and
it performs the work about 27% faster.

Apr 22 '06 #11

P: n/a
Identity (1, 1) doesn't guarantee an ascending number by order, until
SQL 2005.

Apr 22 '06 #12

P: n/a

dr*****@hotmail.com wrote:
I'm sorry David, I'd just drank a couple of bottles of particularly
nice Castello Banfi, Brunello di Montalcino. Any provocative statements
I make during such inebriated times is purely to have a poke. I value
your opinions I assure you.

As for your comments about updates that affect multiple rows being
undefined, I'm afraid you're wrong.

Update table set @fred = column = @fred + 1 is defined in SQL server's
own help, please look it up if you don't believe me.

Not only that, but SQL Server evaluates columns in order (otherwise you
can't do order by 1, order by 2 etc.)

Thus, it's perfectly legal to do this...
declare @rows int
Select a.Column1,
a.Column2,
b.Column3,
b.Column4.
c.Column5.
d.Column6,
sort1idx = convert (smallint, null)
sort2idx = convert (smallint, null)
sort3idx = convert (smallint, null)
into #results
From A Inner Join B on A.join = B.join
inner join C on C.join = b.join
Select @rows = @@rowcount,
@rc = @@error
create index #re1 on #results (column6, column3)
create index #re2 on #results (column2, column4)
create index #re1 on #results (column1, column5)

declare @fred = int
set @fred = 0
update R set @fred = @fred + 1, sort1idx = case when @fred <= 1000 then
convert (smallint, @fred) when @rows - @fred < 1000 then convert
(smallint, 2000 + @rows - @fred) else null end From #results R With
(index= #re1)
set @fred = 0
update R set @fred = @fred + 1, sort2idx = case when @fred <= 1000 then
convert (smallint, @fred) when @rows - @fred < 1000 then convert
(smallint, 2000 + @rows - @fred) else null end From #results R With
(index= #re2)
set @fred = 0
update R set @fred = @fred + 1, sort3idx = case when @fred <= 1000 then
convert (smallint, @fred) when @rows - @fred < 1000 then convert
(smallint, 2000 + @rows - @fred) else null end From #results R With
(index= #re3)

Select *
From #results

where coalesce (sort1idx, sort2idx, sort3idx) Is not Null

-- This line returns the top and bottom 1000 by three dimensions (more
or less, I've been on Sauvignon Blanc in All bar one in Leicester
Square all night) and is the equivalent of SQL 2005's

Select t.Column1, t.column2, t.column3, t.column4, t.column5,
t.column6,
sort1idx = case when sort1idxasc + sort1idxdesc < 2002 then
sort1idxasc when sort1idxasc < 1001 then sort1idx when sort1idxasc >
1000 and sort1idxdesc > 1000 then null else 1000 + 1001 - sort1idxdesc
end, -- or something, I'm a bit pissed
sort2idx = etc...
sort3idx =
From (

Select a.Column1,
a.Column2,
b.Column3,
b.Column4.
c.Column5.
d.Column6,
sort1idxasc = Row_number () (over column6, column3)
sort2idxasc = Row_number () (over column2, column4)
sort3idxasc = Row_number () (over column1, column5)
sort1idxasc = Row_number () (over column6 desc, column3 desc)
sort2idxasc = Row_number () (over column2 desc, column4 desc)
sort3idxasc = Row_number () (over column1 desc, column5 desc)

-- God I am a bit smashed. Went straight out onto the piss after
leaving the ofiice vandag.
-- into #results -- -dednae mean that.
From A

Inner Join B on A.join = B.join
inner join C on C.join = b.join) T Where sort1idxasc <= 1000 or
sort2idxasc <= 1000 or
sort3idxasc <= 1000 or
sort1idxdesc <= 1000 or
sort2idxdesc <= 1000 or
sort3idxdesc <= 1000)

It implements these with the same execution plan as it would a cursor.
I'm sure M. Ben Gan can confirm this.
The only difference is that SQL 2005 doesn't seem to need tempdb


(or at least its transaction log, half so much)
, and it performs the work about 27% faster.


Apr 22 '06 #13

P: n/a
dr*****@hotmail.com wrote:

As for your comments about updates that affect multiple rows being
undefined, I'm afraid you're wrong.

Update table set @fred = column = @fred + 1 is defined in SQL server's
own help, please look it up if you don't believe me.


There is a difference between valid syntax and defined behaviour. BOL
does indeed say that your syntax is valid but nowhere does it define
what the result is supposed to be. The closest the documentation gets
is where it describes the equivalent multiple row assignment in a
SELECT statement. It says:

"SELECT @local_variable is typically used to return a single value into
the variable. However, when expression is the name of a column, it can
return multiple values. If the SELECT statement returns more than one
value, the variable is assigned the last value that is returned."

Note: "last value returned". That means the assignment only has to
happen once. The expression isn't necessarily evaluated for each row.
You cannot rely on the expression being evaluated for every row because
it doesn't always work.

In your example you try to force a particular execution plan onto your
code using an INDEX hint. But there is absolutely no reason why SQL
Server should always be required to implement a hint. Indexes are
intended as an optimization tool - they are not supposed to affect
logical behaviour of code.

So all I'm saying is that you should be very cautious with this UPDATE.
In the past we have seen too many undocumented features and smart
little tricks that fail or change in each new version. Unfortunately,
SQL Server is still full of "features" that give undefined results. If
you are doing a one-off update that doesn't matter much - you can
easily verify the results afterwards - but if you put this sort of
thing into production code you run the risk of it breaking under a
future version or service pack.

--
David Portas, SQL Server MVP

Whenever possible please post enough code to reproduce your problem.
Including CREATE TABLE and INSERT statements usually helps.
State what version of SQL Server you are using and specify the content
of any error messages.

SQL Server Books Online:
http://msdn2.microsoft.com/library/m...S,SQL.90).aspx
--

Apr 23 '06 #14

P: n/a
David Portas (RE****************************@acm.org) writes:
In your example you try to force a particular execution plan onto your
code using an INDEX hint. But there is absolutely no reason why SQL
Server should always be required to implement a hint. Indexes are
intended as an optimization tool - they are not supposed to affect
logical behaviour of code.


I only like to enforce this: if the result of a query is dependent on
the presense of an index, the result of the query is undefined (or there
is a plain bug). The task of indexes is to decrease execution times, but
they should not affect the outcome of a query.

Of course, by adding an index hint you at least ensure that they query
will fail would the index be dropped. Then again, if someone changes
the index, you lose anyway.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Apr 23 '06 #15

P: n/a

Erland Sommarskog wrote:
David Portas (RE****************************@acm.org) writes:
In your example you try to force a particular execution plan onto your
code using an INDEX hint. But there is absolutely no reason why SQL
Server should always be required to implement a hint. Indexes are
intended as an optimization tool - they are not supposed to affect
logical behaviour of code.
I only like to enforce this: if the result of a query is dependent on
the presense of an index, the result of the query is undefined (or there
is a plain bug). The task of indexes is to decrease execution times, but
they should not affect the outcome of a query.


I'm prepared to put up with checking a view queries once every new
generation
of SQL server, for an e-commerce site that will run on a laptop, rather
than
pay half a million quid for hardware licence fees, just to do it the
hard way.

Of course, by adding an index hint you at least ensure that they query
will fail would the index be dropped. Then again, if someone changes
the index, you lose anyway.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx


Apr 24 '06 #16

P: n/a

David Portas wrote:
In your example you try to force a particular execution plan onto your
code using an INDEX hint. But there is absolutely no reason why SQL
Server should always be required to implement a hint. Indexes are
intended as an optimization tool - they are not supposed to affect
logical behaviour of code.


I _am_ using it as an optimisation tool. I'm getting the results I
want, by forcing the
plan I want. If at some point, index hints stop working (like that's
going to happen,) I'll find some other way of breaking relational
theory.

What would be nice, (if you're listening Mr Gates,) is for you to
modify this mechanic so,
1. You can do it with selects as well.
2. The where clause checks @variable conditions on every row, instead
of just at the beginning.

You really are taking this too seriously. The code works, and it's a
hundred times faster than the next nearest Oracle technique, and the
code has to be checked every now and then anyway, but so what? The cost
savings to market that this achieves more than outweighs the later
potential corrections.

If run forever reliability was the issue, I'm sure we can both agree
that we wouldn't
be using a product or suite of products written in c++ anyway.

Apr 24 '06 #17

P: n/a

David Portas wrote:
dr*****@hotmail.com wrote:

As for your comments about updates that affect multiple rows being
undefined, I'm afraid you're wrong.

Update table set @fred = column = @fred + 1 is defined in SQL server's
own help, please look it up if you don't believe me.


There is a difference between valid syntax and defined behaviour. BOL


Defined behaviour is as defined behaviour does. Outperforming an
equivalent
query ten to one is worth a rewrite 5 years from now, in the unlikely
event
that microsoft, remove the ability. (This goes against all precedents
as MS
have done very little but improve ability rather than remove it.)

I understand your purist position though, I used to be a software
engineering
purist, but after 30 years of writing code, I now hold the cost benefit
analysis
position. If I can generate, orders, multidimensional rolling averages,
and all
kinds of crap with a single pass of a table, rather than generating a
gig of
transaction log, and 5 Gig of tempdb allocation, then I do it.

Similarly, if I can implement the kind of parametric query, like
dabs.com's have
done, and multiorder search facility, at basically no cost, then I do
it.

By the time MS remove this facility, they'll replace it with something
better,
so there's no worries.

Apr 24 '06 #18

P: n/a
dr*****@hotmail.com wrote:
David Portas wrote:
dr*****@hotmail.com wrote:
>
> As for your comments about updates that affect multiple rows being
> undefined, I'm afraid you're wrong.
>
> Update table set @fred = column = @fred + 1 is defined in SQL server's
> own help, please look it up if you don't believe me.
>


There is a difference between valid syntax and defined behaviour. BOL


Defined behaviour is as defined behaviour does. Outperforming an
equivalent
query ten to one is worth a rewrite 5 years from now, in the unlikely
event
that microsoft, remove the ability. (This goes against all precedents
as MS
have done very little but improve ability rather than remove it.)

I understand your purist position though, I used to be a software
engineering
purist, but after 30 years of writing code, I now hold the cost benefit
analysis
position. If I can generate, orders, multidimensional rolling averages,
and all
kinds of crap with a single pass of a table, rather than generating a
gig of
transaction log, and 5 Gig of tempdb allocation, then I do it.

Similarly, if I can implement the kind of parametric query, like
dabs.com's have
done, and multiorder search facility, at basically no cost, then I do
it.

By the time MS remove this facility, they'll replace it with something
better,
so there's no worries.


I don't consider myself a purist. In the spirit of Martin Fowler I
guess my ethic is something like "Any fool can write something that
works. Good developers write stuff that is verifiable and supportable."
That's not dogmatic. It's entirely practical because it reduces TCO for
the customer.

In this case the problem is not just that it may break in some distant
future. It is broken now. That is, even today there are situations
where multiple row variable assignments in queries just do not happen.
Since you can't predict whether those situations will arise at runtime
you have to take a calculated risk before you implement those
undocumented tricks. Microsoft's history of breaking changes to
undocumented behaviour is against you. I can think of multiple
precedents where undefined features have changed or failed in SQL
Server service packs, hotfixes and versions. The customer then has to
pay the price for development before he can patch his server.

--
David Portas, SQL Server MVP

Whenever possible please post enough code to reproduce your problem.
Including CREATE TABLE and INSERT statements usually helps.
State what version of SQL Server you are using and specify the content
of any error messages.

SQL Server Books Online:
http://msdn2.microsoft.com/library/m...S,SQL.90).aspx
--

Apr 24 '06 #19

P: n/a
(dr*****@hotmail.com) writes:
Defined behaviour is as defined behaviour does. Outperforming an
equivalent query ten to one is worth a rewrite 5 years from now, in the
unlikely event that microsoft, remove the ability.


But the ability isn't there. It only looks like it is. That's why it's
undefined.

In SQL Server 6.5 GROUP BY implied an ORDER BY. Not so any more. In SQL 2000
people would put TOP 100 PERCENT ORDER BY in a view, and they found that
they did seem to need an ORDER BY when selecting from the view. In SQL 2005
more than one have found that they don't get away with it.

You get the result you get by happenstance. There is nothing that Microsoft
can remove, because they never added it. One day the optimizer decides to
do a different plan, and you don't get the result you wanted.

Of course, you may be prepared to take the gamble, but the day it breaks,
it's going to break hard.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Apr 24 '06 #20

P: n/a

Erland Sommarskog wrote:
(dr*****@hotmail.com) writes:
Defined behaviour is as defined behaviour does. Outperforming an
equivalent query ten to one is worth a rewrite 5 years from now, in the
unlikely event that microsoft, remove the ability.
But the ability isn't there. It only looks like it is. That's why it's
undefined.

In SQL Server 6.5 GROUP BY implied an ORDER BY. Not so any more. In SQL 2000
people would put TOP 100 PERCENT ORDER BY in a view, and they found that
they did seem to need an ORDER BY when selecting from the view. In SQL 2005
more than one have found that they don't get away with it.

You get the result you get by happenstance. There is nothing that Microsoft
can remove, because they never added it. One day the optimizer decides to
do a different plan, and you don't get the result you wanted.

Of course, you may be prepared to take the gamble, but the day it breaks,
it's going to break hard.


Everything you say is true. I don't disagree with any of it. However,
from experience,
an ordered update against a single table is worth the risk, that I
might have to (one day) rewrite it. I'm not writing space targeted
software (anymore.)

I find this particularly the case, because I've never seen an explicit
index hint against
a single table, to be ignored. Similarly, microsoft actually advertise
the update set
@fred = col = @fred + 1.

I feel justified in my assertion that "by the time they don't support
it, they'll have introduced something faster," because on the whole,
(and very definitely in this case)
they have done.

Most tech authorities hold that 2 second is an acceptable response time
for a webpage. I maintain that anything over 40 mS is failure.

As for "It may not work at sometime in the future." Well that's the
most ridiculous strawman I've heard in a long time.

I judge that keeping an eye on code is an ongoing requirement, and I
constantly watch for changes that invalidate something I depend on.
From .net 1.1, to .net 2.0 invoking the Sleep method, meant a change from a class member, to a static member, meaning only a thread can
sleep itself. I don't see my watching for changes to unlisted features
to be any different to changes to listed features.


--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx


Apr 24 '06 #21

P: n/a

David Portas wrote:
dr*****@hotmail.com wrote:
David Portas wrote:
dr*****@hotmail.com wrote:
> >
> > As for your comments about updates that affect multiple rows being
> > undefined, I'm afraid you're wrong.
> >
> > Update table set @fred = column = @fred + 1 is defined in SQL server's
> > own help, please look it up if you don't believe me.
> >

There is a difference between valid syntax and defined behaviour. BOL
Defined behaviour is as defined behaviour does. Outperforming an
equivalent
query ten to one is worth a rewrite 5 years from now, in the unlikely
event
that microsoft, remove the ability. (This goes against all precedents
as MS
have done very little but improve ability rather than remove it.)

I understand your purist position though, I used to be a software
engineering
purist, but after 30 years of writing code, I now hold the cost benefit
analysis
position. If I can generate, orders, multidimensional rolling averages,
and all
kinds of crap with a single pass of a table, rather than generating a
gig of
transaction log, and 5 Gig of tempdb allocation, then I do it.

Similarly, if I can implement the kind of parametric query, like
dabs.com's have
done, and multiorder search facility, at basically no cost, then I do
it.

By the time MS remove this facility, they'll replace it with something
better,
so there's no worries.


I don't consider myself a purist. In the spirit of Martin Fowler I
guess my ethic is something like "Any fool can write something that
works. Good developers write stuff that is verifiable and supportable."
That's not dogmatic. It's entirely practical because it reduces TCO for
the customer.

In this case the problem is not just that it may break in some distant
future. It is broken now. That is, even today there are situations
where multiple row variable assignments in queries just do not happen.


Are you saying my query doesn't work? No? I didn't think so.
Since you can't predict whether those situations will arise at runtime
you have to take a calculated risk before you implement those
Not true. I actually run them to find out.
undocumented tricks. Microsoft's history of breaking changes to
Not so far. Nothing I've written in the last ten years has been broken
by a change anywhere near so much as those broken by changes to
advertised features.
undocumented behaviour is against you. I can think of multiple
precedents where undefined features have changed or failed in SQL
Server service packs, hotfixes and versions. The customer then has to
pay the price for development before he can patch his server.
We agree then. Like I said, it's a pure cost benefit analysis. You
contend that
this approach has risks, but imply that I'm not aware of that. This is
simply
not true. I'm well aware of the risks, which is why I have a list of
features I
use that are periodically checked. This includes documented features.

In fact in my experience, documented features change more than
undocumented
features.

--
David Portas, SQL Server MVP

Whenever possible please post enough code to reproduce your problem.
Including CREATE TABLE and INSERT statements usually helps.
State what version of SQL Server you are using and specify the content
of any error messages.

SQL Server Books Online:
http://msdn2.microsoft.com/library/m...S,SQL.90).aspx
--


Apr 24 '06 #22

P: n/a
dr*****@hotmail.com wrote:

Are you saying my query doesn't work? No? I didn't think so.
I'm saying you can't demonstrate that it does.
Since you can't predict whether those situations will arise at runtime
you have to take a calculated risk before you implement those


Not true. I actually run them to find out.


Which proves nothing because execution plans can change at runtime.
I'm well aware of the risks, which is why I have a list of
features I
use that are periodically checked. This includes documented features.


Proves nothing. See above.

--
David Portas, SQL Server MVP

Whenever possible please post enough code to reproduce your problem.
Including CREATE TABLE and INSERT statements usually helps.
State what version of SQL Server you are using and specify the content
of any error messages.

SQL Server Books Online:
http://msdn2.microsoft.com/library/m...S,SQL.90).aspx
--

Apr 24 '06 #23

P: n/a

David Portas wrote:
dr*****@hotmail.com wrote:

Are you saying my query doesn't work? No? I didn't think so.
I'm saying you can't demonstrate that it does.
Since you can't predict whether those situations will arise at runtime
you have to take a calculated risk before you implement those


Not true. I actually run them to find out.


Which proves nothing because execution plans can change at runtime.
I'm well aware of the risks, which is why I have a list of
features I
use that are periodically checked. This includes documented features.


Proves nothing. See above.


Neither does it if it's written down.

--
David Portas, SQL Server MVP

Whenever possible please post enough code to reproduce your problem.
Including CREATE TABLE and INSERT statements usually helps.
State what version of SQL Server you are using and specify the content
of any error messages.

SQL Server Books Online:
http://msdn2.microsoft.com/library/m...S,SQL.90).aspx
--


Apr 24 '06 #24

P: n/a
what happens if the table is big and this UPDATE is executed by several
processors in parallel?

Apr 24 '06 #25

P: n/a
I'm well aware of Codd, thank you very much. I'm also aware that SQL is
a relational language.

Ordering is however, a requirement for the real world, and I take
advantage of all the features of an application that are published,
and also the features that I deem are safe, that aren't published.

It's interesting you say that I could be replaced, because in 14 years
as a contractor, I've never (that's Never) failed to deliver working
project, as advertised. I've until very recently, deliberately gone
into each project as junior developer, taken over, and then fixed it,
and am usually the last contractor standing. I have prototypes I've
written, that became airborne software, and are flying, in space, as
well as in the air, and I replaced the whole of site server and 150
pages of ASP, with a single stored procedure for a very busy (and
successful) website.
This single stored procedure contained ALL the business logic. It was
done, because it needed to work, and the company couldn't get the
people in time to write it. This ran for two years before I was
replaced (correctly I believe) with a net datatable multi-tier based
system, once there was sufficient resource to continue with
development.

Books are useful to be sure, and I do have an extensive library on
software engineering, but a book is only as good as its authors are
clever. I'm happy to stand by my record, of having never failed to
deliver.

Apr 25 '06 #26

P: n/a
When the day comes that Microsoft runs parallel query against a
sequential update, then it will be spotted in UAT (if I haven't already
seen it,) and replaced with a cursor or some other cunning tactic.

I don't imagine I'll ever see that however, because I don't suppose
that Microsoft will rollback the row_number() (Over col1, col2, col3)
function in SQL 2010, just to allow it to introduce its new "parallel
sequential (tm)" technology - a process that allocates an unknown in
advance amount sequential numbers of in parallel.

Out of interest, how does one allocate an index in sequence in parallel
against a table whose size is unknown at the start of the update
without wasting huge resources calculating the tree sizes? Does one
mark the entire index as "untouched", then excute massive parallel jobs
against parts of it, ensuring never to touch the same record twice?
Perhaps, it asks god how big each table leaf root is, and then assigns
ranges of numbers of exact size, (taking into account any inserts or
deletes that are going to happen, after it starts and before it ends)
to ensure that when it distributes the job to it's SMP array, that each
row gets a number that is exactly unique, and contiguous. Perhaps you
could start at one end of the table, with the length of it (including
any intermediate insert/deletes as extrapolated from a small piece of
fairy cake), and start at the other end of the same, and then work
inwards.

My solution would be to simply use magic. This is guaranteed to work.

Apr 25 '06 #27

P: n/a
dr*****@hotmail.com wrote:
When the day comes that Microsoft runs parallel query against a
sequential update, then it will be spotted in UAT (if I haven't already
seen it,) and replaced with a cursor or some other cunning tactic.

I don't imagine I'll ever see that however, because I don't suppose
that Microsoft will rollback the row_number() (Over col1, col2, col3)
function in SQL 2010, just to allow it to introduce its new "parallel
sequential (tm)" technology - a process that allocates an unknown in
advance amount sequential numbers of in parallel.

Out of interest, how does one allocate an index in sequence in parallel
against a table whose size is unknown at the start of the update
without wasting huge resources calculating the tree sizes? Does one
mark the entire index as "untouched", then excute massive parallel jobs
against parts of it, ensuring never to touch the same record twice?
Perhaps, it asks god how big each table leaf root is, and then assigns
ranges of numbers of exact size, (taking into account any inserts or
deletes that are going to happen, after it starts and before it ends)
to ensure that when it distributes the job to it's SMP array, that each
row gets a number that is exactly unique, and contiguous. Perhaps you
could start at one end of the table, with the length of it (including
any intermediate insert/deletes as extrapolated from a small piece of
fairy cake), and start at the other end of the same, and then work
inwards.

My solution would be to simply use magic. This is guaranteed to work.


Even sequential scans don't always start at the same place in the index
due to enhanced read-ahead (AKA merry-go-round scan). AFAIK that
doesn't apply to index updates today but it could perhaps do so in
future. That's a good example of something you may be unlikely to spot
in a unit test (unless you simulate load). It will likely show up if
you test to peak production workload but do you always regression test
on that scale for every service pack? Engine changes have gone in SPs
before.

--
David Portas, SQL Server MVP

Whenever possible please post enough code to reproduce your problem.
Including CREATE TABLE and INSERT statements usually helps.
State what version of SQL Server you are using and specify the content
of any error messages.

SQL Server Books Online:
http://msdn2.microsoft.com/library/m...S,SQL.90).aspx
--

Apr 25 '06 #28

P: n/a

David Portas wrote:
dr*****@hotmail.com wrote:
When the day comes that Microsoft runs parallel query against a
sequential update, then it will be spotted in UAT (if I haven't already
seen it,) and replaced with a cursor or some other cunning tactic.

I don't imagine I'll ever see that however, because I don't suppose
that Microsoft will rollback the row_number() (Over col1, col2, col3)
function in SQL 2010, just to allow it to introduce its new "parallel
sequential (tm)" technology - a process that allocates an unknown in
advance amount sequential numbers of in parallel.

Out of interest, how does one allocate an index in sequence in parallel
against a table whose size is unknown at the start of the update
without wasting huge resources calculating the tree sizes? Does one
mark the entire index as "untouched", then excute massive parallel jobs
against parts of it, ensuring never to touch the same record twice?
Perhaps, it asks god how big each table leaf root is, and then assigns
ranges of numbers of exact size, (taking into account any inserts or
deletes that are going to happen, after it starts and before it ends)
to ensure that when it distributes the job to it's SMP array, that each
row gets a number that is exactly unique, and contiguous. Perhaps you
could start at one end of the table, with the length of it (including
any intermediate insert/deletes as extrapolated from a small piece of
fairy cake), and start at the other end of the same, and then work
inwards.

My solution would be to simply use magic. This is guaranteed to work.
Even sequential scans don't always start at the same place in the index
due to enhanced read-ahead (AKA merry-go-round scan). AFAIK that
doesn't apply to index updates today but it could perhaps do so in
future. That's a good example of something you may be unlikely to spot
in a unit test (unless you simulate load). It will likely show up if
you test to peak production workload but do you always regression test
on that scale for every service pack? Engine changes have gone in SPs
before.


And will no doubt go further in future, due to the new CTP approach to
delivery.

--
David Portas, SQL Server MVP

Whenever possible please post enough code to reproduce your problem.
Including CREATE TABLE and INSERT statements usually helps.
State what version of SQL Server you are using and specify the content
of any error messages.

SQL Server Books Online:
http://msdn2.microsoft.com/library/m...S,SQL.90).aspx
--


Apr 25 '06 #29

P: n/a

--CELKO-- wrote:
Yes; shot you and replace you with a programmer who has read the first
10 pages of **any** RDBMS book. You are posting realllllllly baaaaad
questions. Please take some time to catch up over the weekend before
you ask us to do your job for you again.

SQL is a set-orient language. Tables -- by definition -- have no
ordering. That is the nature of a set. Have you ever read Dr. Codd's
12 rules for RDBMS/ Look up the Information Principle: all
relationships are shown as values in columns. Ordering is a
relationship, so you need a column(s) for it.

If you do not know who Dr. Codd is or his rules, then you are like a
Geometry student who never heard of Euclid.


When faced with crass pointless comments like this, it's unsurprising
that
software engineering doesn't advance as fast as it could. Relational
theory is
no cleverer than Codd, (and maybe Date) was. They weren't gods. They
put forward
software engineering maybe only twice as far as Straustrup put it back
with
the abomination of c++.

I'm happy to decline to Bruce Lee. No matter how good a technique, be
it a
punch or a throw or a kick, (or relational theory, or xml, or
hibernate, or a
standards document) it becomes a weakness when one becomes obsessed
with it.

Apr 25 '06 #30

This discussion thread is closed

Replies have been disabled for this discussion.