SELECT works but UPDATE fails. ?

kalamos

This statement fails
update ded_temp a
set a.balance = (select sum(b.ln_amt)
from ded_temp b
where a.cust_no = b.cust_no
and a.ded_type_cd = b.ded_type_cd
and a.chk_no = b.chk_no
group by cust_no, ded_type_cd, chk_no)
With this error:
Server: Msg 170, Level 15, State 1, Line 1
Line 1: Incorrect syntax near 'a'.
But this statement:
select * from ded_temp a
where a.balance = (select sum(b.ln_amt)
from ded_temp b
where a.cust_no = b.cust_no
and a.ded_type_cd = b.ded_type_cd
and a.chk_no = b.chk_no
group by cust_no, ded_type_cd, chk_no)
Runs without error:
Why? and How should I change the first statement to run my update. This
statement of course works fine in Oracle. :)
tks
ken.

Jul 23 '05 #1

Subscribe Post Reply

4961

Chandra

Hi
Just try it out this way:

update ded_temp
set balance = (select sum(b.ln_amt)
from ded_temp b
where a.cust_no = b.cust_no
and a.ded_type_cd = b.ded_type_cd
and a.chk_no = b.chk_no
group by cust_no, ded_type_cd, chk_no)

best Regards,
Chandra
http://groups.msn.com/SQLResource/
http://chanduas.blogspot.com/
---------------------------------------

*** Sent via Developersdex http://www.developersdex.com ***

Jul 23 '05 #2

Erland Sommarskog

kalamos (km******@hotmail.com) writes:

This statement fails
update ded_temp a
set a.balance = (select sum(b.ln_amt)
from ded_temp b
where a.cust_no = b.cust_no
and a.ded_type_cd = b.ded_type_cd
and a.chk_no = b.chk_no
group by cust_no, ded_type_cd, chk_no)

update ded_temp
set a.balance = (select sum(b.ln_amt)
from ded_temp b
where a.cust_no = b.cust_no
and a.ded_type_cd = b.ded_type_cd
and a.chk_no = b.chk_no
group by cust_no, ded_type_cd, chk_no)
from ded_temp a

None of the syntaxes are in line with standard SQL, so different enginge
have added different place where you can put in the alias.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp

Jul 23 '05 #3

Hugo Kornelis

On Fri, 3 Jun 2005 00:06:03 -0400, kalamos wrote:

This statement fails
update ded_temp a
set a.balance = (select sum(b.ln_amt)
from ded_temp b
where a.cust_no = b.cust_no
and a.ded_type_cd = b.ded_type_cd
and a.chk_no = b.chk_no
group by cust_no, ded_type_cd, chk_no)
With this error:
Server: Msg 170, Level 15, State 1, Line 1
Line 1: Incorrect syntax near 'a'.

(snip)

Hi ken,

Erland already pointed out that the proprietary UPDATE FROM syntax
differs between products. However, why use proprietary code when you can
use ANSI-standard code that will work on almost all databases:

update ded_temp
set balance = (select sum(b.ln_amt)
from ded_temp b
where ded_temp.cust_no = b.cust_no
and ded_temp.ded_type_cd = b.ded_type_cd
and ded_temp.chk_no = b.chk_no
group by cust_no, ded_type_cd, chk_no)

BTW, you can also omit the GROUP BY clause, since the subquery will only
match rows for one set of (cust_no, ded_type_cd, chk_no) anyway - this
might even give you some performance gain!

Best, Hugo
--

(Remove _NO_ and _SPAM_ to get my e-mail address)

Jul 23 '05 #4

Erland Sommarskog

Hugo Kornelis (hugo@pe_NO_rFact.in_SPAM_fo) writes:

Erland already pointed out that the proprietary UPDATE FROM syntax
differs between products. However, why use proprietary code when you can
use ANSI-standard code that will work on almost all databases:

update ded_temp
set balance = (select sum(b.ln_amt)
from ded_temp b
where ded_temp.cust_no = b.cust_no
and ded_temp.ded_type_cd = b.ded_type_cd
and ded_temp.chk_no = b.chk_no
group by cust_no, ded_type_cd, chk_no)
Could you please explain what this miserable piece of code means? You
have two ded_temp in the query, so which ded_temp does ded_temp.cust_no
refer to?

You will have to excuse, but I think it's poor advice to suggest that
people should use code that is ambiguous. I don't really care if there
is wording in ANSI that disamguiates the query, it's still bad practice,
because for a human the query is about incomprehensible. (And I would
not be surprised if more than one engine gets lost on the query above,
so I would not even trust the elusive compatibility.)

The ANSI standard does not provide a place to put in an alias, and
that is a serious shortcoming. In this case, it leads to the nonsese
above. In other cases, imagine that you have a table by the of
instrumentclearingmarketplaces - writing that over and over again
is completely out of the question.
BTW, you can also omit the GROUP BY clause, since the subquery will only
match rows for one set of (cust_no, ded_type_cd, chk_no) anyway - this
might even give you some performance gain!

For some real serious peformance gain, this is likely to be a true
winner:

UPDATE ded_temp
SET balance = b.ln_amt
FROM ded_temp a
JOIN (SELECT cust_no, ded_type_cd, chk_no, ln_amt = sum(b.ln_amt)
FROM ded_temp
GROUP BY cust_no, ded_type_cd, chk_no) AS b
ON a.cust_no = b.cust_no
AND a.ded_type_cd = b.ded_type_cd
AND a.ded_temp.chk_no = b.chk_no

I can give no guarantees, but my experience is that a join with a derived
table results in a lot more effecient plan, than a correlated subquery.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp

Jul 23 '05 #5

kalamos

The above solutions were great. I only had the "a" in the wrong place. I did
not understand that sql server was "different" from Oracle in this way.

Thanks for all your help.

I'm not really sure what the code means, I'm not the programmer just the
administrator, I got this code second hand. :)

thanks again,
ken.
"Erland Sommarskog" <es****@sommarskog.se> wrote in message
news:Xn*********************@127.0.0.1...

Hugo Kornelis (hugo@pe_NO_rFact.in_SPAM_fo) writes:
Erland already pointed out that the proprietary UPDATE FROM syntax
differs between products. However, why use proprietary code when you can
use ANSI-standard code that will work on almost all databases:

update ded_temp
set balance = (select sum(b.ln_amt)
from ded_temp b
where ded_temp.cust_no = b.cust_no
and ded_temp.ded_type_cd = b.ded_type_cd
and ded_temp.chk_no = b.chk_no
group by cust_no, ded_type_cd, chk_no)

Could you please explain what this miserable piece of code means? You
have two ded_temp in the query, so which ded_temp does ded_temp.cust_no
refer to?

You will have to excuse, but I think it's poor advice to suggest that
people should use code that is ambiguous. I don't really care if there
is wording in ANSI that disamguiates the query, it's still bad practice,
because for a human the query is about incomprehensible. (And I would
not be surprised if more than one engine gets lost on the query above,
so I would not even trust the elusive compatibility.)

The ANSI standard does not provide a place to put in an alias, and
that is a serious shortcoming. In this case, it leads to the nonsese
above. In other cases, imagine that you have a table by the of
instrumentclearingmarketplaces - writing that over and over again
is completely out of the question.
BTW, you can also omit the GROUP BY clause, since the subquery will only
match rows for one set of (cust_no, ded_type_cd, chk_no) anyway - this
might even give you some performance gain!

For some real serious peformance gain, this is likely to be a true
winner:

UPDATE ded_temp
SET balance = b.ln_amt
FROM ded_temp a
JOIN (SELECT cust_no, ded_type_cd, chk_no, ln_amt = sum(b.ln_amt)
FROM ded_temp
GROUP BY cust_no, ded_type_cd, chk_no) AS b
ON a.cust_no = b.cust_no
AND a.ded_type_cd = b.ded_type_cd
AND a.ded_temp.chk_no = b.chk_no

I can give no guarantees, but my experience is that a join with a derived
table results in a lot more effecient plan, than a correlated subquery.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp

Jul 23 '05 #6

kalamos

This is exactly the answer I was looking for, so simple yet so difficult to
find.

thanks for your help
:)
"Erland Sommarskog" <es****@sommarskog.se> wrote in message
news:Xn**********************@127.0.0.1...

kalamos (km******@hotmail.com) writes:
This statement fails
update ded_temp a
set a.balance = (select sum(b.ln_amt)
from ded_temp b
where a.cust_no = b.cust_no
and a.ded_type_cd = b.ded_type_cd
and a.chk_no = b.chk_no
group by cust_no, ded_type_cd, chk_no)

update ded_temp
set a.balance = (select sum(b.ln_amt)
from ded_temp b
where a.cust_no = b.cust_no
and a.ded_type_cd = b.ded_type_cd
and a.chk_no = b.chk_no
group by cust_no, ded_type_cd, chk_no)
from ded_temp a

None of the syntaxes are in line with standard SQL, so different enginge
have added different place where you can put in the alias.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp

Jul 23 '05 #7

--CELKO--

>> The ANSI standard does not provide a place to put in an alias, and that is a serious shortcoming. <<

The reason for that is a consistent model of alias in Standard SQL. A
correlation names acts as if it makes a new working table, whcih wil
disappear at the end of the statement. You would never update the base
table if you allowed an alias in the UPDATE statement.

Jul 23 '05 #8

Erland Sommarskog

kalamos (km******@hotmail.com) writes:

The above solutions were great. I only had the "a" in the wrong place. I
did not understand that sql server was "different" from Oracle in this
way.

Just a word of warning, while most, if nor all, RDBMSs today provides
SQL, that does not mean that SQL from one engine runs on another. You
cannot even rely on some sort of standard, because few engines implement
all of the standard.

....and even if the statement runs and give the same result on two
engines, performance may differ considerably.
--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp

Jul 23 '05 #9

Serge Rielau

--CELKO-- wrote:

The ANSI standard does not provide a place to put in an alias, and that is a serious shortcoming. <<

The reason for that is a consistent model of alias in Standard SQL. A
correlation names acts as if it makes a new working table, whcih wil
disappear at the end of the statement. You would never update the base
table if you allowed an alias in the UPDATE statement.

Are you sure about that? I thought the alias is just the "exposed name".
Just like CREATE ALIAS/SYNONYM does not make a copy of the table
labeling a table with a different exposed name does not (semantically)
make a copy.
What makes the copy is the SELECT (or UNION , ...) because it produces a
new "derived table". Whether it has a name or not conceptually has no
effect on its existance.
Interestingly this all falls apart on UPDATE/DELETE/INSERT target when
one looks at updatable views which clearly are derived tables but do
what one can reasonably expect: modify the "underlying" table.
FWIW some other DBMS support aliasing of UPDATE and DELETE targets.
(Makes obviously no sense on INSERT)

Cheers
Serge
--
Serge Rielau
DB2 SQL Compiler Development
IBM Toronto Lab

Jul 23 '05 #10

--CELKO--

I should have said corelation name and not alias, since a lot of
products have a CREATE ALIAS construct, which the standards do not.

Interestingly this all falls apart on UPDATE/DELETE/INSERT target when one looks at updatable views which clearly are derived tables but do what one can reasonably expect: modify the "underlying" table. <<

One of the reasons for the weakness of UPDATE-able views in Standard
SQL was so that they could be defined as a base table with restrictions
on unexposed columns -- must have a DEFAULT, cannot be in the SET
clause, etc. The WITH CHECK OPTION is a real piece of work when you
have nested views and the CASCADE option.

Jul 23 '05 #11

Hugo Kornelis

On Fri, 3 Jun 2005 22:41:23 +0000 (UTC), Erland Sommarskog wrote:

Hugo Kornelis (hugo@pe_NO_rFact.in_SPAM_fo) writes:
Erland already pointed out that the proprietary UPDATE FROM syntax
differs between products. However, why use proprietary code when you can
use ANSI-standard code that will work on almost all databases:

update ded_temp
set balance = (select sum(b.ln_amt)
from ded_temp b
where ded_temp.cust_no = b.cust_no
and ded_temp.ded_type_cd = b.ded_type_cd
and ded_temp.chk_no = b.chk_no
group by cust_no, ded_type_cd, chk_no)
Could you please explain what this miserable piece of code means? You
have two ded_temp in the query, so which ded_temp does ded_temp.cust_no
refer to?

You will have to excuse, but I think it's poor advice to suggest that
people should use code that is ambiguous. I don't really care if there
is wording in ANSI that disamguiates the query, it's still bad practice,

Hi Erland,

Thanks for the feedback. As you know, I always welcome comments to the
posts I make in these groups - I'm here to learn as well as to help.

In this case, though, you're missing the point. The update statement
above is completely non-ambiguous, and that has nothing to do with any
wording in ANSI. It has to do with how table aliases work in SQL Server.

If you supply a table with an alias, you can no longer refer to that
table by it's tablename; you can ONLY use the alias. This is easily
demonstrated:

use pubs
select authors.au_fname
from authors a

Server: Msg 107, Level 16, State 3, Line 1
The column prefix 'authors' does not match with a table name or alias
name used in the query.

As you see, adding the alias 'a' to the table 'authors' doesn't give a
second name to the set, it changes it's only name. BTW, this behaviour
is (briefly) documented in BOL, index item "FROM clause (descibed)":

(quote from Books Online) [AS] table_alias

Is an alias for table_name, view_name, or rowset_function, used either
for convenience or to distinguish a table or view in a self-join or
subquery. An alias is often a shortened table name used to refer to
specific columns of the tables in a join. If the same column name exists
in more than one table in the join, SQL Server requires that the column
name must be qualified by a table name or alias. (The table name cannot
be used if an alias is defined).
Note the parenthesed remark at the end of this paragraph
Back to my update statement. The ded_temp table is used twice. The one
in the subquery is aliased as 'b' - column from this occurence of
ded_temp can never be refered to in the form ded_temp.foo - they MUST be
refered to as b.foo (or just foo - but that _would_ be ambiguous!!). The
one in the UPDATE (outside of the subquery) doesn't have an alias, hence
each column ded_temp.bar can ONLY refer to this occurence of the
ded_temp table.
because for a human the query is about incomprehensible.
I disagree with that as well. Of course, it IS necessary to always
follow the same conventions. These are my favorites:

1. In a one-table SELECT, UPDATE or DELETE, use no alias and no prefixes
for the column names.
2. In a two-or-more-table SELECT, use aliases for ALL tables and prefix
ALL column names with the appropriatie alias.
3. In a proprietary UPDATE FROM or DELETE FROM, use aliases for ALL
tables and prefix ALL column names with the appropriatie alias (except
the columns on the left of the = symbol in the SET clause). Use the
alias after the UPDATE or DELETE keyword, not the table name. Do not
omit the table to be updated or deleted from the FROM list, even though
SQL Server allows this. And don't use this syntax for single-table
UPDATE or DELETE operations.
4. In an ANSI UPDATE or DELETE with subquery (or subqueries), don't use
an alias for the table to be updated or deleted (not allowed!), but do
use aliases for all tables in the subqueries. Prefix ALL column names
with either the apppropriate prefix, or the tablename of the table to be
updated or deleted.
5. Use ANSI standard constructions, unless there is a good reason to
choose proprietary SQL. Always document the reason for the use of
proprietary SQL.

With the guidelines above, queries such as the one above are far from
incomprehensible. (Though I'd normally use UPPERCASE for all keywords,
CamelCase for table and column names, lowercase for aliases and
different indentation - but I was in a lazy mood when I made my first
post in this thread - too lazy to change the existing capitalization and
layout).

(And I would
not be surprised if more than one engine gets lost on the query above,
so I would not even trust the elusive compatibility.)
Since I know only SQL Server, I'll have to take your word for it.

OTOH, any RDBMS that claims entry-level SQL-92 or above *should* be able
to understand this statement.

The ANSI standard does not provide a place to put in an alias, and
that is a serious shortcoming. In this case, it leads to the nonsese
above. In other cases, imagine that you have a table by the of
instrumentclearingmarketplaces - writing that over and over again
is completely out of the question.

The same "problem" occurs with long column names. Using drag and drop
from object editor, copy/paste, programmable macro keys and/or mnemonics
while typing and search&replace when done are easy solutions.

BTW, you can also omit the GROUP BY clause, since the subquery will only
match rows for one set of (cust_no, ded_type_cd, chk_no) anyway - this
might even give you some performance gain!

For some real serious peformance gain, this is likely to be a true
winner:

UPDATE ded_temp
SET balance = b.ln_amt
FROM ded_temp a
JOIN (SELECT cust_no, ded_type_cd, chk_no, ln_amt = sum(b.ln_amt)
FROM ded_temp
GROUP BY cust_no, ded_type_cd, chk_no) AS b
ON a.cust_no = b.cust_no
AND a.ded_type_cd = b.ded_type_cd
AND a.ded_temp.chk_no = b.chk_no

I can give no guarantees, but my experience is that a join with a derived
table results in a lot more effecient plan, than a correlated subquery.

You are very probably right - I have seen numerous occasions where the
proprietary UPDATE FROM far outperformed the ANSI equivalent. If
performance is the main objective, then UPDATE FROM might be the better
choice. But performance is not always the main objective. If the tables
are small, then nobody will care if it takes 3 ms or 6 ms. If the app
will have to run on multiple platforms (or even if the app is written
with the idea that one day, it *might* be ported), ANSI standard code
should be used unless it'd become a real performance killer.

And finally - if the table to be updated happens to be a view with an
INSTEAD OF trigger, then the proprietary UPDATE FROM syntax can not be
used, so you'll HAVE TO use the ANSI standard equivalent, whether you
like it or not.

Best, Hugo
--

(Remove _NO_ and _SPAM_ to get my e-mail address)

Jul 23 '05 #12

Erland Sommarskog

Hugo Kornelis (hugo@pe_NO_rFact.in_SPAM_fo) writes:

In this case, though, you're missing the point. The update statement
above is completely non-ambiguous, and that has nothing to do with any
wording in ANSI. It has to do with how table aliases work in SQL Server.
Yes, I know that there are such rules. But my main point is that I as
a human have a difficult to understand what the query means. You and
I may know the alias rules by heart, but not everyone may feel fully
confident in them. Something is not working properly, and the support
person freaks out on this query, thinking this must be where the error
is, and wastes time, because some purist put ANSI compliance ahead of
clarity.

By the way, I think these rules were introduced in 6.5, but I don't know
what would have happened with this statement in 6.0 or 4.x.
Use the alias after the UPDATE or DELETE keyword, not the table name.
I much prefer to use the table name, since at this point the reader may
not know what "a" stands for. There is one exception, and that is if
you self-join against the table being updated/deleted to. If you are on
SQL 7 or 6.5 it's a different matter: here you need to use the alias,
as soon as you use the newer ANSI-join syntax.
4. In an ANSI UPDATE or DELETE with subquery (or subqueries), don't use
an alias for the table to be updated or deleted (not allowed!), but do
use aliases for all tables in the subqueries. Prefix ALL column names
with either the apppropriate prefix, or the tablename of the table to be
updated or deleted.
And I say: always use an alias, and add a FROM clause if only to
introduce an alias.
5. Use ANSI standard constructions, unless there is a good reason to
choose proprietary SQL. Always document the reason for the use of
proprietary SQL.
That would be a lot of comments. Do you really write a comment everytime
you use + for string concatenation? Every time you create a temp table?

Of course, if there is a clearly outspoken desire to be portable, then
there is a point with it. But in many most situations, portability is
not on the agenda. Being ANSI-compliant in such case only means that
you work with your RDBMS with one hand tied behind your back. And
if the day comes when you actually will have to target another engine,
you find that that nice ANSI-compliant syntax will have to be rewritten
anyway, because the new engine did not support that syntax anyway.

And even if the engine supports the syntax, you may find that the
performance did not port. You know, customers do not that make much
distinction between a syntax error and a query that runs longer than
their patience permits them.

(And I would
not be surprised if more than one engine gets lost on the query above,
so I would not even trust the elusive compatibility.)

Since I know only SQL Server, I'll have to take your word for it.

Note that I didn't say that there are such engines. I only speculated
that there could be, since this is a case which really invites to bugs.

The ANSI standard does not provide a place to put in an alias, and
that is a serious shortcoming. In this case, it leads to the nonsese
above. In other cases, imagine that you have a table by the of
instrumentclearingmarketplaces - writing that over and over again
is completely out of the question.

The same "problem" occurs with long column names. Using drag and drop
from object editor, copy/paste, programmable macro keys and/or mnemonics
while typing and search&replace when done are easy solutions.

No, the problem is not the same with long column names, because they
are not repeated over and over again over the query. And the problem
is not about typing the code. Again, it is about reading the code.
You are very probably right - I have seen numerous occasions where the
proprietary UPDATE FROM far outperformed the ANSI equivalent. If
performance is the main objective, then UPDATE FROM might be the better
choice. But performance is not always the main objective. If the tables
are small, then nobody will care if it takes 3 ms or 6 ms.

So we should use one syntax for small tables, and one for big tables?
What if that small table becomes big one day? Frankly, that is far more
likely that the code will be ported.

Most production databases have considerable amounts of data, but
development environments are offer lesser-scale, so developers are
not always directly made aware of performance issues. Better to learn
them to use something that provides best performance in most cases.

I remember one such stored procedure where we slashed execition time
from 4 minutes to a couple of seconds by replacing correlated subqueries
with derived tables in two update statments.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp

Jul 23 '05 #13

Hugo Kornelis

On Mon, 6 Jun 2005 22:31:07 +0000 (UTC), Erland Sommarskog wrote:

Hugo Kornelis (hugo@pe_NO_rFact.in_SPAM_fo) writes:
In this case, though, you're missing the point. The update statement
above is completely non-ambiguous, and that has nothing to do with any
wording in ANSI. It has to do with how table aliases work in SQL Server.
Yes, I know that there are such rules. But my main point is that I as
a human have a difficult to understand what the query means.

Hi Erland,

I don't have those difficulties, but if you do, then I can see that
you'd rather avoid this syntax. But I do not agree that posting this
code amounts to "poor advice".

You could even defend the opposite: it's better that people get their
first glimpse of that syntax here, where they can ask questions about
it, then while debugging some poorly documented code in a
mission-critical database.
You and
I may know the alias rules by heart, but not everyone may feel fully
confident in them.
Yes, I agree that the ANSI compliant UPDATE statement could be confusing
for inexperienced SQL Server users. But the T-SQL UPDATE statement comes
with a much bigger risk in the hands of less experienced people: a small
mistake in the subquery could easily result in inpredictabe results,
that might be correct in test, but wrong in production - without any
warning from SQL Server!! The ANSI compliant version with the same
mistake would throw an error, forcing the programmer or DBA to review
the code.
(snip) because some purist put ANSI compliance ahead of
clarity.

I am not a purist. In performance-critical parts of my DB's, I use what
performs best. In other parts, I use ANSI, unless I feel that there is a
huge gain in readability and maintainability (e.g. when more columns
have to be set, I generally prefer a joined UPDATE over repeating the
same subquery over and over again).

Let's agree to disagree on the readability of ANSI-standard UPDATE
statements, but let's also agree to not call each other names over this,
okay?
Another difference of opinion:

Use the alias after the UPDATE or DELETE keyword, not the table name.

I much prefer to use the table name, since at this point the reader may
not know what "a" stands for. There is one exception, and that is if
you self-join against the table being updated/deleted to.

This exception is exactly the reason why I always use the alias after
the UPDATE or DELETE keyword - if you have to do it in some cases,
better be consequent and do it in all cases. Besides, it's a good
reminder that the T-SQL version of the UPDATE / DELETE syntax is used!

5. Use ANSI standard constructions, unless there is a good reason to
choose proprietary SQL. Always document the reason for the use of
proprietary SQL.

That would be a lot of comments. Do you really write a comment everytime
you use + for string concatenation? Every time you create a temp table?

You've got me there! :)

What I meant to write was: document it when you CHOOSE to use T-SQL
instead of ANSI SQL. String concatenation, date and time handling, temp
tables and lots of other things can only be done with proprietary code,
so there's no need to document them. UPDATE FROM and DELETE FROM can
always be replaced by ANSI-compliant code, just as code that uses the
TOP keyword - those are the cases that I always document.

(snip parts I agree with)

(And I would
not be surprised if more than one engine gets lost on the query above,
so I would not even trust the elusive compatibility.)

Since I know only SQL Server, I'll have to take your word for it.

Note that I didn't say that there are such engines. I only speculated
that there could be, since this is a case which really invites to bugs.

I'll counter-speculate that machines would have far less trouble with
this than humans. If an RDBMS has trouble with this statement, then my
guess is that it either has trouble with all correlated subqueries in
update statements (whether with or without alias), or that is has
trouble with all queries that mix aliased and non-aliased tables. In
both cases, the product would disqualify as serious DB in my book.

(snip parts I agree with)

If
performance is the main objective, then UPDATE FROM might be the better
choice. But performance is not always the main objective. If the tables
are small, then nobody will care if it takes 3 ms or 6 ms.

So we should use one syntax for small tables, and one for big tables?

No, that's not what I meant. A good DB uses standardized code in most
cases (and in our case, your standard for UPDATE statements is different
from mine - but as long as each of us sticks to his own standard, that's
fine <g>). But for the long-running performance-hurting queries and
statements, we pull every trick that we know, in order to squeeze the
last bit of performance out of it - if necessary even at the expense of
standard constructions, readability and maintainability.

Best, Hugo
--

(Remove _NO_ and _SPAM_ to get my e-mail address)

Jul 23 '05 #14

Erland Sommarskog

Hugo Kornelis (hugo@pe_NO_rFact.in_SPAM_fo) writes:

You could even defend the opposite: it's better that people get their
first glimpse of that syntax here, where they can ask questions about
it, then while debugging some poorly documented code in a
mission-critical database.
Not all people read the newsgroups or other SQL forums. In facr, I would
suspect most people don't.
Yes, I agree that the ANSI compliant UPDATE statement could be confusing
for inexperienced SQL Server users. But the T-SQL UPDATE statement comes
with a much bigger risk in the hands of less experienced people: a small
mistake in the subquery could easily result in inpredictabe results,
that might be correct in test, but wrong in production - without any
warning from SQL Server!!
That's another red herring. With the same argument you should use
JOIN clauses in your SELECT either, but always get your values with
subqueries in the column list.
The ANSI compliant version with the same mistake would throw an error,
forcing the programmer or DBA to review the code.
Considet:

UPDATE tbl
SET col = (SELECT SUM(o.somecol)
FROM othertbl o
JOIN tbl t WHERE t.keycol = o.keycol)

Exactly which error will be raised here?

Sure, the result is not predictable - we know that the entire table
will be thrashed.

And, yes, I've seen people who have posted queries like this. and who
other just messed up, or have not grasped the correlated subqueries.

And this brings to another great advantage with the FROM syntax: you
use the same idiom for UPDATE/DELETE as you do for SELECT. If you are
uncertain whether your conditions are correct, you can easily transform
your UPDATE to a SELECT.

Again, I'm putting emphasis on the cognitive side of things.

Note that I didn't say that there are such engines. I only speculated
that there could be, since this is a case which really invites to bugs.

I'll counter-speculate that machines would have far less trouble with
this than humans. If an RDBMS has trouble with this statement, then my
guess is that it either has trouble with all correlated subqueries in
update statements (whether with or without alias), or that is has
trouble with all queries that mix aliased and non-aliased tables. In
both cases, the product would disqualify as serious DB in my book.

Maybe it would. But SQL 6.0 was such a database. There you could say:

SELECT a.col, tbl.col FROM tbl a
Let's agree to disagree on the readability of ANSI-standard UPDATE
statements, but let's also agree to not call each other names over this,
okay?

Any time I see someone touting the ANSI syntax for UPDATE and DELETE,
I will reserve the right to bump in. Unless portability is a true and
serious concern, and just theoretical "it could happen", the ANSI
syntax has so serious problems - both in itself, and specific to
SQL Server - that it is in my opinion poor advice to suggest it
instead of the FROM syntax which is clearer, easier to understand,
and usually also more performant on SQL Server.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp

Jul 23 '05 #15

Serge Rielau

Erland Sommarskog wrote:

Any time I see someone touting the ANSI syntax for UPDATE and DELETE,
I will reserve the right to bump in. Unless portability is a true and
serious concern, and just theoretical "it could happen", the ANSI
syntax has so serious problems - both in itself, and specific to
SQL Server - that it is in my opinion poor advice to suggest it
instead of the FROM syntax which is clearer, easier to understand,
and usually also more performant on SQL Server.

Just an FYI the ANSI syntax to solve the problem is

MERGE INTO <target>
USING <source>
ON <join codition>
WHEN MATCHED THEN UPDATE SET ...

IMHO The UPDATE with the double query is tough for many users to
understand, before MERGE the UPDATE FROM problem was a FAQ in
c.d.ibm-db2. UPDATE FROM is also supported by Informix XPS, btw.

Cheers
Serge
--
Serge Rielau
DB2 SQL Compiler Development
IBM Toronto Lab

Jul 23 '05 #16

Hugo Kornelis

Hi Erland,

Sorry for picking up an old discussion. I've been planning to respond to
this the whole week, but other things just kept interfering. And yet, I
do want to respond to some of your points.
On Tue, 7 Jun 2005 23:11:31 +0000 (UTC), Erland Sommarskog wrote:

Hugo Kornelis (hugo@pe_NO_rFact.in_SPAM_fo) writes:
You could even defend the opposite: it's better that people get their
first glimpse of that syntax here, where they can ask questions about
it, then while debugging some poorly documented code in a
mission-critical database.
Not all people read the newsgroups or other SQL forums. In facr, I would
suspect most people don't.

But since this started over what advice I should or should not give in
the newsgroups, it's safe to assume that the people reading my advice do
read the groups.

Yes, I agree that the ANSI compliant UPDATE statement could be confusing
for inexperienced SQL Server users. But the T-SQL UPDATE statement comes
with a much bigger risk in the hands of less experienced people: a small
mistake in the subquery could easily result in inpredictabe results,
that might be correct in test, but wrong in production - without any
warning from SQL Server!!

That's another red herring. With the same argument you should use
JOIN clauses in your SELECT either, but always get your values with
subqueries in the column list.

No. JOIN clauses might result in undesired results (but so could
subqueries, or any other code), but not in unpredictable code. The
UPDATE FROM syntax and TOP without ORDER BY are the only constructions
where the result on the same starting data might change if the optimizer
picks a different plan.

The ANSI compliant version with the same mistake would throw an error,
forcing the programmer or DBA to review the code.

Considet:

UPDATE tbl
SET col = (SELECT SUM(o.somecol)
FROM othertbl o
JOIN tbl t WHERE t.keycol = o.keycol)

Exactly which error will be raised here?

None, as this is not the kind of mistake I refered to.
Sure, the result is not predictable - we know that the entire table
will be thrashed.
The result IS predictable. Wrong, perhaps (probably) - but predictable
and repeatable. If I know the data in the tables, I can predict the
result of this query - and you can add, remove or change indexes, add or
remove processors or memory or even use hints to force funny execution
plans, but none of these changes will change the execution plan.

The kind of mistake I was talking about is illustrated in the query
below:

ALTER TABLE titles
ADD sold_in char(4)
go
-- ANSI standard - will consistently throw an error
UPDATE titles
SET sold_in = (SELECT s.stor_id
FROM sales AS s
WHERE s.title_id = titles.title_id)
go
-- T-SQL proprietary - no error, no warning
UPDATE t
SET sold_in = s.stor_id
FROM titles AS t
JOIN sales AS s
ON s.title_id = t.title_id
go
SELECT title_id, sold_in
FROM titles
go
-- Same query, but force another execution plan
-- Now, the same query suddenly yields different results
UPDATE t
SET sold_in = s.stor_id
FROM titles AS t
INNER HASH JOIN sales AS s
ON s.title_id = t.title_id
go
SELECT title_id, sold_in
FROM titles
go
(snip)And this brings to another great advantage with the FROM syntax: you
use the same idiom for UPDATE/DELETE as you do for SELECT. If you are
uncertain whether your conditions are correct, you can easily transform
your UPDATE to a SELECT.

Again, I'm putting emphasis on the cognitive side of things.

You're right there - transforming a working SELECT to a working T-SQL
UPDATE is lots easier than creating the equivalent ANSI UPDATE. (But I
can tell you from experience that the latter also becomes a routine job
after the first 50 or so).
I personally believe that a good programmer should master both versions:
the ANSI version becuas he might be required to write portable code, or
might get himself in a situation where even SQL Server doesn't like the
T-SQL version, and the T-SQL syntax because he might need it to optimize
poor performing parts of the code.

Let's agree to disagree on the readability of ANSI-standard UPDATE
statements, but let's also agree to not call each other names over this,
okay?

Any time I see someone touting the ANSI syntax for UPDATE and DELETE,
I will reserve the right to bump in.

By all means, do. People *should* know both variants!

Best, Hugo
--

(Remove _NO_ and _SPAM_ to get my e-mail address)

Jul 23 '05 #17

Erland Sommarskog

Hugo Kornelis (hugo@pe_NO_rFact.in_SPAM_fo) writes:

Not all people read the newsgroups or other SQL forums. In facr, I would
suspect most people don't.
But since this started over what advice I should or should not give in
the newsgroups, it's safe to assume that the people reading my advice do
read the groups.

Person A reads your post on the net and follows your advice. A quits
his job to move elsewhere, and B comes in and looks at it says "What
the ****?"
No. JOIN clauses might result in undesired results (but so could
subqueries, or any other code), but not in unpredictable code. The
UPDATE FROM syntax and TOP without ORDER BY are the only constructions
where the result on the same starting data might change if the optimizer
picks a different plan.

Any SELECT without an ORDER BY is unpredicatble. It may not be unpredicatble
in the relational realm, since it deals with unordered sets. However, in
real life order matter in many cases. For instance a programmer might
program his SELECT statement that is supposed to return one row incorrectly.
His client program reads that only row that is supposed to be there.
Sometimes that happens to be the row he is actually looking for. And
one day it is not.

Sure, the result is not predictable - we know that the entire table
will be thrashed.

The result IS predictable. Wrong, perhaps (probably) - but predictable
and repeatable.

Sorry, I did of course mean to say "is predictable". For once I had one
"not" too many.

And my point is that is irrelevant whether incorrect code is unpredictable
or not. An incorrect result is incorrect, no matter it is predictable
or not.
Now, since last it has occurred to me that I actually have access to
MySQL. So I ran this on MySQL 4.1:

CREATE TABLE tbl (a int NOT NULL,
b int NOT NULL,
total int NULL);
CREATE TABLE tbl2 (a int NOT NULL,
total int NULL);
INSERT tbl2(a) SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3
INSERT tbl (a, b)
SELECT 1, 1 UNION ALL SELECT 1, 2 UNION ALL SELECT 1, 3 UNION ALL
SELECT 2, 11 UNION ALL SELECT 2, 21 UNION ALL SELECT 2, 23 UNION ALL
SELECT 3, 101 UNION ALL SELECT 3, 102 UNION ALL SELECT 3, 103;
UPDATE tbl
SET total = (SELECT SUM(b.b)
FROM tbl b
WHERE b.a = tbl.a);
UPDATE tbl2
SET total = (SELECT SUM(b.b)
FROM tbl b
WHERE b.a = tbl2.a);
SELECT * FROM tbl;
SELECT * FROM tbl2;
DROP TABLE tbl;
DROP TABLE tbl2;

(Each statement was a batch of its own it seems.)

The UPDATE of tbl2 worked fine. But the UPDATE of tbl, with the unreadable
syntax that started this thread, failed:

ERROR 1093 (HY000): You can't specify target table 'tbl' for update in
FROM clause

Now, on http://sql-info.de/mysql/gotchas.html you can find a quite a few
issues with MySQL that quite hilarious. So of course, you could disqualify
it as "unserious DBMS". Nveertheless, MySQL has gained quite some
popularity, and is definitely a competitor to SQL Server in the low-end
market, and to some extent in the middle-range as well.

How you actually would write that UPDATE in MySQL, I don't know. But
I have not tried to find out either.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp

Jul 23 '05 #18

Similar topics