deadlock - retrying the transaction

Zeng

Hi,

The bigger my C# web-application gets, the more places I need to put in the
tedious retrying block of code to make sure operations that can run into
database deadlocks are re-run (retried) 3-4 times and give up if after that
it's still in deadlock. I'm very sure that many experienced people out
there already deal with this issue somehow. Is there an alternative to it?
Thanks for your comments and suggestions.

Nov 17 '05 #1

Subscribe Post Reply

9944

Andrew J. Kelly

Most deadlocks are due to poor indexing and inconsistent updating in the
order of the tables. You may want to concentrate more on the database
schema and code vs. the C# code.

--
Andrew J. Kelly SQL MVP
"Zeng" <Ze******@hotmail.com> wrote in message
news:%2****************@TK2MSFTNGP14.phx.gbl...

Hi,

The bigger my C# web-application gets, the more places I need to put in
the
tedious retrying block of code to make sure operations that can run into
database deadlocks are re-run (retried) 3-4 times and give up if after
that
it's still in deadlock. I'm very sure that many experienced people out
there already deal with this issue somehow. Is there an alternative to it?
Thanks for your comments and suggestions.

Nov 17 '05 #2

Zeng

Thanks for the advice. I understand that indexing and update order of the
tables contribute to deadlocks. My question is is it possible to make a
large application deadlock free? If yes, please share the tips with me how
to ensure that; what type of guidelines/disciplines to follow to ensure
tables are always updated in correct order when there are storeprocedures,
triggers, and direct queries can hit the db at the same time.

If deadlock free is not guaranteed, then it sounds to me that we would need
to put retrying blocks into the code - no other way around.

Hope to hear back from you, thanks again.

"Andrew J. Kelly" <sq************@shadhawk.com> wrote in message
news:uT**************@TK2MSFTNGP12.phx.gbl...

Most deadlocks are due to poor indexing and inconsistent updating in the
order of the tables. You may want to concentrate more on the database
schema and code vs. the C# code.

--
Andrew J. Kelly SQL MVP
"Zeng" <Ze******@hotmail.com> wrote in message
news:%2****************@TK2MSFTNGP14.phx.gbl...
Hi,

The bigger my C# web-application gets, the more places I need to put in
the
tedious retrying block of code to make sure operations that can run into
database deadlocks are re-run (retried) 3-4 times and give up if after
that
it's still in deadlock. I'm very sure that many experienced people out
there already deal with this issue somehow. Is there an alternative to it? Thanks for your comments and suggestions.

Nov 17 '05 #3

Andrew J. Kelly

I don't think there is ever a way to make an app deadlock free but you can
certainly do some things to prevent most occurrences of them. Maybe these
will help:

http://www.sql-server-performance.com/deadlocks.asp
http://www.sql-server-performance.co...ql_locking.asp
http://support.microsoft.com/kb/q169960/
http://www.codeproject.com/cs/database/sqldodont.asp

--
Andrew J. Kelly SQL MVP
"Zeng" <Ze******@hotmail.com> wrote in message
news:eG**************@TK2MSFTNGP10.phx.gbl...

Thanks for the advice. I understand that indexing and update order of the
tables contribute to deadlocks. My question is is it possible to make a
large application deadlock free? If yes, please share the tips with me
how
to ensure that; what type of guidelines/disciplines to follow to ensure
tables are always updated in correct order when there are storeprocedures,
triggers, and direct queries can hit the db at the same time.

If deadlock free is not guaranteed, then it sounds to me that we would
need
to put retrying blocks into the code - no other way around.

Hope to hear back from you, thanks again.

"Andrew J. Kelly" <sq************@shadhawk.com> wrote in message
news:uT**************@TK2MSFTNGP12.phx.gbl...
Most deadlocks are due to poor indexing and inconsistent updating in the
order of the tables. You may want to concentrate more on the database
schema and code vs. the C# code.

--
Andrew J. Kelly SQL MVP
"Zeng" <Ze******@hotmail.com> wrote in message
news:%2****************@TK2MSFTNGP14.phx.gbl...
> Hi,
>
> The bigger my C# web-application gets, the more places I need to put in
> the
> tedious retrying block of code to make sure operations that can run
> into
> database deadlocks are re-run (retried) 3-4 times and give up if after
> that
> it's still in deadlock. I'm very sure that many experienced people out
> there already deal with this issue somehow. Is there an alternative to it? > Thanks for your comments and suggestions.
>
>
>

Nov 17 '05 #4

David Browne

"Andrew J. Kelly" <sq************@shadhawk.com> wrote in message
news:u0**************@TK2MSFTNGP14.phx.gbl...

I don't think there is ever a way to make an app deadlock free but you can
certainly do some things to prevent most occurrences of them. ...

Just a note: there is help on the way for deadlocks. Running on SQL Server
2005 with Read Commited Snapshot Isolation should automatically eliminate
the vast majority of deadlocks.

David

Nov 17 '05 #5

Andrew J. Kelly

Why do you say that? It basically stops readers from blocking writers and
visa versa. It does nothing to prevent writers from blocking writers in the
reverse order.

--
Andrew J. Kelly SQL MVP
"David Browne" <davidbaxterbrowne no potted me**@hotmail.com> wrote in
message news:Om**************@TK2MSFTNGP09.phx.gbl...

"Andrew J. Kelly" <sq************@shadhawk.com> wrote in message
news:u0**************@TK2MSFTNGP14.phx.gbl...
I don't think there is ever a way to make an app deadlock free but you can
certainly do some things to prevent most occurrences of them. ...

Just a note: there is help on the way for deadlocks. Running on SQL
Server 2005 with Read Commited Snapshot Isolation should automatically
eliminate the vast majority of deadlocks.

David

Nov 17 '05 #6

Brian Selzer

If you have more than one connection to the database, then deadlocks can
happen. Deadlocks can occur on resources other than just locks. Threads
and memory are some examples. Sometimes parallelism can cause deadlocks.
These other kinds of deadlocks are rare, and usually occur on servers with
very heavy loads, but they can occur. Therefore, unless you take no pride
in your work, you should take into account that they can occur in your code.

Retry blocks are one possible solution, and can also be used to recover from
optimistic concurrency collisions. Another would be to return an error
message to the user, and let them resubmit the transaction.

Most deadlocks occur because resources aren't updated in the same order.
This problem is amplified by frequent scans due to a lack of indexes. In my
experience, the best way to mitigate the effect of deadlocks is to execute
transactions within stored procedures. If the transaction is enclosed in a
stored procedure, then most often it's a simple matter to fix the problem.
Usually rearranging statements or adding a simple select statement can be
used to alter the order in which locks are obtained. That's a lot more
difficult to do in client code that needs to be redeployed to tens or
hundreds of workstations. There are several other reasons to issue updates
within stored procedures. It's easier to issue set-based operations within
a procedure--just send the changes in temp tables or (frown) block the
changes in a large varchar parameter that can be parsed within the procedure
and then issue set-based statements to commit the changes. Using stored
procedures makes it easier to wait until the last possible instant to start
the transaction. Set-based operations are much faster and more scalable
than row-based operations because triggers fire only once, updates to
indexes can be optimized, and transaction logging is minimized--all of which
can serve to minimize the duration of transactions which will consequently
reduce the probability of deadlocks. If you must use a cursor, then it's
best to cache the results in temp tables or table variables and flush them
using set-based operations. Another technique is to use optimistic
concurrency with rowversioning which is again simpler to accoplish in a
stored procedure. All of the work required to calculate the results to be
committed is done under a READ COMMITTED isolation level with the max
rowversion of each row source cached in local variable and the results
cached in table variables so that all that is left to do after the
transaction is started is to lock the source rows with REPEATABLE READ
verifying at the same time that the max rowversion hasn't changed, to apply
update locks on all rows to be modified or deleted, and finally to issue the
statements that commit the changes.

To summarize: (1) Enclose transaction processing within stored procedures.
(2) Make sure that you obtain locks in the same order in every procedure,
trigger, function, or process. (3) Wait until the last possible instant to
start a transaction. And (4) keep transaction duration as short as possible
by using set-based operations, by caching and flushing, and/or by using
optimistic concurrency.

One other thing: redundancy in a database can increase the probability of
deadlocks. Make sure your database schema conforms at a minimum to
Boyce-Codd normal form. And (This will probably cause Celko to get out his
flame thrower!) use surrogate keys instead of natural keys in DRI
relationships to eliminate the redundancy inherent in natural foreign keys.
A database that is in 5th normal form and that uses surrogate keys correctly
has the additional property that each extrinsic atomic value exist in
exactly one place in the database. Redundancy can thus be completely
eliminated from the database.

"Zeng" <Ze******@hotmail.com> wrote in message
news:eG**************@TK2MSFTNGP10.phx.gbl...

Thanks for the advice. I understand that indexing and update order of the
tables contribute to deadlocks. My question is is it possible to make a
large application deadlock free? If yes, please share the tips with me
how
to ensure that; what type of guidelines/disciplines to follow to ensure
tables are always updated in correct order when there are storeprocedures,
triggers, and direct queries can hit the db at the same time.

If deadlock free is not guaranteed, then it sounds to me that we would
need
to put retrying blocks into the code - no other way around.

Hope to hear back from you, thanks again.

"Andrew J. Kelly" <sq************@shadhawk.com> wrote in message
news:uT**************@TK2MSFTNGP12.phx.gbl...
Most deadlocks are due to poor indexing and inconsistent updating in the
order of the tables. You may want to concentrate more on the database
schema and code vs. the C# code.

--
Andrew J. Kelly SQL MVP
"Zeng" <Ze******@hotmail.com> wrote in message
news:%2****************@TK2MSFTNGP14.phx.gbl...
> Hi,
>
> The bigger my C# web-application gets, the more places I need to put in
> the
> tedious retrying block of code to make sure operations that can run
> into
> database deadlocks are re-run (retried) 3-4 times and give up if after
> that
> it's still in deadlock. I'm very sure that many experienced people out
> there already deal with this issue somehow. Is there an alternative to it? > Thanks for your comments and suggestions.
>
>
>

Nov 17 '05 #7

David Browne

"Andrew J. Kelly" <sq************@shadhawk.com> wrote in message
news:OD**************@TK2MSFTNGP10.phx.gbl...

Why do you say that? It basically stops readers from blocking writers and
visa versa. It does nothing to prevent writers from blocking writers in
the reverse order.

Here are my reasons:

1 My feeling is that a majority of deadlocks involve a shared lock will be
directly eliminated by not issuing shared locks. For the purposes of
deadlock elimination, READ CONCURRENT SNAPSHOT isolation is like putting
NOLOCK on every query, which is a common remedial measure for deadlocks.

2 Across the board improvements in concurrency reduce deadlocks
automatically. This is for the same reason that only busy databases
deadlock. Eliminating S locks will cause fewer lock waits for writing
transactions, which will cause them to hold their X locks for less time,
reducing the time in which they would be vulnerable to a deadlock.

3 Most of the remaining deadlock scenarios are pretty simple and can be
considered coding errors or table design errors.

4 READ CONCURRENT SNAPSHOT is very similar to Oracle's Multi-Version Read
Concurrency, and Deadlocks are extremely rare in Oracle.
David

Nov 17 '05 #8

Andrew J. Kelly

I just hate the thought that people will flock to Snapshot Isolation level
because it is perceived to be easier and better than Read committed. Most
developers that I meet using Oracle do not realize what that (or this new
isolation level) means in regards to the data they are returning. Too many
make decisions based on the data returned without regard to the fact the
data may in fact be changing underneath them. I am not saying there wont be
proper times to use this as everything has it's place. But I already see
where too many people think this will solve all their problems
automatically. Anyway enough of this as this is one of those religious war
kind of topics<g>.

--
Andrew J. Kelly SQL MVP
"David Browne" <davidbaxterbrowne no potted me**@hotmail.com> wrote in
message news:Ow**************@TK2MSFTNGP15.phx.gbl...

"Andrew J. Kelly" <sq************@shadhawk.com> wrote in message
news:OD**************@TK2MSFTNGP10.phx.gbl...
Why do you say that? It basically stops readers from blocking writers
and visa versa. It does nothing to prevent writers from blocking writers
in the reverse order.

Here are my reasons:

1 My feeling is that a majority of deadlocks involve a shared lock will be
directly eliminated by not issuing shared locks. For the purposes of
deadlock elimination, READ CONCURRENT SNAPSHOT isolation is like putting
NOLOCK on every query, which is a common remedial measure for deadlocks.

2 Across the board improvements in concurrency reduce deadlocks
automatically. This is for the same reason that only busy databases
deadlock. Eliminating S locks will cause fewer lock waits for writing
transactions, which will cause them to hold their X locks for less time,
reducing the time in which they would be vulnerable to a deadlock.

3 Most of the remaining deadlock scenarios are pretty simple and can be
considered coding errors or table design errors.

4 READ CONCURRENT SNAPSHOT is very similar to Oracle's Multi-Version Read
Concurrency, and Deadlocks are extremely rare in Oracle.
David

Nov 17 '05 #9

Zeng

Would you be able to give me a pointer to where I can find more information
about set-based operations such as how to implement them for sqlserver 2000
and pro/con comparisons with the row based operations?

Thank you very much for the detailed guidance.
"Brian Selzer" <br***@selzer-software.com> wrote in message
news:un**************@TK2MSFTNGP10.phx.gbl...

If you have more than one connection to the database, then deadlocks can
happen. Deadlocks can occur on resources other than just locks. Threads
and memory are some examples. Sometimes parallelism can cause deadlocks.
These other kinds of deadlocks are rare, and usually occur on servers with
very heavy loads, but they can occur. Therefore, unless you take no pride
in your work, you should take into account that they can occur in your code.
Retry blocks are one possible solution, and can also be used to recover from optimistic concurrency collisions. Another would be to return an error
message to the user, and let them resubmit the transaction.

Most deadlocks occur because resources aren't updated in the same order.
This problem is amplified by frequent scans due to a lack of indexes. In my experience, the best way to mitigate the effect of deadlocks is to execute
transactions within stored procedures. If the transaction is enclosed in a stored procedure, then most often it's a simple matter to fix the problem.
Usually rearranging statements or adding a simple select statement can be
used to alter the order in which locks are obtained. That's a lot more
difficult to do in client code that needs to be redeployed to tens or
hundreds of workstations. There are several other reasons to issue updates within stored procedures. It's easier to issue set-based operations within a procedure--just send the changes in temp tables or (frown) block the
changes in a large varchar parameter that can be parsed within the procedure and then issue set-based statements to commit the changes. Using stored
procedures makes it easier to wait until the last possible instant to start the transaction. Set-based operations are much faster and more scalable
than row-based operations because triggers fire only once, updates to
indexes can be optimized, and transaction logging is minimized--all of which can serve to minimize the duration of transactions which will consequently
reduce the probability of deadlocks. If you must use a cursor, then it's
best to cache the results in temp tables or table variables and flush them
using set-based operations. Another technique is to use optimistic
concurrency with rowversioning which is again simpler to accoplish in a
stored procedure. All of the work required to calculate the results to be
committed is done under a READ COMMITTED isolation level with the max
rowversion of each row source cached in local variable and the results
cached in table variables so that all that is left to do after the
transaction is started is to lock the source rows with REPEATABLE READ
verifying at the same time that the max rowversion hasn't changed, to apply update locks on all rows to be modified or deleted, and finally to issue the statements that commit the changes.

To summarize: (1) Enclose transaction processing within stored procedures.
(2) Make sure that you obtain locks in the same order in every procedure,
trigger, function, or process. (3) Wait until the last possible instant to start a transaction. And (4) keep transaction duration as short as possible by using set-based operations, by caching and flushing, and/or by using
optimistic concurrency.

One other thing: redundancy in a database can increase the probability of
deadlocks. Make sure your database schema conforms at a minimum to
Boyce-Codd normal form. And (This will probably cause Celko to get out his flame thrower!) use surrogate keys instead of natural keys in DRI
relationships to eliminate the redundancy inherent in natural foreign keys. A database that is in 5th normal form and that uses surrogate keys correctly has the additional property that each extrinsic atomic value exist in
exactly one place in the database. Redundancy can thus be completely
eliminated from the database.

"Zeng" <Ze******@hotmail.com> wrote in message
news:eG**************@TK2MSFTNGP10.phx.gbl...
Thanks for the advice. I understand that indexing and update order of the tables contribute to deadlocks. My question is is it possible to make a
large application deadlock free? If yes, please share the tips with me
how
to ensure that; what type of guidelines/disciplines to follow to ensure
tables are always updated in correct order when there are storeprocedures, triggers, and direct queries can hit the db at the same time.

If deadlock free is not guaranteed, then it sounds to me that we would
need
to put retrying blocks into the code - no other way around.

Hope to hear back from you, thanks again.

"Andrew J. Kelly" <sq************@shadhawk.com> wrote in message
news:uT**************@TK2MSFTNGP12.phx.gbl...
Most deadlocks are due to poor indexing and inconsistent updating in the order of the tables. You may want to concentrate more on the database
schema and code vs. the C# code.

--
Andrew J. Kelly SQL MVP
"Zeng" <Ze******@hotmail.com> wrote in message
news:%2****************@TK2MSFTNGP14.phx.gbl...
> Hi,
>
> The bigger my C# web-application gets, the more places I need to put in > the
> tedious retrying block of code to make sure operations that can run
> into
> database deadlocks are re-run (retried) 3-4 times and give up if after > that
> it's still in deadlock. I'm very sure that many experienced people out > there already deal with this issue somehow. Is there an alternative
to it?
> Thanks for your comments and suggestions.
>
>
>

Nov 17 '05 #10

Brian Selzer

How to implement set-based operations? They are inherent in every SQL
database. Perhaps you misunderstand what I mean by set-based and row-based
operations. A set-based INSERT, UPDATE or DELETE statement may affect
multple rows, whereas with row-based processing, a separate INSERT, UPDATE
or DELETE statement is issued for each affected row.

There are a plethora of articles, books, and courses available on SQL.
Since cursors can be used and misused in many different ways (not to mention
that there are several different types of cursors), there isn't a simple
all-encompassing pro/con comparison that can be made. There are times
(albeit very rare) when a cursor will outperform a similar set-based
query--sometimes even by several orders of magnitude. Most of the time,
however, the reverse is true.

These facts are always true:

(1) triggers fire once per statement. If you send 100 INSERT...VALUES
statements, all INSERT triggers on the table will fire 100 times, whereas if
you issue the 100 INSERT...VALUES against a temporary table, and then issue
a single INSERT...SELECT, then the INSERT triggers will only fire once.

(2) updates to indexes can be optimized. If you send 100 INSERT...VALUES
statements, then index maintenance is performed 100 times, which may mean 99
extra logical writes per index. A single INSERT...SELECT will cause each
index to be updated only once, and if several changes are made on the same
index page, then instead of several individual updates to the same index
page, you get a single write to that index page.

(3) transaction logging is minimized. Every statement that executes has a
certain amount of transaction log overhead associated with it, so if you
send 100 INSERT...VALUES statements, then there is 100 times the overhead
than with a single INSERT...SELECT. In addition, since index updates are
optimized to minimize writes to each index, it follows that the number of
writes to the transaction log to record the old and new index entries is
similarly reduced.
"Zeng" <Ze******@hotmail.com> wrote in message
news:Oi**************@TK2MSFTNGP12.phx.gbl...

Would you be able to give me a pointer to where I can find more
information
about set-based operations such as how to implement them for sqlserver
2000
and pro/con comparisons with the row based operations?

Thank you very much for the detailed guidance.
"Brian Selzer" <br***@selzer-software.com> wrote in message
news:un**************@TK2MSFTNGP10.phx.gbl...
If you have more than one connection to the database, then deadlocks can
happen. Deadlocks can occur on resources other than just locks. Threads
and memory are some examples. Sometimes parallelism can cause deadlocks.
These other kinds of deadlocks are rare, and usually occur on servers
with
very heavy loads, but they can occur. Therefore, unless you take no
pride
in your work, you should take into account that they can occur in your

code.

Retry blocks are one possible solution, and can also be used to recover

from
optimistic concurrency collisions. Another would be to return an error
message to the user, and let them resubmit the transaction.

Most deadlocks occur because resources aren't updated in the same order.
This problem is amplified by frequent scans due to a lack of indexes. In

my
experience, the best way to mitigate the effect of deadlocks is to
execute
transactions within stored procedures. If the transaction is enclosed in

a
stored procedure, then most often it's a simple matter to fix the
problem.
Usually rearranging statements or adding a simple select statement can be
used to alter the order in which locks are obtained. That's a lot more
difficult to do in client code that needs to be redeployed to tens or
hundreds of workstations. There are several other reasons to issue

updates
within stored procedures. It's easier to issue set-based operations

within
a procedure--just send the changes in temp tables or (frown) block the
changes in a large varchar parameter that can be parsed within the

procedure
and then issue set-based statements to commit the changes. Using stored
procedures makes it easier to wait until the last possible instant to

start
the transaction. Set-based operations are much faster and more scalable
than row-based operations because triggers fire only once, updates to
indexes can be optimized, and transaction logging is minimized--all of

which
can serve to minimize the duration of transactions which will
consequently
reduce the probability of deadlocks. If you must use a cursor, then it's
best to cache the results in temp tables or table variables and flush
them
using set-based operations. Another technique is to use optimistic
concurrency with rowversioning which is again simpler to accoplish in a
stored procedure. All of the work required to calculate the results to
be
committed is done under a READ COMMITTED isolation level with the max
rowversion of each row source cached in local variable and the results
cached in table variables so that all that is left to do after the
transaction is started is to lock the source rows with REPEATABLE READ
verifying at the same time that the max rowversion hasn't changed, to

apply
update locks on all rows to be modified or deleted, and finally to issue

the
statements that commit the changes.

To summarize: (1) Enclose transaction processing within stored
procedures.
(2) Make sure that you obtain locks in the same order in every procedure,
trigger, function, or process. (3) Wait until the last possible instant

to
start a transaction. And (4) keep transaction duration as short as

possible
by using set-based operations, by caching and flushing, and/or by using
optimistic concurrency.

One other thing: redundancy in a database can increase the probability of
deadlocks. Make sure your database schema conforms at a minimum to
Boyce-Codd normal form. And (This will probably cause Celko to get out

his
flame thrower!) use surrogate keys instead of natural keys in DRI
relationships to eliminate the redundancy inherent in natural foreign

keys.
A database that is in 5th normal form and that uses surrogate keys

correctly
has the additional property that each extrinsic atomic value exist in
exactly one place in the database. Redundancy can thus be completely
eliminated from the database.

"Zeng" <Ze******@hotmail.com> wrote in message
news:eG**************@TK2MSFTNGP10.phx.gbl...
> Thanks for the advice. I understand that indexing and update order of the > tables contribute to deadlocks. My question is is it possible to make
> a
> large application deadlock free? If yes, please share the tips with me
> how
> to ensure that; what type of guidelines/disciplines to follow to ensure
> tables are always updated in correct order when there are storeprocedures, > triggers, and direct queries can hit the db at the same time.
>
> If deadlock free is not guaranteed, then it sounds to me that we would
> need
> to put retrying blocks into the code - no other way around.
>
> Hope to hear back from you, thanks again.
>
> "Andrew J. Kelly" <sq************@shadhawk.com> wrote in message
> news:uT**************@TK2MSFTNGP12.phx.gbl...
>> Most deadlocks are due to poor indexing and inconsistent updating in the >> order of the tables. You may want to concentrate more on the database
>> schema and code vs. the C# code.
>>
>> --
>> Andrew J. Kelly SQL MVP
>>
>>
>> "Zeng" <Ze******@hotmail.com> wrote in message
>> news:%2****************@TK2MSFTNGP14.phx.gbl...
>> > Hi,
>> >
>> > The bigger my C# web-application gets, the more places I need to put in >> > the
>> > tedious retrying block of code to make sure operations that can run
>> > into
>> > database deadlocks are re-run (retried) 3-4 times and give up if after >> > that
>> > it's still in deadlock. I'm very sure that many experienced people out >> > there already deal with this issue somehow. Is there an alternative to > it?
>> > Thanks for your comments and suggestions.
>> >
>> >
>> >
>>
>>
>
>

Nov 17 '05 #11

Zeng

Thanks, by the way, in the earlier post, you mentioned that if the
transaction is enclosed in a store proc, you would just re-arrange
statements or adding a ssimple select statement to alter the order in which
locks are obtained. Would you be able to elaborate that? How do I know which
order is the right order, or maybe you mean when we have a deadlock, then
try to figure out the order...? That won't work well that I could see. In
another word, is there
1) a way/tool to figure out the table order of all statements in a
transaction should be?
2) a discipline that helps us maintain the correct order from the beginning
such as "always try the best to hit db tables in alphabetical order within a
transaction"?
3) In C#, is there a way to retrieve deadlock info such as which tables and
operations were involved in the deadlock via Exception etc..? Tracing those
is not something I know how to do right now.

Thanks again!

"Brian Selzer" <br***@selzer-software.com> wrote in message
news:ub**************@TK2MSFTNGP15.phx.gbl...

How to implement set-based operations? They are inherent in every SQL
database. Perhaps you misunderstand what I mean by set-based and row-based operations. A set-based INSERT, UPDATE or DELETE statement may affect
multple rows, whereas with row-based processing, a separate INSERT, UPDATE
or DELETE statement is issued for each affected row.

There are a plethora of articles, books, and courses available on SQL.
Since cursors can be used and misused in many different ways (not to mention that there are several different types of cursors), there isn't a simple
all-encompassing pro/con comparison that can be made. There are times
(albeit very rare) when a cursor will outperform a similar set-based
query--sometimes even by several orders of magnitude. Most of the time,
however, the reverse is true.

These facts are always true:

(1) triggers fire once per statement. If you send 100 INSERT...VALUES
statements, all INSERT triggers on the table will fire 100 times, whereas if you issue the 100 INSERT...VALUES against a temporary table, and then issue a single INSERT...SELECT, then the INSERT triggers will only fire once.

(2) updates to indexes can be optimized. If you send 100 INSERT...VALUES
statements, then index maintenance is performed 100 times, which may mean 99 extra logical writes per index. A single INSERT...SELECT will cause each
index to be updated only once, and if several changes are made on the same
index page, then instead of several individual updates to the same index
page, you get a single write to that index page.

(3) transaction logging is minimized. Every statement that executes has a
certain amount of transaction log overhead associated with it, so if you
send 100 INSERT...VALUES statements, then there is 100 times the overhead
than with a single INSERT...SELECT. In addition, since index updates are
optimized to minimize writes to each index, it follows that the number of
writes to the transaction log to record the old and new index entries is
similarly reduced.
"Zeng" <Ze******@hotmail.com> wrote in message
news:Oi**************@TK2MSFTNGP12.phx.gbl...
Would you be able to give me a pointer to where I can find more
information
about set-based operations such as how to implement them for sqlserver
2000
and pro/con comparisons with the row based operations?

Thank you very much for the detailed guidance.
"Brian Selzer" <br***@selzer-software.com> wrote in message
news:un**************@TK2MSFTNGP10.phx.gbl...
If you have more than one connection to the database, then deadlocks can happen. Deadlocks can occur on resources other than just locks. Threads and memory are some examples. Sometimes parallelism can cause deadlocks. These other kinds of deadlocks are rare, and usually occur on servers
with
very heavy loads, but they can occur. Therefore, unless you take no
pride
in your work, you should take into account that they can occur in your

code.

Retry blocks are one possible solution, and can also be used to recover

from
optimistic concurrency collisions. Another would be to return an error
message to the user, and let them resubmit the transaction.

Most deadlocks occur because resources aren't updated in the same order. This problem is amplified by frequent scans due to a lack of indexes. In
my
experience, the best way to mitigate the effect of deadlocks is to
execute
transactions within stored procedures. If the transaction is enclosed
in a
stored procedure, then most often it's a simple matter to fix the
problem.
Usually rearranging statements or adding a simple select statement can
be used to alter the order in which locks are obtained. That's a lot more
difficult to do in client code that needs to be redeployed to tens or
hundreds of workstations. There are several other reasons to issue

updates
within stored procedures. It's easier to issue set-based operations

within
a procedure--just send the changes in temp tables or (frown) block the
changes in a large varchar parameter that can be parsed within the

procedure
and then issue set-based statements to commit the changes. Using stored procedures makes it easier to wait until the last possible instant to

start
the transaction. Set-based operations are much faster and more scalable than row-based operations because triggers fire only once, updates to
indexes can be optimized, and transaction logging is minimized--all of

which
can serve to minimize the duration of transactions which will
consequently
reduce the probability of deadlocks. If you must use a cursor, then it's best to cache the results in temp tables or table variables and flush
them
using set-based operations. Another technique is to use optimistic
concurrency with rowversioning which is again simpler to accoplish in a
stored procedure. All of the work required to calculate the results to
be
committed is done under a READ COMMITTED isolation level with the max
rowversion of each row source cached in local variable and the results
cached in table variables so that all that is left to do after the
transaction is started is to lock the source rows with REPEATABLE READ
verifying at the same time that the max rowversion hasn't changed, to

apply
update locks on all rows to be modified or deleted, and finally to issue the
statements that commit the changes.

To summarize: (1) Enclose transaction processing within stored
procedures.
(2) Make sure that you obtain locks in the same order in every
procedure, trigger, function, or process. (3) Wait until the last possible instant to
start a transaction. And (4) keep transaction duration as short as

possible
by using set-based operations, by caching and flushing, and/or by using
optimistic concurrency.

One other thing: redundancy in a database can increase the probability
of deadlocks. Make sure your database schema conforms at a minimum to
Boyce-Codd normal form. And (This will probably cause Celko to get out

his
flame thrower!) use surrogate keys instead of natural keys in DRI
relationships to eliminate the redundancy inherent in natural foreign

keys.
A database that is in 5th normal form and that uses surrogate keys

correctly
has the additional property that each extrinsic atomic value exist in
exactly one place in the database. Redundancy can thus be completely
eliminated from the database.

"Zeng" <Ze******@hotmail.com> wrote in message
news:eG**************@TK2MSFTNGP10.phx.gbl...
> Thanks for the advice. I understand that indexing and update order of

the
> tables contribute to deadlocks. My question is is it possible to make > a
> large application deadlock free? If yes, please share the tips with me > how
> to ensure that; what type of guidelines/disciplines to follow to ensure > tables are always updated in correct order when there are

storeprocedures,
> triggers, and direct queries can hit the db at the same time.
>
> If deadlock free is not guaranteed, then it sounds to me that we would > need
> to put retrying blocks into the code - no other way around.
>
> Hope to hear back from you, thanks again.
>
> "Andrew J. Kelly" <sq************@shadhawk.com> wrote in message
> news:uT**************@TK2MSFTNGP12.phx.gbl...
>> Most deadlocks are due to poor indexing and inconsistent updating in

the
>> order of the tables. You may want to concentrate more on the database >> schema and code vs. the C# code.
>>
>> --
>> Andrew J. Kelly SQL MVP
>>
>>
>> "Zeng" <Ze******@hotmail.com> wrote in message
>> news:%2****************@TK2MSFTNGP14.phx.gbl...
>> > Hi,
>> >
>> > The bigger my C# web-application gets, the more places I need to put in
>> > the
>> > tedious retrying block of code to make sure operations that can
run >> > into
>> > database deadlocks are re-run (retried) 3-4 times and give up if

after
>> > that
>> > it's still in deadlock. I'm very sure that many experienced

people out
>> > there already deal with this issue somehow. Is there an
alternative to
> it?
>> > Thanks for your comments and suggestions.
>> >
>> >
>> >
>>
>>
>
>

Nov 17 '05 #12

Fredrik Gjärdman

Tibor has a good collection of articles about locking problems on
http://www.karaszi.com/SQLServer/msarticles.asp

/Fredrik

"Zeng" <Ze******@hotmail.com> wrote in message
news:%2****************@TK2MSFTNGP12.phx.gbl...

Thanks, by the way, in the earlier post, you mentioned that if the
transaction is enclosed in a store proc, you would just re-arrange
statements or adding a ssimple select statement to alter the order in
which
locks are obtained. Would you be able to elaborate that? How do I know
which
order is the right order, or maybe you mean when we have a deadlock, then
try to figure out the order...? That won't work well that I could see. In
another word, is there
1) a way/tool to figure out the table order of all statements in a
transaction should be?
2) a discipline that helps us maintain the correct order from the
beginning
such as "always try the best to hit db tables in alphabetical order within
a
transaction"?
3) In C#, is there a way to retrieve deadlock info such as which tables
and
operations were involved in the deadlock via Exception etc..? Tracing
those
is not something I know how to do right now.

Thanks again!

"Brian Selzer" <br***@selzer-software.com> wrote in message
news:ub**************@TK2MSFTNGP15.phx.gbl...
How to implement set-based operations? They are inherent in every SQL
database. Perhaps you misunderstand what I mean by set-based and

row-based
operations. A set-based INSERT, UPDATE or DELETE statement may affect
multple rows, whereas with row-based processing, a separate INSERT,
UPDATE
or DELETE statement is issued for each affected row.

There are a plethora of articles, books, and courses available on SQL.
Since cursors can be used and misused in many different ways (not to

mention
that there are several different types of cursors), there isn't a simple
all-encompassing pro/con comparison that can be made. There are times
(albeit very rare) when a cursor will outperform a similar set-based
query--sometimes even by several orders of magnitude. Most of the time,
however, the reverse is true.

These facts are always true:

(1) triggers fire once per statement. If you send 100 INSERT...VALUES
statements, all INSERT triggers on the table will fire 100 times, whereas

if
you issue the 100 INSERT...VALUES against a temporary table, and then

issue
a single INSERT...SELECT, then the INSERT triggers will only fire once.

(2) updates to indexes can be optimized. If you send 100 INSERT...VALUES
statements, then index maintenance is performed 100 times, which may mean

99
extra logical writes per index. A single INSERT...SELECT will cause each
index to be updated only once, and if several changes are made on the
same
index page, then instead of several individual updates to the same index
page, you get a single write to that index page.

(3) transaction logging is minimized. Every statement that executes has
a
certain amount of transaction log overhead associated with it, so if you
send 100 INSERT...VALUES statements, then there is 100 times the overhead
than with a single INSERT...SELECT. In addition, since index updates are
optimized to minimize writes to each index, it follows that the number of
writes to the transaction log to record the old and new index entries is
similarly reduced.
"Zeng" <Ze******@hotmail.com> wrote in message
news:Oi**************@TK2MSFTNGP12.phx.gbl...
> Would you be able to give me a pointer to where I can find more
> information
> about set-based operations such as how to implement them for sqlserver
> 2000
> and pro/con comparisons with the row based operations?
>
> Thank you very much for the detailed guidance.
>
>
> "Brian Selzer" <br***@selzer-software.com> wrote in message
> news:un**************@TK2MSFTNGP10.phx.gbl...
>> If you have more than one connection to the database, then deadlocks can >> happen. Deadlocks can occur on resources other than just locks. Threads >> and memory are some examples. Sometimes parallelism can cause deadlocks. >> These other kinds of deadlocks are rare, and usually occur on servers
>> with
>> very heavy loads, but they can occur. Therefore, unless you take no
>> pride
>> in your work, you should take into account that they can occur in your
> code.
>>
>> Retry blocks are one possible solution, and can also be used to
>> recover
> from
>> optimistic concurrency collisions. Another would be to return an
>> error
>> message to the user, and let them resubmit the transaction.
>>
>> Most deadlocks occur because resources aren't updated in the same order. >> This problem is amplified by frequent scans due to a lack of indexes. In > my
>> experience, the best way to mitigate the effect of deadlocks is to
>> execute
>> transactions within stored procedures. If the transaction is enclosed in > a
>> stored procedure, then most often it's a simple matter to fix the
>> problem.
>> Usually rearranging statements or adding a simple select statement can be >> used to alter the order in which locks are obtained. That's a lot
>> more
>> difficult to do in client code that needs to be redeployed to tens or
>> hundreds of workstations. There are several other reasons to issue
> updates
>> within stored procedures. It's easier to issue set-based operations
> within
>> a procedure--just send the changes in temp tables or (frown) block the
>> changes in a large varchar parameter that can be parsed within the
> procedure
>> and then issue set-based statements to commit the changes. Using stored >> procedures makes it easier to wait until the last possible instant to
> start
>> the transaction. Set-based operations are much faster and more scalable >> than row-based operations because triggers fire only once, updates to
>> indexes can be optimized, and transaction logging is minimized--all of
> which
>> can serve to minimize the duration of transactions which will
>> consequently
>> reduce the probability of deadlocks. If you must use a cursor, then it's >> best to cache the results in temp tables or table variables and flush
>> them
>> using set-based operations. Another technique is to use optimistic
>> concurrency with rowversioning which is again simpler to accoplish in
>> a
>> stored procedure. All of the work required to calculate the results
>> to
>> be
>> committed is done under a READ COMMITTED isolation level with the max
>> rowversion of each row source cached in local variable and the results
>> cached in table variables so that all that is left to do after the
>> transaction is started is to lock the source rows with REPEATABLE READ
>> verifying at the same time that the max rowversion hasn't changed, to
> apply
>> update locks on all rows to be modified or deleted, and finally to issue > the
>> statements that commit the changes.
>>
>> To summarize: (1) Enclose transaction processing within stored
>> procedures.
>> (2) Make sure that you obtain locks in the same order in every procedure, >> trigger, function, or process. (3) Wait until the last possible instant > to
>> start a transaction. And (4) keep transaction duration as short as
> possible
>> by using set-based operations, by caching and flushing, and/or by
>> using
>> optimistic concurrency.
>>
>> One other thing: redundancy in a database can increase the probability of >> deadlocks. Make sure your database schema conforms at a minimum to
>> Boyce-Codd normal form. And (This will probably cause Celko to get
>> out
> his
>> flame thrower!) use surrogate keys instead of natural keys in DRI
>> relationships to eliminate the redundancy inherent in natural foreign
> keys.
>> A database that is in 5th normal form and that uses surrogate keys
> correctly
>> has the additional property that each extrinsic atomic value exist in
>> exactly one place in the database. Redundancy can thus be completely
>> eliminated from the database.
>>
>> "Zeng" <Ze******@hotmail.com> wrote in message
>> news:eG**************@TK2MSFTNGP10.phx.gbl...
>> > Thanks for the advice. I understand that indexing and update order
>> > of
> the
>> > tables contribute to deadlocks. My question is is it possible to make >> > a
>> > large application deadlock free? If yes, please share the tips with me >> > how
>> > to ensure that; what type of guidelines/disciplines to follow to ensure >> > tables are always updated in correct order when there are
> storeprocedures,
>> > triggers, and direct queries can hit the db at the same time.
>> >
>> > If deadlock free is not guaranteed, then it sounds to me that we would >> > need
>> > to put retrying blocks into the code - no other way around.
>> >
>> > Hope to hear back from you, thanks again.
>> >
>> > "Andrew J. Kelly" <sq************@shadhawk.com> wrote in message
>> > news:uT**************@TK2MSFTNGP12.phx.gbl...
>> >> Most deadlocks are due to poor indexing and inconsistent updating
>> >> in
> the
>> >> order of the tables. You may want to concentrate more on the database >> >> schema and code vs. the C# code.
>> >>
>> >> --
>> >> Andrew J. Kelly SQL MVP
>> >>
>> >>
>> >> "Zeng" <Ze******@hotmail.com> wrote in message
>> >> news:%2****************@TK2MSFTNGP14.phx.gbl...
>> >> > Hi,
>> >> >
>> >> > The bigger my C# web-application gets, the more places I need to put > in
>> >> > the
>> >> > tedious retrying block of code to make sure operations that can run >> >> > into
>> >> > database deadlocks are re-run (retried) 3-4 times and give up if
> after
>> >> > that
>> >> > it's still in deadlock. I'm very sure that many experienced people > out
>> >> > there already deal with this issue somehow. Is there an alternative > to
>> > it?
>> >> > Thanks for your comments and suggestions.
>> >> >
>> >> >
>> >> >
>> >>
>> >>
>> >
>> >
>>
>>
>
>

Nov 17 '05 #13

Brian Selzer

Here's an example of how to use a simple select statement

BEGIN TRAN
DECLARE @table1Key INT SELECT @table1Key = keyColumn
FROM tableName WITH(UPDLOCK)
WHERE table1RestrictClause
UPDATE table2 SET ... WHERE table2RestrictClause
UPDATE table1 SET ... WHERE table1RestrictClause
COMMIT TRAN

Update locks can only be obtained by one transaction at a time. The select
statement arranges the lock order so that rows in table1 are locked before
rows in table2.

How do you know which order is correct? The criteria you use doesn't really
matter, provided it's consistent throughout the system. There is one thing
you should consider, however. You need to walk the declarative referential
integrity heirarchy up from the bottom. For example, if you have 3 tables
T1 (PK1), T2 (PK2, FK1), and T3 (PK3, FK2), then you should put T3 before T2
and T2 before T1, in other words, the order should be T3, T2, T1. This
should be obvious because you have to delete all of the rows in T3 that are
related to a row in T2 before you can delete the row in T2.

This brings up another point. Cascading referential actions increase the
complexity of deadlock minimization. If you update a key value, every row
that contains that key value must be locked, and that complicates the order
in which locks are obtained for tables involved in a declarative referential
integrity constraint that involves natural keys. The order in which the
system obtains the locks for tables affected by a cascading referential
action is not clear since there isn't any documentation to support it. This
can increase the possibility of deadlocks because locks might be obtained
out of order due to a cascading referential action. This bolsters the
argument for using IDENTITY primary key columns because they cannot be
changed.
"Zeng" <Ze******@hotmail.com> wrote in message
news:%2****************@TK2MSFTNGP12.phx.gbl...

Thanks, by the way, in the earlier post, you mentioned that if the
transaction is enclosed in a store proc, you would just re-arrange
statements or adding a ssimple select statement to alter the order in
which
locks are obtained. Would you be able to elaborate that? How do I know
which
order is the right order, or maybe you mean when we have a deadlock, then
try to figure out the order...? That won't work well that I could see. In
another word, is there
1) a way/tool to figure out the table order of all statements in a
transaction should be?
2) a discipline that helps us maintain the correct order from the
beginning
such as "always try the best to hit db tables in alphabetical order within
a
transaction"?
3) In C#, is there a way to retrieve deadlock info such as which tables
and
operations were involved in the deadlock via Exception etc..? Tracing
those
is not something I know how to do right now.

Thanks again!

"Brian Selzer" <br***@selzer-software.com> wrote in message
news:ub**************@TK2MSFTNGP15.phx.gbl...
How to implement set-based operations? They are inherent in every SQL
database. Perhaps you misunderstand what I mean by set-based and

row-based
operations. A set-based INSERT, UPDATE or DELETE statement may affect
multple rows, whereas with row-based processing, a separate INSERT,
UPDATE
or DELETE statement is issued for each affected row.

There are a plethora of articles, books, and courses available on SQL.
Since cursors can be used and misused in many different ways (not to

mention
that there are several different types of cursors), there isn't a simple
all-encompassing pro/con comparison that can be made. There are times
(albeit very rare) when a cursor will outperform a similar set-based
query--sometimes even by several orders of magnitude. Most of the time,
however, the reverse is true.

These facts are always true:

(1) triggers fire once per statement. If you send 100 INSERT...VALUES
statements, all INSERT triggers on the table will fire 100 times, whereas

if
you issue the 100 INSERT...VALUES against a temporary table, and then

issue
a single INSERT...SELECT, then the INSERT triggers will only fire once.

(2) updates to indexes can be optimized. If you send 100 INSERT...VALUES
statements, then index maintenance is performed 100 times, which may mean

99
extra logical writes per index. A single INSERT...SELECT will cause each
index to be updated only once, and if several changes are made on the
same
index page, then instead of several individual updates to the same index
page, you get a single write to that index page.

(3) transaction logging is minimized. Every statement that executes has
a
certain amount of transaction log overhead associated with it, so if you
send 100 INSERT...VALUES statements, then there is 100 times the overhead
than with a single INSERT...SELECT. In addition, since index updates are
optimized to minimize writes to each index, it follows that the number of
writes to the transaction log to record the old and new index entries is
similarly reduced.
"Zeng" <Ze******@hotmail.com> wrote in message
news:Oi**************@TK2MSFTNGP12.phx.gbl...
> Would you be able to give me a pointer to where I can find more
> information
> about set-based operations such as how to implement them for sqlserver
> 2000
> and pro/con comparisons with the row based operations?
>
> Thank you very much for the detailed guidance.
>
>
> "Brian Selzer" <br***@selzer-software.com> wrote in message
> news:un**************@TK2MSFTNGP10.phx.gbl...
>> If you have more than one connection to the database, then deadlocks can >> happen. Deadlocks can occur on resources other than just locks. Threads >> and memory are some examples. Sometimes parallelism can cause deadlocks. >> These other kinds of deadlocks are rare, and usually occur on servers
>> with
>> very heavy loads, but they can occur. Therefore, unless you take no
>> pride
>> in your work, you should take into account that they can occur in your
> code.
>>
>> Retry blocks are one possible solution, and can also be used to
>> recover
> from
>> optimistic concurrency collisions. Another would be to return an
>> error
>> message to the user, and let them resubmit the transaction.
>>
>> Most deadlocks occur because resources aren't updated in the same order. >> This problem is amplified by frequent scans due to a lack of indexes. In > my
>> experience, the best way to mitigate the effect of deadlocks is to
>> execute
>> transactions within stored procedures. If the transaction is enclosed in > a
>> stored procedure, then most often it's a simple matter to fix the
>> problem.
>> Usually rearranging statements or adding a simple select statement can be >> used to alter the order in which locks are obtained. That's a lot
>> more
>> difficult to do in client code that needs to be redeployed to tens or
>> hundreds of workstations. There are several other reasons to issue
> updates
>> within stored procedures. It's easier to issue set-based operations
> within
>> a procedure--just send the changes in temp tables or (frown) block the
>> changes in a large varchar parameter that can be parsed within the
> procedure
>> and then issue set-based statements to commit the changes. Using stored >> procedures makes it easier to wait until the last possible instant to
> start
>> the transaction. Set-based operations are much faster and more scalable >> than row-based operations because triggers fire only once, updates to
>> indexes can be optimized, and transaction logging is minimized--all of
> which
>> can serve to minimize the duration of transactions which will
>> consequently
>> reduce the probability of deadlocks. If you must use a cursor, then it's >> best to cache the results in temp tables or table variables and flush
>> them
>> using set-based operations. Another technique is to use optimistic
>> concurrency with rowversioning which is again simpler to accoplish in
>> a
>> stored procedure. All of the work required to calculate the results
>> to
>> be
>> committed is done under a READ COMMITTED isolation level with the max
>> rowversion of each row source cached in local variable and the results
>> cached in table variables so that all that is left to do after the
>> transaction is started is to lock the source rows with REPEATABLE READ
>> verifying at the same time that the max rowversion hasn't changed, to
> apply
>> update locks on all rows to be modified or deleted, and finally to issue > the
>> statements that commit the changes.
>>
>> To summarize: (1) Enclose transaction processing within stored
>> procedures.
>> (2) Make sure that you obtain locks in the same order in every procedure, >> trigger, function, or process. (3) Wait until the last possible instant > to
>> start a transaction. And (4) keep transaction duration as short as
> possible
>> by using set-based operations, by caching and flushing, and/or by
>> using
>> optimistic concurrency.
>>
>> One other thing: redundancy in a database can increase the probability of >> deadlocks. Make sure your database schema conforms at a minimum to
>> Boyce-Codd normal form. And (This will probably cause Celko to get
>> out
> his
>> flame thrower!) use surrogate keys instead of natural keys in DRI
>> relationships to eliminate the redundancy inherent in natural foreign
> keys.
>> A database that is in 5th normal form and that uses surrogate keys
> correctly
>> has the additional property that each extrinsic atomic value exist in
>> exactly one place in the database. Redundancy can thus be completely
>> eliminated from the database.
>>
>> "Zeng" <Ze******@hotmail.com> wrote in message
>> news:eG**************@TK2MSFTNGP10.phx.gbl...
>> > Thanks for the advice. I understand that indexing and update order
>> > of
> the
>> > tables contribute to deadlocks. My question is is it possible to make >> > a
>> > large application deadlock free? If yes, please share the tips with me >> > how
>> > to ensure that; what type of guidelines/disciplines to follow to ensure >> > tables are always updated in correct order when there are
> storeprocedures,
>> > triggers, and direct queries can hit the db at the same time.
>> >
>> > If deadlock free is not guaranteed, then it sounds to me that we would >> > need
>> > to put retrying blocks into the code - no other way around.
>> >
>> > Hope to hear back from you, thanks again.
>> >
>> > "Andrew J. Kelly" <sq************@shadhawk.com> wrote in message
>> > news:uT**************@TK2MSFTNGP12.phx.gbl...
>> >> Most deadlocks are due to poor indexing and inconsistent updating
>> >> in
> the
>> >> order of the tables. You may want to concentrate more on the database >> >> schema and code vs. the C# code.
>> >>
>> >> --
>> >> Andrew J. Kelly SQL MVP
>> >>
>> >>
>> >> "Zeng" <Ze******@hotmail.com> wrote in message
>> >> news:%2****************@TK2MSFTNGP14.phx.gbl...
>> >> > Hi,
>> >> >
>> >> > The bigger my C# web-application gets, the more places I need to put > in
>> >> > the
>> >> > tedious retrying block of code to make sure operations that can run >> >> > into
>> >> > database deadlocks are re-run (retried) 3-4 times and give up if
> after
>> >> > that
>> >> > it's still in deadlock. I'm very sure that many experienced people > out
>> >> > there already deal with this issue somehow. Is there an alternative > to
>> > it?
>> >> > Thanks for your comments and suggestions.
>> >> >
>> >> >
>> >> >
>> >>
>> >>
>> >
>> >
>>
>>
>
>

Nov 17 '05 #14

Zeng

Hi Brian,

Thanks for your explanation; although I'm not sure about paying for select
statement time cost everytime to avoid deadlock happening must less
frequently...unless you see in general the good locking order would pay off
the extra select statement.

Your point about the referential integrity hierarchy made perfect sense to
me.

As for the IDENTITY type for primary key, is there a way to declare to
sqlserver that the UNIQUEIIDENTIFIER primary key won't ever be changed
either? In fact, since primary key should never be changed, I thought db
system should allow us to declare that.

"Brian Selzer" <br***@selzer-software.com> wrote in message
news:%2****************@TK2MSFTNGP14.phx.gbl...

Here's an example of how to use a simple select statement

BEGIN TRAN
DECLARE @table1Key INT SELECT @table1Key = keyColumn
FROM tableName WITH(UPDLOCK)
WHERE table1RestrictClause
UPDATE table2 SET ... WHERE table2RestrictClause
UPDATE table1 SET ... WHERE table1RestrictClause
COMMIT TRAN

Update locks can only be obtained by one transaction at a time. The select statement arranges the lock order so that rows in table1 are locked before
rows in table2.

How do you know which order is correct? The criteria you use doesn't really matter, provided it's consistent throughout the system. There is one thing you should consider, however. You need to walk the declarative referential integrity heirarchy up from the bottom. For example, if you have 3 tables
T1 (PK1), T2 (PK2, FK1), and T3 (PK3, FK2), then you should put T3 before T2 and T2 before T1, in other words, the order should be T3, T2, T1. This
should be obvious because you have to delete all of the rows in T3 that are related to a row in T2 before you can delete the row in T2.

This brings up another point. Cascading referential actions increase the
complexity of deadlock minimization. If you update a key value, every row
that contains that key value must be locked, and that complicates the order in which locks are obtained for tables involved in a declarative referential integrity constraint that involves natural keys. The order in which the
system obtains the locks for tables affected by a cascading referential
action is not clear since there isn't any documentation to support it. This can increase the possibility of deadlocks because locks might be obtained
out of order due to a cascading referential action. This bolsters the
argument for using IDENTITY primary key columns because they cannot be
changed.
"Zeng" <Ze******@hotmail.com> wrote in message
news:%2****************@TK2MSFTNGP12.phx.gbl...
Thanks, by the way, in the earlier post, you mentioned that if the
transaction is enclosed in a store proc, you would just re-arrange
statements or adding a ssimple select statement to alter the order in
which
locks are obtained. Would you be able to elaborate that? How do I know
which
order is the right order, or maybe you mean when we have a deadlock, then try to figure out the order...? That won't work well that I could see. In another word, is there
1) a way/tool to figure out the table order of all statements in a
transaction should be?
2) a discipline that helps us maintain the correct order from the
beginning
such as "always try the best to hit db tables in alphabetical order within a
transaction"?
3) In C#, is there a way to retrieve deadlock info such as which tables
and
operations were involved in the deadlock via Exception etc..? Tracing
those
is not something I know how to do right now.

Thanks again!

"Brian Selzer" <br***@selzer-software.com> wrote in message
news:ub**************@TK2MSFTNGP15.phx.gbl...
How to implement set-based operations? They are inherent in every SQL
database. Perhaps you misunderstand what I mean by set-based and

row-based
operations. A set-based INSERT, UPDATE or DELETE statement may affect
multple rows, whereas with row-based processing, a separate INSERT,
UPDATE
or DELETE statement is issued for each affected row.

There are a plethora of articles, books, and courses available on SQL.
Since cursors can be used and misused in many different ways (not to

mention
that there are several different types of cursors), there isn't a simple all-encompassing pro/con comparison that can be made. There are times
(albeit very rare) when a cursor will outperform a similar set-based
query--sometimes even by several orders of magnitude. Most of the time, however, the reverse is true.

These facts are always true:

(1) triggers fire once per statement. If you send 100 INSERT...VALUES
statements, all INSERT triggers on the table will fire 100 times, whereas
if
you issue the 100 INSERT...VALUES against a temporary table, and then

issue
a single INSERT...SELECT, then the INSERT triggers will only fire once.

(2) updates to indexes can be optimized. If you send 100
INSERT...VALUES statements, then index maintenance is performed 100 times, which may mean 99
extra logical writes per index. A single INSERT...SELECT will cause
each index to be updated only once, and if several changes are made on the
same
index page, then instead of several individual updates to the same index page, you get a single write to that index page.

(3) transaction logging is minimized. Every statement that executes has a
certain amount of transaction log overhead associated with it, so if you send 100 INSERT...VALUES statements, then there is 100 times the overhead than with a single INSERT...SELECT. In addition, since index updates are optimized to minimize writes to each index, it follows that the number of writes to the transaction log to record the old and new index entries is similarly reduced.
"Zeng" <Ze******@hotmail.com> wrote in message
news:Oi**************@TK2MSFTNGP12.phx.gbl...
> Would you be able to give me a pointer to where I can find more
> information
> about set-based operations such as how to implement them for sqlserver > 2000
> and pro/con comparisons with the row based operations?
>
> Thank you very much for the detailed guidance.
>
>
> "Brian Selzer" <br***@selzer-software.com> wrote in message
> news:un**************@TK2MSFTNGP10.phx.gbl...
>> If you have more than one connection to the database, then deadlocks

can
>> happen. Deadlocks can occur on resources other than just locks.

Threads
>> and memory are some examples. Sometimes parallelism can cause

deadlocks.
>> These other kinds of deadlocks are rare, and usually occur on servers >> with
>> very heavy loads, but they can occur. Therefore, unless you take no
>> pride
>> in your work, you should take into account that they can occur in your > code.
>>
>> Retry blocks are one possible solution, and can also be used to
>> recover
> from
>> optimistic concurrency collisions. Another would be to return an
>> error
>> message to the user, and let them resubmit the transaction.
>>
>> Most deadlocks occur because resources aren't updated in the same

order.
>> This problem is amplified by frequent scans due to a lack of indexes. In
> my
>> experience, the best way to mitigate the effect of deadlocks is to
>> execute
>> transactions within stored procedures. If the transaction is
enclosed in
> a
>> stored procedure, then most often it's a simple matter to fix the
>> problem.
>> Usually rearranging statements or adding a simple select statement
can be
>> used to alter the order in which locks are obtained. That's a lot
>> more
>> difficult to do in client code that needs to be redeployed to tens
or >> hundreds of workstations. There are several other reasons to issue
> updates
>> within stored procedures. It's easier to issue set-based operations
> within
>> a procedure--just send the changes in temp tables or (frown) block the >> changes in a large varchar parameter that can be parsed within the
> procedure
>> and then issue set-based statements to commit the changes. Using

stored
>> procedures makes it easier to wait until the last possible instant to > start
>> the transaction. Set-based operations are much faster and more

scalable
>> than row-based operations because triggers fire only once, updates to >> indexes can be optimized, and transaction logging is minimized--all of > which
>> can serve to minimize the duration of transactions which will
>> consequently
>> reduce the probability of deadlocks. If you must use a cursor, then

it's
>> best to cache the results in temp tables or table variables and flush >> them
>> using set-based operations. Another technique is to use optimistic
>> concurrency with rowversioning which is again simpler to accoplish in >> a
>> stored procedure. All of the work required to calculate the results
>> to
>> be
>> committed is done under a READ COMMITTED isolation level with the max >> rowversion of each row source cached in local variable and the results >> cached in table variables so that all that is left to do after the
>> transaction is started is to lock the source rows with REPEATABLE READ >> verifying at the same time that the max rowversion hasn't changed, to > apply
>> update locks on all rows to be modified or deleted, and finally to

issue
> the
>> statements that commit the changes.
>>
>> To summarize: (1) Enclose transaction processing within stored
>> procedures.
>> (2) Make sure that you obtain locks in the same order in every

procedure,
>> trigger, function, or process. (3) Wait until the last possible

instant
> to
>> start a transaction. And (4) keep transaction duration as short as
> possible
>> by using set-based operations, by caching and flushing, and/or by
>> using
>> optimistic concurrency.
>>
>> One other thing: redundancy in a database can increase the probability of
>> deadlocks. Make sure your database schema conforms at a minimum to
>> Boyce-Codd normal form. And (This will probably cause Celko to get
>> out
> his
>> flame thrower!) use surrogate keys instead of natural keys in DRI
>> relationships to eliminate the redundancy inherent in natural
foreign > keys.
>> A database that is in 5th normal form and that uses surrogate keys
> correctly
>> has the additional property that each extrinsic atomic value exist in >> exactly one place in the database. Redundancy can thus be completely >> eliminated from the database.
>>
>> "Zeng" <Ze******@hotmail.com> wrote in message
>> news:eG**************@TK2MSFTNGP10.phx.gbl...
>> > Thanks for the advice. I understand that indexing and update order
>> > of
> the
>> > tables contribute to deadlocks. My question is is it possible to

make
>> > a
>> > large application deadlock free? If yes, please share the tips with me
>> > how
>> > to ensure that; what type of guidelines/disciplines to follow to

ensure
>> > tables are always updated in correct order when there are
> storeprocedures,
>> > triggers, and direct queries can hit the db at the same time.
>> >
>> > If deadlock free is not guaranteed, then it sounds to me that we

would
>> > need
>> > to put retrying blocks into the code - no other way around.
>> >
>> > Hope to hear back from you, thanks again.
>> >
>> > "Andrew J. Kelly" <sq************@shadhawk.com> wrote in message
>> > news:uT**************@TK2MSFTNGP12.phx.gbl...
>> >> Most deadlocks are due to poor indexing and inconsistent updating
>> >> in
> the
>> >> order of the tables. You may want to concentrate more on the

database
>> >> schema and code vs. the C# code.
>> >>
>> >> --
>> >> Andrew J. Kelly SQL MVP
>> >>
>> >>
>> >> "Zeng" <Ze******@hotmail.com> wrote in message
>> >> news:%2****************@TK2MSFTNGP14.phx.gbl...
>> >> > Hi,
>> >> >
>> >> > The bigger my C# web-application gets, the more places I need
to put
> in
>> >> > the
>> >> > tedious retrying block of code to make sure operations that can

run
>> >> > into
>> >> > database deadlocks are re-run (retried) 3-4 times and give up

if > after
>> >> > that
>> >> > it's still in deadlock. I'm very sure that many experienced

people
> out
>> >> > there already deal with this issue somehow. Is there an

alternative
> to
>> > it?
>> >> > Thanks for your comments and suggestions.
>> >> >
>> >> >
>> >> >
>> >>
>> >>
>> >
>> >
>>
>>
>
>

Nov 17 '05 #15

Brian Selzer

The payment for the SELECT is moot, since an UPDATE must first read the rows
to be updated and lock them; therefore, the additional SELECT incurs little
if any additional overhead because more often than not the rows that were
SELECTed are still in memory when it comes time to UPDATE them--eliminating
the extra physical read needed by the UPDATE. Of course I'm assuming that
SQL Server, like most modern applications that use caching, uses a least
recently used algorithm to free up memory that is no longer needed for use
by the next query.

If you use a natural primary key--particularly a composite natural primary
key, then changes to the primary key can occur frequently. Think about the
location of a part on an assembly line. If the PartNumber, Location is the
natural primary key, then it will change every time the part moves on to
another position on the line. I agree that the primary key should never be
changed, that's why I advocate the use of surrogates.

There are several arguments against using UNIQUEIDENTIFIERs as primary keys.
Google this. If you must use them for replication, then add them as an
additional candidate key, but don't use them in declarative referential
integrity constraints.

"Zeng" <Ze******@hotmail.com> wrote in message
news:eN**************@TK2MSFTNGP09.phx.gbl...

Hi Brian,

Thanks for your explanation; although I'm not sure about paying for select
statement time cost everytime to avoid deadlock happening must less
frequently...unless you see in general the good locking order would pay
off
the extra select statement.

Your point about the referential integrity hierarchy made perfect sense to
me.

As for the IDENTITY type for primary key, is there a way to declare to
sqlserver that the UNIQUEIIDENTIFIER primary key won't ever be changed
either? In fact, since primary key should never be changed, I thought db
system should allow us to declare that.

"Brian Selzer" <br***@selzer-software.com> wrote in message
news:%2****************@TK2MSFTNGP14.phx.gbl...
Here's an example of how to use a simple select statement

BEGIN TRAN
DECLARE @table1Key INT SELECT @table1Key = keyColumn
FROM tableName WITH(UPDLOCK)
WHERE table1RestrictClause
UPDATE table2 SET ... WHERE table2RestrictClause
UPDATE table1 SET ... WHERE table1RestrictClause
COMMIT TRAN

Update locks can only be obtained by one transaction at a time. The

select
statement arranges the lock order so that rows in table1 are locked
before
rows in table2.

How do you know which order is correct? The criteria you use doesn't

really
matter, provided it's consistent throughout the system. There is one

thing
you should consider, however. You need to walk the declarative

referential
integrity heirarchy up from the bottom. For example, if you have 3
tables
T1 (PK1), T2 (PK2, FK1), and T3 (PK3, FK2), then you should put T3 before

T2
and T2 before T1, in other words, the order should be T3, T2, T1. This
should be obvious because you have to delete all of the rows in T3 that

are
related to a row in T2 before you can delete the row in T2.

This brings up another point. Cascading referential actions increase the
complexity of deadlock minimization. If you update a key value, every
row
that contains that key value must be locked, and that complicates the

order
in which locks are obtained for tables involved in a declarative

referential
integrity constraint that involves natural keys. The order in which the
system obtains the locks for tables affected by a cascading referential
action is not clear since there isn't any documentation to support it.

This
can increase the possibility of deadlocks because locks might be obtained
out of order due to a cascading referential action. This bolsters the
argument for using IDENTITY primary key columns because they cannot be
changed.
"Zeng" <Ze******@hotmail.com> wrote in message
news:%2****************@TK2MSFTNGP12.phx.gbl...
> Thanks, by the way, in the earlier post, you mentioned that if the
> transaction is enclosed in a store proc, you would just re-arrange
> statements or adding a ssimple select statement to alter the order in
> which
> locks are obtained. Would you be able to elaborate that? How do I know
> which
> order is the right order, or maybe you mean when we have a deadlock, then > try to figure out the order...? That won't work well that I could see. In > another word, is there
> 1) a way/tool to figure out the table order of all statements in a
> transaction should be?
> 2) a discipline that helps us maintain the correct order from the
> beginning
> such as "always try the best to hit db tables in alphabetical order within > a
> transaction"?
> 3) In C#, is there a way to retrieve deadlock info such as which tables
> and
> operations were involved in the deadlock via Exception etc..? Tracing
> those
> is not something I know how to do right now.
>
> Thanks again!
>
> "Brian Selzer" <br***@selzer-software.com> wrote in message
> news:ub**************@TK2MSFTNGP15.phx.gbl...
>> How to implement set-based operations? They are inherent in every SQL
>> database. Perhaps you misunderstand what I mean by set-based and
> row-based
>> operations. A set-based INSERT, UPDATE or DELETE statement may affect
>> multple rows, whereas with row-based processing, a separate INSERT,
>> UPDATE
>> or DELETE statement is issued for each affected row.
>>
>> There are a plethora of articles, books, and courses available on SQL.
>> Since cursors can be used and misused in many different ways (not to
> mention
>> that there are several different types of cursors), there isn't a simple >> all-encompassing pro/con comparison that can be made. There are times
>> (albeit very rare) when a cursor will outperform a similar set-based
>> query--sometimes even by several orders of magnitude. Most of the time, >> however, the reverse is true.
>>
>> These facts are always true:
>>
>> (1) triggers fire once per statement. If you send 100 INSERT...VALUES
>> statements, all INSERT triggers on the table will fire 100 times, whereas > if
>> you issue the 100 INSERT...VALUES against a temporary table, and then
> issue
>> a single INSERT...SELECT, then the INSERT triggers will only fire
>> once.
>>
>> (2) updates to indexes can be optimized. If you send 100 INSERT...VALUES >> statements, then index maintenance is performed 100 times, which may mean > 99
>> extra logical writes per index. A single INSERT...SELECT will cause each >> index to be updated only once, and if several changes are made on the
>> same
>> index page, then instead of several individual updates to the same index >> page, you get a single write to that index page.
>>
>> (3) transaction logging is minimized. Every statement that executes has >> a
>> certain amount of transaction log overhead associated with it, so if you >> send 100 INSERT...VALUES statements, then there is 100 times the overhead >> than with a single INSERT...SELECT. In addition, since index updates are >> optimized to minimize writes to each index, it follows that the number of >> writes to the transaction log to record the old and new index entries is >> similarly reduced.
>>
>>
>> "Zeng" <Ze******@hotmail.com> wrote in message
>> news:Oi**************@TK2MSFTNGP12.phx.gbl...
>> > Would you be able to give me a pointer to where I can find more
>> > information
>> > about set-based operations such as how to implement them for sqlserver >> > 2000
>> > and pro/con comparisons with the row based operations?
>> >
>> > Thank you very much for the detailed guidance.
>> >
>> >
>> > "Brian Selzer" <br***@selzer-software.com> wrote in message
>> > news:un**************@TK2MSFTNGP10.phx.gbl...
>> >> If you have more than one connection to the database, then
>> >> deadlocks
> can
>> >> happen. Deadlocks can occur on resources other than just locks.
> Threads
>> >> and memory are some examples. Sometimes parallelism can cause
> deadlocks.
>> >> These other kinds of deadlocks are rare, and usually occur on servers >> >> with
>> >> very heavy loads, but they can occur. Therefore, unless you take
>> >> no
>> >> pride
>> >> in your work, you should take into account that they can occur in your >> > code.
>> >>
>> >> Retry blocks are one possible solution, and can also be used to
>> >> recover
>> > from
>> >> optimistic concurrency collisions. Another would be to return an
>> >> error
>> >> message to the user, and let them resubmit the transaction.
>> >>
>> >> Most deadlocks occur because resources aren't updated in the same
> order.
>> >> This problem is amplified by frequent scans due to a lack of indexes. > In
>> > my
>> >> experience, the best way to mitigate the effect of deadlocks is to
>> >> execute
>> >> transactions within stored procedures. If the transaction is enclosed > in
>> > a
>> >> stored procedure, then most often it's a simple matter to fix the
>> >> problem.
>> >> Usually rearranging statements or adding a simple select statement can > be
>> >> used to alter the order in which locks are obtained. That's a lot
>> >> more
>> >> difficult to do in client code that needs to be redeployed to tens or >> >> hundreds of workstations. There are several other reasons to issue
>> > updates
>> >> within stored procedures. It's easier to issue set-based
>> >> operations
>> > within
>> >> a procedure--just send the changes in temp tables or (frown) block the >> >> changes in a large varchar parameter that can be parsed within the
>> > procedure
>> >> and then issue set-based statements to commit the changes. Using
> stored
>> >> procedures makes it easier to wait until the last possible instant to >> > start
>> >> the transaction. Set-based operations are much faster and more
> scalable
>> >> than row-based operations because triggers fire only once, updates to >> >> indexes can be optimized, and transaction logging is minimized--all of >> > which
>> >> can serve to minimize the duration of transactions which will
>> >> consequently
>> >> reduce the probability of deadlocks. If you must use a cursor,
>> >> then
> it's
>> >> best to cache the results in temp tables or table variables and flush >> >> them
>> >> using set-based operations. Another technique is to use optimistic
>> >> concurrency with rowversioning which is again simpler to accoplish in >> >> a
>> >> stored procedure. All of the work required to calculate the
>> >> results
>> >> to
>> >> be
>> >> committed is done under a READ COMMITTED isolation level with the max >> >> rowversion of each row source cached in local variable and the results >> >> cached in table variables so that all that is left to do after the
>> >> transaction is started is to lock the source rows with REPEATABLE READ >> >> verifying at the same time that the max rowversion hasn't changed, to >> > apply
>> >> update locks on all rows to be modified or deleted, and finally to
> issue
>> > the
>> >> statements that commit the changes.
>> >>
>> >> To summarize: (1) Enclose transaction processing within stored
>> >> procedures.
>> >> (2) Make sure that you obtain locks in the same order in every
> procedure,
>> >> trigger, function, or process. (3) Wait until the last possible
> instant
>> > to
>> >> start a transaction. And (4) keep transaction duration as short as
>> > possible
>> >> by using set-based operations, by caching and flushing, and/or by
>> >> using
>> >> optimistic concurrency.
>> >>
>> >> One other thing: redundancy in a database can increase the probability > of
>> >> deadlocks. Make sure your database schema conforms at a minimum to
>> >> Boyce-Codd normal form. And (This will probably cause Celko to get
>> >> out
>> > his
>> >> flame thrower!) use surrogate keys instead of natural keys in DRI
>> >> relationships to eliminate the redundancy inherent in natural foreign >> > keys.
>> >> A database that is in 5th normal form and that uses surrogate keys
>> > correctly
>> >> has the additional property that each extrinsic atomic value exist in >> >> exactly one place in the database. Redundancy can thus be completely >> >> eliminated from the database.
>> >>
>> >> "Zeng" <Ze******@hotmail.com> wrote in message
>> >> news:eG**************@TK2MSFTNGP10.phx.gbl...
>> >> > Thanks for the advice. I understand that indexing and update
>> >> > order
>> >> > of
>> > the
>> >> > tables contribute to deadlocks. My question is is it possible to
> make
>> >> > a
>> >> > large application deadlock free? If yes, please share the tips with > me
>> >> > how
>> >> > to ensure that; what type of guidelines/disciplines to follow to
> ensure
>> >> > tables are always updated in correct order when there are
>> > storeprocedures,
>> >> > triggers, and direct queries can hit the db at the same time.
>> >> >
>> >> > If deadlock free is not guaranteed, then it sounds to me that we
> would
>> >> > need
>> >> > to put retrying blocks into the code - no other way around.
>> >> >
>> >> > Hope to hear back from you, thanks again.
>> >> >
>> >> > "Andrew J. Kelly" <sq************@shadhawk.com> wrote in message
>> >> > news:uT**************@TK2MSFTNGP12.phx.gbl...
>> >> >> Most deadlocks are due to poor indexing and inconsistent
>> >> >> updating
>> >> >> in
>> > the
>> >> >> order of the tables. You may want to concentrate more on the
> database
>> >> >> schema and code vs. the C# code.
>> >> >>
>> >> >> --
>> >> >> Andrew J. Kelly SQL MVP
>> >> >>
>> >> >>
>> >> >> "Zeng" <Ze******@hotmail.com> wrote in message
>> >> >> news:%2****************@TK2MSFTNGP14.phx.gbl...
>> >> >> > Hi,
>> >> >> >
>> >> >> > The bigger my C# web-application gets, the more places I need to > put
>> > in
>> >> >> > the
>> >> >> > tedious retrying block of code to make sure operations that
>> >> >> > can
> run
>> >> >> > into
>> >> >> > database deadlocks are re-run (retried) 3-4 times and give up if >> > after
>> >> >> > that
>> >> >> > it's still in deadlock. I'm very sure that many experienced
> people
>> > out
>> >> >> > there already deal with this issue somehow. Is there an
> alternative
>> > to
>> >> > it?
>> >> >> > Thanks for your comments and suggestions.
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >>
>> >> >>
>> >> >
>> >> >
>> >>
>> >>
>> >
>> >
>>
>>
>
>

Nov 17 '05 #16

deadlock - retrying the transaction

Similar topics