By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
435,247 Members | 1,018 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 435,247 IT Pros & Developers. It's quick & easy.

Stored procedure deadlock on self with subquery

P: n/a
Here's a really weird one for any SQL Server gurus out there...

We have observed (SQL Server 2000) scenarios where a stored procedure
which

(a) begins a transaction
(b) inserts some rows into a table
(c) re-queries another table using a subquery which references the
inserted table (correlated or not)

will deadlock itself at point (c), even when it is the only task on
the server.

[I use the term 'deadlock' because I can't think of anything else
appropriate, even though, as stated, this is the ONLY task executing,
so this isn't a classical deadlock situation]

A typical deadlocking scenario would be (assume insert_table is the
table into which some rows are being inserted)

begin transaction

insert insert_table(col....) select (col....) from some_other_table

/* this following query will deadlock and never complete */

select some_other_table.col
from some_other_table
where not exists (select *
from insert_table
where some_other_table.col = insert_table.col )
Whereas if the offending second query in the sequence is rewritten as
a join

e.g

select some_other_table.col
from some_other_table
left join insert_table
on some_other_table.col = insert_table.col
where insert_table.col is null

the query will not deadlock.

If the subquery is an aggregate function, a deadlock will also not
occur.

If the transaction is committed prior to executing the blocking query,
then it will not block (hardly surprising; if it did, there'd be major
problems with SQL Server!).

Note that this is a canonical case of a much more complex SP, and that
simplified test cases often will not deadlock; you need a significant
amount of data, typically 30,000 rows or more to see the problem. The
blocking query is, in real life, used to drive a subsequent table
insert operation, but this is not relevant to the problem.

We conclude that there is some problem, possibly involving catalogue
contention, if a temporary table must be created in a subquery within
a transaction in a stored procedure, and if the subquery involves
references to a table for which locks have been acquired.

Note that the lock timeout will never trigger and a deadlock victim is
never chosen, presumably because the deadlock occurs entirely within
the same SPID.

Locking hints and transaction isolation level setting does not affect
the result. Note also that the exact same queries, executed as a TSQL
batch, do not deadlock; you must place them in a stored procedure.

Recovery mode for the database is SIMPLE, and the problem is portable
across databases; it can also be exhibited with MSDE/2000, and is
independent of whether or not the database server is the local machine
or not.

Has anyone else experienced this problem and/or know of a workaround,
other than those mentioned here?. It does look awfully like a bug with
SQL Server, since a single task should never be able to deadlock
itself, surely.
Jul 23 '05 #1
Share this Question
Share on Google+
7 Replies


P: n/a
Andrew Mayo (aj****@my-deja.com) writes:
A typical deadlocking scenario would be (assume insert_table is the
table into which some rows are being inserted)

begin transaction

insert insert_table(col....) select (col....) from some_other_table

/* this following query will deadlock and never complete */

select some_other_table.col
from some_other_table
where not exists (select *
from insert_table
where some_other_table.col = insert_table.col )

...
If the transaction is committed prior to executing the blocking query,
then it will not block (hardly surprising; if it did, there'd be major
problems with SQL Server!).
...
We conclude that there is some problem, possibly involving catalogue
contention, if a temporary table must be created in a subquery within
a transaction in a stored procedure, and if the subquery involves
references to a table for which locks have been acquired.


Without a reproducible scenarion, it is very difficult to comment. Of
course, a real mean person can get the above to deadlock itself: in the
SELECT call a UDF which calls xp_cmdshell that tries to retrieve a
row from insert_table. This will cause a big fat deadlock, that SQL
Server of course cannot detect. I would not really expect this to
the problem, though.

What I would recomment as a first shot is to use my routine aba_lockinfo,
http://www.sommarskog.se/sqlutil/aba_lockinfo.html. This routine gives
you a good overview of what's going on in the server. In particular,
you can see if there is some blocking, and in such what you are blocked
by. You can also keep an eye at the cpu, io and memory columns to
determine wheter the process is working.

Another place to look is the SQL Server error log. Particularly I would
be interested if there is any talk of UMS scheduler.

Unless any of what I have set leads you to a revelation, I would recommend
you to open a case with Microsoft, as it appears to be too complex to
pursue the actual case over the newsgroups.
--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 23 '05 #2

P: n/a
Erland Sommarskog <es****@sommarskog.se> wrote in message news:<Xn*********************@127.0.0.1>...
Without a reproducible scenarion, it is very difficult to comment. Of
course, a real mean person can get the above to deadlock itself: in the
SELECT call a UDF which calls xp_cmdshell that tries to retrieve a
row from insert_table. This will cause a big fat deadlock, that SQL
Server of course cannot detect. I would not really expect this to
the problem, though.


Thanks very much; that's a useful hint.

We found, after further exploration, that rephrasing the subqueries as
self joins did NOT resolve the problem, although a simple subquery
expressed as a single left outer join DID cause the SP to stop
blocking. However, more complex self-joins caused the problem to
reappear.

We concluded that the problem is likely triggered by the query plan
requiring a temp table to be dynamically created in order to either
satisfy the complex join condition or process the subquery. We have
now worked around the problem by taking the code out of a transaction
entirely and building to a temp table. We then serialise the SP at top
and bottom by means of an exclusive lock on the permanent table we are
rebuilding. This ensures that any caller of the SP will not regain
control until the work table it rebuilds is either verified up-to-date
or has been rebuilt. The downside is that in rare circumstances, two
simultaneous callers will both rebuilt temp tables, and then one of
these tasks will discover that the permanent work table was rebuilt
and discard their work. This will occur very rarely and is an
acceptable tradeoff.

So, the current hypothesis for this bug is that, within an SP

(a) Begin a transaction
(b) Perform an insert operation on empty table t which populates a
reasonable rowcount - say, 40,000 rows (note that in our tests, table
t had a PK but no other indexes or constraints, e.g foreign keys etc -
nor are identity columns involved)
(c) Within the same transaction, perform a second insert operation on
table t which involves a subquery or self join involving table t (plus
at least one other table), sufficiently complex to require the query
analyser to dynamically create a temp table as part of the execution
plan.

We believe that the SP will 'lock up' at step (c) and that this is
highly likely to be a catalogue serialisation issue, rather than
something like page lock or table lock collision.

If I get time, I will attempt to produce a canonical test case.
Jul 23 '05 #3

P: n/a
Andrew Mayo (aj****@my-deja.com) writes:
(a) Begin a transaction
(b) Perform an insert operation on empty table t which populates a
reasonable rowcount - say, 40,000 rows (note that in our tests, table
t had a PK but no other indexes or constraints, e.g foreign keys etc -
nor are identity columns involved)
(c) Within the same transaction, perform a second insert operation on
table t which involves a subquery or self join involving table t (plus
at least one other table), sufficiently complex to require the query
analyser to dynamically create a temp table as part of the execution
plan.


All I can say is that this sounds highly unlikely.

As for the loopback possibility, it occurred to me that there are a
few more possibilies where this could creep in. You mentioned that the
queries were complex, and if they refer to views or functions the loopback
maybe hidden. Here are a couple a ways that you could haev a loopback:

* A function that calls xp_cmdshell to do some dynamic SQL. (Very poor
programming practice!)
* A function or a view that uses OPENQUERY to receive the result set
from a stored procedure. (Not very good programming practice either.)
* Your query accesses views on a remote server, which in their turn
refer back to tables on your own server.

I will have to admit that it would be very interesting to see the output
from a procedure like aba_lockinfo when you have this situation.

I would say that my main hypothesis is that the query simply takes a lot
time. But then again, I have not been at the site.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 23 '05 #4

P: n/a
Erland Sommarskog <es****@sommarskog.se> wrote in message news:<Xn*********************@127.0.0.1>...
All I can say is that this sounds highly unlikely.

As for the loopback possibility, it occurred to me that there are a
few more possibilies where this could creep in. You mentioned that the
queries were complex, and if they refer to views or functions the loopback
maybe hidden. Here are a couple a ways that you could haev a loopback:

* A function that calls xp_cmdshell to do some dynamic SQL. (Very poor
programming practice!)
* A function or a view that uses OPENQUERY to receive the result set
from a stored procedure. (Not very good programming practice either.)
* Your query accesses views on a remote server, which in their turn
refer back to tables on your own server.


Well, although the data and tables are more complex than my simplified
example, they are not significantly more complex, and they do not do
any of the above. Also, if the queries are done out of a transaction
they complete in about 10 seconds and if done in the transaction, they
never complete.

However, I'm trying to create a canonical test case, so far
unsuccessfully, to pin down what the heck is going on, and will repost
if and when I can do this.
Jul 23 '05 #5

P: n/a
I am probably misunderstanding/oversimplifying your scenario, however it
seems to me that

1) You do these two queries in a transaction
2) The insert table is empty, and you load all rows with the first
ftransaction - this will put write locks on all pages on the table
(making it unable to be read until the transaction doing the insert is
closed)
3) You then try to join to the table which is not possible because it is
locked by the previous insert
4) This appears to be confirmed by the fact that running the query in a 2nd
transaction works fine
5) If this is correct - then testing the transaction allowing dirty reads
(READ UNCOMMITTED) should allow it to work

If this is correct then you have a case of blocking locks, which is
normal/correct behaviour (though not obvious in this case)

Julian 8^)

"Andrew Mayo" <aj****@my-deja.com> wrote in message
news:2b**************************@posting.google.c om...
Here's a really weird one for any SQL Server gurus out there...

We have observed (SQL Server 2000) scenarios where a stored procedure
which

(a) begins a transaction
(b) inserts some rows into a table
(c) re-queries another table using a subquery which references the
inserted table (correlated or not)

will deadlock itself at point (c), even when it is the only task on
the server.

[I use the term 'deadlock' because I can't think of anything else
appropriate, even though, as stated, this is the ONLY task executing,
so this isn't a classical deadlock situation]

A typical deadlocking scenario would be (assume insert_table is the
table into which some rows are being inserted)

begin transaction

insert insert_table(col....) select (col....) from some_other_table

/* this following query will deadlock and never complete */

select some_other_table.col
from some_other_table
where not exists (select *
from insert_table
where some_other_table.col = insert_table.col )
Whereas if the offending second query in the sequence is rewritten as
a join

e.g

select some_other_table.col
from some_other_table
left join insert_table
on some_other_table.col = insert_table.col
where insert_table.col is null

the query will not deadlock.

If the subquery is an aggregate function, a deadlock will also not
occur.

If the transaction is committed prior to executing the blocking query,
then it will not block (hardly surprising; if it did, there'd be major
problems with SQL Server!).

Note that this is a canonical case of a much more complex SP, and that
simplified test cases often will not deadlock; you need a significant
amount of data, typically 30,000 rows or more to see the problem. The
blocking query is, in real life, used to drive a subsequent table
insert operation, but this is not relevant to the problem.

We conclude that there is some problem, possibly involving catalogue
contention, if a temporary table must be created in a subquery within
a transaction in a stored procedure, and if the subquery involves
references to a table for which locks have been acquired.

Note that the lock timeout will never trigger and a deadlock victim is
never chosen, presumably because the deadlock occurs entirely within
the same SPID.

Locking hints and transaction isolation level setting does not affect
the result. Note also that the exact same queries, executed as a TSQL
batch, do not deadlock; you must place them in a stored procedure.

Recovery mode for the database is SIMPLE, and the problem is portable
across databases; it can also be exhibited with MSDE/2000, and is
independent of whether or not the database server is the local machine
or not.

Has anyone else experienced this problem and/or know of a workaround,
other than those mentioned here?. It does look awfully like a bug with
SQL Server, since a single task should never be able to deadlock
itself, surely.

Jul 23 '05 #6

P: n/a
"julian_muir" <ju*********@yahho.com> wrote in message news:<42***********************@ptn-nntp-reader01.plus.net>...
I am probably misunderstanding/oversimplifying your scenario, however it
seems to me that [snip] 5) If this is correct - then testing the transaction allowing dirty reads
(READ UNCOMMITTED) should allow it to work

If this is correct then you have a case of blocking locks, which is
normal/correct behaviour (though not obvious in this case)

I did consider this possibility. However, setting isolation level
hinting in either or both SQL statements did not change things.

I still strongly suspect catalogue contention, I have to say.
Certainly, the first insert statement will acquire a bunch of write
locks. Possibly these will even be escalated to a table-level lock. In
fact,you can explicitly specify such a lock hint without changing the
behaviour of the SP.(i.e it will still hang up)

Now the important point that maybe I didn't stress significantly
enough here is that this behaviour ONLY occurs in a STORED PROCEDURE.

If it were related to deadlocking issues, there is no reason why it
would be different if you executed it as part of a single TSQL batch,
within a transaction, since the exact same locks would be acquired.

Also, we should bear in mind that the locks you are acquiring are YOUR
OWN locks. Because of this, you should never be blocked - and, in
fact, you can do a transactional insert, then, without committing, do
a SELECT on the table you just inserted into quite successfully and
without blocking, as you might expect.
(whether you actually see anything in the select will of course depend
on whether you want committed reads or dirty reads, for example).

After all, if it didn't work that way, you'd begin a transaction and
then get blocked on your own locks - so, for example, two singleton
updates on the same row within a single transaction would fail because
the first update would acquire a write lock and by this theory, the
second update should be blocked by it.

Of course, this does not happen because, as I said, they are YOUR
locks and YOU are not blocked by them, only other processes. You can
do what you like; its YOUR transaction, and in this test scenario
there are no other database users to complicate matters.

Further testing currently confirms my working hypothesis; namely, that
a whole bunch of things need to happen

(a) The base table into which inserts are made needs to be truncated
prior to executing the query
(b) I *think* the query optimiser must then make a decision to
recompile the SP (or at least rebuild the execution plan) based on a
statistics change (need to check this area)
(c) The subquery or self-join in the second statement (that blocks)
must be sufficiently complex to require a temporary table to be
created in order to execute the query plan. A really simple self-join,
for instance, makes the problem go away, but a more complex join will
promptly bring it back.

As soon as I can produce a simple reproducible example I will post it.
Jul 23 '05 #7

P: n/a
Andrew Mayo (aj****@my-deja.com) writes:
I still strongly suspect catalogue contention, I have to say.
And I still think this sounds unlikely.
Now the important point that maybe I didn't stress significantly
enough here is that this behaviour ONLY occurs in a STORED PROCEDURE.

If it were related to deadlocking issues, there is no reason why it
would be different if you executed it as part of a single TSQL batch,
within a transaction, since the exact same locks would be acquired.
For various reasons you can get different query plans when you use
a stored procedure and when you have a loose batch. The two chief
reasons for these possibilities:

1) Parameter sniffing. (If you SP has parameters the values of these
can affect the query plan.)
2) Different settings. (Recall that some settings are saved with the
stored procedure.)
As soon as I can produce a simple reproducible example I will post it.


Once you have it, I would be very interested in seeing it. However, I
fear that it might be a time-consuming task to construct a repro. And, as
I said before, it would be interesting to see lock and process information
from this situation.
--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 23 '05 #8

This discussion thread is closed

Replies have been disabled for this discussion.