insert statement blocked

nano2k

Hi

Shortly, I keep invoices in a table.
Occasionally, someone will fire the execution of a stored procedure
(SP) that performs several UPDATEs against (potentially) all invoices
OLDER than a date that is supplied to the SP as a parameter.
The SP is usually a lengthy process (it takes at least 30 mins).
The problem is that SQL server 2000 Dev Edition doesn't allow me to
insert new invoices that are "younger", while the SP is executing.
How should I specify to SQL Server that new invoices are "harmless"?

Thanks.

Sep 25 '06 #1

Subscribe Post Reply

3475

nano2k

Sep 25 '06 #2

Hugo Kornelis

On 25 Sep 2006 03:04:37 -0700, nano2k wrote:

>Hi

Shortly, I keep invoices in a table.
Occasionally, someone will fire the execution of a stored procedure
(SP) that performs several UPDATEs against (potentially) all invoices
OLDER than a date that is supplied to the SP as a parameter.
The SP is usually a lengthy process (it takes at least 30 mins).
The problem is that SQL server 2000 Dev Edition doesn't allow me to
insert new invoices that are "younger", while the SP is executing.
How should I specify to SQL Server that new invoices are "harmless"?

Thanks.

Hi nano2k,

To answer that question, we need to know:

- The design of your tables. Please post in the form of CREATE TABLE
statements. Don't forget to include all constraints and properties and
all indexes for the tables - they are probably a key factor in this
situation.

- The code executed in the stored procedure.

- The code executed to insert new rows.

--
Hugo Kornelis, SQL Server MVP

Sep 25 '06 #3

nano2k

Hi Hugo

Thank you for your quick reply.

It's hard for me to provide all data that you need (and I fully
understand the needing) because all (many) objects involved (SP,
tables, indexes, triggers) are big (as number of lines and also as
complexity).

I reformulate my problem to make it more simple to understand and let's
forget about the complexity of the objects enumerated above.

I have an app that allows me to insert invoices into my invoice table.
Classic.
Today is sept 23rd 2006. I need to run my SP for all invoices older
than current date (including invoices for 22nd of sept 2006). The SP
***may*** modify (that is, UPDATE) ***all*** invoices until 22nd
(including). The SP runs 2h. This means that I have to wait 2h before
the system allows me to insert new invoices for 23rd of sept 2006, even
if new invoices do not affect the result of my SP. This is not accepted
as the SP may be executed 2-3 times a day.
I also have tens of thousands of records in my invoice table.

My questions:
1. What indexes should I set for my invoice table? Currently I have an
uniqueidentifier as primary key. I suppose I should set another
non-unique index for the [invoicedate] column.
2. What types of transactions should I use inside the SP, so that
inserting new invoices will be accepted by db engine?

Thanks.

Sep 26 '06 #4

Dan Guzman

Without the scripts Hugo requested, we really can't provide specific
recommendations. I suggest you start by identifying the longest running
query in your invoicing process and post the relevant DDL for that query.
To identify problem queries, run a Profiler trace with SQL statement
completed and stored procedure statement completed events and include a
filter on high duration (e.g. 1000). Examine the execution plans of those
queries and add indexes or tweak the SQL for efficiency.

Some general observations and guidelines:

I need to run my SP for all invoices older
than current date (including invoices for 22nd of sept 2006).

I assume there is other criteria besides date for these older invoices. If
you have some indicator like InvoiceProcessed, and index on that column
(perhaps with InvoiceDate too) might be a good candidate.

The SP runs 2h.

This seems like an extraordinarily long time for even a very involved daily
process. I would expect that a standard invoicing process would take no
more than a minute for thousands of invoices. I suspect sub-optimal
execution plans (e.g. iterative scans) and/or poor query techniques
(cursors). Suboptimal trigger processing (scans) are notorious for causing
concurrency problems. I don't know the purpose of the triggers but you
might instead consider performing the process in stored procedures if
possible.

1. What indexes should I set for my invoice table? Currently I have an
uniqueidentifier as primary key. I suppose I should set another
non-unique index for the [invoicedate] column.

The best candidates for indexes are columns in JOIN and WHERE clause
predicates. I already mentioned that InvoiceDate and the theoretical
InvoiceProcessed column. In fact, this may be a good candidate for the
clustered index.

In lieu of scripts, you might try running the workload through the Index
Tuning Wizard or Database Tuning Advisor for automated suggestions.

2. What types of transactions should I use inside the SP, so that
inserting new invoices will be accepted by db engine?

The purpose of transactions are to provide data integrity per ACID rules.
It's best to specify the lowest level that satisfies this requirement. READ
COMMITED is most often appropriate. However, keep in mind that performance
and concurrency often go hand-in-hand. Ideally, the daily process should
have minimal effect on inserting new invoices and visa-versa as long as
index are useful.

--
Hope this helps.

Dan Guzman
SQL Server MVP

"nano2k" <ad*@ikonsoft.rowrote in message
news:11**********************@k70g2000cwa.googlegr oups.com...

Hi Hugo

Thank you for your quick reply.

It's hard for me to provide all data that you need (and I fully
understand the needing) because all (many) objects involved (SP,
tables, indexes, triggers) are big (as number of lines and also as
complexity).

I reformulate my problem to make it more simple to understand and let's
forget about the complexity of the objects enumerated above.

I have an app that allows me to insert invoices into my invoice table.
Classic.
Today is sept 23rd 2006. I need to run my SP for all invoices older
than current date (including invoices for 22nd of sept 2006). The SP
***may*** modify (that is, UPDATE) ***all*** invoices until 22nd
(including). The SP runs 2h. This means that I have to wait 2h before
the system allows me to insert new invoices for 23rd of sept 2006, even
if new invoices do not affect the result of my SP. This is not accepted
as the SP may be executed 2-3 times a day.
I also have tens of thousands of records in my invoice table.

My questions:
1. What indexes should I set for my invoice table? Currently I have an
uniqueidentifier as primary key. I suppose I should set another
non-unique index for the [invoicedate] column.
2. What types of transactions should I use inside the SP, so that
inserting new invoices will be accepted by db engine?

Thanks.

Sep 26 '06 #5

Hugo Kornelis

On 26 Sep 2006 03:18:17 -0700, nano2k wrote:

>Hi Hugo

Thank you for your quick reply.

It's hard for me to provide all data that you need (and I fully
understand the needing) because all (many) objects involved (SP,
tables, indexes, triggers) are big (as number of lines and also as
complexity).

Hi nano2k,

I concur with Dan that it's hard to help you without the data we need.

In general terms: you need to ensure that "old" invoices and "new"
invoices can be completely seperated. Any read or search operation on
either the "old" or "new" subset must be able to use indexes in which
potentially locked rows from the other subset can be bypassed, and any
insert, update or delete in one of the subsets should only incur such
locks that the other subset is not affected at all.

The SP
***may*** modify (that is, UPDATE) ***all*** invoices until 22nd
(including). The SP runs 2h.

This is your main problem. I am 99.9% sure that this SP can be optimized
to perform much faster. I'm not talking about saving 10 or 20% on
execution time, but aboout bringing execution time back to a couple of
minutes, at most.

>My questions:
1. What indexes should I set for my invoice table? Currently I have an
uniqueidentifier as primary key. I suppose I should set another
non-unique index for the [invoicedate] column.

That's not a good design, for several reasons.

First, if this uniqueidentifier is your ONLY key, then you have no
protection at all against duplicate rows. Imagine that the data entry
typists accidentally puts a paperweight on the <Enterkey - the program
will just keep on generating uniqueidentifier values and inserting the
same row over and over again.
Using a surrogate key (the official term for key values generated by the
database) can, in some cases, be a good idea, but only IN ADDITION TO
THE BUSINESS KEY. The business key is an attribute (or combination of
attributes) that uniquely identifies an instance of the object/entity
type within your Universe of Discourse.

Second, if you have to use a surrogate key, then uniqueidentifier is in
most cases the worst possible choice. In almost all cases, IDENTITY is
the preferred way to generate surrogate key values.

Since uniqueidentifiers are not monotonically increasing but generated
in a pseudo-random pattern, and since the primary key by default results
in a clustered index, insertions will often result in page splits. That
is in itself already pretty bad for INSERT performance, but in cases
when many rows in the database may be blocked (as in your scenario),
it's an open invitation to major blocking problems. Consider what
happens if an in INSERT needs to store a row in a page that happpens to
be full - the page has to be split and half the rows in it have to be
moved. But what if they are locked by another transaction? And what it
that transaction happens to be running for 2 hoours?

I don't think that this is the only cause for your locking problems, but
it's definitely one of the causes!

>2. What types of transactions should I use inside the SP, so that
inserting new invoices will be accepted by db engine?

Use the transaction isolation level that you need to maintain the level
of integrity that your application requires.
Never use a lower level: if you are willing to sacrifice correctness for
speed, just ditch the database and program your reports to produce
random data - lots faster and cheaper!
But don't set the transaction isolation level higher than you need
either, becuase (as Dan already pointed out) higher isolation means
lower concurrency.

--
Hugo Kornelis, SQL Server MVP

Sep 26 '06 #6

nano2k

Hugo, Dan
Thanks for your patience with this topic.

I understand and agree to all your advices.

My struggle, at least for the moment, is to make the application as
much responsive as possible while strongly looking for data security.
Let's say that for the moment we don't care how much the SP needs to
process - we care only to let other users work while the SP is running.
The SP will be run only a few times a month, but there are 2-3 fixed
days of the month when the SP needs to perform - it's all about
accounting :) I only want to give other users the ability to work while
the SP runs.

I need uniqueidentifiers because my app is a three-tier client-server
app (client/server/db) that needs to generate unique IDs from client
side. The client app may generate several records from different tables
that will be sent all together in a single request to the server. The
records need to be already coupled (that is, foreign keys must be
already known before sending the request to the database) before
sending them to the server to process. Of course, there is an issue
here regarding indexing this column, as Dan mentioned. Anyway, I have
set a task to reindex the database every night, so I think this will
reduce the payback to using uniqueidentifiers - do you agree? Is there
any other action I should schedule?

Dan, my uniqueid column (named [objectid]) is the only key, and at
least at the moment, this column is set as primary key (ok? not ok?).
My protection to duplicate entries is handled by the client
application. 99% of the business logic is concentrated in client app.
The server side only minimally processes the request and deals with the
database (select/update/insert/delete). I fully understand that I have
to logically split my tables using other indexes.

Dan, you got right in the heart of my problem when you said: "In
general terms: you need to ensure that "old" invoices and "new"
invoices can be completely seperated". This is my goal at the moment.
Your inputs along with Hugo's inputs helped me to start my research on
the right path.

But today, I have discovered that the SP heavily uses CURSORS :((
This is the other big issue I have to deal with. Check out this piece
of code (one of 3 CURSORS defined in SP):

DECLARE inputs CURSOR FOR
SELECT i.accountid, il.doctypeid,
il.objectid,
(CASE WHEN ilb.objectid IS NOT NULL THEN ilb.objectid ELSE il.objectid
END),
il.itemid, ilb.batchid, d.updatestock * d.cogssign * (CASE WHEN
ilb.objectid IS NOT NULL THEN ilb.pdocquantity ELSE il.pdocquantity
END),
d.updatestock * d.cogssign * il.price * il.rate * (1 - il.discount /
100) * (1 - i.discount / 100),
il.currencyid, il.rate,
il.warehouseid, dbo.f_date_notime(il.stockdate)
FROM
inventory i
JOIN inventoryline il ON (il.inventoryid = i.objectid)
JOIN item it ON (it.objectid = il.itemid)
JOIN doctype d ON (il.doctypeid = d.objectid)
JOIN status st ON (st.objectid = i.statusid)
JOIN warehouse w ON (w.objectid = il.warehouseid)
LEFT JOIN inventorylinebatch ilb ON (ilb.inventorylineid = il.objectid)
WHERE
d.cogssign = 1
AND (st.final = 1 OR st.cancel = 1)
AND d.doctypekey NOT IN ('25','26')
AND il.stockdate >= dbo.f_date_notime(@last_date) + 1
AND it.stockcontrol = 1
AND ISNULL(w.nocost,0) = 0
ORDER BY il.itemid, il.stockdate, d.doctypekey, i.docnumber,
il.create_date, ilb.create_date

OPEN inputs

....follows fetching results in a loop

I want to change this in the following manner: to create an temporary
table and to insert the results of the above SELECT statement into this
temp table. After that, using a cursor, I intend to fetch the records
from the temp table. This way, the tables involved in the SELECT stm
above will be locked as short as possible. Do you think this is a good
approach?

Note: In the SELECT statement above: ALL tables involved have a primay
key defined as uniqueidentifier and named objectid. Also, ALL foreign
keys (FK) are indexed like this:

CREATE INDEX [FK_inventoryline_inventoryid] ON
[dbo].[inventoryline]([inventoryid]) ON [PRIMARY]
GO
Thanks again.

Sep 26 '06 #7

Dan Guzman

Dan, my uniqueid column (named [objectid]) is the only key, and at

least at the moment, this column is set as primary key (ok? not ok?).
My protection to duplicate entries is handled by the client
application. 99% of the business logic is concentrated in client app.
The server side only minimally processes the request and deals with the
database (select/update/insert/delete). I fully understand that I have
to logically split my tables using other indexes.

You probably intended to direct this question at Hugo rather than me but my
$.02 is that you should also have a unique constraint on you natural key.
Even if integrity is enforced in the application, the database must still be
queried based on the natural key value. The unique constraint creates a
unique index that will improve performance of that query and guarantee data
integrity as well.

ORDER BY il.itemid, il.stockdate, d.doctypekey, i.docnumber,
il.create_date, ilb.create_date

I'm not sure why the ORDER BY here. Does your current processing logic
require that rows be processed in a particular sequence?

I want to change this in the following manner: to create an temporary
table and to insert the results of the above SELECT statement into this
temp table. After that, using a cursor, I intend to fetch the records
from the temp table. This way, the tables involved in the SELECT stm
above will be locked as short as possible. Do you think this is a good
approach?

The temp table will probably help mitigate blocking but you ought to avoid
cursors entirely, if possible. A set-based process usually performs much
better, especially if cursors are nested. If you must use a cursor, specify
the LOCAL FAST_FORWARD READ_ONLY when possible.

--
Hope this helps.

Dan Guzman
SQL Server MVP

"nano2k" <ad*@ikonsoft.rowrote in message
news:11*********************@e3g2000cwe.googlegrou ps.com...

Hugo, Dan
Thanks for your patience with this topic.

I understand and agree to all your advices.

My struggle, at least for the moment, is to make the application as
much responsive as possible while strongly looking for data security.
Let's say that for the moment we don't care how much the SP needs to
process - we care only to let other users work while the SP is running.
The SP will be run only a few times a month, but there are 2-3 fixed
days of the month when the SP needs to perform - it's all about
accounting :) I only want to give other users the ability to work while
the SP runs.

I need uniqueidentifiers because my app is a three-tier client-server
app (client/server/db) that needs to generate unique IDs from client
side. The client app may generate several records from different tables
that will be sent all together in a single request to the server. The
records need to be already coupled (that is, foreign keys must be
already known before sending the request to the database) before
sending them to the server to process. Of course, there is an issue
here regarding indexing this column, as Dan mentioned. Anyway, I have
set a task to reindex the database every night, so I think this will
reduce the payback to using uniqueidentifiers - do you agree? Is there
any other action I should schedule?

Dan, my uniqueid column (named [objectid]) is the only key, and at
least at the moment, this column is set as primary key (ok? not ok?).
My protection to duplicate entries is handled by the client
application. 99% of the business logic is concentrated in client app.
The server side only minimally processes the request and deals with the
database (select/update/insert/delete). I fully understand that I have
to logically split my tables using other indexes.

Dan, you got right in the heart of my problem when you said: "In
general terms: you need to ensure that "old" invoices and "new"
invoices can be completely seperated". This is my goal at the moment.
Your inputs along with Hugo's inputs helped me to start my research on
the right path.

But today, I have discovered that the SP heavily uses CURSORS :((
This is the other big issue I have to deal with. Check out this piece
of code (one of 3 CURSORS defined in SP):

DECLARE inputs CURSOR FOR
SELECT i.accountid, il.doctypeid,
il.objectid,
(CASE WHEN ilb.objectid IS NOT NULL THEN ilb.objectid ELSE il.objectid
END),
il.itemid, ilb.batchid, d.updatestock * d.cogssign * (CASE WHEN
ilb.objectid IS NOT NULL THEN ilb.pdocquantity ELSE il.pdocquantity
END),
d.updatestock * d.cogssign * il.price * il.rate * (1 - il.discount /
100) * (1 - i.discount / 100),
il.currencyid, il.rate,
il.warehouseid, dbo.f_date_notime(il.stockdate)
FROM
inventory i
JOIN inventoryline il ON (il.inventoryid = i.objectid)
JOIN item it ON (it.objectid = il.itemid)
JOIN doctype d ON (il.doctypeid = d.objectid)
JOIN status st ON (st.objectid = i.statusid)
JOIN warehouse w ON (w.objectid = il.warehouseid)
LEFT JOIN inventorylinebatch ilb ON (ilb.inventorylineid = il.objectid)
WHERE
d.cogssign = 1
AND (st.final = 1 OR st.cancel = 1)
AND d.doctypekey NOT IN ('25','26')
AND il.stockdate >= dbo.f_date_notime(@last_date) + 1
AND it.stockcontrol = 1
AND ISNULL(w.nocost,0) = 0
ORDER BY il.itemid, il.stockdate, d.doctypekey, i.docnumber,
il.create_date, ilb.create_date

OPEN inputs

...follows fetching results in a loop

I want to change this in the following manner: to create an temporary
table and to insert the results of the above SELECT statement into this
temp table. After that, using a cursor, I intend to fetch the records
from the temp table. This way, the tables involved in the SELECT stm
above will be locked as short as possible. Do you think this is a good
approach?

Note: In the SELECT statement above: ALL tables involved have a primay
key defined as uniqueidentifier and named objectid. Also, ALL foreign
keys (FK) are indexed like this:

CREATE INDEX [FK_inventoryline_inventoryid] ON
[dbo].[inventoryline]([inventoryid]) ON [PRIMARY]
GO
Thanks again.

Sep 27 '06 #8

Hugo Kornelis

On 26 Sep 2006 14:38:25 -0700, nano2k wrote:

>Hugo, Dan
Thanks for your patience with this topic.

I understand and agree to all your advices.

My struggle, at least for the moment, is to make the application as
much responsive as possible while strongly looking for data security.
Let's say that for the moment we don't care how much the SP needs to
process - we care only to let other users work while the SP is running.
The SP will be run only a few times a month, but there are 2-3 fixed
days of the month when the SP needs to perform - it's all about
accounting :) I only want to give other users the ability to work while
the SP runs.

Hi nano2k,

There are several ways to achieve that:

- Run the SP at a time when no inserts are done.
- If the SP only does reporting - run the SP on a copy of the database,
restored from a recent backup. Since your SP does updates, you can't use
this option.
- Make the SP run as fast as possible. With only tens of thousands of
rows in the database, you should be able to get performance in terms of
minutes, maybe even less. Definitely not hours.
- Make sure that all the right indexes are there to enable the processes
to run in parallel without blocking each other. This can only be done
with access to the complete code - i.e. all tables, constraints, and
indexes, the stored proc, and the code used for inserts. You've already
said that the code is big (long and complex), but not how big. My
definition of long and complex might differ from yours. That being said,
it is definitely possible that your code is to long and complex for help
via usenet groups - in that case, you either have to do it yoursself or
(if you lack the skills) hire a consultant to do it for you.

>I need uniqueidentifiers because my app is a three-tier client-server
app (client/server/db) that needs to generate unique IDs from client
side. The client app may generate several records from different tables
that will be sent all together in a single request to the server. The
records need to be already coupled (that is, foreign keys must be
already known before sending the request to the database) before
sending them to the server to process.

You're freely mingling real end-user requirement and implementation
choices here. uniqueidentifiers are never an end-user requirement. They
can be the best solution. I'm not sure if they are here. The only real
requirements I read here are insertion of new data from several clients,
and sending multiple related rows on a single access to keep network
traffic low.

It's not a given that you need surrogate keys for this. They can be
handy in some cases, but in other cases, using only the business key is
preferable. And even if you do need surrogate keys, then you can still
use identity values (with one of several standard techniques to prevent
duplicates when identity values get generated at different locations, if
you're in a replicated scenario [which I don't think is the case]).

Anyway, I have
set a task to reindex the database every night, so I think this will
reduce the payback to using uniqueidentifiers - do you agree? Is there
any other action I should schedule?

If you choose a low fill factor, the number of page splits will go down.
This may give some relief. OTOH, it will also spread the data over more
pages, increasing the amount of I/O needed for any request.

>Dan, my uniqueid column (named [objectid]) is the only key, and at
least at the moment, this column is set as primary key (ok? not ok?).

Primary key or not is irrelevant for now - having only a generated value
as key is wrong. Google for some of my prevous musings on surrogate key
vs business key, and for similar (and, to be honest, also a few
contradicting) opinions by others.

That being said, the index created to check the primary key defaults to
being clustered. You don't want to cluster on uniqueidentifier. Change
this index to nonclustered, then find a more sensible column or
combination of columns for the clustered key. Even if it's a nonunique
index.

>My protection to duplicate entries is handled by the client
application. 99% of the business logic is concentrated in client app.

Bad. Constraints should always be enforced in the DB level. You can ALSO
enforce them in the client, but never forget the DB - that way, if an
attacker compromissed the security of your client, (s)he still is unable
to bypass your basic integrity checks.

>But today, I have discovered that the SP heavily uses CURSORS :((

I'm not surprised. That is about the only way to get 2 hour performance
on database with less than 100K rows.

>This is the other big issue I have to deal with. Check out this piece
of code (one of 3 CURSORS defined in SP):

(snip)

>I want to change this in the following manner: to create an temporary
table and to insert the results of the above SELECT statement into this
temp table. After that, using a cursor, I intend to fetch the records
from the temp table. This way, the tables involved in the SELECT stm
above will be locked as short as possible. Do you think this is a good
approach?

No. You're standing behind a Formula 1 racing wagon and pushing it, and
you're proposing to apply some grease to the axes to go faster. You
should get in, start the engine and hit the pedal.

There are several ways to improve cursor performance (such as changing
to FAST_FORWARD READ_ONLY, using the temp table you propose, or other
techniques). Some of them will make things worse. Others might give you
some performance gain. But that's just peanuts compared to what you can
probably gain by rewriting the complete stored procedure to operate in a
set-based manner.

>Note: In the SELECT statement above: ALL tables involved have a primay
key defined as uniqueidentifier and named objectid.

I would also encourage you to revise your naming standards. I'll have to
point you to google again, since my post is already quite long, and it's
time for me to get some sleep.

--
Hugo Kornelis, SQL Server MVP

Sep 28 '06 #9

Similar topics