Very small table incredibly slow

mamo74

Hello.

I am administering a SQL Server (Enterprise Edition on Windows 2003)
from some month and can't understand what is going on in the latest
week (when the db grow a lot).

The DB is around 250G, and has one table with 1 billion rows. It is
performing in a decent way, but can't understand why a particolar table
has strong performance problem.

I have a stored procedure that read table from table A and insert them,
after processing in table B, and then move them in other table (similar
to a Star Schema) for reporting.

Table B is, for how the SP is written, not more than 3000 lines. Table
B is very simple, has 3 rows, and no index.

What is very strange is that performance of table B is really slow. If
I do a select count (*) from table_b it takes between 30s & 2minutes to
return it has 0 lines. When the stored procedure insert 1000 lines, it
takes 20/30 seconds and it takes 20/30 seconds to delete them.

To me it doesn't look like a lock problem, because it is slow also when
the only procedure that access that table are stopped. I did an update
statistics with fullscan on this table with no improvement.

The DB is on a Storage Area Network that should perform decently. The
LUN I use is configured to use a piece of 32 disk that are used also by
other application. I don't have performance data of the SAN. The
machine is an HP DL580 with 4 CPU (hiperthreading disabled), 8G of RAM,
AWE and PAE and 5G reserved for SQL Server.

I don't know what to do to solve this situation. Could it be a
"corruption problem" that slow this table so much? is it possible the
fact the db grow a lot in the last week created problem also to this
small and simple table?

Do you have any idea or hint on how to manage this situation, or
pointer to documentation that can help in analizing this situation?

For the ones that arrived till here, thank you for your time and
patience reading my bad english...

Best Regards,
Mamo

PS
I can't rewrite the stored procedure, because it is part of a closed
source product.

Jul 23 '05 #1

Subscribe Reply

21256

louis

First of all. Please don't use "lines" the purists among us will have
a fit.

Table B is, for how the SP is written, not more than 3000 lines. TableB is very simple, has 3 rows, and no index.
Table B has no more than 3000 rows at a time and has 3 columns with no
index?
If I do a select count (*) from table_b it takes between 30s & 2minutes toreturn it has 0 lines. When the stored procedure insert 1000 lines, ittakes 20/30 seconds and it takes 20/30 seconds to delete them.
a) I would check the execution plans when you run these queries/ stored
procedures.
b) Check database configuration.
Does the slowdown only occur when a query involves table B? Or is the
entire database slow?
Is table A, B, & C all in the same database or in separate databases?
Does it have single or multiple data and log files?
What are the database options? Do you have something funky like "auto
close" or "auto shrink" enabled?
Is the database set to autogrow? Any max size restriction?
c) Try running sp_who2 while the query is progress
Could it be a
"corruption problem" that slow this table so much? is it possible the
fact the db grow a lot in the last week created problem also to this
small and simple table?

Corruption unlikely. Running out of space -- definitely a possibility.
When the disk arrays are nearing capacity,
you'll get mysterious problems. Jobs will fail. Queries take a long
time. Error messages that don't make any
sense. Check the size of the arrays -- for the database and log and
tempDb.

Jul 23 '05 #2

mamo74

louis wrote:

First of all. Please don't use "lines" the purists among us will have a fit.
Sorry....You are right.
Table B has no more than 3000 rows at a time and has 3 columns with no index? Yes!
a) I would check the execution plans when you run these queries/ stored procedures. I checked them. The slowdown is on the insert/delete in the "table B".
b) Check database configuration.
What would you check in particolar?
Does the slowdown only occur when a query involves table B? Or is the entire database slow? The DB is not fast. The application sometimes timeout. But it is also
big with a lot of data. I can't say how much it could perform better
with fine tuning. Is table A, B, & C all in the same database or in separate databases?
Table A,B,C are all in the same DB. Does it have single or multiple data and log files?
They are in the same filesystem. The filesystem is on the SAN. I could
have some local disk on the machine (SCSI 15K), but I am not sure that
could help (the "SAN expert" says me that the SAN should be a lot
faster than local disk). What do you think about it?
What are the database options? Do you have something funky like "auto close" or "auto shrink" enabled? The DB doesn't have options like auto close or shrink. I have auto
create/update statistics, Torn Page detection, allow cross database
ownership chaining (it is required by some stored procedure in the
vendor code). The recovery model is Simple.
Is the database set to autogrow? Any max size restriction? The DB have been created as 150GB. It has autogrow with step of 1G. Now
it is 250G. It can grow till 350G. Now the DB has 50G free space. The
filesystem is 0.5Tbyte. Now it is 60% Used. Is 40% enought to have a
decent performance? What is the maximum that can be used to have decent
performance?
We had the transaction log grow till 80G a month ago, but now we backup
the system every day and the db agent truncate the transaction log so
now it is around 0.5G (We have 75G transaction log free, because we
didn't shrink it).

c) Try running sp_who2 while the query is progress
I did it but could not find what is wrong. What could I look for?
From the execution plan it looks like it take 20s to insert 1000 rows

in a simple table & 20s to delete them from that simple table. I would
like to solve this because it looks really strange to me. I am a DBA
newbie, but to me it looks really strange that to write 5/10kbyte of
data on a fast storage managed by a DMBS it takes so much time. Isn't
it "strange"?

Best Regards,
Massimo

Jul 23 '05 #3

Erland Sommarskog

ma****@gmail.com (ma****@gmail.com) writes:

I have a stored procedure that read table from table A and insert them,
after processing in table B, and then move them in other table (similar
to a Star Schema) for reporting.

Table B is, for how the SP is written, not more than 3000 lines. Table
B is very simple, has 3 rows, and no index.

What is very strange is that performance of table B is really slow. If
I do a select count (*) from table_b it takes between 30s & 2minutes to
return it has 0 lines. When the stored procedure insert 1000 lines, it
takes 20/30 seconds and it takes 20/30 seconds to delete them.

OK, so here goes my ESP:

Table B is frequenly inserted into and deleted from. It does not
have a clustered index. This results in huge fragmentation, particular
if there are leftover rows which causes entire extents to be left behind.

Run DBCC SHOWCONTIG on the table to see some horrying numbers.

Then create a clustered index on the table, and keep it. An occassional
DBCC DBREINDEX would be a good thing too.
--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp

Jul 23 '05 #4

louis

>>>They are in the same filesystem. The filesystem is on the SAN. I
could have some local disk on the machine (SCSI 15K), but I am not sure
that could help (the "SAN expert" says me that the SAN should be a lot
faster than local disk). What do you think about it?

I don't have any experience with SANs, so I don't know. I think
SANs/FibreChannel are "supposed" to have the same speed as local disk.

Is 40% enought to have a decent performance? What is the maximum that can be used to have decent performance?

I would worry when the DB is full and it has to constantly autogrow (if
possible). 40% is plenty of space.
sp_who2 Will identify any blocked processes. It also tells you which SPIDs are
monopolizing CPU and disk IO.

Based on everything described -- I think Erland is right -- and that a
clustered index on table B will fix the problem. You can also execute
sp_spaceused, before and after creating the clustered index. You
should see that the table size will shrink dramatically.

Jul 23 '05 #5

mamo74

The dbcc return:

/*-----------------------------
DBCC SHOWCONTIG (table_b)
-----------------------------*/
DBCC SHOWCONTIG scanning 'table_b' table...
Table: 'table_b' (1282103608); index ID: 0, database ID: 5
TABLE level scan performed.
- Pages Scanned................................: 30229
- Extents Scanned..............................: 3841
- Extent Switches..............................: 3840
- Avg. Pages per Extent........................: 7.9
- Scan Density [Best Count:Actual Count].......: 98.39% [3779:3841]
- Extent Scan Fragmentation ...................: 99.51%
- Avg. Bytes Free per Page.....................: 7481.5
- Avg. Page Density (full).....................: 7.57%

Do you think I can benefit the creation of a clusted index with this
statistics?

Best Regards,
Mamo

Jul 23 '05 #6

mamo74

To Erland Sommarskog. Thank you very much!!!!

You are right!!! I dropped and recreated the table and now the process
is INCREDIBLY faster. I will create the clustered index as you say.
Thank you.
Mamo

Jul 23 '05 #7

mamo74

Jul 23 '05 #8

Erland Sommarskog

ma****@gmail.com (ma****@gmail.com) writes:

The dbcc return:

/*-----------------------------
DBCC SHOWCONTIG (table_b)
-----------------------------*/
DBCC SHOWCONTIG scanning 'table_b' table...
Table: 'table_b' (1282103608); index ID: 0, database ID: 5
TABLE level scan performed.
- Pages Scanned................................: 30229
- Extents Scanned..............................: 3841
- Extent Switches..............................: 3840
- Avg. Pages per Extent........................: 7.9
- Scan Density [Best Count:Actual Count].......: 98.39% [3779:3841]
- Extent Scan Fragmentation ...................: 99.51%
- Avg. Bytes Free per Page.....................: 7481.5
- Avg. Page Density (full).....................: 7.57%

30000 pages for three rows is quite a lot!

I'm glad to hear that you got things working!
--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp

Jul 23 '05 #9

Clifford Heath

Erland Sommarskog wrote:

30000 pages for three rows is quite a lot!

How is it possible that SQL server can get into this situation?
It seems totally bizarre! Since pages belong to one table exclusively,
why don't empty pages get released?

I tend to avoid using clustered indexes whenever secondary indices are
required, except very limited situations where the clustered index has
a very small number of bytes in the key value. is this unnecessarily
conservative?

Is there any way to get SQL server to select pages for new rows based
on locality on a particular index (other than using a clustered index?),
for when frequent index scans will need to fetch a subset of the rows
on that index? I just ask because when I was starting out with SQL
Server some years ago, that's what I thought clustering did, and that
assumption burnt me a bit.

Jul 23 '05 #10

Erland Sommarskog

Clifford Heath (no@spam.please) writes:

How is it possible that SQL server can get into this situation?
It seems totally bizarre! Since pages belong to one table exclusively,
why don't empty pages get released?
First of all, when you have a heap - that is, a table without clustered
index - SQL Server can only insert rows in one place: at the end of
the table. It cannot scan the table from left to right to see if by
chance there is an empty slot. This would make insertions painfully
slow for table that normally are not deleted from.

When you delete rows from a heap, SQL Server can deallocate pages.
But since SQL Server allocates pages in extents of eight at a time,
an extent can only be deallocated if all rows are gone.

Try this repro:

use master
go
drop database klump
create database klump
go
use klump
go
select klumpid = (a.OrderID - 10000) * 1000 + b.OrderID - 10000,
a.* into klump
from Northwind..Orders a
cross join Northwind..Orders b
go
go
CREATE UNIQUE INDEX nc_ix1 ON klump(klumpid)
CREATE UNIQUE INDEX nc_ix2 ON klump(klumpid, ShipAddress)

go
dbcc showcontig (klump)
go
delete klump where klumpid % 19 <> 0
go
dbcc showcontig (klump)
go
delete klump where klumpid > (SELECT MIN(klumpid) FROM klump) + 20
go
dbcc showcontig (klump)
go
The first DELETE removes 95% of the rows, but the rows that are kept
are spread all over the table. Therefor the first two SHOWCONTIG gives
us:

DBCC SHOWCONTIG scanning 'klump' table...
Table: 'klump' (1977058079); index ID: 0, database ID: 29
TABLE level scan performed.
- Pages Scanned................................: 16827
- Extents Scanned..............................: 2105
- Extent Switches..............................: 2104
- Avg. Pages per Extent........................: 8.0
- Scan Density [Best Count:Actual Count].......: 99.95% [2104:2105]
- Extent Scan Fragmentation ...................: 0.14%
- Avg. Bytes Free per Page.....................: 90.0
- Avg. Page Density (full).....................: 98.89%
DBCC execution completed. If DBCC printed error messages, contact your
system administrator.

.................

DBCC SHOWCONTIG scanning 'klump' table...
Table: 'klump' (1977058079); index ID: 0, database ID: 29
TABLE level scan performed.
- Pages Scanned................................: 15672
- Extents Scanned..............................: 2105
- Extent Switches..............................: 2104
- Avg. Pages per Extent........................: 7.4
- Scan Density [Best Count:Actual Count].......: 93.06% [1959:2105]
- Extent Scan Fragmentation ...................: 0.14%
- Avg. Bytes Free per Page.....................: 7590.3
- Avg. Page Density (full).....................: 6.22%
DBCC execution completed. If DBCC printed error messages, contact your
system administrator.

Only a few pages were deallocated. And more importantly: the number of
extents are the same. See how the Avg. Page Density drops drastically.

In the next DELETE we only keep a handful of rows, and they are in a
contiguous range.

DBCC SHOWCONTIG scanning 'klump' table...
Table: 'klump' (1977058079); index ID: 0, database ID: 29
TABLE level scan performed.
- Pages Scanned................................: 15
- Extents Scanned..............................: 11
- Extent Switches..............................: 10
- Avg. Pages per Extent........................: 1.4
- Scan Density [Best Count:Actual Count].......: 18.18% [2:11]
- Extent Scan Fragmentation ...................: 54.55%
- Avg. Bytes Free per Page.....................: 7998.1
- Avg. Page Density (full).....................: 1.18%
DBCC execution completed. If DBCC printed error messages, contact your
system administrator.

There are only 11 extents left.
I tend to avoid using clustered indexes whenever secondary indices are
required, except very limited situations where the clustered index has
a very small number of bytes in the key value. is this unnecessarily
conservative?
Yes. My recommendation is that you should always have a clustered index,
unless you have very good reasons for not having one. The example we had
here is a strong reason for this recommendation.
Is there any way to get SQL server to select pages for new rows based
on locality on a particular index (other than using a clustered index?),
for when frequent index scans will need to fetch a subset of the rows
on that index?

The one way to get inserts happens elsewhere than at the end is by
using a clustered index.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp

Jul 23 '05 #11

Clifford Heath

Erland,

Great response, thanks. I hope it's not too much trouble to pursue
this a little further with me?

First of all, when you have a heap - that is, a table without clustered
index - SQL Server can only insert rows in one place: at the end of
the table. It cannot scan the table from left to right to see if by
chance there is an empty slot.
I appreciate you pointing out that this is how SQL Server actually
works, but it's certainly not the only possible solution. I've
implemented efficient solutions to this class of allocation problem
myself.
Only a few pages were deallocated. And more importantly: the number of
extents are the same. See how the Avg. Page Density drops drastically.
This is appalling behaviour from an otherwise quite acceptable product.
In the next DELETE we only keep a handful of rows, and they are in a
contiguous range. .... There are only 11 extents left.
Then how did the original problem arise, where a table with 30000 pages
has only 3 rows? There can be no more than 3 non-empty extents, so
there should be no more than 24 pages...?
Yes. My recommendation is that you should always have a clustered index,
unless you have very good reasons for not having one. The example we had
here is a strong reason for this recommendation.

I have avoided it because it forces two index searches whenever a
secondary index is used. Is it that this isn't a performance problem
because the internal nodes of the B-tree are cached and have high
fan-out?

I know it's not "pure", but because many of our tables have potentially
large natural primary keys, we use synthetic primary keys extensively
(INT IDENTITY fields). Otherwise all the foreign keys would enlarge the
tables unacceptably, and many of the indices would wind up with key
sizes potentially over 900 bytes, which is inadvisable with S2000 and
impossible with S7.

I try to use PRIMARY KEY constraints on the natural primary keys, except
in one awkward case where the natural primary key would be a complex
value calculated from four separate values, all potentially nullable
(though not all at the same time). This is just an aesthetic thing,
because apart from disallowing nullable fields, a primary key constraint
is exactly like a unique index (true?).

So I'm thinking that all our IDENTITY fields should be marked
"clustered" to avoid the allocation problems you describe...?

Clifford Heath.

Jul 23 '05 #12

Leythos

In article <11**********************@f14g2000cwb.googlegroups .com>,
ma****@gmail.com says...

To Erland Sommarskog. Thank you very much!!!!

You are right!!! I dropped and recreated the table and now the process
is INCREDIBLY faster. I will create the clustered index as you say.
Thank you.

Mamo - don't forget that you should reindex a table (depending on
updates/adds/deletes) on a scheduled basis. I've seen many applications
time-out because of a years changes to a table that was never
maintained.

You can schedule a "job" that will reindex a table(s) when you determine
it's needed.

--
--
sp*********@rrohio.com
(Remove 999 to reply to me)

Jul 23 '05 #13

Erland Sommarskog

Clifford Heath (no@spam.please) writes:

Only a few pages were deallocated. And more importantly: the number of
extents are the same. See how the Avg. Page Density drops drastically.
This is appalling behaviour from an otherwise quite acceptable product.

I see no reason to call it appalling. Consider what actually happened.
Assume for simplicitity that the klumpid column is laid out in sequential
order over the pages. (This may or may not be true, but if it's untrue,
it's likely to be in large clusters of sequential numbers.) Now we delete
all but every 19th row. This means that on about every page, there is
a row.

Of course, SQL Server could compact on the fly, but doing that could be
an expensive operation. Prohibitly expensive.

In fact, this part of the equation is the same when you have a clustered
index as well. The difference is that with a clustered index, it's
easy to run DBCC DBREINDEX everyonce in a while. Also, if your INSERT/DELETE
has a certain pattern that is reflected in the clustered index, there
are better chances for extents being deallocated.
Then how did the original problem arise, where a table with 30000 pages
has only 3 rows? There can be no more than 3 non-empty extents, so
there should be no more than 24 pages...?
I've been on these newsgroups long enough to understand that information
given in posts asking for help is not always accurate. Maybe he looked
at sysindexes.rows which said 3, but it was out of date. Without
access to the database at hand (or some other example of the same) it's
difficult to tell.
I have avoided it because it forces two index searches whenever a
secondary index is used. Is it that this isn't a performance problem
because the internal nodes of the B-tree are cached and have high
fan-out?
It's true that when you have a clustered index on a table, the clustered
index keys are used as row locators, as opposed to when you have a heap
where the row ID fills this function.

It goes without saying that Microsoft would not have settled for this
solution, if they didn't find that it performed well. Yes, there a few
more reads in the top level of the clustered index, but this is likely
to be small and be in cache.

On the other hand, with a heap, there are more costs to maintain
non-clustered indexes. The row ID inclueds the page number, so if a
page is moved because of a page-split, or a non-in-place update, all
the non-clustered index needs an update.
I know it's not "pure", but because many of our tables have potentially
large natural primary keys, we use synthetic primary keys extensively
(INT IDENTITY fields). Otherwise all the foreign keys would enlarge the
tables unacceptably, and many of the indices would wind up with key
sizes potentially over 900 bytes, which is inadvisable with S2000 and
impossible with S7.
Whether to use artificial keys or not is a matter of judgemnet. I
recently reworked a pair of tables in our application where the upper
table and an artificial key, since I thought the original four-part
natural key was too bulky to bring to the subtable. But I have found
the artificial key much more difficult to work with in this case, why
I ditched it.
I try to use PRIMARY KEY constraints on the natural primary keys, except
in one awkward case where the natural primary key would be a complex
value calculated from four separate values, all potentially nullable
(though not all at the same time). This is just an aesthetic thing,
because apart from disallowing nullable fields, a primary key constraint
is exactly like a unique index (true?).

So I'm thinking that all our IDENTITY fields should be marked
"clustered" to avoid the allocation problems you describe...?

It's good to select a clustered index on a small column, as this keeps
down the size of the non-clustered indexes. Also, clustering on IDENTITY
columns is good to avoid fragmentation in tables where is mainly
inserts, since all inserts are at the end. On the other hand, IDENTITY
column are rarely of interest for range queries, so there might be better
candidates.
--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp

Jul 23 '05 #14

Clifford Heath

Erland Sommarskog wrote:

Then how did the original problem arise, where a table with 30000 pages
has only 3 rows? ... information given in posts asking for help is not always accurate.

Ok, so you also don't think that the situation as described can have
actually occurred. In that case you're right, and "appalling" is the
wrong description. OTOH it wouldn't be hard to keep page-loading stats
for the table, or even a "page number in which space for some rows is
known to be available" for the table, so that it could work out when
it is worth searching and/or compacting.
On the other hand, with a heap, there are more costs to maintain
non-clustered indexes. The row ID inclueds the page number, so if a
page is moved because of a page-split, or a non-in-place update, all
the non-clustered index needs an update.
The page won't be split, of course, because it isn't in a tree. But
if it overflows because the newly updated record doesn't fit, another
commonly-used solution (e.g. HP Allbase products) is to leave a
"forwarding address" in the same slot of the original page (the RID
is just page#/slot#). Whenever the record is updated, a decision can
be made as to whether it now fits back into the original page, so the
forwarding can be cancelled. If it doesn't fit in either the original
or forwarded page, it can be forwarded to a new place, so there's still
only one level of forwarding. This solution often yields better
performance than moving by re-indexing.
Whether to use artificial keys or not is a matter of judgemnet.

I've had exactly the same experiences - sometimes the natural keys are
best.

So I'm thinking that all our IDENTITY fields should be marked
"clustered" to avoid the allocation problems you describe...?

Thanks - you've confirmed my understanding on this.

Jul 23 '05 #15

Erland Sommarskog

Clifford Heath (no@spam.please) writes:

The page won't be split, of course, because it isn't in a tree. But
if it overflows because the newly updated record doesn't fit, another
commonly-used solution (e.g. HP Allbase products) is to leave a
"forwarding address" in the same slot of the original page (the RID
is just page#/slot#).

But SQL 2000 does not do this. SQL 2005 does permit a row to overflow to
other pages, but I believe this mainly used to permit rows to exceed
the fixed page size of 8192 bytes.
--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp

Jul 23 '05 #16

mamo74

>> Then how did the original problem arise, where a table with 30000
pages

has only 3 rows? There can be no more than 3 non-empty extents, so
there should be no more than 24 pages...?
I've been on these newsgroups long enough to understand that informationgiven in posts asking for help is not always accurate. Maybe he looked
at sysindexes.rows which said 3, but it was out of date. Without
access to the database at hand (or some other example of the same) it'sdifficult to tell.

To give the group that information I did a select count on that table,
and it returned 0 (the select count took between 30s to 2 minutes to
run before the drop/creation on the table).
BTW, before posting to the group I did an
"update statistics table_b with fullscan" in the hope it could help.
Best Regards,
Mamo

Jul 23 '05 #17

Similar topics