423,680 Members | 2,461 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 423,680 IT Pros & Developers. It's quick & easy.

UPDATE STATISTICS necessary to improve performance (?)

P: n/a
Dear Sql Server experts:

First off, I am no sql server expert :)

A few months ago I put a database into a production environment.
Recently, It was brought to my attention that a particular query that
executed quite quickly in our dev environment was painfully slow in
production. I analyzed the the plan on the production server (it
looked good), and then tried quite a few tips that I'd gleaned from
reading newsgroups. Nothing worked. Then on a whim I performed an
UPDATE STATISTICS on a few of the tables that were being queried. The
query immediately went from executing in 61 seconds to under 1 second.
I checked to make sure that statistics were being "auto updated" and
they were.

Why did I need to run UPDATE STATISTICS? Will I need to again?

A little more background info:
The database started empty, and has grown quite rapidly in the last
few months. One particular table grows at a rate of about 300,000
records per month. I get fast query times due to a few well placed
indexes.

A quick question:
If I add an index, do statistics get automatically updated for this
new index immediately?

Thanks in advance for any help,

Felix
Jul 20 '05 #1
Share this Question
Share on Google+
17 Replies


P: n/a
In article <80**************************@posting.google.com >,
fe*************************@yahoo.com says...
A few months ago I put a database into a production environment.
Recently, It was brought to my attention that a particular query that
executed quite quickly in our dev environment was painfully slow in
production. I analyzed the the plan on the production server (it
looked good), and then tried quite a few tips that I'd gleaned from
reading newsgroups. Nothing worked. Then on a whim I performed an
UPDATE STATISTICS on a few of the tables that were being queried. The
query immediately went from executing in 61 seconds to under 1 second.
I checked to make sure that statistics were being "auto updated" and
they were.


I've seen this type of thing many times, developers are not exposed to
the production system long enough to build a maintenance plan, and the
DBA doesn't get enough time to monitor the DB to build a maintenance
plan specific to the database.

In most cases, I create generic maintenance plans that will auto select
20% of the tables per night and reindex them, same with marking stored
procedures for recompile.

If you do something like this, or if you just reindex/recompile them all
on a weekend you should be able to maintain your performance.

Just so you know, this is a problem in Oracle too - had a team of
developers build a inventory system, worked great for almost a year, but
no one really noticed it getting slower every day until the reports
started failing. They tried for three days to fix it before calling me.
My first clue was no schedule maintenance plan, second was that it
worked for almost a year, and then that there are about 20K inserts in a
single table per day with no deletes.... Reindexed two tables and it
returned to the same level of performance as when it was developed.

Also, when you mark a sproc for recompile, the first time it executes it
will be SLOW. While tables are reindexing they will also be slow or
could even be locked, so you want to do this in the off-hours.

--
--
sp*********@rrohio.com
(Remove 999 to reply to me)
Jul 20 '05 #2

P: n/a
Felix (fe*************************@yahoo.com) writes:
A few months ago I put a database into a production environment.
Recently, It was brought to my attention that a particular query that
executed quite quickly in our dev environment was painfully slow in
production. I analyzed the the plan on the production server (it
looked good), and then tried quite a few tips that I'd gleaned from
reading newsgroups. Nothing worked. Then on a whim I performed an
UPDATE STATISTICS on a few of the tables that were being queried. The
query immediately went from executing in 61 seconds to under 1 second.
I checked to make sure that statistics were being "auto updated" and
they were.

Why did I need to run UPDATE STATISTICS? Will I need to again?
If you only performed UPDATE STATISTICS it seems a little funny, if
there is autostatistics on. If you added WITH FULLSCAN, then there
is a possible explanation.

Whether you will need to do it again, I cannot tell, but as Leythos
discusses, the most critical part is when you start empty and volume
grows. Once you are over a certain level a lot of execution plans
becomes moot.

It the query was parameterized - including auto-parameterized or in
a stored procedure - there is another possible explanation, that the
cached plan was for an atypical parameter. Recall that when SQL Server
builds a query plan for a procedure, it uses the current value of
the parameters to build the plan. If the value used happens to be an
atypical one, then you may get a bad plan for common values to linger.

If this is the case, an sp_recompile on a table involved in the query
will do.
A quick question:
If I add an index, do statistics get automatically updated for this
new index immediately?


Yes, as I recall, SQL Server creates statistics when it builds an
index.
--
Erland Sommarskog, SQL Server MVP, so****@algonet.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 20 '05 #3

P: n/a
The query was not parameterized, nor in a stored procedure.

And I did not run UPDATE STATISTICS WITH FULLSCAN.

What do you mean by "Once you are over a certain level a lot of
execution plans becomes moot."?

Anyway, I remain puzzled.

Thanks,
Felix
Erland Sommarskog <so****@algonet.se> wrote in message news:<Xn**********************@127.0.0.1>...
Felix (fe*************************@yahoo.com) writes:
A few months ago I put a database into a production environment.
Recently, It was brought to my attention that a particular query that
executed quite quickly in our dev environment was painfully slow in
production. I analyzed the the plan on the production server (it
looked good), and then tried quite a few tips that I'd gleaned from
reading newsgroups. Nothing worked. Then on a whim I performed an
UPDATE STATISTICS on a few of the tables that were being queried. The
query immediately went from executing in 61 seconds to under 1 second.
I checked to make sure that statistics were being "auto updated" and
they were.

Why did I need to run UPDATE STATISTICS? Will I need to again?


If you only performed UPDATE STATISTICS it seems a little funny, if
there is autostatistics on. If you added WITH FULLSCAN, then there
is a possible explanation.

Whether you will need to do it again, I cannot tell, but as Leythos
discusses, the most critical part is when you start empty and volume
grows. Once you are over a certain level a lot of execution plans
becomes moot.

It the query was parameterized - including auto-parameterized or in
a stored procedure - there is another possible explanation, that the
cached plan was for an atypical parameter. Recall that when SQL Server
builds a query plan for a procedure, it uses the current value of
the parameters to build the plan. If the value used happens to be an
atypical one, then you may get a bad plan for common values to linger.

If this is the case, an sp_recompile on a table involved in the query
will do.
A quick question:
If I add an index, do statistics get automatically updated for this
new index immediately?


Yes, as I recall, SQL Server creates statistics when it builds an
index.

Jul 20 '05 #4

P: n/a
Felix (fe*************************@yahoo.com) writes:
The query was not parameterized, nor in a stored procedure.
SQL Server can do autoparameterization, although I believe this
mainly happen for simple queries.
What do you mean by "Once you are over a certain level a lot of
execution plans becomes moot."?


Say that you have a query which involves two tables. One table is a
Products table, and one is an Orders table. When you start the system,
you have a lot of products, but you have no orders. But after running
the system for six months, you have a tenfold more orders than products,
and this ratio is only going to increase. Thus, all plans that are built
on the assumption that the Orders table is smaller are no longer of
interest.

--
Erland Sommarskog, SQL Server MVP, so****@algonet.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 20 '05 #5

P: n/a
Erland Sommarskog <so****@algonet.se> wrote in message news:<Xn**********************@127.0.0.1>...
Felix (fe*************************@yahoo.com) writes:
The query was not parameterized, nor in a stored procedure.


SQL Server can do autoparameterization, although I believe this
mainly happen for simple queries.
What do you mean by "Once you are over a certain level a lot of
execution plans becomes moot."?


Say that you have a query which involves two tables. One table is a
Products table, and one is an Orders table. When you start the system,
you have a lot of products, but you have no orders. But after running
the system for six months, you have a tenfold more orders than products,
and this ratio is only going to increase. Thus, all plans that are built
on the assumption that the Orders table is smaller are no longer of
interest.


Ah, thanks. That makes a lot of sense. But since the stats are being
auto updated (in the default case and in my case), this should not be
a problem, right?
Jul 20 '05 #6

P: n/a
Felix (fe*************************@yahoo.com) writes:
Ah, thanks. That makes a lot of sense. But since the stats are being
auto updated (in the default case and in my case), this should not be
a problem, right?


Right. For this typical case that I outlined, SQL Server handles the case,
and you are not likely to see any problems.

But then there might be more sensitive cases. Say that you have a complex
query with many tables involved. A minor error in the statistics of one of
the innermost tables in the query, can lead to gross errors in the
estimates. And, of course, there are cases where even with completely
accurate statistics the optimizer will go astray.
--
Erland Sommarskog, SQL Server MVP, so****@algonet.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 20 '05 #7

P: n/a
In article <80*************************@posting.google.com> , felix666007
_n**************@yahoo.com says...
Ah, thanks. That makes a lot of sense. But since the stats are being
auto updated (in the default case and in my case), this should not be
a problem, right?


A good example of AUTO for things is the standard MS Drive Fragmentation
- if you leave it to the OS your drive will be fragmented, same if using
DiskKeeper, there will still be some fragmentation. If you take it
offline you can fully defragment the drive.

This same thought holds try for stats - I've seen hundreds of servers
that benefited from having tables manually reindexed, stored procs
recompiled, etc....

While the automation works well, it leaves a lot to be desired. You can
write your own scripts to automate the process of the reindexing,
recompiling, etc... and then schedule them on a nightly basis (do about
20% of the objects each evening if possible).

In one system, where they insert about 500,000 new records a day, the
system was overly slow, a nightly reindex on the common insert tables
brought back performance to the level of the development days.
--
--
sp*********@rrohio.com
(Remove 999 to reply to me)
Jul 20 '05 #8

P: n/a
Erland Sommarskog (so****@algonet.se) writes:
Felix (fe*************************@yahoo.com) writes:
Ah, thanks. That makes a lot of sense. But since the stats are being
auto updated (in the default case and in my case), this should not be
a problem, right?


Right. For this typical case that I outlined, SQL Server handles the case,
and you are not likely to see any problems.

But then there might be more sensitive cases. Say that you have a complex
query with many tables involved. A minor error in the statistics of one of
the innermost tables in the query, can lead to gross errors in the
estimates. And, of course, there are cases where even with completely
accurate statistics the optimizer will go astray.


And, oh, just because the statistics are updated, does not mean that the
query plan is. I had a case recently where this happened. It's not really
a usuaul situation, because I'm running a long job that converts data
from a competitor's system to our system. I'm running eight years worth
of transaction in one long stored procedure which runs for days. The
job runs by handling each business day there has been over the years,
one at a time. Anyway, there was one procedure in the job, which for a
long time ran with nothing to do, as there was no data to convert, but
from one day there is data every day. By time the procedure went from
a few seconds, to half a minute. With an sp_recompile on the table that
this procedure loads, the procedure quickly fell back to a few seconds.
--
Erland Sommarskog, SQL Server MVP, so****@algonet.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 20 '05 #9

P: n/a
Leythos (vo**@nowhere.com) writes:
This same thought holds try for stats - I've seen hundreds of servers
that benefited from having tables manually reindexed, stored procs
recompiled, etc....


But reindexing is another thing. There is no autoreindex, so of course
running reindex manually may have some effect.
--
Erland Sommarskog, SQL Server MVP, so****@algonet.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 20 '05 #10

P: n/a
In article <Xn**********************@127.0.0.1>, so****@algonet.se
says...
Leythos (vo**@nowhere.com) writes:
This same thought holds try for stats - I've seen hundreds of servers
that benefited from having tables manually reindexed, stored procs
recompiled, etc....


But reindexing is another thing. There is no autoreindex, so of course
running reindex manually may have some effect.


I can assure you that it won't have SOME effect, it will have a dramatic
effect if the table has a high insert ratio, or has been used for a long
time (months) without a reindex being done. Also, after the reindex you
need to mark the applicable stored procs for recompile.

If you've managed groups of 3 million record tables with 900GB of data
in a single database would have seen this many times.

Small databases (under 50GB) don't see this as much as larger ones, but
all of them benefit from proper table/index maintenance.

--
--
sp*********@rrohio.com
(Remove 999 to reply to me)
Jul 20 '05 #11

P: n/a
On 25 Dec 2003 01:34:03 -0800, Felix wrote:
A few months ago I put a database into a production environment.
Recently, It was brought to my attention that a particular query that
executed quite quickly in our dev environment was painfully slow in
production. I analyzed the the plan on the production server (it
looked good), and then tried quite a few tips that I'd gleaned from
reading newsgroups. Nothing worked. Then on a whim I performed an
UPDATE STATISTICS on a few of the tables that were being queried. The
query immediately went from executing in 61 seconds to under 1 second.
I checked to make sure that statistics were being "auto updated" and
they were.

Why did I need to run UPDATE STATISTICS? Will I need to again?
Autostatistics does a sample scan, as does UPDATE STATISTICS if you
don't specify FULLSCAN (which I see you didn't). Perhaps the sample
scans are skewed differently.

In fact, can anyone confirm how autostats works? Does it utilise
existing statistics to incrementally calculate new statistics based on
inserted / updated rows, or does it sample scan?

If the former, then perhaps the problem you are seeing is caused by
changes in the distribution of values over time, e.g. new rows are more
similar to each other than older rows were to each other.

It seems that the sample scan done by UPDATE STATISTICS has given you
good results, so it might be prudent to schedule a batch job to update
these periodically - e.g. nightly or weekly.

If new rows are skewing the statistics dramatically, then you might even
benefit from disabling autostats for *selected tables*, but make sure
that you run regular updates (and I'd go for the FULLSCAN option).
Disabling autostats is generally not recommended, so don't go for this
option lightly!

The trick is to determine which tables the autostats are not performing
well for, so that you can keep manual updates to a minimum. Profiler can
tell you when statistics are missing (set SHOWPLAN_ALL on and show
warnings in profile, you will see NO STATS messages), but out of date
statistics are harder.

You need to check the query plans for your slow running queries, and
look at the estimates for returned rows, then check that with actual
returned rows if you break the query up in the same way that the query
plan has. Major differences (e.g. 5,000 vs 100,000) should indicate
where the stats are out wildly.

I picked this up from the book "SQL Server Query Performance Tuning
Distilled" by Sajal Dam. Good book for covering many of the aspects of
performance in SQL Server.
A quick question:
If I add an index, do statistics get automatically updated for this
new index immediately?


Indexes store and update statistics automatically. As with non-index
statistics, they can get out of date (i.e. unrepresentative), so they
may need updating also. You can do this either by UPDATE STATISTICS or
by rebuilding the index. I guess which one is the best pick depends on
how fragmented your index is - not much, and it might be cheaper to just
update the stats than rebuild the entire index.

cheers,
Ross.
--
Ross McKay, WebAware Pty Ltd
"Words can only hurt if you try to read them. Don't play their game" - Zoolander
Jul 20 '05 #12

P: n/a
Bas

"Leythos" <vo**@nowhere.com> wrote in message
news:MP************************@news-server.columbus.rr.com...
In article <Xn**********************@127.0.0.1>, so****@algonet.se

says...
Leythos (vo**@nowhere.com) writes:
This same thought holds try for stats - I've seen hundreds of servers
that benefited from having tables manually reindexed, stored procs
recompiled, etc....


But reindexing is another thing. There is no autoreindex, so of course
running reindex manually may have some effect.


I can assure you that it won't have SOME effect, it will have a dramatic
effect if the table has a high insert ratio, or has been used for a long
time (months) without a reindex being done. Also, after the reindex you
need to mark the applicable stored procs for recompile.


I regularly do a reindex for every table and then DBCC freeproccache the
database. After following this thread I'm wondering if this is the best
approach? Would it be wise to schedule this or more with either my app or
schedule it with SQL server? What interval would be good, each day, week,
month?
I understand ofcourse this depends on how the DB is used and how big it is.

Bas
Jul 20 '05 #13

P: n/a
In article <3f*********************@dreader9.news.xs4all.nl >, "Bas"
<nomailplease> says...

"Leythos" <vo**@nowhere.com> wrote in message
news:MP************************@news-server.columbus.rr.com...
In article <Xn**********************@127.0.0.1>, so****@algonet.se

says...
Leythos (vo**@nowhere.com) writes:
> This same thought holds try for stats - I've seen hundreds of servers
> that benefited from having tables manually reindexed, stored procs
> recompiled, etc....

But reindexing is another thing. There is no autoreindex, so of course
running reindex manually may have some effect.


I can assure you that it won't have SOME effect, it will have a dramatic
effect if the table has a high insert ratio, or has been used for a long
time (months) without a reindex being done. Also, after the reindex you
need to mark the applicable stored procs for recompile.


I regularly do a reindex for every table and then DBCC freeproccache the
database. After following this thread I'm wondering if this is the best
approach? Would it be wise to schedule this or more with either my app or
schedule it with SQL server? What interval would be good, each day, week,
month?


I have one client that has more than 3 million registration identities -
this tracks financial information. They used my sprocs to perform
maintenance on the tables on a weekly basis. Their system consists of a
large import daily and a large export daily - import being lots of
inserts and updates. If you look at the tables, determine which ones
have significant data additions/deletions/changes each day you can tweak
your maintenance plan to just those tables on a frequent basis and the
less used ones on another schedule.

DBCC freeproccache doesn't really help, except when creating
plans/indexes by hand. I only use it when working with reports and
optimizing indexes.
--
--
sp*********@rrohio.com
(Remove 999 to reply to me)
Jul 20 '05 #14

P: n/a
Hi Ross,

There is an KB article that explains how autostats works.

http://support.microsoft.com/default...b;en-us;195565
INF: How SQL Server 7.0 and SQL Server 2000 Autostats Work

Yih-Yoon Lee
Ross McKay <ro***@zeta.NOT.THIS.BIT.org.au> wrote in message news:<mu********************************@4ax.com>. ..
On 25 Dec 2003 01:34:03 -0800, Felix wrote:
A few months ago I put a database into a production environment.
Recently, It was brought to my attention that a particular query that
executed quite quickly in our dev environment was painfully slow in
production. I analyzed the the plan on the production server (it
looked good), and then tried quite a few tips that I'd gleaned from
reading newsgroups. Nothing worked. Then on a whim I performed an
UPDATE STATISTICS on a few of the tables that were being queried. The
query immediately went from executing in 61 seconds to under 1 second.
I checked to make sure that statistics were being "auto updated" and
they were.

Why did I need to run UPDATE STATISTICS? Will I need to again?


Autostatistics does a sample scan, as does UPDATE STATISTICS if you
don't specify FULLSCAN (which I see you didn't). Perhaps the sample
scans are skewed differently.

In fact, can anyone confirm how autostats works? Does it utilise
existing statistics to incrementally calculate new statistics based on
inserted / updated rows, or does it sample scan?

If the former, then perhaps the problem you are seeing is caused by
changes in the distribution of values over time, e.g. new rows are more
similar to each other than older rows were to each other.

It seems that the sample scan done by UPDATE STATISTICS has given you
good results, so it might be prudent to schedule a batch job to update
these periodically - e.g. nightly or weekly.

If new rows are skewing the statistics dramatically, then you might even
benefit from disabling autostats for *selected tables*, but make sure
that you run regular updates (and I'd go for the FULLSCAN option).
Disabling autostats is generally not recommended, so don't go for this
option lightly!

The trick is to determine which tables the autostats are not performing
well for, so that you can keep manual updates to a minimum. Profiler can
tell you when statistics are missing (set SHOWPLAN_ALL on and show
warnings in profile, you will see NO STATS messages), but out of date
statistics are harder.

You need to check the query plans for your slow running queries, and
look at the estimates for returned rows, then check that with actual
returned rows if you break the query up in the same way that the query
plan has. Major differences (e.g. 5,000 vs 100,000) should indicate
where the stats are out wildly.

I picked this up from the book "SQL Server Query Performance Tuning
Distilled" by Sajal Dam. Good book for covering many of the aspects of
performance in SQL Server.
A quick question:
If I add an index, do statistics get automatically updated for this
new index immediately?


Indexes store and update statistics automatically. As with non-index
statistics, they can get out of date (i.e. unrepresentative), so they
may need updating also. You can do this either by UPDATE STATISTICS or
by rebuilding the index. I guess which one is the best pick depends on
how fragmented your index is - not much, and it might be cheaper to just
update the stats than rebuild the entire index.

cheers,
Ross.

Jul 20 '05 #15

P: n/a
I asked:
In fact, can anyone confirm how autostats works? Does it utilise
existing statistics to incrementally calculate new statistics based on
inserted / updated rows, or does it sample scan?

(Yih-Yoon Lee) wrote:There is an KB article that explains how autostats works.

http://support.microsoft.com/default...b;en-us;195565
INF: How SQL Server 7.0 and SQL Server 2000 Autostats Work


Thanks Yih-Yoon, from this article once can infer that autostats does a
sample scan, since it can affect performance on heavily loaded systems
(and I presume that using existing stats to incrementally calculate new
stats would use bugger all resources).

So I guess the question is, why did Felix's autostats perform a
different scan to his manual scan?

cheers,
Ross.
--
Ross McKay, WebAware Pty Ltd
"The lawn could stand another mowing; funny, I don't even care"
- Elvis Costello
Jul 20 '05 #16

P: n/a
Bas

"Leythos" <vo**@nowhere.com> wrote in message
news:MP************************@news-server.columbus.rr.com...
I have one client that has more than 3 million registration identities -
this tracks financial information. They used my sprocs to perform
maintenance on the tables on a weekly basis. Their system consists of a
large import daily and a large export daily - import being lots of
inserts and updates. If you look at the tables, determine which ones
have significant data additions/deletions/changes each day you can tweak
your maintenance plan to just those tables on a frequent basis and the
less used ones on another schedule.

DBCC freeproccache doesn't really help, except when creating
plans/indexes by hand. I only use it when working with reports and
optimizing indexes.


Hi Leythos,

Could you shed a little light on what your sprocs do then? Reindexes?

Thanx,

Bas

Jul 20 '05 #17

P: n/a
In article <3f**********************@dreader10.news.xs4all.nl >, "Bas"
<nomailplease> says...

"Leythos" <vo**@nowhere.com> wrote in message
news:MP************************@news-server.columbus.rr.com...
I have one client that has more than 3 million registration identities -
this tracks financial information. They used my sprocs to perform
maintenance on the tables on a weekly basis. Their system consists of a
large import daily and a large export daily - import being lots of
inserts and updates. If you look at the tables, determine which ones
have significant data additions/deletions/changes each day you can tweak
your maintenance plan to just those tables on a frequent basis and the
less used ones on another schedule.

DBCC freeproccache doesn't really help, except when creating
plans/indexes by hand. I only use it when working with reports and
optimizing indexes.


Hi Leythos,

Could you shed a little light on what your sprocs do then? Reindexes?


The scan all the tables, gather the table names, reindex the tables
(only 20% of them per night), mark the stored procs for recompile, and
run nightly. Sometime, when I cant run nightly I do it on Sunday.

--
--
sp*********@rrohio.com
(Remove 999 to reply to me)
Jul 20 '05 #18

This discussion thread is closed

Replies have been disabled for this discussion.