469,133 Members | 1,003 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,133 developers. It's quick & easy.

Actual and Estimated Rows

I have a stored procedure that will execute with less than 1,000 reads one
time (with a specified set of parameters), then with a different set of
parameters the procedure executes with close to 500,000 reads (according to
Profiler).

In comparing the execution plans, they are the same, except for the actual
and estimated number of rows. When the proc runs with parameters that produce
reads that are less than 1,000 the actual and estimated number of rows equal
1. When the proc runs with parameters that produce reads are near 500,000 the
actual rows are approximately 85,000 and the estimated rows equal 1.

Then I run:
DBCC DROPCLEANBUFFERS
DBCC FREEPROCCACHE

If I then reverse the order of execution by executing the procedure that
initially executes with close to 500,000 reads first, the reads drop to less
than 2,000. The execution plan shows the acutual number of rows equal to 1,
and the estimated rows equal to 2.27. Then when I run the procedure that
initially executed with less than 1,000 reads, it continues to run at less
than 1,000 reads, and the actual number of rows is equal to 1 and the
estimated rows equal to 2.27. When run in this order, there is consistency in
the actual and estimated number of rows and the reads for both executions
with differing parameters are within reason.

Do I need to run DBCC DROPCLEANBUFFERS and DBCC FREEPROCCACHE on production
and then ensure that the procedure that ran close to 500,000 reads is run
first to ensure the proper plan, as well as using a KEEP PLAN option? Or,
what other options might you recommend?

I am running SQL 2000 SP4.

--
Message posted via SQLMonster.com
http://www.sqlmonster.com/Uwe/Forums...neral/200609/1

Sep 14 '06 #1
4 7391
So perhaps a little more information will be helpful.

First of all, this is the query statement that generates the differing actual
versus estimated rows:

(The variables are of type int, sent as arguments to the stored procedure.)

SELECT y.rpFID, x.dCode, x.Desc, x.dcUID
FROM dbo.Table_X x
JOIN dbo.Table_Y y
ON x.dcUID = y.dcFID
AND x.MainID = y.MainID
JOIN dbo.Table_Z z
ON y.rpFID = z.rpUID
AND y.MainID = z.MainID
WHERE z.pFID = @pID
AND z.MainID = @MainID
AND x.MainID = @MainID
AND y.MainID = @MainID
I have run:
UPDATE STATISTICS dbname.Table_X WITH FULLSCAN
UPDATE STATISTICS dbname.Table_Y WITH FULLSCAN
UPDATE STATISTICS dbname.Table_Z WITH FULLSCAN

When I execute the stored proc with the following arguments and in this order
I get the following results:
DBCC DROPCLEANBUFFERS
DBCC FREEPROCCACHE
go
exec dbo.dem_sel_combined @pID = 440, @MainID = 576021
go
exec dbo.dem_sel_combined @pID = 476, @MainID = 576003
go

What this did was drop the reads from 500,000 to 250,000, while the estimated
versus actual dropped on one clustered index seek (on Table_Z) from 85,000
actual and 1 estimated to 4 actual and 1 estimated. On the other hand, the
clustered index seek (on Table_Y) still reports 85,000 actual and 1
estimated.

If I just pull this query out and run on its own, there are 4 records
returned.
When I execute the stored proc with the following arguments and in this order
(note the arguments) I get the following results:
DBCC DROPCLEANBUFFERS
DBCC FREEPROCCACHE
go
exec dbo.dem_sel_combined @pID = 476, @MainID = 576003
go
exec dbo.dem_sel_combined @pID = 440, @MainID = 576021
go

What this did was drop the reads from 250,000 to 79, while the estimated
versus actual on one clustered index seek (on Table_Z) remained 4 actual and
1 estimated while the clustered index seek (on Table_Y) dropped from 85,000
actual and 1 estimated to 5 actual and 1 estimated.

Any ideas how to always have the plan optimized to the lower read count, or
stated another way, optimized to the correct actual versus estimated row
counts?

cbrichards wrote:
>I have a stored procedure that will execute with less than 1,000 reads one
time (with a specified set of parameters), then with a different set of
parameters the procedure executes with close to 500,000 reads (according to
Profiler).

In comparing the execution plans, they are the same, except for the actual
and estimated number of rows. When the proc runs with parameters that produce
reads that are less than 1,000 the actual and estimated number of rows equal
1. When the proc runs with parameters that produce reads are near 500,000 the
actual rows are approximately 85,000 and the estimated rows equal 1.

Then I run:
DBCC DROPCLEANBUFFERS
DBCC FREEPROCCACHE

If I then reverse the order of execution by executing the procedure that
initially executes with close to 500,000 reads first, the reads drop to less
than 2,000. The execution plan shows the acutual number of rows equal to 1,
and the estimated rows equal to 2.27. Then when I run the procedure that
initially executed with less than 1,000 reads, it continues to run at less
than 1,000 reads, and the actual number of rows is equal to 1 and the
estimated rows equal to 2.27. When run in this order, there is consistency in
the actual and estimated number of rows and the reads for both executions
with differing parameters are within reason.

Do I need to run DBCC DROPCLEANBUFFERS and DBCC FREEPROCCACHE on production
and then ensure that the procedure that ran close to 500,000 reads is run
first to ensure the proper plan, as well as using a KEEP PLAN option? Or,
what other options might you recommend?

I am running SQL 2000 SP4.
--
Message posted via SQLMonster.com
http://www.sqlmonster.com/Uwe/Forums...neral/200609/1

Sep 15 '06 #2
Adding on to my previous reply, when I execute the procedures in the
following order (getting the high reads and large actual rows), I obtain the
following explain plan:

DBCC DROPCLEANBUFFERS
DBCC FREEPROCCACHE
go
exec dbo.dem_sel_combined @pID = 440, @MainID = 576021
go
exec dbo.dem_sel_combined @pID = 476, @MainID = 576003
go
Rows Executes StmtText
---- -------- --------
4 1 Nested Loops(Inner Join, OUTER REFERENCES:([y].
[dcFID]) OPTIMIZED)
4 1 |--Nested Loops(Inner Join, OUTER REFERENCES:([y].
[rpFID]) OPTIMIZED)
82331 1 | |--Clustered Index Seek(OBJECT:([MyDB].[dbo].
[Table_Y].[IX_MainID_rpdcUID] AS [y]), SEEK:([y].[MainID]=[@MainID]) ORDERED
FORWARD)
4 82331 | |--Clustered Index Seek(OBJECT:([MyDB].[dbo].
[Table_Z].[IX_MainID_rpUID] AS [z]), SEEK:([z].[MainID]=[@MainID] AND [z].
[rpUID]=[MyDB].[dbo].[Table_Y].[rpFID] as [y].[rpFID]), WHERE:([MyDB].[dbo].
[Table_Z].[pFID] as [z].[pFID]=[@pID]) ORDERED FORWARD)
4 4 |--Clustered Index Seek(OBJECT:([MyDB].[dbo].
[Table_X].[IX_MainID_dcUID] AS [x]), SEEK:([x].[MainID]=[@MainID] AND [x].
[dcUID]=[MyDB].[dbo].[Table_Y].[dcFID] as [y].[dcFID]) ORDERED FORWARD)


when I execute the procedures in the following order (getting the low number
of reads and low number of actual rows), I obtain the following explain plan:

DBCC DROPCLEANBUFFERS
DBCC FREEPROCCACHE
go
exec dbo.dem_sel_combined @pID = 476, @MainID = 576003
go
exec dbo.dem_sel_combined @pID = 440, @MainID = 576021
go
Rows Executes StmtText
---- -------- --------
4 1 Nested Loops(Inner Join, OUTER REFERENCES:([y].
[dcFID]) OPTIMIZED)
4 1 |--Nested Loops(Inner Join, OUTER REFERENCES:([z].
[rpUID]))
5 1 | |--Index Seek(OBJECT:([MyDB].[dbo].[Table_Z].
[IDX1_pFID] AS [z]), SEEK:([z].[pFID]=[@pID]), WHERE:([MyDB].[dbo].[Table_Z].
[MainID] as [z].[MainID]=[@MainID]) ORDERED FORWARD)
4 5 | |--Clustered Index Seek(OBJECT:([MyDB].[dbo].
[Table_Y].[IX_MainID_rpdcUID] AS [y]), SEEK:([y].[MainID]=[@MainID] AND [y].
[rpFID]=[MyDB].[dbo].[Table_Z].[rpUID] as [z].[rpUID]) ORDERED FORWARD)
4 4 |--Clustered Index Seek(OBJECT:([MyDB].[dbo].
[Table_X].[IX_MainID_dcUID] AS [x]), SEEK:([x].[MainID]=[@MainID] AND [x].
[dcUID]=[MyDB].[dbo].[Table_Y].[dcFID] as [y].[dcFID]) ORDERED FORWARD)
Please help explain how I might always obtain the second explain plan, if
possible.

cbrichards wrote:
>So perhaps a little more information will be helpful.

First of all, this is the query statement that generates the differing actual
versus estimated rows:

(The variables are of type int, sent as arguments to the stored procedure.)

SELECT y.rpFID, x.dCode, x.Desc, x.dcUID
FROM dbo.Table_X x
JOIN dbo.Table_Y y
ON x.dcUID = y.dcFID
AND x.MainID = y.MainID
JOIN dbo.Table_Z z
ON y.rpFID = z.rpUID
AND y.MainID = z.MainID
WHERE z.pFID = @pID
AND z.MainID = @MainID
AND x.MainID = @MainID
AND y.MainID = @MainID

I have run:
UPDATE STATISTICS dbname.Table_X WITH FULLSCAN
UPDATE STATISTICS dbname.Table_Y WITH FULLSCAN
UPDATE STATISTICS dbname.Table_Z WITH FULLSCAN

When I execute the stored proc with the following arguments and in this order
I get the following results:
DBCC DROPCLEANBUFFERS
DBCC FREEPROCCACHE
go
exec dbo.dem_sel_combined @pID = 440, @MainID = 576021
go
exec dbo.dem_sel_combined @pID = 476, @MainID = 576003
go

What this did was drop the reads from 500,000 to 250,000, while the estimated
versus actual dropped on one clustered index seek (on Table_Z) from 85,000
actual and 1 estimated to 4 actual and 1 estimated. On the other hand, the
clustered index seek (on Table_Y) still reports 85,000 actual and 1
estimated.

If I just pull this query out and run on its own, there are 4 records
returned.

When I execute the stored proc with the following arguments and in this order
(note the arguments) I get the following results:
DBCC DROPCLEANBUFFERS
DBCC FREEPROCCACHE
go
exec dbo.dem_sel_combined @pID = 476, @MainID = 576003
go
exec dbo.dem_sel_combined @pID = 440, @MainID = 576021
go

What this did was drop the reads from 250,000 to 79, while the estimated
versus actual on one clustered index seek (on Table_Z) remained 4 actual and
1 estimated while the clustered index seek (on Table_Y) dropped from 85,000
actual and 1 estimated to 5 actual and 1 estimated.

Any ideas how to always have the plan optimized to the lower read count, or
stated another way, optimized to the correct actual versus estimated row
counts?
>>I have a stored procedure that will execute with less than 1,000 reads one
time (with a specified set of parameters), then with a different set of
[quoted text clipped - 27 lines]
>>
I am running SQL 2000 SP4.
--
Message posted via http://www.sqlmonster.com

Sep 15 '06 #3
cbrichards via SQLMonster.com (u3288@uwe) writes:
I have a stored procedure that will execute with less than 1,000 reads
one time (with a specified set of parameters), then with a different set
of parameters the procedure executes with close to 500,000 reads
(according to Profiler).

In comparing the execution plans, they are the same, except for the
actual and estimated number of rows. When the proc runs with parameters
that produce reads that are less than 1,000 the actual and estimated
number of rows equal 1. When the proc runs with parameters that produce
reads are near 500,000 the actual rows are approximately 85,000 and the
estimated rows equal 1.

Then I run:
DBCC DROPCLEANBUFFERS
DBCC FREEPROCCACHE

If I then reverse the order of execution by executing the procedure that
initially executes with close to 500,000 reads first, the reads drop to
less than 2,000. The execution plan shows the acutual number of rows
equal to 1, and the estimated rows equal to 2.27. Then when I run the
procedure that initially executed with less than 1,000 reads, it
continues to run at less than 1,000 reads, and the actual number of rows
is equal to 1 and the estimated rows equal to 2.27. When run in this
order, there is consistency in the actual and estimated number of rows
and the reads for both executions with differing parameters are within
reason.
So this is what is happening: SQL Server engage in something known as
parameter sniffing. When SQL Server runs a stored procedure for which
there is no plan available in the procedure cache, the optimizer builds
a plan for the procedure. It then uses the actual values of the input
parameters as guidance for the plan.

This can sometimes backfire, for instance if the first invocation is
for an atypical value. It appears that in your case, there is little
difference for one of the input cases, but the balance tips over in
favour of a plan that is detrimental for the other input cases.

The situation is not entirely trivial to handle. There are several
options, but all have their drawbacks.

To start with DBCC FREEPROCACHE, since this affects the entire server, this
is a really bad idea.

If you always want the same plan, you can use an index hint, so that
say

dbo.Table_Z z WITH (INDEX = IDX1_pFID)

to force the use of the non-clustred index. This may still not give
the plan you want, though, as the optimizer can choose some other
combination. (On SQL 2005 you can force the exact plan if you wish.) A
more seroius problem with this approach is that in two years when
the data has changed, another plan is much better. So index hints should
only be used when you are convinced that the index in question is
always the best.

Another alternative is to add WITH RECOMPILE to the procedure definition.
In this case the procedure gets recompiled every time it's called, which
means that you can get a new plan each time. The obvious drawback is
that you get a higher overhead. If the procedure is just that statement,
it's not that big deal, but if the procedure is several hundread lines
long, there is reason for concern.
--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Sep 15 '06 #4
cbrichards via SQLMonster.com wrote:
So perhaps a little more information will be helpful.

First of all, this is the query statement that generates the differing actual
versus estimated rows:

(The variables are of type int, sent as arguments to the stored procedure.)

SELECT y.rpFID, x.dCode, x.Desc, x.dcUID
FROM dbo.Table_X x
JOIN dbo.Table_Y y
ON x.dcUID = y.dcFID
AND x.MainID = y.MainID
JOIN dbo.Table_Z z
ON y.rpFID = z.rpUID
AND y.MainID = z.MainID
WHERE z.pFID = @pID
AND z.MainID = @MainID
AND x.MainID = @MainID
AND y.MainID = @MainID
Something likely unrelated to your issue (Erland answered that one
already) as it does not change the semantics of your statement but I
though I mention it nevertheless: it seems two of the last lines of this
SQL statement are pretty superfluous as all three tables are joined on
MainID anyway. So I'd leave only "AND x.MainID = @MainID" in there
because Table_X seems to be your major table.

Ah, and another remark: I find these table and column names highly
unspecific. Changing them to something more meaningful (identifiers are
not restricted to 8 characters) will probably help maintain the
application.

Kind regards

robert
Sep 18 '06 #5

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

5 posts views Thread by Lars Netzel | last post: by
2 posts views Thread by Hervé Piedvache | last post: by
1 post views Thread by CARIGAR | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.