469,299 Members | 2,035 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,299 developers. It's quick & easy.

Major query optimiser weirdness with UDFs and SPs on SQL 2000

There is something very strange going on here. Tested with ADO 2.7 and
MSDE/2000. At first, things look quite sensible.

You have a simple SQL query, let's say

select * from mytab where col1 = 1234

Now, let's write a simple VB program to do this query back to an
MSDE/2000 database on our local machine. Effectively, we'll

rs.open sSQL
rs.close

and do that 1,000 times. We wont bother fetching the result set, it
isn't important in this example.

No problem. On my machine this takes around 1.6 seconds and modifying
the code so that the column value in the where clause changes each
time (i.e col1 = nnnn), doesn't make a substantial difference to this
time. Well, that all seems reasonable, so moving right along...

Now we do it with a stored procedure

create procedure proctest(@id int)
as
select * from mytab where col1 = @id

and we now find that executing

proctest nnnn

1,000 times takes around 1.6 seconds whether or not the argument
changes. So far so good. No obvious saving, but then we wouldn't
expect any. The query is very simple, after all.

Well, get to the point!

Now create a table-returning UDF

create function functest(@id int) returns table as

return
(
select * from mytab where col1 = @id
)

try calling that 1,000 times as

select * from functest(nnnn)

and we get around 5.5 seconds on my machine if the argument changes,
otherwise 1.6 seconds if it remains the same for each call.

Hmm, looks like the query plan is discarded if the argument changes.
Well, that's fair enough I guess. UDFs might well be more expensive...
gotta be careful about using them. It's odd that discarding the query
plan seems to be SO expensive, but hey, waddya expect?. (perhaps the
UDF is completely rebuilt, who knows)

last test, then. Create an SP that calls the UDF

create procedure proctest1(@id int)
as
select * from functest(@id)
Ok, here's the $64,000 question. How long will this take if @id
changes each time. The raw UDF took 5.5 seconds, remember, so this
should be slightly slower.

But... IT IS NOT.. It takes 1.6 seconds whether or not @id changes.
Somehow, the UDF becomes FOUR TIMES more efficient when wrapped in an
SP.

My theory, which I stress is not entirely scientific, goes something
like this:-

I deduce that SQL Server decides to reuse the query plan in this
circumstance but does NOT when the UDF is called directly. This is
counter-intuitive but it may be because SQL Server's query parser is
tuned for conventional SQL i.e it can say

well, I've got

select * from mytab WHERE [something or other]

and now I've got

select * from mytab WHERE [something else]

so I can probably re-use the query plan from last time. (I don't know
if it is this clever, but it does seem to know when two
textually-different queries have some degree of commonality)

Whereas with

select * from UDF(arg1)

and

select * from UDF(arg2)

it goes... hmm, mebbe not.... I better not risk it.

But with

sp_something arg1

and

sp_something arg2

it goes... yup, i'll just go call it... and because the SP was already
compiled, the internal call to the UDF already has a query plan.

Anyway, that's the theory. For more complex UDFs, by the way, the
performance increase can be a lot more substantial. On a big complex
UDF with a bunch of joins, I measured a tenfold increase in
performance just by wrapping it in an SP, as above.

Obviously, wrapping a UDF in an SP isn't generally a good thing; the
idea of UDFs is to allow the column list and where clause to filter
the rowset of the UDF, but if you are repeatedly calling the UDF with
the same where clause and column list, this will make it a *lot*
faster.
Jul 20 '05 #1
3 2134
Andrew Mayo (aj****@my-deja.com) writes:
try calling that 1,000 times as

select * from functest(nnnn)

and we get around 5.5 seconds on my machine if the argument changes,
otherwise 1.6 seconds if it remains the same for each call.

Hmm, looks like the query plan is discarded if the argument changes.
Well, that's fair enough I guess. UDFs might well be more expensive...
gotta be careful about using them. It's odd that discarding the query
plan seems to be SO expensive, but hey, waddya expect?. (perhaps the
UDF is completely rebuilt, who knows)


The reason why the call to the UDF takes longer time is simple,
and you would have obtained the same result if you had called the
bare SQL statement with different values for 1234.

To wit, a table-valued UDF is not really a function: it's a macro, so
when you call it on your own, this is the same as invoking the naked
SQL statement.

Here, this costs you performance, but normally this works to your
advantage. You use the table-valued UDF in a query, and the optimizer
can then work with the expanded, potentially creating a better plan
than if had to first had to evaluate the UDF on its own.

A good way to illustrate this, is to write three functions, one of
each kind that all return the value of @@nestlevel.

--
Erland Sommarskog, SQL Server MVP, so****@algonet.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 20 '05 #2
Erland Sommarskog <so****@algonet.se> wrote in message news:<Xn*********************@127.0.0.1>...
[snip]
The reason why the call to the UDF takes longer time is simple,
and you would have obtained the same result if you had called the
bare SQL statement with different values for 1234.

Hmm. Interesting. But I did state in the original post that I had also
tried the bare SQL statement with *different* values for the column
value and that this does *not* affect the execution time. This would
appear to contradict your expected observation.

I have done quite a bit of testing with more complex queries and again
the cost of the raw query does not seem to be affected significantly
by changing the substituted argument (I have only done this for the
one argument case)

The most complex query I have tested involves several joins, has a
CASE statement, and has the argument parametrically substituted in
four places within the query. The execution time for this query is
around 8 seconds per 100 iterations running on the local server. This
figure is almost constant, regardless of the parameter value, which is
numeric.

If I wrap this into a UDF, it also takes 8 seconds for 100 iterations,
UNLESS I keep the argument the same. If I do that, execution time
drops to 0.4 seconds.

If I wrap it into an SP, execution time drops to 0.4 seconds
regardless of the passed-in argument. If I then call the UDF from the
SP, the execution time remains at 0.4 seconds regardless of the
argument.

However, if a UDF is essentially a macro (and unfortunately lacking
the source code for SQL Server I have no idea how they are
implemented), then I can see that effectively each time the UDF is
invoked, if the arguments change, then the UDF is effectively rebuilt
- presumably, this is equivalent to recompiling a stored procedure -
but if the UDF is already inside a precompiled SP, then this process
does not need to occur. That would make sense.
Jul 20 '05 #3
Andrew Mayo (aj****@my-deja.com) writes:
Hmm. Interesting. But I did state in the original post that I had also
tried the bare SQL statement with *different* values for the column
value and that this does *not* affect the execution time. This would
appear to contradict your expected observation.
It could be that the single SQL statement is autoparameterized, but
when it is packaged in a UDF it is not.
However, if a UDF is essentially a macro (and unfortunately lacking
the source code for SQL Server I have no idea how they are
implemented), then I can see that effectively each time the UDF is
invoked, if the arguments change, then the UDF is effectively rebuilt
- presumably, this is equivalent to recompiling a stored procedure -
but if the UDF is already inside a precompiled SP, then this process
does not need to occur. That would make sense.


Not that this only apply to inlined table function. If you were to
put the SELECT in a multi-statement UDF, I predict that the execution
time is the same as for the stored procedure, or possibly somewhat
higher, since there is some overhead for the table variable.

--
Erland Sommarskog, SQL Server MVP, so****@algonet.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 20 '05 #4

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

9 posts views Thread by Ed_No_Spam_Please_Weber | last post: by
6 posts views Thread by Steven D.Arnold | last post: by
5 posts views Thread by David Thielen | last post: by
3 posts views Thread by Martini | last post: by
1 post views Thread by CARIGAR | last post: by
reply views Thread by suresh191 | last post: by
1 post views Thread by Geralt96 | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.