By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
440,230 Members | 2,425 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 440,230 IT Pros & Developers. It's quick & easy.

Why are scalar functions soooo sloooow?

P: n/a
Hi all,

I've been using scalar functions as a way to perform some complex data
transformation operations, and I've noticed that scalar functions
reaaaaalllllyyyy sloooowwwwww thiiiiiings dooooooown. I expect slow-down, of
course, and would even not be surprised at slow-downs up to a factor of, say
50:1, but I'm seeing slow-downs more like 1000:1 or 100000:1. I'm sure it
would actually be faster to actually export a table, use VB to process it,
then import it back in.
Jul 20 '05 #1
Share this Question
Share on Google+
7 Replies


P: n/a

"Steve Jorgensen" <no****@nospam.nospam> wrote in message
news:81********************************@4ax.com...
Hi all,

I've been using scalar functions as a way to perform some complex data
transformation operations, and I've noticed that scalar functions
reaaaaalllllyyyy sloooowwwwww thiiiiiings dooooooown. I expect slow-down, of course, and would even not be surprised at slow-downs up to a factor of, say 50:1, but I'm seeing slow-downs more like 1000:1 or 100000:1. I'm sure it
would actually be faster to actually export a table, use VB to process it,
then import it back in.


Scalar functions are not converted into set-based logic, so they act like
cursors when you use them to manipulate rows in a set/based operation. This
can be extremely slow, as you say - perhaps using batches instead of a
single update would be faster? In some cases, it may indeed be better to do
the transformation outside the database - regexes, for example, may be much
faster and easier in a client language than in a UDF in TSQL.

Simon
Jul 20 '05 #2

P: n/a
Steve Jorgensen (no****@nospam.nospam) writes:
I've been using scalar functions as a way to perform some complex data
transformation operations, and I've noticed that scalar functions
reaaaaalllllyyyy sloooowwwwww thiiiiiings dooooooown. I expect
slow-down, of course, and would even not be surprised at slow-downs up
to a factor of, say 50:1, but I'm seeing slow-downs more like 1000:1 or
100000:1. I'm sure it would actually be faster to actually export a
table, use VB to process it, then import it back in.


Yes, scalar UDF:s are something to be careful with when it comes to
performance. Essentially what is happen is that your set-based query
becomes a cursor behind the scenes. That is, SQL Server runs the
UDF for one row at a time.

Theoretically, if you UDF is inlineable, SQL Server could be able to expand
the query, so that SQL Server could handle the UDF like any other
expression, but I have never seen that happen. (An lineable function
would have a body with a single RETURN statement that computes the value
of the function.)
--
Erland Sommarskog, SQL Server MVP, so****@algonet.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 20 '05 #3

P: n/a
On Tue, 10 Feb 2004 23:24:07 +0000 (UTC), Erland Sommarskog
<so****@algonet.se> wrote:
Steve Jorgensen (no****@nospam.nospam) writes:
I've been using scalar functions as a way to perform some complex data
transformation operations, and I've noticed that scalar functions
reaaaaalllllyyyy sloooowwwwww thiiiiiings dooooooown. I expect
slow-down, of course, and would even not be surprised at slow-downs up
to a factor of, say 50:1, but I'm seeing slow-downs more like 1000:1 or
100000:1. I'm sure it would actually be faster to actually export a
table, use VB to process it, then import it back in.


Yes, scalar UDF:s are something to be careful with when it comes to
performance. Essentially what is happen is that your set-based query
becomes a cursor behind the scenes. That is, SQL Server runs the
UDF for one row at a time.

Theoretically, if you UDF is inlineable, SQL Server could be able to expand
the query, so that SQL Server could handle the UDF like any other
expression, but I have never seen that happen. (An lineable function
would have a body with a single RETURN statement that computes the value
of the function.)


Well, I knew that SQL would turn my query into cursor-based behind the scenes,
but that, in itself should not have made much difference because it was an
update on a single table to begin with. Still, my experience suggests, that
even using VBA on an Access dynaset bound to the table over a 10Base-T network
would be faster than what SQL Server is doing.
Jul 20 '05 #4

P: n/a
Steve Jorgensen (no****@nospam.nospam) writes:
Well, I knew that SQL would turn my query into cursor-based behind the
scenes, but that, in itself should not have made much difference because
it was an update on a single table to begin with. Still, my experience
suggests, that even using VBA on an Access dynaset bound to the table
over a 10Base-T network would be faster than what SQL Server is doing.


I don't know what your SQL looks like, but if your UPDATE looks like:

UPDATE tbl
SET col = ...
WHERE dbo.udf(some_col) = 4711

And there are a couple of rows in the table, this certainly is a killer.

--
Erland Sommarskog, SQL Server MVP, so****@algonet.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 20 '05 #5

P: n/a
Erland Sommarskog wrote:
Steve Jorgensen (no****@nospam.nospam) writes:
Well, I knew that SQL would turn my query into cursor-based behind the
scenes, but that, in itself should not have made much difference because
it was an update on a single table to begin with. Still, my experience
suggests, that even using VBA on an Access dynaset bound to the table
over a 10Base-T network would be faster than what SQL Server is doing.

I don't know what your SQL looks like, but if your UPDATE looks like:

UPDATE tbl
SET col = ...
WHERE dbo.udf(some_col) = 4711

And there are a couple of rows in the table, this certainly is a killer.

But that would be the fault of the resulting tablescan.....

Cheers
Serge

--
Serge Rielau
DB2 SQL Compiler Development
IBM Toronto Lab
Jul 20 '05 #6

P: n/a
Serge Rielau (sr*****@ca.eye-be-em.com) writes:
Erland Sommarskog wrote:
I don't know what your SQL looks like, but if your UPDATE looks like:

UPDATE tbl
SET col = ...
WHERE dbo.udf(some_col) = 4711

And there are a couple of rows in the table, this certainly is a killer.

But that would be the fault of the resulting tablescan.....


Not only. There will be a table scan, but there is a huge difference
between a set-based table scan and one which is row by row. Say that
the table has some 30000 rows. A table scan on such a table does not need
to be a big deal. But add the UDF, and it may go over the edge.
--
Erland Sommarskog, SQL Server MVP, so****@algonet.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 20 '05 #7

P: n/a
Erland Sommarskog wrote:
Serge Rielau (sr*****@ca.eye-be-em.com) writes:
Erland Sommarskog wrote:
I don't know what your SQL looks like, but if your UPDATE looks like:

UPDATE tbl
SET col = ...
WHERE dbo.udf(some_col) = 4711

And there are a couple of rows in the table, this certainly is a killer.


But that would be the fault of the resulting tablescan.....

Not only. There will be a table scan, but there is a huge difference
between a set-based table scan and one which is row by row. Say that
the table has some 30000 rows. A table scan on such a table does not need
to be a big deal. But add the UDF, and it may go over the edge.

Hmm, I'm likley loosing you on the set, vs. cursor based scan.
At the end of the day everything is cursor based because the data is
coming from a disk in form of records. Naturally these "cursors" are at
much lower level in the DBMS and hence faster than a cursor in T-SQL.
A predicate which is simple enough may be executed in the lowest level,
somewhere in the neighbourhood of the bufferpool.
Predicates that are too complex must be driven by the regular runtime
engine of the DBMS. Runtime still is record driven.
Orthogonal to that function invocations may be victims of security
measues by the DBMS (own process, I/O validation etc...) which increases
cost significantly. If I get that straight then this later part may be
one of the things that CLR is about in Yukon (?)

So my take is that extra cost of predicates is associated with a need to
be serviced higher up in the food chain.
General functions may suffer from context switches and, depending of the
language used impedance mismatch of the API (value formats).
Now all this was general DBMS talk. I don't know the guts of SQL Server.

Cheers
Serge
--
Serge Rielau
DB2 SQL Compiler Development
IBM Toronto Lab
Jul 20 '05 #8

This discussion thread is closed

Replies have been disabled for this discussion.