Why are scalar functions soooo sloooow?

Steve Jorgensen

Hi all,

I've been using scalar functions as a way to perform some complex data
transformation operations, and I've noticed that scalar functions
reaaaaalllllyyyy sloooowwwwww thiiiiiings dooooooown. I expect slow-down, of
course, and would even not be surprised at slow-downs up to a factor of, say
50:1, but I'm seeing slow-downs more like 1000:1 or 100000:1. I'm sure it
would actually be faster to actually export a table, use VB to process it,
then import it back in.

Jul 20 '05 #1

Subscribe Post Reply

6519

Simon Hayes

"Steve Jorgensen" <no****@nospam.nospam> wrote in message
news:81********************************@4ax.com...

Hi all,

I've been using scalar functions as a way to perform some complex data
transformation operations, and I've noticed that scalar functions
reaaaaalllllyyyy sloooowwwwww thiiiiiings dooooooown. I expect slow-down, of course, and would even not be surprised at slow-downs up to a factor of, say 50:1, but I'm seeing slow-downs more like 1000:1 or 100000:1. I'm sure it
would actually be faster to actually export a table, use VB to process it,
then import it back in.

Scalar functions are not converted into set-based logic, so they act like
cursors when you use them to manipulate rows in a set/based operation. This
can be extremely slow, as you say - perhaps using batches instead of a
single update would be faster? In some cases, it may indeed be better to do
the transformation outside the database - regexes, for example, may be much
faster and easier in a client language than in a UDF in TSQL.

Simon

Jul 20 '05 #2

Erland Sommarskog

Steve Jorgensen (no****@nospam.nospam) writes:

I've been using scalar functions as a way to perform some complex data
transformation operations, and I've noticed that scalar functions
reaaaaalllllyyyy sloooowwwwww thiiiiiings dooooooown. I expect
slow-down, of course, and would even not be surprised at slow-downs up
to a factor of, say 50:1, but I'm seeing slow-downs more like 1000:1 or
100000:1. I'm sure it would actually be faster to actually export a
table, use VB to process it, then import it back in.

Yes, scalar UDF:s are something to be careful with when it comes to
performance. Essentially what is happen is that your set-based query
becomes a cursor behind the scenes. That is, SQL Server runs the
UDF for one row at a time.

Theoretically, if you UDF is inlineable, SQL Server could be able to expand
the query, so that SQL Server could handle the UDF like any other
expression, but I have never seen that happen. (An lineable function
would have a body with a single RETURN statement that computes the value
of the function.)
--
Erland Sommarskog, SQL Server MVP, so****@algonet.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp

Jul 20 '05 #3

Steve Jorgensen

On Tue, 10 Feb 2004 23:24:07 +0000 (UTC), Erland Sommarskog
<so****@algonet.se> wrote:

Steve Jorgensen (no****@nospam.nospam) writes:
I've been using scalar functions as a way to perform some complex data
transformation operations, and I've noticed that scalar functions
reaaaaalllllyyyy sloooowwwwww thiiiiiings dooooooown. I expect
slow-down, of course, and would even not be surprised at slow-downs up
to a factor of, say 50:1, but I'm seeing slow-downs more like 1000:1 or
100000:1. I'm sure it would actually be faster to actually export a
table, use VB to process it, then import it back in.

Yes, scalar UDF:s are something to be careful with when it comes to
performance. Essentially what is happen is that your set-based query
becomes a cursor behind the scenes. That is, SQL Server runs the
UDF for one row at a time.

Theoretically, if you UDF is inlineable, SQL Server could be able to expand
the query, so that SQL Server could handle the UDF like any other
expression, but I have never seen that happen. (An lineable function
would have a body with a single RETURN statement that computes the value
of the function.)

Well, I knew that SQL would turn my query into cursor-based behind the scenes,
but that, in itself should not have made much difference because it was an
update on a single table to begin with. Still, my experience suggests, that
even using VBA on an Access dynaset bound to the table over a 10Base-T network
would be faster than what SQL Server is doing.

Jul 20 '05 #4

Erland Sommarskog

Steve Jorgensen (no****@nospam.nospam) writes:

Well, I knew that SQL would turn my query into cursor-based behind the
scenes, but that, in itself should not have made much difference because
it was an update on a single table to begin with. Still, my experience
suggests, that even using VBA on an Access dynaset bound to the table
over a 10Base-T network would be faster than what SQL Server is doing.

I don't know what your SQL looks like, but if your UPDATE looks like:

UPDATE tbl
SET col = ...
WHERE dbo.udf(some_col) = 4711

And there are a couple of rows in the table, this certainly is a killer.

--
Erland Sommarskog, SQL Server MVP, so****@algonet.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp

Jul 20 '05 #5

Serge Rielau

Erland Sommarskog wrote:

Steve Jorgensen (no****@nospam.nospam) writes:
Well, I knew that SQL would turn my query into cursor-based behind the
scenes, but that, in itself should not have made much difference because
it was an update on a single table to begin with. Still, my experience
suggests, that even using VBA on an Access dynaset bound to the table
over a 10Base-T network would be faster than what SQL Server is doing.

I don't know what your SQL looks like, but if your UPDATE looks like:

UPDATE tbl
SET col = ...
WHERE dbo.udf(some_col) = 4711

And there are a couple of rows in the table, this certainly is a killer.

But that would be the fault of the resulting tablescan.....

Cheers
Serge

--
Serge Rielau
DB2 SQL Compiler Development
IBM Toronto Lab

Jul 20 '05 #6

Erland Sommarskog

Serge Rielau (sr*****@ca.eye-be-em.com) writes:

Erland Sommarskog wrote:
I don't know what your SQL looks like, but if your UPDATE looks like:

UPDATE tbl
SET col = ...
WHERE dbo.udf(some_col) = 4711

And there are a couple of rows in the table, this certainly is a killer.

But that would be the fault of the resulting tablescan.....

Not only. There will be a table scan, but there is a huge difference
between a set-based table scan and one which is row by row. Say that
the table has some 30000 rows. A table scan on such a table does not need
to be a big deal. But add the UDF, and it may go over the edge.
--
Erland Sommarskog, SQL Server MVP, so****@algonet.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp

Jul 20 '05 #7

Serge Rielau

Erland Sommarskog wrote:

Serge Rielau (sr*****@ca.eye-be-em.com) writes:
Erland Sommarskog wrote:
I don't know what your SQL looks like, but if your UPDATE looks like:

UPDATE tbl
SET col = ...
WHERE dbo.udf(some_col) = 4711

And there are a couple of rows in the table, this certainly is a killer.

But that would be the fault of the resulting tablescan.....

Not only. There will be a table scan, but there is a huge difference
between a set-based table scan and one which is row by row. Say that
the table has some 30000 rows. A table scan on such a table does not need
to be a big deal. But add the UDF, and it may go over the edge.

Hmm, I'm likley loosing you on the set, vs. cursor based scan.
At the end of the day everything is cursor based because the data is
coming from a disk in form of records. Naturally these "cursors" are at
much lower level in the DBMS and hence faster than a cursor in T-SQL.
A predicate which is simple enough may be executed in the lowest level,
somewhere in the neighbourhood of the bufferpool.
Predicates that are too complex must be driven by the regular runtime
engine of the DBMS. Runtime still is record driven.
Orthogonal to that function invocations may be victims of security
measues by the DBMS (own process, I/O validation etc...) which increases
cost significantly. If I get that straight then this later part may be
one of the things that CLR is about in Yukon (?)

So my take is that extra cost of predicates is associated with a need to
be serviced higher up in the food chain.
General functions may suffer from context switches and, depending of the
language used impedance mismatch of the API (value formats).
Now all this was general DBMS talk. I don't know the guts of SQL Server.

Cheers
Serge
--
Serge Rielau
DB2 SQL Compiler Development
IBM Toronto Lab

Jul 20 '05 #8

by: roger | last post by:

I'm having difficulties invoking a user defined table function, when passing to it a parameter that is the result of another user defined function. My functions are defined like so: drop...

Microsoft SQL Server

Scalar Function Columns

by: Tyler Hudson | last post by:

Is it ill-advised to have columns whose values pull from scalar functions using other fields in the record as parameters? For example, if I have create table a(iID int primary key) create table...

Microsoft SQL Server

Help: template function specialisation must accept scalar types only

by: Martin MacRobert | last post by:

Hi, I'm trying to make a specialisation of a template function, so that the second parameter accepts scalar types only (int,double,float etc.). How can I do this without writing an explicit...

C / C++

Throwing errors in UDF scalar functions

by: Simon Riggs | last post by:

I'm trying to create some User Defined Functions, which act as scalar functions. I can successfully create a UDF like this: drop function arraylistset; CREATE FUNCTION ArrayListSet (ArrayList...

DB2 Database

abysmal performance with scalar query

by: Bob Stearns | last post by:

When I run the following query with the two sections commented out, the response time is between 1 an 2 seconds; with the first indicated section enabled, the response goes up to 15 seconds even...

DB2 Database

VS 2003 Sloooow Debugging

by: John E Katich | last post by:

I'm just getting into VC++ managed code (VS 2003). Using the App Wizard I build a VC++ .NET Console Application. When I press F5 it takes 30 seconds to get to the Console::Writeline(). ...

.NET Framework

check whether a value is scalar

by: Eli | last post by:

Hi, I want to check whether a value is a scalar. A scalar can be: - None (null) - string - number (integer, float) - boolean How can I validate a value is one of these types? I care about...

Python

Creating Scalar UDF

by: deepdata | last post by:

Hi, I am trying to create a function which returns a scalar data. My function tries to insert a row in a table as well. /** CREATE function getManufacturerId ( manufacturer VARCHAR(128) )

DB2 Database

calling a scalar function in sql from vb

by: Rayne | last post by:

I've got some scalar functions defined in sql server 2005. In VB.NET they are set up as stored procedures that return a single value in my tableadapter. When I preview data on the query in design...

Visual Basic .NET

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Merging data from multiple Excel files

by: ryjfgjl | last post by:

In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...

Data Management

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

The easy way to turn off automatic updates for Windows 10/11

by: Hystou | last post by:

Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

Windows Server

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

General

Why are scalar functions soooo sloooow?

Similar topics