By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
429,263 Members | 2,640 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 429,263 IT Pros & Developers. It's quick & easy.

parameterized queries running slower than non-parameterized queries

P: n/a
Hello

When I use a PreparedStatement (in jdbc) with the following query:

SELECT store_groups_id
FROM store_groups
WHERE store_groups_id IS NOT NULL
AND type = ?
ORDER BY group_name

It takes a significantly longer time to run (the time it takes for
executeQuery() to return ) than if I use

SELECT store_groups_id
FROM store_groups
WHERE store_groups_id IS NOT NULL
AND type = 'M'
ORDER BY group_name

After tracing the problem down, it appears that this is not precisely
a java issue, but rather has to do with the underlying cost of running
parameterized queries.

When I open up MS Enterprise Manager and type the same query in - it
also takes far longer for the parameterized query to run when I use
the version of the query with bind (?) parameters.

This only happens when the table in question is large - I am seeing
this behaviour for a table with > 1,000,000 records. It doesn't make
sense to me why a parameterized query would run SLOWER than a
completely ad-hoc query when it is supposed to be more efficient.

Furthermore, if one were to say that the reason for this behaviour is
that the query is first getting compliled and then the parameters are
getting sent over - thus resulting in a longer percieved execution
time - I would respond that if this were the case then A) it shouldn't
be any different if it were run against a large or small table B) this
performance hit should only be experienced the first time that the
query is run C) the performance hit should only be 2x the time for the
non-parameterized query takes to run - the difference in response time
is more like 4-10 times the time it takes for the non parameterized
version to run!!!

Is this a sql-server specific problem or something that would pertain
to other databases as well? I there something about the coorect use of
bind parameters that I overall don't understand?

If I can provide some hints in Java then this would be great..
otherwise, do I need to turn/off certain settings on the database
itself?

If nothing else works, I will have to either find or write a wrapper
around the Statement object that acts like a prepared statement but in
reality sends regular Statement objects to the JDBC driver. I would
then put some inteligence in the database layer for deciding whether
to use this special -hack- object or a regular prepared statement
depending on the expected overhead. (Obviously this logic would only
be written in once place.. etc.. IoC.. ) HOWEVER, I would desperately
want to avoid doing this.

Please help :)
Jul 20 '05 #1
Share this Question
Share on Google+
1 Reply


P: n/a
[posted and mailed, please reply in news]

gary b (bo*******@hotmail.com) writes:
SELECT store_groups_id
FROM store_groups
WHERE store_groups_id IS NOT NULL
AND type = ?
ORDER BY group_name

It takes a significantly longer time to run (the time it takes for
executeQuery() to return ) than if I use

SELECT store_groups_id
FROM store_groups
WHERE store_groups_id IS NOT NULL
AND type = 'M'
ORDER BY group_name
...
This only happens when the table in question is large - I am seeing
this behaviour for a table with > 1,000,000 records. It doesn't make
sense to me why a parameterized query would run SLOWER than a
completely ad-hoc query when it is supposed to be more efficient.
It would have helped if you had included the CREATE TABLE and CREATE INDEX
statements for your table. Now, I will have to guess that your table has
a non-clustered index on type and that index does not include both of
store_groups_id and groups_name, and also at least one these colunms is
absent from the clustered index.

When it is said that prepared queries are more effecient, this is because
the query plan for these are cached and can be reused. This means that
if you resubmit the query you are saved the compilation phase.

Ad-hoc statements can be cached too, and also auto-parameterized, so they
would be very similar to prepared queries. But auto-parameterization
does not always happen, and this seems to be such a case.

When SQL Server builds a query plan, it looks ar the values in the queries
and also at the statistics saved for the table. Now, if a value is a
constant, it knows that this is the value that applies, and no other.

If the statement is parameterized, SQL Server still looks at the current
parameter value as a hint, but it has to consider the possibility that
next time, the value passed might be different.

For this query it seems that SQL Server has a choices between two
alternatives:

1) Scan the data pages of the table.
2) Use the non-clustered index on type.

For a value of type with few rows, using the index is much faster. However,
say that 40% of the rows has ' ' in type. Using the index to retrieve
these rows would be very expensive - a lot more expensive than scanning
the table, because each data page would be read more than once.

SQL Server tends to be conservative here. I've more often seen it choose
table scan incorrectly than incorrectly seeking a non-clustered index.

Had the index included all columns in the query, this would have changed
the scene completely, since SQL Server would have no reason to access
the data pages. As I mentioned the clustered index also has importance.
This is because the pointer to the data page is actually the value of
the clustered-index key. So the columns of the clustered index is present
in all non-clustered indexes.
Is this a sql-server specific problem or something that would pertain
to other databases as well? I there something about the coorect use of
bind parameters that I overall don't understand?
The behaviour is cetainly related to the architecture of SQL Server. I
don't have experience of other engines, but I would expect most engines
can give you surprises, although which surprises you get may differ
from engine to engine.
If I can provide some hints in Java then this would be great..
otherwise, do I need to turn/off certain settings on the database
itself?


The best in this case may be to create a covering index for this
particular query. You can also use an index hint to force use of
the non-clustered index, but if you actually pass a value with many
occurrances your application might grind to an actual standstill.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 20 '05 #2

This discussion thread is closed

Replies have been disabled for this discussion.