By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
438,265 Members | 1,333 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 438,265 IT Pros & Developers. It's quick & easy.

Optimizator and indexes

P: n/a
Is there a way to force optimizer to use indexes without hints? (some
server setting or index type...)
I'll give an example to clarify :

I have a table with fields
Customer_Code char(10) not null
Invoice_Number int not null
and an index on those fields IX_1.

there are about 2,000,000 records in the table and those two fields are
not unique, and I cant use clustered index because of the nature of the
table (need it for something else).

When I use query like :

Select * from CustInv where Customer_Code='ABC' and invoice_Number=2

depending on the ammount of data statistics computes, query is executed
using the IX_1 or IX_1 is ignored. The documentation says that using of
indexes is determened by the uniqueness of the data, but my tests show
that every usage of index is faster.
I can fool the optimizer when using query like

Select * from CustInv where Customer_Code='ABC' and invoice_Number>=2
and invoice_number<=2

but all that it does is extends the optimizer's tolerance (IX_1) is
ignored for some queries in that form also.

Because my database and indexes are dynamically created, I cannot use
hints for indexes, so if anyone knows a way to tell the database
something like : "If you have an index on where fields, use it ALWAYS",
the knowledge would be gratly appretiated! :)

Thanx in advance! Sorry for a longer post and maybe fuzzy explanation of
the problem...

Igor
Jul 23 '05 #1
Share this Question
Share on Google+
9 Replies


P: n/a
Igor,

If you create the tables (and indexes) dynamically, then it should be no
problem for you to name your indexes and constraints. And if you can
name your indexes/constraints, then you can use index hints.

BTW: There is no way to force the use of a particular index without
using hints. Such option would make no sense at all.

SQL-Server uses a cost based optimizer and will minimize I/O, it will
choose indexes that will minimize query execution times.

With a hot cache, an index hint can in some situations increase
performance. Note however that the index hint can become troublesome
when the amount of data grows (or unused space increases), or when the
cache size shrinks (because other processes need the server's memory).

If the query that you are actually using is more complex then the one
below, then please post it. There may be other options...

Hope this helps,
Gert-Jan
Igor wrote:

Is there a way to force optimizer to use indexes without hints? (some
server setting or index type...)
I'll give an example to clarify :

I have a table with fields
Customer_Code char(10) not null
Invoice_Number int not null
and an index on those fields IX_1.

there are about 2,000,000 records in the table and those two fields are
not unique, and I cant use clustered index because of the nature of the
table (need it for something else).

When I use query like :

Select * from CustInv where Customer_Code='ABC' and invoice_Number=2

depending on the ammount of data statistics computes, query is executed
using the IX_1 or IX_1 is ignored. The documentation says that using of
indexes is determened by the uniqueness of the data, but my tests show
that every usage of index is faster.
I can fool the optimizer when using query like

Select * from CustInv where Customer_Code='ABC' and invoice_Number>=2
and invoice_number<=2

but all that it does is extends the optimizer's tolerance (IX_1) is
ignored for some queries in that form also.

Because my database and indexes are dynamically created, I cannot use
hints for indexes, so if anyone knows a way to tell the database
something like : "If you have an index on where fields, use it ALWAYS",
the knowledge would be gratly appretiated! :)

Thanx in advance! Sorry for a longer post and maybe fuzzy explanation of
the problem...

Igor

Jul 23 '05 #2

P: n/a
Igor (bi**@baja.com) writes:
Because my database and indexes are dynamically created, I cannot use
hints for indexes, so if anyone knows a way to tell the database
something like : "If you have an index on where fields, use it ALWAYS",
the knowledge would be gratly appretiated! :)


Believe me, you don't want that. Say that you have:

SELECT * FROM tbl WHERE nonclusteredindexedcol = @value

If there is a good distribution of values in nonclusteredindexedcol,
the index will be good. But say that for 30% of the rows, the value
is the one the same. The cost for using the index rather than scanning
the table, will be considerably lower than using the index, since when
using the index the same page will be accessed more than once.

It is true, though, that the optimizer is overly conservative when it
comes to using non-clustered index, and in my opinion, the threshold
when it switches to table scan is a little low.

For a table where I know my data, I don't shudder for using an index
hint. But for a dynamically created database like yours, this sounds
risky.

A better alternative might be to build non-clustered indexes which
comprises all columns you need. If that is all columns of the table,
that is in practice a second clustered index on the table. It may
not be good for update speed though.

You could also investigate indexed views.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 23 '05 #3

P: n/a
Thanks for the reply.

Unfortunately, the query is as simple as I posted.
I didn't mean forcing usage of an PARTICULAR index, but telling the
optimizer if there is an index on the table with my where clause, use
it. :)
What I ment by dynamically creating indexes is that some tables index
themselves depending on the usage and data in them. The index names are
created dynamically and for me to choose one would mean scanning the
where clause and index names table...

And, if the only way to get the optimizer to actually optimize is to
turn it off, I can use DBF+NTX :))

btw, the irony is that the more complex the query, the smarter the
optimizer gets! :((
Gert-Jan Strik wrote:
Igor,

If you create the tables (and indexes) dynamically, then it should be no
problem for you to name your indexes and constraints. And if you can
name your indexes/constraints, then you can use index hints.

BTW: There is no way to force the use of a particular index without
using hints. Such option would make no sense at all.

SQL-Server uses a cost based optimizer and will minimize I/O, it will
choose indexes that will minimize query execution times.

With a hot cache, an index hint can in some situations increase
performance. Note however that the index hint can become troublesome
when the amount of data grows (or unused space increases), or when the
cache size shrinks (because other processes need the server's memory).

If the query that you are actually using is more complex then the one
below, then please post it. There may be other options...

Hope this helps,
Gert-Jan

Jul 23 '05 #4

P: n/a
Thanks for the reply!
I agree with you on the threshold of the optimizer. But my problem
manifests itself like this :

My table is about 2,000,000 records, with an estimated growth of about
250,000 a month (don't think a giant index would be nice :))

When a result set will contain 3 records the query time is about 20secs
(stupid thing scans the table)

When a result set will contain 50,000 records (or so the optimizer
thinks), the query time is about 5 secs!
(uses index)
When I show this to my user, I get beat up by a stick! :))

Also, the server has only 1G of ram, so it cannot cache everything (if
it could, i wouldn't have noticed this) and is a dedicated server (only
SQL2000).
Not dealing with the inteligence (stupidity?) of the optimizer, is there
a way to increase the threshold for the table scan???

database is optimized every night and is currently about 10G!

Erland Sommarskog wrote:
Igor (bi**@baja.com) writes:
Because my database and indexes are dynamically created, I cannot use
hints for indexes, so if anyone knows a way to tell the database
something like : "If you have an index on where fields, use it ALWAYS",
the knowledge would be gratly appretiated! :)

Believe me, you don't want that. Say that you have:

SELECT * FROM tbl WHERE nonclusteredindexedcol = @value

If there is a good distribution of values in nonclusteredindexedcol,
the index will be good. But say that for 30% of the rows, the value
is the one the same. The cost for using the index rather than scanning
the table, will be considerably lower than using the index, since when
using the index the same page will be accessed more than once.

It is true, though, that the optimizer is overly conservative when it
comes to using non-clustered index, and in my opinion, the threshold
when it switches to table scan is a little low.

For a table where I know my data, I don't shudder for using an index
hint. But for a dynamically created database like yours, this sounds
risky.

A better alternative might be to build non-clustered indexes which
comprises all columns you need. If that is all columns of the table,
that is in practice a second clustered index on the table. It may
not be good for update speed though.

You could also investigate indexed views.

Jul 23 '05 #5

P: n/a
Hi Igor
have you tried manually updating statistics more often?

Jul 23 '05 #6

P: n/a
Tzvika Barenholz wrote:
Hi Igor
have you tried manually updating statistics more often?


Yes. Database has a plan to do it every night. but optimizer decides on
the basis of the statistics which is bad in my case... :(
Jul 23 '05 #7

P: n/a
Igor,

If table CustInv has a nonclustered compound index on
(Customer_Code,Invoice_Number) and you run the query

Select * from CustInv where Customer_Code='ABC' and invoice_Number=2

and this query (0) will only return a few rows, then I would be *very*
surprised if the optimizer did not use the nonclustered index.

However, if what you are actually doing is this (1)

EXEC MySProc 'ABC', 2

or this (2)

Declare @Customer_Code char(10)
Declare @invoice_Number int
Set @Customer_Code='ABC'
Set @invoice_Number=2
Select * from CustInv where Customer_Code=@Customer_Code and
invoice_Number=@invoice_Number

then that is a different story.

Situation 1 is different because of parameter sniffing (check out Google
for more info). If you are using that, adding WITH RECOMPILE to the
stored procdure definition could be a solution.

In situation 2, it is not the statistics of the value pair ('ABC',2)
that is used, but the selectivity of the entire index, because the
variable values are not known compile time. The query optimizer will
then create a plan for the worst case scenario. If your data is evenly
distributed, then it could use the nonclustered index. However, if your
data distribution is skewed, and there is a combination with a very high
occurrence, then SQL-Server will probably choose a query plan with a
clustered index scan. Whatever query plan is chosen, it is used for all
different variable values.

Please let us know which type of query is actually causing the bad
performance: Type 0, 1 or 2.

Gert-Jan
Igor wrote:

Thanks for the reply.

Unfortunately, the query is as simple as I posted.
I didn't mean forcing usage of an PARTICULAR index, but telling the
optimizer if there is an index on the table with my where clause, use
it. :)
What I ment by dynamically creating indexes is that some tables index
themselves depending on the usage and data in them. The index names are
created dynamically and for me to choose one would mean scanning the
where clause and index names table...

And, if the only way to get the optimizer to actually optimize is to
turn it off, I can use DBF+NTX :))

btw, the irony is that the more complex the query, the smarter the
optimizer gets! :((

Gert-Jan Strik wrote:
Igor,

If you create the tables (and indexes) dynamically, then it should be no
problem for you to name your indexes and constraints. And if you can
name your indexes/constraints, then you can use index hints.

BTW: There is no way to force the use of a particular index without
using hints. Such option would make no sense at all.

SQL-Server uses a cost based optimizer and will minimize I/O, it will
choose indexes that will minimize query execution times.

With a hot cache, an index hint can in some situations increase
performance. Note however that the index hint can become troublesome
when the amount of data grows (or unused space increases), or when the
cache size shrinks (because other processes need the server's memory).

If the query that you are actually using is more complex then the one
below, then please post it. There may be other options...

Hope this helps,
Gert-Jan

Jul 23 '05 #8

P: n/a
Gert-Jan,
Unfortunately, query (0) is the one with a problem! :)
But, i've tried q(2) and this works fine????
Also, I've tried this on a box with 4G of ram - works perfectly (not the
speed, the plan)... If only I could find a way to increase the threshold
for I/O performance... :)

Anyway, thanks a lot for a great post - i'll play around with it some
more, but my nerves are breaking... :)
Gert-Jan Strik wrote:
Igor,

If table CustInv has a nonclustered compound index on
(Customer_Code,Invoice_Number) and you run the query

Select * from CustInv where Customer_Code='ABC' and invoice_Number=2

and this query (0) will only return a few rows, then I would be *very*
surprised if the optimizer did not use the nonclustered index.

However, if what you are actually doing is this (1)

EXEC MySProc 'ABC', 2

or this (2)

Declare @Customer_Code char(10)
Declare @invoice_Number int
Set @Customer_Code='ABC'
Set @invoice_Number=2
Select * from CustInv where Customer_Code=@Customer_Code and
invoice_Number=@invoice_Number

then that is a different story.

Situation 1 is different because of parameter sniffing (check out Google
for more info). If you are using that, adding WITH RECOMPILE to the
stored procdure definition could be a solution.

In situation 2, it is not the statistics of the value pair ('ABC',2)
that is used, but the selectivity of the entire index, because the
variable values are not known compile time. The query optimizer will
then create a plan for the worst case scenario. If your data is evenly
distributed, then it could use the nonclustered index. However, if your
data distribution is skewed, and there is a combination with a very high
occurrence, then SQL-Server will probably choose a query plan with a
clustered index scan. Whatever query plan is chosen, it is used for all
different variable values.

Please let us know which type of query is actually causing the bad
performance: Type 0, 1 or 2.

Gert-Jan
Igor wrote:
Thanks for the reply.

Unfortunately, the query is as simple as I posted.
I didn't mean forcing usage of an PARTICULAR index, but telling the
optimizer if there is an index on the table with my where clause, use
it. :)
What I ment by dynamically creating indexes is that some tables index
themselves depending on the usage and data in them. The index names are
created dynamically and for me to choose one would mean scanning the
where clause and index names table...

And, if the only way to get the optimizer to actually optimize is to
turn it off, I can use DBF+NTX :))

btw, the irony is that the more complex the query, the smarter the
optimizer gets! :((

Gert-Jan Strik wrote:
Igor,

If you create the tables (and indexes) dynamically, then it should be no
problem for you to name your indexes and constraints. And if you can
name your indexes/constraints, then you can use index hints.

BTW: There is no way to force the use of a particular index without
using hints. Such option would make no sense at all.

SQL-Server uses a cost based optimizer and will minimize I/O, it will
choose indexes that will minimize query execution times.

With a hot cache, an index hint can in some situations increase
performance. Note however that the index hint can become troublesome
when the amount of data grows (or unused space increases), or when the
cache size shrinks (because other processes need the server's memory).

If the query that you are actually using is more complex then the one
below, then please post it. There may be other options...

Hope this helps,
Gert-Jan

Jul 23 '05 #9

P: n/a
Igor (bi**@baja.com) writes:
Thanks for the reply!
I agree with you on the threshold of the optimizer. But my problem
manifests itself like this :

My table is about 2,000,000 records, with an estimated growth of about
250,000 a month (don't think a giant index would be nice :))

When a result set will contain 3 records the query time is about 20secs
(stupid thing scans the table)

When a result set will contain 50,000 records (or so the optimizer
thinks), the query time is about 5 secs!
(uses index)
When I show this to my user, I get beat up by a stick! :))


Not seeing the table, its indexes and its plan, it's difficult to say
something useful.

But it is worth keeping in mind that stastistics are normally maintained
for single columns, if your query has two conditions of which each
value is fairly common, but the combination is not, the optimizer
may be confused. You can create two-column statistics, but I don't know
whether they actually have any practical importance.

Assuming that your query is dynamically generated, you could add an
index hint like:

WITH INDEX (2, 3, 4, ...)

where you simply include the index ids of the non-clustered indexes
of the table. You would have to query sysindexes to find them. (You
need to exclude statistics and hypothetical indexes.)

If that doesn't work out, you probably need to reconsider your
strategy with dynamic indexing.
--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 23 '05 #10

This discussion thread is closed

Replies have been disabled for this discussion.