472,984 Members | 2,123 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,984 software developers and data experts.

Optimizator and indexes

Is there a way to force optimizer to use indexes without hints? (some
server setting or index type...)
I'll give an example to clarify :

I have a table with fields
Customer_Code char(10) not null
Invoice_Number int not null
and an index on those fields IX_1.

there are about 2,000,000 records in the table and those two fields are
not unique, and I cant use clustered index because of the nature of the
table (need it for something else).

When I use query like :

Select * from CustInv where Customer_Code='ABC' and invoice_Number=2

depending on the ammount of data statistics computes, query is executed
using the IX_1 or IX_1 is ignored. The documentation says that using of
indexes is determened by the uniqueness of the data, but my tests show
that every usage of index is faster.
I can fool the optimizer when using query like

Select * from CustInv where Customer_Code='ABC' and invoice_Number>=2
and invoice_number<=2

but all that it does is extends the optimizer's tolerance (IX_1) is
ignored for some queries in that form also.

Because my database and indexes are dynamically created, I cannot use
hints for indexes, so if anyone knows a way to tell the database
something like : "If you have an index on where fields, use it ALWAYS",
the knowledge would be gratly appretiated! :)

Thanx in advance! Sorry for a longer post and maybe fuzzy explanation of
the problem...

Igor
Jul 23 '05 #1
9 1624
Igor,

If you create the tables (and indexes) dynamically, then it should be no
problem for you to name your indexes and constraints. And if you can
name your indexes/constraints, then you can use index hints.

BTW: There is no way to force the use of a particular index without
using hints. Such option would make no sense at all.

SQL-Server uses a cost based optimizer and will minimize I/O, it will
choose indexes that will minimize query execution times.

With a hot cache, an index hint can in some situations increase
performance. Note however that the index hint can become troublesome
when the amount of data grows (or unused space increases), or when the
cache size shrinks (because other processes need the server's memory).

If the query that you are actually using is more complex then the one
below, then please post it. There may be other options...

Hope this helps,
Gert-Jan
Igor wrote:

Is there a way to force optimizer to use indexes without hints? (some
server setting or index type...)
I'll give an example to clarify :

I have a table with fields
Customer_Code char(10) not null
Invoice_Number int not null
and an index on those fields IX_1.

there are about 2,000,000 records in the table and those two fields are
not unique, and I cant use clustered index because of the nature of the
table (need it for something else).

When I use query like :

Select * from CustInv where Customer_Code='ABC' and invoice_Number=2

depending on the ammount of data statistics computes, query is executed
using the IX_1 or IX_1 is ignored. The documentation says that using of
indexes is determened by the uniqueness of the data, but my tests show
that every usage of index is faster.
I can fool the optimizer when using query like

Select * from CustInv where Customer_Code='ABC' and invoice_Number>=2
and invoice_number<=2

but all that it does is extends the optimizer's tolerance (IX_1) is
ignored for some queries in that form also.

Because my database and indexes are dynamically created, I cannot use
hints for indexes, so if anyone knows a way to tell the database
something like : "If you have an index on where fields, use it ALWAYS",
the knowledge would be gratly appretiated! :)

Thanx in advance! Sorry for a longer post and maybe fuzzy explanation of
the problem...

Igor

Jul 23 '05 #2
Igor (bi**@baja.com) writes:
Because my database and indexes are dynamically created, I cannot use
hints for indexes, so if anyone knows a way to tell the database
something like : "If you have an index on where fields, use it ALWAYS",
the knowledge would be gratly appretiated! :)


Believe me, you don't want that. Say that you have:

SELECT * FROM tbl WHERE nonclusteredindexedcol = @value

If there is a good distribution of values in nonclusteredindexedcol,
the index will be good. But say that for 30% of the rows, the value
is the one the same. The cost for using the index rather than scanning
the table, will be considerably lower than using the index, since when
using the index the same page will be accessed more than once.

It is true, though, that the optimizer is overly conservative when it
comes to using non-clustered index, and in my opinion, the threshold
when it switches to table scan is a little low.

For a table where I know my data, I don't shudder for using an index
hint. But for a dynamically created database like yours, this sounds
risky.

A better alternative might be to build non-clustered indexes which
comprises all columns you need. If that is all columns of the table,
that is in practice a second clustered index on the table. It may
not be good for update speed though.

You could also investigate indexed views.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 23 '05 #3
Thanks for the reply.

Unfortunately, the query is as simple as I posted.
I didn't mean forcing usage of an PARTICULAR index, but telling the
optimizer if there is an index on the table with my where clause, use
it. :)
What I ment by dynamically creating indexes is that some tables index
themselves depending on the usage and data in them. The index names are
created dynamically and for me to choose one would mean scanning the
where clause and index names table...

And, if the only way to get the optimizer to actually optimize is to
turn it off, I can use DBF+NTX :))

btw, the irony is that the more complex the query, the smarter the
optimizer gets! :((
Gert-Jan Strik wrote:
Igor,

If you create the tables (and indexes) dynamically, then it should be no
problem for you to name your indexes and constraints. And if you can
name your indexes/constraints, then you can use index hints.

BTW: There is no way to force the use of a particular index without
using hints. Such option would make no sense at all.

SQL-Server uses a cost based optimizer and will minimize I/O, it will
choose indexes that will minimize query execution times.

With a hot cache, an index hint can in some situations increase
performance. Note however that the index hint can become troublesome
when the amount of data grows (or unused space increases), or when the
cache size shrinks (because other processes need the server's memory).

If the query that you are actually using is more complex then the one
below, then please post it. There may be other options...

Hope this helps,
Gert-Jan

Jul 23 '05 #4
Thanks for the reply!
I agree with you on the threshold of the optimizer. But my problem
manifests itself like this :

My table is about 2,000,000 records, with an estimated growth of about
250,000 a month (don't think a giant index would be nice :))

When a result set will contain 3 records the query time is about 20secs
(stupid thing scans the table)

When a result set will contain 50,000 records (or so the optimizer
thinks), the query time is about 5 secs!
(uses index)
When I show this to my user, I get beat up by a stick! :))

Also, the server has only 1G of ram, so it cannot cache everything (if
it could, i wouldn't have noticed this) and is a dedicated server (only
SQL2000).
Not dealing with the inteligence (stupidity?) of the optimizer, is there
a way to increase the threshold for the table scan???

database is optimized every night and is currently about 10G!

Erland Sommarskog wrote:
Igor (bi**@baja.com) writes:
Because my database and indexes are dynamically created, I cannot use
hints for indexes, so if anyone knows a way to tell the database
something like : "If you have an index on where fields, use it ALWAYS",
the knowledge would be gratly appretiated! :)

Believe me, you don't want that. Say that you have:

SELECT * FROM tbl WHERE nonclusteredindexedcol = @value

If there is a good distribution of values in nonclusteredindexedcol,
the index will be good. But say that for 30% of the rows, the value
is the one the same. The cost for using the index rather than scanning
the table, will be considerably lower than using the index, since when
using the index the same page will be accessed more than once.

It is true, though, that the optimizer is overly conservative when it
comes to using non-clustered index, and in my opinion, the threshold
when it switches to table scan is a little low.

For a table where I know my data, I don't shudder for using an index
hint. But for a dynamically created database like yours, this sounds
risky.

A better alternative might be to build non-clustered indexes which
comprises all columns you need. If that is all columns of the table,
that is in practice a second clustered index on the table. It may
not be good for update speed though.

You could also investigate indexed views.

Jul 23 '05 #5
Hi Igor
have you tried manually updating statistics more often?

Jul 23 '05 #6
Tzvika Barenholz wrote:
Hi Igor
have you tried manually updating statistics more often?


Yes. Database has a plan to do it every night. but optimizer decides on
the basis of the statistics which is bad in my case... :(
Jul 23 '05 #7
Igor,

If table CustInv has a nonclustered compound index on
(Customer_Code,Invoice_Number) and you run the query

Select * from CustInv where Customer_Code='ABC' and invoice_Number=2

and this query (0) will only return a few rows, then I would be *very*
surprised if the optimizer did not use the nonclustered index.

However, if what you are actually doing is this (1)

EXEC MySProc 'ABC', 2

or this (2)

Declare @Customer_Code char(10)
Declare @invoice_Number int
Set @Customer_Code='ABC'
Set @invoice_Number=2
Select * from CustInv where Customer_Code=@Customer_Code and
invoice_Number=@invoice_Number

then that is a different story.

Situation 1 is different because of parameter sniffing (check out Google
for more info). If you are using that, adding WITH RECOMPILE to the
stored procdure definition could be a solution.

In situation 2, it is not the statistics of the value pair ('ABC',2)
that is used, but the selectivity of the entire index, because the
variable values are not known compile time. The query optimizer will
then create a plan for the worst case scenario. If your data is evenly
distributed, then it could use the nonclustered index. However, if your
data distribution is skewed, and there is a combination with a very high
occurrence, then SQL-Server will probably choose a query plan with a
clustered index scan. Whatever query plan is chosen, it is used for all
different variable values.

Please let us know which type of query is actually causing the bad
performance: Type 0, 1 or 2.

Gert-Jan
Igor wrote:

Thanks for the reply.

Unfortunately, the query is as simple as I posted.
I didn't mean forcing usage of an PARTICULAR index, but telling the
optimizer if there is an index on the table with my where clause, use
it. :)
What I ment by dynamically creating indexes is that some tables index
themselves depending on the usage and data in them. The index names are
created dynamically and for me to choose one would mean scanning the
where clause and index names table...

And, if the only way to get the optimizer to actually optimize is to
turn it off, I can use DBF+NTX :))

btw, the irony is that the more complex the query, the smarter the
optimizer gets! :((

Gert-Jan Strik wrote:
Igor,

If you create the tables (and indexes) dynamically, then it should be no
problem for you to name your indexes and constraints. And if you can
name your indexes/constraints, then you can use index hints.

BTW: There is no way to force the use of a particular index without
using hints. Such option would make no sense at all.

SQL-Server uses a cost based optimizer and will minimize I/O, it will
choose indexes that will minimize query execution times.

With a hot cache, an index hint can in some situations increase
performance. Note however that the index hint can become troublesome
when the amount of data grows (or unused space increases), or when the
cache size shrinks (because other processes need the server's memory).

If the query that you are actually using is more complex then the one
below, then please post it. There may be other options...

Hope this helps,
Gert-Jan

Jul 23 '05 #8
Gert-Jan,
Unfortunately, query (0) is the one with a problem! :)
But, i've tried q(2) and this works fine????
Also, I've tried this on a box with 4G of ram - works perfectly (not the
speed, the plan)... If only I could find a way to increase the threshold
for I/O performance... :)

Anyway, thanks a lot for a great post - i'll play around with it some
more, but my nerves are breaking... :)
Gert-Jan Strik wrote:
Igor,

If table CustInv has a nonclustered compound index on
(Customer_Code,Invoice_Number) and you run the query

Select * from CustInv where Customer_Code='ABC' and invoice_Number=2

and this query (0) will only return a few rows, then I would be *very*
surprised if the optimizer did not use the nonclustered index.

However, if what you are actually doing is this (1)

EXEC MySProc 'ABC', 2

or this (2)

Declare @Customer_Code char(10)
Declare @invoice_Number int
Set @Customer_Code='ABC'
Set @invoice_Number=2
Select * from CustInv where Customer_Code=@Customer_Code and
invoice_Number=@invoice_Number

then that is a different story.

Situation 1 is different because of parameter sniffing (check out Google
for more info). If you are using that, adding WITH RECOMPILE to the
stored procdure definition could be a solution.

In situation 2, it is not the statistics of the value pair ('ABC',2)
that is used, but the selectivity of the entire index, because the
variable values are not known compile time. The query optimizer will
then create a plan for the worst case scenario. If your data is evenly
distributed, then it could use the nonclustered index. However, if your
data distribution is skewed, and there is a combination with a very high
occurrence, then SQL-Server will probably choose a query plan with a
clustered index scan. Whatever query plan is chosen, it is used for all
different variable values.

Please let us know which type of query is actually causing the bad
performance: Type 0, 1 or 2.

Gert-Jan
Igor wrote:
Thanks for the reply.

Unfortunately, the query is as simple as I posted.
I didn't mean forcing usage of an PARTICULAR index, but telling the
optimizer if there is an index on the table with my where clause, use
it. :)
What I ment by dynamically creating indexes is that some tables index
themselves depending on the usage and data in them. The index names are
created dynamically and for me to choose one would mean scanning the
where clause and index names table...

And, if the only way to get the optimizer to actually optimize is to
turn it off, I can use DBF+NTX :))

btw, the irony is that the more complex the query, the smarter the
optimizer gets! :((

Gert-Jan Strik wrote:
Igor,

If you create the tables (and indexes) dynamically, then it should be no
problem for you to name your indexes and constraints. And if you can
name your indexes/constraints, then you can use index hints.

BTW: There is no way to force the use of a particular index without
using hints. Such option would make no sense at all.

SQL-Server uses a cost based optimizer and will minimize I/O, it will
choose indexes that will minimize query execution times.

With a hot cache, an index hint can in some situations increase
performance. Note however that the index hint can become troublesome
when the amount of data grows (or unused space increases), or when the
cache size shrinks (because other processes need the server's memory).

If the query that you are actually using is more complex then the one
below, then please post it. There may be other options...

Hope this helps,
Gert-Jan

Jul 23 '05 #9
Igor (bi**@baja.com) writes:
Thanks for the reply!
I agree with you on the threshold of the optimizer. But my problem
manifests itself like this :

My table is about 2,000,000 records, with an estimated growth of about
250,000 a month (don't think a giant index would be nice :))

When a result set will contain 3 records the query time is about 20secs
(stupid thing scans the table)

When a result set will contain 50,000 records (or so the optimizer
thinks), the query time is about 5 secs!
(uses index)
When I show this to my user, I get beat up by a stick! :))


Not seeing the table, its indexes and its plan, it's difficult to say
something useful.

But it is worth keeping in mind that stastistics are normally maintained
for single columns, if your query has two conditions of which each
value is fairly common, but the combination is not, the optimizer
may be confused. You can create two-column statistics, but I don't know
whether they actually have any practical importance.

Assuming that your query is dynamically generated, you could add an
index hint like:

WITH INDEX (2, 3, 4, ...)

where you simply include the index ids of the non-clustered indexes
of the table. You would have to query sysindexes to find them. (You
need to exclude statistics and hypothetical indexes.)

If that doesn't work out, you probably need to reconsider your
strategy with dynamic indexing.
--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 23 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
by: Paul | last post by:
I'm confused about something I did this week. I have an 8.1.7.4 database with tables partitioned by dates. I recently dropped my maxvalue tablespace, added a few more partitions for up-coming dates,...
2
by: Chris | last post by:
Any help would be appreciated. I am running a script that does the following in succession. 1-Drop existing database and create new database 2-Defines tables, stored procedures and functions...
1
by: Steve_CA | last post by:
Hi, The more I read, the more confused I'm getting ! (no wonder they say ignorance is bliss) I just got back from the bookstore and was flipping through some SQL Server Administration...
9
by: pheonix1t | last post by:
hello, I've been assigned to do performance tuning on an SQL2000 database (around 10GB in size, several instances). So far, I see a single RAID5 array, 4CPU (xeon 700MHZ), 4GB RAM. I see the...
4
by: mairhtin o'feannag | last post by:
Hello, I want to ask the question "how large is each of the indexes in a particular tablespace??" since I want to know which user-created indexes are taking up huge amounts of space. I used the...
14
by: Jeff | last post by:
This is the first time that I remember ever having too many indexes on a table, but it has happened. I have en employees table and store in many places, on tables, the id of the employee that...
17
by: boa | last post by:
I'm currently planning disk layouts and use for a new version of our database. The current version has all data and indexes in the default filegroup, placed on one big raid-5 array(6 drives) along...
10
by: lesperancer | last post by:
you start with a small application in access97, then you have more modules and more... and you reach the point where tables like 'item' and 'employee' reach the limit and you know there's more...
0
MMcCarthy
by: MMcCarthy | last post by:
The more data you include in your tables the more you will need to have indexes to search and sort that data. However, there is a balance between having enough indexes and too many. Too many...
0
by: lllomh | last post by:
Define the method first this.state = { buttonBackgroundColor: 'green', isBlinking: false, // A new status is added to identify whether the button is blinking or not } autoStart=()=>{
0
by: Aliciasmith | last post by:
In an age dominated by smartphones, having a mobile app for your business is no longer an option; it's a necessity. Whether you're a startup or an established enterprise, finding the right mobile app...
0
tracyyun
by: tracyyun | last post by:
Hello everyone, I have a question and would like some advice on network connectivity. I have one computer connected to my router via WiFi, but I have two other computers that I want to be able to...
2
by: giovanniandrean | last post by:
The energy model is structured as follows and uses excel sheets to give input data: 1-Utility.py contains all the functions needed to calculate the variables and other minor things (mentions...
3
NeoPa
by: NeoPa | last post by:
Introduction For this article I'll be using a very simple database which has Form (clsForm) & Report (clsReport) classes that simply handle making the calling Form invisible until the Form, or all...
1
by: Teri B | last post by:
Hi, I have created a sub-form Roles. In my course form the user selects the roles assigned to the course. 0ne-to-many. One course many roles. Then I created a report based on the Course form and...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 1 Nov 2023 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM) Please note that the UK and Europe revert to winter time on...
0
NeoPa
by: NeoPa | last post by:
Introduction For this article I'll be focusing on the Report (clsReport) class. This simply handles making the calling Form invisible until all of the Reports opened by it have been closed, when it...
0
isladogs
by: isladogs | last post by:
The next online meeting of the Access Europe User Group will be on Wednesday 6 Dec 2023 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, Mike...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.