By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
429,313 Members | 2,725 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 429,313 IT Pros & Developers. It's quick & easy.

Index Design Recommendation - Examine Column Uniqueness

P: n/a
I am reading "SQL Server Query Performance Tuning Distilled",
on page 104 it talks about one of the index design recommendations
which is to choose the column that has very high selectivity of values
instead of a column that has very few selectivity of values.

My question is if I have currently indexes on my tables that have
1, 2, 3, 4, ... values only on thousands of rows, are these nonclustered
indexes pretty much useless indexes that I should get rid of?

And I know that pretty much the number of selectivity values will
always remain very low.

Thank you

Nov 30 '05 #1
Share this Question
Share on Google+
1 Reply


P: n/a
serge (se****@nospam.ehmail.com) writes:
I am reading "SQL Server Query Performance Tuning Distilled",
on page 104 it talks about one of the index design recommendations
which is to choose the column that has very high selectivity of values
instead of a column that has very few selectivity of values.

My question is if I have currently indexes on my tables that have
1, 2, 3, 4, ... values only on thousands of rows, are these nonclustered
indexes pretty much useless indexes that I should get rid of?

And I know that pretty much the number of selectivity values will
always remain very low.


As always in the database world, it depends. An index on a bit column sound
like a bad idea in general, but consider this query:

SELECT ... FROM tbl WHERE unprocessed = convert(bit, 0)

Typically in such a table, there will be only a small number of unprocessed
rows, so the column is very selective for unprocessed = 0, and you almost
need an index on unprocessed here. (And for the index to be useful, you need
the convert as well, a subtlety with SQL Server data-type precedence.)

It also matters here whether the index is clustered or not. To continue with
the bit column, a non-clustered index on a bit column with a 50/50 split
is useless (almost see below), where as a clustered index actually reduces
the scan to only half of the table. Take this a little further and consider
a column with ten different values with equal distribution. The non-
clustered index is still not much of use, where as a clustered index reduces
the reads for a query like:

SELECT ... FROM tbl WHERE col = 'G' AND ...

to 10% of a full scan.

The reason the non-clustered index is useless, is because the optimizer
will find it more expensive to seek the index and then look up rows from
the data pages.

But all this changes if all you read is columns from the index. Consider
the bit column with a 50/50 split, and assume that you often need to run

SELECT bitcol, COUNT(*) FROM tbl GROUP BY bitcol

The non-clustered index is now a covering index and very useful.

So bottom line is: good indexes are indexes that are used.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Nov 30 '05 #2

This discussion thread is closed

Replies have been disabled for this discussion.