Index Design Recommendation - Examine Column Uniqueness

serge

I am reading "SQL Server Query Performance Tuning Distilled",
on page 104 it talks about one of the index design recommendations
which is to choose the column that has very high selectivity of values
instead of a column that has very few selectivity of values.

My question is if I have currently indexes on my tables that have
1, 2, 3, 4, ... values only on thousands of rows, are these nonclustered
indexes pretty much useless indexes that I should get rid of?

And I know that pretty much the number of selectivity values will
always remain very low.

Thank you

Nov 30 '05 #1

Subscribe Post Reply

2038

Erland Sommarskog

serge (se****@nospam.ehmail.com) writes:

I am reading "SQL Server Query Performance Tuning Distilled",
on page 104 it talks about one of the index design recommendations
which is to choose the column that has very high selectivity of values
instead of a column that has very few selectivity of values.

My question is if I have currently indexes on my tables that have
1, 2, 3, 4, ... values only on thousands of rows, are these nonclustered
indexes pretty much useless indexes that I should get rid of?

And I know that pretty much the number of selectivity values will
always remain very low.

As always in the database world, it depends. An index on a bit column sound
like a bad idea in general, but consider this query:

SELECT ... FROM tbl WHERE unprocessed = convert(bit, 0)

Typically in such a table, there will be only a small number of unprocessed
rows, so the column is very selective for unprocessed = 0, and you almost
need an index on unprocessed here. (And for the index to be useful, you need
the convert as well, a subtlety with SQL Server data-type precedence.)

It also matters here whether the index is clustered or not. To continue with
the bit column, a non-clustered index on a bit column with a 50/50 split
is useless (almost see below), where as a clustered index actually reduces
the scan to only half of the table. Take this a little further and consider
a column with ten different values with equal distribution. The non-
clustered index is still not much of use, where as a clustered index reduces
the reads for a query like:

SELECT ... FROM tbl WHERE col = 'G' AND ...

to 10% of a full scan.

The reason the non-clustered index is useless, is because the optimizer
will find it more expensive to seek the index and then look up rows from
the data pages.

But all this changes if all you read is columns from the index. Consider
the bit column with a 50/50 split, and assume that you often need to run

SELECT bitcol, COUNT(*) FROM tbl GROUP BY bitcol

The non-clustered index is now a covering index and very useful.

So bottom line is: good indexes are indexes that are used.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx

Nov 30 '05 #2

Similar topics

Unique constraint index

by: Mansoor Azam | last post by:

When I add a unique key constraint to column in SQL 6.5 why does it also create an index. e.g. In the table subaccounts I added a unique key constraint for the column login and SQL creates an index...

Microsoft SQL Server

unique constraint vs unique index in MS SQL 2000

by: Kamil | last post by:

Hello What should I use for better perfomance since unique constraint always use index ? Thanks Kamil

Microsoft SQL Server

Is KEY and INDEX the same thing?

by: Phil Latio | last post by:

I am following a book on PHP and MySQL and have come across the below SQL statement. CREATE TABLE users ( user_id MEDIUMINT(8) UNSIGNED NOT NULL AUTO_INCREMENT, username VARCHAR(20) NOT NULL,...

MySQL Database

Index usage on db2 v7 for os390

by: Andr? Queiroz | last post by:

Hi, I have a table with 10M records and col A has a index created on it. The data on that table has the same value for col A on all 10M records. After that I insert diferent values for that column...

DB2 Database

Better index access = worse performance??

by: Sean C. | last post by:

Helpful folks, Most of my previous experience with DB2 was on s390 mainframe systems and the optimizer on this platform always seemed very predictable and consistent. Since moving to a WinNT/UDB...

DB2 Database

Difference between unique constraint and unique index?

by: aj | last post by:

DB2 WSE 8.1 FP5 Red Hat AS 2.1 What is the difference between adding a unique constraint like: ALTER TABLE <SCHEMA>.<TABLE> ADD CONSTRAINT CC1131378283225 UNIQUE ( <COL1>) ; and adding a...

DB2 Database

Index Design Recommendation - Examine Column Uniqueness

by: serge | last post by:

I am reading "SQL Server Query Performance Tuning Distilled", on page 104 it talks about one of the index design recommendations which is to choose the column that has very high selectivity of...

Microsoft SQL Server

Indexed single-col search capability from multi-col index

by: Dave Hammond | last post by:

Hi All, I'd like to have indexed search capability on column A, column B, or columns (A,B) for a given table. According to the MySQL manual, a multi-column index of (A,B) will provide "leftmost...

MySQL Database

Index/Key length > 1100 chars

by: nshishir | last post by:

DB2 LUW 8.2 I need to have a combination of varchar columns, whose length is >1100, as primary key, foreign key and unique index column. When I try this, I get the error: SQL0613N The primary key...

DB2 Database

Batch import of multiple excel files into the database

by: ryjfgjl | last post by:

If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...

Data Management

Merging data from multiple Excel files

by: ryjfgjl | last post by:

In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...

Data Management

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

Online Marketing