By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
429,435 Members | 2,036 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 429,435 IT Pros & Developers. It's quick & easy.

clustered vs. non clustered

P: n/a
I've been doing a bit of reading and have read in quite a few places
that an identity column is a good clustered index and that all or at
least most tables should have a clustered index. The tool I used to
generate tables made them all with non clustered indexes so I would
like to drop all of them and generate clustered indexes. So my
questions is a) good idea? and b) how? There are foreign key references
to most of them so those would need to be dropped first and then
re-created after the clustered one was created and that could cascade
(I think?)

Any existing scripts out there that might do this? I found something
similar and modified it, the sql is included below. This gives me the
list of all the columns I need, I just need to get the foreign keys for
each from here before each one and generate all the create/drop
scripts.

All the columns I am looking to do this for are called "Id" making this
somewhat simpler. I'm just looking to incrementally make the SQL side
better and don't want to rewrite a bunch of application level code to
make the column names ISO compliant, etc.

/*
-- Returns whether the column is ASC or DESC
CREATE FUNCTION dbo.GetIndexColumnOrder
(
@object_id INT,
@index_id TINYINT,
@column_id TINYINT
)
RETURNS NVARCHAR(5)
AS
BEGIN
DECLARE @r NVARCHAR(5)
SELECT @r = CASE INDEXKEY_PROPERTY
(
@object_id,
@index_id,
@column_id,
'IsDescending'
)
WHEN 1 THEN N' DESC'
ELSE N''
END
RETURN @r
END

-- Returns the list of columns in the index
CREATE FUNCTION dbo.GetIndexColumns
(
@table_name SYSNAME,
@object_id INT,
@index_id TINYINT
)
RETURNS NVARCHAR(4000)
AS
BEGIN
DECLARE
@colnames NVARCHAR(4000),
@thisColID INT,
@thisColName SYSNAME

SET @colnames = INDEX_COL(@table_name, @index_id, 1)
+ dbo.GetIndexColumnOrder(@object_id, @index_id, 1)

SET @thisColID = 2
SET @thisColName = INDEX_COL(@table_name, @index_id, @thisColID)
+ dbo.GetIndexColumnOrder(@object_id, @index_id, @thisColID)

WHILE (@thisColName IS NOT NULL)
BEGIN
SET @thisColID = @thisColID + 1
SET @colnames = @colnames + ', ' + @thisColName

SET @thisColName = INDEX_COL(@table_name, @index_id,
@thisColID)
+ dbo.GetIndexColumnOrder(@object_id, @index_id,
@thisColID)
END
RETURN @colNames
END

CREATE VIEW dbo.vAllIndexes
AS
begin
SELECT
TABLE_NAME = OBJECT_NAME(i.id),
INDEX_NAME = i.name,
COLUMN_LIST = dbo.GetIndexColumns(OBJECT_NAME(i.id), i.id,
i.indid),
IS_CLUSTERED = INDEXPROPERTY(i.id, i.name, 'IsClustered'),
IS_UNIQUE = INDEXPROPERTY(i.id, i.name, 'IsUnique'),
FILE_GROUP = g.GroupName
FROM
sysindexes i
INNER JOIN
sysfilegroups g
ON
i.groupid = g.groupid
WHERE
(i.indid BETWEEN 1 AND 254)
-- leave out AUTO_STATISTICS:
AND (i.Status & 64)=0
-- leave out system tables:
AND OBJECTPROPERTY(i.id, 'IsMsShipped') = 0
end
*/

SELECT
v.*
FROM
dbo.vAllIndexes v
INNER JOIN
INFORMATION_SCHEMA.TABLE_CONSTRAINTS T
ON
T.CONSTRAINT_NAME = v.INDEX_NAME
AND T.TABLE_NAME = v.TABLE_NAME
AND T.CONSTRAINT_TYPE = 'PRIMARY KEY'
AND v.COLUMN_LIST = 'Id'
AND v.IS_CLUSTERED = 0
ORDER BY v.TABLE_NAME

Aug 14 '06 #1
Share this Question
Share on Google+
5 Replies


P: n/a
Stu
It's OK to have a clustered index that is seperate from your
nonclustered primary key, even if the two indexes cover the same
columns. In fact, I usually build my indexes in this way in case I
ever have to move the clustered index to a different column and I don't
want to mess with my established foreign key constraints.

That being said, I would simply add the clustered index to each table
and not worry about dropping the pre-existing primary key constraint.
It'll take a while, but it will work.

Stu
pb648174 wrote:
I've been doing a bit of reading and have read in quite a few places
that an identity column is a good clustered index and that all or at
least most tables should have a clustered index. The tool I used to
generate tables made them all with non clustered indexes so I would
like to drop all of them and generate clustered indexes. So my
questions is a) good idea? and b) how? There are foreign key references
to most of them so those would need to be dropped first and then
re-created after the clustered one was created and that could cascade
(I think?)

Any existing scripts out there that might do this? I found something
similar and modified it, the sql is included below. This gives me the
list of all the columns I need, I just need to get the foreign keys for
each from here before each one and generate all the create/drop
scripts.

All the columns I am looking to do this for are called "Id" making this
somewhat simpler. I'm just looking to incrementally make the SQL side
better and don't want to rewrite a bunch of application level code to
make the column names ISO compliant, etc.

/*
-- Returns whether the column is ASC or DESC
CREATE FUNCTION dbo.GetIndexColumnOrder
(
@object_id INT,
@index_id TINYINT,
@column_id TINYINT
)
RETURNS NVARCHAR(5)
AS
BEGIN
DECLARE @r NVARCHAR(5)
SELECT @r = CASE INDEXKEY_PROPERTY
(
@object_id,
@index_id,
@column_id,
'IsDescending'
)
WHEN 1 THEN N' DESC'
ELSE N''
END
RETURN @r
END

-- Returns the list of columns in the index
CREATE FUNCTION dbo.GetIndexColumns
(
@table_name SYSNAME,
@object_id INT,
@index_id TINYINT
)
RETURNS NVARCHAR(4000)
AS
BEGIN
DECLARE
@colnames NVARCHAR(4000),
@thisColID INT,
@thisColName SYSNAME

SET @colnames = INDEX_COL(@table_name, @index_id, 1)
+ dbo.GetIndexColumnOrder(@object_id, @index_id, 1)

SET @thisColID = 2
SET @thisColName = INDEX_COL(@table_name, @index_id, @thisColID)
+ dbo.GetIndexColumnOrder(@object_id, @index_id, @thisColID)

WHILE (@thisColName IS NOT NULL)
BEGIN
SET @thisColID = @thisColID + 1
SET @colnames = @colnames + ', ' + @thisColName

SET @thisColName = INDEX_COL(@table_name, @index_id,
@thisColID)
+ dbo.GetIndexColumnOrder(@object_id, @index_id,
@thisColID)
END
RETURN @colNames
END

CREATE VIEW dbo.vAllIndexes
AS
begin
SELECT
TABLE_NAME = OBJECT_NAME(i.id),
INDEX_NAME = i.name,
COLUMN_LIST = dbo.GetIndexColumns(OBJECT_NAME(i.id), i.id,
i.indid),
IS_CLUSTERED = INDEXPROPERTY(i.id, i.name, 'IsClustered'),
IS_UNIQUE = INDEXPROPERTY(i.id, i.name, 'IsUnique'),
FILE_GROUP = g.GroupName
FROM
sysindexes i
INNER JOIN
sysfilegroups g
ON
i.groupid = g.groupid
WHERE
(i.indid BETWEEN 1 AND 254)
-- leave out AUTO_STATISTICS:
AND (i.Status & 64)=0
-- leave out system tables:
AND OBJECTPROPERTY(i.id, 'IsMsShipped') = 0
end
*/

SELECT
v.*
FROM
dbo.vAllIndexes v
INNER JOIN
INFORMATION_SCHEMA.TABLE_CONSTRAINTS T
ON
T.CONSTRAINT_NAME = v.INDEX_NAME
AND T.TABLE_NAME = v.TABLE_NAME
AND T.CONSTRAINT_TYPE = 'PRIMARY KEY'
AND v.COLUMN_LIST = 'Id'
AND v.IS_CLUSTERED = 0
ORDER BY v.TABLE_NAME
Aug 14 '06 #2

P: n/a
pb648174 (go****@webpaul.net) writes:
I've been doing a bit of reading and have read in quite a few places
that an identity column is a good clustered index and that all or at
least most tables should have a clustered index. The tool I used to
generate tables made them all with non clustered indexes so I would
like to drop all of them and generate clustered indexes.
Yes, having clustered indexes on all tables is a good idea, but the
IDENTITY column is not always the best choice. It's a good choice if
you have a high transaction rate, and you want to avoid fragmentation
and page splits.

But for SELECT queries it is likely that in most tables that there
are better candidates for the clustered index, as you don't do
range queries on ids that often. So I would suggest that you review
your tables and look for better columns to cluster on.

Here I had single-column PKs in mind. Clustering on a multi-column PK,
or part of it is another matter. Take an OrderDetails table for instance.
"SELECT ... FROM OrderDetails WHERE OrderID = @id" is a very likely
query and a clustred index may be great here.

Stu's suggestion of keeping the PK non-clustered, and adding a clustered
index as well is not that bad. If you have a multi-column key that is 4
30 bytes long, but the first key column is four bytes, the clustering on
the first columns means that the key size for the clustered index is
only 8 bytes. (key col + uniquifier). Since cluster-key colunms appear
in non-clustered index, this matters quite a bit.

As for looking up the foreign keys, the tables are sysreferences in
SQL 2000 and sys.forein_keys in SQL 2005.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Aug 14 '06 #3

P: n/a
Well that makes things simpler then.. I'll try adding the clustered
columns to one area of the app and see if it makes a positive or
negative performance impact. Thanks for the info guys.

Erland Sommarskog wrote:
pb648174 (go****@webpaul.net) writes:
I've been doing a bit of reading and have read in quite a few places
that an identity column is a good clustered index and that all or at
least most tables should have a clustered index. The tool I used to
generate tables made them all with non clustered indexes so I would
like to drop all of them and generate clustered indexes.

Yes, having clustered indexes on all tables is a good idea, but the
IDENTITY column is not always the best choice. It's a good choice if
you have a high transaction rate, and you want to avoid fragmentation
and page splits.

But for SELECT queries it is likely that in most tables that there
are better candidates for the clustered index, as you don't do
range queries on ids that often. So I would suggest that you review
your tables and look for better columns to cluster on.

Here I had single-column PKs in mind. Clustering on a multi-column PK,
or part of it is another matter. Take an OrderDetails table for instance.
"SELECT ... FROM OrderDetails WHERE OrderID = @id" is a very likely
query and a clustred index may be great here.

Stu's suggestion of keeping the PK non-clustered, and adding a clustered
index as well is not that bad. If you have a multi-column key that is 4
30 bytes long, but the first key column is four bytes, the clustering on
the first columns means that the key size for the clustered index is
only 8 bytes. (key col + uniquifier). Since cluster-key colunms appear
in non-clustered index, this matters quite a bit.

As for looking up the foreign keys, the tables are sysreferences in
SQL 2000 and sys.forein_keys in SQL 2005.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Aug 15 '06 #4

P: n/a
Performance was actually worse once I added the clustered index. A
query that takes 4 seconds took 5 seconds after adding clustered
indexes to all the tables for a particular module. I turned the actual
execution plan display on and saw that it was using the clustered index
instead of the non clustered. So without the clustered index the
largest time used is an "index seek" and a "table spool/lazy spool" and
with the clustered index the index seek just becomes a clustered index
seek... No big difference except it takes longer!

pb648174 wrote:
Well that makes things simpler then.. I'll try adding the clustered
columns to one area of the app and see if it makes a positive or
negative performance impact. Thanks for the info guys.

Erland Sommarskog wrote:
pb648174 (go****@webpaul.net) writes:
I've been doing a bit of reading and have read in quite a few places
that an identity column is a good clustered index and that all or at
least most tables should have a clustered index. The tool I used to
generate tables made them all with non clustered indexes so I would
like to drop all of them and generate clustered indexes.
Yes, having clustered indexes on all tables is a good idea, but the
IDENTITY column is not always the best choice. It's a good choice if
you have a high transaction rate, and you want to avoid fragmentation
and page splits.

But for SELECT queries it is likely that in most tables that there
are better candidates for the clustered index, as you don't do
range queries on ids that often. So I would suggest that you review
your tables and look for better columns to cluster on.

Here I had single-column PKs in mind. Clustering on a multi-column PK,
or part of it is another matter. Take an OrderDetails table for instance.
"SELECT ... FROM OrderDetails WHERE OrderID = @id" is a very likely
query and a clustred index may be great here.

Stu's suggestion of keeping the PK non-clustered, and adding a clustered
index as well is not that bad. If you have a multi-column key that is 4
30 bytes long, but the first key column is four bytes, the clustering on
the first columns means that the key size for the clustered index is
only 8 bytes. (key col + uniquifier). Since cluster-key colunms appear
in non-clustered index, this matters quite a bit.

As for looking up the foreign keys, the tables are sysreferences in
SQL 2000 and sys.forein_keys in SQL 2005.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Aug 16 '06 #5

P: n/a
Index tuning is not black-and-white, especially when it comes to the
clustered index decision. It is likely than some queries will benefit by
the PK clustered index while others will not. You'll need run a mix of
queries that is representative of the actual workload mix to ascertain
overall performance impact. IMHO, an all-or-nothing clustered index
decision is naive.

It is also possible that some tables will benefit with the clustered PK and
others will not. I know that this adds a wrinkle to automated schema
generation but this is reality. You might consider using the Index Tuning
Wizard (SQL 2000) or Database Engine Tuning Advisor (SQL 2005) for index
recommendations based on workload.
--
Hope this helps.

Dan Guzman
SQL Server MVP

"pb648174" <go****@webpaul.netwrote in message
news:11**********************@74g2000cwt.googlegro ups.com...
Performance was actually worse once I added the clustered index. A
query that takes 4 seconds took 5 seconds after adding clustered
indexes to all the tables for a particular module. I turned the actual
execution plan display on and saw that it was using the clustered index
instead of the non clustered. So without the clustered index the
largest time used is an "index seek" and a "table spool/lazy spool" and
with the clustered index the index seek just becomes a clustered index
seek... No big difference except it takes longer!

pb648174 wrote:
>Well that makes things simpler then.. I'll try adding the clustered
columns to one area of the app and see if it makes a positive or
negative performance impact. Thanks for the info guys.

Erland Sommarskog wrote:
pb648174 (go****@webpaul.net) writes:
I've been doing a bit of reading and have read in quite a few places
that an identity column is a good clustered index and that all or at
least most tables should have a clustered index. The tool I used to
generate tables made them all with non clustered indexes so I would
like to drop all of them and generate clustered indexes.

Yes, having clustered indexes on all tables is a good idea, but the
IDENTITY column is not always the best choice. It's a good choice if
you have a high transaction rate, and you want to avoid fragmentation
and page splits.

But for SELECT queries it is likely that in most tables that there
are better candidates for the clustered index, as you don't do
range queries on ids that often. So I would suggest that you review
your tables and look for better columns to cluster on.

Here I had single-column PKs in mind. Clustering on a multi-column PK,
or part of it is another matter. Take an OrderDetails table for
instance.
"SELECT ... FROM OrderDetails WHERE OrderID = @id" is a very likely
query and a clustred index may be great here.

Stu's suggestion of keeping the PK non-clustered, and adding a
clustered
index as well is not that bad. If you have a multi-column key that is 4
30 bytes long, but the first key column is four bytes, the clustering
on
the first columns means that the key size for the clustered index is
only 8 bytes. (key col + uniquifier). Since cluster-key colunms appear
in non-clustered index, this matters quite a bit.

As for looking up the foreign keys, the tables are sysreferences in
SQL 2000 and sys.forein_keys in SQL 2005.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx

Aug 16 '06 #6

This discussion thread is closed

Replies have been disabled for this discussion.