Hi,
I have been running some queries against a table in a my database and
have noted an odd (at least it seems odd to me) performance issue.
The table has approximately 5 million rows and includes the following
columns:
DocID (INTEGER, PRIMARY KEY, CLUSTERED)
IsRecord (INTEGER, NONCLUSTERED)
Title (VARCHAR(255), NONCLUSTERED)
If I issue the following query:
SELECT DocID, IsRecord FROM DocTable WHERE Title LIKE '%process%'
it takes about 23 seconds to return the 481 hits.
The execution plan shows a non-clustered index scan being performed on
the Title index (returning 481 rows) and a non-clustered index scan on
the IsRecord index (returning 4.9 million rows). These are then merged
in a hash match/inner join operation.
The Title index scan has an estimated row size of 41 and an I/O cost
of 9.82 (cost is 27%). The IsRecord index scan has an estimated row
size of 33 and an I/O cost of 6.32 (cost is 21%). The Hash Match
accounts for a further 52% of the cose with the SELECT at the head of
the plan listed as 0% cost.
If I issue the following query:
SELECT DocID, Title FROM DocTable WHERE Title LIKE '%process%'
it takes about 12 seconds to return the 481 hits and consists solely
of a non-clustered index scan of the Title Index.
Again the Title index scan has an estimated row size of 41 and an I/O
cost of 9.82 ans it's cost is listed as 78%. The SELECT at the head of
the plan is attributed the other 22% of the cost.
All this is fine, however when I issue the following query:
SELECT DocID, Title, IsRecord FROM DocTable WHERE Title LIKE
'%process%'
it takes 1 minute 50 seconds to run the query. The execution plans
shows that a clustered index scan is occurring and this accounts for
96% of the cost. The estimated row size is 463 and the I/O cost is
111.
What on earth is going on here. I can understand the need to scan the
Title index because of the wildcards, but why on earth would the query
perform a scan of the clustered (primary key) index? And what is going
on with the row size and I/O cost?
All the indexes and statistics are up to date, so I am at a complete
loss to explain what is going on here. Can anyone explain why the 3rd
query is so much slower (and possibly suggest a way to improve the
performance)/
Thanks
Paul Mateer
Meridio Limted
I am at a complete loss to explain what is happening here,