473,387 Members | 1,374 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

Need to tune a table for performance gains

Hi :

I have a TableA with around 10 columns with varchar and numeric
datatypes
It has 500 million records and its size is 999999999 KB. i believe it
is kb
i got this data after running sp_spaceused on it. The index_size was
also pretty big in 6 digits.

On looking at the tableA
it didnot have any pks and hence no clustered index.
It had other indices
IX_1 on ColA
IX_2 on ColB
IX_3 on ColC
IX_4 on ColA, ColB and ColC put together.

Queries performed in this table are very slow. I have been asked to
tune up this table.
I know as much info as you.

Data prior to 2004 can be archived into another table. I need to run a
query to find out how many records that is.

I am thinking the following, but dont know if i am correct ?
I need to add a new PK column (which will increase the size of the
tableA) which will add a clustered index.
Right now there are no clustered indices

2. I would like help in understanding should i remove IX_1, IX_2, IX_3
as they are all used in IX_4 anyway .

3. I forget what the textbox is called on the index page. it is set to
0 and can be set from 0 to 100. what would be a good value for it ?

thank you.
RS

May 2 '07 #1
8 2366
On May 1, 6:29 pm, rshivara...@gmail.com wrote:
Hi :

I have a TableA with around 10 columns with varchar and numeric
datatypes
It has 500 million records and its size is 999999999 KB. i believe it
is kb
i got this data after running sp_spaceused on it. The index_size was
also pretty big in 6 digits.

On looking at the tableA
it didnot have any pks and hence no clustered index.
It had other indices
IX_1 on ColA
IX_2 on ColB
IX_3 on ColC
IX_4 on ColA, ColB and ColC put together.

Queries performed in this table are very slow. I have been asked to
tune up this table.
I know as much info as you.

Data prior to 2004 can be archived into another table. I need to run a
query to find out how many records that is.

I am thinking the following, but dont know if i am correct ?
I need to add a new PK column (which will increase the size of the
tableA) which will add a clustered index.
Right now there are no clustered indices

2. I would like help in understanding should i remove IX_1, IX_2, IX_3
as they are all used in IX_4 anyway .

3. I forget what the textbox is called on the index page. it is set to
0 and can be set from 0 to 100. what would be a good value for it ?

thank you.
RS
I'm afraid I might not be able to offer all the hints you are looking
for, but here are some things I've found helpful:

1) LOOK AT YOUR QUERIES!! We've obtained huge performance boosts
just by looking at the queries and stored procedures and optimizing
them. Usually if a query is going extremely slow it is because of
poor programming techniques.
2) LOOK AT YOUR QUERIES!! Are they sometimes filtering only on
ColB? If so, IX_4 will be useless if you remove IX_2. IX_4 will kick
in when you are filtering on several columns. But are there other
columns that are being filtered on?
3) Are you dynamically parcing or calculating columns? For instance,
if you have a column with EventDateTime of 01-01-2007 12:34PM but you
are filtering on dates (ie WHERE CONVERT(varchar, EventDateTime, 101)
= '01-01-07') then that is going to slow down your query. Create a
new column of EventDate and calculate the event date from the event
date and time and then stick an index on it. That will add some
boosts as the sampled WHERE clause won't reference the index.
4) Do you look at the query execution plan? That can tell you if you
are doing full table scans or hitting an index.
5) Have you used the query index wizard?
6) Breaking the table up will help a lot. At one company I consult
for we have a similar table (a half million records added per day)
with data going back to 2003. We create two (what we call) _Mini
tables. TableName_Mini_Last45Days, TableName_Mini_Last6Months, and
TableName_Mini_Last1Year. This does take up space but since 80% of
the queries only want data from the last month the first mini table
makes sense to hit. If they need to go further back further then they
hit the appropriate table. The main table (TableName) is rarely
queries except after its morning data load (off hours) to create the
_Mini tables so if somebody does need information from it, it isn't
bogged down by heavy usage.
7) I've never done this personally so I cannot attest to its success
(though I should test it). I have heard you can recreate/rebuild an
index and that should get rid of some of the fragmentation. I think
I'll give it a test, but you might want to do that as well. But since
it is re-indexing half a billion records you may want to do that off-
hours.

I hope one or more of these helps you out!

May 2 '07 #2
On May 2, 10:39 am, Utahduck <Utahd...@hotmail.comwrote:
On May 1, 6:29 pm, rshivara...@gmail.com wrote:


Hi :
I have a TableA with around 10 columns with varchar and numeric
datatypes
It has 500 million records and its size is 999999999 KB. i believe it
is kb
i got this data after running sp_spaceused on it. The index_size was
also pretty big in 6 digits.
On looking at the tableA
it didnot have any pks and hence no clustered index.
It had other indices
IX_1 on ColA
IX_2 on ColB
IX_3 on ColC
IX_4 on ColA, ColB and ColC put together.
Queries performed in this table are very slow. I have been asked to
tune up this table.
I know as much info as you.
Data prior to 2004 can be archived into another table. I need to run a
query to find out how many records that is.
I am thinking the following, but dont know if i am correct ?
I need to add a new PK column (which will increase the size of the
tableA) which will add a clustered index.
Right now there are no clustered indices
2. I would like help in understanding should i remove IX_1, IX_2, IX_3
as they are all used in IX_4 anyway .
3. I forget what the textbox is called on the index page. it is set to
0 and can be set from 0 to 100. what would be a good value for it ?
thank you.
RS

I'm afraid I might not be able to offer all the hints you are looking
for, but here are some things I've found helpful:

1) LOOK AT YOUR QUERIES!! We've obtained huge performance boosts
just by looking at the queries and stored procedures and optimizing
them. Usually if a query is going extremely slow it is because of
poor programming techniques.
2) LOOK AT YOUR QUERIES!! Are they sometimes filtering only on
ColB? If so, IX_4 will be useless if you remove IX_2. IX_4 will kick
in when you are filtering on several columns. But are there other
columns that are being filtered on?
3) Are you dynamically parcing or calculating columns? For instance,
if you have a column with EventDateTime of 01-01-2007 12:34PM but you
are filtering on dates (ie WHERE CONVERT(varchar, EventDateTime, 101)
= '01-01-07') then that is going to slow down your query. Create a
new column of EventDate and calculate the event date from the event
date and time and then stick an index on it. That will add some
boosts as the sampled WHERE clause won't reference the index.
4) Do you look at the query execution plan? That can tell you if you
are doing full table scans or hitting an index.
5) Have you used the query index wizard?
6) Breaking the table up will help a lot. At one company I consult
for we have a similar table (a half million records added per day)
with data going back to 2003. We create two (what we call) _Mini
tables. TableName_Mini_Last45Days, TableName_Mini_Last6Months, and
TableName_Mini_Last1Year. This does take up space but since 80% of
the queries only want data from the last month the first mini table
makes sense to hit. If they need to go further back further then they
hit the appropriate table. The main table (TableName) is rarely
queries except after its morning data load (off hours) to create the
_Mini tables so if somebody does need information from it, it isn't
bogged down by heavy usage.
7) I've never done this personally so I cannot attest to its success
(though I should test it). I have heard you can recreate/rebuild an
index and that should get rid of some of the fragmentation. I think
I'll give it a test, but you might want to do that as well. But since
it is re-indexing half a billion records you may want to do that off-
hours.

I hope one or more of these helps you out!- Hide quoted text -

- Show quoted text -
Thank you Utahduck, your mail was helpful. Also am reading up on
indexes and that is helping me.
You idea of laying out mini tables by the year is a great suggestion i
hadnt thought of. I will be definitely using that
Thank you
RS

May 2 '07 #3
Think of IX_4 like a phone book.

City, Last Name, First Name, Phone Number.

That's great, as long as you know what city a person is in. If you
don't know the city then it takes you a lot longer to look up the
perosns name because you have to loop through each city to find if the
person is in there.

That's where IX_1, Ix_2, and IX_3 come in. You could easily get rid of
IX_1, because it will be in the same order as IX_4.

Cheers,
Jason Lepack

On May 2, 1:07 pm, rshivara...@gmail.com wrote:
On May 2, 10:39 am, Utahduck <Utahd...@hotmail.comwrote:


On May 1, 6:29 pm, rshivara...@gmail.com wrote:
Hi :
I have a TableA with around 10 columns with varchar and numeric
datatypes
It has 500 million records and its size is 999999999 KB. i believe it
is kb
i got this data after running sp_spaceused on it. The index_size was
also pretty big in 6 digits.
On looking at the tableA
it didnot have any pks and hence no clustered index.
It had other indices
IX_1 on ColA
IX_2 on ColB
IX_3 on ColC
IX_4 on ColA, ColB and ColC put together.
Queries performed in this table are very slow. I have been asked to
tune up this table.
I know as much info as you.
Data prior to 2004 can be archived into another table. I need to run a
query to find out how many records that is.
I am thinking the following, but dont know if i am correct ?
I need to add a new PK column (which will increase the size of the
tableA) which will add a clustered index.
Right now there are no clustered indices
2. I would like help in understanding should i remove IX_1, IX_2, IX_3
as they are all used in IX_4 anyway .
3. I forget what the textbox is called on the index page. it is set to
0 and can be set from 0 to 100. what would be a good value for it ?
thank you.
RS
I'm afraid I might not be able to offer all the hints you are looking
for, but here are some things I've found helpful:
1) LOOK AT YOUR QUERIES!! We've obtained huge performance boosts
just by looking at the queries and stored procedures and optimizing
them. Usually if a query is going extremely slow it is because of
poor programming techniques.
2) LOOK AT YOUR QUERIES!! Are they sometimes filtering only on
ColB? If so, IX_4 will be useless if you remove IX_2. IX_4 will kick
in when you are filtering on several columns. But are there other
columns that are being filtered on?
3) Are you dynamically parcing or calculating columns? For instance,
if you have a column with EventDateTime of 01-01-2007 12:34PM but you
are filtering on dates (ie WHERE CONVERT(varchar, EventDateTime, 101)
= '01-01-07') then that is going to slow down your query. Create a
new column of EventDate and calculate the event date from the event
date and time and then stick an index on it. That will add some
boosts as the sampled WHERE clause won't reference the index.
4) Do you look at the query execution plan? That can tell you if you
are doing full table scans or hitting an index.
5) Have you used the query index wizard?
6) Breaking the table up will help a lot. At one company I consult
for we have a similar table (a half million records added per day)
with data going back to 2003. We create two (what we call) _Mini
tables. TableName_Mini_Last45Days, TableName_Mini_Last6Months, and
TableName_Mini_Last1Year. This does take up space but since 80% of
the queries only want data from the last month the first mini table
makes sense to hit. If they need to go further back further then they
hit the appropriate table. The main table (TableName) is rarely
queries except after its morning data load (off hours) to create the
_Mini tables so if somebody does need information from it, it isn't
bogged down by heavy usage.
7) I've never done this personally so I cannot attest to its success
(though I should test it). I have heard you can recreate/rebuild an
index and that should get rid of some of the fragmentation. I think
I'll give it a test, but you might want to do that as well. But since
it is re-indexing half a billion records you may want to do that off-
hours.
I hope one or more of these helps you out!- Hide quoted text -
- Show quoted text -

Thank you Utahduck, your mail was helpful. Also am reading up on
indexes and that is helping me.
You idea of laying out mini tables by the year is a great suggestion i
hadnt thought of. I will be definitely using that
Thank you
RS- Hide quoted text -

- Show quoted text -

May 2 '07 #4
(rs*********@gmail.com) writes:
I have a TableA with around 10 columns with varchar and numeric
datatypes
It has 500 million records and its size is 999999999 KB. i believe it
is kb i got this data after running sp_spaceused on it. The index_size was
also pretty big in 6 digits.
Could you post the exact output from sp_spaceused and also from
DBCC SHOWCONTIG on the table? (The latter will take some time to
run on a table this size.)
On looking at the tableA
it didnot have any pks and hence no clustered index.
Maybe the table has neither, however there is no connection between
the two. A table could have a clustered index, but no primary key or
vice versa. Most or rather all tables should really have a PK. And
the very most tables should have a clustered index. But far from all
tables should have their clustered index on their primary key.

Tables without a clustered index are known as heaps. Heaps are more
prone to fragmentation for various reasons, and given the size of your
table, I suspect yours is victim to that.
It had other indices
IX_1 on ColA
IX_2 on ColB
IX_3 on ColC
IX_4 on ColA, ColB and ColC put together.

Queries performed in this table are very slow. I have been asked to
tune up this table.
I know as much info as you.
There are all reasons to investigate whether the queries align with
the index. Utahduck made a very good point about queries hiding the
column in an expression.
Data prior to 2004 can be archived into another table. I need to run a
query to find out how many records that is.
Since we don't know the table definiton, we cannot help you with that.
I am thinking the following, but dont know if i am correct ?
I need to add a new PK column (which will increase the size of the
tableA) which will add a clustered index.
Better investigate if there are any columns for which there are
typically range queries, like a datetime column.

You should also investigate if there are columns in the table that in
fact constitute a key, but someone has forgotten to define that key.
After all, adding an IDENTITY column as a PK, makes in practice very
little difference to not having a key at all.
2. I would like help in understanding should i remove IX_1, IX_2, IX_3
as they are all used in IX_4 anyway .
That is impossible to say without knowing the queries. But a query
like

SELECT .... FROM tbl WHERE colB = @value

will not be helped much by the index on (ColA, ColB, ColC). Possibly,
the index on ColA is redudant, but there are queries that will run
faster with this index in place, for instance:

SELECT ColA, COUNT(*) FROM tbl GROUP BY ColA
3. I forget what the textbox is called on the index page. it is set to
0 and can be set from 0 to 100. what would be a good value for it ?
That is the fillfactor. We would need to know more about the table to
be able to make recommendations about it.
--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
May 2 '07 #5
Thank you All .
Erland, i will post answers to your queries shortly.
I was looking at a datatype of a column and it a numerica 9(15,0),
occupies 9 bytes but allowed a max lenght of 15.
This can also be a varchar field. So a varchar(15) compared to a
numeric 9(15,0)
Which occupies more space? or they are the same ?

May 4 '07 #6
exec sp_spaceused TableA
TableA 1 16 KB 8 KB 8 KB 0 KB

exec sp_spaceused Tableb
TableB 1 16 KB 8 KB 8 KB 0 KB
Hi : I created two tables.
TableA has a varchar field of size 18
TableB has a numeric field of size 9(15,0)

I put data of 18 1's in TableA and 15 1's in TableB

Why is there an index_size whey i have not created any pks and ixs.
Looks like they are the same storage size ? Is that strange ?
Right now,there are only one rows in each, so if i had more rows of
same data, will the size differ ?

May 4 '07 #7
DBCC SHOWCONTIG (TableA)-- 18:39 minutes

DBCC SHOWCONTIG scanning 'TableA' table...
Table: 'TableA' (549576996); index ID: 0, database ID: 15
TABLE level scan performed.
- Pages Scanned................................: 6007257
- Extents Scanned..............................: 750994
- Extent Switches..............................: 750993
- Avg. Pages per Extent........................: 8.0
- Scan Density [Best Count:Actual Count].......: 99.99%
[750908:750994]
- Extent Scan Fragmentation ...................: 21.20%
- Avg. Bytes Free per Page.....................: 368.0
- Avg. Page Density (full).....................: 95.45%
DBCC execution completed. If DBCC printed error messages, contact your
system administrator.

exec SP_SPACEUSED TableA

name rows reserved data
index_size unused
tbl06521AA 504609530 125679832 KB 48058056 KB 77594288 KB 27488 KB

May 4 '07 #8
(rs*********@gmail.com) writes:
Erland, i will post answers to your queries shortly.
I was looking at a datatype of a column and it a numerica 9(15,0),
occupies 9 bytes but allowed a max lenght of 15.
This can also be a varchar field. So a varchar(15) compared to a
numeric 9(15,0)
Which occupies more space? or they are the same ?
They are not the same. Each numeric(15,0) value occupies nine bytes,
including NULL values. (Unless you are on SQL 2005 SP2 and use the
new vardecimal feature.)

A varchar(15) values can occupy anything from 2 to 17 bytes. That is, two
bytes for the length and then as many bytes as needed for the value.

I don't really understand why you are making these considerations, but
if you are choosing between the two for a key value, I would recommend
a numeric type (and rather bigint over numeric(15,0)), since varchar
is subject to more complex sorting and comparison rules (unless you
pick a binary collation.)

I looked at your SHOWCONTIG and spaceused data. The table has some
fragmentation, but it is not frightening. And the table certainly
calls for respect with its 125 GB.

I think you should examine exactly what queries that are run against
this table. With a table this size, you really to have indexes to
support all queries.

I guess that since the table has so low rate of fragmentation, that
data is only inserted, but never updated or deleted. Therefore it
may not be a pressing issue to add a clustered index - an operation
given the size of the table that will take some time.
--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
May 4 '07 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
by: | last post by:
As we're on this topic in another thread right now: Say I have a SELECT query from more than one table and with some = conditions, does it matter in what order I enter the tables in the FROM =...
7
by: CAH | last post by:
I once wrote on usenet and got a very good answer . This one JN wrote: > >| Should one place all the text in one big table, or is i better to place > >| it in several tabels. Should one make 4...
4
by: Aaron | last post by:
In a test environment we have dropped over 100 RI restrictions which showed a huge instant performance gain where some jobs were running with 2 or 3 times the performance than previously. The...
11
by: my-wings | last post by:
I think I've painted myself into a corner, and I'm hoping someone can help me out. I have a table of books (tblBooks), which includes a field (strPubName) for Publisher Name and another field...
1
by: Quizical | last post by:
First, is null considered a Value Type or a Reference Type. I have a question about performance and memory. Does the performance change or the amount of memory that is consumed when you have a...
6
by: kvsnramesh | last post by:
hi, I have a problem asked by one of my senior person and finding the answer . What is the step by step procedure for tune a large sql query. OR how do we tune a large SQL query with somany joins
3
by: Jesper Stocholm | last post by:
I have developed a data-cleaner that extracts some data from a database, cleans it for illegal/unwanted data and writes it to a CSV-fil for later insertion to a SQL Server 2000 database. My problem...
4
by: tarscher | last post by:
Hi all, I have 2 questions regarding performance: 1) I'm building a monitoring system that has to store lots of sensor data that I 'll have to query from time to time. I have pressure and...
4
by: technocrat | last post by:
I am trying to update a million rows in a table and doing this reading from a file which has the record pkids and constructing a preparedstatemtn with that pkid and adding it to the batch......
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.