472,353 Members | 2,200 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,353 software developers and data experts.

Need to tune a table for performance gains

Hi :

I have a TableA with around 10 columns with varchar and numeric
datatypes
It has 500 million records and its size is 999999999 KB. i believe it
is kb
i got this data after running sp_spaceused on it. The index_size was
also pretty big in 6 digits.

On looking at the tableA
it didnot have any pks and hence no clustered index.
It had other indices
IX_1 on ColA
IX_2 on ColB
IX_3 on ColC
IX_4 on ColA, ColB and ColC put together.

Queries performed in this table are very slow. I have been asked to
tune up this table.
I know as much info as you.

Data prior to 2004 can be archived into another table. I need to run a
query to find out how many records that is.

I am thinking the following, but dont know if i am correct ?
I need to add a new PK column (which will increase the size of the
tableA) which will add a clustered index.
Right now there are no clustered indices

2. I would like help in understanding should i remove IX_1, IX_2, IX_3
as they are all used in IX_4 anyway .

3. I forget what the textbox is called on the index page. it is set to
0 and can be set from 0 to 100. what would be a good value for it ?

thank you.
RS

May 2 '07 #1
8 2285
On May 1, 6:29 pm, rshivara...@gmail.com wrote:
Hi :

I have a TableA with around 10 columns with varchar and numeric
datatypes
It has 500 million records and its size is 999999999 KB. i believe it
is kb
i got this data after running sp_spaceused on it. The index_size was
also pretty big in 6 digits.

On looking at the tableA
it didnot have any pks and hence no clustered index.
It had other indices
IX_1 on ColA
IX_2 on ColB
IX_3 on ColC
IX_4 on ColA, ColB and ColC put together.

Queries performed in this table are very slow. I have been asked to
tune up this table.
I know as much info as you.

Data prior to 2004 can be archived into another table. I need to run a
query to find out how many records that is.

I am thinking the following, but dont know if i am correct ?
I need to add a new PK column (which will increase the size of the
tableA) which will add a clustered index.
Right now there are no clustered indices

2. I would like help in understanding should i remove IX_1, IX_2, IX_3
as they are all used in IX_4 anyway .

3. I forget what the textbox is called on the index page. it is set to
0 and can be set from 0 to 100. what would be a good value for it ?

thank you.
RS
I'm afraid I might not be able to offer all the hints you are looking
for, but here are some things I've found helpful:

1) LOOK AT YOUR QUERIES!! We've obtained huge performance boosts
just by looking at the queries and stored procedures and optimizing
them. Usually if a query is going extremely slow it is because of
poor programming techniques.
2) LOOK AT YOUR QUERIES!! Are they sometimes filtering only on
ColB? If so, IX_4 will be useless if you remove IX_2. IX_4 will kick
in when you are filtering on several columns. But are there other
columns that are being filtered on?
3) Are you dynamically parcing or calculating columns? For instance,
if you have a column with EventDateTime of 01-01-2007 12:34PM but you
are filtering on dates (ie WHERE CONVERT(varchar, EventDateTime, 101)
= '01-01-07') then that is going to slow down your query. Create a
new column of EventDate and calculate the event date from the event
date and time and then stick an index on it. That will add some
boosts as the sampled WHERE clause won't reference the index.
4) Do you look at the query execution plan? That can tell you if you
are doing full table scans or hitting an index.
5) Have you used the query index wizard?
6) Breaking the table up will help a lot. At one company I consult
for we have a similar table (a half million records added per day)
with data going back to 2003. We create two (what we call) _Mini
tables. TableName_Mini_Last45Days, TableName_Mini_Last6Months, and
TableName_Mini_Last1Year. This does take up space but since 80% of
the queries only want data from the last month the first mini table
makes sense to hit. If they need to go further back further then they
hit the appropriate table. The main table (TableName) is rarely
queries except after its morning data load (off hours) to create the
_Mini tables so if somebody does need information from it, it isn't
bogged down by heavy usage.
7) I've never done this personally so I cannot attest to its success
(though I should test it). I have heard you can recreate/rebuild an
index and that should get rid of some of the fragmentation. I think
I'll give it a test, but you might want to do that as well. But since
it is re-indexing half a billion records you may want to do that off-
hours.

I hope one or more of these helps you out!

May 2 '07 #2
On May 2, 10:39 am, Utahduck <Utahd...@hotmail.comwrote:
On May 1, 6:29 pm, rshivara...@gmail.com wrote:


Hi :
I have a TableA with around 10 columns with varchar and numeric
datatypes
It has 500 million records and its size is 999999999 KB. i believe it
is kb
i got this data after running sp_spaceused on it. The index_size was
also pretty big in 6 digits.
On looking at the tableA
it didnot have any pks and hence no clustered index.
It had other indices
IX_1 on ColA
IX_2 on ColB
IX_3 on ColC
IX_4 on ColA, ColB and ColC put together.
Queries performed in this table are very slow. I have been asked to
tune up this table.
I know as much info as you.
Data prior to 2004 can be archived into another table. I need to run a
query to find out how many records that is.
I am thinking the following, but dont know if i am correct ?
I need to add a new PK column (which will increase the size of the
tableA) which will add a clustered index.
Right now there are no clustered indices
2. I would like help in understanding should i remove IX_1, IX_2, IX_3
as they are all used in IX_4 anyway .
3. I forget what the textbox is called on the index page. it is set to
0 and can be set from 0 to 100. what would be a good value for it ?
thank you.
RS

I'm afraid I might not be able to offer all the hints you are looking
for, but here are some things I've found helpful:

1) LOOK AT YOUR QUERIES!! We've obtained huge performance boosts
just by looking at the queries and stored procedures and optimizing
them. Usually if a query is going extremely slow it is because of
poor programming techniques.
2) LOOK AT YOUR QUERIES!! Are they sometimes filtering only on
ColB? If so, IX_4 will be useless if you remove IX_2. IX_4 will kick
in when you are filtering on several columns. But are there other
columns that are being filtered on?
3) Are you dynamically parcing or calculating columns? For instance,
if you have a column with EventDateTime of 01-01-2007 12:34PM but you
are filtering on dates (ie WHERE CONVERT(varchar, EventDateTime, 101)
= '01-01-07') then that is going to slow down your query. Create a
new column of EventDate and calculate the event date from the event
date and time and then stick an index on it. That will add some
boosts as the sampled WHERE clause won't reference the index.
4) Do you look at the query execution plan? That can tell you if you
are doing full table scans or hitting an index.
5) Have you used the query index wizard?
6) Breaking the table up will help a lot. At one company I consult
for we have a similar table (a half million records added per day)
with data going back to 2003. We create two (what we call) _Mini
tables. TableName_Mini_Last45Days, TableName_Mini_Last6Months, and
TableName_Mini_Last1Year. This does take up space but since 80% of
the queries only want data from the last month the first mini table
makes sense to hit. If they need to go further back further then they
hit the appropriate table. The main table (TableName) is rarely
queries except after its morning data load (off hours) to create the
_Mini tables so if somebody does need information from it, it isn't
bogged down by heavy usage.
7) I've never done this personally so I cannot attest to its success
(though I should test it). I have heard you can recreate/rebuild an
index and that should get rid of some of the fragmentation. I think
I'll give it a test, but you might want to do that as well. But since
it is re-indexing half a billion records you may want to do that off-
hours.

I hope one or more of these helps you out!- Hide quoted text -

- Show quoted text -
Thank you Utahduck, your mail was helpful. Also am reading up on
indexes and that is helping me.
You idea of laying out mini tables by the year is a great suggestion i
hadnt thought of. I will be definitely using that
Thank you
RS

May 2 '07 #3
Think of IX_4 like a phone book.

City, Last Name, First Name, Phone Number.

That's great, as long as you know what city a person is in. If you
don't know the city then it takes you a lot longer to look up the
perosns name because you have to loop through each city to find if the
person is in there.

That's where IX_1, Ix_2, and IX_3 come in. You could easily get rid of
IX_1, because it will be in the same order as IX_4.

Cheers,
Jason Lepack

On May 2, 1:07 pm, rshivara...@gmail.com wrote:
On May 2, 10:39 am, Utahduck <Utahd...@hotmail.comwrote:


On May 1, 6:29 pm, rshivara...@gmail.com wrote:
Hi :
I have a TableA with around 10 columns with varchar and numeric
datatypes
It has 500 million records and its size is 999999999 KB. i believe it
is kb
i got this data after running sp_spaceused on it. The index_size was
also pretty big in 6 digits.
On looking at the tableA
it didnot have any pks and hence no clustered index.
It had other indices
IX_1 on ColA
IX_2 on ColB
IX_3 on ColC
IX_4 on ColA, ColB and ColC put together.
Queries performed in this table are very slow. I have been asked to
tune up this table.
I know as much info as you.
Data prior to 2004 can be archived into another table. I need to run a
query to find out how many records that is.
I am thinking the following, but dont know if i am correct ?
I need to add a new PK column (which will increase the size of the
tableA) which will add a clustered index.
Right now there are no clustered indices
2. I would like help in understanding should i remove IX_1, IX_2, IX_3
as they are all used in IX_4 anyway .
3. I forget what the textbox is called on the index page. it is set to
0 and can be set from 0 to 100. what would be a good value for it ?
thank you.
RS
I'm afraid I might not be able to offer all the hints you are looking
for, but here are some things I've found helpful:
1) LOOK AT YOUR QUERIES!! We've obtained huge performance boosts
just by looking at the queries and stored procedures and optimizing
them. Usually if a query is going extremely slow it is because of
poor programming techniques.
2) LOOK AT YOUR QUERIES!! Are they sometimes filtering only on
ColB? If so, IX_4 will be useless if you remove IX_2. IX_4 will kick
in when you are filtering on several columns. But are there other
columns that are being filtered on?
3) Are you dynamically parcing or calculating columns? For instance,
if you have a column with EventDateTime of 01-01-2007 12:34PM but you
are filtering on dates (ie WHERE CONVERT(varchar, EventDateTime, 101)
= '01-01-07') then that is going to slow down your query. Create a
new column of EventDate and calculate the event date from the event
date and time and then stick an index on it. That will add some
boosts as the sampled WHERE clause won't reference the index.
4) Do you look at the query execution plan? That can tell you if you
are doing full table scans or hitting an index.
5) Have you used the query index wizard?
6) Breaking the table up will help a lot. At one company I consult
for we have a similar table (a half million records added per day)
with data going back to 2003. We create two (what we call) _Mini
tables. TableName_Mini_Last45Days, TableName_Mini_Last6Months, and
TableName_Mini_Last1Year. This does take up space but since 80% of
the queries only want data from the last month the first mini table
makes sense to hit. If they need to go further back further then they
hit the appropriate table. The main table (TableName) is rarely
queries except after its morning data load (off hours) to create the
_Mini tables so if somebody does need information from it, it isn't
bogged down by heavy usage.
7) I've never done this personally so I cannot attest to its success
(though I should test it). I have heard you can recreate/rebuild an
index and that should get rid of some of the fragmentation. I think
I'll give it a test, but you might want to do that as well. But since
it is re-indexing half a billion records you may want to do that off-
hours.
I hope one or more of these helps you out!- Hide quoted text -
- Show quoted text -

Thank you Utahduck, your mail was helpful. Also am reading up on
indexes and that is helping me.
You idea of laying out mini tables by the year is a great suggestion i
hadnt thought of. I will be definitely using that
Thank you
RS- Hide quoted text -

- Show quoted text -

May 2 '07 #4
(rs*********@gmail.com) writes:
I have a TableA with around 10 columns with varchar and numeric
datatypes
It has 500 million records and its size is 999999999 KB. i believe it
is kb i got this data after running sp_spaceused on it. The index_size was
also pretty big in 6 digits.
Could you post the exact output from sp_spaceused and also from
DBCC SHOWCONTIG on the table? (The latter will take some time to
run on a table this size.)
On looking at the tableA
it didnot have any pks and hence no clustered index.
Maybe the table has neither, however there is no connection between
the two. A table could have a clustered index, but no primary key or
vice versa. Most or rather all tables should really have a PK. And
the very most tables should have a clustered index. But far from all
tables should have their clustered index on their primary key.

Tables without a clustered index are known as heaps. Heaps are more
prone to fragmentation for various reasons, and given the size of your
table, I suspect yours is victim to that.
It had other indices
IX_1 on ColA
IX_2 on ColB
IX_3 on ColC
IX_4 on ColA, ColB and ColC put together.

Queries performed in this table are very slow. I have been asked to
tune up this table.
I know as much info as you.
There are all reasons to investigate whether the queries align with
the index. Utahduck made a very good point about queries hiding the
column in an expression.
Data prior to 2004 can be archived into another table. I need to run a
query to find out how many records that is.
Since we don't know the table definiton, we cannot help you with that.
I am thinking the following, but dont know if i am correct ?
I need to add a new PK column (which will increase the size of the
tableA) which will add a clustered index.
Better investigate if there are any columns for which there are
typically range queries, like a datetime column.

You should also investigate if there are columns in the table that in
fact constitute a key, but someone has forgotten to define that key.
After all, adding an IDENTITY column as a PK, makes in practice very
little difference to not having a key at all.
2. I would like help in understanding should i remove IX_1, IX_2, IX_3
as they are all used in IX_4 anyway .
That is impossible to say without knowing the queries. But a query
like

SELECT .... FROM tbl WHERE colB = @value

will not be helped much by the index on (ColA, ColB, ColC). Possibly,
the index on ColA is redudant, but there are queries that will run
faster with this index in place, for instance:

SELECT ColA, COUNT(*) FROM tbl GROUP BY ColA
3. I forget what the textbox is called on the index page. it is set to
0 and can be set from 0 to 100. what would be a good value for it ?
That is the fillfactor. We would need to know more about the table to
be able to make recommendations about it.
--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
May 2 '07 #5
Thank you All .
Erland, i will post answers to your queries shortly.
I was looking at a datatype of a column and it a numerica 9(15,0),
occupies 9 bytes but allowed a max lenght of 15.
This can also be a varchar field. So a varchar(15) compared to a
numeric 9(15,0)
Which occupies more space? or they are the same ?

May 4 '07 #6
exec sp_spaceused TableA
TableA 1 16 KB 8 KB 8 KB 0 KB

exec sp_spaceused Tableb
TableB 1 16 KB 8 KB 8 KB 0 KB
Hi : I created two tables.
TableA has a varchar field of size 18
TableB has a numeric field of size 9(15,0)

I put data of 18 1's in TableA and 15 1's in TableB

Why is there an index_size whey i have not created any pks and ixs.
Looks like they are the same storage size ? Is that strange ?
Right now,there are only one rows in each, so if i had more rows of
same data, will the size differ ?

May 4 '07 #7
DBCC SHOWCONTIG (TableA)-- 18:39 minutes

DBCC SHOWCONTIG scanning 'TableA' table...
Table: 'TableA' (549576996); index ID: 0, database ID: 15
TABLE level scan performed.
- Pages Scanned................................: 6007257
- Extents Scanned..............................: 750994
- Extent Switches..............................: 750993
- Avg. Pages per Extent........................: 8.0
- Scan Density [Best Count:Actual Count].......: 99.99%
[750908:750994]
- Extent Scan Fragmentation ...................: 21.20%
- Avg. Bytes Free per Page.....................: 368.0
- Avg. Page Density (full).....................: 95.45%
DBCC execution completed. If DBCC printed error messages, contact your
system administrator.

exec SP_SPACEUSED TableA

name rows reserved data
index_size unused
tbl06521AA 504609530 125679832 KB 48058056 KB 77594288 KB 27488 KB

May 4 '07 #8
(rs*********@gmail.com) writes:
Erland, i will post answers to your queries shortly.
I was looking at a datatype of a column and it a numerica 9(15,0),
occupies 9 bytes but allowed a max lenght of 15.
This can also be a varchar field. So a varchar(15) compared to a
numeric 9(15,0)
Which occupies more space? or they are the same ?
They are not the same. Each numeric(15,0) value occupies nine bytes,
including NULL values. (Unless you are on SQL 2005 SP2 and use the
new vardecimal feature.)

A varchar(15) values can occupy anything from 2 to 17 bytes. That is, two
bytes for the length and then as many bytes as needed for the value.

I don't really understand why you are making these considerations, but
if you are choosing between the two for a key value, I would recommend
a numeric type (and rather bigint over numeric(15,0)), since varchar
is subject to more complex sorting and comparison rules (unless you
pick a binary collation.)

I looked at your SHOWCONTIG and spaceused data. The table has some
fragmentation, but it is not frightening. And the table certainly
calls for respect with its 125 GB.

I think you should examine exactly what queries that are run against
this table. With a table this size, you really to have indexes to
support all queries.

I guess that since the table has so low rate of fragmentation, that
data is only inserted, but never updated or deleted. Therefore it
may not be a pressing issue to add a clustered index - an operation
given the size of the table that will take some time.
--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
May 4 '07 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
by: | last post by:
As we're on this topic in another thread right now: Say I have a SELECT query from more than one table and with some = conditions, does it matter...
7
by: CAH | last post by:
I once wrote on usenet and got a very good answer . This one JN wrote: > >| Should one place all the text in one big table, or is i better to...
4
by: Aaron | last post by:
In a test environment we have dropped over 100 RI restrictions which showed a huge instant performance gain where some jobs were running with 2 or 3...
11
by: my-wings | last post by:
I think I've painted myself into a corner, and I'm hoping someone can help me out. I have a table of books (tblBooks), which includes a field...
1
by: Quizical | last post by:
First, is null considered a Value Type or a Reference Type. I have a question about performance and memory. Does the performance change or the...
6
by: kvsnramesh | last post by:
hi, I have a problem asked by one of my senior person and finding the answer . What is the step by step procedure for tune a large sql query. OR...
3
by: Jesper Stocholm | last post by:
I have developed a data-cleaner that extracts some data from a database, cleans it for illegal/unwanted data and writes it to a CSV-fil for later...
4
by: tarscher | last post by:
Hi all, I have 2 questions regarding performance: 1) I'm building a monitoring system that has to store lots of sensor data that I 'll have to...
4
by: technocrat | last post by:
I am trying to update a million rows in a table and doing this reading from a file which has the record pkids and constructing a preparedstatemtn...
0
by: antdb | last post by:
Ⅰ. Advantage of AntDB: hyper-convergence + streaming processing engine In the overall architecture, a new "hyper-convergence" concept was...
0
by: Matthew3360 | last post by:
Hi there. I have been struggling to find out how to use a variable as my location in my header redirect function. Here is my code. ...
2
by: Matthew3360 | last post by:
Hi, I have a python app that i want to be able to get variables from a php page on my webserver. My python app is on my computer. How would I make it...
0
by: AndyPSV | last post by:
HOW CAN I CREATE AN AI with an .executable file that would suck all files in the folder and on my computerHOW CAN I CREATE AN AI with an .executable...
0
by: Arjunsri | last post by:
I have a Redshift database that I need to use as an import data source. I have configured the DSN connection using the server, port, database, and...
0
by: Matthew3360 | last post by:
Hi, I have been trying to connect to a local host using php curl. But I am finding it hard to do this. I am doing the curl get request from my web...
0
Oralloy
by: Oralloy | last post by:
Hello Folks, I am trying to hook up a CPU which I designed using SystemC to I/O pins on an FPGA. My problem (spelled failure) is with the...
0
by: Carina712 | last post by:
Setting background colors for Excel documents can help to improve the visual appeal of the document and make it easier to read and understand....
0
BLUEPANDA
by: BLUEPANDA | last post by:
At BluePanda Dev, we're passionate about building high-quality software and sharing our knowledge with the community. That's why we've created a SaaS...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.