473,700 Members | 2,301 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Need to tune a table for performance gains

Hi :

I have a TableA with around 10 columns with varchar and numeric
datatypes
It has 500 million records and its size is 999999999 KB. i believe it
is kb
i got this data after running sp_spaceused on it. The index_size was
also pretty big in 6 digits.

On looking at the tableA
it didnot have any pks and hence no clustered index.
It had other indices
IX_1 on ColA
IX_2 on ColB
IX_3 on ColC
IX_4 on ColA, ColB and ColC put together.

Queries performed in this table are very slow. I have been asked to
tune up this table.
I know as much info as you.

Data prior to 2004 can be archived into another table. I need to run a
query to find out how many records that is.

I am thinking the following, but dont know if i am correct ?
I need to add a new PK column (which will increase the size of the
tableA) which will add a clustered index.
Right now there are no clustered indices

2. I would like help in understanding should i remove IX_1, IX_2, IX_3
as they are all used in IX_4 anyway .

3. I forget what the textbox is called on the index page. it is set to
0 and can be set from 0 to 100. what would be a good value for it ?

thank you.
RS

May 2 '07 #1
8 2387
On May 1, 6:29 pm, rshivara...@gma il.com wrote:
Hi :

I have a TableA with around 10 columns with varchar and numeric
datatypes
It has 500 million records and its size is 999999999 KB. i believe it
is kb
i got this data after running sp_spaceused on it. The index_size was
also pretty big in 6 digits.

On looking at the tableA
it didnot have any pks and hence no clustered index.
It had other indices
IX_1 on ColA
IX_2 on ColB
IX_3 on ColC
IX_4 on ColA, ColB and ColC put together.

Queries performed in this table are very slow. I have been asked to
tune up this table.
I know as much info as you.

Data prior to 2004 can be archived into another table. I need to run a
query to find out how many records that is.

I am thinking the following, but dont know if i am correct ?
I need to add a new PK column (which will increase the size of the
tableA) which will add a clustered index.
Right now there are no clustered indices

2. I would like help in understanding should i remove IX_1, IX_2, IX_3
as they are all used in IX_4 anyway .

3. I forget what the textbox is called on the index page. it is set to
0 and can be set from 0 to 100. what would be a good value for it ?

thank you.
RS
I'm afraid I might not be able to offer all the hints you are looking
for, but here are some things I've found helpful:

1) LOOK AT YOUR QUERIES!! We've obtained huge performance boosts
just by looking at the queries and stored procedures and optimizing
them. Usually if a query is going extremely slow it is because of
poor programming techniques.
2) LOOK AT YOUR QUERIES!! Are they sometimes filtering only on
ColB? If so, IX_4 will be useless if you remove IX_2. IX_4 will kick
in when you are filtering on several columns. But are there other
columns that are being filtered on?
3) Are you dynamically parcing or calculating columns? For instance,
if you have a column with EventDateTime of 01-01-2007 12:34PM but you
are filtering on dates (ie WHERE CONVERT(varchar , EventDateTime, 101)
= '01-01-07') then that is going to slow down your query. Create a
new column of EventDate and calculate the event date from the event
date and time and then stick an index on it. That will add some
boosts as the sampled WHERE clause won't reference the index.
4) Do you look at the query execution plan? That can tell you if you
are doing full table scans or hitting an index.
5) Have you used the query index wizard?
6) Breaking the table up will help a lot. At one company I consult
for we have a similar table (a half million records added per day)
with data going back to 2003. We create two (what we call) _Mini
tables. TableName_Mini_ Last45Days, TableName_Mini_ Last6Months, and
TableName_Mini_ Last1Year. This does take up space but since 80% of
the queries only want data from the last month the first mini table
makes sense to hit. If they need to go further back further then they
hit the appropriate table. The main table (TableName) is rarely
queries except after its morning data load (off hours) to create the
_Mini tables so if somebody does need information from it, it isn't
bogged down by heavy usage.
7) I've never done this personally so I cannot attest to its success
(though I should test it). I have heard you can recreate/rebuild an
index and that should get rid of some of the fragmentation. I think
I'll give it a test, but you might want to do that as well. But since
it is re-indexing half a billion records you may want to do that off-
hours.

I hope one or more of these helps you out!

May 2 '07 #2
On May 2, 10:39 am, Utahduck <Utahd...@hotma il.comwrote:
On May 1, 6:29 pm, rshivara...@gma il.com wrote:


Hi :
I have a TableA with around 10 columns with varchar and numeric
datatypes
It has 500 million records and its size is 999999999 KB. i believe it
is kb
i got this data after running sp_spaceused on it. The index_size was
also pretty big in 6 digits.
On looking at the tableA
it didnot have any pks and hence no clustered index.
It had other indices
IX_1 on ColA
IX_2 on ColB
IX_3 on ColC
IX_4 on ColA, ColB and ColC put together.
Queries performed in this table are very slow. I have been asked to
tune up this table.
I know as much info as you.
Data prior to 2004 can be archived into another table. I need to run a
query to find out how many records that is.
I am thinking the following, but dont know if i am correct ?
I need to add a new PK column (which will increase the size of the
tableA) which will add a clustered index.
Right now there are no clustered indices
2. I would like help in understanding should i remove IX_1, IX_2, IX_3
as they are all used in IX_4 anyway .
3. I forget what the textbox is called on the index page. it is set to
0 and can be set from 0 to 100. what would be a good value for it ?
thank you.
RS

I'm afraid I might not be able to offer all the hints you are looking
for, but here are some things I've found helpful:

1) LOOK AT YOUR QUERIES!! We've obtained huge performance boosts
just by looking at the queries and stored procedures and optimizing
them. Usually if a query is going extremely slow it is because of
poor programming techniques.
2) LOOK AT YOUR QUERIES!! Are they sometimes filtering only on
ColB? If so, IX_4 will be useless if you remove IX_2. IX_4 will kick
in when you are filtering on several columns. But are there other
columns that are being filtered on?
3) Are you dynamically parcing or calculating columns? For instance,
if you have a column with EventDateTime of 01-01-2007 12:34PM but you
are filtering on dates (ie WHERE CONVERT(varchar , EventDateTime, 101)
= '01-01-07') then that is going to slow down your query. Create a
new column of EventDate and calculate the event date from the event
date and time and then stick an index on it. That will add some
boosts as the sampled WHERE clause won't reference the index.
4) Do you look at the query execution plan? That can tell you if you
are doing full table scans or hitting an index.
5) Have you used the query index wizard?
6) Breaking the table up will help a lot. At one company I consult
for we have a similar table (a half million records added per day)
with data going back to 2003. We create two (what we call) _Mini
tables. TableName_Mini_ Last45Days, TableName_Mini_ Last6Months, and
TableName_Mini_ Last1Year. This does take up space but since 80% of
the queries only want data from the last month the first mini table
makes sense to hit. If they need to go further back further then they
hit the appropriate table. The main table (TableName) is rarely
queries except after its morning data load (off hours) to create the
_Mini tables so if somebody does need information from it, it isn't
bogged down by heavy usage.
7) I've never done this personally so I cannot attest to its success
(though I should test it). I have heard you can recreate/rebuild an
index and that should get rid of some of the fragmentation. I think
I'll give it a test, but you might want to do that as well. But since
it is re-indexing half a billion records you may want to do that off-
hours.

I hope one or more of these helps you out!- Hide quoted text -

- Show quoted text -
Thank you Utahduck, your mail was helpful. Also am reading up on
indexes and that is helping me.
You idea of laying out mini tables by the year is a great suggestion i
hadnt thought of. I will be definitely using that
Thank you
RS

May 2 '07 #3
Think of IX_4 like a phone book.

City, Last Name, First Name, Phone Number.

That's great, as long as you know what city a person is in. If you
don't know the city then it takes you a lot longer to look up the
perosns name because you have to loop through each city to find if the
person is in there.

That's where IX_1, Ix_2, and IX_3 come in. You could easily get rid of
IX_1, because it will be in the same order as IX_4.

Cheers,
Jason Lepack

On May 2, 1:07 pm, rshivara...@gma il.com wrote:
On May 2, 10:39 am, Utahduck <Utahd...@hotma il.comwrote:


On May 1, 6:29 pm, rshivara...@gma il.com wrote:
Hi :
I have a TableA with around 10 columns with varchar and numeric
datatypes
It has 500 million records and its size is 999999999 KB. i believe it
is kb
i got this data after running sp_spaceused on it. The index_size was
also pretty big in 6 digits.
On looking at the tableA
it didnot have any pks and hence no clustered index.
It had other indices
IX_1 on ColA
IX_2 on ColB
IX_3 on ColC
IX_4 on ColA, ColB and ColC put together.
Queries performed in this table are very slow. I have been asked to
tune up this table.
I know as much info as you.
Data prior to 2004 can be archived into another table. I need to run a
query to find out how many records that is.
I am thinking the following, but dont know if i am correct ?
I need to add a new PK column (which will increase the size of the
tableA) which will add a clustered index.
Right now there are no clustered indices
2. I would like help in understanding should i remove IX_1, IX_2, IX_3
as they are all used in IX_4 anyway .
3. I forget what the textbox is called on the index page. it is set to
0 and can be set from 0 to 100. what would be a good value for it ?
thank you.
RS
I'm afraid I might not be able to offer all the hints you are looking
for, but here are some things I've found helpful:
1) LOOK AT YOUR QUERIES!! We've obtained huge performance boosts
just by looking at the queries and stored procedures and optimizing
them. Usually if a query is going extremely slow it is because of
poor programming techniques.
2) LOOK AT YOUR QUERIES!! Are they sometimes filtering only on
ColB? If so, IX_4 will be useless if you remove IX_2. IX_4 will kick
in when you are filtering on several columns. But are there other
columns that are being filtered on?
3) Are you dynamically parcing or calculating columns? For instance,
if you have a column with EventDateTime of 01-01-2007 12:34PM but you
are filtering on dates (ie WHERE CONVERT(varchar , EventDateTime, 101)
= '01-01-07') then that is going to slow down your query. Create a
new column of EventDate and calculate the event date from the event
date and time and then stick an index on it. That will add some
boosts as the sampled WHERE clause won't reference the index.
4) Do you look at the query execution plan? That can tell you if you
are doing full table scans or hitting an index.
5) Have you used the query index wizard?
6) Breaking the table up will help a lot. At one company I consult
for we have a similar table (a half million records added per day)
with data going back to 2003. We create two (what we call) _Mini
tables. TableName_Mini_ Last45Days, TableName_Mini_ Last6Months, and
TableName_Mini_ Last1Year. This does take up space but since 80% of
the queries only want data from the last month the first mini table
makes sense to hit. If they need to go further back further then they
hit the appropriate table. The main table (TableName) is rarely
queries except after its morning data load (off hours) to create the
_Mini tables so if somebody does need information from it, it isn't
bogged down by heavy usage.
7) I've never done this personally so I cannot attest to its success
(though I should test it). I have heard you can recreate/rebuild an
index and that should get rid of some of the fragmentation. I think
I'll give it a test, but you might want to do that as well. But since
it is re-indexing half a billion records you may want to do that off-
hours.
I hope one or more of these helps you out!- Hide quoted text -
- Show quoted text -

Thank you Utahduck, your mail was helpful. Also am reading up on
indexes and that is helping me.
You idea of laying out mini tables by the year is a great suggestion i
hadnt thought of. I will be definitely using that
Thank you
RS- Hide quoted text -

- Show quoted text -

May 2 '07 #4
(rs*********@gm ail.com) writes:
I have a TableA with around 10 columns with varchar and numeric
datatypes
It has 500 million records and its size is 999999999 KB. i believe it
is kb i got this data after running sp_spaceused on it. The index_size was
also pretty big in 6 digits.
Could you post the exact output from sp_spaceused and also from
DBCC SHOWCONTIG on the table? (The latter will take some time to
run on a table this size.)
On looking at the tableA
it didnot have any pks and hence no clustered index.
Maybe the table has neither, however there is no connection between
the two. A table could have a clustered index, but no primary key or
vice versa. Most or rather all tables should really have a PK. And
the very most tables should have a clustered index. But far from all
tables should have their clustered index on their primary key.

Tables without a clustered index are known as heaps. Heaps are more
prone to fragmentation for various reasons, and given the size of your
table, I suspect yours is victim to that.
It had other indices
IX_1 on ColA
IX_2 on ColB
IX_3 on ColC
IX_4 on ColA, ColB and ColC put together.

Queries performed in this table are very slow. I have been asked to
tune up this table.
I know as much info as you.
There are all reasons to investigate whether the queries align with
the index. Utahduck made a very good point about queries hiding the
column in an expression.
Data prior to 2004 can be archived into another table. I need to run a
query to find out how many records that is.
Since we don't know the table definiton, we cannot help you with that.
I am thinking the following, but dont know if i am correct ?
I need to add a new PK column (which will increase the size of the
tableA) which will add a clustered index.
Better investigate if there are any columns for which there are
typically range queries, like a datetime column.

You should also investigate if there are columns in the table that in
fact constitute a key, but someone has forgotten to define that key.
After all, adding an IDENTITY column as a PK, makes in practice very
little difference to not having a key at all.
2. I would like help in understanding should i remove IX_1, IX_2, IX_3
as they are all used in IX_4 anyway .
That is impossible to say without knowing the queries. But a query
like

SELECT .... FROM tbl WHERE colB = @value

will not be helped much by the index on (ColA, ColB, ColC). Possibly,
the index on ColA is redudant, but there are queries that will run
faster with this index in place, for instance:

SELECT ColA, COUNT(*) FROM tbl GROUP BY ColA
3. I forget what the textbox is called on the index page. it is set to
0 and can be set from 0 to 100. what would be a good value for it ?
That is the fillfactor. We would need to know more about the table to
be able to make recommendations about it.
--
Erland Sommarskog, SQL Server MVP, es****@sommarsk og.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
May 2 '07 #5
Thank you All .
Erland, i will post answers to your queries shortly.
I was looking at a datatype of a column and it a numerica 9(15,0),
occupies 9 bytes but allowed a max lenght of 15.
This can also be a varchar field. So a varchar(15) compared to a
numeric 9(15,0)
Which occupies more space? or they are the same ?

May 4 '07 #6
exec sp_spaceused TableA
TableA 1 16 KB 8 KB 8 KB 0 KB

exec sp_spaceused Tableb
TableB 1 16 KB 8 KB 8 KB 0 KB
Hi : I created two tables.
TableA has a varchar field of size 18
TableB has a numeric field of size 9(15,0)

I put data of 18 1's in TableA and 15 1's in TableB

Why is there an index_size whey i have not created any pks and ixs.
Looks like they are the same storage size ? Is that strange ?
Right now,there are only one rows in each, so if i had more rows of
same data, will the size differ ?

May 4 '07 #7
DBCC SHOWCONTIG (TableA)-- 18:39 minutes

DBCC SHOWCONTIG scanning 'TableA' table...
Table: 'TableA' (549576996); index ID: 0, database ID: 15
TABLE level scan performed.
- Pages Scanned........ ............... .........: 6007257
- Extents Scanned........ ............... .......: 750994
- Extent Switches....... ............... ........: 750993
- Avg. Pages per Extent......... ............... : 8.0
- Scan Density [Best Count:Actual Count].......: 99.99%
[750908:750994]
- Extent Scan Fragmentation ............... ....: 21.20%
- Avg. Bytes Free per Page........... ..........: 368.0
- Avg. Page Density (full)......... ............: 95.45%
DBCC execution completed. If DBCC printed error messages, contact your
system administrator.

exec SP_SPACEUSED TableA

name rows reserved data
index_size unused
tbl06521AA 504609530 125679832 KB 48058056 KB 77594288 KB 27488 KB

May 4 '07 #8
(rs*********@gm ail.com) writes:
Erland, i will post answers to your queries shortly.
I was looking at a datatype of a column and it a numerica 9(15,0),
occupies 9 bytes but allowed a max lenght of 15.
This can also be a varchar field. So a varchar(15) compared to a
numeric 9(15,0)
Which occupies more space? or they are the same ?
They are not the same. Each numeric(15,0) value occupies nine bytes,
including NULL values. (Unless you are on SQL 2005 SP2 and use the
new vardecimal feature.)

A varchar(15) values can occupy anything from 2 to 17 bytes. That is, two
bytes for the length and then as many bytes as needed for the value.

I don't really understand why you are making these considerations, but
if you are choosing between the two for a key value, I would recommend
a numeric type (and rather bigint over numeric(15,0)), since varchar
is subject to more complex sorting and comparison rules (unless you
pick a binary collation.)

I looked at your SHOWCONTIG and spaceused data. The table has some
fragmentation, but it is not frightening. And the table certainly
calls for respect with its 125 GB.

I think you should examine exactly what queries that are run against
this table. With a table this size, you really to have indexes to
support all queries.

I guess that since the table has so low rate of fragmentation, that
data is only inserted, but never updated or deleted. Therefore it
may not be a pressing issue to add a clustered index - an operation
given the size of the table that will take some time.
--
Erland Sommarskog, SQL Server MVP, es****@sommarsk og.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
May 4 '07 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
1706
by: | last post by:
As we're on this topic in another thread right now: Say I have a SELECT query from more than one table and with some = conditions, does it matter in what order I enter the tables in the FROM = clause and in what order the WHERE conditions appear in my query? Or = does it make any difference if I use WHERE or HAVING? (I see that MS = Access likes those HAVINGs...) Of course my tables contain (maybe very much) more than some 100 records...
7
1835
by: CAH | last post by:
I once wrote on usenet and got a very good answer . This one JN wrote: > >| Should one place all the text in one big table, or is i better to place > >| it in several tabels. Should one make 4 colums one for each language, > >| or 4 rows one for each language? > > I use: > CREATE TABLE `usr_languages` ( > `pageName` varchar(25) default '',
4
1765
by: Aaron | last post by:
In a test environment we have dropped over 100 RI restrictions which showed a huge instant performance gain where some jobs were running with 2 or 3 times the performance than previously. The problem is that we have now dropped the same restrictions in the production environment but have seen absolutely no performance difference since the drop of RI. Any idea's why we have not seen the expected performance improvement? We have...
11
2802
by: my-wings | last post by:
I think I've painted myself into a corner, and I'm hoping someone can help me out. I have a table of books (tblBooks), which includes a field (strPubName) for Publisher Name and another field (strPubCity) for Publisher City. These two fields have a many-to-one relationship with tables, (tlkpPubName and tlkpPubCity) respectively. The lookup tables only have one field (strPubName and strPubCity), which is their primary key. I also have...
1
1531
by: Quizical | last post by:
First, is null considered a Value Type or a Reference Type. I have a question about performance and memory. Does the performance change or the amount of memory that is consumed when you have a difference in flow control statements as follows: //Arbitrary Reference Type Mailmessage mm = new Mailmessage(); //A if (null != mm)
6
4469
by: kvsnramesh | last post by:
hi, I have a problem asked by one of my senior person and finding the answer . What is the step by step procedure for tune a large sql query. OR how do we tune a large SQL query with somany joins
3
1947
by: Jesper Stocholm | last post by:
I have developed a data-cleaner that extracts some data from a database, cleans it for illegal/unwanted data and writes it to a CSV-fil for later insertion to a SQL Server 2000 database. My problem is that it performs like an old, limb man :o( The method is: public static StringBuilder RemoveChars(StringBuilder dataToClean_, string illegalChars_) { // only try to remove chars if there is data to clean
4
1680
by: tarscher | last post by:
Hi all, I have 2 questions regarding performance: 1) I'm building a monitoring system that has to store lots of sensor data that I 'll have to query from time to time. I have pressure and temperature. Since we sample every 500 ms we will get lots of data after some time. Will my performance increase by making 2 tables; one with pressure and one with temperature? Thus when querying (eg pressure) mysql will only need to look in the...
4
4806
by: technocrat | last post by:
I am trying to update a million rows in a table and doing this reading from a file which has the record pkids and constructing a preparedstatemtn with that pkid and adding it to the batch... count is the recordnumber read from the file... so for every count the st is added to the batch.....since for every record read from the file has a update statement(st)... st.setString(1, arg1); st.setString(2, arg2); st.setString(3, arg3);
0
8644
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
9214
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
8970
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
7807
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
5902
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4656
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
3088
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
2392
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2027
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.