471,853 Members | 927 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 471,853 software developers and data experts.

SQL Server storing large amounts of data in multiple tables

Hello,
Currently we have a database, and it is our desire for it to be able
to store millions of records. The data in the table can be divided up
by client, and it stores nothing but about 7 integers.
| table |
| id | clientId | int1 | int2 | int 3 | ... |

Right now, our benchmarks indicate a drastic increase in performance
if we divide the data into different tables. For example,
table_clientA, table_clientB, table_clientC, despite the fact the
tables contain the exact same columns. This however does not seem very
clean or elegant to me, and rather illogical since a database exists
as a single file on the harddrive.

| table_clientA |
| id | clientId | int1 | int2 | int 3 | ...

| table_clientB |
| id | clientId | int1 | int2 | int 3 | ...

| table_clientC |
| id | clientId | int1 | int2 | int 3 | ...

Is there anyway to duplicate this increase in database performance
gained by splitting the table, perhaps by using a certain type of
index?

Thanks,
Jeff Brubaker
Software Developer
Jul 20 '05 #1
4 4037
Why not create a view that will combine the separate tables back into your
original format. You could even place the different base tables in different
locations (i.e., different servers) - the whole distributed partitioned view
concept.

Other than this, what do you have indexed on this table? What kind of query
is showing a drastic improvement when you separate the table like this?

-Chuck Urwiler, MCSD, MCDBA

"jeff brubaker" <je**@priva.com> wrote in message
news:b7*************************@posting.google.co m...
Hello,
Currently we have a database, and it is our desire for it to be able
to store millions of records. The data in the table can be divided up
by client, and it stores nothing but about 7 integers.
| table |
| id | clientId | int1 | int2 | int 3 | ... |

Right now, our benchmarks indicate a drastic increase in performance
if we divide the data into different tables. For example,
table_clientA, table_clientB, table_clientC, despite the fact the
tables contain the exact same columns. This however does not seem very
clean or elegant to me, and rather illogical since a database exists
as a single file on the harddrive.

| table_clientA |
| id | clientId | int1 | int2 | int 3 | ...

| table_clientB |
| id | clientId | int1 | int2 | int 3 | ...

| table_clientC |
| id | clientId | int1 | int2 | int 3 | ...

Is there anyway to duplicate this increase in database performance
gained by splitting the table, perhaps by using a certain type of
index?

Thanks,
Jeff Brubaker
Software Developer

Jul 20 '05 #2
[posted and mailed, please reply in news]

jeff brubaker (je**@priva.com) writes:
Currently we have a database, and it is our desire for it to be able
to store millions of records. The data in the table can be divided up
by client, and it stores nothing but about 7 integers.
| table |
| id | clientId | int1 | int2 | int 3 | ... |

Right now, our benchmarks indicate a drastic increase in performance
if we divide the data into different tables. For example,
table_clientA, table_clientB, table_clientC, despite the fact the
tables contain the exact same columns. This however does not seem very
clean or elegant to me, and rather illogical since a database exists
as a single file on the harddrive.
...

Is there anyway to duplicate this increase in database performance
gained by splitting the table, perhaps by using a certain type of
index?


It is not implausible, but with out further knowledge of your tables
and the benchmark queries, it is impossible to tell.

You could get a more informative answer, if you posted:

o The CREATE TABLE statements (both for the unpartitioned table,
and the partitioned table).
o Any indexes on the tables.
o The queries you use for the benchmark.
o If you have scripts that generates data for the benchmarks, that
would extremely useful. (Provided that they reasonably small.)

Which client did you use for the benchmark? Query Analyzer?

--
Erland Sommarskog, SQL Server MVP, so****@algonet.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 20 '05 #3
Erland Sommarskog <so****@algonet.se> wrote in message news:<Xn**********************@127.0.0.1>...
[posted and mailed, please reply in news]

jeff brubaker (je**@priva.com) writes:
Currently we have a database, and it is our desire for it to be able
to store millions of records. The data in the table can be divided up
by client, and it stores nothing but about 7 integers.
| table |
| id | clientId | int1 | int2 | int 3 | ... |

Right now, our benchmarks indicate a drastic increase in performance
if we divide the data into different tables. For example,
table_clientA, table_clientB, table_clientC, despite the fact the
tables contain the exact same columns. This however does not seem very
clean or elegant to me, and rather illogical since a database exists
as a single file on the harddrive.
...

Is there anyway to duplicate this increase in database performance
gained by splitting the table, perhaps by using a certain type of
index?


It is not implausible, but with out further knowledge of your tables
and the benchmark queries, it is impossible to tell.

You could get a more informative answer, if you posted:

o The CREATE TABLE statements (both for the unpartitioned table,
and the partitioned table).
o Any indexes on the tables.
o The queries you use for the benchmark.
o If you have scripts that generates data for the benchmarks, that
would extremely useful. (Provided that they reasonably small.)

Which client did you use for the benchmark? Query Analyzer?

Okay, sorry for the delay. Here is a SQL Script to setup the
experiment. Basically it builts one table with 100,000 records, and 10
small tables with 10,000 records. To select all the records from
table_5 is substantially faster than selecting all the records from
bigTable where clientID = 5

SET NOCOUNT ON
/* Drop any tables that might exist */

if exists (select * from dbo.sysobjects where id =
object_id(N'[dbo].[bigTable]') and OBJECTPROPERTY(id, N'IsUserTable')
= 1)
drop table [dbo].[bigTable]
if exists (select * from dbo.sysobjects where id =
object_id(N'[dbo].[table_0]') and OBJECTPROPERTY(id, N'IsUserTable') =
1)
drop table [dbo].[table_0]
if exists (select * from dbo.sysobjects where id =
object_id(N'[dbo].[table_1]') and OBJECTPROPERTY(id, N'IsUserTable') =
1)
drop table [dbo].[table_1]
if exists (select * from dbo.sysobjects where id =
object_id(N'[dbo].[table_2]') and OBJECTPROPERTY(id, N'IsUserTable') =
1)
drop table [dbo].[table_2]
if exists (select * from dbo.sysobjects where id =
object_id(N'[dbo].[table_3]') and OBJECTPROPERTY(id, N'IsUserTable') =
1)
drop table [dbo].[table_3]
if exists (select * from dbo.sysobjects where id =
object_id(N'[dbo].[table_4]') and OBJECTPROPERTY(id, N'IsUserTable') =
1)
drop table [dbo].[table_4]
if exists (select * from dbo.sysobjects where id =
object_id(N'[dbo].[table_5]') and OBJECTPROPERTY(id, N'IsUserTable') =
1)
drop table [dbo].[table_5]
if exists (select * from dbo.sysobjects where id =
object_id(N'[dbo].[table_6]') and OBJECTPROPERTY(id, N'IsUserTable') =
1)
drop table [dbo].[table_6]
if exists (select * from dbo.sysobjects where id =
object_id(N'[dbo].[table_7]') and OBJECTPROPERTY(id, N'IsUserTable') =
1)
drop table [dbo].[table_7]
if exists (select * from dbo.sysobjects where id =
object_id(N'[dbo].[table_8]') and OBJECTPROPERTY(id, N'IsUserTable') =
1)
drop table [dbo].[table_8]
if exists (select * from dbo.sysobjects where id =
object_id(N'[dbo].[table_9]') and OBJECTPROPERTY(id, N'IsUserTable') =
1)
drop table [dbo].[table_9]
/* Create the tables */

CREATE TABLE [dbo].[bigTable] (
[id1] [int] NULL ,
[id2] [int] NULL ,
[id3] [int] NULL ,
[id4] [int] NULL ,
[id5] [int] NULL ,
[id6] [int] NULL ,
[id7] [int] NULL ,
[id8] [int] NULL ,
[id9] [int] NULL ,
[id10] [int] NULL ,
[id11] [int] NULL ,
[id12] [int] NULL ,
[id13] [int] NULL ,
[id14] [int] NULL ,
[id15] [int] NULL ,
[id16] [int] NULL ,
[id17] [int] NULL ,
[id18] [int] NULL ,
[id19] [int] NULL ,
[id110] [int] NULL ,
[clientid] [int] NULL
) ON [PRIMARY]

CREATE TABLE table_1 (
[id1] [int] NULL ,
[id2] [int] NULL ,
[id3] [int] NULL ,
[id4] [int] NULL ,
[id5] [int] NULL ,
[id6] [int] NULL ,
[id7] [int] NULL ,
[id8] [int] NULL ,
[id9] [int] NULL ,
[id10] [int] NULL ,
[id11] [int] NULL ,
[id12] [int] NULL ,
[id13] [int] NULL ,
[id14] [int] NULL ,
[id15] [int] NULL ,
[id16] [int] NULL ,
[id17] [int] NULL ,
[id18] [int] NULL ,
[id19] [int] NULL ,
[id20] [int] NULL ,
) ON [PRIMARY]

CREATE TABLE table_2 (
[id1] [int] NULL ,
[id2] [int] NULL ,
[id3] [int] NULL ,
[id4] [int] NULL ,
[id5] [int] NULL ,
[id6] [int] NULL ,
[id7] [int] NULL ,
[id8] [int] NULL ,
[id9] [int] NULL ,
[id10] [int] NULL ,
[id11] [int] NULL ,
[id12] [int] NULL ,
[id13] [int] NULL ,
[id14] [int] NULL ,
[id15] [int] NULL ,
[id16] [int] NULL ,
[id17] [int] NULL ,
[id18] [int] NULL ,
[id19] [int] NULL ,
[id20] [int] NULL ,
) ON [PRIMARY]
CREATE TABLE table_3 (
[id1] [int] NULL ,
[id2] [int] NULL ,
[id3] [int] NULL ,
[id4] [int] NULL ,
[id5] [int] NULL ,
[id6] [int] NULL ,
[id7] [int] NULL ,
[id8] [int] NULL ,
[id9] [int] NULL ,
[id10] [int] NULL ,
[id11] [int] NULL ,
[id12] [int] NULL ,
[id13] [int] NULL ,
[id14] [int] NULL ,
[id15] [int] NULL ,
[id16] [int] NULL ,
[id17] [int] NULL ,
[id18] [int] NULL ,
[id19] [int] NULL ,
[id20] [int] NULL ,
) ON [PRIMARY]
CREATE TABLE table_4 (
[id1] [int] NULL ,
[id2] [int] NULL ,
[id3] [int] NULL ,
[id4] [int] NULL ,
[id5] [int] NULL ,
[id6] [int] NULL ,
[id7] [int] NULL ,
[id8] [int] NULL ,
[id9] [int] NULL ,
[id10] [int] NULL ,
[id11] [int] NULL ,
[id12] [int] NULL ,
[id13] [int] NULL ,
[id14] [int] NULL ,
[id15] [int] NULL ,
[id16] [int] NULL ,
[id17] [int] NULL ,
[id18] [int] NULL ,
[id19] [int] NULL ,
[id20] [int] NULL ,
) ON [PRIMARY]
CREATE TABLE table_5 (
[id1] [int] NULL ,
[id2] [int] NULL ,
[id3] [int] NULL ,
[id4] [int] NULL ,
[id5] [int] NULL ,
[id6] [int] NULL ,
[id7] [int] NULL ,
[id8] [int] NULL ,
[id9] [int] NULL ,
[id10] [int] NULL ,
[id11] [int] NULL ,
[id12] [int] NULL ,
[id13] [int] NULL ,
[id14] [int] NULL ,
[id15] [int] NULL ,
[id16] [int] NULL ,
[id17] [int] NULL ,
[id18] [int] NULL ,
[id19] [int] NULL ,
[id20] [int] NULL ,
) ON [PRIMARY]
CREATE TABLE table_6 (
[id1] [int] NULL ,
[id2] [int] NULL ,
[id3] [int] NULL ,
[id4] [int] NULL ,
[id5] [int] NULL ,
[id6] [int] NULL ,
[id7] [int] NULL ,
[id8] [int] NULL ,
[id9] [int] NULL ,
[id10] [int] NULL ,
[id11] [int] NULL ,
[id12] [int] NULL ,
[id13] [int] NULL ,
[id14] [int] NULL ,
[id15] [int] NULL ,
[id16] [int] NULL ,
[id17] [int] NULL ,
[id18] [int] NULL ,
[id19] [int] NULL ,
[id20] [int] NULL ,
) ON [PRIMARY]
CREATE TABLE table_7 (
[id1] [int] NULL ,
[id2] [int] NULL ,
[id3] [int] NULL ,
[id4] [int] NULL ,
[id5] [int] NULL ,
[id6] [int] NULL ,
[id7] [int] NULL ,
[id8] [int] NULL ,
[id9] [int] NULL ,
[id10] [int] NULL ,
[id11] [int] NULL ,
[id12] [int] NULL ,
[id13] [int] NULL ,
[id14] [int] NULL ,
[id15] [int] NULL ,
[id16] [int] NULL ,
[id17] [int] NULL ,
[id18] [int] NULL ,
[id19] [int] NULL ,
[id20] [int] NULL ,
) ON [PRIMARY]
CREATE TABLE table_8 (
[id1] [int] NULL ,
[id2] [int] NULL ,
[id3] [int] NULL ,
[id4] [int] NULL ,
[id5] [int] NULL ,
[id6] [int] NULL ,
[id7] [int] NULL ,
[id8] [int] NULL ,
[id9] [int] NULL ,
[id10] [int] NULL ,
[id11] [int] NULL ,
[id12] [int] NULL ,
[id13] [int] NULL ,
[id14] [int] NULL ,
[id15] [int] NULL ,
[id16] [int] NULL ,
[id17] [int] NULL ,
[id18] [int] NULL ,
[id19] [int] NULL ,
[id20] [int] NULL ,
) ON [PRIMARY]
CREATE TABLE table_9 (
[id1] [int] NULL ,
[id2] [int] NULL ,
[id3] [int] NULL ,
[id4] [int] NULL ,
[id5] [int] NULL ,
[id6] [int] NULL ,
[id7] [int] NULL ,
[id8] [int] NULL ,
[id9] [int] NULL ,
[id10] [int] NULL ,
[id11] [int] NULL ,
[id12] [int] NULL ,
[id13] [int] NULL ,
[id14] [int] NULL ,
[id15] [int] NULL ,
[id16] [int] NULL ,
[id17] [int] NULL ,
[id18] [int] NULL ,
[id19] [int] NULL ,
[id20] [int] NULL ,
) ON [PRIMARY]
CREATE TABLE table_0 (
[id1] [int] NULL ,
[id2] [int] NULL ,
[id3] [int] NULL ,
[id4] [int] NULL ,
[id5] [int] NULL ,
[id6] [int] NULL ,
[id7] [int] NULL ,
[id8] [int] NULL ,
[id9] [int] NULL ,
[id10] [int] NULL ,
[id11] [int] NULL ,
[id12] [int] NULL ,
[id13] [int] NULL ,
[id14] [int] NULL ,
[id15] [int] NULL ,
[id16] [int] NULL ,
[id17] [int] NULL ,
[id18] [int] NULL ,
[id19] [int] NULL ,
[id20] [int] NULL ,
) ON [PRIMARY]
DECLARE @countPerClient int
SET @countPerClient = 10000
DECLARE @counter int
SET @counter = 1

/* Fill the big table with the 10 clients */

WHILE (@counter <= @countPerClient)
BEGIN
INSERT bigTable (clientId) VALUES (0)
SET @counter = @counter + 1
END
SET @counter=1
WHILE (@counter <= @countPerClient)
BEGIN
INSERT bigTable (clientId) VALUES (1)
SET @counter = @counter + 1
END
SET @counter=1
WHILE (@counter <= @countPerClient)
BEGIN
INSERT bigTable (clientId) VALUES (2)
SET @counter = @counter + 1
END
SET @counter=1
WHILE (@counter <= @countPerClient)
BEGIN
INSERT bigTable (clientId) VALUES (3)
SET @counter = @counter + 1
END
SET @counter=1
WHILE (@counter <= @countPerClient)
BEGIN
INSERT bigTable (clientId) VALUES (4)
SET @counter = @counter + 1
END
SET @counter=1
WHILE (@counter <= @countPerClient)
BEGIN
INSERT bigTable (clientId) VALUES (5)
SET @counter = @counter + 1
END
SET @counter=1
WHILE (@counter <= @countPerClient)
BEGIN
INSERT bigTable (clientId) VALUES (6)
SET @counter = @counter + 1
END
SET @counter=1
WHILE (@counter <= @countPerClient)
BEGIN
INSERT bigTable (clientId) VALUES (7)
SET @counter = @counter + 1
END
SET @counter=1
WHILE (@counter <= @countPerClient)
BEGIN
INSERT bigTable (clientId) VALUES (8)
SET @counter = @counter + 1
END
SET @counter=1
WHILE (@counter <= @countPerClient)
BEGIN
INSERT bigTable (clientId) VALUES (9)
SET @counter = @counter + 1
END
/* Fill each of the table with 1 clients */
SET @counter = 1
WHILE (@counter <= @countPerClient)
BEGIN
INSERT table_1 DEFAULT VALUES
SET @counter = @counter + 1
END

SET @counter = 1
WHILE (@counter <= @countPerClient)
BEGIN
INSERT table_2 DEFAULT VALUES
SET @counter = @counter + 1
END
SET @counter = 1
WHILE (@counter <= @countPerClient)
BEGIN
INSERT table_3 DEFAULT VALUES
SET @counter = @counter + 1
END
SET @counter = 1
WHILE (@counter <= @countPerClient)
BEGIN
INSERT table_4 DEFAULT VALUES
SET @counter = @counter + 1
END
SET @counter = 1
WHILE (@counter <= @countPerClient)
BEGIN
INSERT table_5 DEFAULT VALUES
SET @counter = @counter + 1
END
SET @counter = 1
WHILE (@counter <= @countPerClient)
BEGIN
INSERT table_6 DEFAULT VALUES
SET @counter = @counter + 1
END
SET @counter = 1
WHILE (@counter <= @countPerClient)
BEGIN
INSERT table_7 DEFAULT VALUES
SET @counter = @counter + 1
END
SET @counter = 1
WHILE (@counter <= @countPerClient)
BEGIN
INSERT table_8 DEFAULT VALUES
SET @counter = @counter + 1
END

SET @counter = 1
WHILE (@counter <= @countPerClient)
BEGIN
INSERT table_9 DEFAULT VALUES
SET @counter = @counter + 1
END
SET @counter = 1
WHILE (@counter <= @countPerClient)
BEGIN
INSERT table_0 DEFAULT VALUES
SET @counter = @counter + 1
END
GO
/* Now time for the queries */
DECLARE @x datetime
SELECT @x = GetDate()
select count(*) from table_5
SELECT 'Split Tables' as label,DateDiff(millisecond, @x, GetDate())

SELECT @x = GetDate()
select count(*) from bigTable where clientId=5
SELECT 'Big Table' as label,DateDiff(millisecond, @x, GetDate())
Jul 20 '05 #4
[posted and mailed, please reply in news]

jeff brubaker (je**@priva.com) writes:
Okay, sorry for the delay. Here is a SQL Script to setup the
experiment. Basically it builts one table with 100,000 records, and 10
small tables with 10,000 records. To select all the records from
table_5 is substantially faster than selecting all the records from
bigTable where clientID = 5


Yes, since there is no index at all at your tables, this is not
strange.

First, I had to increase the number of rows per client to 100000 to
get a significant difference.

When I had run the first test, I ran these two statements:

CREATE CLUSTERED INDEX clientid_ix on bigTable (clientId)
go
DBCC DROPCLEANBUFFERS

The first statement builds an index on bigTable.clientId. The second
just cleans out the cache, so that all data will be read from disk.
(Don't do this on a production machine!)

I then ran the benchmarks. table_5 was still faster with 450 ms, where
as the SELECT COUNT(*) from bigTable needed 563 ms. However, on
successive runs, table_5 took 110 ms, where as the read from bigTable
was 16 ms.

Before you consider advanced techniques like splitting tables, you should
make sure that you have a sound index strategy. A clustered index has
the data as its leaf pages, so after adding the clustered index,
bigTable is really like table_1 to table_9 glued together.
--
Erland Sommarskog, SQL Server MVP, so****@algonet.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 20 '05 #5

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

14 posts views Thread by diskoduro | last post: by
10 posts views Thread by Kevin Lawrence | last post: by
2 posts views Thread by Ilyas | last post: by
46 posts views Thread by RAZZ | last post: by
reply views Thread by YellowAndGreen | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.