472,145 Members | 1,419 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,145 software developers and data experts.

Need advise on horizontal table partitioning

3
Hello,

I need some advise in table design, and more specifically about table partitioning. I read some papers and there was mentioned that if a table is expected to hold millions of rows, it's a good idea to partition it.

Vertical partitioning, as I understood it, is separating data that differs in some way in a separate table, adding a key field as an identifier to what segment it belongs.

The particular table holds signal measurements from different sites and has the following layout :

SiteID int,
MeasID int,
Timestamp datetime,
Value real,
Status int
etcJunk int

SiteID, MeasID and Timestamp form the PK.

The horizontal partitioning should be obviously done by Timestamp, but that creates pretty bad problems with automatic table creation (when it's time to do so), inserts from measurement system and selects from report tools - generally with the application code.

I would appreciate any consistent strategy for implementing such partitioning.

Ops, just to mention - OS - Windows, MSSQL 2000, in case that matters in any way.
Jul 19 '07 #1
2 1945
mandor
3
An update...

After reading some stuff for partitions, views etc, I decided to split the data in two tables - one for the last month (that will need to be queried regularly), called RDATA_CURR and one for the the older data for the last year (that will be queried really rarely). A scheduled stored procedure should move data between them on daily basis. For the experiment I created a table RDATA_FULL that holds the sane data as the above tables. Also created a view (union all) on the first two tables. The thing is that there are some selects that need to be transparent upon the splitted tables and don't need to be really fast (while other are to be fast and always from the last month or so). The inserts are always at the current, because the timestamps are current time, actually.

So I generated few millions raws of junk and filled the tables. After that made the same typical select from the view, the full table and the current (the needed data is within the current period). The execution plan shows that actually select from the view is twice as slow as select from the full table, which is very close to the select from the current (full has millions and current has tens of thousands). This is not really what I expected - the split does not really make things better.

So my question is, first, does really the execution plan shows correctly queries cost? If it is, then such kind of splitting the data does not really improve performance on selects from all data.

The typical select is a select for a certain time interval for a certain ID. I wonder if I could make the select in the view more sophisticated, so the query is actually done on the partitioned table that actually holds the data.
Jul 20 '07 #2
mandor
3
Well, it seems that I will drop the partitioning thing, especially with the results I get from very large tables.

There are two tables - rdata_f holding data for all sites, about 10 millions generated rows for various sites and rdata_1 (check constraint on site_id) holding data only from site 1, having about 2 millions rows. The tables have identical layout and PK (the only difference is CHECK on Site_ID). The thing is that selects on index are much faster with the larger table.

How is that possible? I use dbcc dropcleanbuffers before each select and the check constraint should not matter for selects. The larger table is 15 times faster...
Jul 20 '07 #3

Post your reply

Sign in to post your reply or Sign up for a free account.

Similar topics

1 post views Thread by Jay | last post: by
18 posts views Thread by Jeff Boes | last post: by
2 posts views Thread by jorge | last post: by
2 posts views Thread by shsandeep | last post: by
10 posts views Thread by shsandeep | last post: by
9 posts views Thread by Veeru71 | last post: by
15 posts views Thread by Piero 'Giops' Giorgi | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.