473,382 Members | 1,583 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,382 software developers and data experts.

Need advise on horizontal table partitioning

3
Hello,

I need some advise in table design, and more specifically about table partitioning. I read some papers and there was mentioned that if a table is expected to hold millions of rows, it's a good idea to partition it.

Vertical partitioning, as I understood it, is separating data that differs in some way in a separate table, adding a key field as an identifier to what segment it belongs.

The particular table holds signal measurements from different sites and has the following layout :

SiteID int,
MeasID int,
Timestamp datetime,
Value real,
Status int
etcJunk int

SiteID, MeasID and Timestamp form the PK.

The horizontal partitioning should be obviously done by Timestamp, but that creates pretty bad problems with automatic table creation (when it's time to do so), inserts from measurement system and selects from report tools - generally with the application code.

I would appreciate any consistent strategy for implementing such partitioning.

Ops, just to mention - OS - Windows, MSSQL 2000, in case that matters in any way.
Jul 19 '07 #1
2 1998
mandor
3
An update...

After reading some stuff for partitions, views etc, I decided to split the data in two tables - one for the last month (that will need to be queried regularly), called RDATA_CURR and one for the the older data for the last year (that will be queried really rarely). A scheduled stored procedure should move data between them on daily basis. For the experiment I created a table RDATA_FULL that holds the sane data as the above tables. Also created a view (union all) on the first two tables. The thing is that there are some selects that need to be transparent upon the splitted tables and don't need to be really fast (while other are to be fast and always from the last month or so). The inserts are always at the current, because the timestamps are current time, actually.

So I generated few millions raws of junk and filled the tables. After that made the same typical select from the view, the full table and the current (the needed data is within the current period). The execution plan shows that actually select from the view is twice as slow as select from the full table, which is very close to the select from the current (full has millions and current has tens of thousands). This is not really what I expected - the split does not really make things better.

So my question is, first, does really the execution plan shows correctly queries cost? If it is, then such kind of splitting the data does not really improve performance on selects from all data.

The typical select is a select for a certain time interval for a certain ID. I wonder if I could make the select in the view more sophisticated, so the query is actually done on the partitioned table that actually holds the data.
Jul 20 '07 #2
mandor
3
Well, it seems that I will drop the partitioning thing, especially with the results I get from very large tables.

There are two tables - rdata_f holding data for all sites, about 10 millions generated rows for various sites and rdata_1 (check constraint on site_id) holding data only from site 1, having about 2 millions rows. The tables have identical layout and PK (the only difference is CHECK on Site_ID). The thing is that selects on index are much faster with the larger table.

How is that possible? I use dbcc dropcleanbuffers before each select and the check constraint should not matter for selects. The larger table is 15 times faster...
Jul 20 '07 #3

Sign in to post your reply or Sign up for a free account.

Similar topics

1
by: Jay | last post by:
Hi I have a huge table with over 100million records and on regular basis ineed to delete nearly a million records and insert a million records. Currently I delete indexes before going through the...
18
by: Jeff Boes | last post by:
I'm sure this is a concept that's been explored here. I have a table (fairly simple, just two columns, one of which is a 32-digit checksum) with several million rows (currently, about 7 million)....
2
by: jorge | last post by:
I have just started working with DB2 Data Partition Feature.. Given a table name, how can I find out the partitioning key of that table? Is there a SYSIBM.* table that I can look at? Thanks,...
2
by: shsandeep | last post by:
Hi all, I have heard and read this many times: "Partitions should only be used for 'very large' tables". What actually determines whether a table is 'very large' or not? I have tables containing...
10
by: shsandeep | last post by:
DB2 V8.2 (not Viper yet and no range partitioning!!) I have created a table T1 (col1, col2) with col1 as the primary key. When I try to create a partitioning key on col2, it gives me error that it...
9
by: Veeru71 | last post by:
Can someone point me to good documentation on 'WITH clause" ? (I couldn't get much out of Queries section from SQL Reference manual). We are getting better performance when we explicity use global...
1
by: Chessie | last post by:
I'm currently working on Table Partitioning. I have done everything succesfull for partitioning, what I'm struggling with is to use " SELECT MAX(NAME) from sys.filegroups WHERE NAME NOT LIKE...
15
by: Piero 'Giops' Giorgi | last post by:
Hi! I have a question: I already have a DB that uses partitions to divide data in US Counties, partitioned by state. Can I use TWO levels of partitioning? I mean... 3077 filegroups and...
1
by: amuven | last post by:
Hi All, I need to put a horizontal scroll bar for 4 cells alone where my first cell in table should not contain any horizontal scroll bar . In clear, let us say there are 5 columns in my...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.