473,543 Members | 2,172 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Need advise on horizontal table partitioning

3 New Member

I need some advise in table design, and more specifically about table partitioning. I read some papers and there was mentioned that if a table is expected to hold millions of rows, it's a good idea to partition it.

Vertical partitioning, as I understood it, is separating data that differs in some way in a separate table, adding a key field as an identifier to what segment it belongs.

The particular table holds signal measurements from different sites and has the following layout :

SiteID int,
MeasID int,
Timestamp datetime,
Value real,
Status int
etcJunk int

SiteID, MeasID and Timestamp form the PK.

The horizontal partitioning should be obviously done by Timestamp, but that creates pretty bad problems with automatic table creation (when it's time to do so), inserts from measurement system and selects from report tools - generally with the application code.

I would appreciate any consistent strategy for implementing such partitioning.

Ops, just to mention - OS - Windows, MSSQL 2000, in case that matters in any way.
Jul 19 '07 #1
2 2007
3 New Member
An update...

After reading some stuff for partitions, views etc, I decided to split the data in two tables - one for the last month (that will need to be queried regularly), called RDATA_CURR and one for the the older data for the last year (that will be queried really rarely). A scheduled stored procedure should move data between them on daily basis. For the experiment I created a table RDATA_FULL that holds the sane data as the above tables. Also created a view (union all) on the first two tables. The thing is that there are some selects that need to be transparent upon the splitted tables and don't need to be really fast (while other are to be fast and always from the last month or so). The inserts are always at the current, because the timestamps are current time, actually.

So I generated few millions raws of junk and filled the tables. After that made the same typical select from the view, the full table and the current (the needed data is within the current period). The execution plan shows that actually select from the view is twice as slow as select from the full table, which is very close to the select from the current (full has millions and current has tens of thousands). This is not really what I expected - the split does not really make things better.

So my question is, first, does really the execution plan shows correctly queries cost? If it is, then such kind of splitting the data does not really improve performance on selects from all data.

The typical select is a select for a certain time interval for a certain ID. I wonder if I could make the select in the view more sophisticated, so the query is actually done on the partitioned table that actually holds the data.
Jul 20 '07 #2
3 New Member
Well, it seems that I will drop the partitioning thing, especially with the results I get from very large tables.

There are two tables - rdata_f holding data for all sites, about 10 millions generated rows for various sites and rdata_1 (check constraint on site_id) holding data only from site 1, having about 2 millions rows. The tables have identical layout and PK (the only difference is CHECK on Site_ID). The thing is that selects on index are much faster with the larger table.

How is that possible? I use dbcc dropcleanbuffer s before each select and the check constraint should not matter for selects. The larger table is 15 times faster...
Jul 20 '07 #3

Sign in to post your reply or Sign up for a free account.

Similar topics

by: Jay | last post by:
Hi I have a huge table with over 100million records and on regular basis ineed to delete nearly a million records and insert a million records. Currently I delete indexes before going through the process and recreate the indexes which takes a very very long time. IS there a way to disable indexes and re enable them after doing insert and...
by: Jeff Boes | last post by:
I'm sure this is a concept that's been explored here. I have a table (fairly simple, just two columns, one of which is a 32-digit checksum) with several million rows (currently, about 7 million). About a million times a day we do select * from my_table where md5 = ? to verify presence or absence of the row, and base further processing on...
by: jorge | last post by:
I have just started working with DB2 Data Partition Feature.. Given a table name, how can I find out the partitioning key of that table? Is there a SYSIBM.* table that I can look at? Thanks, Jorge
by: shsandeep | last post by:
Hi all, I have heard and read this many times: "Partitions should only be used for 'very large' tables". What actually determines whether a table is 'very large' or not? I have tables containing 0.5 million rows, 8 million rows, 14 & 29 million rows as well. How do I categorize them? Any comments will be helpful.
by: shsandeep | last post by:
DB2 V8.2 (not Viper yet and no range partitioning!!) I have created a table T1 (col1, col2) with col1 as the primary key. When I try to create a partitioning key on col2, it gives me error that it should have all primary keys included. So, I created table T1 again with col2 as the partitioning key. Now, I do not have col1 as the primary...
by: Veeru71 | last post by:
Can someone point me to good documentation on 'WITH clause" ? (I couldn't get much out of Queries section from SQL Reference manual). We are getting better performance when we explicity use global temp tables to store intermediate results than using "WITH cluase" in our queries. Where does DB2 store the intermediate results if the query uses...
by: Chessie | last post by:
I'm currently working on Table Partitioning. I have done everything succesfull for partitioning, what I'm struggling with is to use " SELECT MAX(NAME) from sys.filegroups WHERE NAME NOT LIKE 'PRIMARY' " to get the last used FileGroup. Now this works just fine, but when I run the following script: --Decalre variables DECLARE @LastFilegroupName...
by: Piero 'Giops' Giorgi | last post by:
Hi! I have a question: I already have a DB that uses partitions to divide data in US Counties, partitioned by state. Can I use TWO levels of partitioning? I mean... 3077 filegroups and 50 partition functions that address
by: amuven | last post by:
Hi All, I need to put a horizontal scroll bar for 4 cells alone where my first cell in table should not contain any horizontal scroll bar . In clear, let us say there are 5 columns in my table i want to put horizontal scroll bar only for 2nd,3rd,4th and 5th column alone where my 1st column should not horizontal scroll bar . so that I can...
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main...
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. ...
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For...
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the...
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then...
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert...
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in...
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.