472,805 Members | 1,099 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,805 software developers and data experts.

Large table redesign

Hi all,

We are logging approx. 3 million records every day into a history
table.
Last week we ran into the 64 GB limit in UDB 8 so we recreated the
table with 8 k pagesize to get some breathingroom before we hit the
128 GB limit.

We are considering partitioning and I just wanted to check with you
that our proposal is the best one:

Table structure is:
Column Type Type
name schema name Length
Scale Nulls
------------------------------ --------- ------------------ --------
----- -----

CREATED SYSIBM TIMESTAMP 10
0 No

OWNER_ID SYSIBM INTEGER 4
0 No

TRANS_ID SYSIBM BIGINT 8
0 No

EVENT_ID SYSIBM SMALLINT 2
0 No

OBJECT_ID SYSIBM INTEGER 4
0 No

CLASS_ID SYSIBM SMALLINT 2
0 No

PARAM_INDEX SYSIBM SMALLINT 2
0 No

VALUE_CHANGE SYSIBM BIGINT 8
0 No

VALUE_ACC SYSIBM BIGINT 8
0 No

Indexes are:

CREATED
OBJECT_ID,CREATED
OWNER_ID,CREATED
TRANS_ID

Our assumption is:

The rowsize is 48 bytes which means we will not benefit from a
pagesize larger than 8kB

We create with option PCTFREE 0 and APPEND ON in a dedicated SMS
tablespace

CREATED and TRANS_ID are increasing with every insert, but since many
hosts are writing into this table, they will not arrive perfectly
ordered which leads to the assumption that we will not benefit from
clustering.

We are thinking that we want to partition on "CREATED" and have one
month of data in each partition. This is currently about 100 million
rows / partition and the speed of entry increases with about 4% every
month (so next month will be 104 new million rows) - that will
currently be about 4,8 GB data per current partition.

For business reasons we want to keep about two years online, and after
what we will purge monthly.

Inserts and searches are performed 24/7, where searches are mostly
performed on recent data but sometimes older data is searched also.

So here are my questions:

a) Do you have better suggestions for creating this table?
b) Should we alter the XXX,CREATED indexes to CREATED,XXX ?
c) Is there a siginificant performance gain in altering TIMESTAMP to
something with less precision?

and finally, I would really appreciate recommendations on how to
configure the new diskarray that soon arrive to get best performance
with the table mentioned above.

(HP MSA30 , dualchannel U320 with 14 drives)

Kindly regards,
/Mats
Nov 12 '05 #1
1 1995
"Mats Kling" <ma********@gmail.com> wrote in message
news:ff**************************@posting.google.c om...
Hi all,

We are logging approx. 3 million records every day into a history
table.
Last week we ran into the 64 GB limit in UDB 8 so we recreated the
table with 8 k pagesize to get some breathingroom before we hit the
128 GB limit.

We are considering partitioning and I just wanted to check with you
that our proposal is the best one:

Table structure is:
Column Type Type
name schema name Length
Scale Nulls
------------------------------ --------- ------------------ --------
----- -----

CREATED SYSIBM TIMESTAMP 10
0 No

OWNER_ID SYSIBM INTEGER 4
0 No

TRANS_ID SYSIBM BIGINT 8
0 No

EVENT_ID SYSIBM SMALLINT 2
0 No

OBJECT_ID SYSIBM INTEGER 4
0 No

CLASS_ID SYSIBM SMALLINT 2
0 No

PARAM_INDEX SYSIBM SMALLINT 2
0 No

VALUE_CHANGE SYSIBM BIGINT 8
0 No

VALUE_ACC SYSIBM BIGINT 8
0 No

Indexes are:

CREATED
OBJECT_ID,CREATED
OWNER_ID,CREATED
TRANS_ID

Our assumption is:

The rowsize is 48 bytes which means we will not benefit from a
pagesize larger than 8kB

We create with option PCTFREE 0 and APPEND ON in a dedicated SMS
tablespace

CREATED and TRANS_ID are increasing with every insert, but since many
hosts are writing into this table, they will not arrive perfectly
ordered which leads to the assumption that we will not benefit from
clustering.

We are thinking that we want to partition on "CREATED" and have one
month of data in each partition. This is currently about 100 million
rows / partition and the speed of entry increases with about 4% every
month (so next month will be 104 new million rows) - that will
currently be about 4,8 GB data per current partition.

For business reasons we want to keep about two years online, and after
what we will purge monthly.

Inserts and searches are performed 24/7, where searches are mostly
performed on recent data but sometimes older data is searched also.

So here are my questions:

a) Do you have better suggestions for creating this table?
b) Should we alter the XXX,CREATED indexes to CREATED,XXX ?
c) Is there a siginificant performance gain in altering TIMESTAMP to
something with less precision?

and finally, I would really appreciate recommendations on how to
configure the new diskarray that soon arrive to get best performance
with the table mentioned above.

(HP MSA30 , dualchannel U320 with 14 drives)

Kindly regards,
/Mats


You would have to explain a lot more about this application and whether you
are trying to optimize insert performance or query performance. A load would
be faster than an insert, if you can do that. The indexes will slow
load/inserts down considerably so make sure they are needed and that they
are configured properly for percent free (and you reorg the indexes if
possible) unless the index is always a column with an increasing value.

If you want to use SMS with large inserts, make sure you run db2empfa to
enable multipage file allocation (allocate more than one new page at a
time). See the Command Reference.

DB2 LUW partitioning is a random hash partitioning on the key you specify.
You cannot put a particular month in a particular partition. You might want
to look at creating a separate table for each month and use UNION ALL views
for retrieval. See this link:
http://www-106.ibm.com/developerwork...202zuzarte.pdf

I don't think the timestamp column is a big issue.

I don't understand you question b) about the indexes.

Are all the disks in one single array? RAID 5? Need more details about this.

Nov 12 '05 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
by: Stewart Allen | last post by:
Hi all, I'm designing a club database and have encountered a problem when trying to extract the total amount of fees that a Student/Family is suppose to pay during their time of membership. I've...
6
by: Rajorshi Biswas | last post by:
Hi folks, Suppose I have a large (1 GB) text file which I want to read in reverse. The number of characters I want to read at a time is insignificant. I'm confused as to how best to do it. Upon...
9
by: VMI | last post by:
We have this huge application that's based on storing tons of data on a dataTable. The only problem we're having is that storing LOTS of data (1 million records) into a datatable will slow down the...
1
by: phillip.s.powell | last post by:
UPDATE redesign.student SET student_work_area_other SELECT REPLACE('anywhere,(.*)', '$1', i.work_area) AS regexp_col FROM mycompany.interns i, redesign.student s WHERE i.unique_key =...
12
by: john_sips_tea | last post by:
I've got a fairly substantial webapp written in Java (plus Tomcat, Hibernate, Struts, JSP, MySQL) that is a bit of a bear to work with. I didn't write it. Much of it is only very sparsely...
13
by: Jo | last post by:
Hi. I'm getting the following error when creating a table with 250 columns .. I have tried creating it in a 32K tablespace , still the same issue. Is this a limitation in DB2? I am using DB2...
3
by: mkjets | last post by:
I have worked for hours on trying to find a solution and have not figured it out. I am working in Access 2003. I need to create a query that takes values from 1 table and displays them in...
4
by: Nate | last post by:
I am looking to populate a Schedule table with information from two other tables. I am able to populate it row by row, but I have created tables that should provide all necessary information for me...
1
by: ll | last post by:
Hi all, I've inherited a site and am currently looking to redesign a page that displays a table of course curriculum, with each row representing a "Topic" of the curriculum. There is a courseID...
2
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 2 August 2023 starting at 18:00 UK time (6PM UTC+1) and finishing at about 19:15 (7.15PM) The start time is equivalent to 19:00 (7PM) in Central...
0
by: erikbower65 | last post by:
Using CodiumAI's pr-agent is simple and powerful. Follow these steps: 1. Install CodiumAI CLI: Ensure Node.js is installed, then run 'npm install -g codiumai' in the terminal. 2. Connect to...
0
by: erikbower65 | last post by:
Here's a concise step-by-step guide for manually installing IntelliJ IDEA: 1. Download: Visit the official JetBrains website and download the IntelliJ IDEA Community or Ultimate edition based on...
0
by: Taofi | last post by:
I try to insert a new record but the error message says the number of query names and destination fields are not the same This are my field names ID, Budgeted, Actual, Status and Differences ...
14
DJRhino1175
by: DJRhino1175 | last post by:
When I run this code I get an error, its Run-time error# 424 Object required...This is my first attempt at doing something like this. I test the entire code and it worked until I added this - If...
0
by: Rina0 | last post by:
I am looking for a Python code to find the longest common subsequence of two strings. I found this blog post that describes the length of longest common subsequence problem and provides a solution in...
0
by: lllomh | last post by:
Define the method first this.state = { buttonBackgroundColor: 'green', isBlinking: false, // A new status is added to identify whether the button is blinking or not } autoStart=()=>{
0
by: lllomh | last post by:
How does React native implement an English player?
0
by: Mushico | last post by:
How to calculate date of retirement from date of birth

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.