473,320 Members | 1,856 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

Large table redesign

Hi all,

We are logging approx. 3 million records every day into a history
table.
Last week we ran into the 64 GB limit in UDB 8 so we recreated the
table with 8 k pagesize to get some breathingroom before we hit the
128 GB limit.

We are considering partitioning and I just wanted to check with you
that our proposal is the best one:

Table structure is:
Column Type Type
name schema name Length
Scale Nulls
------------------------------ --------- ------------------ --------
----- -----

CREATED SYSIBM TIMESTAMP 10
0 No

OWNER_ID SYSIBM INTEGER 4
0 No

TRANS_ID SYSIBM BIGINT 8
0 No

EVENT_ID SYSIBM SMALLINT 2
0 No

OBJECT_ID SYSIBM INTEGER 4
0 No

CLASS_ID SYSIBM SMALLINT 2
0 No

PARAM_INDEX SYSIBM SMALLINT 2
0 No

VALUE_CHANGE SYSIBM BIGINT 8
0 No

VALUE_ACC SYSIBM BIGINT 8
0 No

Indexes are:

CREATED
OBJECT_ID,CREATED
OWNER_ID,CREATED
TRANS_ID

Our assumption is:

The rowsize is 48 bytes which means we will not benefit from a
pagesize larger than 8kB

We create with option PCTFREE 0 and APPEND ON in a dedicated SMS
tablespace

CREATED and TRANS_ID are increasing with every insert, but since many
hosts are writing into this table, they will not arrive perfectly
ordered which leads to the assumption that we will not benefit from
clustering.

We are thinking that we want to partition on "CREATED" and have one
month of data in each partition. This is currently about 100 million
rows / partition and the speed of entry increases with about 4% every
month (so next month will be 104 new million rows) - that will
currently be about 4,8 GB data per current partition.

For business reasons we want to keep about two years online, and after
what we will purge monthly.

Inserts and searches are performed 24/7, where searches are mostly
performed on recent data but sometimes older data is searched also.

So here are my questions:

a) Do you have better suggestions for creating this table?
b) Should we alter the XXX,CREATED indexes to CREATED,XXX ?
c) Is there a siginificant performance gain in altering TIMESTAMP to
something with less precision?

and finally, I would really appreciate recommendations on how to
configure the new diskarray that soon arrive to get best performance
with the table mentioned above.

(HP MSA30 , dualchannel U320 with 14 drives)

Kindly regards,
/Mats
Nov 12 '05 #1
1 2029
"Mats Kling" <ma********@gmail.com> wrote in message
news:ff**************************@posting.google.c om...
Hi all,

We are logging approx. 3 million records every day into a history
table.
Last week we ran into the 64 GB limit in UDB 8 so we recreated the
table with 8 k pagesize to get some breathingroom before we hit the
128 GB limit.

We are considering partitioning and I just wanted to check with you
that our proposal is the best one:

Table structure is:
Column Type Type
name schema name Length
Scale Nulls
------------------------------ --------- ------------------ --------
----- -----

CREATED SYSIBM TIMESTAMP 10
0 No

OWNER_ID SYSIBM INTEGER 4
0 No

TRANS_ID SYSIBM BIGINT 8
0 No

EVENT_ID SYSIBM SMALLINT 2
0 No

OBJECT_ID SYSIBM INTEGER 4
0 No

CLASS_ID SYSIBM SMALLINT 2
0 No

PARAM_INDEX SYSIBM SMALLINT 2
0 No

VALUE_CHANGE SYSIBM BIGINT 8
0 No

VALUE_ACC SYSIBM BIGINT 8
0 No

Indexes are:

CREATED
OBJECT_ID,CREATED
OWNER_ID,CREATED
TRANS_ID

Our assumption is:

The rowsize is 48 bytes which means we will not benefit from a
pagesize larger than 8kB

We create with option PCTFREE 0 and APPEND ON in a dedicated SMS
tablespace

CREATED and TRANS_ID are increasing with every insert, but since many
hosts are writing into this table, they will not arrive perfectly
ordered which leads to the assumption that we will not benefit from
clustering.

We are thinking that we want to partition on "CREATED" and have one
month of data in each partition. This is currently about 100 million
rows / partition and the speed of entry increases with about 4% every
month (so next month will be 104 new million rows) - that will
currently be about 4,8 GB data per current partition.

For business reasons we want to keep about two years online, and after
what we will purge monthly.

Inserts and searches are performed 24/7, where searches are mostly
performed on recent data but sometimes older data is searched also.

So here are my questions:

a) Do you have better suggestions for creating this table?
b) Should we alter the XXX,CREATED indexes to CREATED,XXX ?
c) Is there a siginificant performance gain in altering TIMESTAMP to
something with less precision?

and finally, I would really appreciate recommendations on how to
configure the new diskarray that soon arrive to get best performance
with the table mentioned above.

(HP MSA30 , dualchannel U320 with 14 drives)

Kindly regards,
/Mats


You would have to explain a lot more about this application and whether you
are trying to optimize insert performance or query performance. A load would
be faster than an insert, if you can do that. The indexes will slow
load/inserts down considerably so make sure they are needed and that they
are configured properly for percent free (and you reorg the indexes if
possible) unless the index is always a column with an increasing value.

If you want to use SMS with large inserts, make sure you run db2empfa to
enable multipage file allocation (allocate more than one new page at a
time). See the Command Reference.

DB2 LUW partitioning is a random hash partitioning on the key you specify.
You cannot put a particular month in a particular partition. You might want
to look at creating a separate table for each month and use UNION ALL views
for retrieval. See this link:
http://www-106.ibm.com/developerwork...202zuzarte.pdf

I don't think the timestamp column is a big issue.

I don't understand you question b) about the indexes.

Are all the disks in one single array? RAID 5? Need more details about this.

Nov 12 '05 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
by: Stewart Allen | last post by:
Hi all, I'm designing a club database and have encountered a problem when trying to extract the total amount of fees that a Student/Family is suppose to pay during their time of membership. I've...
6
by: Rajorshi Biswas | last post by:
Hi folks, Suppose I have a large (1 GB) text file which I want to read in reverse. The number of characters I want to read at a time is insignificant. I'm confused as to how best to do it. Upon...
9
by: VMI | last post by:
We have this huge application that's based on storing tons of data on a dataTable. The only problem we're having is that storing LOTS of data (1 million records) into a datatable will slow down the...
1
by: phillip.s.powell | last post by:
UPDATE redesign.student SET student_work_area_other SELECT REPLACE('anywhere,(.*)', '$1', i.work_area) AS regexp_col FROM mycompany.interns i, redesign.student s WHERE i.unique_key =...
12
by: john_sips_tea | last post by:
I've got a fairly substantial webapp written in Java (plus Tomcat, Hibernate, Struts, JSP, MySQL) that is a bit of a bear to work with. I didn't write it. Much of it is only very sparsely...
13
by: Jo | last post by:
Hi. I'm getting the following error when creating a table with 250 columns .. I have tried creating it in a 32K tablespace , still the same issue. Is this a limitation in DB2? I am using DB2...
3
by: mkjets | last post by:
I have worked for hours on trying to find a solution and have not figured it out. I am working in Access 2003. I need to create a query that takes values from 1 table and displays them in...
4
by: Nate | last post by:
I am looking to populate a Schedule table with information from two other tables. I am able to populate it row by row, but I have created tables that should provide all necessary information for me...
1
by: ll | last post by:
Hi all, I've inherited a site and am currently looking to redesign a page that displays a table of course curriculum, with each row representing a "Topic" of the curriculum. There is a courseID...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
0
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.