470,586 Members | 1,349 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 470,586 developers. It's quick & easy.

Definition of a 'Very Large Table'

Hi all,
I have heard and read this many times: "Partitions should only be used for
'very large' tables".
What actually determines whether a table is 'very large' or not?
I have tables containing 0.5 million rows, 8 million rows, 14 & 29 million
rows as well.
How do I categorize them?

Any comments will be helpful.

Cheers,
San.
May 12 '06 #1
2 2019
"shsandeep" <sa**********@gmail.com> wrote in message
news:ab******************************@localhost.ta lkaboutdatabases.com...
Hi all,
I have heard and read this many times: "Partitions should only be used for
'very large' tables".
What actually determines whether a table is 'very large' or not?
I have tables containing 0.5 million rows, 8 million rows, 14 & 29 million
rows as well.
How do I categorize them?

Any comments will be helpful.

Cheers,
San.


Partitioning (with DB2 DPF) should be used when there are a significant
number of queries run that would benefit from parallel processing. These
usually include queries where it necessary to read all (or a large
percentage) of the rows in a table in order to return the answer, and there
are a lot of rows in a table, and response time is unacceptable in a
non-parallel environment. This usually occurs when the access plan is one or
more table scans of large tables. There is no magic number of rows that
would tell you when DPF is recommended.

An OLTP system which uses indexes to quickly return only a few rows, may not
benefit at all from DPF, regardless of the number of rows in the table. In
fact, OLTP queries may run slower with DPF because of the overhead required
for each partition to process its portion of the table and for DB2 to
assemble the results into one answer. (There are some situations where DPF
can be used effectively with OLTP, but only experienced professionals should
attempt this).

DPF (data partitioning feature) enables inter-partition parallelism. But you
can also do intra-partition parallelism with only one partition (and without
DPF). UNION ALL views is one way to accomplish this, and I presume that
range-partitioning in V9.1 ( (Viper) due out later summer will do the same.


May 12 '06 #2
San,

My definition is:
table/database is very large if you are approaching some limits, that
hurt your business.

Possible limits are:
time to perform backup, response time of a query, maximum number of
rows that one table/tablespace can hold (per partition is 4 x 10^9),
tablespace size (fe. 512 GB for 32KB page in version 8; 16 TB in
version 9), high machine utilization, etc.

For example, if your environment is powerful enough to serve your
workload you can think that you have "small enough database", even if
it's 10 times larger than your neighbor has.

29 million, wide row table can be large enough to hit maximum
tablespace size. In that case probably you should use larger page size,
partition table using union all, or wait for version 9 (DPF? Probably
not, unless you have terabyte data warehouse).

-- Artur Wronski

May 12 '06 #3

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

13 posts views Thread by StealthBananaT | last post: by
8 posts views Thread by robin | last post: by
9 posts views Thread by Dave H | last post: by
7 posts views Thread by Neil Zanella | last post: by
19 posts views Thread by shanx__=|;- | last post: by
1 post views Thread by datapro01 | last post: by
275 posts views Thread by Astley Le Jasper | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.