By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
426,012 Members | 988 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 426,012 IT Pros & Developers. It's quick & easy.

Performance of union all views

P: n/a
Hi,

I have a large table (about 50G) which stores data for over 7 years. I
decided to split this table up into a yearly basis and in order to
allow minimum changes to the applications which access this table, I
created a union all view over the 7 yearly tables.
What I have noticed is that queries against the union all view is
considerably slower than queries against the original table. When I ran
db2batch, I noticed cpu usage was higher + the number of piped sorts
was higher (proportional to the number of branches in my union all
view). Has anyone come across performance problems with union all views
and can offer some suggestions ??
I changed the locking level to TABLE for the read only year tables and
this helped a little bit.

Thanks
Lyn

Nov 12 '05 #1
Share this Question
Share on Google+
8 Replies


P: n/a
<ly*******@gmail.com> wrote in message
news:11**********************@f14g2000cwb.googlegr oups.com...
Hi,

I have a large table (about 50G) which stores data for over 7 years. I
decided to split this table up into a yearly basis and in order to
allow minimum changes to the applications which access this table, I
created a union all view over the 7 yearly tables.
What I have noticed is that queries against the union all view is
considerably slower than queries against the original table. When I ran
db2batch, I noticed cpu usage was higher + the number of piped sorts
was higher (proportional to the number of branches in my union all
view). Has anyone come across performance problems with union all views
and can offer some suggestions ??
I changed the locking level to TABLE for the read only year tables and
this helped a little bit.

Thanks
Lyn

If you enable intra-partition parallelism, DB2 will process each table in
parallel. For this reason, you need to balance the number of tables and the
degree of parallelism, with the number of processors and the distribution of
each table on different disk drives (or arrays), and if possible on
different disk controllers.

Obviously, with UNION ALL views, you don't always know how many tables will
be accessed in each query, so the optimization of this is tricky, especially
with respect to CPU's. But you should try and spread the data across
separate disk devices to the extent possible.

You can use the explain function to determine how much parallelism you are
getting and to tune your db, dbm, and other parms to optimum results.

Obviously, if you processing in parallel, and you do a lot of piped sorts,
you need lot of sort heap memory. You also need lots of space for your
system temporary tablespace(s).
Nov 12 '05 #2

P: n/a
For real optimize, IMHO, You need add like

ALTER TABLE table_YYYY ADD CONSTRAINT PERIOD_YYYY CHECK
(year_field = YYYY )
ENFORCED ENABLE QUERY OPTIMIZATION;

and useful for read only tables pctfree 0 (if its no default) and VALUE
COMPRESSION

Andy

Nov 12 '05 #3

P: n/a
I have placed the tables in a tablespace which is split across 4
devices.
After running explain, the union all actually looks better
performance-wise however there's an extra group-by step in the union
all access plan that i can't understand. the overall results is grouped
incorrectly. Both have paralellism of 3

Any ideas ?
Thanks again!
Lyn

Nov 12 '05 #4

P: n/a
I have placed the tables in a tablespace which is split across 4
devices.
After running explain, the union all actually looks better
performance-wise however there's an extra group-by step in the union
all access plan that i can't understand. the overall results is grouped
incorrectly. Both have paralellism of 3

Any ideas ?
Thanks again!
Lyn

Nov 12 '05 #5

P: n/a
I already have these contraints on the tables thanks.
but i will try pctfree 0 and value compression

Nov 12 '05 #6

P: n/a
<ly*******@gmail.com> wrote in message
news:11**********************@g43g2000cwa.googlegr oups.com...
I have placed the tables in a tablespace which is split across 4
devices.
After running explain, the union all actually looks better
performance-wise however there's an extra group-by step in the union
all access plan that i can't understand. the overall results is grouped
incorrectly. Both have paralellism of 3

Any ideas ?
Thanks again!
Lyn

If your queries typically access more than table at a time, you would be
probably be better off if you placed each table on a separate tablespace,
each on a separate physical device (separate disk or array). They way you
have it now, there is a lot of disk contention (potentially causing delays
as the heads move back and forth to different parts of the disk) if you are
doing parallel operations. Of course, this depends somewhat on what kind of
disk subsystem you have (JBOD, RAID, SAN, NAS).

If you only have 4 devices, then put two tablespaces on each device (for a
total of 8 tables). Make sure you do this in a round robin fashion, so that
the most frequently accessed tables do not end up on the same device.

Separate tablespaces will also allow you to backup one table at a time, then
delete the rows, and then restore the tablespace later if you need access to
the historical data. Obviously, SMS tablespaces will be easier to administer
in this scenario (tablespace size will shrink automatically), but DMS could
also work if you manually shrink the tablespace for tables that are
archived.

Sounds like your degree of parallelism is set to 4, and you might want to
change to the number of processors or a multiple thereof. Of course, if only
3 tables are accessed in a query, that would be the reason you have
parallelism of 3.
Nov 12 '05 #7

P: n/a
Ian
ly*******@gmail.com wrote:
Hi,

I have a large table (about 50G) which stores data for over 7 years. I
decided to split this table up into a yearly basis and in order to
allow minimum changes to the applications which access this table, I
created a union all view over the 7 yearly tables.
What I have noticed is that queries against the union all view is
considerably slower than queries against the original table. When I ran
db2batch, I noticed cpu usage was higher + the number of piped sorts
was higher (proportional to the number of branches in my union all
view). Has anyone come across performance problems with union all views
and can offer some suggestions ??
I changed the locking level to TABLE for the read only year tables and
this helped a little bit.


There are a couple of reasons to use UNION ALL views:

1) Query performance, which is achieved through branch elimination
(branch = a table in the view).

This really only applies if you decide how to break up your table with
the queries in mind. If none of your queries against the view have
predicates that allow for elimination of any of the branches you will
likely see performance degradation.

2) Ease of maintenance (i.e. roll out an entire year's data very
quickly).

Implementing UNION ALL views for ease of maintenance can easily lead to
performance problems like you describe. With proper planning (as Mark A
described), you can minimize the impact of the design change on overall
query performance.


Nov 12 '05 #8

P: n/a
Ian wrote:
ly*******@gmail.com wrote:
Hi,

I have a large table (about 50G) which stores data for over 7 years. I
decided to split this table up into a yearly basis and in order to
allow minimum changes to the applications which access this table, I
created a union all view over the 7 yearly tables.
What I have noticed is that queries against the union all view is
considerably slower than queries against the original table. When I ran
db2batch, I noticed cpu usage was higher + the number of piped sorts
was higher (proportional to the number of branches in my union all
view). Has anyone come across performance problems with union all views
and can offer some suggestions ??
I changed the locking level to TABLE for the read only year tables and
this helped a little bit.

There are a couple of reasons to use UNION ALL views:

1) Query performance, which is achieved through branch elimination
(branch = a table in the view).

This really only applies if you decide how to break up your table with
the queries in mind. If none of your queries against the view have
predicates that allow for elimination of any of the branches you will
likely see performance degradation.

2) Ease of maintenance (i.e. roll out an entire year's data very
quickly).

Implementing UNION ALL views for ease of maintenance can easily lead to
performance problems like you describe. With proper planning (as Mark A
described), you can minimize the impact of the design change on overall
query performance.

Lyn,

Can you post the DDL, the slow query and the explain plan (ideally
either db2expln with the -graph option or a db2exfmt output?

Cheers
Serge
--
Serge Rielau
DB2 SQL Compiler Development
IBM Toronto Lab
Nov 12 '05 #9

This discussion thread is closed

Replies have been disabled for this discussion.