Connecting Tech Pros Worldwide Help | Site Map

dramatical performance degradation on dynamic sql over night

  #1  
Old November 12th, 2005, 07:38 AM
florian
Guest
 
Posts: n/a
Hello,

we are running DB2 UDB EEE Version 7.2 Fixpack 12 on a two machine
Windows 2000 Advanced Server Cluster in a dss environment.

Some dynamic sql statements for etl processes and even some for online
user queries switched overnight from some minutes runtime to a few
hours or "never come back" statements.

We didn't change a single database or instance parameter. Even the
statistics are still the same for the involved queries. Unfortunately
we cannot compare the before and after access plans because we only
have the actual ones. Query rewriting and index optimization helped us
to tune the bad performing dynamic sqls but we still see bad
performing statements which used to run well.

Most (not all) of the statements are using inner joins and left outer
joins. Some of the tables have more than 100 million rows. However the
statements used to perform well some days ago without changing
parameters. Even the reorgchk output seems to be ok.

My questions:
Has anybody seen a similar behaviour in his/her environment?
What else could affect the access plan for dynamic sql statements?
Are there any issues besides the access plans which may be caused the
performance degradation?

thank you
Florian
  #2  
Old November 12th, 2005, 07:38 AM
Mark A
Guest
 
Posts: n/a

re: dramatical performance degradation on dynamic sql over night


"florian" <autosms@web.de> wrote in message
news:35381911.0406160551.1db80e03@posting.google.c om...[color=blue]
> Hello,
>
> we are running DB2 UDB EEE Version 7.2 Fixpack 12 on a two machine
> Windows 2000 Advanced Server Cluster in a dss environment.
>
> Some dynamic sql statements for etl processes and even some for online
> user queries switched overnight from some minutes runtime to a few
> hours or "never come back" statements.
>
> We didn't change a single database or instance parameter. Even the
> statistics are still the same for the involved queries. Unfortunately
> we cannot compare the before and after access plans because we only
> have the actual ones. Query rewriting and index optimization helped us
> to tune the bad performing dynamic sqls but we still see bad
> performing statements which used to run well.
>
> Most (not all) of the statements are using inner joins and left outer
> joins. Some of the tables have more than 100 million rows. However the
> statements used to perform well some days ago without changing
> parameters. Even the reorgchk output seems to be ok.
>
> My questions:
> Has anybody seen a similar behaviour in his/her environment?
> What else could affect the access plan for dynamic sql statements?
> Are there any issues besides the access plans which may be caused the
> performance degradation?
>
> thank you
> Florian[/color]

Make sure you have sufficient sizes for your system temporary tablespaces.
Sometimes when they are not large enough, the queries can take a lot longer.


  #3  
Old November 12th, 2005, 07:38 AM
Haider Rizvi
Guest
 
Posts: n/a

re: dramatical performance degradation on dynamic sql over night


Has anything changed at the system level? If you haven't changed any
db2 parameter settings, and it has the old stats then plans can not
change. I'd suggest looking at your system first and try to understand,
with the various perf monitors that Windows provides, if the system is
performing badly. May be a disk has gone bad on a raid array? may be
you are using a lot more paging now, etc.

autosms@web.de (florian) writes:
[color=blue]
> Hello,
>
> we are running DB2 UDB EEE Version 7.2 Fixpack 12 on a two machine
> Windows 2000 Advanced Server Cluster in a dss environment.
>
> Some dynamic sql statements for etl processes and even some for online
> user queries switched overnight from some minutes runtime to a few
> hours or "never come back" statements.
>
> We didn't change a single database or instance parameter. Even the
> statistics are still the same for the involved queries. Unfortunately
> we cannot compare the before and after access plans because we only
> have the actual ones. Query rewriting and index optimization helped us
> to tune the bad performing dynamic sqls but we still see bad
> performing statements which used to run well.
>
> Most (not all) of the statements are using inner joins and left outer
> joins. Some of the tables have more than 100 million rows. However the
> statements used to perform well some days ago without changing
> parameters. Even the reorgchk output seems to be ok.
>
> My questions:
> Has anybody seen a similar behaviour in his/her environment?
> What else could affect the access plan for dynamic sql statements?
> Are there any issues besides the access plans which may be caused the
> performance degradation?
>
> thank you
> Florian[/color]

--
Regards,
--
Haider
  #4  
Old November 12th, 2005, 07:38 AM
Paul Vernon
Guest
 
Posts: n/a

re: dramatical performance degradation on dynamic sql over night


"florian" <autosms@web.de> wrote in message
news:35381911.0406160551.1db80e03@posting.google.c om...[color=blue]
> Hello,
>
> we are running DB2 UDB EEE Version 7.2 Fixpack 12 on a two machine
> Windows 2000 Advanced Server Cluster in a dss environment.
>
> Some dynamic sql statements for etl processes and even some for online
> user queries switched overnight from some minutes runtime to a few
> hours or "never come back" statements.[/color]

I've occasionally seen such behaviour. Adding RUNSTATs jobs into key bits of
the ETL fixed the problem, however I've never been sure about the root
cause.

The immediate cause is differing access plans, but whether this is because
the optimiser has poor statistics and is on a knife edge between two
separate plans - one a good one in practice and another a very bad one. In
this case, a few more rows in one table might be enough to push the
optimiser to choose the good plan to the bad one.

The solution here is to improve the stats that the optimiser is working
with. For example look at some of the estimated filter factors and
cardinalities in your access plans. If these are way out to what you know
the real filter factors and cardinalities to be, look to RUNSTATs with more
freqvalues and quantiles, or in extreme cases, manually alter the stats so
that the calculated filter factors, cardinalities etc look better.

The other possible root cause is a suspicion that the optimiser might look
at SYSCAT.TABLES.STATS_TIME and then if the timestamp is 'too old' it decide
to fabricate statistics instead. I've never actually found out if the
optimiser does indeed do something like that, because as I said, I tend to
include RUNSTATs in my ETL (when it is not too costly to do such). If
STATS_TIME is a factor, then update some stats column to a new value to get
the STATS_TIME to be updated.

Regards
Paul Vernon
Business Intelligence, IBM Global Services


  #5  
Old November 12th, 2005, 07:38 AM
Paul Vernon
Guest
 
Posts: n/a

re: dramatical performance degradation on dynamic sql over night


"Haider Rizvi" <haider@nouce.ca.ibm.com> wrote in message
news:7zsmcvk1ht.fsf@thinkhr.torolab.ibm.com...[color=blue]
> Has anything changed at the system level? If you haven't changed any
> db2 parameter settings, and it has the old stats then plans can not
> change.[/color]

Is that 100% true?

Looking in the manuals for a whiff of my suspicion about STATS_DATE, all I
could find is this statement.

"If RUNSTATS is not executed or the optimizer suspects that RUNSTATS was
executed on empty or nearly empty tables, it may either use defaults or
attempt to derive certain statistics based on the number of file pages used
to store the table on disk (FPAGES)."

However, that does imply that access plans can change solely due to I/U/D
activity. I.e. if you runstat a 'nearly empty table' the you will get one
set of access plans. If you then insert lots of rows into said table, DB2
may decide to derive certain statistics instead for subsequent queries, so
by producing different access plans without any change in RUNSTATs
statistics.

Regards
Paul Vernon
Business Intelligence, IBM Global Services


  #6  
Old November 12th, 2005, 07:38 AM
Haider Rizvi
Guest
 
Posts: n/a

re: dramatical performance degradation on dynamic sql over night


"Paul Vernon" <paul.vernon@ukk.ibmm.comm> writes:
[color=blue]
> "Haider Rizvi" <haider@nouce.ca.ibm.com> wrote in message
> news:7zsmcvk1ht.fsf@thinkhr.torolab.ibm.com...[color=green]
>> Has anything changed at the system level? If you haven't changed any
>> db2 parameter settings, and it has the old stats then plans can not
>> change.[/color]
>
> Is that 100% true?[/color]

You are right about this possibility. I was keying off of Florian's
statement that stats had not changed, also implicitly assuming that the
runstats was done on a reasonably loaded table.

[color=blue]
> Looking in the manuals for a whiff of my suspicion about STATS_DATE, all I
> could find is this statement.
>
> "If RUNSTATS is not executed or the optimizer suspects that RUNSTATS was
> executed on empty or nearly empty tables, it may either use defaults or
> attempt to derive certain statistics based on the number of file pages used
> to store the table on disk (FPAGES)."
>
> However, that does imply that access plans can change solely due to I/U/D
> activity. I.e. if you runstat a 'nearly empty table' the you will get one
> set of access plans. If you then insert lots of rows into said table, DB2
> may decide to derive certain statistics instead for subsequent queries, so
> by producing different access plans without any change in RUNSTATs
> statistics.
>
> Regards
> Paul Vernon
> Business Intelligence, IBM Global Services
>
>[/color]

--
Regards,
--
Haider
  #7  
Old November 12th, 2005, 08:16 AM
florian
Guest
 
Posts: n/a

re: dramatical performance degradation on dynamic sql over night


Mark, Haider, Paul

thank you very much for your fast reply.
[color=blue]
>Make sure you have sufficient sizes for your system temporary[/color]
tablespaces.[color=blue]
>Sometimes when they are not large enough, the queries can take a lot[/color]
longer.

this could be an issue, sometimes we see a completely filled tempspace
with 20 GB

[color=blue]
>I'd suggest looking at your system first and try to understand,
>with the various perf monitors that Windows provides, if the system[/color]
is[color=blue]
>performing badly. May be a disk has gone bad on a raid array? may be
>you are using a lot more paging now, etc.[/color]

sometimes i wish we had a unix system which makes it easier to get os
performance information.
It is always hard to find performance changes if you don't have a
history of "normal" operation performance values. But i agree that it
is time to look at system level now, too.

[color=blue]
>The immediate cause is differing access plans....[/color]

Our Runstats are collected every 2 weeks. Most of the tables have
millions of rows and change within 2 weeks a few ten to hundred
thousand rows so that the statistics are close to the real world. But
i think your suggestion is a very good one and the access path could
be the key here to solve our problems.

On another machine some month ago we had the same behaviour of a
dynamic sql statement where runtime decreased from 7 minutes to 3
hours. I tuned it by rewriting (left outer join to fullselect was
possible) and got it back to 4 minutes. After one week the new
statement needed again 2-3 hours and we tried the old one which
finished again in 8 minutes. But in that case the statistics changed
slighly overtime and i was able to find the two points of time where
another access plan was chosen, so we think we understand what
happened.

I am going to create some test tables and use the problem queries on
them. I am interested in seeing access plan changes after updating the
statistics manually.

I will keep you updated and appreciate your help

Thanks again
Florian
Closed Thread