dramatical performance degradation on dynamic sql over night

florian

Hello,

we are running DB2 UDB EEE Version 7.2 Fixpack 12 on a two machine
Windows 2000 Advanced Server Cluster in a dss environment.

Some dynamic sql statements for etl processes and even some for online
user queries switched overnight from some minutes runtime to a few
hours or "never come back" statements.

We didn't change a single database or instance parameter. Even the
statistics are still the same for the involved queries. Unfortunately
we cannot compare the before and after access plans because we only
have the actual ones. Query rewriting and index optimization helped us
to tune the bad performing dynamic sqls but we still see bad
performing statements which used to run well.

Most (not all) of the statements are using inner joins and left outer
joins. Some of the tables have more than 100 million rows. However the
statements used to perform well some days ago without changing
parameters. Even the reorgchk output seems to be ok.

My questions:
Has anybody seen a similar behaviour in his/her environment?
What else could affect the access plan for dynamic sql statements?
Are there any issues besides the access plans which may be caused the
performance degradation?

thank you
Florian

Nov 12 '05 #1

Subscribe Post Reply

3512

Mark A

"florian" <au*****@web.de> wrote in message
news:35**************************@posting.google.c om...

Hello,

we are running DB2 UDB EEE Version 7.2 Fixpack 12 on a two machine
Windows 2000 Advanced Server Cluster in a dss environment.

Some dynamic sql statements for etl processes and even some for online
user queries switched overnight from some minutes runtime to a few
hours or "never come back" statements.

We didn't change a single database or instance parameter. Even the
statistics are still the same for the involved queries. Unfortunately
we cannot compare the before and after access plans because we only
have the actual ones. Query rewriting and index optimization helped us
to tune the bad performing dynamic sqls but we still see bad
performing statements which used to run well.

Most (not all) of the statements are using inner joins and left outer
joins. Some of the tables have more than 100 million rows. However the
statements used to perform well some days ago without changing
parameters. Even the reorgchk output seems to be ok.

My questions:
Has anybody seen a similar behaviour in his/her environment?
What else could affect the access plan for dynamic sql statements?
Are there any issues besides the access plans which may be caused the
performance degradation?

thank you
Florian

Make sure you have sufficient sizes for your system temporary tablespaces.
Sometimes when they are not large enough, the queries can take a lot longer.

Nov 12 '05 #2

Haider Rizvi

Has anything changed at the system level? If you haven't changed any
db2 parameter settings, and it has the old stats then plans can not
change. I'd suggest looking at your system first and try to understand,
with the various perf monitors that Windows provides, if the system is
performing badly. May be a disk has gone bad on a raid array? may be
you are using a lot more paging now, etc.

au*****@web.de (florian) writes:

Hello,

we are running DB2 UDB EEE Version 7.2 Fixpack 12 on a two machine
Windows 2000 Advanced Server Cluster in a dss environment.

Some dynamic sql statements for etl processes and even some for online
user queries switched overnight from some minutes runtime to a few
hours or "never come back" statements.

We didn't change a single database or instance parameter. Even the
statistics are still the same for the involved queries. Unfortunately
we cannot compare the before and after access plans because we only
have the actual ones. Query rewriting and index optimization helped us
to tune the bad performing dynamic sqls but we still see bad
performing statements which used to run well.

Most (not all) of the statements are using inner joins and left outer
joins. Some of the tables have more than 100 million rows. However the
statements used to perform well some days ago without changing
parameters. Even the reorgchk output seems to be ok.

My questions:
Has anybody seen a similar behaviour in his/her environment?
What else could affect the access plan for dynamic sql statements?
Are there any issues besides the access plans which may be caused the
performance degradation?

thank you
Florian

--
Regards,
--
Haider

Nov 12 '05 #3

Paul Vernon

"florian" <au*****@web.de> wrote in message
news:35**************************@posting.google.c om...

Hello,

we are running DB2 UDB EEE Version 7.2 Fixpack 12 on a two machine
Windows 2000 Advanced Server Cluster in a dss environment.

Some dynamic sql statements for etl processes and even some for online
user queries switched overnight from some minutes runtime to a few
hours or "never come back" statements.

I've occasionally seen such behaviour. Adding RUNSTATs jobs into key bits of
the ETL fixed the problem, however I've never been sure about the root
cause.

The immediate cause is differing access plans, but whether this is because
the optimiser has poor statistics and is on a knife edge between two
separate plans - one a good one in practice and another a very bad one. In
this case, a few more rows in one table might be enough to push the
optimiser to choose the good plan to the bad one.

The solution here is to improve the stats that the optimiser is working
with. For example look at some of the estimated filter factors and
cardinalities in your access plans. If these are way out to what you know
the real filter factors and cardinalities to be, look to RUNSTATs with more
freqvalues and quantiles, or in extreme cases, manually alter the stats so
that the calculated filter factors, cardinalities etc look better.

The other possible root cause is a suspicion that the optimiser might look
at SYSCAT.TABLES.STATS_TIME and then if the timestamp is 'too old' it decide
to fabricate statistics instead. I've never actually found out if the
optimiser does indeed do something like that, because as I said, I tend to
include RUNSTATs in my ETL (when it is not too costly to do such). If
STATS_TIME is a factor, then update some stats column to a new value to get
the STATS_TIME to be updated.

Regards
Paul Vernon
Business Intelligence, IBM Global Services

Nov 12 '05 #4

Paul Vernon

"Haider Rizvi" <ha****@nouce.ca.ibm.com> wrote in message
news:7z************@thinkhr.torolab.ibm.com...

Has anything changed at the system level? If you haven't changed any
db2 parameter settings, and it has the old stats then plans can not
change.

Is that 100% true?

Looking in the manuals for a whiff of my suspicion about STATS_DATE, all I
could find is this statement.

"If RUNSTATS is not executed or the optimizer suspects that RUNSTATS was
executed on empty or nearly empty tables, it may either use defaults or
attempt to derive certain statistics based on the number of file pages used
to store the table on disk (FPAGES)."

However, that does imply that access plans can change solely due to I/U/D
activity. I.e. if you runstat a 'nearly empty table' the you will get one
set of access plans. If you then insert lots of rows into said table, DB2
may decide to derive certain statistics instead for subsequent queries, so
by producing different access plans without any change in RUNSTATs
statistics.

Regards
Paul Vernon
Business Intelligence, IBM Global Services

Nov 12 '05 #5

Haider Rizvi

"Paul Vernon" <pa*********@ukk.ibmm.comm> writes:

"Haider Rizvi" <ha****@nouce.ca.ibm.com> wrote in message
news:7z************@thinkhr.torolab.ibm.com...
Has anything changed at the system level? If you haven't changed any
db2 parameter settings, and it has the old stats then plans can not
change.
Is that 100% true?

You are right about this possibility. I was keying off of Florian's
statement that stats had not changed, also implicitly assuming that the
runstats was done on a reasonably loaded table.

Looking in the manuals for a whiff of my suspicion about STATS_DATE, all I
could find is this statement.

"If RUNSTATS is not executed or the optimizer suspects that RUNSTATS was
executed on empty or nearly empty tables, it may either use defaults or
attempt to derive certain statistics based on the number of file pages used
to store the table on disk (FPAGES)."

However, that does imply that access plans can change solely due to I/U/D
activity. I.e. if you runstat a 'nearly empty table' the you will get one
set of access plans. If you then insert lots of rows into said table, DB2
may decide to derive certain statistics instead for subsequent queries, so
by producing different access plans without any change in RUNSTATs
statistics.

Regards
Paul Vernon
Business Intelligence, IBM Global Services

--
Regards,
--
Haider

Nov 12 '05 #6

florian

Mark, Haider, Paul

thank you very much for your fast reply.

Make sure you have sufficient sizes for your system temporary tablespaces.Sometimes when they are not large enough, the queries can take a lot longer.

this could be an issue, sometimes we see a completely filled tempspace
with 20 GB

I'd suggest looking at your system first and try to understand,
with the various perf monitors that Windows provides, if the system isperforming badly. May be a disk has gone bad on a raid array? may be
you are using a lot more paging now, etc.
sometimes i wish we had a unix system which makes it easier to get os
performance information.
It is always hard to find performance changes if you don't have a
history of "normal" operation performance values. But i agree that it
is time to look at system level now, too.

The immediate cause is differing access plans....

Our Runstats are collected every 2 weeks. Most of the tables have
millions of rows and change within 2 weeks a few ten to hundred
thousand rows so that the statistics are close to the real world. But
i think your suggestion is a very good one and the access path could
be the key here to solve our problems.

On another machine some month ago we had the same behaviour of a
dynamic sql statement where runtime decreased from 7 minutes to 3
hours. I tuned it by rewriting (left outer join to fullselect was
possible) and got it back to 4 minutes. After one week the new
statement needed again 2-3 hours and we tried the old one which
finished again in 8 minutes. But in that case the statistics changed
slighly overtime and i was able to find the two points of time where
another access plan was chosen, so we think we understand what
happened.

I am going to create some test tables and use the problem queries on
them. I am interested in seeing access plan changes after updating the
statistics manually.

I will keep you updated and appreciate your help

Thanks again
Florian

Nov 12 '05 #7

Similar topics

Continuous performance regression

by: Przemyslaw Mazur | last post by:

Hello. I use postgress to manage a small amount of data (ca. 25MB), but often changed. This is a mirror of several tables taken form four different databases. All data are updated once per...

PostgreSQL Database

Small dynamic objects and performance degradation.

by: Jason Heyes | last post by:

What can I do to circumvent the performance degradation associated with dynamic allocation and small objects? Thanks.

C / C++

FYI: a weird performance problem with MSDE and possibly SQL Server 2000

by: Andrew Mayo | last post by:

This problem was discovered with MSDE2000 SP2 and under WinXP SP2. We are unsure whether it is more widespread as it has only been seen on one machine to date. The problem is related to name...

Microsoft SQL Server

Database performance degrading (again)

by: teedilo | last post by:

We have an application with a SQL Server 2000 back end that is fairly database intensive -- lots of fairly frequent queries, inserts, updates -- the gamut. The application does not make use of...

Microsoft SQL Server

GIL, threads and scheduling - performance cost

by: adsheehan | last post by:

Hi all, Wondering if a GIL lock/unlock causes a re-schedule/contect swap when embedding Python in a multi-threaded C/C++ app on Unix ? If so, do I have any control or influence on this...

Python

Why large performance degradation?

by: Kevin Wan | last post by:

Hello, Would anyone explain why there is a consistent large performance degradation with the dumb copy? Thanks in advance! array_copy_dumb.c: /* array_copy_dumb.c */

C / C++

performance of IN (subquery)

by: Kevin Murphy | last post by:

I'm using PG 7.4.3 on Mac OS X. I am disappointed with the performance of queries like 'select foo from bar where baz in (subquery)', or updates like 'update bar set foo = 2 where baz in...

PostgreSQL Database

performance degradation when looping through lists

by: Joachim Worringen | last post by:

I need to process large lists (in my real application, this is to parse the content of a file). I noticed that the performance to access the individual list elements degrades over runtime. This...

Python

Offline reorg slowed performance

by: datapro01 | last post by:

Running DB2 8.2.6 on AIX 5.3 I recently took on an assignment supporting a DB2 database which supports a Siebel application. Siebel version is 7.5.3 I have a reorg script that carefully...

DB2 Database

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Batch import of multiple excel files into the database

by: ryjfgjl | last post by:

If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...

Data Management

Merging data from multiple Excel files

by: ryjfgjl | last post by:

In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...

Data Management

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

General