By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
438,007 Members | 1,248 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 438,007 IT Pros & Developers. It's quick & easy.

Federated DB Bad Query performance

P: n/a
I have a query running on a federated database that takes the form

select col1, col2
from nickname1
where <conditions exist>

union all

select col1,col2
from nickname2
where <conditions exist>
Nickname1 refers to a table on my primary database. Nickname2 refers
to an identically structured table in my secondary database.

The performance for this query is abysmal. I found the bottleneck to
be the second half of the query referencing nickname2. However, if I
connect to the secondary database directly and access the table with
the same query (bypassing the federated layer), the performance is
lightning fast.

The plan for the nickname2 piece uses a lot of SHIP and MSJOINs, where
the plan for nickname 1 uses hash joins. Statistics are up to date on
both primary and secondary databases. I need to figure out how to make
the federated layer choose a better plan. Any clues?

Using 8.2 FP14 on AIX 5.3

Thanks,
Evan

Oct 31 '07 #1
Share this Question
Share on Google+
4 Replies


P: n/a
esmith2112 wrote:
I have a query running on a federated database that takes the form

select col1, col2
from nickname1
where <conditions exist>

union all

select col1,col2
from nickname2
where <conditions exist>
Nickname1 refers to a table on my primary database. Nickname2 refers
to an identically structured table in my secondary database.

The performance for this query is abysmal. I found the bottleneck to
be the second half of the query referencing nickname2. However, if I
connect to the secondary database directly and access the table with
the same query (bypassing the federated layer), the performance is
lightning fast.

The plan for the nickname2 piece uses a lot of SHIP and MSJOINs, where
the plan for nickname 1 uses hash joins. Statistics are up to date on
both primary and secondary databases. I need to figure out how to make
the federated layer choose a better plan. Any clues?

Using 8.2 FP14 on AIX 5.3
Are the predicates pushed down to the federated server?

Cheers
Serge

--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab
Oct 31 '07 #2

P: n/a
esmith2112 wrote:
I have a query running on a federated database that takes the form

select col1, col2
from nickname1
where <conditions exist>

union all

select col1,col2
from nickname2
where <conditions exist>
Nickname1 refers to a table on my primary database. Nickname2 refers
to an identically structured table in my secondary database.

The performance for this query is abysmal. I found the bottleneck to
be the second half of the query referencing nickname2. However, if I
connect to the secondary database directly and access the table with
the same query (bypassing the federated layer), the performance is
lightning fast.

The plan for the nickname2 piece uses a lot of SHIP and MSJOINs, where
the plan for nickname 1 uses hash joins. Statistics are up to date on
both primary and secondary databases. I need to figure out how to make
the federated layer choose a better plan. Any clues?
Are the statistics on the federated server up to date as well? The fed
server can't come up with a good plan if it has completely different
statistics than the remote data sources (what you named "primary database
and secondary database").

Could you also show us the access plan that's been generated?

--
Knut Stolze
DB2 z/OS Utilities Development
IBM Germany
Nov 1 '07 #3

P: n/a
Thanks, Michael, for the tip.

I think I finally got to the root cause of the issue here. I am using
a common table expression up front. The optimizer was smart enough
before to recognize that the query could be pushed to the remote
database before materializing the results in the federated database.
It appears now that it's trying ship the entire table and materialize
it locally before getting the values.

I have validated that the statistics on the federated database are
current with the statistics on the remote database. Ironically, I
think that this is the problem. Since it was originally tested, the
remote database has undergone a thorough "cleansing" with reorgs and
runstats. I believe that the new statistics are what's tipping the
optimizer's choice in favor of shipping the data.

Which of the updatable statistics on the federated DB can I change to
"fool" the optimizer into not shipping the data? Here are the current
values:

TABLE:
-----------------------
CARD: 2695569
FPAGES: 130688
NPAGES:130688
OVERFLOW: 0
COLUMNS:
-------------------------------------
ACCKEY--HIGH2KEY: 'ZZZZZZ-0000000'
ACCKEY--LOW2KEY: ' '
CUSIP--HIGH2KEY: 'Z8065852'
CUSIP--LOW2KEY: ' '

INDEX (composite +ACCKEY+CUSIP)
-------------------------------------
CLUSTERFACTOR: 21
CLUSTERRATIO: -1
NLEAF: 87220
NLEVELS: 4
FIRSTKEYCARD: 4355185
FULLKEYCARD: 4948815
Thanks again,
Evan

On Nov 1, 4:45 pm, Michael Ortega-Binderberger <m...@ics.uci.edu>
wrote:
Hi Evan,

Runstats won't work on nicknames, but instead you can use
the nnstat utility that refreshes nickname statistics.

Nicknames use substantially the same statistics as regular
tables, they just get collected by a different mechanism
than runstats. If you can update the statistics on your
target tables, then run nnstat for the nicknames you need,
that should help.

Michael
read more


Nov 2 '07 #4

P: n/a
Thanks for the suggestions again. We tried manually altering the
statistics, and set the DB2_MAXIMAL_PUSHDOWN server option as well,
but never got the execution plan we needed. In the end we chose to
materialize the rows that would have been part of the common table
expressions into temp tables, join those tables with the other
nicknames. Performance came back to the sub-second times we were used
to before "cleaning house."

Evan

On Nov 2, 2:05 pm, Michael Ortega-Binderberger <m...@ics.uci.edu>
wrote:
Hi Evan,

Of the top of my head I don't know what would change the
behavior without spending much more time on it.

Many stats are used by the nicknames besides the ones you
mention. You can play with increasing the size of the
tablecard and then also perhaps manipulating colcard (it
needs to be below tablecard for consistency).

I think that increasing the tablecard would convince the
optimizer not to ship so much data and try to do things at
the data source. Or you could try to enable the
DB2_MAXIMAL_PUSHDOWN server option (need to check the
spelling of it).

Let me know how it goes and if you need more help I wolud
suggest opening a pmr.

Michael

Nov 6 '07 #5

This discussion thread is closed

Replies have been disabled for this discussion.