473,847 Members | 1,714 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Help performace question DB2 UDB v 8.1.9 Linux

The following query runs very slowly compared to other queries on my
system. If I read the explain plan and its data correctly, it is due to
a table scan on animals caused by the join separated from the others by
space. There is an index on both dam_bhid and recip_bhid. I have tried
converting the on condition to a case statement to no good effect.
Likewise reordering the conditions. Do you have any suggestions.

The semantics of the join is that an1 has a calf on the ground if there
is a calf who she is the dam of when there is no recipient for that calf
or there is a calf of which she was the recipient. A calf is less than
283 days old and unweaned.

select
an1.bhid,
an1.breed_1,
an1.pct_1,
ap_donor.herd_i d as donor_id,
ap_sire.herd_id as sire_id,
ap_dam.herd_id as dam_id,
loc.namex as loc_name,
lh.start_date as as_of_date,
case
when wn.WEIGH_DATE is null
then ap_calf.herd_id
else ''
end AS calf_id,
case
when wn.WEIGH_DATE is null
then char(an_calf.BI RTH_DATE, usa)
else ''
end AS calf_birth_date ,
case
when wn.weigh_date is null
then an_calf.sex
else ''
end AS calf_sex,
ap_calf_sire.he rd_id as calf_sire_id,
col_def.COLOR_D ESC as calf_color,
char((select max(act_recov_d ate)
from $schema.flushes
WHERE donor_bhid=an1. bhid), usa)
AS last_flush_date ,
char(proj_mate. BREED_TO_DATE, usa)as proj_breed_date ,
proj_mate_detai l.herd_id as proj_breed_mate

from $schema.animals an1
left outer
join $schema.ANIMALS _PRIV ap_donor
on ap_donor.bhid = an1.BHID
and ap_donor.herd_o wner_id = $empid
left outer
join $schema.animals _priv ap_sire
on ap_sire.bhid = an1.SIRE_BHID
and ap_sire.herd_ow ner_id = $empid
left outer
join $schema.ANIMALS _PRIV ap_dam
on ap_dam.bhid = an1.DAM_BHID
and ap_dam.herd_own er_id = $empid
left outer
join $schema.locatio ns_hist lh
on an1.bhid = lh.bhid
and lh.end_date is null
left outer
join $schema.locatio ns loc
on loc.loc_id = lh.loc_id

-- The problem is here

left outer
join $schema.animals an_calf
on days(current_da te) -
days(an_calf.bi rth_date) < 280
and an_calf.ACTIVEX = 'Y'
and ((an_calf.dam_b hid = an1.bhid AND
an_calf.recip_b hid = 0)
OR an_calf.RECIP_B HID = an1.bhid)

-- end problem

left outer
join $schema.animals _priv ap_calf
on ap_calf.HERD_OW NER_ID = $empid
and ap_calf.bhid = an_calf.bhid
left outer
join $schema.weaning wn
on wn.bhid = an_calf.bhid
left outer
join $schema.ANIMALS _PRIV ap_calf_sire
on an_calf.SIRE_BH ID=ap_calf_sire .bhid
and ap_calf_sire.he rd_owner_id = $empid
left outer
join $schema.COLOR_D EFN col_def
on col_def.BH_COLO R_CODE=an_calf. COLOR
left outer
join $schema.ANIMALS _PRIV_$empid proj_mate
on proj_mate.bhid = an1.bhid
left outer
join $schema.ANIMALS _PRIV_$empid proj_mate_detai l
on proj_mate_detai l.BHID=
proj_mate.BREED _TO_BHID
where
an1.BHID in
(select bhid
from $schema.animal_ sets
where set_name = '$setname'
and userid='$setuse r')
Apr 11 '06 #1
5 1457
Bob Stearns wrote:
left outer
join $schema.animals an_calf
on days(current_da te) -
days(an_calf.bi rth_date) < 280
and an_calf.ACTIVEX = 'Y'
and ((an_calf.dam_b hid = an1.bhid AND
an_calf.recip_b hid = 0)
OR an_calf.RECIP_B HID = an1.bhid)

There are two potential problems. The first is the OR which pretty much
makes teh last AND predicate useless for index exploitation.
The second is the expression on days which disallows index exploitation
on birth_date and, worse, forces an NLJOIN.
What about a small rewrite:
an_calf.birthda te > current date - 280 days
Now DB2 has a clear startkey for an index scan over birth date.

Cheers
Serge

--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab
Apr 12 '06 #2
Serge Rielau wrote:
Bob Stearns wrote:
left outer
join $schema.animals an_calf
on days(current_da te) -
days(an_calf.bi rth_date) < 280
and an_calf.ACTIVEX = 'Y'
and ((an_calf.dam_b hid = an1.bhid AND
an_calf.recip_b hid = 0)
OR an_calf.RECIP_B HID = an1.bhid)


There are two potential problems. The first is the OR which pretty much
makes teh last AND predicate useless for index exploitation.
The second is the expression on days which disallows index exploitation
on birth_date and, worse, forces an NLJOIN.
What about a small rewrite:
an_calf.birthda te > current date - 280 days
Now DB2 has a clear startkey for an index scan over birth date.

Cheers
Serge

Thank you. Creating the index and changing the query as you indicated
changed the time from over a minute to less than 2 seconds.

How could I convince the optimizer to do what I would do in "procedural
code", which uses the dam_bhid and recip_bhid indices to minimize the
time required (in no actual language that I know of):
found = -1
while (SELECT * FROM animals WHERE dam_bhid=an1.bh id) {
if(recip_bhid=0 } continue
if(activex<>'Y' ) continue
if(birth_date < current_date-280 days) continue
found = bhid
break
}
if(found=-1) {
while (SELECT * FROM animals WHERE recip_bhid=an1. bhid) {
if(activex<>'Y' ) continue
if(birth_date < current_date-280 days) continue
found = bhid
break
}
}
if(found=-1) {nomatch)
else (matched the current record, with bhid=found}
Apr 12 '06 #3
Bob Stearns wrote:
Serge Rielau wrote:
Bob Stearns wrote:
left outer
join $schema.animals an_calf
on days(current_da te) -
days(an_calf.bi rth_date) < 280
and an_calf.ACTIVEX = 'Y'
and ((an_calf.dam_b hid = an1.bhid AND
an_calf.recip_b hid = 0)
OR an_calf.RECIP_B HID = an1.bhid)
There are two potential problems. The first is the OR which pretty
much makes teh last AND predicate useless for index exploitation.
The second is the expression on days which disallows index
exploitation on birth_date and, worse, forces an NLJOIN.
What about a small rewrite:
an_calf.birthda te > current date - 280 days
Now DB2 has a clear startkey for an index scan over birth date.

Cheers
Serge

Thank you. Creating the index and changing the query as you indicated
changed the time from over a minute to less than 2 seconds.

How could I convince the optimizer to do what I would do in "procedural
code", which uses the dam_bhid and recip_bhid indices to minimize the
time required (in no actual language that I know of):


CREATE FUNCTION purplecow(...) RETURNS VARCHAR(10)
READS SQL DATA
BEGIN ATOMIC found = -1 DECLARE found INTEGER; while (SELECT * FROM animals WHERE dam_bhid=an1.bh id) {
if(recip_bhid=0 } continue
if(activex<>'Y' ) continue
if(birth_date < current_date-280 days) continue
found = bhid
break
} SET found = (SELECT bhid FROM animals
WHERE dam_bhid=an1.bh id
AND recip_bhid <> 0
AND activex='Y'
AND birth_date >= current_date-280 days
FETCH FIRST ROW ONLY) if(found=-1) {
while (SELECT * FROM animals WHERE recip_bhid=an1. bhid) {
if(activex<>'Y' ) continue
if(birth_date < current_date-280 days) continue
found = bhid
break
}
} SET found = (SELECT bhid FROM animals
WHERE recip_bhid=an1. bhid
AND activex = 'Y'
AND birth_date >= current_date - 280 days
AND found IS NULL
FETCH FIRST ROW ONLY); if(found=-1) {nomatch)
else (matched the current record, with bhid=found}

RETURN CASE WHEN found IS NULL THEN 'No match'
ELSE CHAR(found) END;
END
@

VALUES purplecow();

If you want to take the optimizer out of the equation SQL Functions
rock. Need to be careful with DPF, but in serial they contribute their
fair share to our TPC-C results.

Cheers
Serge
--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab
Apr 12 '06 #4
Serge Rielau wrote:

CREATE FUNCTION purplecow(...) RETURNS VARCHAR(10)
READS SQL DATA
BEGIN ATOMIC

DECLARE found INTEGER;

SET found = (SELECT bhid FROM animals
WHERE dam_bhid=an1.bh id
AND recip_bhid <> 0
AND activex='Y'
AND birth_date >= current_date-280 days
FETCH FIRST ROW ONLY)

SET found = (SELECT bhid FROM animals
WHERE recip_bhid=an1. bhid
AND activex = 'Y'
AND birth_date >= current_date - 280 days
AND found IS NULL
FETCH FIRST ROW ONLY);

RETURN CASE WHEN found IS NULL THEN 'No match'
ELSE CHAR(found) END;
END
@

VALUES purplecow();

If you want to take the optimizer out of the equation SQL Functions
rock. Need to be careful with DPF, but in serial they contribute their
fair share to our TPC-C results.

Cheers
Serge

Could I easily return the entire row found rather than its key? It would
dave 1 join. How do I use it in the middle of a long JOIN sequence?

from an1
..
..
..
left outer join values purplecow(an1.b hid)
as tx(bhid)
on tx.bhid is not null
left outer join animals an_calf
on an_calf.bhid=tx .bhid
..
..
..

is my best guess, but I'm sure that's not quite right
Apr 13 '06 #5
Bob Stearns wrote:
Serge Rielau wrote:

CREATE FUNCTION purplecow(...) RETURNS VARCHAR(10)
READS SQL DATA
BEGIN ATOMIC

DECLARE found INTEGER;

SET found = (SELECT bhid FROM animals
WHERE dam_bhid=an1.bh id
AND recip_bhid <> 0
AND activex='Y'
AND birth_date >= current_date-280 days
FETCH FIRST ROW ONLY)

SET found = (SELECT bhid FROM animals
WHERE recip_bhid=an1. bhid
AND activex = 'Y'
AND birth_date >= current_date - 280 days
AND found IS NULL
FETCH FIRST ROW ONLY);

RETURN CASE WHEN found IS NULL THEN 'No match'
ELSE CHAR(found) END;
END
@

VALUES purplecow();

If you want to take the optimizer out of the equation SQL Functions
rock. Need to be careful with DPF, but in serial they contribute their
fair share to our TPC-C results.

Cheers
Serge

Could I easily return the entire row found rather than its key? It would
dave 1 join. How do I use it in the middle of a long JOIN sequence?

from an1
.
.
.
left outer join values purplecow(an1.b hid)
as tx(bhid)
on tx.bhid is not null
left outer join animals an_calf
on an_calf.bhid=tx .bhid
.
.
.

is my best guess, but I'm sure that's not quite right

If you want to return the entire row you sue a TABLE function.
Simply change the function to:
RETURNS TABLE (pk INT, C1 INT, ...)

and the return to:
RETURN VALUES (......);
...
of course you will also have to extend the SET statements
SET (...) = (SELECT ....)

take a look here for examples:
http://www-128.ibm.com/developerwork...dm-0411rielau/

Cheers
Serge
--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab
Apr 13 '06 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
3738
by: chris | last post by:
Does anyone have any tips for comparing large amounts of data? The data in question is around 1000 lines of up to 400 delimited key value pairs. Each line has a unique identifier stored in one of the fields. Each field needs to be compared to the corresponding line/field in a "master file". In addition, each field has formatting rules, such as trimming spaces and zeroes, ignore capitalization, etc., which are stored in a database. I...
3
1536
by: TPJ | last post by:
"The advantage of xrange() over range() is minimal (since xrange() still has to create the values when asked for them) except when a very large range is used on a memory-starved machine or when all of the range's elements are never used (such as when the loop is usually terminated with break)." - from Python Library Reference. I decided to measure the performance of range and xrange. I did it with the following functions: def rprint(...
5
1538
by: Lada 'Ray' Lostak | last post by:
Dear list, First of all I want to say sory, if my question was answered somewhere. Or if it is my fault. If so, please, give me link/hint. My own search fails :( I have experimence with MySql, MsSql and Oracle (and MS Access huch :) I am new to PgSql. We are running server - OpenBSD 2.9, latest apache, latest PHP, latest PgSql, latest mod_ssl. No other related packages are used. All programs are compiled 'normal' way, including PgSql...
4
1595
by: Lan W via DBMonster.com | last post by:
Hi, How to change the Db2 server's IP address? I have DB2 UDB 7.1 installed on AIX 5.2 machine and DB2 UDB 8.1 installed on Linux machine. Thanks! Lan --
7
1276
by: trialproduct2004 | last post by:
Hi all I am having application which has three classes. in one class say class1, i am having arraylist where i am adding and removing string. there is one method in class1 which is creating object of class2 and calling method of class2. class2 has one member of type class1.
19
2285
by: Tony Johansson | last post by:
Hello!! Which have best performance between i++ or ++i. Is it exact the same or is it some very small difference in performace betwwwn these two. //Tony
1
7497
by: crabbie_upk | last post by:
Hi, Need help in tuning JAVA on LINUX. From http://www.trutek.com/index.php?id=173 I understand that on Linux operating system with 4GB memory, JVM can be run with maximum memory of 3800m. But I had no luck. Could someone help me in achieving on Red Hat Linux server. Here is the details of my Linux server. OS: Red Hat Linux (Linux xyzsrvr 2.6.9-5.ELsmp #1 SMP Wed Jan 5 19:30:39 EST 2005 i686 i686 i386 GNU/Linux) Java: java version...
3
2014
by: Veeru71 | last post by:
We have got 2 DB2-UDB databases (DB1 & DB2) running on separate instances( Inst1 & Inst2). DB1 has got Schema1 and DB2 has got Schema2. We would like to setup some kind of replication to replicate both Schema1 & Schema2 onto a 3 rd database (DB3) running on a 3 rd instance (Inst3). This is basically to separate out Data Reads (Reports module of the app) from the primary databases for performace reasons. Also, Some of the Reports join...
4
4533
by: Dave | last post by:
I have a global.asax file with Application_Start defined and create some static data there and in another module used in the asp.net application and I realize that static data is shared amongst child apps of an IIS application and can be used by multiple users during the application life cycle and for multiple page loads for the same or different page under a root application. What I don't understand and need to know is whether that...
7
2083
by: Peted | last post by:
Hi, im hoping someone cane provide or point to a definitive accurate explantion of dotnet compilation when run and the best way to optimise peformace when dotnet code is run first time and execution speed after compliation. Ive searched the net and seem to find various opinions on what actually happens when you compile and run a c# dotnet application. What im looking for is basicly how to make dotnet dlls load as fast as
0
9887
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9730
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10983
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
10707
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9485
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
7056
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5719
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5911
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4529
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.