473,396 Members | 1,734 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

abysmal performance with scalar query

When I run the following query with the two sections commented out, the
response time is between 1 an 2 seconds; with the first indicated
section enabled, the response goes up to 15 seconds even though
t1.bh_disposal_code IS NOT NULL in only one row; with the second section
enabled, it goes up to 592 seconds, even though t1.mating is NULL in all
the rows chosen. Why should the (supposedly never executed) scalar
subqueries cost so much? It is as if the CASE statements were being
evaluated on every row in t1 rather than just the selected ones. Both of
these scalar subqueries are here because the are infrequent and the
corresponding join is difficult or impossible.

SELECT t2p.herd_id AS dam_tag,
is3.nalf_prefix(t2.tattoo) AS dam_prefix,
is3.nalf_id(t2.tattoo) AS dam_id,
upper(right(rtrim(t2.tattoo),1)) AS dam_yr_code,
rtrim(t2.prefix) || '-' || rtrim(t2.regnum) AS dam_regnum,
case when t2.breed_1='AN' AND t2.pct_1=100
THEN '11111111' end AS dam_breed,
char(t2.birth_date,USA) AS dam_birth_date,

// section 1
case
when t1.bh_disposal_code is null then ''
when t1.bh_disposal_code=0 then ''
else (select disposal_code
from is3.disposal_assoc_defn tx
where assoc_code='LM'
and tx.bh_disposal_code= t2.bh_disposal_code)
END AS dam_stat,
// end section 1

rtrim(t3.prefix) || '-' || rtrim(t3.regnum) AS sire_regnum,
case when t1.mating='A' then 'Y' end AS ai,

// section 2
case when t1.mating='A' then
(SELECT t2a.datex FROM is3.service_ai t2a
WHERE t2a.cow_bhid=t1.dam_bhid
AND t2a.datex =
(SELECT MAX(datex) FROM is3.service_ai
WHERE days(datex) < (days(t1.birth_date)-283+45)
AND days(datex) > (days(t1.birth_date)-283-45) )
)
end AS breed_date,
// end section 2

t1p.herd_id AS tag,
t1.birth_date,
t1b.birth_mgmt AS birth_grp,
case when t1.sex='C' then 'H' else t1.sex end AS sex,
translate(t1b.twin_code,'SSTTTT','NSCBMT') AS typex,
t1.hps,
t1c.color_desc AS color,
t1b.birth_weight AS weight,
t1.namex AS name,
rtrim(t1.prefix) || '-' || rtrim(t1.regnum) AS regnum
FROM is3.animals t1
JOIN is3.animals t2 ON t1.dam_bhid=t2.bhid
JOIN is3.animals t3 ON t1.sire_bhid=t3.bhid
JOIN is3.color_defn t1c ON t1.color=t1c.bh_color_code
LEFT OUTER JOIN is3.birth t1b ON t1.bhid=t1b.bhid
LEFT OUTER JOIN is3.animals_priv t1p
ON t1.bhid=t1p.bhid AND t1p.herd_owner_id=1
LEFT OUTER JOIN is3.animals_priv t2p
ON t2.bhid=t2p.bhid AND t2p.herd_owner_id=1
WHERE t1.bhid IN (SELECT bhid FROM is3.animal_sets
WHERE set_name='Lm_ET' AND userid='jhough')
ORDER BY t1.bhid
Nov 12 '05 #1
5 1662
Bob Stearns wrote:
When I run the following query with the two sections commented out, the
response time is between 1 an 2 seconds; with the first indicated
section enabled, the response goes up to 15 seconds even though
t1.bh_disposal_code IS NOT NULL in only one row; with the second section
enabled, it goes up to 592 seconds, even though t1.mating is NULL in all
the rows chosen. Why should the (supposedly never executed) scalar
subqueries cost so much? It is as if the CASE statements were being
evaluated on every row in t1 rather than just the selected ones. Both of
these scalar subqueries are here because the are infrequent and the
corresponding join is difficult or impossible.

SELECT t2p.herd_id AS dam_tag,
is3.nalf_prefix(t2.tattoo) AS dam_prefix,
is3.nalf_id(t2.tattoo) AS dam_id,
upper(right(rtrim(t2.tattoo),1)) AS dam_yr_code,
rtrim(t2.prefix) || '-' || rtrim(t2.regnum) AS dam_regnum,
case when t2.breed_1='AN' AND t2.pct_1=100
THEN '11111111' end AS dam_breed,
char(t2.birth_date,USA) AS dam_birth_date,

// section 1
case
when t1.bh_disposal_code is null then ''
when t1.bh_disposal_code=0 then ''
else (select disposal_code
from is3.disposal_assoc_defn tx
where assoc_code='LM'
and tx.bh_disposal_code= t2.bh_disposal_code)
END AS dam_stat,
// end section 1

rtrim(t3.prefix) || '-' || rtrim(t3.regnum) AS sire_regnum,
case when t1.mating='A' then 'Y' end AS ai,

// section 2
case when t1.mating='A' then
(SELECT t2a.datex FROM is3.service_ai t2a
WHERE t2a.cow_bhid=t1.dam_bhid
AND t2a.datex =
(SELECT MAX(datex) FROM is3.service_ai
WHERE days(datex) < (days(t1.birth_date)-283+45)
AND days(datex) > (days(t1.birth_date)-283-45) )
)
end AS breed_date,
// end section 2

t1p.herd_id AS tag,
t1.birth_date,
t1b.birth_mgmt AS birth_grp,
case when t1.sex='C' then 'H' else t1.sex end AS sex,
translate(t1b.twin_code,'SSTTTT','NSCBMT') AS typex,
t1.hps,
t1c.color_desc AS color,
t1b.birth_weight AS weight,
t1.namex AS name,
rtrim(t1.prefix) || '-' || rtrim(t1.regnum) AS regnum
FROM is3.animals t1
JOIN is3.animals t2 ON t1.dam_bhid=t2.bhid
JOIN is3.animals t3 ON t1.sire_bhid=t3.bhid
JOIN is3.color_defn t1c ON t1.color=t1c.bh_color_code
LEFT OUTER JOIN is3.birth t1b ON t1.bhid=t1b.bhid
LEFT OUTER JOIN is3.animals_priv t1p
ON t1.bhid=t1p.bhid AND t1p.herd_owner_id=1
LEFT OUTER JOIN is3.animals_priv t2p
ON t2.bhid=t2p.bhid AND t2p.herd_owner_id=1
WHERE t1.bhid IN (SELECT bhid FROM is3.animal_sets
WHERE set_name='Lm_ET' AND userid='jhough')
ORDER BY t1.bhid

Your observation is correct! DB2 loves to turn subqueries to joins.
It is pretty oblivious to CASE expression.
The easiest fix is to push/replicate the WHEN predicate into the WHERE
clause of the scalar subquery.
DB2 is smart enough to realize that the predicate is outside of the
context of the subquery and compute it before going all out for the
subquery.

Cheers
Serge
--
Serge Rielau
DB2 SQL Compiler Development
IBM Toronto Lab
Nov 12 '05 #2
Serge Rielau wrote:
Bob Stearns wrote:
When I run the following query with the two sections commented out,
the response time is between 1 an 2 seconds; with the first indicated
section enabled, the response goes up to 15 seconds even though
t1.bh_disposal_code IS NOT NULL in only one row; with the second
section enabled, it goes up to 592 seconds, even though t1.mating is
NULL in all the rows chosen. Why should the (supposedly never
executed) scalar subqueries cost so much? It is as if the CASE
statements were being evaluated on every row in t1 rather than just
the selected ones. Both of these scalar subqueries are here because
the are infrequent and the corresponding join is difficult or impossible.

SELECT t2p.herd_id AS dam_tag,
is3.nalf_prefix(t2.tattoo) AS dam_prefix,
is3.nalf_id(t2.tattoo) AS dam_id,
upper(right(rtrim(t2.tattoo),1)) AS dam_yr_code,
rtrim(t2.prefix) || '-' || rtrim(t2.regnum) AS dam_regnum,
case when t2.breed_1='AN' AND t2.pct_1=100
THEN '11111111' end AS dam_breed,
char(t2.birth_date,USA) AS dam_birth_date,

// section 1
case
when t1.bh_disposal_code is null then ''
when t1.bh_disposal_code=0 then ''
else (select disposal_code
from is3.disposal_assoc_defn tx
where assoc_code='LM'
and tx.bh_disposal_code= t2.bh_disposal_code)
END AS dam_stat,
// end section 1

rtrim(t3.prefix) || '-' || rtrim(t3.regnum) AS sire_regnum,
case when t1.mating='A' then 'Y' end AS ai,

// section 2
case when t1.mating='A' then
(SELECT t2a.datex FROM is3.service_ai t2a
WHERE t2a.cow_bhid=t1.dam_bhid
AND t2a.datex =
(SELECT MAX(datex) FROM is3.service_ai
WHERE days(datex) < (days(t1.birth_date)-283+45)
AND days(datex) > (days(t1.birth_date)-283-45) )
)
end AS breed_date,
// end section 2

t1p.herd_id AS tag,
t1.birth_date,
t1b.birth_mgmt AS birth_grp,
case when t1.sex='C' then 'H' else t1.sex end AS sex,
translate(t1b.twin_code,'SSTTTT','NSCBMT') AS typex,
t1.hps,
t1c.color_desc AS color,
t1b.birth_weight AS weight,
t1.namex AS name,
rtrim(t1.prefix) || '-' || rtrim(t1.regnum) AS regnum
FROM is3.animals t1
JOIN is3.animals t2 ON t1.dam_bhid=t2.bhid
JOIN is3.animals t3 ON t1.sire_bhid=t3.bhid
JOIN is3.color_defn t1c ON t1.color=t1c.bh_color_code
LEFT OUTER JOIN is3.birth t1b ON t1.bhid=t1b.bhid
LEFT OUTER JOIN is3.animals_priv t1p
ON t1.bhid=t1p.bhid AND t1p.herd_owner_id=1
LEFT OUTER JOIN is3.animals_priv t2p
ON t2.bhid=t2p.bhid AND t2p.herd_owner_id=1
WHERE t1.bhid IN (SELECT bhid FROM is3.animal_sets
WHERE set_name='Lm_ET' AND userid='jhough')
ORDER BY t1.bhid


Your observation is correct! DB2 loves to turn subqueries to joins.
It is pretty oblivious to CASE expression.
The easiest fix is to push/replicate the WHEN predicate into the WHERE
clause of the scalar subquery.
DB2 is smart enough to realize that the predicate is outside of the
context of the subquery and compute it before going all out for the
subquery.

Cheers
Serge


Thanks for the quick reply (when do you sleep:-).

I changed to the syntax below and the time is now 39 seconds. That is a
great improvement over 592 seconds, but is much slower than the 1-2
seconds to do everything but these two fields. I get the impression that
the subselects are still being performed for each row of t1 rather than
the rows selected by the JOINS.

Should I be looking at temporary tables or dedicated functions?

SELECT t2p.herd_id AS dam_tag,
is3.nalf_prefix(t2.tattoo) AS dam_prefix,
is3.nalf_id(t2.tattoo) AS dam_id,
upper(right(rtrim(t2.tattoo),1)) AS dam_yr_code,
rtrim(t2.prefix) || '-' || rtrim(t2.regnum) AS dam_regnum,
case when t2.breed_1='AN' AND t2.pct_1=100 THEN '11111111' end
AS dam_breed,
char(t2.birth_date,USA) AS dam_birth_date,
(select disposal_code
from is3.disposal_assoc_defn tx
where assoc_code='LM'
and tx.bh_disposal_code= t2.bh_disposal_code
and ( (t1.bh_disposal_code is not null)
AND
(t1.bh_disposal_code <> 0)
)
) AS dam_stat,
rtrim(t3.prefix) || '-' || rtrim(t3.regnum) AS sire_regnum,
case when t1.mating='A' then 'Y' end AS ai,
(SELECT t2a.datex FROM is3.service_ai t2a
WHERE t2a.cow_bhid=t1.dam_bhid
AND t2a.datex = (SELECT MAX(datex) FROM is3.service_ai
WHERE days(datex) <
(days(t1.birth_date)-283+45)
AND days(datex) >
(days(t1.birth_date)-283-45) )
AND t1.mating='A'
) AS breed_date,
t1p.herd_id AS tag,
t1.birth_date,
t1b.birth_mgmt AS birth_grp,
case when t1.sex='C' then 'H' else t1.sex end AS sex,
translate(t1b.twin_code,'SSTTTT','NSCBMT') AS typex,
t1.hps,
t1c.color_desc AS color,
t1b.birth_weight AS weight,
t1.namex AS name,
rtrim(t1.prefix) || '-' || rtrim(t1.regnum) AS regnum
FROM is3.animals t1
JOIN is3.animals t2 ON t1.dam_bhid=t2.bhid
JOIN is3.animals t3 ON t1.sire_bhid=t3.bhid
JOIN is3.color_defn t1c ON t1.color=t1c.bh_color_code
LEFT OUTER JOIN is3.birth t1b ON t1.bhid=t1b.bhid
LEFT OUTER JOIN is3.animals_priv t1p
ON t1.bhid=t1p.bhid AND t1p.herd_owner_id=1
LEFT OUTER JOIN is3.animals_priv t2p
ON t2.bhid=t2p.bhid AND t2p.herd_owner_id=1
WHERE t1.bhid IN (SELECT bhid FROM is3.animal_sets WHERE
set_name='Lm_ET' AND userid='jhough')
ORDER BY t1.bhid
go
Nov 12 '05 #3
Bob Stearns wrote:
Serge Rielau wrote:
Bob Stearns wrote:
When I run the following query with the two sections commented out,
the response time is between 1 an 2 seconds; with the first indicated
section enabled, the response goes up to 15 seconds even though
t1.bh_disposal_code IS NOT NULL in only one row; with the second
section enabled, it goes up to 592 seconds, even though t1.mating is
NULL in all the rows chosen. Why should the (supposedly never
executed) scalar subqueries cost so much? It is as if the CASE
statements were being evaluated on every row in t1 rather than just
the selected ones. Both of these scalar subqueries are here because
the are infrequent and the corresponding join is difficult or
impossible.

SELECT t2p.herd_id AS dam_tag,
is3.nalf_prefix(t2.tattoo) AS dam_prefix,
is3.nalf_id(t2.tattoo) AS dam_id,
upper(right(rtrim(t2.tattoo),1)) AS dam_yr_code,
rtrim(t2.prefix) || '-' || rtrim(t2.regnum) AS dam_regnum,
case when t2.breed_1='AN' AND t2.pct_1=100
THEN '11111111' end AS dam_breed,
char(t2.birth_date,USA) AS dam_birth_date,

// section 1
case
when t1.bh_disposal_code is null then ''
when t1.bh_disposal_code=0 then ''
else (select disposal_code
from is3.disposal_assoc_defn tx
where assoc_code='LM'
and tx.bh_disposal_code= t2.bh_disposal_code)
END AS dam_stat,
// end section 1

rtrim(t3.prefix) || '-' || rtrim(t3.regnum) AS sire_regnum,
case when t1.mating='A' then 'Y' end AS ai,

// section 2
case when t1.mating='A' then
(SELECT t2a.datex FROM is3.service_ai t2a
WHERE t2a.cow_bhid=t1.dam_bhid
AND t2a.datex =
(SELECT MAX(datex) FROM is3.service_ai
WHERE days(datex) < (days(t1.birth_date)-283+45)
AND days(datex) > (days(t1.birth_date)-283-45) )
)
end AS breed_date,
// end section 2

t1p.herd_id AS tag,
t1.birth_date,
t1b.birth_mgmt AS birth_grp,
case when t1.sex='C' then 'H' else t1.sex end AS sex,
translate(t1b.twin_code,'SSTTTT','NSCBMT') AS typex,
t1.hps,
t1c.color_desc AS color,
t1b.birth_weight AS weight,
t1.namex AS name,
rtrim(t1.prefix) || '-' || rtrim(t1.regnum) AS regnum
FROM is3.animals t1
JOIN is3.animals t2 ON t1.dam_bhid=t2.bhid
JOIN is3.animals t3 ON t1.sire_bhid=t3.bhid
JOIN is3.color_defn t1c ON t1.color=t1c.bh_color_code
LEFT OUTER JOIN is3.birth t1b ON t1.bhid=t1b.bhid
LEFT OUTER JOIN is3.animals_priv t1p
ON t1.bhid=t1p.bhid AND t1p.herd_owner_id=1
LEFT OUTER JOIN is3.animals_priv t2p
ON t2.bhid=t2p.bhid AND t2p.herd_owner_id=1
WHERE t1.bhid IN (SELECT bhid FROM is3.animal_sets
WHERE set_name='Lm_ET' AND userid='jhough')
ORDER BY t1.bhid

Your observation is correct! DB2 loves to turn subqueries to joins.
It is pretty oblivious to CASE expression.
The easiest fix is to push/replicate the WHEN predicate into the WHERE
clause of the scalar subquery.
DB2 is smart enough to realize that the predicate is outside of the
context of the subquery and compute it before going all out for the
subquery.

Cheers
Serge

Thanks for the quick reply (when do you sleep:-).

I changed to the syntax below and the time is now 39 seconds. That is a
great improvement over 592 seconds, but is much slower than the 1-2
seconds to do everything but these two fields. I get the impression that
the subselects are still being performed for each row of t1 rather than
the rows selected by the JOINS.

Should I be looking at temporary tables or dedicated functions?

SELECT t2p.herd_id AS dam_tag,
is3.nalf_prefix(t2.tattoo) AS dam_prefix,
is3.nalf_id(t2.tattoo) AS dam_id,
upper(right(rtrim(t2.tattoo),1)) AS dam_yr_code,
rtrim(t2.prefix) || '-' || rtrim(t2.regnum) AS dam_regnum,
case when t2.breed_1='AN' AND t2.pct_1=100 THEN '11111111' end AS
dam_breed,
char(t2.birth_date,USA) AS dam_birth_date,
(select disposal_code
from is3.disposal_assoc_defn tx
where assoc_code='LM'
and tx.bh_disposal_code= t2.bh_disposal_code

-- small tweak here and ( (t1.bh_disposal_code is not null)
AND
(t1.bh_disposal_code <> 0)
) AND (t1.bh_disposal_code <> 0) is good enough because: (NULL <> 0) is
UNKNOWN != TRUE ) AS dam_stat,
rtrim(t3.prefix) || '-' || rtrim(t3.regnum) AS sire_regnum,
case when t1.mating='A' then 'Y' end AS ai,
(SELECT t2a.datex FROM is3.service_ai t2a
WHERE t2a.cow_bhid=t1.dam_bhid -- AND t2a.datex = (SELECT MAX(datex) FROM is3.service_ai
WHERE days(datex) <
(days(t1.birth_date)-283+45)
AND days(datex) >
(days(t1.birth_date)-283-45) ) AND t2a.datex = (SELECT MAX(datex) FROM is3.service_ai
WHERE datex < t1.birth_date - 283 days + 45 days
AND datex > t1.birt_date - 283 days + 45 days
AND t1.mating = 'A' ) AND t1.mating='A'
) AS breed_date, -- Or what about only:
(SELECT MAX(datex) FROM is3.service_ai
WHERE datex < t1.birth_date - 283 days + 45 days
AND datex > t1.birt_date - 283 days + 45 days
AND t1.mating = 'A' ) AS breed_date
-- If no row is found the results will still be NULL...
t1p.herd_id AS tag,
t1.birth_date,
t1b.birth_mgmt AS birth_grp,
case when t1.sex='C' then 'H' else t1.sex end AS sex,
translate(t1b.twin_code,'SSTTTT','NSCBMT') AS typex,
t1.hps,
t1c.color_desc AS color,
t1b.birth_weight AS weight,
t1.namex AS name,
rtrim(t1.prefix) || '-' || rtrim(t1.regnum) AS regnum
FROM is3.animals t1
JOIN is3.animals t2 ON t1.dam_bhid=t2.bhid
JOIN is3.animals t3 ON t1.sire_bhid=t3.bhid
JOIN is3.color_defn t1c ON t1.color=t1c.bh_color_code
LEFT OUTER JOIN is3.birth t1b ON t1.bhid=t1b.bhid
LEFT OUTER JOIN is3.animals_priv t1p
ON t1.bhid=t1p.bhid AND t1p.herd_owner_id=1
LEFT OUTER JOIN is3.animals_priv t2p
ON t2.bhid=t2p.bhid AND t2p.herd_owner_id=1
WHERE t1.bhid IN (SELECT bhid FROM is3.animal_sets WHERE
set_name='Lm_ET' AND userid='jhough')
ORDER BY t1.bhid
go


Cheers
Serge

--
Serge Rielau
DB2 SQL Compiler Development
IBM Toronto Lab
Nov 12 '05 #4
Serge Rielau wrote:
Bob Stearns wrote:
Serge Rielau wrote:
Bob Stearns wrote:

When I run the following query with the two sections commented out,
the response time is between 1 an 2 seconds; with the first
indicated section enabled, the response goes up to 15 seconds even
though t1.bh_disposal_code IS NOT NULL in only one row; with the
second section enabled, it goes up to 592 seconds, even though
t1.mating is NULL in all the rows chosen. Why should the (supposedly
never executed) scalar subqueries cost so much? It is as if the CASE
statements were being evaluated on every row in t1 rather than just
the selected ones. Both of these scalar subqueries are here because
the are infrequent and the corresponding join is difficult or
impossible.

SELECT t2p.herd_id AS dam_tag,
is3.nalf_prefix(t2.tattoo) AS dam_prefix,
is3.nalf_id(t2.tattoo) AS dam_id,
upper(right(rtrim(t2.tattoo),1)) AS dam_yr_code,
rtrim(t2.prefix) || '-' || rtrim(t2.regnum) AS dam_regnum,
case when t2.breed_1='AN' AND t2.pct_1=100
THEN '11111111' end AS dam_breed,
char(t2.birth_date,USA) AS dam_birth_date,

// section 1
case
when t1.bh_disposal_code is null then ''
when t1.bh_disposal_code=0 then ''
else (select disposal_code
from is3.disposal_assoc_defn tx
where assoc_code='LM'
and tx.bh_disposal_code= t2.bh_disposal_code)
END AS dam_stat,
// end section 1

rtrim(t3.prefix) || '-' || rtrim(t3.regnum) AS sire_regnum,
case when t1.mating='A' then 'Y' end AS ai,

// section 2
case when t1.mating='A' then
(SELECT t2a.datex FROM is3.service_ai t2a
WHERE t2a.cow_bhid=t1.dam_bhid
AND t2a.datex =
(SELECT MAX(datex) FROM is3.service_ai
WHERE days(datex) < (days(t1.birth_date)-283+45)
AND days(datex) > (days(t1.birth_date)-283-45) )
)
end AS breed_date,
// end section 2

t1p.herd_id AS tag,
t1.birth_date,
t1b.birth_mgmt AS birth_grp,
case when t1.sex='C' then 'H' else t1.sex end AS sex,
translate(t1b.twin_code,'SSTTTT','NSCBMT') AS typex,
t1.hps,
t1c.color_desc AS color,
t1b.birth_weight AS weight,
t1.namex AS name,
rtrim(t1.prefix) || '-' || rtrim(t1.regnum) AS regnum
FROM is3.animals t1
JOIN is3.animals t2 ON t1.dam_bhid=t2.bhid
JOIN is3.animals t3 ON t1.sire_bhid=t3.bhid
JOIN is3.color_defn t1c ON t1.color=t1c.bh_color_code
LEFT OUTER JOIN is3.birth t1b ON t1.bhid=t1b.bhid
LEFT OUTER JOIN is3.animals_priv t1p
ON t1.bhid=t1p.bhid AND t1p.herd_owner_id=1
LEFT OUTER JOIN is3.animals_priv t2p
ON t2.bhid=t2p.bhid AND t2p.herd_owner_id=1
WHERE t1.bhid IN (SELECT bhid FROM is3.animal_sets
WHERE set_name='Lm_ET' AND userid='jhough')
ORDER BY t1.bhid

Your observation is correct! DB2 loves to turn subqueries to joins.
It is pretty oblivious to CASE expression.
The easiest fix is to push/replicate the WHEN predicate into the
WHERE clause of the scalar subquery.
DB2 is smart enough to realize that the predicate is outside of the
context of the subquery and compute it before going all out for the
subquery.

Cheers
Serge


Thanks for the quick reply (when do you sleep:-).

I changed to the syntax below and the time is now 39 seconds. That is
a great improvement over 592 seconds, but is much slower than the 1-2
seconds to do everything but these two fields. I get the impression
that the subselects are still being performed for each row of t1
rather than the rows selected by the JOINS.

Should I be looking at temporary tables or dedicated functions?

SELECT t2p.herd_id AS dam_tag,
is3.nalf_prefix(t2.tattoo) AS dam_prefix,
is3.nalf_id(t2.tattoo) AS dam_id,
upper(right(rtrim(t2.tattoo),1)) AS dam_yr_code,
rtrim(t2.prefix) || '-' || rtrim(t2.regnum) AS dam_regnum,
case when t2.breed_1='AN' AND t2.pct_1=100 THEN '11111111' end
AS dam_breed,
char(t2.birth_date,USA) AS dam_birth_date,
(select disposal_code
from is3.disposal_assoc_defn tx
where assoc_code='LM'
and tx.bh_disposal_code= t2.bh_disposal_code


-- small tweak here
and ( (t1.bh_disposal_code is not null)
AND
(t1.bh_disposal_code <> 0)
)


AND (t1.bh_disposal_code <> 0) is good enough because: (NULL <> 0) is
UNKNOWN != TRUE
) AS dam_stat,
rtrim(t3.prefix) || '-' || rtrim(t3.regnum) AS sire_regnum,
case when t1.mating='A' then 'Y' end AS ai,
(SELECT t2a.datex FROM is3.service_ai t2a
WHERE t2a.cow_bhid=t1.dam_bhid


--
AND t2a.datex = (SELECT MAX(datex) FROM is3.service_ai
WHERE days(datex) <
(days(t1.birth_date)-283+45)
AND days(datex) >
(days(t1.birth_date)-283-45) )


AND t2a.datex = (SELECT MAX(datex) FROM is3.service_ai
WHERE datex < t1.birth_date - 283 days + 45 days
AND datex > t1.birt_date - 283 days + 45 days
AND t1.mating = 'A' )
AND t1.mating='A'
) AS breed_date,


-- Or what about only:
(SELECT MAX(datex) FROM is3.service_ai
WHERE datex < t1.birth_date - 283 days + 45 days
AND datex > t1.birt_date - 283 days + 45 days
AND t1.mating = 'A' ) AS breed_date
-- If no row is found the results will still be NULL...
t1p.herd_id AS tag,
t1.birth_date,
t1b.birth_mgmt AS birth_grp,
case when t1.sex='C' then 'H' else t1.sex end AS sex,
translate(t1b.twin_code,'SSTTTT','NSCBMT') AS typex,
t1.hps,
t1c.color_desc AS color,
t1b.birth_weight AS weight,
t1.namex AS name,
rtrim(t1.prefix) || '-' || rtrim(t1.regnum) AS regnum
FROM is3.animals t1
JOIN is3.animals t2 ON t1.dam_bhid=t2.bhid
JOIN is3.animals t3 ON t1.sire_bhid=t3.bhid
JOIN is3.color_defn t1c ON t1.color=t1c.bh_color_code
LEFT OUTER JOIN is3.birth t1b ON t1.bhid=t1b.bhid
LEFT OUTER JOIN is3.animals_priv t1p
ON t1.bhid=t1p.bhid AND t1p.herd_owner_id=1
LEFT OUTER JOIN is3.animals_priv t2p
ON t2.bhid=t2p.bhid AND t2p.herd_owner_id=1
WHERE t1.bhid IN (SELECT bhid FROM is3.animal_sets WHERE
set_name='Lm_ET' AND userid='jhough')
ORDER BY t1.bhid
go

Cheers
Serge

Thanks again. Neither change resulted in a significant change. We were
looking in the wrong place. I, at least, expected the optimizer to be
better than it is. I reached this epiphany while dreaming about this
problem (talk about nightmares). By forcing the most restrictive WHERE
clause to be done first, the time was reduced to 1-2 seconds. I changed
the FROM phrase to:

FROM (SELECT *
FROM is3.animals
WHERE bhid IN (SELECT bhid
FROM is3.animal_sets
WHERE set_name='Junk' AND userid='jhough')
) t1
JOIN is3.animals t2 ON t1.dam_bhid=t2.bhid
JOIN is3.animals t3 ON t1.sire_bhid=t3.bhid
JOIN is3.color_defn t1c ON t1.color=t1c.bh_color_code
LEFT OUTER JOIN is3.birth t1b ON t1.bhid=t1b.bhid
LEFT OUTER JOIN is3.animals_priv t1p
ON t1.bhid=t1p.bhid AND t1p.herd_owner_id=1
LEFT OUTER JOIN is3.animals_priv t2p
ON t2.bhid=t2p.bhid AND t2p.herd_owner_id=1

Do parentheses force the order of JOINS to be done the parenthesized
way? And why does the optimizer not see this optimization in the
presence of the problematical sections, but does without them?
Nov 12 '05 #5
Bob Stearns wrote:
Serge Rielau wrote:
Bob Stearns wrote:
Serge Rielau wrote:

Bob Stearns wrote:

> When I run the following query with the two sections commented out,
> the response time is between 1 an 2 seconds; with the first
> indicated section enabled, the response goes up to 15 seconds even
> though t1.bh_disposal_code IS NOT NULL in only one row; with the
> second section enabled, it goes up to 592 seconds, even though
> t1.mating is NULL in all the rows chosen. Why should the
> (supposedly never executed) scalar subqueries cost so much? It is
> as if the CASE statements were being evaluated on every row in t1
> rather than just the selected ones. Both of these scalar subqueries
> are here because the are infrequent and the corresponding join is
> difficult or impossible.
>
> SELECT t2p.herd_id AS dam_tag,
> is3.nalf_prefix(t2.tattoo) AS dam_prefix,
> is3.nalf_id(t2.tattoo) AS dam_id,
> upper(right(rtrim(t2.tattoo),1)) AS dam_yr_code,
> rtrim(t2.prefix) || '-' || rtrim(t2.regnum) AS dam_regnum,
> case when t2.breed_1='AN' AND t2.pct_1=100
> THEN '11111111' end AS dam_breed,
> char(t2.birth_date,USA) AS dam_birth_date,
>
> // section 1
> case
> when t1.bh_disposal_code is null then ''
> when t1.bh_disposal_code=0 then ''
> else (select disposal_code
> from is3.disposal_assoc_defn tx
> where assoc_code='LM'
> and tx.bh_disposal_code= t2.bh_disposal_code)
> END AS dam_stat,
> // end section 1
>
> rtrim(t3.prefix) || '-' || rtrim(t3.regnum) AS sire_regnum,
> case when t1.mating='A' then 'Y' end AS ai,
>
> // section 2
> case when t1.mating='A' then
> (SELECT t2a.datex FROM is3.service_ai t2a
> WHERE t2a.cow_bhid=t1.dam_bhid
> AND t2a.datex =
> (SELECT MAX(datex) FROM is3.service_ai
> WHERE days(datex) < (days(t1.birth_date)-283+45)
> AND days(datex) >
> (days(t1.birth_date)-283-45) )
> )
> end AS breed_date,
> // end section 2
>
> t1p.herd_id AS tag,
> t1.birth_date,
> t1b.birth_mgmt AS birth_grp,
> case when t1.sex='C' then 'H' else t1.sex end AS sex,
> translate(t1b.twin_code,'SSTTTT','NSCBMT') AS typex,
> t1.hps,
> t1c.color_desc AS color,
> t1b.birth_weight AS weight,
> t1.namex AS name,
> rtrim(t1.prefix) || '-' || rtrim(t1.regnum) AS regnum
> FROM is3.animals t1
> JOIN is3.animals t2 ON t1.dam_bhid=t2.bhid
> JOIN is3.animals t3 ON t1.sire_bhid=t3.bhid
> JOIN is3.color_defn t1c ON t1.color=t1c.bh_color_code
> LEFT OUTER JOIN is3.birth t1b ON t1.bhid=t1b.bhid
> LEFT OUTER JOIN is3.animals_priv t1p
> ON t1.bhid=t1p.bhid AND t1p.herd_owner_id=1
> LEFT OUTER JOIN is3.animals_priv t2p
> ON t2.bhid=t2p.bhid AND t2p.herd_owner_id=1
> WHERE t1.bhid IN (SELECT bhid FROM is3.animal_sets
> WHERE set_name='Lm_ET' AND userid='jhough')
> ORDER BY t1.bhid


Your observation is correct! DB2 loves to turn subqueries to joins.
It is pretty oblivious to CASE expression.
The easiest fix is to push/replicate the WHEN predicate into the
WHERE clause of the scalar subquery.
DB2 is smart enough to realize that the predicate is outside of the
context of the subquery and compute it before going all out for the
subquery.

Cheers
Serge


Thanks for the quick reply (when do you sleep:-).

I changed to the syntax below and the time is now 39 seconds. That is
a great improvement over 592 seconds, but is much slower than the 1-2
seconds to do everything but these two fields. I get the impression
that the subselects are still being performed for each row of t1
rather than the rows selected by the JOINS.

Should I be looking at temporary tables or dedicated functions?

SELECT t2p.herd_id AS dam_tag,
is3.nalf_prefix(t2.tattoo) AS dam_prefix,
is3.nalf_id(t2.tattoo) AS dam_id,
upper(right(rtrim(t2.tattoo),1)) AS dam_yr_code,
rtrim(t2.prefix) || '-' || rtrim(t2.regnum) AS dam_regnum,
case when t2.breed_1='AN' AND t2.pct_1=100 THEN '11111111' end
AS dam_breed,
char(t2.birth_date,USA) AS dam_birth_date,
(select disposal_code
from is3.disposal_assoc_defn tx
where assoc_code='LM'
and tx.bh_disposal_code= t2.bh_disposal_code

-- small tweak here
and ( (t1.bh_disposal_code is not null)
AND
(t1.bh_disposal_code <> 0)
)

AND (t1.bh_disposal_code <> 0) is good enough because: (NULL <> 0) is
UNKNOWN != TRUE
) AS dam_stat,
rtrim(t3.prefix) || '-' || rtrim(t3.regnum) AS sire_regnum,
case when t1.mating='A' then 'Y' end AS ai,
(SELECT t2a.datex FROM is3.service_ai t2a
WHERE t2a.cow_bhid=t1.dam_bhid

--
AND t2a.datex = (SELECT MAX(datex) FROM is3.service_ai
WHERE days(datex) <
(days(t1.birth_date)-283+45)
AND days(datex) >
(days(t1.birth_date)-283-45) )

AND t2a.datex = (SELECT MAX(datex) FROM is3.service_ai
WHERE datex < t1.birth_date - 283 days + 45 days
AND datex > t1.birt_date - 283 days + 45 days
AND t1.mating = 'A' )
AND t1.mating='A'
) AS breed_date,

-- Or what about only:
(SELECT MAX(datex) FROM is3.service_ai
WHERE datex < t1.birth_date - 283 days + 45 days
AND datex > t1.birt_date - 283 days + 45 days
AND t1.mating = 'A' ) AS breed_date
-- If no row is found the results will still be NULL...
t1p.herd_id AS tag,
t1.birth_date,
t1b.birth_mgmt AS birth_grp,
case when t1.sex='C' then 'H' else t1.sex end AS sex,
translate(t1b.twin_code,'SSTTTT','NSCBMT') AS typex,
t1.hps,
t1c.color_desc AS color,
t1b.birth_weight AS weight,
t1.namex AS name,
rtrim(t1.prefix) || '-' || rtrim(t1.regnum) AS regnum
FROM is3.animals t1
JOIN is3.animals t2 ON t1.dam_bhid=t2.bhid
JOIN is3.animals t3 ON t1.sire_bhid=t3.bhid
JOIN is3.color_defn t1c ON t1.color=t1c.bh_color_code
LEFT OUTER JOIN is3.birth t1b ON t1.bhid=t1b.bhid
LEFT OUTER JOIN is3.animals_priv t1p
ON t1.bhid=t1p.bhid AND t1p.herd_owner_id=1
LEFT OUTER JOIN is3.animals_priv t2p
ON t2.bhid=t2p.bhid AND t2p.herd_owner_id=1
WHERE t1.bhid IN (SELECT bhid FROM is3.animal_sets WHERE
set_name='Lm_ET' AND userid='jhough')
ORDER BY t1.bhid
go


Cheers
Serge

Thanks again. Neither change resulted in a significant change. We were
looking in the wrong place. I, at least, expected the optimizer to be
better than it is. I reached this epiphany while dreaming about this
problem (talk about nightmares). By forcing the most restrictive WHERE
clause to be done first, the time was reduced to 1-2 seconds. I changed
the FROM phrase to:

FROM (SELECT *
FROM is3.animals
WHERE bhid IN (SELECT bhid
FROM is3.animal_sets
WHERE set_name='Junk' AND userid='jhough')
) t1
JOIN is3.animals t2 ON t1.dam_bhid=t2.bhid
JOIN is3.animals t3 ON t1.sire_bhid=t3.bhid
JOIN is3.color_defn t1c ON t1.color=t1c.bh_color_code
LEFT OUTER JOIN is3.birth t1b ON t1.bhid=t1b.bhid
LEFT OUTER JOIN is3.animals_priv t1p
ON t1.bhid=t1p.bhid AND t1p.herd_owner_id=1
LEFT OUTER JOIN is3.animals_priv t2p
ON t2.bhid=t2p.bhid AND t2p.herd_owner_id=1

Do parentheses force the order of JOINS to be done the parenthesized
way? And why does the optimizer not see this optimization in the
presence of the problematical sections, but does without them?

You have three kinds of joins in te hmix here:
INNER, LEFT OUTER and LEFT OUTER with enforced inner of cardinality 1
that's your scalar subquery.
That si somewhat trickier than just having INNER and LEFT (without the
enforcement). Not knowing which version or fixpack you are on the
combinations I do know that theer have been improvement in that area in
recent V8 fixpacks.
Now one could point out that Sybase apparently has solved the problem
and shame on DB2's famed optimizer. BUT to the best of my knowledge
Sybase does NOT enforce scslar subqueries to be scalar (returning
_any_one_ row). So teh DBMS oversimplifies the problem.
To get Sybase semantics you would need to use a regular LEFT OUTER JOIN
insetad the scalar subquery and add a FETCH FIRST ROW ONLY clause to the
right. That would put the systems on more levl grounds.

Cheers
Serge
--
Serge Rielau
DB2 SQL Compiler Development
IBM Toronto Lab
Nov 12 '05 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: Mountain Man | last post by:
Hi, I have an array derived from a set of radio buttons that I want to break down into a single variable for use with a database. How can I do this? $gender is the array, and I want $gender2 to...
8
by: Rodd Snook | last post by:
I have an application which makes extensive use of the Scripting.Dictionary object. I'm not doing anything silly like putting them outside the page scope -- just creating quite a few of them and...
5
by: Warren Wright | last post by:
Hi group, I have a select statement that if run against a 1 million record database directly in query analyzer takes less than 1 second. However, if I execute the select statement in a stored...
12
by: serge | last post by:
I have an SP that is big, huge, 700-800 lines. I am not an expert but I need to figure out every possible way that I can improve the performance speed of this SP. In the next couple of weeks I...
2
by: martin | last post by:
When I execute Scalar and it finds no data,it cannot be trapped. Function with scalar returns nothing instead of value in catch statement.
8
by: Xu, Wei | last post by:
Hi, I have wrote the following sql sentence.Do you have comments to improve the performance.I have created all the indexed. But it's still very slow.Thanks The primary key is proj_ID and...
2
by: Stephan Rose | last post by:
Still been beating on my Multi Unit Scalar structure and...it's coming along well. Since it is a core component of my app that virtually everything builds on, even the smalles amount of performance...
5
by: pantagruel | last post by:
Hi, It is generally stated that stringbuilder should be used instead of just concatenating strings with the plus operator. That's fine enough what I'm wondering in cases I have: String S =...
4
by: pavanponnapalli | last post by:
hi, I need to send some values to a subroutine where i use insert query. suppose say my scalar is as follows: my ($name,$number,$address) etc and i get the values into those...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.