469,315 Members | 1,573 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,315 developers. It's quick & easy.

COALESCE as a Performance Enhancer??!?

I am using a clever product called SQL Expert Pro to optimize
problematic SQL statements. It works by generating syntactically
identical variations of the original SQL, finding all unique execution
plans, and batch testing them.

Anyway, it generally turns out that a very odd change dramatically
improves performance (for a PeopleSoft database on SQL Server 7). Run
time goes from 52 seconds to 8 seconds.

The change is to replace a join condition in a where clause with an
odd equivalent COALESCE construct, e.g. replace WHERE C.PAY_ID =
D.PAY_ID with WHERE C.PAY_ID = COALESCE(D.PAY_ID, D.PAY_ID).

Has anyone seen this sort of behavior before? Why would it be
advantageous to COALESCE on the same field twice?

The original and COALESCE'd versions are shown below. Does the fact
that this is a nine-table join over large tables have anything to do
with it?

Original Query (52 sec)

SELECT F.ACCT_ID,
I.ENTITY_NAME,
A.TNDR_SOURCE_CD,
C.PAY_EVENT_ID,
C.NON_CIS_NAME,
C.NON_CIS_REF_NBR,
C.NON_CIS_COMMENT,
C.PAY_AMT,
D.PAY_SEG_AMT,
E.ACCOUNTING_DT,
':1',
':2',
':3',
E.FREEZE_OPRID,
E.FREEZE_DTTM,
E.FT_TYPE_FLG
FROM PS_CI_TNDR_CTL A,
PS_CI_PAY_TNDR B,
PS_CI_PAY C,
PS_CI_PAY_SEG D,
PS_CI_SA F,
PS_CI_SA_TYPE G,
PS_CI_ACCT_PER H,
PS_CI_PER_NAME I,
PS_CW_FT E
WHERE A.TNDR_CTL_ID = B.TNDR_CTL_ID
AND A.TNDR_SOURCE_CD LIKE 'STK%'
AND B.PAY_EVENT_ID = C.PAY_EVENT_ID
AND C.PAY_ID = D.PAY_ID
AND D.PAY_SEG_ID = E.SIBLING_ID
AND E.ACCOUNTING_DT BETWEEN '2003-10-01' AND '2003-10-31'
AND E.FT_TYPE_FLG IN ('PS', 'PX')
AND NOT EXISTS (SELECT 'X'
FROM PS_CW_INTERFACE_ID J
WHERE J.PAYOR_ACCT_ID = F.ACCT_ID)
AND E.SA_ID = F.SA_ID
AND F.SA_TYPE_CD = G.SA_TYPE_CD
AND G.DEBT_CL_CD = 'NCIS'
AND F.ACCT_ID = H.ACCT_ID
AND H.PER_ID = I.PER_ID
ORDER BY 2

Optimized Query (8 sec)

SELECT F.ACCT_ID,
I.ENTITY_NAME,
A.TNDR_SOURCE_CD,
C.PAY_EVENT_ID,
C.NON_CIS_NAME,
C.NON_CIS_REF_NBR,
C.NON_CIS_COMMENT,
C.PAY_AMT,
D.PAY_SEG_AMT,
E.ACCOUNTING_DT,
':1',
':2',
':3',
E.FREEZE_OPRID,
E.FREEZE_DTTM,
E.FT_TYPE_FLG
FROM PS_CI_TNDR_CTL A,
PS_CI_PAY_TNDR B,
PS_CI_PAY C,
PS_CI_PAY_SEG D,
PS_CI_SA F,
PS_CI_SA_TYPE G,
PS_CI_ACCT_PER H,
PS_CI_PER_NAME I,
PS_CW_FT E
WHERE A.TNDR_CTL_ID = COALESCE(B.TNDR_CTL_ID,B.TNDR_CTL_ID)
AND A.TNDR_SOURCE_CD LIKE 'STK%'
AND B.PAY_EVENT_ID = COALESCE(C.PAY_EVENT_ID, C.PAY_EVENT_ID)
AND C.PAY_ID = COALESCE(D.PAY_ID, D.PAY_ID)
AND D.PAY_SEG_ID = COALESCE(E.SIBLING_ID, E.SIBLING_ID)
AND COALESCE(E.ACCOUNTING_DT, E.ACCOUNTING_DT) BETWEEN '2003-10-01'
AND '2003-10-31'
AND COALESCE(E.FT_TYPE_FLG, E.FT_TYPE_FLG) IN ('PS', 'PX')
AND NOT EXISTS (SELECT 'X'
FROM PS_CW_INTERFACE_ID J
WHERE COALESCE(J.PAYOR_ACCT_ID, J.PAYOR_ACCT_ID) =
F.ACCT_ID)
AND E.SA_ID = COALESCE(F.SA_ID,F.SA_ID)
AND F.SA_TYPE_CD = COALESCE(G.SA_TYPE_CD,G.SA_TYPE_CD)
AND G.DEBT_CL_CD = 'NCIS'
AND F.ACCT_ID = COALESCE(H.ACCT_ID ,H.ACCT_ID)
AND H.PER_ID = COALESCE(I.PER_ID ,I.PER_ID)
ORDER BY 2
Jul 20 '05 #1
2 4595
Check out the query plan for both queries, and you are most likely to
see the difference.

I suppose the COALESCE(JoinColumn,JoinColumn) will be interpreted as a
non-optimizable expression. Because of this, the access path analysis
and join strategy will change. My guess is that it will change in favor
of loop joins. Also, the compilation time is most likely to drop
dramatically.

Gert-Jan
Jeff Roughgarden wrote:

I am using a clever product called SQL Expert Pro to optimize
problematic SQL statements. It works by generating syntactically
identical variations of the original SQL, finding all unique execution
plans, and batch testing them.

Anyway, it generally turns out that a very odd change dramatically
improves performance (for a PeopleSoft database on SQL Server 7). Run
time goes from 52 seconds to 8 seconds.

The change is to replace a join condition in a where clause with an
odd equivalent COALESCE construct, e.g. replace WHERE C.PAY_ID =
D.PAY_ID with WHERE C.PAY_ID = COALESCE(D.PAY_ID, D.PAY_ID).

Has anyone seen this sort of behavior before? Why would it be
advantageous to COALESCE on the same field twice?

The original and COALESCE'd versions are shown below. Does the fact
that this is a nine-table join over large tables have anything to do
with it?

Original Query (52 sec)

SELECT F.ACCT_ID,
I.ENTITY_NAME,
A.TNDR_SOURCE_CD,
C.PAY_EVENT_ID,
C.NON_CIS_NAME,
C.NON_CIS_REF_NBR,
C.NON_CIS_COMMENT,
C.PAY_AMT,
D.PAY_SEG_AMT,
E.ACCOUNTING_DT,
':1',
':2',
':3',
E.FREEZE_OPRID,
E.FREEZE_DTTM,
E.FT_TYPE_FLG
FROM PS_CI_TNDR_CTL A,
PS_CI_PAY_TNDR B,
PS_CI_PAY C,
PS_CI_PAY_SEG D,
PS_CI_SA F,
PS_CI_SA_TYPE G,
PS_CI_ACCT_PER H,
PS_CI_PER_NAME I,
PS_CW_FT E
WHERE A.TNDR_CTL_ID = B.TNDR_CTL_ID
AND A.TNDR_SOURCE_CD LIKE 'STK%'
AND B.PAY_EVENT_ID = C.PAY_EVENT_ID
AND C.PAY_ID = D.PAY_ID
AND D.PAY_SEG_ID = E.SIBLING_ID
AND E.ACCOUNTING_DT BETWEEN '2003-10-01' AND '2003-10-31'
AND E.FT_TYPE_FLG IN ('PS', 'PX')
AND NOT EXISTS (SELECT 'X'
FROM PS_CW_INTERFACE_ID J
WHERE J.PAYOR_ACCT_ID = F.ACCT_ID)
AND E.SA_ID = F.SA_ID
AND F.SA_TYPE_CD = G.SA_TYPE_CD
AND G.DEBT_CL_CD = 'NCIS'
AND F.ACCT_ID = H.ACCT_ID
AND H.PER_ID = I.PER_ID
ORDER BY 2

Optimized Query (8 sec)

SELECT F.ACCT_ID,
I.ENTITY_NAME,
A.TNDR_SOURCE_CD,
C.PAY_EVENT_ID,
C.NON_CIS_NAME,
C.NON_CIS_REF_NBR,
C.NON_CIS_COMMENT,
C.PAY_AMT,
D.PAY_SEG_AMT,
E.ACCOUNTING_DT,
':1',
':2',
':3',
E.FREEZE_OPRID,
E.FREEZE_DTTM,
E.FT_TYPE_FLG
FROM PS_CI_TNDR_CTL A,
PS_CI_PAY_TNDR B,
PS_CI_PAY C,
PS_CI_PAY_SEG D,
PS_CI_SA F,
PS_CI_SA_TYPE G,
PS_CI_ACCT_PER H,
PS_CI_PER_NAME I,
PS_CW_FT E
WHERE A.TNDR_CTL_ID = COALESCE(B.TNDR_CTL_ID,B.TNDR_CTL_ID)
AND A.TNDR_SOURCE_CD LIKE 'STK%'
AND B.PAY_EVENT_ID = COALESCE(C.PAY_EVENT_ID, C.PAY_EVENT_ID)
AND C.PAY_ID = COALESCE(D.PAY_ID, D.PAY_ID)
AND D.PAY_SEG_ID = COALESCE(E.SIBLING_ID, E.SIBLING_ID)
AND COALESCE(E.ACCOUNTING_DT, E.ACCOUNTING_DT) BETWEEN '2003-10-01'
AND '2003-10-31'
AND COALESCE(E.FT_TYPE_FLG, E.FT_TYPE_FLG) IN ('PS', 'PX')
AND NOT EXISTS (SELECT 'X'
FROM PS_CW_INTERFACE_ID J
WHERE COALESCE(J.PAYOR_ACCT_ID, J.PAYOR_ACCT_ID) =
F.ACCT_ID)
AND E.SA_ID = COALESCE(F.SA_ID,F.SA_ID)
AND F.SA_TYPE_CD = COALESCE(G.SA_TYPE_CD,G.SA_TYPE_CD)
AND G.DEBT_CL_CD = 'NCIS'
AND F.ACCT_ID = COALESCE(H.ACCT_ID ,H.ACCT_ID)
AND H.PER_ID = COALESCE(I.PER_ID ,I.PER_ID)
ORDER BY 2

Jul 20 '05 #2
Hi Jeff,

That's pretty bizarre. I can only guess that the old style JOIN
creates a cartesian product with a lot of nulls everywhere, before the
WHERE filters the rows. And that coalesce somehow handles the nulls
better?
Jul 20 '05 #3

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

3 posts views Thread by Steffen Vulpius | last post: by
5 posts views Thread by John Greve | last post: by
10 posts views Thread by db2group88 | last post: by
2 posts views Thread by raysefo | last post: by
reply views Thread by lscbtfws | last post: by
reply views Thread by luigileldsak | last post: by
1 post views Thread by =?Utf-8?B?R1ROMTcwNzc3?= | last post: by
reply views Thread by zhoujie | last post: by
reply views Thread by suresh191 | last post: by
reply views Thread by harlem98 | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.