Help | Site Map
Connecting Tech Pros Worldwide
 
 
LinkBack Thread Tools
  #1  
Old November 17th, 2006, 07:05 AM
Sigmazen
Guest
 
Posts: n/a
Default DPSIs: Materialisation of column values for DPSI use

Hi
I am looking in to utilising the new functionality of DPSIs on zOS UDB
(DB2 v8), but I have a question regarding the following scenario.

- The first table is an Account table whose columns are Account,
Invoice Date, Partition#. This is a NPSI table.
There are about 500,000 accounts which gives me 6MM records (500k * 12
months)

- The second is an Invoice table whose columns are Account, Invoice
Date, Partition#, Invoice. This is a DPSI table partitioned on
Partition# which corresponds to the Invoice Julian Day, and is
clustered within a partition by Account.
There are 365 partitions, and an invoice comes through for a given
account once a month. Therefore for a year for a given account I will
find the invoices within 12 partitions.
A given invoice may take up 20 records, therefore there are 120MM
records on this table (6MM * 20 records)

- Finally I have a Driver table with a list of Accounts I need to pull
off for use.
There might be 100,000 records on here with accounts spanning all the
DPSI partitions for the year.

In a total NPSI world I could have the following query:

SELECT INV.INVOICE
FROM DRVR_TBL AS DRVR
, ACCT_TBL AS ACCT
, INV_TBL AS INV
WHERE ACCT.ACCT_NO = DRVR.ACCT_NO
AND INV.PART_NO = ACCT.PART_NO
AND INV.ACCT_NO = ACCT.ACCT_NO

However, by utilising the utility-based data loading flexibility of the
DPSI tables (ie one partition is offline whilst the rest are online
24x7, and being read), the PART-NO needs to be materialised first
before passing into a query otherwise DB2 goes against all the
partitions in order to index search for the account number.

Therefore the options I can see are :

OPTION#1: Open two CURSORs in my COBOL program thus:
CURSOR#1
SELECT ACCT.PART_NO, ACCT.ACCT_NO
FROM DRVR_TBL AS DRVR
, ACCT_TBL AS ACCT
WHERE ACCT.ACCT_NO = DRVR.ACCT_NO

CURSOR#2
SELECT INV.INVOICE
FROM INV_TBL AS INV
WHERE INV.PART_NO = :part-no
AND INV.ACCT_NO = :acct-no


OPTION#2: A rework of the following query which can be done on UDB for
AIX which is effecively a non-correlated sub-query but using two
columns instead of one.
SELECT INV.INVOICE
FROM INV_TBL AS INV
WHERE (INV.PART_NO, INV.ACCT_NO) IN (SELECT ACCT.PART_NO,
ACCT.ACCT_NO
FROM DRVR_TBL AS DRVR
, ACCT_TBL AS ACCT
WHERE ACCT.ACCT_NO =
DRVR.ACCT_NO)


OPTION#3: The rework could be a concatonation of the two columns

- The problem I have with #1 is that I have to OPEN, FETCH and CLOSE my
CURSOR#2 for all the 100k accounts I am reading from my CURSOR#1 on my
Driver table, thereby giving me an overhead ... am I worrying to much
about this overhead? is it really that much?
- The problem I have with #2 is that it can't be done on zOS UDB. Also
I'm worried that wouldn't DB2 have to materialise the whole resultset
before passing back thereby giving me possible temp-space issues, or
does it give me the results in part-no¦¦acct-no chunks?
- The problem with #3 is would the optimiser end up using any of my
column based indexes?

Any ideas and solutions would be most appreciated
Thanks in advance
Simon

  #2  
Old November 17th, 2006, 05:45 PM
Phil Sherman
Guest
 
Posts: n/a
Default Re: DPSIs: Materialisation of column values for DPSI use

Your description indicates that you start with a list of accounts and
want to retrieve all of the invoices for the year for each of the accounts.

If you use your option #2, you should retrieve all 12 rows of account
information first, then use an IN clause to retrieve all of the invoices
for that account. This should limit you to 12 of the invoice partitions
and the indexes should take you directly to the rows. Plan on having 12
values for the IN clause, fill all of them with zero then overlay the
values as you read the account rows. This will allow you to use static
SQL when there are less than 12 invoices in a year.

As an interesting aside - what do you do in leap years? Those have 366
days in them.

Second aside - why are you processing 20% (100,000 of 500,000) of the
accounts every night? Who looks at all that data? Who has the time to
set up the driver table data every day? If you bill each account once a
month, then less than 25k accounts change each day. If you have 100k
records and each represents an invoice for an account over a year's
time, then you have only about 8300 accounts being processed. Using the
IN clause above will dramatically cut the number of SQL statements
issued against the invoice table.


Phil Sherman



Sigmazen wrote:
Quote:
Hi
I am looking in to utilising the new functionality of DPSIs on zOS UDB
(DB2 v8), but I have a question regarding the following scenario.
>
- The first table is an Account table whose columns are Account,
Invoice Date, Partition#. This is a NPSI table.
There are about 500,000 accounts which gives me 6MM records (500k * 12
months)
>
- The second is an Invoice table whose columns are Account, Invoice
Date, Partition#, Invoice. This is a DPSI table partitioned on
Partition# which corresponds to the Invoice Julian Day, and is
clustered within a partition by Account.
There are 365 partitions, and an invoice comes through for a given
account once a month. Therefore for a year for a given account I will
find the invoices within 12 partitions.
A given invoice may take up 20 records, therefore there are 120MM
records on this table (6MM * 20 records)
>
- Finally I have a Driver table with a list of Accounts I need to pull
off for use.
There might be 100,000 records on here with accounts spanning all the
DPSI partitions for the year.
>
In a total NPSI world I could have the following query:
>
SELECT INV.INVOICE
FROM DRVR_TBL AS DRVR
, ACCT_TBL AS ACCT
, INV_TBL AS INV
WHERE ACCT.ACCT_NO = DRVR.ACCT_NO
AND INV.PART_NO = ACCT.PART_NO
AND INV.ACCT_NO = ACCT.ACCT_NO
>
However, by utilising the utility-based data loading flexibility of the
DPSI tables (ie one partition is offline whilst the rest are online
24x7, and being read), the PART-NO needs to be materialised first
before passing into a query otherwise DB2 goes against all the
partitions in order to index search for the account number.
>
Therefore the options I can see are :
>
OPTION#1: Open two CURSORs in my COBOL program thus:
CURSOR#1
SELECT ACCT.PART_NO, ACCT.ACCT_NO
FROM DRVR_TBL AS DRVR
, ACCT_TBL AS ACCT
WHERE ACCT.ACCT_NO = DRVR.ACCT_NO
>
CURSOR#2
SELECT INV.INVOICE
FROM INV_TBL AS INV
WHERE INV.PART_NO = :part-no
AND INV.ACCT_NO = :acct-no
>
>
OPTION#2: A rework of the following query which can be done on UDB for
AIX which is effecively a non-correlated sub-query but using two
columns instead of one.
SELECT INV.INVOICE
FROM INV_TBL AS INV
WHERE (INV.PART_NO, INV.ACCT_NO) IN (SELECT ACCT.PART_NO,
ACCT.ACCT_NO
FROM DRVR_TBL AS DRVR
, ACCT_TBL AS ACCT
WHERE ACCT.ACCT_NO =
DRVR.ACCT_NO)
>
>
OPTION#3: The rework could be a concatonation of the two columns
>
- The problem I have with #1 is that I have to OPEN, FETCH and CLOSE my
CURSOR#2 for all the 100k accounts I am reading from my CURSOR#1 on my
Driver table, thereby giving me an overhead ... am I worrying to much
about this overhead? is it really that much?
- The problem I have with #2 is that it can't be done on zOS UDB. Also
I'm worried that wouldn't DB2 have to materialise the whole resultset
before passing back thereby giving me possible temp-space issues, or
does it give me the results in part-no¦¦acct-no chunks?
- The problem with #3 is would the optimiser end up using any of my
column based indexes?
>
Any ideas and solutions would be most appreciated
Thanks in advance
Simon
>
 

Bookmarks

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are Off
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On

What is Bytes?

We are a network of experts and professionals in IT and software development that help one another with answers to tough questions and share insights. Get the best answers to your questions from over network members.
Post your question now . . .
It's fast and it's free

Popular Articles