Did you consider having the process that creates the keys generate
separate files for each of the "worker" processes? It would mean you'd
need a separate procedure to start each of the multiple tasks, or use a
parameter which will be [part of] the file name.
Extending this to a database should be possible by setting up a control
table with an identifier for each of the multiple tasks. A predicate
indicating the task id will separate out the rows for processing.
Either of these should work with "a large number of rows to process" and
have all of the tasks complete in around the same amount of time. If you
truely need to have the tasks process on a "do the next input key"
sequence then the following MIGHT be a way to do this:
Create your keys table each day with a column containing a numeric key
starting with 1. You'll need a unique index on the numeric key column.
After creating the keys table, create a control table having an
autogenerated identity column starting with 1. This table needs no
columns other than the identity column and should not be indexed. It
must have row level locking. Each task inserts a row into the control
table then retrieves the identity key for the inserted row. This key is
used to do a single row retrieval from the keys table. This keys table
retrieval should include an "OPTIMIZE FOR 1 ROWS" clause. Commits must
be taken at appropriate intervals to avoid lock escalation on the
control table.
Phil Sherman
pfa wrote:
Perhaps there's an alternative way to achieve my end result. I'll try
to explain better:
We have a program which "programmatical ly" builds a selection of keys
to be processed at a later date by multiple processes run in parallel.
As this list is "potentiall y" too big to be used in a WHERE key IN
(...) I figured a function table which extracts each key and returns it
as a row would work better:
e.g. SELECT key u, data_column t from udftable('listf ile') u, table t
WHERE u.key = t.key
(btw 'listfile' is currently a file on the OS which the udftable reads
in and scans through looking for delimiters end returning a null
terminated char * as the key, keeping the position in the list for the
next fetch)
Each process run in parallel feeds of the above SELECT (now this is a
work in progress project so may not be the best way to go...we are not
DB2 experts) and processes the data_column contents in some way. So the
catch is that each row in "table t" must only be processed once so a
got the idea of using a NEXTVAL sequence from DB2 so that each process
when issuing a FETCH would get a unique key u, data_column t result
because the udftable's function would skip keys in the list until the
NEXTVAL position has been reached.
The only other way I could see to do this was having a separate process
(MQ series perhaps? never used it) which has this cursor and the other
processes would fetch from it ensuring each row is only processed once.
Splitting the list up is not an option as the number of processes
started is up to the customer and they can start more after the others
are already running.