By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
437,586 Members | 2,445 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 437,586 IT Pros & Developers. It's quick & easy.

using nextval in external udf

P: n/a
pfa
I have a udf which returns a table. I was hoping to access the NEXTVAL
from a sequence to affect the logic flow and consequently the return
result. I'm using a LANGUAGE C type function.

Here's the fetch snippet

case SQLUDF_TF_FETCH:
/* fetch next row */
{
char * nextid = myrecids++;
char * ptr;
--pScratArea->recids_len;
if (pScratArea->recids_len < 1)
{
/* SQLUDF_STATE is part of SQLUDF_TRAIL_ARGS_ALL */
strcpy(outRecid, "");
strcpy(SQLUDF_STATE, "02000");
break;
}
// look for AM and terminate nextid
for (ptr = nextid; *ptr != '\376'; ++ptr) --pScratArea->recids_len;
*(ptr) = '\0';
myrecids = ptr + 1;

// copy current null terminated ptr to outRecid (return arg)
strcpy(outRecid, nextid);
}

*recidNullInd = 0;
/* next row of data */
pScratArea->file_pos++;
break;
What I'm hoping to do is use the result of a NEXTVAL call to cross
check against a counter in my scratch pad such that multiple process
can be feeding off this function table each row from the function table
would only be processed once.

Nov 12 '05 #1
Share this Question
Share on Google+
7 Replies


P: n/a
pfa wrote:
I have a udf which returns a table. I was hoping to access the NEXTVAL
from a sequence to affect the logic flow and consequently the return
result. I'm using a LANGUAGE C type function.

Here's the fetch snippet

case SQLUDF_TF_FETCH:
/* fetch next row */
{
char * nextid = myrecids++;
char * ptr;
--pScratArea->recids_len;
if (pScratArea->recids_len < 1)
{
/* SQLUDF_STATE is part of SQLUDF_TRAIL_ARGS_ALL */
strcpy(outRecid, "");
strcpy(SQLUDF_STATE, "02000");
break;
}
// look for AM and terminate nextid
for (ptr = nextid; *ptr != '\376'; ++ptr) --pScratArea->recids_len;
*(ptr) = '\0';
myrecids = ptr + 1;

// copy current null terminated ptr to outRecid (return arg)
strcpy(outRecid, nextid);
}

*recidNullInd = 0;
/* next row of data */
pScratArea->file_pos++;
break;
What I'm hoping to do is use the result of a NEXTVAL call to cross
check against a counter in my scratch pad such that multiple process
can be feeding off this function table each row from the function table
would only be processed once.


I still don't quite understand your scenario, but you could use embedded SQL
and simply query the sequence that way. You have to register the UDF with
READS SQL DATA, however.

--
Knut Stolze
Information Integration Development
IBM Germany / University of Jena
Nov 12 '05 #2

P: n/a
pfa
Hmmm ok, can't be done via CLI? i.e. are there any handles available?

Knut Stolze wrote:
pfa wrote:
I have a udf which returns a table. I was hoping to access the NEXTVAL
from a sequence to affect the logic flow and consequently the return
result. I'm using a LANGUAGE C type function.

Here's the fetch snippet

case SQLUDF_TF_FETCH:
/* fetch next row */
{
char * nextid = myrecids++;
char * ptr;
--pScratArea->recids_len;
if (pScratArea->recids_len < 1)
{
/* SQLUDF_STATE is part of SQLUDF_TRAIL_ARGS_ALL */
strcpy(outRecid, "");
strcpy(SQLUDF_STATE, "02000");
break;
}
// look for AM and terminate nextid
for (ptr = nextid; *ptr != '\376'; ++ptr) --pScratArea->recids_len;
*(ptr) = '\0';
myrecids = ptr + 1;

// copy current null terminated ptr to outRecid (return arg)
strcpy(outRecid, nextid);
}

*recidNullInd = 0;
/* next row of data */
pScratArea->file_pos++;
break;
What I'm hoping to do is use the result of a NEXTVAL call to cross
check against a counter in my scratch pad such that multiple process
can be feeding off this function table each row from the function table
would only be processed once.


I still don't quite understand your scenario, but you could use embedded SQL
and simply query the sequence that way. You have to register the UDF with
READS SQL DATA, however.

--
Knut Stolze
Information Integration Development
IBM Germany / University of Jena


Nov 12 '05 #3

P: n/a
pfa
Perhaps there's an alternative way to achieve my end result. I'll try
to explain better:

We have a program which "programmatically" builds a selection of keys
to be processed at a later date by multiple processes run in parallel.
As this list is "potentially" too big to be used in a WHERE key IN
(...) I figured a function table which extracts each key and returns it
as a row would work better:

e.g. SELECT key u, data_column t from udftable('listfile') u, table t
WHERE u.key = t.key

(btw 'listfile' is currently a file on the OS which the udftable reads
in and scans through looking for delimiters end returning a null
terminated char * as the key, keeping the position in the list for the
next fetch)

Each process run in parallel feeds of the above SELECT (now this is a
work in progress project so may not be the best way to go...we are not
DB2 experts) and processes the data_column contents in some way. So the
catch is that each row in "table t" must only be processed once so a
got the idea of using a NEXTVAL sequence from DB2 so that each process
when issuing a FETCH would get a unique key u, data_column t result
because the udftable's function would skip keys in the list until the
NEXTVAL position has been reached.

The only other way I could see to do this was having a separate process
(MQ series perhaps? never used it) which has this cursor and the other
processes would fetch from it ensuring each row is only processed once.

Splitting the list up is not an option as the number of processes
started is up to the customer and they can start more after the others
are already running.

Nov 12 '05 #4

P: n/a
pfa wrote:
Perhaps there's an alternative way to achieve my end result. I'll try
to explain better:

We have a program which "programmatically" builds a selection of keys
to be processed at a later date by multiple processes run in parallel.
As this list is "potentially" too big to be used in a WHERE key IN
(...) I figured a function table which extracts each key and returns it
as a row would work better:

e.g. SELECT key u, data_column t from udftable('listfile') u, table t
WHERE u.key = t.key

(btw 'listfile' is currently a file on the OS which the udftable reads
in and scans through looking for delimiters end returning a null
terminated char * as the key, keeping the position in the list for the
next fetch)

Each process run in parallel feeds of the above SELECT (now this is a
work in progress project so may not be the best way to go...we are not
DB2 experts) and processes the data_column contents in some way. So the
catch is that each row in "table t" must only be processed once so a
got the idea of using a NEXTVAL sequence from DB2 so that each process
when issuing a FETCH would get a unique key u, data_column t result
because the udftable's function would skip keys in the list until the
NEXTVAL position has been reached.

The only other way I could see to do this was having a separate process
(MQ series perhaps? never used it) which has this cursor and the other
processes would fetch from it ensuring each row is only processed once.

Splitting the list up is not an option as the number of processes
started is up to the customer and they can start more after the others
are already running.

The solution of the table-fucntion is a no go. Check out the NO PARALLEL
option - which happens to be mandatory.
You will not achieve the parallelizing (sp?) effect you plan to achieve.
If DB2 (hypothetically) were to support "partitioned" (which you want)
or "replicated" (which you don't want) table functions. The likely way
to achieve your goal would lie in an extension of the DBINFO structure
with the partition number. So you could partition the file.

Now, the problem you are facing in not new by any means. It often
appears during data cleansing in a warehouse.
In these cases scripts ar eused to fire the same query from each data
node (using the DB2NODE export variable to connect) and pass the db
partition number as an argument to the table function.
It's not as pretty as having teh optimizer do the job, but it works
quite well.

Cheers
Serge
--
Serge Rielau
DB2 SQL Compiler Development
IBM Toronto Lab
Nov 12 '05 #5

P: n/a
pfa wrote:
Hmmm ok, can't be done via CLI? i.e. are there any handles available?


You can use CLI. There is a description in the Application Development
Guide that explains how to obtain the (default) connection handle inside
the UDF for the current connection.

--
Knut Stolze
Information Integration Development
IBM Germany / University of Jena
Nov 12 '05 #6

P: n/a
pfa wrote:
Perhaps there's an alternative way to achieve my end result. I'll try
to explain better:

We have a program which "programmatically" builds a selection of keys
to be processed at a later date by multiple processes run in parallel.
As this list is "potentially" too big to be used in a WHERE key IN
(...) I figured a function table which extracts each key and returns it
as a row would work better:

e.g. SELECT key u, data_column t from udftable('listfile') u, table t
WHERE u.key = t.key

(btw 'listfile' is currently a file on the OS which the udftable reads
in and scans through looking for delimiters end returning a null
terminated char * as the key, keeping the position in the list for the
next fetch)

Each process run in parallel feeds of the above SELECT (now this is a
work in progress project so may not be the best way to go...we are not
DB2 experts) and processes the data_column contents in some way. So the
catch is that each row in "table t" must only be processed once so a
got the idea of using a NEXTVAL sequence from DB2 so that each process
when issuing a FETCH would get a unique key u, data_column t result
because the udftable's function would skip keys in the list until the
NEXTVAL position has been reached.

The only other way I could see to do this was having a separate process
(MQ series perhaps? never used it) which has this cursor and the other
processes would fetch from it ensuring each row is only processed once.

Splitting the list up is not an option as the number of processes
started is up to the customer and they can start more after the others
are already running.


My first idea would have been to use a temp table. You populate the table
once and then the different processes could use sequences to coordinate
which process operates on which rows. However, that really depends on the
functionality of the table function and copying its results into a temp
table might not be preferred and Serge's suggestion the way to go.

--
Knut Stolze
Information Integration Development
IBM Germany / University of Jena
Nov 12 '05 #7

P: n/a
Did you consider having the process that creates the keys generate
separate files for each of the "worker" processes? It would mean you'd
need a separate procedure to start each of the multiple tasks, or use a
parameter which will be [part of] the file name.

Extending this to a database should be possible by setting up a control
table with an identifier for each of the multiple tasks. A predicate
indicating the task id will separate out the rows for processing.

Either of these should work with "a large number of rows to process" and
have all of the tasks complete in around the same amount of time. If you
truely need to have the tasks process on a "do the next input key"
sequence then the following MIGHT be a way to do this:

Create your keys table each day with a column containing a numeric key
starting with 1. You'll need a unique index on the numeric key column.
After creating the keys table, create a control table having an
autogenerated identity column starting with 1. This table needs no
columns other than the identity column and should not be indexed. It
must have row level locking. Each task inserts a row into the control
table then retrieves the identity key for the inserted row. This key is
used to do a single row retrieval from the keys table. This keys table
retrieval should include an "OPTIMIZE FOR 1 ROWS" clause. Commits must
be taken at appropriate intervals to avoid lock escalation on the
control table.
Phil Sherman


pfa wrote:
Perhaps there's an alternative way to achieve my end result. I'll try
to explain better:

We have a program which "programmatically" builds a selection of keys
to be processed at a later date by multiple processes run in parallel.
As this list is "potentially" too big to be used in a WHERE key IN
(...) I figured a function table which extracts each key and returns it
as a row would work better:

e.g. SELECT key u, data_column t from udftable('listfile') u, table t
WHERE u.key = t.key

(btw 'listfile' is currently a file on the OS which the udftable reads
in and scans through looking for delimiters end returning a null
terminated char * as the key, keeping the position in the list for the
next fetch)

Each process run in parallel feeds of the above SELECT (now this is a
work in progress project so may not be the best way to go...we are not
DB2 experts) and processes the data_column contents in some way. So the
catch is that each row in "table t" must only be processed once so a
got the idea of using a NEXTVAL sequence from DB2 so that each process
when issuing a FETCH would get a unique key u, data_column t result
because the udftable's function would skip keys in the list until the
NEXTVAL position has been reached.

The only other way I could see to do this was having a separate process
(MQ series perhaps? never used it) which has this cursor and the other
processes would fetch from it ensuring each row is only processed once.

Splitting the list up is not an option as the number of processes
started is up to the customer and they can start more after the others
are already running.

Nov 12 '05 #8

This discussion thread is closed

Replies have been disabled for this discussion.