469,929 Members | 1,748 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,929 developers. It's quick & easy.

Problem with VARCHAR FOR BIT DATA

I need to update a fairly wide non-indexed column for a very large
number of rows in several tables. One solution we were tossing around
was to avoid all the problems with performance and log space by
exporting the data to flat files, using an external program to make the
changes, then reload the data into the target tables. Preliminary tests
showed good results.

We ran into a problem where one table had a column defined as VARCHAR
FOR BIT DATA. Several of the rows had binary values that when exported
to flat files showed up as strange control characters and newline
characters which confused most text processing utilities. Is there a
way to get around this situation by materializing the data in a
non-binary format and reloading it as such?

Thanks,

Mike L. Bell

Oct 23 '06 #1
9 9895
You could use the HEX() function on the column.

Cheers
Serge
--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab

IOD Conference
http://www.ibm.com/software/data/ond...ness/conf2006/
Oct 23 '06 #2
I just tested that function and it does produce output that can be
processed as I need. Once I've done the conversion, I'm not sure how I
tell DB2 to take the character-ized version of the data and convert
back into binary format via the LOAD utility. My first pass just loaded
the converted data into the varchar field, truncating the last half.

What used to be x'2000' gets converted to x'32303030' (where 32 is
ASCII "2" and 30 is ASCII "0"). I didn't see anything immediately
obvious under the MODIFIED BY modifiers in the docs. Is there a mod
that will do this?

Mike
Serge Rielau wrote:
You could use the HEX() function on the column.

Cheers
Serge
--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab

IOD Conference
http://www.ibm.com/software/data/ond...ness/conf2006/
Oct 23 '06 #3
mikelbell2000 wrote:
I just tested that function and it does produce output that can be
processed as I need. Once I've done the conversion, I'm not sure how I
tell DB2 to take the character-ized version of the data and convert
back into binary format via the LOAD utility. My first pass just loaded
the converted data into the varchar field, truncating the last half.

What used to be x'2000' gets converted to x'32303030' (where 32 is
ASCII "2" and 30 is ASCII "0"). I didn't see anything immediately
obvious under the MODIFIED BY modifiers in the docs. Is there a mod
that will do this?
Given the table created as:

D:\SQLLIB\BIN>db2 create table junk (c1 varchar(16) for bit data)
DB20000I The SQL command completed successfully.

you can insert hex value of 2000 into it:

D:\SQLLIB\BIN>db2 insert into junk values x'2000'
DB20000I The SQL command completed successfully.

which you can retrieve in two different ways:

D:\SQLLIB\BIN>db2 select * from junk

C1
-----------------------------------
x'2000'

1 record(s) selected.
D:\SQLLIB\BIN>db2 select hex(c1) from junk

1
--------------------------------
2000

1 record(s) selected.
If this is not what you want - then I don't understand your problem; please
elaborate.

Jan M. Nelkem
Oct 23 '06 #4
This is what currently happens

insert into junk values x'2000'
select * from junk
c1
----------
x'2000'

export to dumpfile.txt of del select hex(c1) from junk.
dumpfile.txt contains:
"2000"
load from dumpfile.txt of del replace into junk

select * from junk
c1
--------
x'32303030'

I need junk to c1 to be: x'2000'. Some kind of conversion during the
load process needs to happen to un-HEX the character string back to
binary.

For a more realistic scenario, need to unload data similar to:

ITEM_ID CONT_ID

-----------------------------
----------------------------------------------------------------
x'20060918142255954039000000' EWKN3TQ7M2J32Q1C

then edit the file substituting
"94+3+ICM8+icmnlsdb14+Srt_Transfer59+26+PCBB%24A4D CROTYHKX++++++++++18+PCBB%24A4DCROTYHKX++1+14+1043 "
for "EWKN3TQ7M2J32Q1C" , then reload the table from the new file.
Exporting the ITEM_ID field to a delimited file produces:

" ^F ^X^T""U^@9","EWKN3TQ7M2J32Q1C" (with three trailing
null characters at the end of the first field)

Depending on the control characters in the field, utilities like grep,
sed, awk, etc. will do wild things. This is the reason it would be nice
to HEX the value as suggested (thus removing the control characters),
perform text substitution, then reload the table on the newly processed
file converting the HEXed value back to binary along the way to
preserve its original context.

Hope this clarifies.

Mike
Jan M. Nelken wrote:
>
Given the table created as:

D:\SQLLIB\BIN>db2 create table junk (c1 varchar(16) for bit data)
DB20000I The SQL command completed successfully.

you can insert hex value of 2000 into it:

D:\SQLLIB\BIN>db2 insert into junk values x'2000'
DB20000I The SQL command completed successfully.

which you can retrieve in two different ways:

D:\SQLLIB\BIN>db2 select * from junk

C1
-----------------------------------
x'2000'

1 record(s) selected.
D:\SQLLIB\BIN>db2 select hex(c1) from junk

1
--------------------------------
2000

1 record(s) selected.
If this is not what you want - then I don't understand your problem; please
elaborate.

Jan M. Nelkem
Oct 23 '06 #5
mikelbell2000 wrote:
This is what currently happens

insert into junk values x'2000'
select * from junk
c1
----------
x'2000'

export to dumpfile.txt of del select hex(c1) from junk.
dumpfile.txt contains:
"2000"
load from dumpfile.txt of del replace into junk

select * from junk
c1
--------
x'32303030'

I need junk to c1 to be: x'2000'. Some kind of conversion during the
load process needs to happen to un-HEX the character string back to
binary.
Ok - I think I got it.

It appears that load/import does not accept hexadecimal notation in input file
(x'2000' would be inserted as hex 58273230303027 ).

What is working is this:

D:\Working>db2 -x -o- -z junk.txt "select 'INSERT INTO JUNK VALUES X''' concat h
ex(c1) concat ''' ;' from junk"

Pay attention to placement and numbver of single quotes.

This creates a file junk.text like this:

D:\Working>type junk.txt
INSERT INTO JUNK VALUES X'4672616E6B' ;
INSERT INTO JUNK VALUES X'1248163264' ;

You can then use junk.txt as input batch file to db2:

db2 -tvf junk.txt

to insert those values back.

Seems almost straightforward ... :-)

Jan M. Nelken
Oct 24 '06 #6
Thanks, Jan. Your approach would definitely solve the immediate issue,
but at the expense of incurring logging and being much slower than the
high speed LOAD command which we were trying to take advantage of.
We've got millions of rows to update and a short window to do it in.
The hopes were to dump the data while still online, run the
substitution scripts then reload the tables quickly.

Our research up to now indicates that there is no translation function
available to the LOAD command to aid in our plight. Your solution is
elegant, but not viable for our constraints. If there are other
"outside-the-box" ideas, we're open to them as well.

Thanks again,
Mike
Jan M. Nelken wrote:
>
Ok - I think I got it.

It appears that load/import does not accept hexadecimal notation in input file
(x'2000' would be inserted as hex 58273230303027 ).

What is working is this:

D:\Working>db2 -x -o- -z junk.txt "select 'INSERT INTO JUNK VALUES X''' concat h
ex(c1) concat ''' ;' from junk"

Pay attention to placement and numbver of single quotes.

This creates a file junk.text like this:

D:\Working>type junk.txt
INSERT INTO JUNK VALUES X'4672616E6B' ;
INSERT INTO JUNK VALUES X'1248163264' ;

You can then use junk.txt as input batch file to db2:

db2 -tvf junk.txt

to insert those values back.

Seems almost straightforward ... :-)

Jan M. Nelken
Oct 24 '06 #7
mikelbell2000 wrote:
Thanks, Jan. Your approach would definitely solve the immediate issue,
but at the expense of incurring logging and being much slower than the
high speed LOAD command which we were trying to take advantage of.
We've got millions of rows to update and a short window to do it in.
The hopes were to dump the data while still online, run the
substitution scripts then reload the tables quickly.

Our research up to now indicates that there is no translation function
available to the LOAD command to aid in our plight. Your solution is
elegant, but not viable for our constraints. If there are other
"outside-the-box" ideas, we're open to them as well.
You may consider following "improvements" to your concerns:

a) Use AUTOCOMMIT OFF and NOT LOGGED INITALLY to address logging issue;
b) Use blocking insert to improve speed:

instead of:

INSERT INTO JUNK VALUES X'4672616E6B' ;
....
INSERT INTO JUNK VALUES X'1248163264' ;

construct:

INSERT INTO JUNK VALUES
X'4672616E6B',
....
X'1248163264'
;

thus use single INSERT for inserting multiple rows - up to SQL statement limit.
Combining both a) and b) should yield acceptable results.

If you insist on using LOAD - considering LOAD FROM CURSOR and feed cursor with
data processed by an application program massaging input data...

Jan M. Nelken
Oct 24 '06 #8
Jan M. Nelken wrote:
If you insist on using LOAD - considering LOAD FROM CURSOR and feed cursor
with data processed by an application program massaging input data...
Or you can use pipes, which is probably a more obvious way to involve an
application that modifies the data to be loaded.

--
Knut Stolze
DB2 Information Integration Development
IBM Germany
Oct 25 '06 #9
How about writing a simple C or Java program to re-translate the
human-readable form back to the binary format on a byte-by-byte basis
for the specific field in the loadable file? It would read 2 characters
at time then write the single byte to the output file. This would have
the effect of shrinking the field in half sizewise and representing in
the same binayr format DB2 would expect on input. It would have to pass
through the rest of the fields as is without translation. Seems like it
should be a about a page of code or so. You could make it go the
proverbial extra mile by making it a filter like the aforementioned
grep, sed and awk utilities so it can be processed all in one shot.

Although not a solution using the built-in functionality of the DB2
arsenal, this solution would be keeping to the spriit of the original
scenario you suggested.

Evan

Jan M. Nelken wrote:
mikelbell2000 wrote:
>
construct:

INSERT INTO JUNK VALUES
X'4672616E6B',
...
X'1248163264'
;

thus use single INSERT for inserting multiple rows - up to SQL statement limit.
Combining both a) and b) should yield acceptable results.

If you insist on using LOAD - considering LOAD FROM CURSOR and feed cursor with
data processed by an application program massaging input data...

Jan M. Nelken
Oct 25 '06 #10

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

2 posts views Thread by Dariusz | last post: by
2 posts views Thread by Intrepid Soldier | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.