By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
438,007 Members | 1,248 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 438,007 IT Pros & Developers. It's quick & easy.

DB2 V7.2 fixpak 12 / UTF-8 db doing extra Unicode -> UTF-8 conversion on client?

P: n/a
Help!

We have DB2 V7.2 (fixpak 12) installed on Windows2003 Server, and the latest
V7.2 client installed on another system. The DB2CODEPAGE on all systems is
set to 1208, and the database was created with code set UTF-8 / codepage
1208. (Note: Running our test application described below on the database
host as opposed to a separate client system produced the same results as
described below).

When we perform an INSERT statement passing UTF-8 encoded data to a VARCHAR
column, it appears the client ODBC driver is encoding this again, corrupting
the data. Since the codepages of the client and database are the same, it
shouldn't do anything to the passed data, should it?

We are using VB6 (SP5) / ADO (latest 2.7 release) / ODBC on the client
system. We have written a test application that simply performs an INSERT of
a UTF-8 encoded string. Enabled TRACE on the client system shows the
following information (I know the UTF-8 characters look like junk, but you
can still see that something changed the 3rd column string). Note that the
initial connection reports the codepage values are 1208 on both client /
server. A check of the database using HEX( ) confirms that the retranslated
/ 2nd form is what gets saved in the database.

--------------------------------------------------------------
[ Process: 5232, Thread: 3992 ]
[ Date & Time: 08-08-2004 13:59:49.000004 ]
[ Product: QDB2/NT 7.1.0.98 ]
[ Level Identifier: 030A0105 ]
[ CLI Driver Version: 07.02.0001 ]
[ Informational Tokens: "DB2 v7.1.0.98","n040510","WR21337" ]

DBMS NAME="DB2/NT", Version="07.02.0009", Fixpack="0x230a0105"
Application Codepage=1208, Database Codepage=1208, Char Send/Recv
Codepage=1208, Graphic Send/Recv Codepage=1200, Application Char
Codepage=1208, Application Graphic Codepage=1200

SQLExecDirectW( hStmt=1:2, pszSqlStr="INSERT INTO IP.LOG_PROP
(LOG_ID,NAME,VALUE) VALUES
(x'20040807184740321428000000','PUB_DESC-1033','пеѬозаводск')"
-
X"49004E005300450052005400200049004E0054004F002000 490050002E004C004F0047005F
00500052004F005000200028004C004F0047005F0049004400 2C004E0041004D0045002C0056
0041004C005500450029002000560041004C00550045005300 20002800780027003200300030
00340030003800300037003100380034003700340030003300 32003100340032003800300030
00300030003000300027002C0027005000550042005F004400 4500530043002D003100300033
00330027002C002700D000BF00D000B500D1001A20D100AC20 D000BE00D000B700D000B000D0
00B200D000BE00D000B400D1008100D000BA0027002900", cbSqlStr=125 )

StmtOut="INSERT INTO IP.LOG_PROP (LOG_ID,NAME,VALUE) VALUES
(x'20040807184740321428000000','PUB_DESC-1033','пе'?s
',озавод'Ðº')"

sqlccsend( ulBytes - 388 )
sqlccsend( Handle - 26037648 )
sqlccsend( ) - rc - 0, time elapsed - +2.350000E-004
sqlccrecv( )
sqlccrecv( ulBytes - 456 ) - rc - 0, time elapsed - +7.520000E-004

Row=1, iCol=1, fCType=SQL_C_CHAR, rgbValue="PUB_DESC-1036" -
X"5055425F444553432D31303336", pcbValue=13, piIndicatorValue=13
Row=1, iCol=2, fCType=SQL_C_CHAR, rgbValue="пе'?s
',озавод'Ðº" -
X"C390C2BFC390C2B5C391E2809AC391E282ACC390C2BEC390 C2B7C390C2B0C390C2B2C390C2
BEC390C2B4C391C281C390C2BA", pcbValue=50, piIndicatorValue=50
--------------------------------------------------------------

We've run this test application on another system which uses the V7.1
client, and this client operates as expected - no additional translation is
performed on the UTF-8 encoded string, and it appears in the database
exactly as it was passed. DB2CODEPAGE on this client is also set to 1208.
Here are some lines from that trace. (Note - we have been using a V7.1
database with this setup for some time successfully).

--------------------------------------------------------------
[ Process: 2772, Thread: 4040 ]
[ Date & Time: 08-08-2004 10:59:46.000005 ]
[ Product: QDB2/NT 7.1.0 ]
[ Level Identifier: 02010105 ]
[ CLI Driver Version: 07.01.0000 ]
[ Informational Tokens: "DB2 v7.1.0","n000510","" ]

SQLExecDirect( hStmt=1:2, pszSqlStr="INSERT INTO IP.LOG_PROP
(LOG_ID,NAME,VALUE) VALUES
(x'20040807184740321428000000','PUB_DESC-1013','пе,?озаводск')",
cbSqlStr=125 )

( StmtOut="INSERT INTO IP.LOG_PROP (LOG_ID,NAME,VALUE) VALUES
(x'20040807184740321428000000','PUB_DESC-1013','пе,?озаводск')"
)

sqlccsend( ulBytes - 320 )
sqlccsend( Handle - 90010096 )
sqlccsend( ) - rc - 0, time elapsed - +2.222000E-003
sqlccrecv( )
sqlccrecv( ulBytes - 163 ) - rc - 0, time elapsed - +1.011970E-001
Row=1, iCol=1, fCType=SQL_C_CHAR, rgbValue="PUB_DESC-1034", pcbValue=13,
piIndicatorValue=13
Row=1, iCol=2, fCType=SQL_C_CHAR, rgbValue="пе,?озаводск",
pcbValue=24, piIndicatorValue=24

--------------------------------------------------------------

Here, the 3rd column string remains untouched. Checking the database
confirms that using the 7.1 client the correct UTF-8 encoded data is stored.

Notice that the V7.2/fixpak 12 client is using SQLExecDirectW (wide
version?) while the V7.1 client uses SQLExecDirect -- perhaps this is an
indicator of something?

Further experimentation showed that if instead of a UTF-8 string we decode
it into Unicode and pass this off in the INSERT statement with the 7.2 /
FP12 client, then we can see it get translated into UTF-8 in the trace:
--------------------------------------------------------------
SQLExecDirectW( hStmt=1:2, pszSqlStr="INSERT INTO IP.LOG_PROP
(LOG_ID,NAME,VALUE) VALUES
(x'20040807184740321428000000','PUB_DESC-1035','?5B@>702>4A:')" -
X"49004E005300450052005400200049004E0054004F002000 490050002E004C004F0047005F
00500052004F005000200028004C004F0047005F0049004400 2C004E0041004D0045002C0056
0041004C005500450029002000560041004C00550045005300 20002800780027003200300030
00340030003800300037003100380034003700340030003300 32003100340032003800300030
00300030003000300027002C0027005000550042005F004400 4500530043002D003100300033
00350027002C0027003F043504420440043E04370430043204 3E04340441043A0427002900",
cbSqlStr=113

StmtOut="INSERT INTO IP.LOG_PROP (LOG_ID,NAME,VALUE) VALUES
(x'20040807184740321428000000','PUB_DESC-1035','пе,?озаводск')"

sqlccsend( ulBytes - 320 )
sqlccsend( Handle - 26037648 )
sqlccsend( ) - rc - 0, time elapsed - +2.290000E-004
sqlccrecv( )
sqlccrecv( ulBytes - 163 ) - rc - 0, time elapsed - +3.997000E-003
Row=1, iCol=1, fCType=SQL_C_CHAR, rgbValue="PUB_DESC-1035" -
X"5055425F444553432D31303335", pcbValue=13, piIndicatorValue=13
Row=1, iCol=2, fCType=SQL_C_CHAR, rgbValue="пе,?озаводск" -
X"D0BFD0B5D182D180D0BED0B7D0B0D0B2D0BED0B4D181D0BA ", pcbValue=24,
piIndicatorValue=24
--------------------------------------------------------------

Again, the proper UTF-8 data gets stored in the database, however updating
all of our code to deal with Unicode rather than UTF-8 as currently
configured could take a little time, and I don't want to go through that
exercise if what we are seeing is in fact a bug in DB2.

Any help at all would be appreciated!

--
Timothy G. Northrup
IP.com, Inc.
Nov 12 '05 #1
Share this question for a faster answer!
Share on Google+

This discussion thread is closed

Replies have been disabled for this discussion.