On 2004-04-12, Kanv scribbled:
Hi Dave,
There are only a couple of fields and they are all VARCHAR. My DB is
on NT, and I also tried installing the Thai language set on the
OS...just to be sure. But still.... no can do Thai.
:(
KP
Hi Kanv,
Okay. Try changing the definition of the fields from VARCHAR to
VARGRAPHIC instead (GRAPHIC is DB2's slightly bizarre name for 2-byte
characters). As I mentioned in the previous post, VARGRAPHIC fields
undergo no translation so, assuming it's the UTF-8 conversion that is
trashing the data, this might make a difference.
One thing to note though: because VARGRAPHIC fields contain 2-byte
characters, as opposed to VARCHAR which stores 1-byte characters
(although up to 3-bytes can be used for characters by the UTF-8
encoding scheme) the maximum length of a VARGRAPHIC field is half that
of an equivalent VARCHAR field. Hence, while the maximum size for a
VARCHAR field definition is 32672, the maximum size for a VARGRAPHIC
field is 16336.
I must admit I'm skeptical that the UTF-8 conversion could be trashing
the data (it's not exactly a difficult piece of code - and there are
numerous reference implementations). I think it's more likely that
something "either side" of the database (the application(s) being used
to insert and/or retrieve the data) are doing something strange. Maybe
the output isn't being translated from UTF-8 and you're seeing raw
UTF-8 encoding?
If the above suggestion doesn't work, could you send a sample of the
trashed data to my e-mail? If you can manage it, a screen shot would be
good too (to avoid any mail gateways translating the data any further).
I'm on DSL so it doesn't matter if it's large (obviously, don't post it
here! :-)
HTH, Dave.
--
Dave
Remove "_nospam" for valid e-mail address
"Never underestimate the bandwidth of a station wagon full of CDs doing
a ton down the highway" -- Anon.