By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
440,375 Members | 1,095 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 440,375 IT Pros & Developers. It's quick & easy.

Transcode Japanese??

P: n/a
I'm on a Solaris 9 Japanese machine w/ an Ultra 5 Sparc CPU. I'm using
Xerces 2.6 DOM

I've got a document in UTF-8 format..
<?xml version="1.0" encoding="UTF-8"?>
<Name>ja_alert-\343\201\250\343\201\241\343\201\244\343\201\252\3 43\201\256\343\
201\253</Name>
(I'm not sure if the Japanese came out right here, but everything after
ja_alert- is UTF-8 for Japanese).

When I extract the text element I get an XMLCh* that claims to be 15 char's
long. However, when I get a char* from it, all the Japanese is truncated and
it comes out only 9 chars long.

char * value = XMLString::transcode( pNode->getNodeValue() );
cout<<"original length is "<<strlen( value )<<endl;
cout<<"Its a text named
"<<XMLString::transcode(pNode->getNodeName())
<<" value "
<<XMLString::transcode(pNode->getNodeValue())
<<" size is "<<XMLString::stringLen( pNode->getNodeValue())
<<endl;

I get back...

original length is 9
Its a text named #text value ja_alert- size is 15

(notice the Japanese is gone).
My locale looks like...
=> locale
LANG=ja
LC_CTYPE="ja"
LC_NUMERIC="ja"
LC_TIME="ja"
LC_COLLATE="ja"
LC_MONETARY="ja"
LC_MESSAGES="ja"
LC_ALL=
Do I need to something to tell the transcoder what encoding to transcode
to??

-Robert
Jul 20 '05 #1
Share this Question
Share on Google+
2 Replies


P: n/a
Opinion 1 :

If you "overwrite" load/save nomen klatur re: characters

you will render the contentType useless.

It will not render.

======

Opinion 2 :

If this tripple digit backward slash multiple Garbage
was supposed to be ANY UNICODE or JAPANESE,

then I live on the moon.


"Robert M. Gary" <fo****@foobar.com> wrote in message
news:11***************@cswreg.cos.agilent.com...
I'm on a Solaris 9 Japanese machine w/ an Ultra 5 Sparc CPU. I'm using
Xerces 2.6 DOM

I've got a document in UTF-8 format..
<?xml version="1.0" encoding="UTF-8"?>
<Name>ja_alert-\343\201\250\343\201\241\343\201\244\343\201\252\3 43\201\256\
343\ 201\253</Name>
(I'm not sure if the Japanese came out right here, but everything after
ja_alert- is UTF-8 for Japanese).

When I extract the text element I get an XMLCh* that claims to be 15 char's long. However, when I get a char* from it, all the Japanese is truncated and it comes out only 9 chars long.

char * value = XMLString::transcode( pNode->getNodeValue() );
cout<<"original length is "<<strlen( value )<<endl;
cout<<"Its a text named
"<<XMLString::transcode(pNode->getNodeName())
<<" value "
<<XMLString::transcode(pNode->getNodeValue())
<<" size is "<<XMLString::stringLen( pNode->getNodeValue())
<<endl;

I get back...

original length is 9
Its a text named #text value ja_alert- size is 15

(notice the Japanese is gone).
My locale looks like...
=> locale
LANG=ja
LC_CTYPE="ja"
LC_NUMERIC="ja"
LC_TIME="ja"
LC_COLLATE="ja"
LC_MONETARY="ja"
LC_MESSAGES="ja"
LC_ALL=
Do I need to something to tell the transcoder what encoding to transcode
to??

-Robert


Jul 20 '05 #2

P: n/a
On Mon, 18 Apr 2005, Robert M. Gary wrote:
X-Newsreader: Microsoft Outlook Express 6.00.2900.2180

<Name>ja_alert-\343\201\250\343\201\241\343\201\244\343\201\252\3 43\201\256\343\
201\253</Name>
(I'm not sure if the Japanese came out right here, but everything after
ja_alert- is UTF-8 for Japanese).


If you continue to use this newsreader surrogate instead of an
actual newsreader, you need to select

Tools > Options > Send
Mail Sending Format > Plain Text Settings > Message format MIME
News Sending Format > Plain Text Settings > Message format MIME
Encode text using: None

in order to transmit special, non-ASCII characters.

--
Top-posting.
What's the most irritating thing on Usenet?

Jul 20 '05 #3

This discussion thread is closed

Replies have been disabled for this discussion.