472,803 Members | 881 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,803 software developers and data experts.

MSXML and UTF-8 chinese characters

K
I've an XML file in UTF-8.
It contains some chinese characters ( both simplified chinese and
traditional chinese).

In loading the XML file with MSXML parser, I used the below code to retrieve
the data in a node. The CString was then display in CListCtrl. For the
traditional chinese characters, they were shown correctly, but for
simplified characters, I encounted many "?", but some characters were
correct.

if (MSXML::NODE_ELEMENT == pChild->nodeType)
{
MSXML::IXMLDOMNamedNodeMapPtr pAttrs = pChild->attributes;
MSXML::IXMLDOMNodePtr pAttr;

pAttr = pAttrs->getNamedItem(L"id");
CString id = OLE2T(pAttr->text);

MSXML::IXMLDOMNodePtr pWording = pChild->firstChild;
CString wording = OLE2T(pWording->text);

//add the wording to language
pMessageLanguage->m_wordingList.insert(MessageWordingListPair(id,
wording) );

}
Nov 16 '05 #1
4 6811
K wrote:
I've an XML file in UTF-8.
It contains some chinese characters ( both simplified chinese and
traditional chinese).

In loading the XML file with MSXML parser, I used the below code to
retrieve the data in a node. The CString was then display in
CListCtrl. For the traditional chinese characters, they were shown
correctly, but for simplified characters, I encounted many "?", but
some characters were correct.


You should compile with UNICODE and _UNICODE defined!
Or you have to convert the unicode to MBCS...

--
Greetings
Jochen

Do you need a memory-leak finder ?
http://www.codeproject.com/tools/leakfinder.asp
Nov 16 '05 #2
K,

Does your XML file begin with the following line?

<?xml version="1.0" encoding="UTF-8" ?>

If not, add this line and see what happens. If you do have this line (or
you add it) and still have problems, then you may be using characters that
Windows cannot support or your fonts cannot display (i.e. traditional
Chinese).

Windows supports Unicode up to version 2.1 only. The XML parser converts
your XML source to UTF-16 and parsed internally. When the XML parser sees
the line above it will convert your XML file from UTF-8 with no loss of
information. However, without this line (specifically without the encoding
clue) the system default ANSI code page will be used when converting to
UTF-16.

Even with this line, you may still have characters that your fonts can't
display, however no loss in the conversion to/from UTF-8 will occur.

Hope this helps (and I hope I know what I'm talking about :-)

-MerkX

"K" <k@taka.com> wrote in message
news:#D**************@TK2MSFTNGP10.phx.gbl...
I've an XML file in UTF-8.
It contains some chinese characters ( both simplified chinese and
traditional chinese).

In loading the XML file with MSXML parser, I used the below code to retrieve the data in a node. The CString was then display in CListCtrl. For the
traditional chinese characters, they were shown correctly, but for
simplified characters, I encounted many "?", but some characters were
correct.

if (MSXML::NODE_ELEMENT == pChild->nodeType)
{
MSXML::IXMLDOMNamedNodeMapPtr pAttrs = pChild->attributes;
MSXML::IXMLDOMNodePtr pAttr;

pAttr = pAttrs->getNamedItem(L"id");
CString id = OLE2T(pAttr->text);

MSXML::IXMLDOMNodePtr pWording = pChild->firstChild;
CString wording = OLE2T(pWording->text);

//add the wording to language
pMessageLanguage->m_wordingList.insert(MessageWordingListPair(id,
wording) );

}

Nov 16 '05 #3
K
My project was compiling as UNICODE build, and my XML was begin with the
<?xml ... ?> line, but my problem is still persist.

After reading in the node in MSXML, can I use the macro OLE2T then assign it
to a CStirng ??

What does CSTring store internally ?? I'm using VS.NET to compile my
projects.

I can see and edit the xml file in DreamWaver, so the fonts must be
supported by my system. However, after loading up the XML file by MSXML, and
get the node, and assigned to a CString, and display it out, the problem
happends, for some simplified chinese becomes "?", but some are okay.

"MerkX Zyban" <Me***@NetWand.com> wrote in message
news:uk**************@TK2MSFTNGP09.phx.gbl...
K,

Does your XML file begin with the following line?

<?xml version="1.0" encoding="UTF-8" ?>

If not, add this line and see what happens. If you do have this line (or
you add it) and still have problems, then you may be using characters that
Windows cannot support or your fonts cannot display (i.e. traditional
Chinese).

Windows supports Unicode up to version 2.1 only. The XML parser converts
your XML source to UTF-16 and parsed internally. When the XML parser sees
the line above it will convert your XML file from UTF-8 with no loss of
information. However, without this line (specifically without the encoding clue) the system default ANSI code page will be used when converting to
UTF-16.

Even with this line, you may still have characters that your fonts can't
display, however no loss in the conversion to/from UTF-8 will occur.

Hope this helps (and I hope I know what I'm talking about :-)

-MerkX

"K" <k@taka.com> wrote in message
news:#D**************@TK2MSFTNGP10.phx.gbl...
I've an XML file in UTF-8.
It contains some chinese characters ( both simplified chinese and
traditional chinese).

In loading the XML file with MSXML parser, I used the below code to

retrieve
the data in a node. The CString was then display in CListCtrl. For the
traditional chinese characters, they were shown correctly, but for
simplified characters, I encounted many "?", but some characters were
correct.

if (MSXML::NODE_ELEMENT == pChild->nodeType)
{
MSXML::IXMLDOMNamedNodeMapPtr pAttrs = pChild->attributes;
MSXML::IXMLDOMNodePtr pAttr;

pAttr = pAttrs->getNamedItem(L"id");
CString id = OLE2T(pAttr->text);

MSXML::IXMLDOMNodePtr pWording = pChild->firstChild;
CString wording = OLE2T(pWording->text);

//add the wording to language
pMessageLanguage->m_wordingList.insert(MessageWordingListPair(id,
wording) );

}


Nov 16 '05 #4
> After reading in the node in MSXML, can I use the macro OLE2T then
assign it to a CStirng ??

What does CSTring store internally ?? I'm using VS.NET to compile my
projects.

CString stores ANSI in an ANSI application and Unicode in a UNICODE app.
If you app. is Unicode, there is no need to use

But question marks are usualy the result of bad code page conversions.
Are you sure there are no conversions happening
(maybe in m_wordingList.insert, or in MessageWordingListPair)?

Mihai
Nov 16 '05 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
by: asim | last post by:
Hi All i m transforming a simple XSL file on Server side using MSXML Parser ... and writting resultant HTML directly on browser .. is there any way to get this HTML in a variable ??? i seen the...
0
by: Variable | last post by:
I have a webpage where I'm opening an XML page with MSXML, pushing it through an XSL file to generate some HTML which is incorporated into the body of the main HTML page. I have other pages where...
1
by: BCC | last post by:
Hi, I am new to msxml, and am having a hard time figuring out how to get all children of a particular node. For example, if I have something like this: <?xml version="1.0" encoding="utf-8"?>...
9
by: LarryR | last post by:
The following XSLT works fine using MSXML 4.0 (e.g I receive a result in about 20 seconds), but effectively hangs in both .NET 1.0 sp2 with the XML hot fix and NET 1.1. My source XML file is...
3
by: awong | last post by:
Hi there, I was trying to convert the following VB6 code to VB.NET. But I can't find a corresponding System.XML object for MSXML IXMLDOMSelection. I am thinking to use System.XML XMLNodeList...
3
by: Jason S | last post by:
Hello Group, I am just about tearing my hair out with this one and thought someone may have some insight. I have a transform that wasn't working so I grabbed the nearest debugger (xselerator)...
7
by: Michael | last post by:
Hi, I have a problem parsing XML file using XSLT stylesheet by using : using System.Xml; using System.Xml.XPath; using System.Xml.Xsl; // load Xsl stylesheet XslTransform myXslTrans = new...
5
by: Jeroen | last post by:
We're using MSXML to transform the XML document we have to an XHTML file using an XSLT. Now the problem is that the dotnet implementation we made does something subtly different from the...
6
by: Anthony Jones | last post by:
People, Anyone else got an IIS7 server out there that they can test this little ASP file:- <% Set xml = Server.CreateObject("MSXML2.DOMDocument.3.0") xml.loadXML "<root />" Set xsl =...
0
by: erikbower65 | last post by:
Using CodiumAI's pr-agent is simple and powerful. Follow these steps: 1. Install CodiumAI CLI: Ensure Node.js is installed, then run 'npm install -g codiumai' in the terminal. 2. Connect to...
0
linyimin
by: linyimin | last post by:
Spring Startup Analyzer generates an interactive Spring application startup report that lets you understand what contributes to the application startup time and helps to optimize it. Support for...
0
by: erikbower65 | last post by:
Here's a concise step-by-step guide for manually installing IntelliJ IDEA: 1. Download: Visit the official JetBrains website and download the IntelliJ IDEA Community or Ultimate edition based on...
0
by: kcodez | last post by:
As a H5 game development enthusiast, I recently wrote a very interesting little game - Toy Claw ((http://claw.kjeek.com/))。Here I will summarize and share the development experience here, and hope it...
0
by: Taofi | last post by:
I try to insert a new record but the error message says the number of query names and destination fields are not the same This are my field names ID, Budgeted, Actual, Status and Differences ...
14
DJRhino1175
by: DJRhino1175 | last post by:
When I run this code I get an error, its Run-time error# 424 Object required...This is my first attempt at doing something like this. I test the entire code and it worked until I added this - If...
5
by: DJRhino | last post by:
Private Sub CboDrawingID_BeforeUpdate(Cancel As Integer) If = 310029923 Or 310030138 Or 310030152 Or 310030346 Or 310030348 Or _ 310030356 Or 310030359 Or 310030362 Or...
0
by: lllomh | last post by:
How does React native implement an English player?
0
by: Mushico | last post by:
How to calculate date of retirement from date of birth

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.