473,388 Members | 1,405 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,388 software developers and data experts.

MSXML and UTF-8 chinese characters

K
I've an XML file in UTF-8.
It contains some chinese characters ( both simplified chinese and
traditional chinese).

In loading the XML file with MSXML parser, I used the below code to retrieve
the data in a node. The CString was then display in CListCtrl. For the
traditional chinese characters, they were shown correctly, but for
simplified characters, I encounted many "?", but some characters were
correct.

if (MSXML::NODE_ELEMENT == pChild->nodeType)
{
MSXML::IXMLDOMNamedNodeMapPtr pAttrs = pChild->attributes;
MSXML::IXMLDOMNodePtr pAttr;

pAttr = pAttrs->getNamedItem(L"id");
CString id = OLE2T(pAttr->text);

MSXML::IXMLDOMNodePtr pWording = pChild->firstChild;
CString wording = OLE2T(pWording->text);

//add the wording to language
pMessageLanguage->m_wordingList.insert(MessageWordingListPair(id,
wording) );

}
Nov 16 '05 #1
4 6870
K wrote:
I've an XML file in UTF-8.
It contains some chinese characters ( both simplified chinese and
traditional chinese).

In loading the XML file with MSXML parser, I used the below code to
retrieve the data in a node. The CString was then display in
CListCtrl. For the traditional chinese characters, they were shown
correctly, but for simplified characters, I encounted many "?", but
some characters were correct.


You should compile with UNICODE and _UNICODE defined!
Or you have to convert the unicode to MBCS...

--
Greetings
Jochen

Do you need a memory-leak finder ?
http://www.codeproject.com/tools/leakfinder.asp
Nov 16 '05 #2
K,

Does your XML file begin with the following line?

<?xml version="1.0" encoding="UTF-8" ?>

If not, add this line and see what happens. If you do have this line (or
you add it) and still have problems, then you may be using characters that
Windows cannot support or your fonts cannot display (i.e. traditional
Chinese).

Windows supports Unicode up to version 2.1 only. The XML parser converts
your XML source to UTF-16 and parsed internally. When the XML parser sees
the line above it will convert your XML file from UTF-8 with no loss of
information. However, without this line (specifically without the encoding
clue) the system default ANSI code page will be used when converting to
UTF-16.

Even with this line, you may still have characters that your fonts can't
display, however no loss in the conversion to/from UTF-8 will occur.

Hope this helps (and I hope I know what I'm talking about :-)

-MerkX

"K" <k@taka.com> wrote in message
news:#D**************@TK2MSFTNGP10.phx.gbl...
I've an XML file in UTF-8.
It contains some chinese characters ( both simplified chinese and
traditional chinese).

In loading the XML file with MSXML parser, I used the below code to retrieve the data in a node. The CString was then display in CListCtrl. For the
traditional chinese characters, they were shown correctly, but for
simplified characters, I encounted many "?", but some characters were
correct.

if (MSXML::NODE_ELEMENT == pChild->nodeType)
{
MSXML::IXMLDOMNamedNodeMapPtr pAttrs = pChild->attributes;
MSXML::IXMLDOMNodePtr pAttr;

pAttr = pAttrs->getNamedItem(L"id");
CString id = OLE2T(pAttr->text);

MSXML::IXMLDOMNodePtr pWording = pChild->firstChild;
CString wording = OLE2T(pWording->text);

//add the wording to language
pMessageLanguage->m_wordingList.insert(MessageWordingListPair(id,
wording) );

}

Nov 16 '05 #3
K
My project was compiling as UNICODE build, and my XML was begin with the
<?xml ... ?> line, but my problem is still persist.

After reading in the node in MSXML, can I use the macro OLE2T then assign it
to a CStirng ??

What does CSTring store internally ?? I'm using VS.NET to compile my
projects.

I can see and edit the xml file in DreamWaver, so the fonts must be
supported by my system. However, after loading up the XML file by MSXML, and
get the node, and assigned to a CString, and display it out, the problem
happends, for some simplified chinese becomes "?", but some are okay.

"MerkX Zyban" <Me***@NetWand.com> wrote in message
news:uk**************@TK2MSFTNGP09.phx.gbl...
K,

Does your XML file begin with the following line?

<?xml version="1.0" encoding="UTF-8" ?>

If not, add this line and see what happens. If you do have this line (or
you add it) and still have problems, then you may be using characters that
Windows cannot support or your fonts cannot display (i.e. traditional
Chinese).

Windows supports Unicode up to version 2.1 only. The XML parser converts
your XML source to UTF-16 and parsed internally. When the XML parser sees
the line above it will convert your XML file from UTF-8 with no loss of
information. However, without this line (specifically without the encoding clue) the system default ANSI code page will be used when converting to
UTF-16.

Even with this line, you may still have characters that your fonts can't
display, however no loss in the conversion to/from UTF-8 will occur.

Hope this helps (and I hope I know what I'm talking about :-)

-MerkX

"K" <k@taka.com> wrote in message
news:#D**************@TK2MSFTNGP10.phx.gbl...
I've an XML file in UTF-8.
It contains some chinese characters ( both simplified chinese and
traditional chinese).

In loading the XML file with MSXML parser, I used the below code to

retrieve
the data in a node. The CString was then display in CListCtrl. For the
traditional chinese characters, they were shown correctly, but for
simplified characters, I encounted many "?", but some characters were
correct.

if (MSXML::NODE_ELEMENT == pChild->nodeType)
{
MSXML::IXMLDOMNamedNodeMapPtr pAttrs = pChild->attributes;
MSXML::IXMLDOMNodePtr pAttr;

pAttr = pAttrs->getNamedItem(L"id");
CString id = OLE2T(pAttr->text);

MSXML::IXMLDOMNodePtr pWording = pChild->firstChild;
CString wording = OLE2T(pWording->text);

//add the wording to language
pMessageLanguage->m_wordingList.insert(MessageWordingListPair(id,
wording) );

}


Nov 16 '05 #4
> After reading in the node in MSXML, can I use the macro OLE2T then
assign it to a CStirng ??

What does CSTring store internally ?? I'm using VS.NET to compile my
projects.

CString stores ANSI in an ANSI application and Unicode in a UNICODE app.
If you app. is Unicode, there is no need to use

But question marks are usualy the result of bad code page conversions.
Are you sure there are no conversions happening
(maybe in m_wordingList.insert, or in MessageWordingListPair)?

Mihai
Nov 16 '05 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
by: asim | last post by:
Hi All i m transforming a simple XSL file on Server side using MSXML Parser ... and writting resultant HTML directly on browser .. is there any way to get this HTML in a variable ??? i seen the...
0
by: Variable | last post by:
I have a webpage where I'm opening an XML page with MSXML, pushing it through an XSL file to generate some HTML which is incorporated into the body of the main HTML page. I have other pages where...
1
by: BCC | last post by:
Hi, I am new to msxml, and am having a hard time figuring out how to get all children of a particular node. For example, if I have something like this: <?xml version="1.0" encoding="utf-8"?>...
9
by: LarryR | last post by:
The following XSLT works fine using MSXML 4.0 (e.g I receive a result in about 20 seconds), but effectively hangs in both .NET 1.0 sp2 with the XML hot fix and NET 1.1. My source XML file is...
3
by: awong | last post by:
Hi there, I was trying to convert the following VB6 code to VB.NET. But I can't find a corresponding System.XML object for MSXML IXMLDOMSelection. I am thinking to use System.XML XMLNodeList...
3
by: Jason S | last post by:
Hello Group, I am just about tearing my hair out with this one and thought someone may have some insight. I have a transform that wasn't working so I grabbed the nearest debugger (xselerator)...
7
by: Michael | last post by:
Hi, I have a problem parsing XML file using XSLT stylesheet by using : using System.Xml; using System.Xml.XPath; using System.Xml.Xsl; // load Xsl stylesheet XslTransform myXslTrans = new...
5
by: Jeroen | last post by:
We're using MSXML to transform the XML document we have to an XHTML file using an XSLT. Now the problem is that the dotnet implementation we made does something subtly different from the...
6
by: Anthony Jones | last post by:
People, Anyone else got an IIS7 server out there that they can test this little ASP file:- <% Set xml = Server.CreateObject("MSXML2.DOMDocument.3.0") xml.loadXML "<root />" Set xsl =...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.