473,881 Members | 1,644 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

MSXML and UTF-8 chinese characters

K
I've an XML file in UTF-8.
It contains some chinese characters ( both simplified chinese and
traditional chinese).

In loading the XML file with MSXML parser, I used the below code to retrieve
the data in a node. The CString was then display in CListCtrl. For the
traditional chinese characters, they were shown correctly, but for
simplified characters, I encounted many "?", but some characters were
correct.

if (MSXML::NODE_EL EMENT == pChild->nodeType)
{
MSXML::IXMLDOMN amedNodeMapPtr pAttrs = pChild->attributes;
MSXML::IXMLDOMN odePtr pAttr;

pAttr = pAttrs->getNamedItem(L "id");
CString id = OLE2T(pAttr->text);

MSXML::IXMLDOMN odePtr pWording = pChild->firstChild;
CString wording = OLE2T(pWording->text);

//add the wording to language
pMessageLanguag e->m_wordingList. insert(MessageW ordingListPair( id,
wording) );

}
Nov 16 '05 #1
4 6903
K wrote:
I've an XML file in UTF-8.
It contains some chinese characters ( both simplified chinese and
traditional chinese).

In loading the XML file with MSXML parser, I used the below code to
retrieve the data in a node. The CString was then display in
CListCtrl. For the traditional chinese characters, they were shown
correctly, but for simplified characters, I encounted many "?", but
some characters were correct.


You should compile with UNICODE and _UNICODE defined!
Or you have to convert the unicode to MBCS...

--
Greetings
Jochen

Do you need a memory-leak finder ?
http://www.codeproject.com/tools/leakfinder.asp
Nov 16 '05 #2
K,

Does your XML file begin with the following line?

<?xml version="1.0" encoding="UTF-8" ?>

If not, add this line and see what happens. If you do have this line (or
you add it) and still have problems, then you may be using characters that
Windows cannot support or your fonts cannot display (i.e. traditional
Chinese).

Windows supports Unicode up to version 2.1 only. The XML parser converts
your XML source to UTF-16 and parsed internally. When the XML parser sees
the line above it will convert your XML file from UTF-8 with no loss of
information. However, without this line (specifically without the encoding
clue) the system default ANSI code page will be used when converting to
UTF-16.

Even with this line, you may still have characters that your fonts can't
display, however no loss in the conversion to/from UTF-8 will occur.

Hope this helps (and I hope I know what I'm talking about :-)

-MerkX

"K" <k@taka.com> wrote in message
news:#D******** ******@TK2MSFTN GP10.phx.gbl...
I've an XML file in UTF-8.
It contains some chinese characters ( both simplified chinese and
traditional chinese).

In loading the XML file with MSXML parser, I used the below code to retrieve the data in a node. The CString was then display in CListCtrl. For the
traditional chinese characters, they were shown correctly, but for
simplified characters, I encounted many "?", but some characters were
correct.

if (MSXML::NODE_EL EMENT == pChild->nodeType)
{
MSXML::IXMLDOMN amedNodeMapPtr pAttrs = pChild->attributes;
MSXML::IXMLDOMN odePtr pAttr;

pAttr = pAttrs->getNamedItem(L "id");
CString id = OLE2T(pAttr->text);

MSXML::IXMLDOMN odePtr pWording = pChild->firstChild;
CString wording = OLE2T(pWording->text);

//add the wording to language
pMessageLanguag e->m_wordingList. insert(MessageW ordingListPair( id,
wording) );

}

Nov 16 '05 #3
K
My project was compiling as UNICODE build, and my XML was begin with the
<?xml ... ?> line, but my problem is still persist.

After reading in the node in MSXML, can I use the macro OLE2T then assign it
to a CStirng ??

What does CSTring store internally ?? I'm using VS.NET to compile my
projects.

I can see and edit the xml file in DreamWaver, so the fonts must be
supported by my system. However, after loading up the XML file by MSXML, and
get the node, and assigned to a CString, and display it out, the problem
happends, for some simplified chinese becomes "?", but some are okay.

"MerkX Zyban" <Me***@NetWand. com> wrote in message
news:uk******** ******@TK2MSFTN GP09.phx.gbl...
K,

Does your XML file begin with the following line?

<?xml version="1.0" encoding="UTF-8" ?>

If not, add this line and see what happens. If you do have this line (or
you add it) and still have problems, then you may be using characters that
Windows cannot support or your fonts cannot display (i.e. traditional
Chinese).

Windows supports Unicode up to version 2.1 only. The XML parser converts
your XML source to UTF-16 and parsed internally. When the XML parser sees
the line above it will convert your XML file from UTF-8 with no loss of
information. However, without this line (specifically without the encoding clue) the system default ANSI code page will be used when converting to
UTF-16.

Even with this line, you may still have characters that your fonts can't
display, however no loss in the conversion to/from UTF-8 will occur.

Hope this helps (and I hope I know what I'm talking about :-)

-MerkX

"K" <k@taka.com> wrote in message
news:#D******** ******@TK2MSFTN GP10.phx.gbl...
I've an XML file in UTF-8.
It contains some chinese characters ( both simplified chinese and
traditional chinese).

In loading the XML file with MSXML parser, I used the below code to

retrieve
the data in a node. The CString was then display in CListCtrl. For the
traditional chinese characters, they were shown correctly, but for
simplified characters, I encounted many "?", but some characters were
correct.

if (MSXML::NODE_EL EMENT == pChild->nodeType)
{
MSXML::IXMLDOMN amedNodeMapPtr pAttrs = pChild->attributes;
MSXML::IXMLDOMN odePtr pAttr;

pAttr = pAttrs->getNamedItem(L "id");
CString id = OLE2T(pAttr->text);

MSXML::IXMLDOMN odePtr pWording = pChild->firstChild;
CString wording = OLE2T(pWording->text);

//add the wording to language
pMessageLanguag e->m_wordingList. insert(MessageW ordingListPair( id,
wording) );

}


Nov 16 '05 #4
> After reading in the node in MSXML, can I use the macro OLE2T then
assign it to a CStirng ??

What does CSTring store internally ?? I'm using VS.NET to compile my
projects.

CString stores ANSI in an ANSI application and Unicode in a UNICODE app.
If you app. is Unicode, there is no need to use

But question marks are usualy the result of bad code page conversions.
Are you sure there are no conversions happening
(maybe in m_wordingList.i nsert, or in MessageWordingL istPair)?

Mihai
Nov 16 '05 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
2030
by: asim | last post by:
Hi All i m transforming a simple XSL file on Server side using MSXML Parser ... and writting resultant HTML directly on browser .. is there any way to get this HTML in a variable ??? i seen the article on MSDN but it makes my page blank ... plz edit my code to make is work in this manner, if possible Thankx in advance Plz Help
0
1302
by: Variable | last post by:
I have a webpage where I'm opening an XML page with MSXML, pushing it through an XSL file to generate some HTML which is incorporated into the body of the main HTML page. I have other pages where this all works great. One page, however, seems to bypass the a for-each iteration, and I can't understand why. I've used XMLSpy to validate the XML and XSL, and as far as I can see, the simulated generation through XMLSpy is correct. Here is...
1
1784
by: BCC | last post by:
Hi, I am new to msxml, and am having a hard time figuring out how to get all children of a particular node. For example, if I have something like this: <?xml version="1.0" encoding="utf-8"?> <rules> <cell id="1"> <type>CD4 Cell</type> <x>10</x> <y>10</y>
9
1794
by: LarryR | last post by:
The following XSLT works fine using MSXML 4.0 (e.g I receive a result in about 20 seconds), but effectively hangs in both .NET 1.0 sp2 with the XML hot fix and NET 1.1. My source XML file is large at over 46,000 <atl_client> nodes <?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/> <!-- lookup port...
3
4886
by: awong | last post by:
Hi there, I was trying to convert the following VB6 code to VB.NET. But I can't find a corresponding System.XML object for MSXML IXMLDOMSelection. I am thinking to use System.XML XMLNodeList object and GetElementsByTagName method to find the "selection". Any suggestion/comment? Thanks in advance.
3
2481
by: Jason S | last post by:
Hello Group, I am just about tearing my hair out with this one and thought someone may have some insight. I have a transform that wasn't working so I grabbed the nearest debugger (xselerator) and saw that it works just fine. Now what I mean by not working is that it just silently fails to produce the expected output... no exceptions are being thrown. Xselerator uses msxml 3 so it's not really helping me see the problem in .net 1.1. ...
7
3392
by: Michael | last post by:
Hi, I have a problem parsing XML file using XSLT stylesheet by using : using System.Xml; using System.Xml.XPath; using System.Xml.Xsl; // load Xsl stylesheet XslTransform myXslTrans = new XslTransform() ;
5
5374
by: Jeroen | last post by:
We're using MSXML to transform the XML document we have to an XHTML file using an XSLT. Now the problem is that the dotnet implementation we made does something subtly different from the commandline call to MSXML. The problem is that the dotnet variant leaves out a piece of info on the charset, leading to the browser going to a default encoding instead of the wanted UTF-8. MSXML2.DOMDocument40Class stylesheet = new...
6
7999
by: Anthony Jones | last post by:
People, Anyone else got an IIS7 server out there that they can test this little ASP file:- <% Set xml = Server.CreateObject("MSXML2.DOMDocument.3.0") xml.loadXML "<root />" Set xsl = Server.CreateObject("MSXML2.DOMDocument.3.0")
0
11098
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
10814
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
10401
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
9552
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
7109
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5977
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4597
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
4196
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
3223
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.