I'm working on a .NET application that requests an XML document, in
string form, from a legacy COM component, then deserializes it. In
order to deserialize the document, the string needs to be placed into a
stream. AFAIK, .NET strings are UTF-16 encoded, but the COM component
returns a UTF-8 encoded document, so my first attempt at creating and
filling a stream used this code:
string response = <some UTF-8 encoded xml from a COM component>;
MemoryStream result = new MemoryStream(re sponse.Length);
UTF8Encoding utf8Encoding = new UTF8Encoding();
result.Write(ut f8Encoding.GetB ytes(response), 0, response.Length );
The deserialization then worked fine until one of the XML documents
contained a UK pound character - £ - at which point an exception was
thrown indicating an invalid document.
After searching Google, I came across the following alternative code to
create and fill the stream for deserialization . After testing with
pound and a few other problem characters, this seems to work properly.
string response = <some UTF-8 encoded xml from a COM component>;
MemoryStream result = new MemoryStream(re sponse.Length);
StreamWriter writer = new StreamWriter( result, new UTF8Encoding()) ;
writer.Write( response );
writer.Flush();
I have a couple of questions: First, is my understanding of the
encoding issues correct? If I have a UTF-8 encoded document, is it up
to me to decode it into the stream explicitly? Secondly, my reading of
the two code snippets is that they should produce an identical result,
but in reality the first one doesn't seem to be decoding the document
correctly for all characters - can anyone explain what is causing the
different behaviour?
Thanks
Ian 4 1334
Ian Harding wrote: I'm working on a .NET application that requests an XML document, in string form, from a legacy COM component, then deserializes it. In order to deserialize the document, the string needs to be placed into a stream. AFAIK, .NET strings are UTF-16 encoded, but the COM component returns a UTF-8 encoded document, so my first attempt at creating and filling a stream used this code:
string response = <some UTF-8 encoded xml from a COM component>;
I think you should do nothing here. Just parse the string. Why do you
need a stream? This should work:
XmlDocument doc = new XmlDocument();
doc.LoadXml(res ponse);
--
Oleg Tkachenko [XML MVP, MCP] http://blog.tkachenko.com
Oleg Tkachenko [MVP] wrote: Ian Harding wrote:
I'm working on a .NET application that requests an XML document, in string form, from a legacy COM component, then deserializes it. In order to deserialize the document, the string needs to be placed into a stream. AFAIK, .NET strings are UTF-16 encoded, but the COM component returns a UTF-8 encoded document, so my first attempt at creating and filling a stream used this code:
string response = <some UTF-8 encoded xml from a COM component>;
I think you should do nothing here. Just parse the string. Why do you need a stream? This should work: XmlDocument doc = new XmlDocument(); doc.LoadXml(res ponse);
I probably should have explained what I was doing with the data more
clearly.
We have a class library, containing serializable classes that represent
each type of document that can be returned by the COM component. For a
given request, we always know the type of the returned document, so we
just use XmlSerializer to populate a class instance from the XML. Saves
messing about with DOM and XPATH on the client-side. As I understand
it, it isn't possibly to pass a string directly to the serializer for
de-serialization. A MemoryStream seemed like the lowest-overhead way of
getting it into a stream.
Thanks
Ian
Ian Harding wrote: We have a class library, containing serializable classes that represent each type of document that can be returned by the COM component. For a given request, we always know the type of the returned document, so we just use XmlSerializer to populate a class instance from the XML. Saves messing about with DOM and XPATH on the client-side. As I understand it, it isn't possibly to pass a string directly to the serializer for de-serialization. A MemoryStream seemed like the lowest-overhead way of getting it into a stream.
XmlSerializer accepts TextReader, that means you can pass it new
StringReader(re sponse). Fiddling with encoding with MemoryStream is
usually very error-prone. Basically if your XML is in .NET string that
means it's already UTF-16 encoded, but its XML declaration says UTF-8.
..NET supports such case just fine by switching to UTF-16.
--
Oleg Tkachenko [XML MVP, MCP] http://blog.tkachenko.com
Oleg Tkachenko [MVP] wrote: Ian Harding wrote:
We have a class library, containing serializable classes that represent each type of document that can be returned by the COM component. For a given request, we always know the type of the returned document, so we just use XmlSerializer to populate a class instance from the XML. Saves messing about with DOM and XPATH on the client-side. As I understand it, it isn't possibly to pass a string directly to the serializer for de-serialization. A MemoryStream seemed like the lowest-overhead way of getting it into a stream.
XmlSerializer accepts TextReader, that means you can pass it new StringReader(re sponse). Fiddling with encoding with MemoryStream is usually very error-prone. Basically if your XML is in .NET string that means it's already UTF-16 encoded, but its XML declaration says UTF-8. .NET supports such case just fine by switching to UTF-16.
Thank you Oleg. I hoped there was an easier way then getting involved
in encoding issues but just couldn't see it. This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics |
by: jean.moser |
last post by:
Hi !
I need help to solve the problem of the special characters used by european western languages, for example French.
Word is my word-processing tool.I can save the files in txt format but special characters like é are transformed in \xe9 when I read the files in Python. How do I proceed to get the original files in latin-1 ?
Thanks for your help.
Jean
|
by: Beowulf |
last post by:
Hi,
I have an XML file generated by a third party (and therefore
unchangable) program.
1st line in it is <?xml version="1.0" encoding="us-ascii"?> and down
in the depths of the xml I have a element
<FirstName>Françoise</FirstName>
I have an xsl file I've created to attempt to export this xml to CSV.
|
by: Jaroslav Jakes |
last post by:
Hi,
please help.
Sounds so simple. We receive textfiles (customer orders) as e-mail
attachment. These textfiles contain a simple structure of orders, like:
custno, itemno, qty, text
Since these textfile are made on different systems, the field "text" causes
some trouble.
|
by: foreman |
last post by:
Hi there,
Hello everybody. I am a newbie to dot net framework class lib. I
am confused about those classes such as all of the stream classes and
those XXXReader XXXWriter. In fact, I have tried the StreamReader(which
can read in big5 encoding text files well) and besides, I have tried
the BufferedStream to do the same thing. It does work fine except that
It can't interpret the Chinese big5 words well(which becomes messy code
around)....
|
by: Zhiv Kurilka |
last post by:
Hi,
I have a text file with following content:
"((^)|(.* +))§§§§§§§§"
if I read it with:
k=System.IO.StreamReader( "file.txt",System.Text.Encoding.ASCII);
k.readtotheend()
| |
by: Andy |
last post by:
Hello All:
I have a windows application that I need to encode a string using Unicode.
The example I have been given to use is a Web-Version. Below is the webcode.
Response.ContentEncoding=System.Text.Encoding.Unicode;
Response.ContentType = "application/postscript";
Response.Buffer =true;
Response.AppendHeader("Content-Disposition","attachment; filename=\"" +
sFilename + "\"");
|
by: Steve |
last post by:
I wish my aspx pages to be interpreted as UTF-8 by browsers.
Apart from setting the following in the web.config file:
<globalization fileEncoding="utf-8" requestEncoding="utf-8"
responseEncoding="utf-8" />
1. Do I also have to specify <meta http-equiv="Content-Type"
content="text/html; charset=utf-8"in every aspx page?
|
by: krishnakant Mane |
last post by:
hello,
I am strangely confused with a date calculation problem.
the point is that I want to calculate difference in two dates in days.
there are two aspects to this problem.
firstly, I can't get a way to convert a string like "1/2/2005" in a
genuan date object which is needed for calculation.
now once this is done I will create a another date object with
today = datetime.datetime.now()
and then see the difference between this today and...
|
by: Diego F. |
last post by:
Hi. I'm using that code:
If File.Exists(Ls_NombreFichero) = False Then
sw = File.CreateText(Ls_NombreFichero)
Else
sw = File.AppendText(Ls_NombreFichero)
End If
I need to change the encoding, as utf-8 is not the one I can use. How can I
change it? Encoding property is read only and I don't know how to use the
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it.
First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
|
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
| |
by: agi2029 |
last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own....
Now, this would greatly impact the work of software developers. The idea...
|
by: isladogs |
last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM).
In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules.
He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms.
Adolph will...
|
by: conductexam |
last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one.
At the time of converting from word file to html my equations which are in the word document file was convert into image.
Globals.ThisAddIn.Application.ActiveDocument.Select();...
|
by: TSSRALBI |
last post by:
Hello
I'm a network technician in training and I need your help.
I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs.
The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols.
I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
|
by: adsilva |
last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
|
by: 6302768590 |
last post by:
Hai team
i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
| |
by: bsmnconsultancy |
last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...
| |