473,324 Members | 2,581 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,324 software developers and data experts.

incorrect encoding after serialisation to XML

Using the code below I am trying, in VB .Net 2003, to serialise classes
defined in a couple of XSD documents. The encoding for both is
Unicode(UTF-8). However the resulting XML is encoded as UTF-16. This is
causing me problems when I try to load it into an XPath document. I would
imagine I should be able to use System.Text.Encoding to define the encoding
as UTF-8 but I haven't been able to figure out how so far.

Dim ser As New XmlSerializer(GetType(GTPAPP))
Dim sw As New StringWriter
ser.Serialize(sw, domainObj.getGTPApp)
Return sw.ToString()

any help would be appreciated.
Nov 12 '05 #1
4 5692
one additional point on this is that when I try to load the XML when encoded
as UTF-16 I get the error
"There is no Unicode byte order mark. Cannot switch to Unicode."

"Stephen" wrote:
Using the code below I am trying, in VB .Net 2003, to serialise classes
defined in a couple of XSD documents. The encoding for both is
Unicode(UTF-8). However the resulting XML is encoded as UTF-16. This is
causing me problems when I try to load it into an XPath document. I would
imagine I should be able to use System.Text.Encoding to define the encoding
as UTF-8 but I haven't been able to figure out how so far.

Dim ser As New XmlSerializer(GetType(GTPAPP))
Dim sw As New StringWriter
ser.Serialize(sw, domainObj.getGTPApp)
Return sw.ToString()

any help would be appreciated.

Nov 12 '05 #2
"Stephen" <St*****@discussions.microsoft.com> wrote in message news:8C**********************************@microsof t.com...
"Stephen" wrote:
Using the code below I am trying, in VB .Net 2003, to serialise classes
defined in a couple of XSD documents. The encoding for both is
Unicode(UTF-8).
The encoding of the schema documents is irrelevant, and the
encoding of the classes -- huh?

Once the Framework loads a piece of XML, that XML becomes
a node set. Metaphysically, think of it as astral projection (an
out-of-body experience) for the XML ... it goes to a higher plane
of existence where encodings no longer matter (whether attributes
are delimited by single- or double- quotes no longer matters,
whether content came from a CDATA section no longer matters,
etc.)

The challenges you're facing seem to center on entering and
leaving 'the Body' (think of the XML as being corporeal
whenever you see angle brackets).
However the resulting XML is encoded as UTF-16. : : Dim sw As New StringWriter
ser.Serialize(sw, domainObj.getGTPApp)
Return sw.ToString()

A String is always UTF-16 encoded. There is no such thing as a UTF-8
string. It's a myth. It's fiction. UTF-8 strings went out with the dragon,
leprechauns, three-headed dogs guarding the Underworld, and Java.
one additional point on this is that when I try to load the XML when encoded
as UTF-16 I get the error
"There is no Unicode byte order mark. Cannot switch to Unicode."


This depends on how the XML was saved. Was it saved with a Stream
that used System.Text.UnicodeEncoding?

You'll get this XmlException when the XML isn't encoded in the encoding
it says it is.

My advice is not to write the XML to a String. If you want to put it into
a file (and have control over it's encoding), use an XmlTextWriter and
wrap it around a Stream. Then read it back in with a TextWriter that
is wrapped around a Stream, with a matching encoding (and any
encoding declaration that appears in the XML declaration should
match both the encoding you used to serialize out and deserialize
in.)

- - - utf8xml.vb (excerpt)
' . . .
Dim tw As New System.Xml.XmlTextWriter( _
New System.IO.FileStream( "file.xml", System.IO.FileMode.Create), _
New System.Text.UTF8Encoding( True) )

' XML will be serialized to file.xml, in UTF-8, with BOM.
ser.Serialize( tw, domainObj.getGTPApp)

' Finish writing the file and close it.
tw.Flush( )
tw.Close( )

Dim tr As New System.IO.TextReader( _
New System.IO.FileStream( "file.xml", System.IO.FileMode.Open), _
New System.Text.UTF8Encoding( True) )

' XML will be deserialized from file.xml, in UTF-8, with BOM.
domainObj.setGTPApp( CType( _
ser.Deserialize( tr), GTPApp) )

tr.Close( )
' . . .
- - -
Derek Harmon
Nov 12 '05 #3
Derek
I have a further question on this query. What I am actually trying to do is
generate XML that can be used as the input XML for a call to a second
application. So rather then generating a file I need to generate the XML to
be used in code. Would you know how to do this?

"Derek Harmon" wrote:
"Stephen" <St*****@discussions.microsoft.com> wrote in message news:8C**********************************@microsof t.com...
"Stephen" wrote:
Using the code below I am trying, in VB .Net 2003, to serialise classes
defined in a couple of XSD documents. The encoding for both is
Unicode(UTF-8).
The encoding of the schema documents is irrelevant, and the
encoding of the classes -- huh?

Once the Framework loads a piece of XML, that XML becomes
a node set. Metaphysically, think of it as astral projection (an
out-of-body experience) for the XML ... it goes to a higher plane
of existence where encodings no longer matter (whether attributes
are delimited by single- or double- quotes no longer matters,
whether content came from a CDATA section no longer matters,
etc.)

The challenges you're facing seem to center on entering and
leaving 'the Body' (think of the XML as being corporeal
whenever you see angle brackets).
However the resulting XML is encoded as UTF-16. : : Dim sw As New StringWriter
ser.Serialize(sw, domainObj.getGTPApp)
Return sw.ToString()


A String is always UTF-16 encoded. There is no such thing as a UTF-8
string. It's a myth. It's fiction. UTF-8 strings went out with the dragon,
leprechauns, three-headed dogs guarding the Underworld, and Java.
one additional point on this is that when I try to load the XML when encoded
as UTF-16 I get the error
"There is no Unicode byte order mark. Cannot switch to Unicode."


This depends on how the XML was saved. Was it saved with a Stream
that used System.Text.UnicodeEncoding?

You'll get this XmlException when the XML isn't encoded in the encoding
it says it is.

My advice is not to write the XML to a String. If you want to put it into
a file (and have control over it's encoding), use an XmlTextWriter and
wrap it around a Stream. Then read it back in with a TextWriter that
is wrapped around a Stream, with a matching encoding (and any
encoding declaration that appears in the XML declaration should
match both the encoding you used to serialize out and deserialize
in.)

- - - utf8xml.vb (excerpt)
' . . .
Dim tw As New System.Xml.XmlTextWriter( _
New System.IO.FileStream( "file.xml", System.IO.FileMode.Create), _
New System.Text.UTF8Encoding( True) )

' XML will be serialized to file.xml, in UTF-8, with BOM.
ser.Serialize( tw, domainObj.getGTPApp)

' Finish writing the file and close it.
tw.Flush( )
tw.Close( )

Dim tr As New System.IO.TextReader( _
New System.IO.FileStream( "file.xml", System.IO.FileMode.Open), _
New System.Text.UTF8Encoding( True) )

' XML will be deserialized from file.xml, in UTF-8, with BOM.
domainObj.setGTPApp( CType( _
ser.Deserialize( tr), GTPApp) )

tr.Close( )
' . . .
- - -
Derek Harmon

Nov 12 '05 #4
Stephen,

Certainly, no problem. In the code below, simply replace references to
FileStream with references to MemoryStream. This will yield a Byte( )
array that you can pass along in binary form to this other application.
They're Bytes and not Chars, so the Byte( ) can be UTF-8 encoded.

As far as

The XmlTextWriter declarations below become.
Dim tw As New System.Xml.XmlTextWriter( _
New System.IO.FileStream( "file.xml", System.IO.FileMode.Create), _
New System.Text.UTF8Encoding( True) )
Dim stream As New System.IO.MemoryStream( )
Dim tw As New System.Xml.XmlTextWriter( _
stream, New System.Text.UTF8Encoding( True) )

Again, this produces the Byte Order Mark because I'm creating the
UTF8Encoding with True. When you're sending a binary stream, you
might not want a BOM (or at least the consumer might not want the
BOM). You may want to try it both ways, with- and w/o BOM.

After you are done Serializing to the tw, just like in the code example
below, you would extract the Byte( ) from the MemoryStream.

Dim buffer As Byte( ) = stream.GetBuffer( )

and you can pass this forward to another application (i.e., through
a Socket) or call a method,

Me.OtherMethod( buffer)

' . . .

Public Sub OtherMethod( ByVal incomingXml As Byte( ) )
' OtherMethod proceeds to wrap incomingXml in a
' MemoryStream and deserialize it using an XmlReader
' . . .
End Sub
The deserialization works much the same as the translation of the serialization
I've shown above. Polymorphism is what makes this possible. To the Xml-
Reader and Writer classes it doesn't matter what variety of System.IO.Stream
you use (MemoryStream, FileStream, NetworkStream), only that it is a Stream.
Derek Harmon
"Stephen" <St*****@discussions.microsoft.com> wrote in message news:0A**********************************@microsof t.com...
Derek
I have a further question on this query. What I am actually trying to do is
generate XML that can be used as the input XML for a call to a second
application. So rather then generating a file I need to generate the XML to
be used in code. Would you know how to do this?

"Derek Harmon" wrote:

: : - - - utf8xml.vb (excerpt)
' . . .
Dim tw As New System.Xml.XmlTextWriter( _
New System.IO.FileStream( "file.xml", System.IO.FileMode.Create), _
New System.Text.UTF8Encoding( True) )

' XML will be serialized to file.xml, in UTF-8, with BOM.
ser.Serialize( tw, domainObj.getGTPApp)

' Finish writing the file and close it.
tw.Flush( )
tw.Close( )

Dim tr As New System.IO.TextReader( _
New System.IO.FileStream( "file.xml", System.IO.FileMode.Open), _
New System.Text.UTF8Encoding( True) )

' XML will be deserialized from file.xml, in UTF-8, with BOM.
domainObj.setGTPApp( CType( _
ser.Deserialize( tr), GTPApp) )

tr.Close( )
' . . .
- - -

Nov 12 '05 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: Gustaf Liljegren | last post by:
I'm using xml.sax.parseString to read an XML file. The XML file contains a few words in Russian, and is encoded in UTF-8 using C#. In the example below, MyParser() is my SAX ContentHandler class....
1
by: murphy | last post by:
Hi, I've been seeing two symptoms with my asp.net site that have started recently after a long period of smooth running. As others on our team make changes to referenced dll's I find that I get...
3
by: murphy | last post by:
Hi, I've been seeing two symptoms with my asp.net site that have started recently after a long period of smooth running. As others on our team make changes to referenced dll's I find that I...
4
by: Peter Ritchie | last post by:
Does anyone know how to suppress a specific warning for a line or block of code in C#? C++ has a nice facility to disable a warning for a block of code with #pragma for warnings that are incorrect...
2
by: Greg | last post by:
I have a bizarre situation in which serialisation is failing routinely under a specific condition, and I'm wondering if the details ring a bell with anyone here. I have 2 classes that my...
2
by: ashwinij | last post by:
Hello The steps which i am doing in my program 1) I am having an xml file. 2) I am performing some updations in the file using XQueryUtil class from nux package. 3)After that i am...
1
by: Sarika Agarwal | last post by:
Hi, What is the primary difference between serialization and encoding in ..NET! *** Sent via Developersdex http://www.developersdex.com ***
1
by: OrionLee | last post by:
I am using C# to work with a 3rd party DLL (Nevron Charts), and attempting to serialise it. The serialisation itself is handled somewhere inside the DLL, so to get it to happen you call the Nevron's...
5
by: julvr | last post by:
Is the encoding for float platform independent? That is, if I take the four bytes, stick them into an ethernet packet send them to another machine, then (after endian swapping and alignment), set...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.