473,322 Members | 1,562 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,322 software developers and data experts.

XMLTextWriter Encoding problem

The following code sample should produce a valid xml file to the
console. However, when I try this in C# (Visual Studio 2003, 1.1
Framework), there is an extra questionmark preceding the rest of the
content.

MemoryStream ms = new MemoryStream();
XmlTextWriter xtw = new XmlTextWriter(ms, Encoding.UTF8);
xtw.Namespaces = false;
xtw.Indentation = 5;
xtw.Formatting = Formatting.Indented;
xtw.WriteStartDocument();
xtw.WriteStartElement("root");
xtw.WriteStartElement("People");
xtw.WriteStartElement("Person");
xtw.WriteAttributeString("FirstName", "John");
xtw.WriteAttributeString("LastName", "Smith");
xtw.WriteEndElement();
xtw.WriteStartElement("Person");
xtw.WriteAttributeString("FirstName", "Jane");
xtw.WriteAttributeString("LastName", "Smith");
xtw.WriteEndElement();
xtw.WriteEndElement();
xtw.WriteEndElement();
xtw.WriteEndDocument();
xtw.Flush();
xtw.Close();

Console.WriteLine( Encoding.UTF8.GetString( ms.ToArray()));

Everything works as expected when I use ASCII encoding. Any encoding
works when I write to a file instead of a memory stream.

Anyone have the answer to this problem?

Thanks in advance.
Nov 12 '05 #1
5 16744
Hi Adam,

The first three bytes in the byte array are BOM(Byte Order Mark 0xEFBBBF),
allowing applications to easily detect UTF-8 encoded text. If you want to
create UTF-8 encoded content without BOM, don't use the default instance
available through Encoding.UTF8, but create a new instance:

Encoding utf8 = new UTF8Encoding(false);

In you program, you can do like the following:

XmlTextWriter xtw = new XmlTextWriter(ms, new UTF8Encoding(false));

If anything is unclear, please feel free to reply to the post.

Kevin Yu
=======
"This posting is provided "AS IS" with no warranties, and confers no
rights."

Nov 12 '05 #2
Thank you. That worked great.
v-****@online.microsoft.com (Kevin Yu [MSFT]) wrote in message news:<uH**************@cpmsftngxa07.phx.gbl>...
Hi Adam,

The first three bytes in the byte array are BOM(Byte Order Mark 0xEFBBBF),
allowing applications to easily detect UTF-8 encoded text. If you want to
create UTF-8 encoded content without BOM, don't use the default instance
available through Encoding.UTF8, but create a new instance:

Encoding utf8 = new UTF8Encoding(false);

In you program, you can do like the following:

XmlTextWriter xtw = new XmlTextWriter(ms, new UTF8Encoding(false));

If anything is unclear, please feel free to reply to the post.

Kevin Yu
=======
"This posting is provided "AS IS" with no warranties, and confers no
rights."

Nov 12 '05 #3
Kevin,

could you explain what the problem is with using the default instance?

Thanks,
Christoph Schittko [MVP, XmlInsider]
Software Architect, .NET Mentor

"Kevin Yu [MSFT]" <v-****@online.microsoft.com> wrote in message
news:uH**************@cpmsftngxa07.phx.gbl...
Hi Adam,

The first three bytes in the byte array are BOM(Byte Order Mark 0xEFBBBF),
allowing applications to easily detect UTF-8 encoded text. If you want to
create UTF-8 encoded content without BOM, don't use the default instance
available through Encoding.UTF8, but create a new instance:

Encoding utf8 = new UTF8Encoding(false);

In you program, you can do like the following:

XmlTextWriter xtw = new XmlTextWriter(ms, new UTF8Encoding(false));

If anything is unclear, please feel free to reply to the post.

Kevin Yu
=======
"This posting is provided "AS IS" with no warranties, and confers no
rights."

Nov 12 '05 #4
Hi Christoph,

The UTF8 is a static instance of UTF8Encoding class. The encoder using this
instance will emit an UTF8 identifier, which is the 3 bytes I mentioned in
my last post. The identifer will help the program to recognize the UTF8
encoding. If we create a new instance with the first parameter of the
constructor set to false, it will not be emited.

Kevin Yu
=======
"This posting is provided "AS IS" with no warranties, and confers no
rights."

Nov 12 '05 #5
Thanks :)

--
HTH
Christoph Schittko [MVP]
Software Architect, .NET Mentor
"Kevin Yu [MSFT]" <v-****@online.microsoft.com> wrote in message
news:V0**************@cpmsftngxa07.phx.gbl...
Hi Christoph,

The UTF8 is a static instance of UTF8Encoding class. The encoder using this instance will emit an UTF8 identifier, which is the 3 bytes I mentioned in
my last post. The identifer will help the program to recognize the UTF8
encoding. If we create a new instance with the first parameter of the
constructor set to false, it will not be emited.

Kevin Yu
=======
"This posting is provided "AS IS" with no warranties, and confers no
rights."

Nov 12 '05 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: Ayende Rahien | last post by:
Two questions: I've a XmlTextWriter that I want to use to build a string in memory. However, when I'm using a StringWriter, the xml comes out at UTF-16, which isn't good for me. Currently, I...
2
by: Greg | last post by:
Hi all, I'm using the XmlDocument class to create an XML document. I found out that in .NET there are special classes that they do that job a lot faster, namelly XmlTextWritter. My problem is...
4
by: z. f. | last post by:
i have xml with the line <VAL ID="artist" VAL="abc & cde"/> i need the & character to be there but the xmlDocument Load method throw exception for invalid character. i don't want to encode...
3
by: David Taylor | last post by:
In .net I am using a HttpWebRequest to read from a WebSite. I am getting everything back except for some characters above hex 7F which appear to have been stripped out of my response. I see these...
2
by: fmancina | last post by:
Hi, I am employing the XmlTextWriter class to generate an XML document. Everything works fine, until I have to write an attribute to an element which contains a value. Examples below: //...
4
by: flyingco | last post by:
URL decoding/encoding problem Iif the url contains chinese char,the url will be encoded. For example : url:http://194.0.0.84/ÖÐÎÄÒ³Ãæ.htm when my tdi driver intercept the packet, I find that...
0
by: Serdar Irmak | last post by:
Hello, I've an encoding problem with asp.net 2.0 pages, it only effects to the input elements of forms that the client submitted, all other page content can be displayed normally including...
5
by: bagelman | last post by:
Hello, I've an encoding problem with asp.net 2.0 pages, it only effects to the content of textBox elements. When user press Submit button and page become postback the strings in textBox become...
2
by: David Gillen | last post by:
Hello. I've a problem (which I believe is a character encoding problem) where I retrieve data from a MSSQL database and euro and pound sign symbols appear as ? when a do a print_r of the rows...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.