By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
429,251 Members | 2,727 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 429,251 IT Pros & Developers. It's quick & easy.

XmlTextWriter and Text.Encoding.UTF8 issues

P: n/a
Hi,

I'm an XML newbie, and not sure if this is the appropriate newsgroup to post
my question, so feel free to suggest other newgroups where I should post
this message if this is the case.

I'm having issues using XmlTextWriter, saving it out to a file with UTF8
encoding, and seeing "dirty", or "human unreadable" characters show up
*right before* the XML declaration.
I need to have the XML declaration state "encoding = utf-8", but also get
rid of the dirty characters. I was hoping that using
System.Text.Encoding.UTF8 in the constructor of the XmlTextWriter class
would prove nice and simple, but I'm finding out I'm having some issues with
it.

Ultimately, I want to stream the XMLDocument over to the vendor via HTTP
POST, but right now, the vendor's saying my transmissions have been invalid,
perhaps due to the dirty chars prior to the XML declaration. I included
code for both my local file dumping activities and my streaming activities.
Code snippets:
----------------------------------------------------------------------------
--------------------------------------------------------

XmlDocument xDoc = <some XmlDocument>;
/*

For following line of code, no Explicit Encoding Here, but encoding is
utf-16, according to XML declaration. Vendor wants it to say "utf-8".--
Resulting file doesn't have dirty chars prior to XML declaration, as
follows:

"<?xml version="1.0" encoding="utf-16"?>
<wPOREQ payloadID=""............."

*/

xDoc.Save("c:\\temp\\tpg\\" + dtNow + "_XMLPOREQ_Raw.xml");

----------------------------------------------------------------------------
----------------------------------------------------------------------------
--------------------------------------

/* For following line of code, tried dumping XmlDoc to file as utf8, but
notice the dirty chars right before the XML declaration:

<?xml version="1.0" encoding="utf-8"?><wPOREQ payloadID="" ..............

*/

xDoc.Save(new XmlTextWriter(@"c:\temp\tpg\" + dtNow +
"_POREQ_UTF8_ENCODED.xml",System.Text.Encoding.UTF 8));

----------------------------------------------------------------------------
----------------------------------------------------------------------------
--------------------------------------

/* Ultimately, I want to send the XmlDocument over to vendor, using the
following code:

*/

string url = PORequestURL;

System.Net.WebRequest w = System.Net.WebRequest.Create(url);

w.Method = "POST";

w.Timeout = Convert.ToInt32(PORequestTimeout);

w.ContentType = PORequestContentType;

System.IO.Stream s = w.GetRequestStream();

XmlTextWriter writer = new XmlTextWriter(s, System.Text.Encoding.UTF8);

xDoc.Save(writer);

writer.Close();

----------------------------------------------------------------------------
----------------------------------------------------------------------------
--------------------------------------

When I attempted to capture the Response, after sending a Request using the
above code, and dump the Response to a local file using the following code,
I see the same dirty chars show up in the local file:

----------------------------------------------------------------------------
----------------------------------------------------------------------------
--------------------------------------

/*

Dumped file begins with the following:

<?xml version="1.0" encoding="utf-8"?><!DOCTYPE cXML.............

*/

System.Net.WebResponse response = w.GetResponse();

XmlDocument xDoc1 = new XmlDocument();

XmlTextReader reader = new XmlTextReader(response.GetResponseStream());

xDoc1.Load(reader);

reader.Close();

xDoc1.Save(new XmlTextWriter(@"c:\temp\tpg\" + dtNow +
"_RESPONSE.xml",System.Text.Encoding.UTF8));

----------------------------------------------------------------------------
----------------------------------------------------------------------------
--------------------------------------

Basically, I need to encode my XML request in utf-8, without the junk
characters before the XML declaration. Any help or pointers would be
gratefully appreciated.

Thanks in advance,

- Howard
Nov 12 '05 #1
Share this Question
Share on Google+
4 Replies


P: n/a
The standard UTF8 encoding that you get from Encoding.UTF8 has the utf8
identifier on by default. That's the "invalid" characters you see
before the xml declaration.

For clients that don't understand this identifier, you have to use the
constructor on the UTF8Encoding class where you turn off the
identifier.

Encoding enc = new UTF8Encoding(false);

Nov 12 '05 #2

P: n/a
The standard UTF8 encoding that you get from Encoding.UTF8 has the utf8
identifier on by default. That's the "invalid" characters you see
before the xml declaration.

For clients that don't understand this identifier, you have to use the
constructor on the UTF8Encoding class where you turn off the
identifier.

Encoding enc = new UTF8Encoding(false);

Nov 12 '05 #3

P: n/a
That did the trick. What a relief!

Any pro's and con's with having a client understand the identifier?

Thank you so much for the help!

- Howard

<cb*****@hotmail.com> wrote in message
news:11**********************@c13g2000cwb.googlegr oups.com...
The standard UTF8 encoding that you get from Encoding.UTF8 has the utf8
identifier on by default. That's the "invalid" characters you see
before the xml declaration.

For clients that don't understand this identifier, you have to use the
constructor on the UTF8Encoding class where you turn off the
identifier.

Encoding enc = new UTF8Encoding(false);

Nov 12 '05 #4

P: n/a
H Lee wrote:
That did the trick. What a relief!

Any pro's and con's with having a client understand the identifier?


In fact byte-order mark (BOM) is optional in UTF-8 so don't worry about it.

--
Oleg Tkachenko [XML MVP]
http://blog.tkachenko.com
Nov 12 '05 #5

This discussion thread is closed

Replies have been disabled for this discussion.