471,595 Members | 2,233 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 471,595 software developers and data experts.

How to load an XmlTextReader from stream

I have an XML document in a file (e:\bobo.xml) saved using unicode
encoding with declaration:
<?xml version="1.0" encoding="UTF-16"?>

I can load that file into an XmlTextReader and read it just fine:
XmlTextReader reader = new XmlTextReader( @"e:\bobo.xml" );
reader.Read();

But this fails reporting, "This is an unexpected token. The expected
token is 'NAME'. Line 1, position 2.":
sr = new StreamReader( @"e:\bobo.xml" );
reader = new XmlTextReader( sr );
reader.Read();

Can anyone tell me why?
Nov 12 '05 #1
3 4464
Actually, this was simply because I should have used the overload of the
constructor that includes a parameter for encoding and it would work fine.

However, I don't know why this won't work:
MemoryStream ms = new MemoryStream();
BinaryWriter bw = new BinaryWriter( ms, Encoding.Unicode );
// this just reads with StreamReader and returns ReadToEnd()
string fileContents = FileUtils.fileToString(@"e:\bobo.xml");
bw.Write( fileContents );
reader = new XmlTextReader( ms );
reader.Read();

I thought that plain .net strings are encoded in unicode. The file is
encoded in unicode, the XML declaration includes 'encoding="utf-16"',
and the BinaryWriter is set to write unicode, so what is the problem?

Brad Wood wrote:
But this fails reporting, "This is an unexpected token. The expected
token is 'NAME'. Line 1, position 2.":
sr = new StreamReader( @"e:\bobo.xml" );
reader = new XmlTextReader( sr );
reader.Read();

Nov 12 '05 #2
"Brad Wood" <bradley|.wood|@ndsu|.edu> wrote in message news:uG**************@tk2msftngp13.phx.gbl...
I thought that plain .net strings are encoded in unicode. The file is encoded in unicode, the XML declaration includes
'encoding="utf-16"', and the BinaryWriter is set to write unicode, so what is the problem?


1. Make sure when you create the StreamReader in your fileToString( )
method that you use the overload that takes an encoding, and that you
are passing the same Encoding.Unicode (little-Endian, no BOM) to it.

2. After you call bw.Write( ) you should call bw.Flush( ). It may be
possible to get away with this on small-writes, but it can cause many
sleepless hours if you've written content that hasn't been committed
yet.

3. Normally, after you write to a MemoryStream the 'pointer' inside of
the MemoryStream will be resting at the end-of-the-Stream. This
requires you to reset the pointer back to the beginning-of-the-Stream
using either of:

ms.Seek( 0, SeekOrigin.Begin);

or alternately,

ms.Position = 0;

before you begin reading from it. The usual error for this mistake reads
to the effect, "There is no root element [at the end of your Stream!]"

4. However, # 3 has a wrinkle to it in your specific program because
you're using BinaryWriter.Write( string ), and that's a special method.
It's special because it always prefixes the string it writes out with the
string's Length. Therefore you're only receiving UTF-16 encoding
following that Length, however this is quite enough to throw off a
Unicode decoder. The usual error for this mistake reads to the
effect "There is invalid data at the root level. [First character!]"

So if fileContents.Length < 256 you want to reset the MemoryStream
to 1, and if it's between 256 and 65535 inclusive then you want to set it
to 2, etc., to cover the offset of the string length prefix Write( ) writes for
you. It has to put the string prefix there, btw, because that's how Read(
string) knows how far to go in a BinaryReader.
Derek Harmon
Nov 12 '05 #3
Sorry I took so long to check back. Your info is very good; explained
everything.

It does seem a bit strange to me, however, that the
BinaryWriter.Write( string )
adds the string length indicator to its MemoryStream, but calling
new StreamReader( @"e:\bobo.xml", Encoding.Unicode );
deosn't.
It would seem that if the BinaryWriter's MemoryStream needs to know the
length of its data, the StreamReader's would as well...

Derek Harmon wrote:
4. However, # 3 has a wrinkle to it in your specific program because
you're using BinaryWriter.Write( string ), and that's a special method.
It's special because it always prefixes the string it writes out with the
string's Length. Therefore you're only receiving UTF-16 encoding
following that Length, however this is quite enough to throw off a
Unicode decoder. The usual error for this mistake reads to the
effect "There is invalid data at the root level. [First character!]"

So if fileContents.Length < 256 you want to reset the MemoryStream
to 1, and if it's between 256 and 65535 inclusive then you want to set it
to 2, etc., to cover the offset of the string length prefix Write( ) writes for
you. It has to put the string prefix there, btw, because that's how Read(
string) knows how far to go in a BinaryReader.

Nov 12 '05 #4

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

2 posts views Thread by Tom Pearson | last post: by
3 posts views Thread by Emily John | last post: by
2 posts views Thread by Yuriy | last post: by
3 posts views Thread by Kjeld | last post: by
1 post views Thread by Nick | last post: by
3 posts views Thread by g66g08d14 | last post: by
reply views Thread by XIAOLAOHU | last post: by
reply views Thread by Anwar ali | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.