470,612 Members | 2,505 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 470,612 developers. It's quick & easy.

XmlDocument read from Network Stream



I have been using the following code (successfully) to read Xml formated
text packets from a TCP stream. The output from the server stream consists
of a sequence of well formed Xml documents written to the output stream.
We are willing to pay $ to any expert (e.g. MVP) consultant who has to time
to help us track down this problem.
(we will discuss rates if you can prove your expertise, and problem solving
approach).

The code below is for the client that is reading the stream.

All this worked well, as long as long delays were between each "packet". But
when multiple packets are sent in quick succession, it is dropping packets.
I have been trying to do this more "correctly" with using a XmlTextReader,
but with no luck.

E.g.

XmlTextReader r = new XmlTextReader();
reader.MoveToContent();
string s = reader.ReadOuterXml();
dgram.Load(s);

It basically hangs on the ReadOuterXml();

Here is the original "working" code:
--------------------------------------------

private void OrchTalk()
{
int bytesRead = 0;
XmlDocument dgram = new XmlDocument();

while (true)
{
byte[] buffer = new byte[1024]; // buffer for socket read stream
bytesRead = 0;
try { bytesRead = oStream.Read(buffer, 0, buffer.Length); }
catch (Exception e)
{
if (stateCode == Common.stateCodes.shutdown) {break;}
Tools.Debug(10, "OrchTalk: read error {0} ", e.ToString());
break;
}
if (bytesRead == 0)
{
if (this.stateCode == Common.stateCodes.shutdown)
{
Tools.Debug(10, "OrchTalk: Orchestrator died");
this.Disconnect();
return;
}
break;
}
else
{
string s = Tools.ByteToStr(buffer, 0);
try { dgram.LoadXml(s); }
catch (Exception e)
{ Tools.Debug(0, "OnReadComplete: {0} Runt packet: {1}", e.ToString(),
s); }
int endTime = Environment.TickCount;
this.inPacket = dgram.DocumentElement;
Tools.Debug(10, "packet: {0}", inPacket.OuterXml);
this.Dispatch();
int startTime = (int)this.stats["start"];
int elapsed = endTime - startTime;
this.stats["stop"] = endTime;
this.stats["time"] = elapsed;
this.stats["bytes"] = bytesRead;
this.stats["packetText"] = inPacket.OuterXml;
if (StationMaster.isConnected)
{
string outPacket = inPacket.OuterXml;
FlashComm.itself.Send(outPacket);
}
}
}
this.Disconnect();
}
Nov 12 '05 #1
6 7116
Hello Yechezkal,

Thanks for posting in the group.

I noticed that the same issue is posted in dotnet.framework group. If you
have time, please check my reply there.

If you have any more concerns on it, pleae feel free to post in that group
and I will follow up there.

Thanks again for participating the community.

Best regards,
Yanhong Huang
Microsoft Online Partner Support

Get Secure! C www.microsoft.com/security
This posting is provided "AS IS" with no warranties, and confers no rights.

Nov 12 '05 #2
You say the output from the server is a sequence of well formed XML
documents. Now by definition this means each document fragment is "variable
length", so the fact that you are attempting to call LoadXml() with partial
fragments in your 1024 byte buffer is completely wrong. LoadXml expects a
complete document fragment, you cannot call it repeatedly with partial
fragments.

So try this:
public class XmlStreamReader {
XmlTextReader reader;
XmlDocument doc;

public XmlStreamReader(string url) {
XmlNameTable nt = new NameTable();
XmlNamespaceManager nsMgr = new XmlNamespaceManager(nt);
XmlParserContext context = new XmlParserContext(nt, nsMgr, null,
XmlSpace.None);
WebRequest wr = WebRequest.Create(url);
Stream stream = wr.GetResponse().GetResponseStream();
this.reader = new XmlTextReader(stream, XmlNodeType.Element,
context);
this.doc = new XmlDocument();
}

public XmlElement ReadXmlElement() {
XmlNode node = null;
while ((node = doc.ReadNode(reader)) != null && !(node is
XmlElement)) {
}
return node as XmlElement;
}
}

and test it like this:

public class Test {
static void Main(string[] args) {
string url = args[0];
XmlStreamReader r = new XmlStreamReader(url);
XmlElement e;
while ((e = r.ReadXmlElement()) != null) {
Console.WriteLine(e.OuterXml);
}
return;
}
}

This might not work if the server injects an <?xml declaration and/or
<!DOCTYPE tag at the start of each document fragment...

Chris.

"Yechezkal Gutfreund" <sg********@hotmail.com> wrote in message
news:%2******************@TK2MSFTNGP10.phx.gbl...


I have been using the following code (successfully) to read Xml formated
text packets from a TCP stream. The output from the server stream consists
of a sequence of well formed Xml documents written to the output stream.
We are willing to pay $ to any expert (e.g. MVP) consultant who has to time to help us track down this problem.
(we will discuss rates if you can prove your expertise, and problem solving approach).

The code below is for the client that is reading the stream.

All this worked well, as long as long delays were between each "packet". But when multiple packets are sent in quick succession, it is dropping packets. I have been trying to do this more "correctly" with using a XmlTextReader,
but with no luck.

E.g.

XmlTextReader r = new XmlTextReader();
reader.MoveToContent();
string s = reader.ReadOuterXml();
dgram.Load(s);

It basically hangs on the ReadOuterXml();

Here is the original "working" code:
--------------------------------------------

private void OrchTalk()
{
int bytesRead = 0;
XmlDocument dgram = new XmlDocument();

while (true)
{
byte[] buffer = new byte[1024]; // buffer for socket read stream
bytesRead = 0;
try { bytesRead = oStream.Read(buffer, 0, buffer.Length); }
catch (Exception e)
{
if (stateCode == Common.stateCodes.shutdown) {break;}
Tools.Debug(10, "OrchTalk: read error {0} ", e.ToString());
break;
}
if (bytesRead == 0)
{
if (this.stateCode == Common.stateCodes.shutdown)
{
Tools.Debug(10, "OrchTalk: Orchestrator died");
this.Disconnect();
return;
}
break;
}
else
{
string s = Tools.ByteToStr(buffer, 0);
try { dgram.LoadXml(s); }
catch (Exception e)
{ Tools.Debug(0, "OnReadComplete: {0} Runt packet: {1}", e.ToString(), s); }
int endTime = Environment.TickCount;
this.inPacket = dgram.DocumentElement;
Tools.Debug(10, "packet: {0}", inPacket.OuterXml);
this.Dispatch();
int startTime = (int)this.stats["start"];
int elapsed = endTime - startTime;
this.stats["stop"] = endTime;
this.stats["time"] = elapsed;
this.stats["bytes"] = bytesRead;
this.stats["packetText"] = inPacket.OuterXml;
if (StationMaster.isConnected)
{
string outPacket = inPacket.OuterXml;
FlashComm.itself.Send(outPacket);
}
}
}
this.Disconnect();
}

Nov 12 '05 #3
Yechezkal Gutfreund <sg********@hotmail.com> wrote:
I have been using the following code (successfully) to read Xml formated
text packets from a TCP stream. The output from the server stream consists
of a sequence of well formed Xml documents written to the output stream.
We are willing to pay $ to any expert (e.g. MVP) consultant who has to time
to help us track down this problem.
(we will discuss rates if you can prove your expertise, and problem solving
approach).

The code below is for the client that is reading the stream.

All this worked well, as long as long delays were between each "packet". But
when multiple packets are sent in quick succession, it is dropping packets.
I have been trying to do this more "correctly" with using a XmlTextReader,
but with no luck.


<snip>

Do you have a short but complete pair of programs (one sender, one
received) which reliably demonstrates the problem?

Have you used a network analyser to check that the packets are being
sent correctly in the first place?

Your receiving code looks pretty odd though: you're ignoring the
bytesRead parameter after checking whether or not it's 0. What does
your Tools.ByteToStr method do? Does it try to find the first 0 byte
and assume that the "good data" ends there? Far better would be to use
an Encoder and GetChars, or just a straight StreamReader on top of the
network stream.

You also *seem* to be assuming that the whole document will come in a
single call to Read - you don't wait for the end of the stream in order
to get the whole document in a string before calling
XmlDocument.LoadXml.

What does your incoming stream look like? Is it meant to have several
documents in it? If so, you could have trouble if you don't have any
way of knowing in advance where one document will end and another one
start.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 12 '05 #4
Chris Lovett wrote:
This might not work if the server injects an <?xml declaration and/or
<!DOCTYPE tag at the start of each document fragment...


Happily XML external entities (which I believe are the same as XML
fragments) are not allowed to have Doctype. But they do allowed to have
text declaration like <?xml version="1.0" encoding="utf-8"?>
I think that's bad idea to use XmlTextReader to read a series of XML
fragments instead of single fragment, some delimiter should be used.
--
Oleg Tkachenko
XML Insider
http://www.tkachenko.com/blog

Nov 12 '05 #5
1. Thank you Yan, Chris, Oleg and Jon - your points are well taken, and I
apologize for the lack of clarity

2. Yes, the design needs improvement, and the choice of mechanism are poor,
mainly due to my lack of familiarity with the various stream readers.

Lets cover a few of the system issues:

a. There is a central server, and about 10-20 PDA clients running Net CF.
For testing purposes, I am still at one client. So there is no massive data
overload, loss packets, etc. This problem occurs weather it is WiFi, GPRS,
or even on the same box. Communicaiton is via TCP for now. I we are planning
to change to UDP to optimize packet latency, recovery etc. So things look
odd, and now you know why. The code uses dgrams, but for the moment we are
going over a TCP stream.

b. Message are small, almost always less than 100 bytes.

c. We are not using binary or other seriel formatters because we are 1)
prototyping 2) need to run on slow speed PDAs

d. we are using GetBytes:
public static string ByteToStr(byte[] buffer, int start)
{
return System.Text.Encoding.ASCII.GetString(buffer, start,
buffer.Length);
}

e. The problem is most likely that the server uses a '\0' to deliminate
messages. Thus if I pause the server between each message (a debug pop up).
The NAGLE factor on the TCP link splirts the stream into two packets. But if
not then they may both come in one packet, and the GetString stops in the
middle of the buffer (packet) read.

f. Ok, one solution, is to run through the buffer a byte at a time and find
the null delimator (or better an EOD) and do the string one at a time. - But
remember this is a low cpu PDA. why run through the buffer twice? parse
twise. Once to split the packets, and the second to turn the string into an
XMLDoc?

g. TextReaderStream from URL (and all of that) is not meanigful. These are
raw TCP sockets, and we are going to UDP eventually.

Is there not some way to use the XmlStreamReaders to gobble up tokens and
even complete fragmets (.ReadOuterXml) or complete docs, so that only one
pass of the stream needs to be done?

The goal then is simple. Take a byte buffer, and extract (ONE at a time) xml
packets and stuff them into XmlDocuments. Assume that the buffer will be big
enough so that there will never be fragmentary documents, and that the
buffer is big enough for 10 messages, and there will never be more than 1 or
2 unread ones. So buffer overflow or splits are not a concern.
Is this clear?
--
==================================
Yechezkal Gutfreund
Chief Scientist
Kesser Technical Group, Inc.
==================================
Nov 12 '05 #6
Yechezkal Gutfreund <sg********@hotmail.com> wrote:
d. we are using GetBytes:
public static string ByteToStr(byte[] buffer, int start)
{
return System.Text.Encoding.ASCII.GetString(buffer, start,
buffer.Length);
}
In that case you're decoding a load of 0s which weren't actually
received. You need to specify how much to decode, at which point
there's no real point in having the extra method anyway.
e. The problem is most likely that the server uses a '\0' to deliminate
messages. Thus if I pause the server between each message (a debug pop up).
The NAGLE factor on the TCP link splirts the stream into two packets. But if
not then they may both come in one packet, and the GetString stops in the
middle of the buffer (packet) read.
Relying on packets doing exactly what you want sounds like a really bad
idea to me, to be honest - there's a reason why all of this is exposed
as a stream.
f. Ok, one solution, is to run through the buffer a byte at a time and find
the null delimator (or better an EOD) and do the string one at a time. - But
remember this is a low cpu PDA. why run through the buffer twice? parse
twise. Once to split the packets, and the second to turn the string into an
XMLDoc?
I don't think looking for a \0 really counts as full parsing - and
you're already running through the byte stream twice, once to decode
from bytes to characters. In fact, with your current code you're
decoding 1024 bytes every time, even if only 100 bytes actually contain
any text.
g. TextReaderStream from URL (and all of that) is not meanigful. These are
raw TCP sockets, and we are going to UDP eventually.
You've got a stream though, which means you can create a StreamReader.
When it goes to UDP, you'd have to rewrite stuff anyway.
Is there not some way to use the XmlStreamReaders to gobble up tokens and
even complete fragmets (.ReadOuterXml) or complete docs, so that only one
pass of the stream needs to be done?
I'm not *exactly* sure what you mean here, but I don't think so.
The goal then is simple. Take a byte buffer, and extract (ONE at a time) xml
packets and stuff them into XmlDocuments. Assume that the buffer will be big
enough so that there will never be fragmentary documents, and that the
buffer is big enough for 10 messages, and there will never be more than 1 or
2 unread ones. So buffer overflow or splits are not a concern.

Is this clear?


Well, if you're *really* sure you've got everything in a buffer, I'd
suggest scanning for your zero bytes, then decode each delimited
section using Encoding.ASCII.GetString (buffer, start, length) and then
feeding the results of that into XmlDocument.LoadXml.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 12 '05 #7

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

2 posts views Thread by Tom Pearson | last post: by
2 posts views Thread by Graham Pengelly | last post: by
5 posts views Thread by Rob Panosh | last post: by
1 post views Thread by =?Utf-8?B?U2hhd24gU2VzbmE=?= | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.