473,396 Members | 1,780 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

XmlDocument read from Network Stream



I have been using the following code (successfully) to read Xml formated
text packets from a TCP stream. The output from the server stream consists
of a sequence of well formed Xml documents written to the output stream.
We are willing to pay $ to any expert (e.g. MVP) consultant who has to time
to help us track down this problem.
(we will discuss rates if you can prove your expertise, and problem solving
approach).

The code below is for the client that is reading the stream.

All this worked well, as long as long delays were between each "packet". But
when multiple packets are sent in quick succession, it is dropping packets.
I have been trying to do this more "correctly" with using a XmlTextReader,
but with no luck.

E.g.

XmlTextReader r = new XmlTextReader();
reader.MoveToContent();
string s = reader.ReadOuterXml();
dgram.Load(s);

It basically hangs on the ReadOuterXml();

Here is the original "working" code:
--------------------------------------------

private void OrchTalk()
{
int bytesRead = 0;
XmlDocument dgram = new XmlDocument();

while (true)
{
byte[] buffer = new byte[1024]; // buffer for socket read stream
bytesRead = 0;
try { bytesRead = oStream.Read(buffer, 0, buffer.Length); }
catch (Exception e)
{
if (stateCode == Common.stateCodes.shutdown) {break;}
Tools.Debug(10, "OrchTalk: read error {0} ", e.ToString());
break;
}
if (bytesRead == 0)
{
if (this.stateCode == Common.stateCodes.shutdown)
{
Tools.Debug(10, "OrchTalk: Orchestrator died");
this.Disconnect();
return;
}
break;
}
else
{
string s = Tools.ByteToStr(buffer, 0);
try { dgram.LoadXml(s); }
catch (Exception e)
{ Tools.Debug(0, "OnReadComplete: {0} Runt packet: {1}", e.ToString(),
s); }
int endTime = Environment.TickCount;
this.inPacket = dgram.DocumentElement;
Tools.Debug(10, "packet: {0}", inPacket.OuterXml);
this.Dispatch();
int startTime = (int)this.stats["start"];
int elapsed = endTime - startTime;
this.stats["stop"] = endTime;
this.stats["time"] = elapsed;
this.stats["bytes"] = bytesRead;
this.stats["packetText"] = inPacket.OuterXml;
if (StationMaster.isConnected)
{
string outPacket = inPacket.OuterXml;
FlashComm.itself.Send(outPacket);
}
}
}
this.Disconnect();
}
Nov 12 '05 #1
6 7331
Hello Yechezkal,

Thanks for posting in the group.

I noticed that the same issue is posted in dotnet.framework group. If you
have time, please check my reply there.

If you have any more concerns on it, pleae feel free to post in that group
and I will follow up there.

Thanks again for participating the community.

Best regards,
Yanhong Huang
Microsoft Online Partner Support

Get Secure! ¨C www.microsoft.com/security
This posting is provided "AS IS" with no warranties, and confers no rights.

Nov 12 '05 #2
You say the output from the server is a sequence of well formed XML
documents. Now by definition this means each document fragment is "variable
length", so the fact that you are attempting to call LoadXml() with partial
fragments in your 1024 byte buffer is completely wrong. LoadXml expects a
complete document fragment, you cannot call it repeatedly with partial
fragments.

So try this:
public class XmlStreamReader {
XmlTextReader reader;
XmlDocument doc;

public XmlStreamReader(string url) {
XmlNameTable nt = new NameTable();
XmlNamespaceManager nsMgr = new XmlNamespaceManager(nt);
XmlParserContext context = new XmlParserContext(nt, nsMgr, null,
XmlSpace.None);
WebRequest wr = WebRequest.Create(url);
Stream stream = wr.GetResponse().GetResponseStream();
this.reader = new XmlTextReader(stream, XmlNodeType.Element,
context);
this.doc = new XmlDocument();
}

public XmlElement ReadXmlElement() {
XmlNode node = null;
while ((node = doc.ReadNode(reader)) != null && !(node is
XmlElement)) {
}
return node as XmlElement;
}
}

and test it like this:

public class Test {
static void Main(string[] args) {
string url = args[0];
XmlStreamReader r = new XmlStreamReader(url);
XmlElement e;
while ((e = r.ReadXmlElement()) != null) {
Console.WriteLine(e.OuterXml);
}
return;
}
}

This might not work if the server injects an <?xml declaration and/or
<!DOCTYPE tag at the start of each document fragment...

Chris.

"Yechezkal Gutfreund" <sg********@hotmail.com> wrote in message
news:%2******************@TK2MSFTNGP10.phx.gbl...


I have been using the following code (successfully) to read Xml formated
text packets from a TCP stream. The output from the server stream consists
of a sequence of well formed Xml documents written to the output stream.
We are willing to pay $ to any expert (e.g. MVP) consultant who has to time to help us track down this problem.
(we will discuss rates if you can prove your expertise, and problem solving approach).

The code below is for the client that is reading the stream.

All this worked well, as long as long delays were between each "packet". But when multiple packets are sent in quick succession, it is dropping packets. I have been trying to do this more "correctly" with using a XmlTextReader,
but with no luck.

E.g.

XmlTextReader r = new XmlTextReader();
reader.MoveToContent();
string s = reader.ReadOuterXml();
dgram.Load(s);

It basically hangs on the ReadOuterXml();

Here is the original "working" code:
--------------------------------------------

private void OrchTalk()
{
int bytesRead = 0;
XmlDocument dgram = new XmlDocument();

while (true)
{
byte[] buffer = new byte[1024]; // buffer for socket read stream
bytesRead = 0;
try { bytesRead = oStream.Read(buffer, 0, buffer.Length); }
catch (Exception e)
{
if (stateCode == Common.stateCodes.shutdown) {break;}
Tools.Debug(10, "OrchTalk: read error {0} ", e.ToString());
break;
}
if (bytesRead == 0)
{
if (this.stateCode == Common.stateCodes.shutdown)
{
Tools.Debug(10, "OrchTalk: Orchestrator died");
this.Disconnect();
return;
}
break;
}
else
{
string s = Tools.ByteToStr(buffer, 0);
try { dgram.LoadXml(s); }
catch (Exception e)
{ Tools.Debug(0, "OnReadComplete: {0} Runt packet: {1}", e.ToString(), s); }
int endTime = Environment.TickCount;
this.inPacket = dgram.DocumentElement;
Tools.Debug(10, "packet: {0}", inPacket.OuterXml);
this.Dispatch();
int startTime = (int)this.stats["start"];
int elapsed = endTime - startTime;
this.stats["stop"] = endTime;
this.stats["time"] = elapsed;
this.stats["bytes"] = bytesRead;
this.stats["packetText"] = inPacket.OuterXml;
if (StationMaster.isConnected)
{
string outPacket = inPacket.OuterXml;
FlashComm.itself.Send(outPacket);
}
}
}
this.Disconnect();
}

Nov 12 '05 #3
Yechezkal Gutfreund <sg********@hotmail.com> wrote:
I have been using the following code (successfully) to read Xml formated
text packets from a TCP stream. The output from the server stream consists
of a sequence of well formed Xml documents written to the output stream.
We are willing to pay $ to any expert (e.g. MVP) consultant who has to time
to help us track down this problem.
(we will discuss rates if you can prove your expertise, and problem solving
approach).

The code below is for the client that is reading the stream.

All this worked well, as long as long delays were between each "packet". But
when multiple packets are sent in quick succession, it is dropping packets.
I have been trying to do this more "correctly" with using a XmlTextReader,
but with no luck.


<snip>

Do you have a short but complete pair of programs (one sender, one
received) which reliably demonstrates the problem?

Have you used a network analyser to check that the packets are being
sent correctly in the first place?

Your receiving code looks pretty odd though: you're ignoring the
bytesRead parameter after checking whether or not it's 0. What does
your Tools.ByteToStr method do? Does it try to find the first 0 byte
and assume that the "good data" ends there? Far better would be to use
an Encoder and GetChars, or just a straight StreamReader on top of the
network stream.

You also *seem* to be assuming that the whole document will come in a
single call to Read - you don't wait for the end of the stream in order
to get the whole document in a string before calling
XmlDocument.LoadXml.

What does your incoming stream look like? Is it meant to have several
documents in it? If so, you could have trouble if you don't have any
way of knowing in advance where one document will end and another one
start.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 12 '05 #4
Chris Lovett wrote:
This might not work if the server injects an <?xml declaration and/or
<!DOCTYPE tag at the start of each document fragment...


Happily XML external entities (which I believe are the same as XML
fragments) are not allowed to have Doctype. But they do allowed to have
text declaration like <?xml version="1.0" encoding="utf-8"?>
I think that's bad idea to use XmlTextReader to read a series of XML
fragments instead of single fragment, some delimiter should be used.
--
Oleg Tkachenko
XML Insider
http://www.tkachenko.com/blog

Nov 12 '05 #5
1. Thank you Yan, Chris, Oleg and Jon - your points are well taken, and I
apologize for the lack of clarity

2. Yes, the design needs improvement, and the choice of mechanism are poor,
mainly due to my lack of familiarity with the various stream readers.

Lets cover a few of the system issues:

a. There is a central server, and about 10-20 PDA clients running Net CF.
For testing purposes, I am still at one client. So there is no massive data
overload, loss packets, etc. This problem occurs weather it is WiFi, GPRS,
or even on the same box. Communicaiton is via TCP for now. I we are planning
to change to UDP to optimize packet latency, recovery etc. So things look
odd, and now you know why. The code uses dgrams, but for the moment we are
going over a TCP stream.

b. Message are small, almost always less than 100 bytes.

c. We are not using binary or other seriel formatters because we are 1)
prototyping 2) need to run on slow speed PDAs

d. we are using GetBytes:
public static string ByteToStr(byte[] buffer, int start)
{
return System.Text.Encoding.ASCII.GetString(buffer, start,
buffer.Length);
}

e. The problem is most likely that the server uses a '\0' to deliminate
messages. Thus if I pause the server between each message (a debug pop up).
The NAGLE factor on the TCP link splirts the stream into two packets. But if
not then they may both come in one packet, and the GetString stops in the
middle of the buffer (packet) read.

f. Ok, one solution, is to run through the buffer a byte at a time and find
the null delimator (or better an EOD) and do the string one at a time. - But
remember this is a low cpu PDA. why run through the buffer twice? parse
twise. Once to split the packets, and the second to turn the string into an
XMLDoc?

g. TextReaderStream from URL (and all of that) is not meanigful. These are
raw TCP sockets, and we are going to UDP eventually.

Is there not some way to use the XmlStreamReaders to gobble up tokens and
even complete fragmets (.ReadOuterXml) or complete docs, so that only one
pass of the stream needs to be done?

The goal then is simple. Take a byte buffer, and extract (ONE at a time) xml
packets and stuff them into XmlDocuments. Assume that the buffer will be big
enough so that there will never be fragmentary documents, and that the
buffer is big enough for 10 messages, and there will never be more than 1 or
2 unread ones. So buffer overflow or splits are not a concern.
Is this clear?
--
==================================
Yechezkal Gutfreund
Chief Scientist
Kesser Technical Group, Inc.
==================================
Nov 12 '05 #6
Yechezkal Gutfreund <sg********@hotmail.com> wrote:
d. we are using GetBytes:
public static string ByteToStr(byte[] buffer, int start)
{
return System.Text.Encoding.ASCII.GetString(buffer, start,
buffer.Length);
}
In that case you're decoding a load of 0s which weren't actually
received. You need to specify how much to decode, at which point
there's no real point in having the extra method anyway.
e. The problem is most likely that the server uses a '\0' to deliminate
messages. Thus if I pause the server between each message (a debug pop up).
The NAGLE factor on the TCP link splirts the stream into two packets. But if
not then they may both come in one packet, and the GetString stops in the
middle of the buffer (packet) read.
Relying on packets doing exactly what you want sounds like a really bad
idea to me, to be honest - there's a reason why all of this is exposed
as a stream.
f. Ok, one solution, is to run through the buffer a byte at a time and find
the null delimator (or better an EOD) and do the string one at a time. - But
remember this is a low cpu PDA. why run through the buffer twice? parse
twise. Once to split the packets, and the second to turn the string into an
XMLDoc?
I don't think looking for a \0 really counts as full parsing - and
you're already running through the byte stream twice, once to decode
from bytes to characters. In fact, with your current code you're
decoding 1024 bytes every time, even if only 100 bytes actually contain
any text.
g. TextReaderStream from URL (and all of that) is not meanigful. These are
raw TCP sockets, and we are going to UDP eventually.
You've got a stream though, which means you can create a StreamReader.
When it goes to UDP, you'd have to rewrite stuff anyway.
Is there not some way to use the XmlStreamReaders to gobble up tokens and
even complete fragmets (.ReadOuterXml) or complete docs, so that only one
pass of the stream needs to be done?
I'm not *exactly* sure what you mean here, but I don't think so.
The goal then is simple. Take a byte buffer, and extract (ONE at a time) xml
packets and stuff them into XmlDocuments. Assume that the buffer will be big
enough so that there will never be fragmentary documents, and that the
buffer is big enough for 10 messages, and there will never be more than 1 or
2 unread ones. So buffer overflow or splits are not a concern.

Is this clear?


Well, if you're *really* sure you've got everything in a buffer, I'd
suggest scanning for your zero bytes, then decode each delimited
section using Encoding.ASCII.GetString (buffer, start, length) and then
feeding the results of that into XmlDocument.LoadXml.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 12 '05 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: Tom Pearson | last post by:
I have created an XmlDocument and can see that this is correctly formed and has a root element If I Save the doc to a file and reload it all works o If I dump the doc to a stream, again using the...
5
by: KathyB | last post by:
If someone could just explain this to me...I just don't get it! I have an aspx page where I retrieve several session variables and use xmlDocument to transform xml file with xsl file into an...
1
by: KathyB | last post by:
Kirk, The other day you very kindly explained how the client/server thing works. May I ask just one more question? Could you give me an example of what code I would put in the client html to...
1
by: Dave | last post by:
Can you load an XmlDocument with a file located on a network share? I thought maybe the XmlResolver could be used XmlUrlResolver resolver = new XmlUrlResolver() resolver.Credentials =...
2
by: Graham Pengelly | last post by:
Hi I am trying to transform on System.Xml.XmlDocument into another using XslTransform without writing the object out to a file. I am guessing it should work something like this... public...
4
by: Foo | last post by:
Hi I have a problem, while retrieving xml data through network. I use Load(Stream) method for this, but this doesn't work: NetworkStream ns = client.GetStream(); StreamReader sreader = new...
5
by: Rob Panosh | last post by:
Hello, I am trying to create a xmlDocument from as dataset. My code is listed below. All seems to go well until xmlDocument.Load(CType(ms, System.IO.Stream)) ... I keep getting the following...
1
by: =?Utf-8?B?U2hhd24gU2VzbmE=?= | last post by:
The description of the XMLDocument.Load method doesn't quite answer the question. When passing in a FileStream object to the Load method, does it load the entire document into memory? For...
2
by: Manikrag | last post by:
Hi All, I am getting error while am loading RSS Stream in XMLDocument object. The error is as follows. "System.Xml.XmlException: '', hexadecimal value 0x19, is an invalid character. Line 18,...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.