363,925 Members | 2607 Browsing Online
Community for Developers & IT Professionals
Bytes IT Community

Parsing XML over TCP/IP

Thomas Polan
P: n/a
Thomas Polan
Sorry if this has been posted before...

I am receiving XML messages over a TCP client. Messages vary in size and
sometimes can arrive in groups. Thus, I am not guaranteed to receive a full
message in a single read from my socket.

I am loading each message into an XmlDocument after it arrives. I need a
way to make sure I have a complete XML message before I load it into an
XmlDocument or it will throw an exception. I am currently finding complete
messages by turning each read into a string and manually look for closing
tags. Surely there is an easier way to accomplish this, perhaps with an
XmlReader or XmlTextReader?

Thanks
Thomas


Nov 11 '05 #1
Share this Question
Share on Google+
6 Replies


Oleg Tkachenko
P: n/a
Oleg Tkachenko
Thomas Polan wrote:
[color=blue]
> Sorry if this has been posted before...
>
> I am receiving XML messages over a TCP client. Messages vary in size and
> sometimes can arrive in groups. Thus, I am not guaranteed to receive a full
> message in a single read from my socket.
>
> I am loading each message into an XmlDocument after it arrives. I need a
> way to make sure I have a complete XML message before I load it into an
> XmlDocument or it will throw an exception. I am currently finding complete
> messages by turning each read into a string and manually look for closing
> tags. Surely there is an easier way to accomplish this, perhaps with an
> XmlReader or XmlTextReader?[/color]

I see two separate problems here - partial documents and grouped documents.
Former one is simple - just create XmlTextReader over the stream and it'll
read it till EOF. But if second document would arrive immediately after first
one XmlTextReader will throw XmlException, because that breaks XML
well-formdness rules. So you need some way to separate XML documents in a
stream, e.g. put \0 between them - it'll stop XmlTextReader just as EOF has
been encounered and then you can check if the stream is still can be read and
start another XmlTextReader.
--
Oleg Tkachenko
http://www.tkachenko.com/blog
Multiconn Technologies, Israel

Nov 11 '05 #2

Thomas Polan
P: n/a
Thomas Polan
What I'm really looking for is an XmlReader that blocks on a network stream
until data arrives. When xml arrives, it will read until it finds a
complete record and then notify me. I'm trying to avoid having to touch or
modify the stream at all.

Thomas

"Oleg Tkachenko" <oleg@NO_SPAM_PLEASEtkachenko.com> wrote in message
news:%23yOZ18eVDHA.1052@TK2MSFTNGP09.phx.gbl...[color=blue]
> Thomas Polan wrote:
>[color=green]
> > Sorry if this has been posted before...
> >
> > I am receiving XML messages over a TCP client. Messages vary in size[/color][/color]
and[color=blue][color=green]
> > sometimes can arrive in groups. Thus, I am not guaranteed to receive a[/color][/color]
full[color=blue][color=green]
> > message in a single read from my socket.
> >
> > I am loading each message into an XmlDocument after it arrives. I need[/color][/color]
a[color=blue][color=green]
> > way to make sure I have a complete XML message before I load it into an
> > XmlDocument or it will throw an exception. I am currently finding[/color][/color]
complete[color=blue][color=green]
> > messages by turning each read into a string and manually look for[/color][/color]
closing[color=blue][color=green]
> > tags. Surely there is an easier way to accomplish this, perhaps with an
> > XmlReader or XmlTextReader?[/color]
>
> I see two separate problems here - partial documents and grouped[/color]
documents.[color=blue]
> Former one is simple - just create XmlTextReader over the stream and it'll
> read it till EOF. But if second document would arrive immediately after[/color]
first[color=blue]
> one XmlTextReader will throw XmlException, because that breaks XML
> well-formdness rules. So you need some way to separate XML documents in a
> stream, e.g. put \0 between them - it'll stop XmlTextReader just as EOF[/color]
has[color=blue]
> been encounered and then you can check if the stream is still can be read[/color]
and[color=blue]
> start another XmlTextReader.
> --
> Oleg Tkachenko
> http://www.tkachenko.com/blog
> Multiconn Technologies, Israel
>[/color]


Nov 11 '05 #3

Oleg Tkachenko
P: n/a
Oleg Tkachenko
Thomas Polan wrote:
[color=blue]
> What I'm really looking for is an XmlReader that blocks on a network stream
> until data arrives. When xml arrives, it will read until it finds a
> complete record and then notify me.[/color]
There is no such thing as "complete record" in XML. XmlReader reads to
the end of file or end of stream, so the following
<first-doc>
</first-doc>
<second-doc>
</second-doc>
will be treated as non wellformed XML document with 2 root element
nodes. If your XML documents don't come with xml declarations, you can
read above as single XML fragment by XmlTextReader, but then you have to
separate them yourself, actually it could be done based on Depth property.
--
Oleg Tkachenko
http://www.tkachenko.com/blog
Multiconn Technologies, Israel

Nov 11 '05 #4

Alexander Gnauck
P: n/a
Alexander Gnauck
Hello,
[color=blue]
> Sorry if this has been posted before...
>
> I am receiving XML messages over a TCP client. Messages vary in size
> and sometimes can arrive in groups. Thus, I am not guaranteed to
> receive a full message in a single read from my socket.
>
> I am loading each message into an XmlDocument after it arrives. I
> need a way to make sure I have a complete XML message before I load
> it into an XmlDocument or it will throw an exception. I am currently
> finding complete messages by turning each read into a string and
> manually look for closing tags. Surely there is an easier way to
> accomplish this, perhaps with an XmlReader or XmlTextReader?[/color]

im working on the same stuff. First i took XMLDocument to validate if have a
complete node that i can parse. But the XmlDocument is very very slow.
Especially on compact Framework. Thats the Reason why im working at the
moment at a better and faster way with the XmlReader or XmlTextReader. You
must Read in a XmlReader and check the depth of the Reader. Then you know
when you have a complete node. The XmlReader must be bound to a Stream. My
problem is that i found no way for now to create a XmlDocument from the
Reader when i know that i have a complete now since this position.
Below is the code of my old Socket Queue class. xmlReader is a small wrapper
class here for the XmlDocument. Can send it to you when it helps.

Alex

public class SocketQueue
{
// static variable where we spool the socket Data when Streams
// are not partial
private static string m_XML = "";

public SocketQueue(string strSocket, Session s)
{
XML_Node xmlReader = new XML_Node(m_XML + strSocket);

if ( xmlReader.Valid == true )
{
// We have one valid XML Node
// send it to the parser

s.RaiseOnIncomingXML(m_XML + strSocket);
Parser parse = new Parser(m_XML + strSocket, s);
m_XML = "";
}
else
{
//XML_Node xmlReader2 = new XML_Node("<dummy>" + strSocket +
"</dummy>");
XML_Node xmlReader2 = new XML_Node("<dummy>" + m_XML + strSocket +
"</dummy>");
if(xmlReader2.Valid == true)
{
// we have more than 1 node
// get all childnodes and send them to the Parser
m_XML = "";

for (int i = 0; i < xmlReader2.XMLObj.ChildNodes[0].ChildNodes.Count;
i++)
{

s.RaiseOnIncomingXML(xmlReader2.XMLObj.ChildNodes[0].ChildNodes[i].OuterXml)
;
Parser parse = new
Parser(xmlReader2.XMLObj.ChildNodes[0].ChildNodes[i].OuterXml,s);
}
}
else
{
// we have no complete XML Node
m_XML = m_XML + strSocket;

}
}
}

}
}


Nov 11 '05 #5

Thomas Polan
P: n/a
Thomas Polan
Oleg,
Thank you for your suggestions. Using the depth property, I am able to
identify when I have a complete node.

Alexander,
I do not have my XmlReader bound to my TCP stream. Rather I point the
XmlReader to a string buffer. When I receive data from the server, I cast
the read into a global string variable. Each time I get data from the
socket, I concatenate this to my global string (I suppose I could be using a
stream instead of a string). I then try to look for a complete node. If I
find a complete node, I spawn a thread and call a procedure that loads the
complete node into an XmlDocument. Afterwards, I clear that xml from the
buffer and try to find another complete node until the XmlReader returns an
error. Here is my code:

'XmlFrag -> global string var

'Each time I receive data from the socket:

XmlFrag = XmlFrag & str 'str = read from socket converted to string
Do While XmlFrag <> "" 'Will run in loop until I parse the entire buffer
or XmlReader throws an exception
parseXML
End While

Private sub parseXml
Try
Dim nt as NameTable = new NameTable
Dim nsmgr as XmlNamespaceManager = new XmlNamespaceManager(nt)
Dim context as XmlParserContext = new XmlParserContext(Nothing, nsmgr,
Nothing, XmlSpace.None)
Dim reader as XmlTextReader = new XmlTextReader(XmlFrag,
XmlNodeType.Element, context)
Dim sXML as string = ""

While (reader.Read())
Select Case reader.NodeType
Case XmlNodeType.Element
If reader.Depth = 0 then 'If I am at the beginning of a
complete node
sXML = reader.ReadOuterXml() 'Try to read to the end
of the node, if an error is thrown, it means there was not a complete node
in the read
ThreadPool.QueueUserWorkerItem(AddressOf routeXMLString,
sXML) 'spawn thread to deal with xml (load into XmlDocument)
XmlFrag = XmlFrag.Remove(0, sXML.length) 'Clear
string of parsed xml
reader.Close()
Exit While
End if
End Select
End While
Catch ex As Exception
Throw New Exception(ex.Message)
End Try

End Sub

You also might want to check out this SAX parser for .NET at:
http://www.xmlforasp.net/codeSection.aspx?csID=36

-Thomas

"Alexander Gnauck" <alex.gnauck@gmx.de> wrote in message
news:bg7mvo$l3ecm$1@ID-112594.news.uni-berlin.de...[color=blue]
> Hello,
>[color=green]
> > Sorry if this has been posted before...
> >
> > I am receiving XML messages over a TCP client. Messages vary in size
> > and sometimes can arrive in groups. Thus, I am not guaranteed to
> > receive a full message in a single read from my socket.
> >
> > I am loading each message into an XmlDocument after it arrives. I
> > need a way to make sure I have a complete XML message before I load
> > it into an XmlDocument or it will throw an exception. I am currently
> > finding complete messages by turning each read into a string and
> > manually look for closing tags. Surely there is an easier way to
> > accomplish this, perhaps with an XmlReader or XmlTextReader?[/color]
>
> im working on the same stuff. First i took XMLDocument to validate if have[/color]
a[color=blue]
> complete node that i can parse. But the XmlDocument is very very slow.
> Especially on compact Framework. Thats the Reason why im working at the
> moment at a better and faster way with the XmlReader or XmlTextReader. You
> must Read in a XmlReader and check the depth of the Reader. Then you know
> when you have a complete node. The XmlReader must be bound to a Stream. My
> problem is that i found no way for now to create a XmlDocument from the
> Reader when i know that i have a complete now since this position.
> Below is the code of my old Socket Queue class. xmlReader is a small[/color]
wrapper[color=blue]
> class here for the XmlDocument. Can send it to you when it helps.
>
> Alex
>
> public class SocketQueue
> {
> // static variable where we spool the socket Data when Streams
> // are not partial
> private static string m_XML = "";
>
> public SocketQueue(string strSocket, Session s)
> {
> XML_Node xmlReader = new XML_Node(m_XML + strSocket);
>
> if ( xmlReader.Valid == true )
> {
> // We have one valid XML Node
> // send it to the parser
>
> s.RaiseOnIncomingXML(m_XML + strSocket);
> Parser parse = new Parser(m_XML + strSocket, s);
> m_XML = "";
> }
> else
> {
> //XML_Node xmlReader2 = new XML_Node("<dummy>" + strSocket +
> "</dummy>");
> XML_Node xmlReader2 = new XML_Node("<dummy>" + m_XML + strSocket +
> "</dummy>");
> if(xmlReader2.Valid == true)
> {
> // we have more than 1 node
> // get all childnodes and send them to the Parser
> m_XML = "";
>
> for (int i = 0; i <[/color]
xmlReader2.XMLObj.ChildNodes[0].ChildNodes.Count;[color=blue]
> i++)
> {
>
>[/color]
s.RaiseOnIncomingXML(xmlReader2.XMLObj.ChildNodes[0].ChildNodes[i].OuterXml)[color=blue]
> ;
> Parser parse = new
> Parser(xmlReader2.XMLObj.ChildNodes[0].ChildNodes[i].OuterXml,s);
> }
> }
> else
> {
> // we have no complete XML Node
> m_XML = m_XML + strSocket;
>
> }
> }
> }
>
> }
> }
>
>[/color]


Nov 11 '05 #6

Alexander Gnauck
P: n/a
Alexander Gnauck
hi Thomas,
[color=blue]
>
> You also might want to check out this SAX parser for .NET at:
> http://www.xmlforasp.net/codeSection.aspx?csID=36
>[/color]

the Link to the Sax Parser looks very good.
Your code works too. I think you restart always the reader from the
beginning of the string. The advantage of a stream is that you can add bytes
to the stream without restarting the reader. So it would be a bit faster.
But the problem i have is when i call reader.ReadOuterXml() in the while
Reader.Read() loop then the reader stops. In my application i could also
have 2 or more complete nodes to parse in the socket stream. Then the reader
shouldnt stop. So i need a way to get the currentNode as string or XML
document without stopping the reader.

Alex


Nov 11 '05 #7

Post your reply

Help answer this question



Didn't find the answer to your .NET Framework question?

You can also browse similar questions: .NET Framework