Connecting Tech Pros Worldwide Help | Site Map

How to Parse Mixed Content

 
LinkBack Thread Tools Search this Thread
  #1  
Old April 3rd, 2007, 02:25 PM
Iain
Guest
 
Posts: n/a
Default How to Parse Mixed Content

I've spent a while researching this and my analysis to date indicates it
can't easily be done in dotnet.

what I want to do is to take xml like this
<stuff>
<node id="1"/>Now<node id="2"/<node id="7"/>is<node id="14"/<node
id="15"/>the<node id="19"/<node id="20"/>winter<node id="21"/>
</stuff>

and extract a list with node identifiers and text (which can be whitespace
as in this example)

This seems impossible with the XMLSerializer (which is a shame as this is
embedded in a somewhat complex xml file!).

It *may* be possible with XMLReader, but I'm not too clear on how.

Any help would be much appreciated!

Iain

  #2  
Old April 3rd, 2007, 02:55 PM
Martin Honnen
Guest
 
Posts: n/a
Default Re: How to Parse Mixed Content

Iain wrote:
Quote:
what I want to do is to take xml like this
<stuff>
<node id="1"/>Now<node id="2"/<node id="7"/>is<node id="14"/<node
id="15"/>the<node id="19"/<node id="20"/>winter<node id="21"/>
</stuff>
>
and extract a list with node identifiers and text (which can be whitespace
as in this example)
Here is a .NET 2.0 XmlReader example:

using (XmlReader xmlReader = XmlReader.Create(@"file.xml")) {
while (xmlReader.Read()) {
if (xmlReader.NodeType == XmlNodeType.Element &&
xmlReader.Name == "stuff") {
while (xmlReader.Read() && xmlReader.NodeType !=
XmlNodeType.EndElement && xmlReader.Name != "stuff") {
switch (xmlReader.NodeType) {
case XmlNodeType.Element:
Console.WriteLine("Found element {0} with id: {1}.",
xmlReader.Name, xmlReader.GetAttribute("id"));
break;
case XmlNodeType.Text:
Console.WriteLine("Found text node with contents
\"{0}\"", xmlReader.Value);
break;
case XmlNodeType.Whitespace:
Console.WriteLine("Found white space \"{0}\"",
xmlReader.Value);
break;
}
}
}
}
}

Output for that sample is

Found white space "
"
Found element node with id: 1.
Found text node with contents "Now"
Found element node with id: 2.
Found white space " "
Found element node with id: 7.
Found text node with contents "is"
Found element node with id: 14.
Found white space " "
Found element node with id: 15.
Found text node with contents "the"
Found element node with id: 19.
Found white space " "
Found element node with id: 20.
Found text node with contents "winter"
Found element node with id: 21.
Found white space "
"

--

Martin Honnen --- MVP XML
http://JavaScript.FAQTs.com/
  #3  
Old April 3rd, 2007, 03:15 PM
Iain
Guest
 
Posts: n/a
Default Re: How to Parse Mixed Content

On Tue, 03 Apr 2007 16:44:51 +0200, Martin Honnen wrote:
Quote:
>
Here is a .NET 2.0 XmlReader example:
Wow.

Thanks!

Do you happen to know if I can zip this into an xml tree otherwise
deserialized by XmlSerializer?

Iain
  #4  
Old April 4th, 2007, 11:55 AM
Martin Honnen
Guest
 
Posts: n/a
Default Re: How to Parse Mixed Content

Iain wrote:
Quote:
Do you happen to know if I can zip this into an xml tree otherwise
deserialized by XmlSerializer?
I am not sure what you want to achieve but I don't think there is an
easy way, you would need to implement a custom XmlReader.


--

Martin Honnen --- MVP XML
http://JavaScript.FAQTs.com/
  #5  
Old April 4th, 2007, 12:55 PM
Iain
Guest
 
Posts: n/a
Default Re: How to Parse Mixed Content

On Wed, 04 Apr 2007 13:42:17 +0200, Martin Honnen wrote:
Quote:
Iain wrote:
>
Quote:
>Do you happen to know if I can zip this into an xml tree otherwise
>deserialized by XmlSerializer?
>
I am not sure what you want to achieve but I don't think there is an
easy way, you would need to implement a custom XmlReader.
Thanks Martin.

What I wanted to do was to use XmlSerializer to serialise and deserialise
everything except the mixed content section. Which I would do with
XmlReader having overridden the XmlSerializer for that particular class in
some way. There appeared no obvious way of doing this so I used brute
force!

I've ended up coding the whole structure up (the sample plus a half dozen
other entities) with XmlReader which is a pain. If I could have done it
with XmlSerializer it would have taken 10 mins not 2 hours.

Now I'm struggling to cope with whitespace.

Nonetheless it's all working (more or less).

Thanks again...

Iain
 

Bookmarks

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On

Popular Articles

What is Bytes?

We are a network of experts and professionals in IT and software development that help one another with answers to tough questions and share insights. Get the best answers to your questions from over 220,989 network members.