472,110 Members | 2,228 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,110 software developers and data experts.

Finding/removing nodes with attribute value

Hi. I have a rather large xml document (object) that can have one or
more nodes with a certain attribute throughout (at ANY depth, not at
the same level necessarily). I need to find this attribute and remove
the containing node (and child nodes) if it has a certain value.

I'm able to find the attributes using an XmlTextReader. Once found,
can someone help me get the XPath at that point? I would then use
this to remove the node from the document object.

Or can someone suggest another appoach to navigate the xml and find
and remove the nodes? Thanks.
Nov 12 '05 #1
2 10535
"Greg" <gr********@earthlink.net> wrote in message news:c5**************************@posting.google.c om...
Hi. I have a rather large xml document (object) that can have one or
more nodes with a certain attribute throughout (at ANY depth, not at
the same level necessarily). I need to find this attribute and remove
the containing node (and child nodes) if it has a certain value.


Since you can find these attributes in your XmlTextReader, why bother
with loading it into a XmlDocument and then using XPath to remove
the nodes when you can filter them out all within your XmlTextReader?

Here's an example XmlTextReader named NodeStripReader that takes
out all elements marked with an attribute named exclude (whatever it's
value, although it should be trivial to fine-tune the filter to apply whatever
attribute value criteria your application requires), including all child nodes
of these elements.

- - - NodeStripReader.cs
using System;
using System.IO;
using System.Text;
using System.Xml;

namespace NS1
{
/// <summary>
/// Specialized <see cref="System.Xml.XmlTextReader"/> that strips out nodes
/// marked with an <u>exclude</u> attribute, and child nodes of an excluded
/// element in a streaming manner.
/// </summary>
public class NodeStripReader : XmlTextReader
{
private string excludeAttrName;

/// <summary>
/// Creates a <see cref="NodeStripReader"/> that excludes all nodes beneath
/// an element having an attribute named <u>exclude</u>.
/// </summary>
/// <param name="source">
/// The <see cref="System.IO.TextReader">TextReader</see> serving as the
/// source for the streaming XML.
/// </param>
public NodeStripReader(TextReader source) : base(source)
{
excludeAttrName = NameTable.Add("exclude");
}

/// <summary>
/// Excludes all nodes read from under the <see cref="NodeStripReader"/>'s
/// starting position.
/// </summary>
/// <returns>
/// True when there are nodes remaining in the document to read.
/// </returns>
protected bool ConsumeExcludedNodes( )
{
// moreToRead is necessary in case I exclude the document element,
// and Read( ) needs to bubble up that the end-of-document has been
// reached.
//
bool moreToRead;

// As long as I haven't ascended to at least the next sibling
// of the node I started excluding at, I keep consuming nodes.
// moreToConsume lets me check this for equality after a first
// pass through the loop, since I must consume at least one
// node.
//
int startingDepth = base.Depth;
bool moreToConsume = true;

// As long sa there are nodes, and I'm not finished consuming them.
//
while ((true == (moreToRead = base.Read( ))) && moreToConsume)
{
XmlNodeType nodeType = base.NodeType;

// If the consumed node I've read is an Element, then I've gone
// one level deeper.
//
if (nodeType == XmlNodeType.EndElement)
{
// If the consumed node I've read is an EndElement, then
// I'm about to ascend one level, check whether I've
// excluded all of the nodes I meant to exclude.
//
moreToConsume = (startingDepth != base.Depth);
}
// XmlNodeTypes of Text, Entity, Comment, everything beneath
// the Element I started to exclude are all quietly eaten.
}

return moreToRead;
}

/// <summary>
/// Reads nodes from the source, excludes any nodes and their descendants
/// that contain an attribute node named <u>exclude</u>.
/// </summary>
/// <returns>
/// True when nodes remain in the document to be read.
/// </returns>
/// <remarks>
/// <seealso cref="System.Xml.XmlTextReader.Read"/>
/// </remarks>
public override bool Read( )
{
// Must call base Read( ) to do anything, including initial transition
// from ReadState.Initial -> ReadState.Interactive.
//
bool moreToRead = base.Read( );

// Nothing to do if this read was the End-of-Document.
//
if (moreToRead)
{
if (base.NodeType == XmlNodeType.Element)
{
// If this node is the start of an element, then I need to check
// for the presence of an exclude attribute.
//
string excludeValue = base.GetAttribute(excludeAttrName, "");
if (null != excludeValue)
{
// Consume all descendant-or-self elements at this point;
// the reader is at the next non-excluded element (sibling
// or parent of this element) or the end-of-document when
// this call returns.
//
moreToRead = this.ConsumeExcludedNodes( );
}
}
}

return moreToRead;
}
}
}
- - -

All pretty simple, and more efficient than building XPath expressions (I mean,
you could use a Stack as your XmlTextReader subclass descends to save off
the element names until you get to an excluded element, and then as soon as
you do you'd store that Stack and use it later to build the location steps of an
XPath expression from the bottom-up).

Here's a test app to demonstrate how NodeStripReader can be applied to a
situation such as yours.

- - - TestApp.cs
using System;
using System.IO;
using System.Text;
using System.Xml;

namespace NS1
{
class NodeStripReaderApp
{
static void Main( string[] args)
{
string xml = "<root><firstChild exclude=\"true\"><grandChild/>" +
"<text>Text</text></firstChild><secondChild>Text</secondChild></root>";
StringReader reader = new StringReader( xml);
NodeStripReader textReader = new NodeStripReader( reader);
XmlDocument doc = new XmlDocument( );
doc.Load( textReader);
Console.WriteLine( doc.OuterXml);

Console.ReadLine( );
}
}
}
- - -

This produces the following output,

<root><secondChild>Text</secondChild></root>

and you'll observe that the excluded element and it's child nodes, the following tree,

<firstChild exclude='true'><grandChild/><text>Text</text></firstChild>

have all been filtered out from the resulting XML.
Derek Harmon
Nov 12 '05 #2


Greg wrote:
I have a rather large xml document (object) that can have one or
more nodes with a certain attribute throughout (at ANY depth, not at
the same level necessarily). I need to find this attribute and remove
the containing node (and child nodes) if it has a certain value.

I'm able to find the attributes using an XmlTextReader. Once found,
can someone help me get the XPath at that point? I would then use
this to remove the node from the document object.

Or can someone suggest another appoach to navigate the xml and find
and remove the nodes?


Well XPath allows you to find nodes at any level e.g.
//element-name[@attribute-name = 'attribute value']
gives you all the elements at any level with the attribute value as
specified so I am not sure you need an XmlTextReader if you later on
want to use an XmlDocument anyway to remove nodes.

--

Martin Honnen
http://JavaScript.FAQTs.com/
Nov 12 '05 #3

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

1 post views Thread by Johannes Lebek | last post: by
10 posts views Thread by Johnny Ooi | last post: by
2 posts views Thread by bmgz | last post: by
7 posts views Thread by Simon Hart | last post: by
1 post views Thread by Christian Rühl | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.