473,394 Members | 1,703 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,394 software developers and data experts.

XPath subtree pattern matching

Hello -

Is there any way to match complex subtree patterns with XPath? The
functions I see all seem to match along a single path from root to leaf.
I would like to match full subtrees.

For example, given the XHTML:

<html>
<body>
<p>
<a>#text</a>
<br/>
#text
<b>#text</b>
#text
<br/>
<font>
<a>#text</a>
</font>
</p>
<p>
<a>#text</a>
<br/>
#text
<br/>
<font>
<a>#text</a>
</font>
</p>
</body>
</html>

I would like to construct a "pattern" using XPath to match all subtrees
like:

<p>
<a>*</a>
<br/>
*
(<b>*</b>)?
(*)?
<br/>
<font>
<a>*</a>
</font>
</p>

where the "*" means that any text can be matched, and the "?" means that
0 or 1 instances of the item may be matched, similar to a regular
expression.

Is there an easy way to do this kind of "subtree pattern matching" in
XPath? Would I be better off writing a wrapper over XPath and using
several XPath queries to represent and retreive my pattern?

Thanks in advance,

Andrew Hogue

Jul 20 '05 #1
2 5030
"ahogue at theory dot lcs dot mit dot edu" <"ahogue at theory dot lcs dot
mit dot edu"> wrote:
Hello -

Is there any way to match complex subtree patterns with XPath? The
functions I see all seem to match along a single path from root to leaf.
I would like to match full subtrees.


XPath is basically a tree language, not a path language, so you *can*
specify tree patterns. This is usually done by using qualifiers. To match
e.g.

<f>
<a/>
<b>Text</b>
<c>Other Text</c>
</f>

and select "Text", an XPath expression could be used as follows:
f/b[preceding-sibling::a][following-sibling::c]

However, Tree matching in XPath has two restrictions:
1. It is not "nice", since you basically encode the tree in a linear
representation which is not straightforward, as it does not
resemble the XML document
2. It is not possible to select content at several positions (e.g.
"Text" and "Other Text" together)

I don't want to make too much advertisement again, but you might want to
have a look at http://www.xcerpt.org if you want to have a look at a
language with "real" tree patterns.

--
Sebastian

PGP Key fingerprint =
13 1D 2E 4F 20 3E C9 1F 4C 57 52 87 8A 80 48 4D F5 E9 97 EC

Jul 20 '05 #2
As easy as:

node()[count(ancestor-or-self::someNode | theRoot-someNode)
=
count(ancestor-or-self::someNode )
]

This matches all nodes of the tree with root theRoot-someNode, which is a
specific "someNode" element.

In case we want simply to select all nodes of a given tree, we can use the
following simpler XPath expression, which is not a match pattern, because
the location steps (not the predicates) of a match pattern may only contain
the child and attribute axis:

theRoot-someNode//descendant-or-self::node()

This selects all nodes of the tree with root a "theRoot-someNode" element.

=====
Cheers,

Dimitre Novatchev.
http://fxsl.sourceforge.net/ -- the home of FXSL

"ahogue at theory dot lcs dot mit dot edu" <"ahogue at theory dot lcs dot
mit dot edu"> wrote in message
news:3f**********************@senator-bedfellow.mit.edu...
Hello -

Is there any way to match complex subtree patterns with XPath? The
functions I see all seem to match along a single path from root to leaf.
I would like to match full subtrees.

For example, given the XHTML:

<html>
<body>
<p>
<a>#text</a>
<br/>
#text
<b>#text</b>
#text
<br/>
<font>
<a>#text</a>
</font>
</p>
<p>
<a>#text</a>
<br/>
#text
<br/>
<font>
<a>#text</a>
</font>
</p>
</body>
</html>

I would like to construct a "pattern" using XPath to match all subtrees
like:

<p>
<a>*</a>
<br/>
*
(<b>*</b>)?
(*)?
<br/>
<font>
<a>*</a>
</font>
</p>

where the "*" means that any text can be matched, and the "?" means that
0 or 1 instances of the item may be matched, similar to a regular
expression.

Is there an easy way to do this kind of "subtree pattern matching" in
XPath? Would I be better off writing a wrapper over XPath and using
several XPath queries to represent and retreive my pattern?

Thanks in advance,

Andrew Hogue

Jul 20 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: bdinmstig | last post by:
I refined my attempt a little further, and the following code does seem to work, however it has 2 major problems: 1. Very limited support for XPath features Basic paths are supported for...
176
by: Thomas Reichelt | last post by:
Moin, short question: is there any language combining the syntax, flexibility and great programming experience of Python with static typing? Is there a project to add static typing to Python? ...
9
by: Xah Lee | last post by:
# -*- coding: utf-8 -*- # Python # Matching string patterns # # Sometimes you want to know if a string is of # particular pattern. Let's say in your website # you have converted all images...
1
by: Xuejun Li \(SH/RDC\) | last post by:
Hi, How do I select a value that starts with a certain prefix? For example I want all <author> elements that contain a <last-name> element with a value that begins with the letter Mi. Will this...
1
by: Hardy Merrill | last post by:
I have an XML document that has this structure: <applications> <application> <app_name>ABC</app_name> </application> <application> <app_name>DEF</app_name> </application> </applications>
2
by: Tom Clement | last post by:
I really believe it's possible to do what I want with a SelectNodes() XPath query, but I'm lost. Any help would be appreciated. Suppose you have an XML document (like a WordProcessingML file)...
1
by: malc | last post by:
How do I do pattern matching with xpath? Say I want to match <b> It is cold <b/> I have tried match="htm:b[contains(text(),'is') But it doesn't seem to work. Can anyone help?
4
by: duduch_1er | last post by:
Hello i tried this code in my stylesheet : xsl:variable name="items" select="xsl-usexsl:getElementsBySplitString(.,'i')" /> whith the java function getElementsBySplitString here the code : ...
3
by: werD | last post by:
Hello I have an xml document that im currently using a forward only .net repeater on and using some xpath queries to display the data The xml is quite simple <?xml version="1.0"...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.