471,578 Members | 1,259 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 471,578 software developers and data experts.

libxml2 and XPath - Iterate through repeating elements?

I'm trying to iterate through repeating elements to extract data using
libxml2 but I'm having zero luck - any help would be appreciated.

My XML source is similar to the following - I'm trying to extract the
line number and product code from the repeating line elements:

<order xmlns="some-ns">
<header>
<orderno>123456</orderno>
</header>
<lines>
<line>
<lineno>1</lineno>
<productcode>PENS</productcode>
</line>
<line>
<lineno>2</lineno>
<productcode>STAPLER</productcode>
</line>
<line>
<lineno>3</lineno>
<productcode>RULER</productcode>
</line>
</lines>
</order>

With the following code I can get at the non-repeating elements in the
header, and get the lines elements, but cannot extract the
lineno/productcode data via xpath:

XmlDoc = libxml2.parseFile(XmlFile);
XPathDoc = XmlDoc.xpathNewContext();
XPathDoc.xpathRegisterNs('so',"some-ns");
# Extract data from the order header
PurchaseOrderNo =
XPathDoc.xpathEval('//so:order/so:header/so:orderno');

# Extract data from the order lines
for line in XPathDoc.xpathEval('//so:order/so:lines/so:line'):
print line.content;

# Explicitly free Xml document and XPath context
XmlDoc.freeDoc()
XPathDoc.xpathFreeContext()

Ideally, I'd like to select the line data using xpath (similar to an
XSLT query after a 'for-each' - i.e. xpathEval('so:lineno') and
xpathEval('so:productcode') once I've got the line element).

Any suggestions grealty appreciated!

Cheers, Nick.

Dec 2 '05 #1
2 6437
nickhepples...@gmail.com wrote:
I'm trying to iterate through repeating elements to extract data using
libxml2 but I'm having zero luck - any help would be appreciated.


Here's how I attempt to solve the problem using libxml2dom [1] (and I
imagine others will suggest their own favourite modules, too):

import libxml2dom
d = libxml2dom.parseFile(filename)
order_numbers = d.xpath("//so:order/so:header/so:orderno",
namespaces={"so" : "some-ns"})

At this point, you have a list of nodes. (I imagine that whatever
object the libxml2 module API produces probably has those previous and
next attributes to navigate the result list instead.) The nodes in the
list represent the orderno elements in this case, and in libxml2dom you
can choose to invoke the usual DOM methods on such node objects, or
even the toString method if you want the document text. For the line
items...

lines = d.xpath("//so:order/so:lines/so:line", namespaces={"so" :
"some-ns"})
for line in lines:
print line.toString()

I can't remember what the libxml2 module produces for the content
attribute of a node, although the underlying libxml2 API produces a
"text-only" representation of the document text, as opposed to the
actual document text that the toString method produces in the above
example. I imagine that an application working with the line item
information would use additional DOM or XPath processing to get the
line item index and the product code directly.

Anyway, I recommend libxml2dom because if you're already using the
bundled libxml2 module, you should be able to install libxml2dom and
plug into the same infrastructure that the bundled module depends upon.
Moreover, libxml2dom is a "pure Python" package that doesn't require
any extension module compilation, so it should be quite portable to
whatever platform you're using.

Paul

[1] http://www.python.org/pypi/libxml2dom

Dec 2 '05 #2
Le Vendredi 2 Décembre 2005 18:31, ni************@gmail.com a écrit*:
I'm trying to iterate through repeating elements to extract data using
libxml2 but I'm having zero luck - any help would be appreciated.

My XML source is similar to the following - I'm trying to extract the
line number and product code from the repeating line elements:

<order xmlns="some-ns">
<header>
<orderno>123456</orderno>
</header>
<lines>
<line>
<lineno>1</lineno>
<productcode>PENS</productcode>
</line>
<line>
<lineno>2</lineno>
<productcode>STAPLER</productcode>
</line>
<line>
<lineno>3</lineno>
<productcode>RULER</productcode>
</line>
</lines>
</order>

The result of an xpath evaluation is a list of node, which you can perform
another xpatheval() on :

import libxml2
doc = libxml2.parseFile(XmlFile)
root = doc.getRootElement()
line_nodes = root.xpathEval('lines/line')
for line_node in line_nodes:
print line_node.xpathEval('lineno')[0].content
print line_node.xpathEval('productcode')[0].content
doc.freeDoc()

--
Cordially

Jean-Roch SOTTY
Dec 5 '05 #3

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

1 post views Thread by Brian Donovan | last post: by
2 posts views Thread by gimme_this_gimme_that | last post: by
1 post views Thread by Dave | last post: by
4 posts views Thread by Bnaya Eshet | last post: by
4 posts views Thread by Guy | last post: by
4 posts views Thread by Claudio Calboni | last post: by
2 posts views Thread by bruce | last post: by
reply views Thread by leo001 | last post: by
reply views Thread by lumer26 | last post: by
reply views Thread by Vinnie | last post: by
1 post views Thread by lumer26 | last post: by
reply views Thread by lumer26 | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.