473,390 Members | 1,448 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,390 software developers and data experts.

libxml2 and XPath - Iterate through repeating elements?

I'm trying to iterate through repeating elements to extract data using
libxml2 but I'm having zero luck - any help would be appreciated.

My XML source is similar to the following - I'm trying to extract the
line number and product code from the repeating line elements:

<order xmlns="some-ns">
<header>
<orderno>123456</orderno>
</header>
<lines>
<line>
<lineno>1</lineno>
<productcode>PENS</productcode>
</line>
<line>
<lineno>2</lineno>
<productcode>STAPLER</productcode>
</line>
<line>
<lineno>3</lineno>
<productcode>RULER</productcode>
</line>
</lines>
</order>

With the following code I can get at the non-repeating elements in the
header, and get the lines elements, but cannot extract the
lineno/productcode data via xpath:

XmlDoc = libxml2.parseFile(XmlFile);
XPathDoc = XmlDoc.xpathNewContext();
XPathDoc.xpathRegisterNs('so',"some-ns");
# Extract data from the order header
PurchaseOrderNo =
XPathDoc.xpathEval('//so:order/so:header/so:orderno');

# Extract data from the order lines
for line in XPathDoc.xpathEval('//so:order/so:lines/so:line'):
print line.content;

# Explicitly free Xml document and XPath context
XmlDoc.freeDoc()
XPathDoc.xpathFreeContext()

Ideally, I'd like to select the line data using xpath (similar to an
XSLT query after a 'for-each' - i.e. xpathEval('so:lineno') and
xpathEval('so:productcode') once I've got the line element).

Any suggestions grealty appreciated!

Cheers, Nick.

Dec 2 '05 #1
2 6592
nickhepples...@gmail.com wrote:
I'm trying to iterate through repeating elements to extract data using
libxml2 but I'm having zero luck - any help would be appreciated.


Here's how I attempt to solve the problem using libxml2dom [1] (and I
imagine others will suggest their own favourite modules, too):

import libxml2dom
d = libxml2dom.parseFile(filename)
order_numbers = d.xpath("//so:order/so:header/so:orderno",
namespaces={"so" : "some-ns"})

At this point, you have a list of nodes. (I imagine that whatever
object the libxml2 module API produces probably has those previous and
next attributes to navigate the result list instead.) The nodes in the
list represent the orderno elements in this case, and in libxml2dom you
can choose to invoke the usual DOM methods on such node objects, or
even the toString method if you want the document text. For the line
items...

lines = d.xpath("//so:order/so:lines/so:line", namespaces={"so" :
"some-ns"})
for line in lines:
print line.toString()

I can't remember what the libxml2 module produces for the content
attribute of a node, although the underlying libxml2 API produces a
"text-only" representation of the document text, as opposed to the
actual document text that the toString method produces in the above
example. I imagine that an application working with the line item
information would use additional DOM or XPath processing to get the
line item index and the product code directly.

Anyway, I recommend libxml2dom because if you're already using the
bundled libxml2 module, you should be able to install libxml2dom and
plug into the same infrastructure that the bundled module depends upon.
Moreover, libxml2dom is a "pure Python" package that doesn't require
any extension module compilation, so it should be quite portable to
whatever platform you're using.

Paul

[1] http://www.python.org/pypi/libxml2dom

Dec 2 '05 #2
Le Vendredi 2 Décembre 2005 18:31, ni************@gmail.com a écrit*:
I'm trying to iterate through repeating elements to extract data using
libxml2 but I'm having zero luck - any help would be appreciated.

My XML source is similar to the following - I'm trying to extract the
line number and product code from the repeating line elements:

<order xmlns="some-ns">
<header>
<orderno>123456</orderno>
</header>
<lines>
<line>
<lineno>1</lineno>
<productcode>PENS</productcode>
</line>
<line>
<lineno>2</lineno>
<productcode>STAPLER</productcode>
</line>
<line>
<lineno>3</lineno>
<productcode>RULER</productcode>
</line>
</lines>
</order>

The result of an xpath evaluation is a list of node, which you can perform
another xpatheval() on :

import libxml2
doc = libxml2.parseFile(XmlFile)
root = doc.getRootElement()
line_nodes = root.xpathEval('lines/line')
for line_node in line_nodes:
print line_node.xpathEval('lineno')[0].content
print line_node.xpathEval('productcode')[0].content
doc.freeDoc()

--
Cordially

Jean-Roch SOTTY
Dec 5 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: Brian Donovan | last post by:
Hi All, I'm trying to get xpath to work with the libxml2 python bindings. I'm using the following doc = libxml2.parseFile(filename) result = doc.xpathEval('//*') My test XML file has 10...
2
by: gimme_this_gimme_that | last post by:
What xpath expression would return the category-item having uid sps002 ? <category-list> <category> <uid>GRIDS_MAIN_CATEGORY_UID</uid> <uid-type>Categories</uid-type> <category-item-list>...
1
by: Dave | last post by:
Is it possible to get <codes><code id="4"><name>abc</name></code></codes from the XML below in single SelectSingleNode/xPath expression step OR is going to have to be a multi=step process of...
4
by: Bnaya Eshet | last post by:
I do like XPath, I really do. But I'm working on the compact framework which XPath is not included. So I come to understanding that if XPath do not come to the mountain,
4
by: Guy | last post by:
Hi, I read an XML file to an XMLDocument and iterate through its nodes. How do I get the XPath position (index) of a certain element? For example If I on the second "b" node I want to get "2": ...
2
by: redcic | last post by:
Hi all, I would like to build a xml file using Xerces. I know how to build a single node at a time. For example, with 'doc' belonging to the DocumentImpl class and with 'docRootNode' belonging...
4
by: Claudio Calboni | last post by:
Hello folks, I'm having some performance issues with the client-side part of my application. Basically, it renders a huge HTML table (about 20'000 cells in my testing scenario), without content....
0
by: sharan | last post by:
how we can print the content of the element in a XML file using libxml2 as i want to use it with "xpath" example: this is my xml file <record> <firstname>myfile</firstname>...
2
by: bruce | last post by:
morning.... i apologize up front as this is really more of an xpath question.. in my python, i'm using the xpath function to iterate/parse some html. i can do something like ...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.