471,338 Members | 1,001 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 471,338 software developers and data experts.

How do I get the value out of a DOM Element

I have been able to get xml.dom.minidom.parse('somefile.xml') and then
dom.getElementsByTagName('LLobjectID') to work to the point where I
get something like: [<DOM Element: LLobjectID at 0x13cba08>] which I
can get down to <DOM Element: LLobjectID at 0x13cba08but then I
can't find any way to just get the value out from the thing!

..toxml() returns something like: u'<LLobjectID><![CDATA[1871203]]></
LLobjectID>'.

How do I just get the 1871203 out of the DOM Element?

Thanks,

Sep 27 '07 #1
4 5043
kj7ny wrote:
I have been able to get xml.dom.minidom.parse('somefile.xml') and then
dom.getElementsByTagName('LLobjectID') to work to the point where I
get something like: [<DOM Element: LLobjectID at 0x13cba08>] which I
can get down to <DOM Element: LLobjectID at 0x13cba08but then I
can't find any way to just get the value out from the thing!

.toxml() returns something like: u'<LLobjectID><![CDATA[1871203]]></
LLobjectID>'.

How do I just get the 1871203 out of the DOM Element?
It contains a CDATA node which in turn contains a Text node (AFAIR), so you
have to walk through the children to get what you want.

Alternatively, try an XML API that makes it easy to handle XML, like
ElementTree (part of the stdlin in Python 2.5) or lxml, both of which have
compatible APIs. The code would look like this:

tree = etree.parse("some_file.xml")
id = tree.find("//LLobjectID")
print id.text

Stefan
Sep 27 '07 #2
On 27 Sep, 07:50, kj7ny <kj...@nakore.comwrote:
I have been able to get xml.dom.minidom.parse('somefile.xml') and then
dom.getElementsByTagName('LLobjectID') to work to the point where I
get something like: [<DOM Element: LLobjectID at 0x13cba08>] which I
can get down to <DOM Element: LLobjectID at 0x13cba08but then I
can't find any way to just get the value out from the thing!

.toxml() returns something like: u'<LLobjectID><![CDATA[1871203]]></
LLobjectID>'.

How do I just get the 1871203 out of the DOM Element?
DOM Level 3 provides the textContent property:

http://www.w3.org/TR/DOM-Level-3-Cor...e3-textContent

You'll find this in libxml2dom and possibly some other packages such
as pxdom. For the above case with minidom specifically (at least with
versions I've used), you need to iterate over the childNodes of the
element, obtaining the nodeValue for each node and joining them
together. Something like this might do it:

"".join([n.nodeValue for n in element.childNodes])

It's not pretty, but encapsulating stuff like this is what functions
are good for.

Paul

Sep 27 '07 #3
Forgot to mention I'm using Python 2.4.3.

Sep 27 '07 #4
kj7ny wrote:
Forgot to mention I'm using Python 2.4.3.
You can install both lxml and ET on Python 2.4 (and 2.3). It's just that ET
went into the stdlib from 2.5 on.

Stefan
Sep 27 '07 #5

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

3 posts views Thread by Lynn | last post: by
reply views Thread by rosydwin | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.