472,111 Members | 2,010 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,111 software developers and data experts.

DOM question

Hi there,

I have an XML document which contains a mixture of structural nodes
(called 'section' and with unique 'id' attributes) and non-structural
nodes (called anything else). The structural elements ('section's) can
contain, as well as non-structural elements, other structural elements.
I'm doing the Python DOM programming with this document and have got
stuck with something.

I want to be able to get all the non-structural elements which are
children of a given 'section' elemenent (identified by 'id' attribute)
but not children of any child 'section' elements of the given 'section'.

e.g.:

<section id="a">
<foo>bar</foo>
</section>
<section id="b">
<foo>baz</foo>
<section id="c">
<bar>foo</bar>
</section>
</section>

Given this document, the working function would return "<foo>baz</foo>"
for id='b' and "<bar>foo</bar>" for id='c'.

Normally, recursion is used for DOM traversals. I've tried this function
which uses recursion with a generator (can the two be mixed?)

def content_elements(node):
if node.hasChildNodes():
node = node.firstChild

if not page_node(node):
yield node

for e in self.content_elements(node):
yield e

node = node.nextSibling

which didn't work. So I tried it without using a generator:

def content_elements(node, elements):
if node.hasChildNodes():
node = node.firstChild

if node.nodeType == Node.ELEMENT_NODE: print node.tagName
if not page_node(node):
elements.append(node)

self.content_elements(node, elements)

node = node.nextSibling

return elements

However, I got exactly the same problem: each time I use this function I
just get a DOM Text node with a few white space (tabs and returns) in
it. I guess this is the indentation in my source document? But why do I
not get the propert element nodes?

Cheers,
Richard
Jul 19 '05 #1
3 1115
> However, I got exactly the same problem: each time I use this function I
just get a DOM Text node with a few white space (tabs and returns) in
it. I guess this is the indentation in my source document? But why do I
not get the propert element nodes?


Welcome to the wonderful world of DOM, Where insignificant whitespace
becomes a first-class citizen!

Use XPath. Really. It's well worth the effort, as it is suited for exactly
the tasks you presented us, and allows for a concise formulation of these.
Yours would be (untested)

//section[id==$id_param]/node()[!name() == section]

It looks from the root throug all the descending childs

//

after nodes with name section

section

that fulfill the predicate

[id==$id_param]

From this out we collect all immediate children

/node()

that are not of type section [!name() == section]
--
Regards,

Diez B. Roggisch
Jul 19 '05 #2

On Thu, 02 Jun 2005 14:34:47 +0200, "Diez B. Roggisch"
<de*********@web.de> said:
However, I got exactly the same problem: each time I use this function I
just get a DOM Text node with a few white space (tabs and returns) in
it. I guess this is the indentation in my source document? But why do I
not get the propert element nodes?


Welcome to the wonderful world of DOM, Where insignificant whitespace
becomes a first-class citizen!

Use XPath. Really. It's well worth the effort, as it is suited for
exactly
the tasks you presented us, and allows for a concise formulation of
these.
Yours would be (untested)

//section[id==$id_param]/node()[!name() == section]

Yes, in fact:

//section[@id=$id_param]//*[name()!='section']

would do the trick.

I was trying to avoid using anything not in the standard Python
distribution if I could help it; I need to be able to use my code on
Linux, OS X and Windows.

The xml.path package is from PyXML, yes? I'll just have to battle with
installing PyXML on OS X ;-)

Cheers,
Richard
Jul 19 '05 #3
>
Yes, in fact:

//section[@id=$id_param]//*[name()!='section']

would do the trick.

I was trying to avoid using anything not in the standard Python
distribution if I could help it; I need to be able to use my code on
Linux, OS X and Windows.

The xml.path package is from PyXML, yes? I'll just have to battle with
installing PyXML on OS X ;-)


As a fresh member of the MacOSX community I can say that so far except
pygame I made everything run. So - I don't expect that to be too much of
a problem.

Diez
Jul 19 '05 #4

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

1 post views Thread by Mohammed Mazid | last post: by
3 posts views Thread by Stevey | last post: by
10 posts views Thread by glenn | last post: by
53 posts views Thread by Jeff | last post: by
56 posts views Thread by spibou | last post: by
2 posts views Thread by Allan Ebdrup | last post: by
3 posts views Thread by Zhang Weiwu | last post: by
reply views Thread by leo001 | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.