473,563 Members | 2,703 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

python xml dom help please

Apologies if this post appears more than once.

The file -

---------------
<?xml version="1.0" encoding="utf-8"?>
<Game><A/><B/><C/></Game>
---------------

is processed by this program -

---------------
#!/usr/bin/env python

from xml.dom.ext.rea der import PyExpat
from xml.dom.ext import PrettyPrint

import sys

def deepen(nodeList ):
for node in nodeList:
print(node.node Name)
if node.previousSi bling != None:
if node.previousSi bling.nodeType == node.ELEMENT_NO DE:
if node.previousSi bling.hasChildN odes():
print("has children")
node.previousSi bling.lastChild .appendChild(no de)
else:
node.previousSi bling.appendChi ld(node)
deepen(node.chi ldNodes)

# get DOM object
reader = PyExpat.Reader( )
doc = reader.fromUri( sys.argv[1])

# call func
deepen(doc.chil dNodes)

# display altered document
PrettyPrint(doc )
---------------

which outputs the following -

---------------
Game
Game
A
B
<?xml version='1.0' encoding='UTF-8'?>
<Game>
<A>
<B/>
</A>
<C/>
</Game>

---------------

Can anybody explain why the line 'print(node.nod eName)' never prints 'C'?

Also, why 'has children' is never printed?

I am trying to output

---------------
<?xml version='1.0' encoding='UTF-8'?>
<Game>
<A>
<B>
<C/>
</B>
</A>
</Game>
---------------

I know there are easier ways to do this, but i want to do it using dom.

Thanks in advance.
Jul 18 '05 #1
11 4446
Without having any thorough look at your (recursive)'dee pen' function, I can
see there's no termination condition for the recursion....
So that's one reason this won't work the way you want it to.

Miklós
deglog <sp***********@ ntlworld.com> wrote in message
news:f7******** *************** **@posting.goog le.com...

---------------
#!/usr/bin/env python

from xml.dom.ext.rea der import PyExpat
from xml.dom.ext import PrettyPrint

import sys

def deepen(nodeList ):
for node in nodeList:
print(node.node Name)
if node.previousSi bling != None:
if node.previousSi bling.nodeType == node.ELEMENT_NO DE:
if node.previousSi bling.hasChildN odes():
print("has children")
node.previousSi bling.lastChild .appendChild(no de)
else:
node.previousSi bling.appendChi ld(node)
deepen(node.chi ldNodes)


Jul 18 '05 #2
Miklós wrote:
Without having any thorough look at your (recursive)'dee pen' function, I
can see there's no termination condition for the recursion....
So that's one reason this won't work the way you want it to.


Nope - he has a termination condition. deepen is called for all childNodes,
so he makes a traversal of all nodes.

Regards,

Diez
Jul 18 '05 #3
Hi,

Also, why 'has children' is never printed?
The code is somewhat complicated, however the reason for "has children" not
beeing printed is simply that for the example no node matches the condition
- nodes A,B,C are the only ones with siblings, and none of them has a child
node....
I know there are easier ways to do this, but i want to do it using dom.


I'm not sure what easier ways _you_ think of - but to me it looks like a
classic field for XSLT, which is much more convenient to deal with. DOM is
usually PIA, don't mess around with it if you're not forced to.

Diez

Jul 18 '05 #4
sp***********@n tlworld.com (deglog) wrote:
def deepen(nodeList ):
for node in nodeList:
[...]
node.previousSi bling.appendChi ld(node)


Bzzt: destructive iteration gotcha.

DOM NodeLists are 'live': when you move a child Element out of the parent,
it no longer exists in the childNodes list. So in the example:

<a/>
<b/>
<c/>

the first element (a) cannot be moved and is skipped; the second element (b)
is moved into its previousSibling (a); the third element... wait, there is no
third element any more because (c) is now the second element. So the loop
stops.

A solution would be to make a static copy of the list beforehand. There's no
standard-DOM way of doing that and the Python copy() method is not guaranteed
to work here, so use a list comprehension or map:

identity= lambda x: x
for node in map(identity, nodeList):
...

--
Andrew Clover
mailto:an*@doxd esk.com
http://www.doxdesk.com/
Jul 18 '05 #5
an********@doxd esk.com (Andrew Clover) writes:
sp***********@n tlworld.com (deglog) wrote: [...] A solution would be to make a static copy of the list beforehand. There's no
standard-DOM way of doing that and the Python copy() method is not guaranteed
to work here, so use a list comprehension or map:

identity= lambda x: x
for node in map(identity, nodeList):
...


Why not just

for node in list(nodeList):
...

?
John
Jul 18 '05 #6
Thanks for the help - this works and i understand how, and why.

jj*@pobox.com (John J. Lee) wrote in message news:<87******* *****@pobox.com >...

Why not just

for node in list(nodeList):
...

?
John


the following also works (as i intended):

from xml.dom.NodeFil ter import NodeFilter

def appendToDescend ant(node):
walker.previous Sibling()
while 1:
if walker.currentN ode.hasChildNod es():
next = walker.nextNode ()
else: break
walker.currentN ode.appendChild (node)

walker = doc.createTreeW alker(doc.docum entElement,Node Filter.SHOW_ELE MENT,
None, 0)
while 1:
print walker.currentN ode.nodeName
if walker.currentN ode.previousSib ling != None:
print "ps "+walker.curren tNode.previousS ibling.nodeName
if walker.currentN ode.previousSib ling.nodeName != "Game":
if walker.currentN ode.previousSib ling.hasChildNo des():
appendToDescend ant(walker.curr entNode)
else:
walker.currentN ode.previousSib ling.appendChil d(walker.curren tNode)
next = walker.nextNode ()
if next is None: break

Strangely, the line checking "Game" is needed, because this firstnode
is its own previous sibling - how can this be right?

for example with the input file:
---
<?xml version="1.0" encoding="utf-8"?>
<Game/>
---
the ouptput is:
---
Game
ps Game
<?xml version='1.0' encoding='UTF-8'?>
<Game/>
Jul 18 '05 #7
John J. Lee <jj*@pobox.co m> wrote:
Why not just for node in list(nodeList)?
You're right! I never trusted list() to make a copy if it was already a
native list (as it is sometimes in eg. minidom) but, bothering to check the
docs, it is guaranteed to after all. Hurrah.

sp***********@n tlworld.com (deglog) wrote:
def appendToDescend ant(node):
walker.previous Sibling()
while 1:
if walker.currentN ode.hasChildNod es():
next = walker.nextNode ()
else: break
walker.currentN ode.appendChild (node)
Are you sure this is doing what you want? A TreeWalker's nextNode() method
goes to an node's next matching sibling, not into its children. To go into
the matching children you'd use TreeWalker.firs tChild().

The function as written above appends the argument node to the first sibling
to have no child nodes, starting from the TreeWalker's current node or its
previous sibling if there is one.

I'm not wholly sure I understand the problem you're trying to solve. If you
just want to nest sibling elements as first children, you could do it without
Traversal or recursion, for example:

def nestChildrenInt oFirstElements( parent):
elements= [c for c in parent.childNod es if c.nodeType==c.E LEMENT_NODE]
if len(elements)>= 2:
insertionPoint= elements[0]
for element in elements[1:]:
insertionPoint. appendChild(ele ment)
insertionPoint= element

(Untested but no reason it shouldn't work.)
Strangely, the line checking "Game" is needed, because this firstnode
is its own previous sibling - how can this be right?


4DOM is fooling you. It has inserted a <!DOCTYPE> declaration automatically
for you. (It probably shouldn't do that.) So the previous sibling of the
documentElement is the doctype; of course the doctype has the same nodeName
as the documentElement , so the debugging output is misleading.

--
Andrew Clover
mailto:an*@doxd esk.com
http://www.doxdesk.com/
Jul 18 '05 #8
an********@doxd esk.com (Andrew Clover) wrote in message news:<2c******* *************** ****@posting.go ogle.com>...
def appendToDescend ant(node):
walker.previous Sibling()
while 1:
if walker.currentN ode.hasChildNod es():
next = walker.nextNode ()
else: break
walker.currentN ode.appendChild (node)
Are you sure this is doing what you want? A TreeWalker's nextNode() method
goes to an node's next matching sibling, not into its children. To go into
the matching children you'd use TreeWalker.firs tChild().


right

I'm not wholly sure I understand the problem you're trying to solve.


actually i'm trying to change the relationship 'is next sibling of' to
'is child of' throughout a document

my latest idea is to go to the end of the document, then walk it
backwards (for christmas?:-) towards this end i wrote:
---
walker = doc.createTreeW alker(doc.docum entElement,Node Filter.SHOW_ELE MENT,
None, 0)
while 1:
print '1 '+walker.curren tNode.nodeName
next = walker.nextNode ()
if next is None: break
print '2 '+walker.curren tNode.nodeName
---
which, given
---
<?xml version="1.0" encoding="utf-8"?>
<Game><A/></Game>

---
outputs
---
1 Game
1 A
2 Game
---
foiled again. How come the current node is back at the start atfter
the loop has finished?
Jul 18 '05 #9
sp***********@n tlworld.com (deglog) wrote:
actually i'm trying to change the relationship 'is next sibling of' to
'is child of' throughout a document
Well, the snippet in the posting above should do that well enough. What
happens to any existing nested children is not defined.
How come the current node is back at the start atfter the loop has finished?


Bug. I've just submitted a patch to the PyXML tracker to address this issue.

(Note: earlier versions of TreeWalker - certainly 0.8.0 - have more significant
bugs, that can lead to infinite recursion.)

That said, I'm not sure how using a TreeWalker or walking backwards actually
helps you here! If you are just using it to filter out non-element children,
remember that moving the current node takes the position of the TreeWalker
with it. It's not like NodeIterator.

--
Andrew Clover
mailto:an*@doxd esk.com
http://www.doxdesk.com/
Jul 18 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

112
13765
by: mystilleef | last post by:
Hello, What is the Pythonic way of implementing getters and setters. I've heard people say the use of accessors is not Pythonic. But why? And what is the alternative? I refrain from using them because they smell "Javaish." But now my code base is expanding and I'm beginning to appreciate the wisdom behind them. I welcome example code and...
12
2989
by: adamurbas | last post by:
ya so im pretty much a newb to this whole python thing... its pretty cool but i just started today and im already having trouble. i started to use a tutorial that i found somewhere and i followed the instructions and couldnt get the correct results. heres the code stuff... temperature=input("what is the temperature of the spam?") if...
0
7659
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main...
0
8103
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that...
1
7634
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For...
0
6244
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then...
0
5208
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert...
0
3634
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in...
0
3618
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
1194
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
916
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.