By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
429,435 Members | 2,036 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 429,435 IT Pros & Developers. It's quick & easy.

xml.dom.minidom weirdness: bug?

P: n/a
JYA
Hi.

I was writing an xmltv parser using python when I faced some weirdness
that I couldn't explain.

What I'm doing, is read an xml file, create another dom object and copy
the element from one to the other.

At no time do I ever modify the original dom object, yet it gets modified.

Unless I missed something, it sounds like a bug to me.

the xml file is simply:
<?xml version="1.0" encoding="utf-8"?>
<tv><channel id="id1"><display-name lang="en">full
name</display-name></channel></tv>

which I store under the name test.xmltv

Here is the code, I've removed everything that isn't applicable to my
description. can't make it any simpler I'm afraid:

from xml.dom.minidom import Document
import xml.dom.minidom
def adjusttimezone(docxml, timezone):
doc = Document()

# Create the <tvbase element
tv_xml = doc.createElement("tv")
doc.appendChild(tv_xml)

#Create the channel list
channellist = docxml.getElementsByTagName('channel')

for x in channellist:
#Copy the original attributes
elem = doc.createElement("channel")
for y in x.attributes.keys():
name = x.attributes[y].name
value = x.attributes[y].value
elem.setAttribute(name,value)
for y in x.getElementsByTagName('display-name'):
elem.appendChild(y)
tv_xml.appendChild(elem)

return doc

if __name__ == '__main__':
handle = open('test.xmltv','r')
docxml = xml.dom.minidom.parse(handle)
print 'step1'
print docxml.toprettyxml(indent=" ",encoding="utf-8")
doc = adjusttimezone(docxml, 1000)
print 'step2'
print docxml.toprettyxml(indent=" ",encoding="utf-8")

Now at "step 1" I will display the content of the dom object, quite
natually it shows:
<?xml version="1.0" encoding="utf-8"?>
<tv>
<channel id="id1">
<display-name lang="en">
full name
</display-name>
</channel>
</tv>

After a call to adjusttimezone, "step 2" however will show:
<?xml version="1.0" encoding="utf-8"?>
<tv>
<channel id="id1"/>
</tv>

That's it !

You'll note that at no time do I modify the content of docxml, yet it
gets modified.

The weirdness disappear if I change the line
channellist = docxml.getElementsByTagName('channel')
to
channellist = copy.deepcopy(docxml.getElementsByTagName('channel '))

However, my understanding is that it shouldn't be necessary.

Any thoughts on this weirdness ?

Thanks
Jean-Yves

--
They who would give up an essential liberty for temporary security,
deserve neither liberty or security (Benjamin Franklin)

Jun 27 '08 #1
Share this Question
Share on Google+
2 Replies


P: n/a
En Tue, 29 Apr 2008 23:51:14 -0300, JYA <no****@nospam.blahescribió:
What I'm doing, is read an xml file, create another dom object and copy
the element from one to the other.

At no time do I ever modify the original dom object, yet it gets
modified.

for y in x.getElementsByTagName('display-name'):
elem.appendChild(y)
tv_xml.appendChild(elem)
You'll note that at no time do I modify the content of docxml, yet it
gets modified.

The weirdness disappear if I change the line
channellist = docxml.getElementsByTagName('channel')
to
channellist = copy.deepcopy(docxml.getElementsByTagName('channel '))

However, my understanding is that it shouldn't be necessary.
I think that any element can have only a single parent. If you get an
element from one document and insert it onto another document, it gets
removed from the first.

--
Gabriel Genellina

Jun 27 '08 #2

P: n/a
JYA <no****@nospam.blahwrote:
for y in x.getElementsByTagName('display-name'):
elem.appendChild(y)
Like Gabriel wrote, nodes can only have one parent. Use
elem.appendChild(y.cloneNode(True))
instead. Or y.cloneNode(False), if you want a shallow copy (i.e. without
any of the children, e.g. text content).

Marc
Jun 27 '08 #3

This discussion thread is closed

Replies have been disabled for this discussion.