By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
443,742 Members | 1,224 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 443,742 IT Pros & Developers. It's quick & easy.

ElementTree/DTD question

P: n/a
I'm trying to convert from minidom to ElementTree for handling XML,
and am having trouble with entities in DTDs. My Python script looks
like this:

----------------------------------------------------------------------

#!/usr/bin/env python

import sys, os
from elementtree import ElementTree

for filename in sys.argv[1:]:
ElementTree.parse(filename)

----------------------------------------------------------------------

My first attempt was this XML file:

----------------------------------------------------------------------

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE lec [
<!ENTITY ldots "&#x8230;">
]>
<lec title="Introduction">
<topic title="Motivation" summary="motivation for course">
<slide>
<b1>Write an introduction&ldots;</b1>
</slide>
</topic>
</lec>

----------------------------------------------------------------------

Running "python validate.py first.xml" produces:

----------------------------------------------------------------------

Traceback (most recent call last):
File "validate.py", line 7, in ?
ElementTree.parse(filename)
File "C:\Python23\Lib\site-packages\elementtree\ElementTree.py",
line 865, in parse
tree.parse(source, parser)
File "C:\Python23\Lib\site-packages\elementtree\ElementTree.py",
line 589, in parse
parser.feed(data)
File "C:\Python23\Lib\site-packages\elementtree\ElementTree.py",
line 1160, in feed
self._parser.Parse(data, 0)
File "C:\Python23\Lib\site-packages\elementtree\ElementTree.py",
line 1113, in _default
raise expat.error(
xml.parsers.expat.ExpatError: undefined entity &ldots;: line 9, column
27

----------------------------------------------------------------------

All right, pull the DTD out, and use this XML file:

----------------------------------------------------------------------

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE lec SYSTEM "swc.dtd">
<lec title="Introduction">
<topic title="Motivation" summary="motivation for course">
<slide>
<b1>Write an introduction&ldots;</b1>
</slide>
</topic>
</lec>

----------------------------------------------------------------------

with this minimalist DTD (saved as "swc.dtd" in the same directory as
both the XML file and the script):

----------------------------------------------------------------------

<!ENTITY ldots "&#x8230;">

----------------------------------------------------------------------

Same error; only the line number changed. Anyone know what I'm doing
wrong? (Note: minidom loads it just fine...)

Thanks,
Greg Wilson
gv******@cs.utoronto.ca
Jul 18 '05 #1
Share this Question
Share on Google+
1 Reply


P: n/a
Greg Wilson wrote:
My first attempt was this XML file:

----------------------------------------------------------------------

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE lec [
<!ENTITY ldots "&#x8230;">
]>
<lec title="Introduction">
<topic title="Motivation" summary="motivation for course">
<slide>
<b1>Write an introduction&ldots;</b1>
</slide>
</topic>
</lec>

----------------------------------------------------------------------

Running "python validate.py first.xml" produces:

----------------------------------------------------------------------

Traceback (most recent call last):
File "validate.py", line 7, in ?
ElementTree.parse(filename)
File "C:\Python23\Lib\site-packages\elementtree\ElementTree.py",
line 865, in parse
tree.parse(source, parser)
File "C:\Python23\Lib\site-packages\elementtree\ElementTree.py",
line 589, in parse
parser.feed(data)
File "C:\Python23\Lib\site-packages\elementtree\ElementTree.py",
line 1160, in feed
self._parser.Parse(data, 0)
File "C:\Python23\Lib\site-packages\elementtree\ElementTree.py",
line 1113, in _default
raise expat.error(
xml.parsers.expat.ExpatError: undefined entity &ldots;: line 9, column
27

----------------------------------------------------------------------


looks like a bug in the Python version of elementtree. you can either switch
to cElementTree, or apply the following patch:

Index: elementtree/ElementTree.py
================================================== =================
--- elementtree/ElementTree.py (revision 2315)
+++ elementtree/ElementTree.py (working copy)
@@ -1120,7 +1120,7 @@
self._target = target
self._names = {} # name memo cache
# callbacks
- parser.DefaultHandler = self._default
+ parser.DefaultHandlerExpand = self._default
parser.StartElementHandler = self._start
parser.EndElementHandler = self._end
parser.CharacterDataHandler = self._data

(for quicker responses to elementtree questions, use the xml-sig mailing list)

</F>

Jul 18 '05 #2

This discussion thread is closed

Replies have been disabled for this discussion.