473,394 Members | 1,663 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,394 software developers and data experts.

PyXML, Sax, error in processing external entity reference

I'm attempting to read an XHTML 1.1 file[1], perform some DOM manipulation,
then write the results to a different file.

I've found myself rather stuck at the first hurdle.

I have the following:

from xml.dom.ext.reader import Sax2
reader = Sax2.Reader()
f = open('dorward.me.uk/sitemap.html', 'r')
doc = reader.fromStream(f)

(dorward.me.uk/sitemap.html being a local copy of
http://dorward.me.uk/sitemap.html)

.... which outputs the following:

Traceback (most recent call last):
File "x.py", line 4, in ?
doc = reader.fromStream(f)
File "/usr/lib/python2.3/site-packages/_xmlplus/dom/ext/reader/Sax2.py",
line 372, in fromStream
self.parser.parse(s)
File "/usr/lib/python2.3/site-packages/_xmlplus/sax/expatreader.py", line
109, in parse
xmlreader.IncrementalParser.parse(self, source)
File "/usr/lib/python2.3/site-packages/_xmlplus/sax/xmlreader.py", line
123, in parse
self.feed(buffer)
File "/usr/lib/python2.3/site-packages/_xmlplus/sax/expatreader.py", line
220, in feed
self._err_handler.fatalError(exc)
File "/usr/lib/python2.3/site-packages/_xmlplus/dom/ext/reader/Sax2.py",
line 340, in fatalError
raise exception
xml.sax._exceptions.SAXParseException:
http://www.w3.org/TR/xhtml-modulariz...s-1.mod:115:0:
error in processing external entity reference

I'm not sure where I should proceed from here. Is it a bug in my code? In
PyXML? In the DTD itself? What should I do next?

Thanks.

[1] Actually, lots of files, but one at a time.

--
David Dorward <http://dorward.me.uk/>
Jul 18 '05 #1
2 3502
David Dorward wrote:
I'm attempting to read an XHTML 1.1 file[1], perform some DOM manipulation,
then write the results to a different file.

I've found myself rather stuck at the first hurdle.

I have the following:

from xml.dom.ext.reader import Sax2
reader = Sax2.Reader()
f = open('dorward.me.uk/sitemap.html', 'r')
doc = reader.fromStream(f)

(dorward.me.uk/sitemap.html being a local copy of
http://dorward.me.uk/sitemap.html)

... which outputs the following:

Traceback (most recent call last):
File "x.py", line 4, in ?
doc = reader.fromStream(f)
File "/usr/lib/python2.3/site-packages/_xmlplus/dom/ext/reader/Sax2.py",
line 372, in fromStream
self.parser.parse(s)
File "/usr/lib/python2.3/site-packages/_xmlplus/sax/expatreader.py", line
109, in parse
xmlreader.IncrementalParser.parse(self, source)
File "/usr/lib/python2.3/site-packages/_xmlplus/sax/xmlreader.py", line
123, in parse
self.feed(buffer)
File "/usr/lib/python2.3/site-packages/_xmlplus/sax/expatreader.py", line
220, in feed
self._err_handler.fatalError(exc)
File "/usr/lib/python2.3/site-packages/_xmlplus/dom/ext/reader/Sax2.py",
line 340, in fatalError
raise exception
xml.sax._exceptions.SAXParseException:
http://www.w3.org/TR/xhtml-modulariz...s-1.mod:115:0:
error in processing external entity reference

I'm not sure where I should proceed from here. Is it a bug in my code? In
PyXML? In the DTD itself? What should I do next?

Thanks.

[1] Actually, lots of files, but one at a time.

I think you need a parser
import xml.sax
parser = xml.sax.make_parser()
file = "dorward.me.uk/sitemap.html"
parser.parse(file)


How furder I don't now, I'am stuck to!

Try the 'http://pyxml.sourceforge.net/topics/howto/xml-howto.html'site.

Bennie,
Jul 18 '05 #2
David Dorward <do*****@yahoo.com> wrote in message news:<c1*******************@news.demon.co.uk>...
I'm attempting to read an XHTML 1.1 file[1], perform some DOM manipulation,
then write the results to a different file.

I've found myself rather stuck at the first hurdle.

I have the following:

from xml.dom.ext.reader import Sax2
reader = Sax2.Reader()
f = open('dorward.me.uk/sitemap.html', 'r')
doc = reader.fromStream(f)

(dorward.me.uk/sitemap.html being a local copy of
http://dorward.me.uk/sitemap.html)

... which outputs the following:

Traceback (most recent call last):
File "x.py", line 4, in ?
doc = reader.fromStream(f)
File "/usr/lib/python2.3/site-packages/_xmlplus/dom/ext/reader/Sax2.py",
line 372, in fromStream
self.parser.parse(s)
File "/usr/lib/python2.3/site-packages/_xmlplus/sax/expatreader.py", line
109, in parse
xmlreader.IncrementalParser.parse(self, source)
File "/usr/lib/python2.3/site-packages/_xmlplus/sax/xmlreader.py", line
123, in parse
self.feed(buffer)
File "/usr/lib/python2.3/site-packages/_xmlplus/sax/expatreader.py", line
220, in feed
self._err_handler.fatalError(exc)
File "/usr/lib/python2.3/site-packages/_xmlplus/dom/ext/reader/Sax2.py",
line 340, in fatalError
raise exception
xml.sax._exceptions.SAXParseException:
http://www.w3.org/TR/xhtml-modulariz...s-1.mod:115:0:
error in processing external entity reference

I'm not sure where I should proceed from here. Is it a bug in my code? In
PyXML? In the DTD itself? What should I do next?


The bug is with the W3C. Through a chain of parameter entity refs, it
http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd references
http://www.w3.org/TR/xhtml-modulariz...11-model-1.mod,
which gives 404 (and yes XML heads, it is in an INCLUDE section so the
URI must be traversed unless there's a resoltion through pubID).

I'm actually rather amazed at such carelessness by the W3C, but I
don't have time to dig further to see if I can figure out how things
got broken.

I can tell you that you can use minidom or OK with this because it
does not even read the external DTD subset:
from xml.dom import minidom
doc = minidom.parse('sitemap.html')
doc <xml.dom.minidom.Document instance at 0x400635ec>
Also, 4Suite's cDomlette makes it easy for you to avoid the DTD
problem:
from Ft.Xml.Domlette import NoExtDtdReader
doc = NoExtDtdReader.parseUri("file:sitemap.html")
doc <cDocument at 0x0x403ab42c>


http://4suite.org
http://uche.ogbuji.net/tech/akara/no...1-01/domlettes

Good luck.

--Uche
http://uche.ogbuji.net
Jul 18 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
by: ittay.dror | last post by:
Hi, I have a build.xml with the following header: <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE project > When I launch ant, the jar that holds common.ent is in the classpath (by setting...
3
by: Robert Lintner | last post by:
Hi, I woult like to switch from DTD to XML-Schema and am looking for an equivalent to external ENTITY for composition of an xml file from modules --- my.dtd -- <?xml version="1.0"...
0
by: Geiger Ho | last post by:
Hi all, I have a DOM tree which is implemented by glib N-ary tree. I serialize the tree to give back an xml document. However, how should I deal with the external entity. Is there any tool or...
11
by: Douglas Reith | last post by:
Hi There, Can someone please tell me why the XML spec states that an attribute value with an external entity is forbidden? Or point me to the appropriate document? Or better still, perhaps you...
3
by: Joris Gillis | last post by:
Hi everyone, I have this stylesheet: <?xml version="1.0" encoding="ISO-8859-1" ?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="xml"/>...
1
by: Razvan | last post by:
Hi What is the difference between an internal and an external entity ? The first one is defined in the internal subset (not in a separate DTD file, but in the XML file itself - in...
1
by: Aravind | last post by:
we have two files: 1. rc4.c (defines one function "create_pin()") 2. MyImpl.c(calling the function "create_pin()"),This implements JNI method. 1.When I am trying to create .dll file with one...
0
by: punjabinezzie | last post by:
I am developing an Application which currently has two XML files. One XML file with some nodes and an external entity reference to another XML file. The source XML file is being parsed using Xerces...
2
by: xxxtriplerxxx | last post by:
//my testing.h code #ifndef TESTING_H #define TESTING_H #include <iostream.h> class TestingLink { public:
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.