I decided to use SAX to parse my xml file.
But the parser crashes on:
File "/usr/lib/python2.3/site-packages/_xmlplus/sax/handler.py", line 38, in fatalError
raise exception
xml.sax._except ions.SAXParseEx ception: NCBI_Entrezgene .dtd:8:0: error in processing external entity reference
This is caused by:
<!DOCTYPE Entrezgene-Set PUBLIC "-//NCBI//NCBI Entrezgene/EN"
"NCBI_Entrezgen e.dtd">
If I remove it, it parses normally.
I've created my parser like this:
import sys
from xml.sax import make_parser
from handler import EntrezGeneHandl er
fopen = open("mouse2.xm l", "r")
ch = EntrezGeneHandl er()
saxparser = make_parser()
saxparser.setCo ntentHandler(ch )
saxparser.parse (fopen)
And the handler is:
from xml.sax import ContentHandler
class EntrezGeneHandl er(ContentHandl er):
"""
A handler to deal with EntrezGene in XML
"""
def startElement(se lf, name, attrs):
print "Start element:", name
So it doesn't do much yet. And still it crashes...
How can I tell the parser not to look at the DOCTYPE declaration.
On a website: http://www.devarticles.com/c/a/XML/P...-and-Python/1/
it states that the SAX parsers are not validating, so this error shouldn't
even occur?
Cheers,
Willem 3 3652
On Sat, 2005-04-23 at 15:20 +0200, Willem Ligtenberg wrote: I decided to use SAX to parse my xml file. But the parser crashes on: File "/usr/lib/python2.3/site-packages/_xmlplus/sax/handler.py", line 38, in fatalError raise exception xml.sax._except ions.SAXParseEx ception: NCBI_Entrezgene .dtd:8:0: error in processing external entity reference
This is caused by: <!DOCTYPE Entrezgene-Set PUBLIC "-//NCBI//NCBI Entrezgene/EN" "NCBI_Entrezgen e.dtd">
If I remove it, it parses normally. I've created my parser like this: import sys from xml.sax import make_parser from handler import EntrezGeneHandl er
fopen = open("mouse2.xm l", "r") ch = EntrezGeneHandl er() saxparser = make_parser() saxparser.setCo ntentHandler(ch ) saxparser.parse (fopen)
And the handler is: from xml.sax import ContentHandler
class EntrezGeneHandl er(ContentHandl er): """ A handler to deal with EntrezGene in XML """
def startElement(se lf, name, attrs): print "Start element:", name
So it doesn't do much yet. And still it crashes... How can I tell the parser not to look at the DOCTYPE declaration. On a website: http://www.devarticles.com/c/a/XML/P...-and-Python/1/ it states that the SAX parsers are not validating, so this error shouldn't even occur?
Just because it's not validating doesn't mean that the parser won't try
to read the external entity.
Maybe you're looking for
"""
feature_externa l_ges
Value: "http://xml.org/sax/features/external-general-entities"
true: Include all external general (text) entities.
false: Do not include external general entities.
access: (parsing) read-only; (not parsing) read/write
"""
Quote from: http://docs.python.org/lib/module-xml.sax.handler.html
But you're on pretty shaky ground in any XML 1.x toolkit using a bogus
DTDecl in this way. Why go through the hassle? Why not use a catalog,
or remove the DTDecl?
--
Uche Ogbuji Fourthought, Inc. http://uche.ogbuji.net http://fourthought.com http://copia.ogbuji.net http://4Suite.org
Use CSS to display XML, part 2 - http://www-128.ibm.com/developerwork...xmlcss2-i.html
XML Output with 4Suite & AMara - http://www.xml.com/pub/a/2005/04/20/py-xml.html
Use XSLT to prepare XML for import into OpenOffice Calc - http://www.ibm.com/developerworks/xml/library/x-oocalc/
Schema standardization for top-down semantic transparency - http://www-128.ibm.com/developerwork...x-think31.html
I didn't make the XML file. And I don't like messing with other peoples
data. So I just want my SAX parser to ignore it. I can't help if other
people make it hard for me to read their xml file...
On Sat, 23 Apr 2005 13:48:49 -0600, Uche Ogbuji wrote: On Sat, 2005-04-23 at 15:20 +0200, Willem Ligtenberg wrote: I decided to use SAX to parse my xml file. But the parser crashes on: File "/usr/lib/python2.3/site-packages/_xmlplus/sax/handler.py", line 38, in fatalError raise exception xml.sax._except ions.SAXParseEx ception: NCBI_Entrezgene .dtd:8:0: error in processing external entity reference
This is caused by: <!DOCTYPE Entrezgene-Set PUBLIC "-//NCBI//NCBI Entrezgene/EN" "NCBI_Entrezgen e.dtd">
If I remove it, it parses normally. I've created my parser like this: import sys from xml.sax import make_parser from handler import EntrezGeneHandl er
fopen = open("mouse2.xm l", "r") ch = EntrezGeneHandl er() saxparser = make_parser() saxparser.setCo ntentHandler(ch ) saxparser.parse (fopen)
And the handler is: from xml.sax import ContentHandler
class EntrezGeneHandl er(ContentHandl er): """ A handler to deal with EntrezGene in XML """
def startElement(se lf, name, attrs): print "Start element:", name
So it doesn't do much yet. And still it crashes... How can I tell the parser not to look at the DOCTYPE declaration. On a website: http://www.devarticles.com/c/a/XML/P...-and-Python/1/ it states that the SAX parsers are not validating, so this error shouldn't even occur?
Just because it's not validating doesn't mean that the parser won't try to read the external entity.
Maybe you're looking for
""" feature_externa l_ges Value: "http://xml.org/sax/features/external-general-entities" true: Include all external general (text) entities. false: Do not include external general entities. access: (parsing) read-only; (not parsing) read/write """
Quote from:
http://docs.python.org/lib/module-xml.sax.handler.html
But you're on pretty shaky ground in any XML 1.x toolkit using a bogus DTDecl in this way. Why go through the hassle? Why not use a catalog, or remove the DTDecl? This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics |
by: Roberto A. F. De Almeida |
last post by:
Hi,
I'm interested in parsing a file containing this "structure":
"""dataset {
int catalog_number;
sequence {
string experimenter;
int32 time;
structure {
|
by: Oxmard |
last post by:
Armed with my new O'Reilly book Optimizing Oracle Performance I have been
trying to get a better understanding of how Oracle works.
The book makes the statement, " A database cal with dep=n + 1 is the
recursive child of the first subsequent dep=n database call listed in the
SQL data stream. The book gives a few examples, and in trying it out it
seemed to work until I tried the following SQL. My question are why does
this not keep with...
|
by: Cigdem |
last post by:
Hello,
I am trying to parse the XML files that the user selects(XML files are
on anoher OS400 system called "wkdis3"). But i am permenantly getting
that error:
Directory0: \\wkdis3\ROOT\home
Canonicalpath-Directory4: \\wkdis3\ROOT\home\bwe\
You selected the file named AAA.XML
getXmlAlgorithmDocument(): IOException Not logged in
|
by: H |
last post by:
Now, I'm here with another newbie question ....
I want to read a text file, string by string (to do some things with some
words etc etc), but I can't seem to find a way to do this String by String.
Is there anyway, like String s = something.ReadString() ?
Or what may be a fine way to do this ? Only thing I can some up with is to
read 1 char at a time, and look if the next char is a space-sign, and that
way "make" the Strings myself....
|
by: christian.eickhoff |
last post by:
Hi Everyone,
I am currently implementing an XercesDOMParser to parse an XML file and
to validate this file against its XSD Schema file which are both
located on my local HD drive. For this purpose I set the corresponding
XercesDOMParser feature as shown in the upcoming subsection of my code.
As far as I understand, the parsing process should throw an
DOMException in case the XML file doesn't match the Schema file (e.g.
Element...
| |
by: baskarpr |
last post by:
Hi all,
I my program after parsing in SAX parser, I want to write the parse result as an XML file. I want to ensure that there should be no difference between source XML file and parse result xml file. Because I set some properties in parser, which may cause to changes between actual and parsed.
What I expect is the exact XML file structure is to be available into another XML file (incl white spc's) after SAX parsing.
Below is a snippet...
|
by: AdrianH |
last post by:
Assumptions
I am assuming that you know or are capable of looking up the functions I am to describe here and have some remedial understanding of C++ programming.
FYI
Although I have called this article “How to Parse a File in C++”, we are actually mostly lexing a file which is the breaking down of a stream in to its component parts, disregarding the syntax that stream contains. Parsing is actually including the syntax in order to make...
|
by: AdrianH |
last post by:
Assumptions
I am assuming that you know or are capable of looking up the functions I am to describe here and have some remedial understanding of C programming.
FYI
Although I have called this article “How to Parse a File in C++”, we are actually mostly lexing a file which is the breaking down of a stream in to its component parts, disregarding the syntax that stream contains. Parsing is actually including the syntax in order to make...
|
by: souravmallik |
last post by:
Hello,
I'm facing a big logical problem while writing a parser in VC++ using C.
I have to parse a file in a chunk of bytes in a round robin fashion.
Means, when I select a file, the parser will read first 512kb(IBUFFSIZE) of data, then move to next file and parse the same way. This way I can parse a number of file spreaded over different directory uniformly.
I'm keeping a meta data in a file where I'm keeping the track of file parse...
|
by: Felipe De Bene |
last post by:
I'm having problems parsing an HTML file with the following syntax :
<TABLE cellspacing=0 cellpadding=0 ALIGN=CENTER BORDER=1 width='100%'>
<TH BGCOLOR='#c0c0c0' Width='3%'>User ID</TH>
<TH Width='10%' BGCOLOR='#c0c0c0'>Name</TH><TH width='7%'
BGCOLOR='#c0c0c0'>Date</TH>
and so on....
whenever I feed the parser with such file I get the error :
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look !
Part I. Meaning of...
| |
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth.
The Art of Business Website Design
Your website is...
|
by: Hystou |
last post by:
Overview:
Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
|
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
|
by: agi2029 |
last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own....
Now, this would greatly impact the work of software developers. The idea...
|
by: isladogs |
last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM).
In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules.
He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms.
Adolph will...
|
by: conductexam |
last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one.
At the time of converting from word file to html my equations which are in the word document file was convert into image.
Globals.ThisAddIn.Application.ActiveDocument.Select();...
| |
by: TSSRALBI |
last post by:
Hello
I'm a network technician in training and I need your help.
I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs.
The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols.
I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
|
by: 6302768590 |
last post by:
Hai team
i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
| |