By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
440,440 Members | 1,872 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 440,440 IT Pros & Developers. It's quick & easy.

SAX XML Parse Python error message

P: n/a
SAX XML Parse Python error message
Hi,
My first attempt at SAX, but have an error message I need help with.

I cite the error message, code, and xml below.

Be grateful if anyone can tell me what the fix is.
Thanks.

>>>
Traceback (most recent call last):
File "C:\Python24\Lib\site-packages\pythonwin\pywin\framework
\scriptutils.py", line 310, in RunScript
exec codeObject in __main__.__dict__
File "C:\pythonscripts\xml\parse3.py", line 43, in ?
parser.parse(r'C:\perlscripts\xml\Document2.kml')
File "C:\Python24\lib\xml\sax\expatreader.py", line 107, in parse
xmlreader.IncrementalParser.parse(self, source)
File "C:\Python24\lib\xml\sax\xmlreader.py", line 123, in parse
self.feed(buffer)
File "C:\Python24\lib\xml\sax\expatreader.py", line 207, in feed
self._parser.Parse(data, isFinal)
File "C:\Python24\lib\xml\sax\expatreader.py", line 303, in
end_element
self._cont_handler.endElement(name)
File "C:\pythonscripts\xml\parse3.py", line 39, in endElement
print self.description, str(self.coordinates)
AttributeError: G_Handler instance has no attribute 'coordinates'
>>>

Code:

from xml.sax import make_parser
from xml.sax.handler import ContentHandler
import string

class G_Handler(ContentHandler):

def __init__ (self):
self.isFolderElement = 0
self.isdescriptionElement = 0
self.iscoordinatesElement = 0

def startElement(self, name , attrs):
if name == 'Folder':
self.isFolderElement= 1
self.Folder = ""
if name == 'description':
self.isdescriptionElement= 1
self.description = ""
if name == 'coordinates':
self.iscoordinatesElement = 1
self.coordinates = ""
def characters (self, ch):
if self.isFolderElement == 1:
self.Folder = ch
if self.isdescriptionElement == 1:
self.description = ch
if self.iscoordinatesElement == 1:
self.coordinates = ch

def endElement(self, name):
if name == 'Folder':
self.isFolderElement = 0
if name == 'description':
self.isdescriptionElement= 0
if name == 'coordinates':
self.iscoordinatesElement = 0
print self.description, str(self.coordinates)

parser = make_parser()
parser.setContentHandler(G_Handler())
parser.parse(r'C:\perlscripts\xml\Document2.kml')

<?xml version="1.0" encoding="UTF-8"?>
<Folder>
<description>
abc
</description>
<coordinates>
-84.4, 33.7
</coordinates>
<description>
abc
</description>
<coordinates>
-86.7, 36.1
</coordinates>
</Folder>
Jul 13 '08 #1
Share this Question
Share on Google+
5 Replies


P: n/a
goldtech wrote:
My first attempt at SAX, but have an error message I need help with.
Just in case you prefer writing readable code over debugging SAX code into
existence, try lxml.

http://codespeak.net/lxml/

Here is a presentation you might find interesting.

http://codespeak.net/lxml/s5/lxml-ep2008.html

Stefan
Jul 13 '08 #2

P: n/a
I would be grateful for support with the code I cited. It's not long
and fairly standard. I'm sure my error(s) would be glaring to more
experienced coders. I appreciated the "heads-up" about other options
but I would be grateful for help getting this code to run. Thanks

On Jul 13, 11:47 am, Stefan Behnel <stefan...@behnel.dewrote:
goldtech wrote:
My first attempt at SAX, but have an error message I need help with.

Just in case you prefer writing readable code over debugging SAX code into
existence, try lxml.

http://codespeak.net/lxml/

Here is a presentation you might find interesting.

http://codespeak.net/lxml/s5/lxml-ep2008.html

Stefan
Jul 13 '08 #3

P: n/a
On Jul 13, 3:00 pm, goldtech <goldt...@worldpost.comwrote:
I would be grateful for support with the code I cited. It's not long
and fairly standard. I'm sure my error(s) would be glaring to more
experienced coders. I appreciated the "heads-up" about other options
but I would be grateful for help getting this code to run. Thanks
Initialize self.coodinates in the __init__
or indent the "print self.description, str(self.coordinates)"
one more level.
You have to remember that "endElement" is being called on the end
of every element. In your case it is called by </descriptionbut
the parser did not see <coordinatesyet.

In "def characters" you should be collecting the "ch" in a buffer.
It may be called multiple times for the same element.
Something like "self.description += ch" would do for starters.

Also you do not need to convert self.coordinates to string before
printing, it is already a string and even if it was not "print"
would convert it for you.

That's it for now :-) Others may spot more issues with
your code or my response.
On the positive side I really liked how you asked
the question. There was a short runnable example and traceback.

Waldemar
Jul 13 '08 #4

P: n/a
On Jul 13, 5:30 pm, Waldemar Osuch <waldemar.os...@gmail.comwrote:
On Jul 13, 3:00 pm, goldtech <goldt...@worldpost.comwrote:
I would be grateful for support with the code I cited. It's not long
and fairly standard. I'm sure my error(s) would be glaring to more
experienced coders. I appreciated the "heads-up" about other options
but I would be grateful for help getting this code to run. Thanks

Initialize self.coodinates in the __init__
or indent the "print self.description, str(self.coordinates)"
one more level.
You have to remember that "endElement" is being called on the end
of every element. In your case it is called by </descriptionbut
the parser did not see <coordinatesyet.

In "def characters" you should be collecting the "ch" in a buffer.
It may be called multiple times for the same element.
Something like "self.description += ch" would do for starters.

Also you do not need to convert self.coordinates to string before
printing, it is already a string and even if it was not "print"
would convert it for you.

That's it for now :-) Others may spot more issues with
your code or my response.
On the positive side I really liked how you asked
the question. There was a short runnable example and traceback.

Waldemar
Putting the print statements were they won't cause trouble and
using ...+= ch (vs. only =) in the character section fixed it:

....
def endElement(self, name):
....
if name == 'description':
self.isdescriptionElement= 0
print self.description
if name == 'coordinates':
self.iscoordinatesElement = 0
print self.coordinates
....

I need to read your answer again carefully - I don't know if what I
did is best - but it seemed to fix it. Thank you for the clear and
cogent answer.

Lee G.
Jul 13 '08 #5

P: n/a
goldtech wrote:
I would be grateful for support with the code I cited. It's not long
and fairly standard. I'm sure my error(s) would be glaring to more
experienced coders. I appreciated the "heads-up" about other options
but I would be grateful for help getting this code to run. Thanks
For comparison, here's how an experienced Python programmer might prefer
to write your code:

import xml.etree.cElementTree as ET

description = None # most recently seen description

for event, elem in ET.parse("somefile.xml"):
if elem.tag == "description":
description = elem.text
elif elem.tag == "coordinates":
print description.strip(), elem.text.strip()

You may want to ask yourself why you prefer to struggle with obsolete,
error-prone, and slow technology when there are more efficient tools
available in Python's standard library.

(the lxml library that Stefan linked to is a superset of xml.etree, in
case you want more XML features).

</F>

Jul 15 '08 #6

This discussion thread is closed

Replies have been disabled for this discussion.