By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
445,870 Members | 1,212 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 445,870 IT Pros & Developers. It's quick & easy.

using TreeBuilder in an ElementTree like way

P: n/a
I am trying to write some python code for a library that reads an
XML-like language from a file into elementtree data structures. Then I
want to be able to read and/or modify the structure and then be able to
write it out either as XML or in the original format. I really want the
api for the XML-like language to be the same as the elementtree api to
reduce confusion, ease of learning etc.

In reading the elementtree documentation I found the
ElementTree.TreeBuilder class which it says can be used to create
parsers for XML-like languages. So I wrote the code below. The code is
working but I am not sure that this is really the intended way to use
the ElementTree.TreeBuilder class.

Essentially I was trying to implement the following advice from Frederik
Lundh (Wed, Sep 8 2004 12:54 am):
by the way, it's trivial to build trees from arbitrary SAX-style sources.
just create an instance of the ElementTree.TreeBuilder class, and call
the "start", "end", and "data" methods as appropriate.

builder = ElementTree.TreeBuilder()
builder.start("tag", {})
builder.data("text")
builder.end("tag")
elem = builder.close()
but in another post he wrote (Wed, May 21 2003 2:56 am): usage:

from elementtree import ElementTree, HTMLTreeBuilder

# file is either a filename or an open stream
tree = ElementTree.parse(file, parser=HTMLTreeBuilder.TreeBuilder())
root = tree.getroot()

or

from elementtree import HTMLTreeBuilder

parser = HTMLTreeBuilder.TreeBuilder()
parser.feed(data)
root = parser.close()


This second one makes me think I should have implemented a parser class
using Treebuilder. Also when I used return builder.close() in the code
below it didn't return an ElementTree structure but an _ElementInterface.

So my question is really about how I should structure the code so that
it is as similar to use this XML format as to use XML itself in
elementtree.

from elementtree import ElementTree
from nltk_lite.corpora.shoebox import ShoeboxFile

class Settings(ShoeboxFile):
def __init__(self):
super(Settings, self).__init__()

def parse(self, encoding=None):
builder = ElementTree.TreeBuilder()
for mkr, value in self.fields(encoding, unwrap=False):
block=mkr[0]
if block in ("+", "-"):
mkr=mkr[1:]
else:
block=None
if block == "+":
builder.start(mkr, {})
builder.data(value)
elif block == '-':
builder.end(mkr)
else:
builder.start(mkr, {})
builder.data(value)
builder.end(mkr)
return ElementTree.ElementTree(builder.close())
Jun 28 '06 #1
Share this question for a faster answer!
Share on Google+

This discussion thread is closed

Replies have been disabled for this discussion.