XML-like language from a file into elementtree data structures. Then I
want to be able to read and/or modify the structure and then be able to
write it out either as XML or in the original format. I really want the
api for the XML-like language to be the same as the elementtree api to
reduce confusion, ease of learning etc.
In reading the elementtree documentation I found the
ElementTree.TreeBuilder class which it says can be used to create
parsers for XML-like languages. So I wrote the code below. The code is
working but I am not sure that this is really the intended way to use
the ElementTree.TreeBuilder class.
Essentially I was trying to implement the following advice from Frederik
Lundh (Wed, Sep 8 2004 12:54 am):
by the way, it's trivial to build trees from arbitrary SAX-style sources.
just create an instance of the ElementTree.TreeBuilder class, and call
the "start", "end", and "data" methods as appropriate.
builder = ElementTree.TreeBuilder()
builder.start("tag", {})
builder.data("text")
builder.end("tag")
elem = builder.close()
but in another post he wrote (Wed, May 21 2003 2:56 am): usage:
from elementtree import ElementTree, HTMLTreeBuilder
# file is either a filename or an open stream
tree = ElementTree.parse(file, parser=HTMLTreeBuilder.TreeBuilder())
root = tree.getroot()
or
from elementtree import HTMLTreeBuilder
parser = HTMLTreeBuilder.TreeBuilder()
parser.feed(data)
root = parser.close()
This second one makes me think I should have implemented a parser class
using Treebuilder. Also when I used return builder.close() in the code
below it didn't return an ElementTree structure but an _ElementInterface.
So my question is really about how I should structure the code so that
it is as similar to use this XML format as to use XML itself in
elementtree.
from elementtree import ElementTree
from nltk_lite.corpora.shoebox import ShoeboxFile
class Settings(ShoeboxFile):
def __init__(self):
super(Settings, self).__init__()
def parse(self, encoding=None):
builder = ElementTree.TreeBuilder()
for mkr, value in self.fields(encoding, unwrap=False):
block=mkr[0]
if block in ("+", "-"):
mkr=mkr[1:]
else:
block=None
if block == "+":
builder.start(mkr, {})
builder.data(value)
elif block == '-':
builder.end(mkr)
else:
builder.start(mkr, {})
builder.data(value)
builder.end(mkr)
return ElementTree.ElementTree(builder.close())