pa***********@gmail.com wrote:
def parse_for_products(filename):
for event, elem in iterparse(filename):
if elem.tag == "Products":
root = ElementTree(elem)
print_all(root)
else:
elem.clear()
My problem is that if i pass the 'elem' found by iterparse then try to
print all attributes, children and tail text i only get
elem.tag....elem.keys returns nothing as do all of the other previously
useful elementtree methods.
Am i right in thinking that you can pass an element into ElementTree?
How might i manually iterate through <product>...</product> grabbing
everything?
by default, iterparse only returns "end" events, which means that the
iterator will visit the Products children before you see the Products
element itself. with the code above, this means that the children will
be nuked before you get around to process the parent.
depending on how much rubbish you have in the file, you can do
for event, elem in iterparse(filename):
if elem.tag == "Products":
process(elem)
elem.clear()
or
for event, elem in iterparse(filename):
if elem.tag == "Products":
process(elem)
elem.clear()
elif elem.tag in ("Rubbish1", "Rubbish2"):
elem.clear()
or
inside = False
for event, elem in iterparse(filename, events=("start", "end")):
if event == "start":
# we've seen the start tag for this element, but not
# necessarily the end tag
if elem.tag == "Products":
inside = True
else:
# we've seen the end tag
if elem.tag == "Products":
process(elem)
elem.clear()
inside = False
elif not inside:
elem.clear()
for more info, see
http://effbot.org/zone/element-iterparse.htm
</F>