Damjan wrote:
Is there any way I could get everything between the <div> and </div> tag?
<div>
text
some other text<br/>
and then some more
</div> gettext(et)
'\n text\n some other text\n and then some more\n'
I acctually need to get
'\n text\n some other text<br/>\n and then some more\n'
that's not the tree content, that's a serialized XML fragment.
the quickest way to do that is to serialize the entire element, and
strip off the start and end tags:
text = ElementTree.tostring(elem)
text = text.split(">", 1)[1].rsplit("<", 1)[0]
alternatively, you can serialize the subelements, and add in properly
encoded text and tail attributes:
def innersource(elem, encoding="ascii"):
text = ElementTree._encode(elem.text or "", encoding)
for subelem in elem:
text = text + ElementTree.tostring(subelem)
if subelem.tail:
text = text + ElementTree._encode(subelem.tail, encoding)
return text
(but _encode is not an official part of the elementtree API, so this code
may not work in post-1.2 releases)
</F>