473,387 Members | 1,455 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

ElementTree, how to get the whole content of a tag

Given the folowing XML snippet, I build an ElementTree instance with
et=ElementTree.fromstring(..). Now et.text returns just '\n text\n some
other text'.
Is there any way I could get everything between the <div> and </div> tag?

<div>
text
some other text<br/>
and then some more
</div>
--
damjan
Jul 18 '05 #1
3 3157
Damjan <gd*****@gmail.com> wrote:
Given the folowing XML snippet, I build an ElementTree instance with
et=ElementTree.fromstring(..). Now et.text returns just '\n text\n some
other text'.
Is there any way I could get everything between the <div> and </div> tag?

<div>
text
some other text<br/>
and then some more
</div>


def gettext(elem):
text = elem.text or ""
for subelem in elem:
text = text + gettext(subelem)
if subelem.tail:
text = text + subelem.tail
return text
gettext(et)

'\n text\n some other text\n and then some more\n'

</F>

Jul 18 '05 #2
>> Is there any way I could get everything between the <div> and </div> tag?

<div>
text
some other text<br/>
and then some more
</div>
gettext(et)

'\n text\n some other text\n and then some more\n'


I acctually need to get
'\n text\n some other text<br/>\n and then some more\n'

And if there were attributes in <br/> I'd want them too where they were.
Can't I just get ALL the text between the <div> tags?

--
damjan
Jul 18 '05 #3
Damjan wrote:
Is there any way I could get everything between the <div> and </div> tag?

<div>
text
some other text<br/>
and then some more
</div> gettext(et)

'\n text\n some other text\n and then some more\n'


I acctually need to get
'\n text\n some other text<br/>\n and then some more\n'


that's not the tree content, that's a serialized XML fragment.

the quickest way to do that is to serialize the entire element, and
strip off the start and end tags:

text = ElementTree.tostring(elem)
text = text.split(">", 1)[1].rsplit("<", 1)[0]

alternatively, you can serialize the subelements, and add in properly
encoded text and tail attributes:

def innersource(elem, encoding="ascii"):
text = ElementTree._encode(elem.text or "", encoding)
for subelem in elem:
text = text + ElementTree.tostring(subelem)
if subelem.tail:
text = text + ElementTree._encode(subelem.tail, encoding)
return text

(but _encode is not an official part of the elementtree API, so this code
may not work in post-1.2 releases)

</F>

Jul 18 '05 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

7
by: Stewart Midwinter | last post by:
I want to parse a file with ElementTree. My file has the following format: <!-- file population.xml --> <?xml version='1.0' encoding='utf-8'?> <population> <person><name="joe" sex="male"...
1
by: dayzman | last post by:
Hi, Is anyone here familiar with ElementTree by effbot? With <html><body>hello</body></html> how is "hello" stored in the element tree? Which node is it under? Similarly, with: foo <a href =...
9
by: Chris Spencer | last post by:
Does anyone know how to make ElementTree preserve namespace prefixes in parsed xml files? The default behavior is to strip a document of all prefixes and then replace them autogenerated prefixes...
1
by: mirandacascade | last post by:
O/S: Windows 2K Vsn of Python: 2.4 Currently: 1) Folder structure: \workarea\ <- ElementTree files reside here \xml\ \dom\
28
by: doug.bromley | last post by:
Why is the ElementTree API not a part of the Python core? I've recently been developing a script for accessing the Miva API only to find all the core API's provided by Python for parsing XML is...
7
by: mirandacascade | last post by:
O/S: Windows XP Home Vsn of Python: 2.4 Copy/paste of interactive window is immediately below; the text/questions toward the bottom of this post will refer to the content of the copy/paste ...
2
by: mirandacascade | last post by:
Situation is this: 1) I have inherited some python code that accepts a string object, the contents of which is an XML document, and produces a data structure that represents some of the content of...
30
by: Chas Emerick | last post by:
I looked around for an ElementTree-specific mailing list, but found none -- my apologies if this is too broad a forum for this question. I've been using the lxml variant of the ElementTree API,...
0
by: Gabriel Genellina | last post by:
En Mon, 09 Jun 2008 15:32:00 -0300, Marcelo de Moraes Serpa <celoserpa@gmail.comescribió: I don't think it's a problem with ElementTree. Perhaps you are writing the same (global) configuration...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.