473,234 Members | 1,538 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,234 software developers and data experts.

ElementTree surprise

I have a doc with a bunch of fields like:

<foo bar="spam">stuff</foo>
<foo bar="penguin">other stuff</foo>

and sometimes

<foo bar="parrot"></foo>

I use ElementTree to parse the doc and I use the .text attribute
to get "stuff" or "other stuff" in the spam and penguin examples.

I'd expect .text to be the empty string in the parrot example, but
instead it is None.

I can fix my script to deal with this, but it's surprising. Is it
intentional? I could understand it being None if the doc had said

<foo bar="parrot"/>

but that is different.

Disclaimer: I'm not even slightly an XML expert, I just find myself
having to deal with a lot of it.
Aug 16 '07 #1
4 1300
Hallöchen!

Paul Rubin writes:
I have a doc with a bunch of fields like:

<foo bar="spam">stuff</foo>
<foo bar="penguin">other stuff</foo>

and sometimes

<foo bar="parrot"></foo>

I use ElementTree to parse the doc and I use the .text attribute
to get "stuff" or "other stuff" in the spam and penguin examples.

I'd expect .text to be the empty string in the parrot example, but
instead it is None.
Technically, text is nodes as all other element nodes. In the
parrot example, there is no empty textnode but no textnode at all.
I can fix my script to deal with this, but it's surprising. Is it
intentional? I could understand it being None if the doc had said

<foo bar="parrot"/>

but that is different.
<foo bar="parrot"/and <foo bar="parrot"></fooare mapped to the
same thing by any XML parser, and I think it wouldn't be standards
conforming if an XML parser would pass this difference to a caller.
Disclaimer: I'm not even slightly an XML expert, I just find myself
having to deal with a lot of it.
I think the question is how XMLish the access via ElementTree should
be. While it is in principle correct that there is no text node in
parrot, it may be sensible to set it to "" for practical reasons.
As far as I can see, there is no empty text node in XML, so no
ambiguity would occur.

Tschö,
Torsten.

--
Torsten Bronger, aquisgrana, europa vetus
Jabber ID: br*****@jabber.org
(See http://ime.webhop.org for ICQ, MSN, etc.)
Aug 16 '07 #2
Torsten Bronger <br*****@physik.rwth-aachen.dewrites:
<foo bar="parrot"></foo>
Technically, text is nodes as all other element nodes. In the
parrot example, there is no empty textnode but no textnode at all.
That is required by the xml standard? If yes, elementtree is doing
the right thing, but it surprises me, I would have expected an empty
string. Thanks.
Aug 16 '07 #3
Paul Rubin wrote:
Torsten Bronger <br*****@physik.rwth-aachen.dewrites:
>> <foo bar="parrot"></foo>
Technically, text is nodes as all other element nodes. In the
parrot example, there is no empty textnode but no textnode at all.

That is required by the xml standard? If yes, elementtree is doing
the right thing, but it surprises me, I would have expected an empty
string. Thanks.
The XML standard defines both as being equivalent, so any XML parser would
handle them exactly the same. Also, as most XML parsers have a SAX(-like)
interface, which always generates events in the "<foo></foo>" form, there is
not even a way for applications or libraries to distinguish between the two.

So it's not even an ElementTree thing. ET just doesn't know what exactly was
in the original XML byte stream. A very simple way to make sure you always get
a string back is
>>text = element.text or ""
BTW, you'd be even more surprised to see that ET can actually /store/ "" as
text if you tell it to, and then returns an empty string when you ask for the
..text property. But any empty text coming from the parser will always be None.

Oh, and lxml.etree behaves exactly the same as ElementTree here. :)

Stefan
Aug 16 '07 #4
Stefan Behnel <st******************@web.dewrites:
So it's not even an ElementTree thing. ET just doesn't know what
exactly was in the original XML byte stream. A very simple way to
make sure you always get a string back is
>>text = element.text or ""
Thanks, I ended up doing something like that. What I wondered about
the standard was whether it specified that parrot had no text node,
as opposed to having an empty text node. I guess it doesn't matter,
it just caught me by surprise.
Aug 16 '07 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

7
by: Stewart Midwinter | last post by:
I want to parse a file with ElementTree. My file has the following format: <!-- file population.xml --> <?xml version='1.0' encoding='utf-8'?> <population> <person><name="joe" sex="male"...
1
by: Greg Wilson | last post by:
I'm trying to convert from minidom to ElementTree for handling XML, and am having trouble with entities in DTDs. My Python script looks like this: ...
1
by: mirandacascade | last post by:
O/S: Windows 2K Vsn of Python: 2.4 Currently: 1) Folder structure: \workarea\ <- ElementTree files reside here \xml\ \dom\
15
by: Steven Bethard | last post by:
I'm having trouble using elementtree with an XML file that has some gbk-encoded text. (I can't read Chinese, so I'm taking their word for it that it's gbk-encoded.) I always have trouble with...
0
by: Greg Aumann | last post by:
I am trying to write some python code for a library that reads an XML-like language from a file into elementtree data structures. Then I want to be able to read and/or modify the structure and then...
2
by: mirandacascade | last post by:
Situation is this: 1) I have inherited some python code that accepts a string object, the contents of which is an XML document, and produces a data structure that represents some of the content of...
5
by: saif.shakeel | last post by:
#!/usr/bin/env python from elementtree import ElementTree as Element tree = et.parse("testxml.xml") for t in tree.getiterator("SERVICEPARAMETER"): if t.get("Semantics") == "localId":...
1
by: Mike Slinn | last post by:
The following short Python program parses a KML file and displays the names of all Marks and Routes: from elementtree.ElementTree import ElementTree tree = ElementTree(file='test.kml') kml =...
3
by: gray.bowman | last post by:
I'm messing around with trying to write an xml file using xml.etree.ElementTree. All the examples on the internet show the use of ElementTree.write(), although when I try to use it it's not...
3
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 3 Jan 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). For other local times, please check World Time Buddy In...
0
by: jianzs | last post by:
Introduction Cloud-native applications are conventionally identified as those designed and nurtured on cloud infrastructure. Such applications, rooted in cloud technologies, skillfully benefit from...
0
by: abbasky | last post by:
### Vandf component communication method one: data sharing ​ Vandf components can achieve data exchange through data sharing, state sharing, events, and other methods. Vandf's data exchange method...
0
Git
by: egorbl4 | last post by:
Скачал я git, хотел начать настройку, а там вылезло вот это Что это? Что мне с этим делать? ...
1
by: davi5007 | last post by:
Hi, Basically, I am trying to automate a field named TraceabilityNo into a web page from an access form. I've got the serial held in the variable strSearchString. How can I get this into the...
0
by: MeoLessi9 | last post by:
I have VirtualBox installed on Windows 11 and now I would like to install Kali on a virtual machine. However, on the official website, I see two options: "Installer images" and "Virtual machines"....
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: Aftab Ahmad | last post by:
Hello Experts! I have written a code in MS Access for a cmd called "WhatsApp Message" to open WhatsApp using that very code but the problem is that it gives a popup message everytime I clicked on...
0
by: Aftab Ahmad | last post by:
So, I have written a code for a cmd called "Send WhatsApp Message" to open and send WhatsApp messaage. The code is given below. Dim IE As Object Set IE =...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.