ElementTree should parse string and file in teh same way

Peter Pei

One bad design about elementtree is that it has different ways parsing a
string and a file, even worse they return different objects:
1) When you parse a file, you can simply call parse, which returns a
elementtree, on which you can then apply xpath;
2) To parse a string (xml section), you can call XML or fromstring, but both
return element instead of elementtree. This alone is bad. To make it worse,
you have to create an elementtree from this element before you can utilize
xpath.

Dec 31 '07 #1

Subscribe Post Reply

3482

Paddy

On Dec 31, 3:42 am, "Peter Pei" <yan...@telus.comwrote:

One bad design about elementtree is that it has different ways parsing a
string and a file, even worse they return different objects:
1) When you parse a file, you can simply call parse, which returns a
elementtree, on which you can then apply xpath;
2) To parse a string (xml section), you can call XML or fromstring, but both
return element instead of elementtree. This alone is bad. To make it worse,
you have to create an elementtree from this element before you can utilize
xpath.

I haven't tried this, but you should be able to wrap your text string
so that it looks like a file using the stringio module and pass that
to elementtree:

http://blog.doughellmann.com/2007/04...cstringio.html

- Paddy.

Dec 31 '07 #2

Stefan Behnel

Peter Pei wrote:

One bad design about elementtree is that it has different ways parsing a
string and a file, even worse they return different objects:
1) When you parse a file, you can simply call parse, which returns a
elementtree, on which you can then apply xpath;

ElementTree doesn't support XPath. In case you mean the simpler ElementPath
language that is supported by the find*() methods, I do not see a reason why
you can't use it on elements.

2) To parse a string (xml section), you can call XML or fromstring, but
both return element instead of elementtree. This alone is bad. To make
it worse, you have to create an elementtree from this element before you
can utilize xpath.

a) how hard is it to write a wrapper function around fromstring() that wraps
the result Element in an ElementTree object and returns it?

b) the same as above applies: I can't see the problem you are talking about.

Stefan

Dec 31 '07 #3

Peter Pei

You are talking shit. It is never about whether it is hard to write a
wrapper. It is about bad design. I should be able to parse a string and a
file in exactly same way, and that should be provided as part of the
package.

Looks like you are just a code monkey not a designer, so I forgive you. You
didn't understand the issue I described? That's your issue. You are not at
the same level to talk to me, so chill.
================================================== =================
"Stefan Behnel" <st******************@web.dewrote in message
news:47**************@web.de...

Peter Pei wrote:
>One bad design about elementtree is that it has different ways parsing a
string and a file, even worse they return different objects:
1) When you parse a file, you can simply call parse, which returns a
elementtree, on which you can then apply xpath;

ElementTree doesn't support XPath. In case you mean the simpler
ElementPath
language that is supported by the find*() methods, I do not see a reason
why
you can't use it on elements.

>2) To parse a string (xml section), you can call XML or fromstring, but
both return element instead of elementtree. This alone is bad. To make
it worse, you have to create an elementtree from this element before you
can utilize xpath.

a) how hard is it to write a wrapper function around fromstring() that
wraps
the result Element in an ElementTree object and returns it?

b) the same as above applies: I can't see the problem you are talking
about.

Stefan

Jan 1 '08 #4

Peter Pei

To be preise, XPath is not fully supported. Don't be a smart asshole.
================================================== ===================
"Stefan Behnel" <st******************@web.dewrote in message
news:47**************@web.de...

Peter Pei wrote:
>One bad design about elementtree is that it has different ways parsing a
string and a file, even worse they return different objects:
1) When you parse a file, you can simply call parse, which returns a
elementtree, on which you can then apply xpath;

ElementTree doesn't support XPath. In case you mean the simpler
ElementPath
language that is supported by the find*() methods, I do not see a reason
why
you can't use it on elements.

>2) To parse a string (xml section), you can call XML or fromstring, but
both return element instead of elementtree. This alone is bad. To make
it worse, you have to create an elementtree from this element before you
can utilize xpath.

a) how hard is it to write a wrapper function around fromstring() that
wraps
the result Element in an ElementTree object and returns it?

b) the same as above applies: I can't see the problem you are talking
about.

Stefan

Jan 1 '08 #5

Steven D'Aprano

On Tue, 01 Jan 2008 01:53:47 +0000, Peter Pei wrote:

You are talking shit. It is never about whether it is hard to write a
wrapper. It is about bad design. I should be able to parse a string and
a file in exactly same way, and that should be provided as part of the
package.

Oh my, somebody decided to start the new year with all guns blazing.

Before abusing anyone else, have you considered asking *why* ElementTree
does not treat files and strings the same way? I believe the writer of
ElementTree, Fredrik Lundh, frequents this newsgroup.

It may be that Fredrik doesn't agree with you that you should be able to
parse a string and a file the same way, in which case there's nothing you
can do but work around it. On the other hand, perhaps he just hasn't had
a chance to implement that functionality, and would welcome a patch.

Fredrik, if you're reading this, I'm curious what your reason is. I don't
have an opinion on whether you should or shouldn't treat files and
strings the same way. Over to you...

--
Steven

Jan 1 '08 #6

Stefan Behnel

Peter Pei wrote:

To be preise

[...]

Preise the lord, not me. :)

Happy New Year!

Stefan

Jan 1 '08 #7

Steven D'Aprano

On Tue, 01 Jan 2008 13:36:57 +0100, Diez B. Roggisch wrote:

And codemonkeys know that in python

doc = et.parse(StringIO(string))

is just one import away

Yes, but to play devil's advocate for a moment,

doc = et.parse(string_or_file)

would be even simpler.

Is there any reason why it should not behave that way? It could be as
simple as adding a couple of lines to the parse method:

if isinstance(arg, str):
import StringIO
arg = StringIO(arg)

I'm not saying it *should*, I'm asking if there's a reason it *shouldn't*.

"I find it aesthetically distasteful" would be a perfectly acceptable
answer -- not one I would agree with, but I could accept it.

--
Steven

Jan 1 '08 #8

Steven Bethard

Steven D'Aprano wrote:

On Tue, 01 Jan 2008 13:36:57 +0100, Diez B. Roggisch wrote:

>And codemonkeys know that in python

doc = et.parse(StringIO(string))

is just one import away

Yes, but to play devil's advocate for a moment,

doc = et.parse(string_or_file)

would be even simpler.

I assume the problem with this is that it would be ambiguous. You can
already use either a string or a file with ``et.parse``. A string is
interpreted as a file name, while a file object is used directly.

How would you differentiate between a string that's supposed to be a
file name, and a string that's supposed to be XML?

Steve

Jan 1 '08 #9

Steven D'Aprano

On Tue, 01 Jan 2008 12:59:44 -0700, Steven Bethard wrote:

Steven D'Aprano wrote:
>On Tue, 01 Jan 2008 13:36:57 +0100, Diez B. Roggisch wrote:

>>And codemonkeys know that in python

doc = et.parse(StringIO(string))

is just one import away

Yes, but to play devil's advocate for a moment,

doc = et.parse(string_or_file)

would be even simpler.

I assume the problem with this is that it would be ambiguous. You can
already use either a string or a file with ``et.parse``. A string is
interpreted as a file name, while a file object is used directly.

Ah! I wasn't aware that parse() operated on either an open file object or
a string file name. That's an excellent reason for not treating strings
the same as files in ElementTree.

How would you differentiate between a string that's supposed to be a
file name, and a string that's supposed to be XML?

Well, naturally I wouldn't.

I *could*, if I assumed that a multi-line string that started with "<"
was XML, and a single-line string with the path separator character or
ending in ".xml" was a file name, but that sort of Do What I Mean coding
is foolish in a library function that can't afford to occasionally Do The
Wrong Thing.
--
Steven

Jan 1 '08 #10

Peter Pei

To answer something posted deep down... It is fine with me if there are two
functions - one to parse a file or file handler and one to parse a string,
yet the returned objects should be consistent.

Jan 2 '08 #11

Fredrik Lundh

Steven D'Aprano wrote:

Fredrik, if you're reading this, I'm curious what your reason is. I don't
have an opinion on whether you should or shouldn't treat files and
strings the same way. Over to you...

as Diez shows, it's all about use cases.

and as anyone who's used my libraries or read my code knows, I'm a big
fan of minimalistic but highly composable object API:s and liberal use
of short helper functions to wire them up to fit the task at hand.

kitchen sink API design is a really bad idea, for more reasons than I
can fit in this small editor window.

</F>

Jan 2 '08 #12

by: Stewart Midwinter | last post by:

I want to parse a file with ElementTree. My file has the following format:  <?xml version='1.0' encoding='utf-8'?> <population> <person><name="joe" sex="male"...

Python

ElementTree/DTD question

by: Greg Wilson | last post by:

I'm trying to convert from minidom to ElementTree for handling XML, and am having trouble with entities in DTDs. My Python script looks like this: ...

Python

Bug in Elementtree/Expat

by: alainpoint | last post by:

Hello, I use Elementtree to parse an elementary SVG file (in fact, it is one of the examples in the "SVG essentials" book). More precisely, it is the fig0201.svg file in the second chapter. The...

Python

handling ExpatError exception raised from ElementTree.XML() method

by: mirandacascade | last post by:

Verion of Python: 2.4 O/S: Windows XP ElementTree resides in the c:\python24\lib\site-packages\elementtree\ folder When a string that does not contain well-formed XML is passed as an argument...

Python

elementtree and gbk encoding

by: Steven Bethard | last post by:

I'm having trouble using elementtree with an XML file that has some gbk-encoded text. (I can't read Chinese, so I'm taking their word for it that it's gbk-encoded.) I always have trouble with...

Python

using TreeBuilder in an ElementTree like way

by: Greg Aumann | last post by:

I am trying to write some python code for a library that reads an XML-like language from a file into elementtree data structures. Then I want to be able to read and/or modify the structure and then...

Python

ElementTree : parse string input

by: rajarshi.guha | last post by:

Hi, recently having discovered ElementTree I'm stumped by a very simple problem, which I can't find the answer to. I have some XML in a string object. Now the parse() method of ElementTree takes...

Python

ElementTree and Unicode

by: Sébastien Boisgérault | last post by:

I guess I am doing something wrong ... Any clue ? Traceback (most recent call last): File "<stdin>", line 1, in ? File "/usr/lib/python2.4/site-packages/elementtree/ElementTree.py", line 960,...

Python

ElementTree and DTDs

by: =?ISO-8859-1?Q?J=2E_Pablo_Fern=E1ndez?= | last post by:

Hello, Is ElementTree supposed to load DTDs? I have some xmls heavy on entities and it fails this way: Python 2.5.2 (r252:60911, Apr 21 2008, 11:12:42) on linux2 Type "help", "copyright",...

Python

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Batch import of multiple excel files into the database

by: ryjfgjl | last post by:

If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...

Data Management

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

Online Marketing

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

General

ElementTree should parse string and file in teh same way

Similar topics