I'm confused, I want to read/write XML files but I don't really understand
what library to use.
I've used DOM-based libraries in other languages, is PyXML the library to
use? 19 3115
You can try xml.dom and xml.sax. Both are inbuilt libraries with Python
standard package. You can read and write xml files with these very
easily. There are number of third party modules for Python that
manipulate XML. But the above are the basic ones.
You can try xml.dom and xml.sax. Both are inbuilt libraries with Python
standard package. You can read and write xml files with these very
easily. There are number of third party modules for Python that
manipulate XML. But the above are the basic ones.
On Tue, 13 Sep 2005 19:23:50 +0200
Kalle Anke <sk*****@gmail.com> wrote: I'm confused, I want to read/write XML files but I don't really understand what library to use.
I've used DOM-based libraries in other languages, is PyXML the library to use?
PyXML will do the job. I'm currently using it in one of my projects.
4suite has their cDomlette also, which provides a high-speed
lightweight DOM implementation. Although, if you don't need
canonicalization, XPath, or those kinds of extensions, you can probably
get by with minidom and the other code included in the standard Python
distribution, and avoid the need to install additional libraries.
I have also heard excellent things about ElementTree; I haven't used it
myself though (largely because I can't find any resources on doing XML
canonicalization with it).
-Michael me******@scl.ameslab.gov wrote: I have also heard excellent things about ElementTree; I haven't used it myself though (largely because I can't find any resources on doing XML canonicalization with it).
ElementTree/cElementTree is really easy to use and Pythonic.
--
Michael Hoffman me******@scl.ameslab.gov wrote: I have also heard excellent things about ElementTree; I haven't used it myself though (largely because I can't find any resources on doing XML canonicalization with it).
You can use lxml which is an implementation of the ElementTree API using
libxml2 and libxslt under the covers for greater standards compliance
including c14n. I've been using extensively recently and highly
recommend it. http://codespeak.net/lxml
--
Robert Kern rk***@ucsd.edu
"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter
Kalle Anke <sk*****@gmail.com> writes: I'm confused, I want to read/write XML files but I don't really understand what library to use.
I've used DOM-based libraries in other languages, is PyXML the library to use?
It depends. Like there's no best car - "best" is very dependant on use of the
vehicle concerned in addition to personal preferences - there's no best XML
module either. Some seem very well in many respects, though :)
I recommend using EffBot's ElementTree. It's very simple to use (you get to do
stuff without thinking delicacies of parsing/generating), and it is
_fast_. Now let me repeat the last part - normally speed is of no concern
with the computers we have nowadays, but using eg. xml.minidom to process
files of size > 10 MB, your system might get very sluggish unless you are
quite careful in traversing the parse tree (and maybe even then).
Using a SAX / full-compliant DOM parser could be good for learning things,
though. As I said, depends a lot.
--
# Edvard Majakari Software Engineer
# PGP PUBLIC KEY available Soli Deo Gloria!
$_ = '456476617264204d616a616b6172692c20612043687269737 469616e20'; print
join('',map{chr hex}(split/(\w{2})/)),uc substr(crypt(60281449,'es'),2,4),"\n";
Kalle Anke wrote: I've used DOM-based libraries in other languages, is PyXML the library to use?
I would start off with minidom; a tutorial I once wrote can be found
here: http://www.boddie.org.uk/python/XML_intro.html
That should demonstrate some minor differences between PyXML-style DOMs
and those for languages like Java. Should you need a faster DOM
implementation, you might want to look at libxml2dom: http://www.boddie.org.uk/python/libxml2dom.html
It's a pure Python module that uses the lower levels of libxml2's own
Python bindings, so if you already have libxml2 plus bindings
installed, it should be very convenient. Although libxml2dom isn't by
any means complete, I do use it myself and would welcome any feedback
which would make it better.
Paul
One more vote for Amara! I think it's unmatched for ease of use, if you
already know Python.
Ron
Edvard Majakari wrote: Using a SAX / full-compliant DOM parser could be good for learning things, though. As I said, depends a lot.
since there are no *sane* reasons to use SAX or DOM in Python, that's mainly
a job security issue...
</F>
Fredrik Lundh wrote: since there are no *sane* reasons to use SAX or DOM in Python, that's mainly a job security issue...
While I doubt that anyone would really recommend exclusive DOM API
usage for significant XML processing tasks (or for anything other than
educational purposes), I think you're overstating some case or other
here. Interoperability is a pretty sane argument for using DOM-based
technologies, whether that be skills interoperability (possibly related
to job security) or just using many different technologies together.
For example, PyQt and PyKDE expose various DOMs of the purest
"non-Pythonic" kind; Mozilla exposes DOMs for XML and HTML; adding a
layer of PyXML varnish to any of these isn't a huge job. Using
different technologies with the same foundations shouldn't have to
involve breaking open yet another API for the "fun" of it.
Paul
Paul Boddie wrote: For example, PyQt and PyKDE expose various DOMs of the purest "non-Pythonic" kind; Mozilla exposes DOMs for XML and HTML
I didn't see anything about manipulating an application's internal
data structures in the original post, but I might have missed some-
thing.
For stand-alone XML manipulation in Python, my point still stands:
programs using DOM and SAX are more bloated and slower than
the alternatives.
</F>
Fredrik Lundh wrote: since there are no *sane* reasons to use SAX or DOM in Python, that's mainly a job security issue...
I can see two reasons (sane or not):
- You're familiar with those APIs and use them in e.g. C++.
- You don't want to rely on third party libraries unless you must.
In many cases, xml.dom.minidom etc will do fine...
Having said this, I must admit that I much prefer Fredrik's ElementTree.
Although Fredrik (did you know that?) had a part in Carmen's move to
Python around five years ago, ElementTree isn't installed by default
on our machines, and I think our CM people are happier if we use as
few third party libraries as possible...
Fredrik Lundh wrote: Paul Boddie wrote:
[On interoperability] For example, PyQt and PyKDE expose various DOMs of the purest "non-Pythonic" kind; Mozilla exposes DOMs for XML and HTML
I didn't see anything about manipulating an application's internal data structures in the original post, but I might have missed some- thing.
Well, manipulating documents in Mozilla and KHTML are just examples, as
I pointed out, and whilst I'd agree that the in-process restrictions
Mozilla appears to place on full participants in its component system
does kind of mean that the Mozilla DOM is an "application's internal
data structure", the opportunities are more open for KHTML in that one
isn't limited to just automating some application. Moreover, the XML
APIs exposed by PyQt are also available for general XML processing,
whether you regard them as performant or not.
For stand-alone XML manipulation in Python, my point still stands: programs using DOM and SAX are more bloated and slower than the alternatives.
Your point was that "there are no *sane* reasons to use SAX or DOM in
Python", which actually isn't true. Sure, processing a 100GB XML
document using the DOM isn't a sensible strategy (with this generation
of hardware!), and SAX isn't necessarily the most elegant way of
expressing the processing logic, and other tools and APIs exist to
perform such tasks more efficiently and elegantly, but then "to
read/write XML files" leaves the questioner's field of endeavour pretty
much open to interpretation. Somewhere amongst the many fields of
endeavour there are places where the DOM (whilst not as "Pythonic" as
some might like) certainly is a valid choice, possibly because it's the
only choice - all thanks to interoperability, as I said. ;-)
Paul
Paul Boddie wrote: For stand-alone XML manipulation in Python, my point still stands: programs using DOM and SAX are more bloated and slower than the alternatives.
Your point was that "there are no *sane* reasons to use SAX or DOM in Python", which actually isn't true.
I replied in the context of this thread. If you chose to ignore the
context, or if you're not capable of reading and understanding the
posts you're replying to, that's your problem.
</F>
Fredrik Lundh wrote: Paul Boddie wrote:
For stand-alone XML manipulation in Python, my point still stands: programs using DOM and SAX are more bloated and slower than the alternatives.
Your point was that "there are no *sane* reasons to use SAX or DOM in Python", which actually isn't true.
I replied in the context of this thread. If you chose to ignore the context, or if you're not capable of reading and understanding the posts you're replying to, that's your problem.
His interpretation of your words is a perfectly valid one even in the
context of this thread. "in Python" explicitly provides a context for
the rest of the sentence. In English, at least, it is perfectly
reasonable to presume that explicit contexts override implicit ones.
--
Robert Kern rk***@ucsd.edu
"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter
Robert Kern wrote: His interpretation of your words is a perfectly valid one even in the context of this thread. "in Python" explicitly provides a context for the rest of the sentence.
Exactly. "in Python", not "in an application with an existing API".
(also, if the OP had been forced to use an existing API by external
constraints, don't you think he would have mentioned it?)
In English, at least, it is perfectly reasonable to presume that explicit contexts override implicit ones.
Letting a part of a sentence override the context of the discussion is
perhaps popular in certain tabloid journalist circles, and among slash-
dot editors and US political bloggers, but most people do in fact have
a context buffer memory that can hold more than a few words. (how
come you're so sure I wasn't talking about, say, the Python Lisp com-
piler? or the Monty Python sketch with the sadistic Belgian instrument-
making monk? or a Harry Potter book?)
I know what I meant. You know what I meant. Paul knows what I
meant. If you still want to play the "but there is a way to interpret
this in another way" game, file a bug report against the python.org
"what is python?" summary page.
</F>
Fredrik Lundh wrote: Robert Kern wrote: His interpretation of your words is a perfectly valid one even in the context of this thread. "in Python" explicitly provides a context for the rest of the sentence. Exactly. "in Python", not "in an application with an existing API".
Well, if you're still not convinced that DOMs exist outside monolithic
applications... ;-) In English, at least, it is perfectly reasonable to presume that explicit contexts override implicit ones.
Letting a part of a sentence override the context of the discussion is perhaps popular in certain tabloid journalist circles, and among slash- dot editors and US political bloggers, but most people do in fact have a context buffer memory that can hold more than a few words.
I don't really see how an absolutely-qualified complete sentence...
Q> since there are no *sane* reasons to use SAX or DOM in Python,
Q> that's mainly a job security issue...
....can somehow be qualified by the preceding discussion, when the only
ambiguous context is "that" == "good for learning things". It's an
absolute statement of opinion! (And yes, I think we all agree on what
"Python" is.)
I know what I meant. You know what I meant. Paul knows what I meant.
I actually do know what you mean, but that doesn't mean that the
statement in question wasn't misleading, especially to people who
aren't familiar with or accustomed to discovering this missing context.
It's like saying "there are no *sane* reasons to drive a Volvo",
possibly in a follow-up to a discussion about how bad Volvos are
compared to Saabs. There may well be a sane reason to drive a Volvo,
but the statement doesn't allow for the possibility, unless in the hunt
for the missing context you're willing to take the term "significant
whitespace" to a whole new level.
Paul
Fredrik Lundh wrote: Edvard Majakari wrote:
Using a SAX / full-compliant DOM parser could be good for learning things, though. As I said, depends a lot.
since there are no *sane* reasons to use SAX or DOM in Python, that's mainly a job security issue...
One sane reason is that ElementTree is not part of the standard library. There
are cases where you write a simple python script of 400 lines and you want it
to stay single-file. While ElementTree is very easy to distribute (for basic
features it's just a single file), it still won't fit some scenarios.
So, why did it not make it to the standard library yet, given that it's so much
better than the alternatives?
--
Giovanni Bajo This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: thecrow |
last post by:
Alright, what the hell is going on here?
In the following code, I expect the printed result to be:
DEBUG: frank's last name is burns.
Instead, what I get is:
DEBUG: frank's last name is...
|
by: Brandon J. Van Every |
last post by:
What's better about Ruby than Python? I'm sure there's something. What is
it?
This is not a troll. I'm language shopping and I want people's answers. I
don't know beans about Ruby or have...
|
by: mike420 |
last post by:
I think everyone who used Python will agree that its syntax is
the best thing going for it. It is very readable and easy
for everyone to learn. But, Python does not a have very good
macro...
|
by: Reed L. O'Brien |
last post by:
I see rotor was removed for 2.4 and the docs say use an AES module
provided separately... Is there a standard module that works alike or
an AES module that works alike but with better encryption?...
|
by: Philippe C. Martin |
last post by:
I apologize in advance for launching this post but I might get enlightment
somehow (PS: I am _very_ agnostic ;-).
- 1) I do not consider my intelligence/education above average
- 2) I am very...
|
by: Dario |
last post by:
The following simple program behaves differently
in Windows and Linux .
#include <stdexcept>
#include <iostream>
#include <string>
using namespace std;
class LogicError : public logic_error {...
|
by: Sarah Tanembaum |
last post by:
Beside its an opensource and supported by community, what's the fundamental
differences between PostgreSQL and those high-price commercial database (and
some are bloated such as Oracle) from...
|
by: Neal |
last post by:
Patrick Griffiths weighs in on the CSS vs table layout debate in his blog
entry "Tables my ass" - http://www.htmldog.com/ptg/archives/000049.php . A
quite good article.
|
by: typingcat |
last post by:
First of all, I'm an Asian and I need to input Japanese, Korean and so
on. I've tried many PHP IDEs today, but almost non of them supported
Unicode (UTF-8) file.
I've found that the only Unicode...
|
by: Midnight Java Junkie |
last post by:
Dear Colleagues:
I feel that the dumbest questions are those that are never asked. I have
been given the opportunity to get into .NET. Our organization has a
subscription with Microsoft that...
|
by: ryjfgjl |
last post by:
ExcelToDatabase: batch import excel into database automatically...
|
by: isladogs |
last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM).
In this month's session, we are pleased to welcome back...
|
by: Vimpel783 |
last post by:
Hello!
Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
|
by: jfyes |
last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
|
by: PapaRatzi |
last post by:
Hello,
I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
|
by: Defcon1945 |
last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
|
by: af34tf |
last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
|
by: Faith0G |
last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
|
by: isladogs |
last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM).
In this session, we are pleased to welcome former...
| |