473,698 Members | 2,376 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Desc of packages for XML processing

There are various packages availaible for XML processing using python.
So which to choose and when. I summarized some of the features,
advantages and disadvantages of some packages int the following text.
Have a look to it. May this get out of the dillema of choice.

Here we go:

OPTIONS
=========
- libxml2
- lxml
- Pyxml
- 4Suite

DESCRIPTION
=============
-------
libxml2
-------
A quote by Mark Pilgrim: "Programmin g with libxml2 is like the
thrilling embrace of an exotic stranger. It seems to have the potential
to fulfill your wildest dreams, but there's a nagging voice somewhere
in your head warning you that you're about to get screwed in the worst
way."

Features:
=========
- Namespaces in XML
- XPath, Xpointer, XInclude XML Base
- XML Schemas Part 2 : DataTypes
- Relax NG
- SAX: a SAX2 like interface and a minimal SAX1 implementation
compatible
with early expat versions
- NO DOM: It provide support for DOM to some extent BUT it does not

implement the API itself, gdome2 .
- It is written in plain C, making as few assumptions as possible,
and sticking
closely to ANSI C/POSIX for easy embedding.
- Platform: Linux/Unix/Windows
Advantages
==========
- Standards-compliant XML support.
- Full-featured.
- Actively maintained by XML experts.
- fast. fast! FAST!
- Stable.

Disadvantages
=============
This library already ship with Python bindings, but
these Python bindings have
some problems:
- Very low level and C-ish (not Pythonic).
- Underdocumented and huge, you get lost in them.
- UTF-8 in API, instead of Python unicode strings.
- Can cause segfaults from Python.
- Have to do manual memory management. As the
library calls are more or
less an exact mapping on the C API, and thus
require to think about
memory management

For Those who want ot go for DOM API:
Packages for DOM
=============== =
- gdome2: gdome2 provides support for dom on top of
libxml2.C-Based
(http://gdome2.cs.unibo.it/)
- libxml2dom: Other option availabile is libxml2dom.

(http://cheeseshop.python.org/pypi/libxml2dom/0.3.3)
- libxml_domlib:l ibxml_domlib is a Python extension module that
enables you
to use the DOM interface to libxml2

(http://www.rexx.com/~dkuhlman/libxml_domlib.html)
Resources
==========
- http://xmlsoft.org/index.html
- http://codespeak.net/lxml/intro.html
----
lxml
-----
lxml follows the ElementTree API as much as possible, building it on
top of the native libxml2 tree.

Features
========
- lxml provides all above features as of libxml2 but using
ElementTreet API.

Advantages
==========
- Pythonic API.
- Documented.
- Use Python unicode strings in API.
- Safe (no segfaults).
- No manual memory management
Disadvantages
==============
- No DOM support as in libxml2.
- It is in its initial release (latest is lxml 0.7)
Resources
=========
- http://codespeak.net/lxml/
------
Pyxml
------
Features
=========
- xmlproc: a validating XML parser.
- Expat: a fast non-validating parser.
- sgmlop: a C helper module that can speed-up xmllib.py and
sgmllib.py by a
factor of 5.
- PySAX: SAX 1 and SAX2 libraries with drivers for most of the
parsers.
- 4DOM: A fully compliant DOM Level 2 implementation
- pulldom: a DOM implementation that supports lazy instantiation of
nodes.
- marshal: a module with several options for serializing Python
objects to XML
Advantages
==========
- A lot of documentation is availaible and almost all resources and
examples
based on it.

Disadvantages
=============
- No Schema support

Pacakges for Schema(For those who want schema support too)
=============== ====
XSV: currently in progress, and provides XML schema Part 1:
Structures.
Dependent on some other pacakage PyLTXML
(http://www.ltg.ed.ac.uk/~ht/xsv-status.html)


-------
4Suite
-------
Features:
=========
- XML,XSLT,XPath, DOM,XInclude,XP ointer,XLink,XU pdate,RELAX NG,XML
Catalogs
- Platform: Posix, Windows

Advantages
============
- As, this provides Relax NG: RELAX NG, a simple schema language for
XML,
based on [RELAX] and [TREX]. A RELAX NG schema
specifies a pattern for
the structure and content of an XML document.
[1]
http://www.oasis-open.org/committees...3.html#IDAGDYR
[2] http://xmlbuddy.com/2.0/features.html
[3] http://www.xml.com/pub/a/2001/12/12/...re.html?page=2

* But Relax NG is not W3C based. It is provided by OASIS.
Site:
======
[4] http://cheeseshop.python.org/pypi/4Suite-XML/1.0b3

Dec 23 '05 #1
1 1626
ankit wrote:
There are various packages availaible for XML processing using python.
So which to choose and when. I summarized some of the features,
advantages and disadvantages of some packages int the following text.
Have a look to it. May this get out of the dillema of choice.

Here we go:

OPTIONS
=========
- libxml2
- lxml
- Pyxml
- 4Suite
Also ElementTree, Amara
----
lxml
-----
Disadvantages
==============


- No Windows release to date :-(

Kent
Dec 23 '05 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
4008
by: Chris Barker | last post by:
Hi all, We've been having a discussion over on the wxPython-users mailing list about how to deal with multiple versions of wxPython. During the discussion it came up that this isn't a problem faced only by wxPython, but would have to be dealt with by virtually all packages. The root of the problem is what to do when you install a new version of wxPython, and want to be able to keep using the old one. This question comes up frequently...
2
5424
by: Peter Saffrey | last post by:
(apologies for starting a new thread - Google can't retrieve the other message for some reason) Yes, /usr/lib/python/site-packages is in sys.path. This series of commands should explain what I mean: I've put the Python ID3 module in a sub-directory of site-packages as an illustration. pzs@bonnie:~$ ls /usr/lib/python2.3/site-packages/ ID3.py ID3.pyc ID3.pyo README apt_inst.so apt_pkg.so apt_proxy bsddb3 debconf.py debconf.pyc ...
0
1489
by: Rich Burridge | last post by:
Hi, I work in the Accessibility Program Office at Sun Microsystems. I'm part of a team that is using Python to create a screen reader called Orca, that'll help blind people (and people with low vision) have access to the GNOME desktop for Solaris and Linux. See: http://cvs.gnome.org/viewcvs/*checkout*/orca/docs/doc-set/orca.html
5
3193
by: comp.lang.php | last post by:
$orderBy = 's.app_date desc, s.last_name asc, s.first_name asc, s.mi asc'; if ($_REQUEST) { $ascArray = array('asc' => 'desc', 'desc' => 'asc'); // ARRAY OF ALL ORDERING POSSIBILITIES $junk = preg_match('/(+)a|sc(,?.*$)/i', $orderBy, $matchArray); $orderBy = substr($orderBy, 0, strpos($orderBy, $matchArray)) . ' ' . $ascArray] . substr($orderBy, strpos($orderBy, $matchArray) +
1
1411
by: sowmya.cbe | last post by:
hi, am new to install packages in .net. i have a couple of questions 1) well i need to develop an install package for my existing web application written in asp.net. i used the settings and deployment option available in .net and created my install package in release mode. i got the setup.exe and the .msi file created. after installing it in a remote system, when i access it using the internet explorer . i get the opening page which...
6
1795
by: John Machin | last post by:
Hi, In general, I'm mainly interested in a template engine for dynamic web pages but would like a general purpose one to avoid learning yet another package for generating e-mail messages, form letters, source code, whatever. In particular, does anyone have much experience with the Python interface to Terence Parr's StringTemplate (http://www.stringtemplate.org/)? Reading the website, I'm attracted by
0
1678
by: giovanni gherdovich | last post by:
Hello, first of all: Is this the right place to ask plastek-related questions? I'm trying to make plastex work on my Ubuntu Dapper Drake. For LaTeX, I have the all-in-one package tetex.
5
8703
by: Brian Tkatch | last post by:
Is there a way to do ORDER BY with DESC inside a CASE statement? That is, given more than one choice for an ORDER BY based on a CASE statement, but only one of the choices will also use DESC (others rely on the default of ASC). For example: While this works: WITH A(A, B) AS (VALUES (1,2), (2,1), (3,4)) SELECT A, B FROM A ORDER
1
4145
by: aRTx | last post by:
<? /* Directory Listing Script - Version 2 ==================================== Script Author: Artani <artan_p@msn.com>. www.artxcenter.com REQUIREMENTS ============ This script requires PHP and GD2 if you wish to use the
0
8676
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9161
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
8897
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
7732
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
6522
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5860
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4619
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
3050
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
2332
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.