473,395 Members | 1,941 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,395 software developers and data experts.

Desc of packages for XML processing

There are various packages availaible for XML processing using python.
So which to choose and when. I summarized some of the features,
advantages and disadvantages of some packages int the following text.
Have a look to it. May this get out of the dillema of choice.

Here we go:

OPTIONS
=========
- libxml2
- lxml
- Pyxml
- 4Suite

DESCRIPTION
=============
-------
libxml2
-------
A quote by Mark Pilgrim: "Programming with libxml2 is like the
thrilling embrace of an exotic stranger. It seems to have the potential
to fulfill your wildest dreams, but there's a nagging voice somewhere
in your head warning you that you're about to get screwed in the worst
way."

Features:
=========
- Namespaces in XML
- XPath, Xpointer, XInclude XML Base
- XML Schemas Part 2 : DataTypes
- Relax NG
- SAX: a SAX2 like interface and a minimal SAX1 implementation
compatible
with early expat versions
- NO DOM: It provide support for DOM to some extent BUT it does not

implement the API itself, gdome2 .
- It is written in plain C, making as few assumptions as possible,
and sticking
closely to ANSI C/POSIX for easy embedding.
- Platform: Linux/Unix/Windows
Advantages
==========
- Standards-compliant XML support.
- Full-featured.
- Actively maintained by XML experts.
- fast. fast! FAST!
- Stable.

Disadvantages
=============
This library already ship with Python bindings, but
these Python bindings have
some problems:
- Very low level and C-ish (not Pythonic).
- Underdocumented and huge, you get lost in them.
- UTF-8 in API, instead of Python unicode strings.
- Can cause segfaults from Python.
- Have to do manual memory management. As the
library calls are more or
less an exact mapping on the C API, and thus
require to think about
memory management

For Those who want ot go for DOM API:
Packages for DOM
================
- gdome2: gdome2 provides support for dom on top of
libxml2.C-Based
(http://gdome2.cs.unibo.it/)
- libxml2dom: Other option availabile is libxml2dom.

(http://cheeseshop.python.org/pypi/libxml2dom/0.3.3)
- libxml_domlib:libxml_domlib is a Python extension module that
enables you
to use the DOM interface to libxml2

(http://www.rexx.com/~dkuhlman/libxml_domlib.html)
Resources
==========
- http://xmlsoft.org/index.html
- http://codespeak.net/lxml/intro.html
----
lxml
-----
lxml follows the ElementTree API as much as possible, building it on
top of the native libxml2 tree.

Features
========
- lxml provides all above features as of libxml2 but using
ElementTreet API.

Advantages
==========
- Pythonic API.
- Documented.
- Use Python unicode strings in API.
- Safe (no segfaults).
- No manual memory management
Disadvantages
==============
- No DOM support as in libxml2.
- It is in its initial release (latest is lxml 0.7)
Resources
=========
- http://codespeak.net/lxml/
------
Pyxml
------
Features
=========
- xmlproc: a validating XML parser.
- Expat: a fast non-validating parser.
- sgmlop: a C helper module that can speed-up xmllib.py and
sgmllib.py by a
factor of 5.
- PySAX: SAX 1 and SAX2 libraries with drivers for most of the
parsers.
- 4DOM: A fully compliant DOM Level 2 implementation
- pulldom: a DOM implementation that supports lazy instantiation of
nodes.
- marshal: a module with several options for serializing Python
objects to XML
Advantages
==========
- A lot of documentation is availaible and almost all resources and
examples
based on it.

Disadvantages
=============
- No Schema support

Pacakges for Schema(For those who want schema support too)
===================
XSV: currently in progress, and provides XML schema Part 1:
Structures.
Dependent on some other pacakage PyLTXML
(http://www.ltg.ed.ac.uk/~ht/xsv-status.html)


-------
4Suite
-------
Features:
=========
- XML,XSLT,XPath,DOM,XInclude,XPointer,XLink,XUpdate ,RELAX NG,XML
Catalogs
- Platform: Posix, Windows

Advantages
============
- As, this provides Relax NG: RELAX NG, a simple schema language for
XML,
based on [RELAX] and [TREX]. A RELAX NG schema
specifies a pattern for
the structure and content of an XML document.
[1]
http://www.oasis-open.org/committees...3.html#IDAGDYR
[2] http://xmlbuddy.com/2.0/features.html
[3] http://www.xml.com/pub/a/2001/12/12/...re.html?page=2

* But Relax NG is not W3C based. It is provided by OASIS.
Site:
======
[4] http://cheeseshop.python.org/pypi/4Suite-XML/1.0b3

Dec 23 '05 #1
1 1614
ankit wrote:
There are various packages availaible for XML processing using python.
So which to choose and when. I summarized some of the features,
advantages and disadvantages of some packages int the following text.
Have a look to it. May this get out of the dillema of choice.

Here we go:

OPTIONS
=========
- libxml2
- lxml
- Pyxml
- 4Suite
Also ElementTree, Amara
----
lxml
-----
Disadvantages
==============


- No Windows release to date :-(

Kent
Dec 23 '05 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
by: Chris Barker | last post by:
Hi all, We've been having a discussion over on the wxPython-users mailing list about how to deal with multiple versions of wxPython. During the discussion it came up that this isn't a problem...
2
by: Peter Saffrey | last post by:
(apologies for starting a new thread - Google can't retrieve the other message for some reason) Yes, /usr/lib/python/site-packages is in sys.path. This series of commands should explain what I...
0
by: Rich Burridge | last post by:
Hi, I work in the Accessibility Program Office at Sun Microsystems. I'm part of a team that is using Python to create a screen reader called Orca, that'll help blind people (and people with low...
5
by: comp.lang.php | last post by:
$orderBy = 's.app_date desc, s.last_name asc, s.first_name asc, s.mi asc'; if ($_REQUEST) { $ascArray = array('asc' => 'desc', 'desc' => 'asc'); // ARRAY OF ALL ORDERING POSSIBILITIES $junk =...
1
by: sowmya.cbe | last post by:
hi, am new to install packages in .net. i have a couple of questions 1) well i need to develop an install package for my existing web application written in asp.net. i used the settings and...
6
by: John Machin | last post by:
Hi, In general, I'm mainly interested in a template engine for dynamic web pages but would like a general purpose one to avoid learning yet another package for generating e-mail messages, form...
0
by: giovanni gherdovich | last post by:
Hello, first of all: Is this the right place to ask plastek-related questions? I'm trying to make plastex work on my Ubuntu Dapper Drake. For LaTeX, I have the all-in-one package tetex.
5
by: Brian Tkatch | last post by:
Is there a way to do ORDER BY with DESC inside a CASE statement? That is, given more than one choice for an ORDER BY based on a CASE statement, but only one of the choices will also use DESC...
1
by: aRTx | last post by:
<? /* Directory Listing Script - Version 2 ==================================== Script Author: Artani <artan_p@msn.com>. www.artxcenter.com REQUIREMENTS ============ This script requires...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.