473,406 Members | 2,549 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,406 software developers and data experts.

Creating referenceable objects from XML

Hi All,

I'm looking for a quality Python XML implementation. All of the DOM
and SAX implementations I've come across so far are rather
convoluted. Are there any quality implementations that will (after
parsing the XML) return an object that is accessible by name? Such as
the following:


xml = """
<book>
<title>MyBook</title>
<author>the author</author>
</book>
"""

And after parsing the XML allow me to access it as so:

book.title

I need it to somehow convert my XML to intuitively referenceable
object. Any ideas? I could even do it myself if I knew the
mechanism by which python classes do this (create variables on the fly).

Thanks in advance!
Dec 5 '05 #1
7 1571

Michael Williams wrote:
Hi All,

I'm looking for a quality Python XML implementation. All of the DOM
and SAX implementations I've come across so far are rather
convoluted. Are there any quality implementations that will (after
parsing the XML) return an object that is accessible by name? Such as
the following:
xml = """
<book>
<title>MyBook</title>
<author>the author</author>
</book>
"""

And after parsing the XML allow me to access it as so:

book.title

I need it to somehow convert my XML to intuitively referenceable
object. Any ideas? I could even do it myself if I knew the
mechanism by which python classes do this (create variables on the fly).

Thanks in advance!


You might want to take a look at Fredrik Lundh's ElementTree
(and cElementTree) modules:
http://effbot.org/zone/element-index.htm

Dec 5 '05 #2
Michael Williams wrote:
I'm looking for a quality Python XML implementation. All of the DOM
and SAX implementations I've come across so far are rather
convoluted.
Welcome to the wonderful world of XML.
I need it to somehow convert my XML to intuitively referenceable
object. Any ideas? I could even do it myself if I knew the
mechanism by which python classes do this (create variables on the fly).


You've been given the advice to use ElementTree - I can only second
that.

But if for whatever reason you do want to do it yourself (or for future
use), the

getattr/setattr

functions are what you are looking for. Look them up in TFM.

Regards,

Diez

Dec 5 '05 #3

Michael Williams wrote:
Hi All,

I'm looking for a quality Python XML implementation. All of the DOM
and SAX implementations I've come across so far are rather
convoluted. Are there any quality implementations that will (after
parsing the XML) return an object that is accessible by name? Such as
the following:


xml = """
<book>
<title>MyBook</title>
<author>the author</author>
</book>
"""

And after parsing the XML allow me to access it as so:

book.title

I need it to somehow convert my XML to intuitively referenceable
object. Any ideas? I could even do it myself if I knew the
mechanism by which python classes do this (create variables on the fly).

Thanks in advance!


Michael

Here's an approach to ElementTree that worked for me. It's not generic
or anything and a bit brittle (eg. it won't handle missing nodes) but
maybe for a simple, flat schema or for a prototype?

All the best

Gerard

(TOY CODE - NOT TESTED MUCH)

from elementtree import ElementTree

class ElementWrapper(object):

def __tostring(self):
return ElementTree.tostring(self.element)

def __fromstring(self, xml):
self.element = ElementTree.fromstring(xml)

xml = property( __tostring, __fromstring )

def __init__(self, element=None):
self.element = element

def __str__(self):
return self.xml

def parse(self, infile):
tree = ElementTree.parse(infile)
self.element = tree.getroot()

def write(self, outfile):
ElementTree.ElementTree(self.element).write(outfil e)

###########
from elementtree.ElementTree import Element
from elementwrapper import ElementWrapper

xmlns = 'http://schemas/email/0.1'
class MailDocument(ElementWrapper):

def __build_element(self):
root = Element('{%s}Mail' % xmlns)
root.append( Element('{%s}Date' % xmlns) )
root.append( Element('{%s}From' % xmlns) )
root.append( Element('{%s}Subject' % xmlns) )
root.append( Element('{%s}To' % xmlns) )
root.append( Element('{%s}Cc' % xmlns) )
root.append( Element('{%s}Body' % xmlns) )
root.append( Element('{%s}Attachments' % xmlns) )
self.element = root

################################################## ###
# Properties
#
def __get_uid(self):
return self.element.get('id')

def __set_uid(self, id=''):
self.element.set('id', id)

def __get_date(self):
return self.element[0].text

def __set_date(self, value=''):
self.element[0].text = value

def __get_from(self):
addr = self.element[1].get('address')
nm = self.element[1].get('name')
return addr, nm

def __get_subject(self):
return self.element[2].text

def __set_subject(self, value=''):
self.element[2].text = value

def __get_body(self):
return self.element[5].text

def __set_body(self, value=''):
self.element[5].text = value

uid = property( __get_uid, __set_uid )
From = property( __get_from)
subject = property( __get_subject, __set_subject )
date = property( __get_date, __set_date )
body = property( __get_body, __set_body )

def set_from_header(self, address='', name=''):
self.element[1].set('address', address)
self.element[1].set('name', name)
#
# End Properties
################################################## ###

################################################## ###
# Lists
#
def add_to_header(self, address='', name=''):
self.__add_mailto( self.element[3], address, name )

def remove_to_header(self, index):
elem = self.element[3][index]
self.element[3].remove(elem)

def add_cc_header(self, address='', name=''):
self.__add_mailto( self.element[4], address, name )

def remove_cc_header(self, index):
elem = self.element[4][index]
self.element[4].remove(elem)

def add_attachment(self, filename='', fileuri='', filetype=''):
elem = Element("{%s}Uri" % xmlns, value=fileuri, type=filetype
)
elem.text = filename
self.element[6].append( elem )

def remove_attachment(self, index):
elem = self.element[6][index]
self.element[6].remove(elem)

def __add_mailto(self, element, Address='', Name=''):
element.append( Element("{%s}mailto" % xmlns, address=Address,
name=Name ) )

def get_to_headers(self):
hdrs = []
for item in self.element[3]:
hdrs.append( ( item.get('address'), item.get('name') ) )
return hdrs

def get_cc_headers(self):
hdrs = []
for item in self.element[4]:
hdrs.append( (item.get('address'), item.get('name') ) )
return hdrs

def get_attachments(self):
ret = []
for item in self.__element[6]:
hdrs.append( (item.text, item.get('value'),
item.get('type') ) )
return hdrs
#
# End Lists
################################################## ######

################################################## ######
# Initialise
#
def __init__(self):
self.__build_element()
self.__set_uid()
self.__set_date()
self.__set_subject()
self.set_from_header()
self.__set_body()
#
# End Initialise
################################################## ######

xml_test ='''
<mail:Mail xmlns:mail="http://schemas/email/0.1">
<mail:Date>10/10/05</mail:Date>
<mail:From ad*********@org.org' name='Mr. Jones'/>
<mail:Subject>just a note</mail:Subject>
<mail:To>
<mail:mailto ad**********@org.org' name='Mrs Jones' />
<mail:mailto ad***************@org.org' name='Alan Nother' />
</mail:To>
<mail:Cc></mail:Cc>
<mail:Body>hi there,
just a note to say hi there!</mail:Body>
<mail:Attachments></mail:Attachments>
</mail:Mail>
'''
if __name__ == '__main__':
mail = MailDocument()
mail.xml = xml_test
#mail.parse('test/data/test.xml')
print 'From: ' + mail.From[0]
print 'Subject: ' + mail.subject
mail.set_from_header('n**@new.com')
print 'From: ' + mail.From[0]
mail.add_to_header('aaa.bbb@ccc', 'aaaaaa')
mail.add_to_header('fff.ggg@hhh', 'ffffff')
print 'To:'
for hdr in mail.get_to_headers():
print hdr
mail.remove_to_header(1)
print 'To:'
for hdr in mail.get_to_headers():
print hdr
#mail.write('test_copy.xml')

Dec 5 '05 #4
Michael Williams wrote:
Hi All,

I'm looking for a quality Python XML implementation. All of the DOM
and SAX implementations I've come across so far are rather convoluted.
Are there any quality implementations that will (after parsing the XML)
return an object that is accessible by name? Such as the following:


xml = """
<book>
<title>MyBook</title>
<author>the author</author>
</book>
"""

And after parsing the XML allow me to access it as so:

book.title

I need it to somehow convert my XML to intuitively referenceable
object. Any ideas? I could even do it myself if I knew the mechanism
by which python classes do this (create variables on the fly).

Thanks in advance!


Another tool (ElementsTree already quoted): Amara
( http://uche.ogbuji.net/uche.ogbuji.n.../4suite/amara/ )

[never tested but bookmarked as it seem interresting]

A+

Laurent.
Dec 5 '05 #5
Michael Williams wrote:
Hi All, I'm looking for a quality Python XML implementation. All of the DOM
and SAX implementations I've come across so far are rather
convoluted. Are there any quality implementations that will (after
parsing the XML) return an object that is accessible by name? Such as
the following: xml = """
<book>
<title>MyBook</title>
<author>the author</author>
</book>
""" And after parsing the XML allow me to access it as so: book.title I need it to somehow convert my XML to intuitively referenceable
object. Any ideas? I could even do it myself if I knew the
mechanism by which python classes do this (create variables on the fly).


Looks as if MIchael is working with Amara now, but I did want to note
for the record that APIs that allow one to access a node in the
"book.title" fashion are what I call Python data bindings.

Python data bindings I usually point out are:

Amara Bindery: http://www.xml.com/pub/a/2005/01/19/amara.html
Gnosis: http://www.xml.com/pub/a/2003/07/02/py-xml.html
generateDS: http://www.xml.com/pub/a/2003/06/11/py-xml.html

Based on updates to EaseXML in response to my article another entry
might be:

EaseXML: http://www.xml.com/pub/a/2005/07/27/py-xml.html

ElementTree ( http://www.xml.com/pub/a/2003/02/12/py-xml.html ) is a
Python InfoSet rather than a Python data binding. You access nodes
using generic names related to the node type rather than the node name.
Whether data bindings or Infosets are your preference is a matter of
taste, but it's a useful distinction to make between the approaches.
It looks as if Gerald Flanagan has constructed a little specialized
binding tool on top of ElementTree, and that's one possible hybrid
approach.

xmltramp ( http://www.aaronsw.com/2002/xmltramp/ ) is another
interesting hybrid.

--
Uche Ogbuji Fourthought, Inc.
http://uche.ogbuji.net http://fourthought.com
http://copia.ogbuji.net http://4Suite.org
Articles: http://uche.ogbuji.net/tech/publications/

Dec 11 '05 #6
[Michael Williams]
I need it to somehow convert my XML to intuitively referenceable
object. Any ideas? I could even do it myself if I knew the mechanism
by which python classes do this (create variables on the fly).


You seem to already have a fair idea what kind of model you need, and to
know that there is a simple way for you to create one. I encourage you
to progress on this path: it will increase the depth of your understanding.

One mistake I think that some people make about XML is relying on other
peoples interpretations of the subject, rather than forming their own
opinions.

The multitude of document models provided by everyone and his mother all
make assumptions about how the components of the model will be accessed,
in what order those components will be accessed, how often and when, how
memory efficient the model is, etc, etc.

To really understand the trade-offs and strengths of all the different
models, it is a good exercise to build your own object model. It's a
simple exercise, due to pythons highly dynamic nature. Understanding
your own model will help you understand what the other models do and do
not provide. You can then evaluate other off-the-shelf models for your
specific applications: I always find different XML tools suit different
situations.

See this post of mine from a couple years back about different ways of
building your own document/data models.

http://groups.google.com/group/comp....a4a1c35395ffec

I think the reference to the ActiveState recipe will be of particular
interest, since you could have a running example very quickly indeed.

See also my tutorial post on extracting document content from a SAX
stream. I gave the example of a simple stack-based xpath-style
expression matcher.

http://groups.google.com/group/comp....53bddbb9326948

Also contained in that thread is an illuminating and productive
discussion between the effbot and myself about how wonderfully simple
ElementTree makes this, not to mention unbeatably efficient.

this-week-i-ave-been-mostly-using-kid-for-templating-ly'yrs,

--
alan kennedy
------------------------------------------------------
email alan: http://xhaus.com/contact/alan
Dec 11 '05 #7
uc*********@gmail.com wrote:
ElementTree ( http://www.xml.com/pub/a/2003/02/12/py-xml.html ) is a
Python InfoSet rather than a Python data binding. You access nodes
using generic names related to the node type rather than the node name.
Whether data bindings or Infosets are your preference is a matter of
taste, but it's a useful distinction to make between the approaches.
It looks as if Gerald Flanagan has constructed a little specialized
binding tool on top of ElementTree, and that's one possible hybrid
approach.


in my experience, it's hard to make a python/xml mapping that's well suited for all
possible use cases (many bindings suffer from issues with namespaces, collisions
between tags/attribute names and python names, etc), but it's usually trivial to write
a custom wrapper for a specific case.

for most normal use, manual infoset navigation is often the easiest way to pull out
data from the infoset (find, get, findtext, int, float, etc).

for certain cases, creating wrappers on demand can be quite efficient; e.g.

http://online.effbot.org/2003_07_01_...element-tricks

and for highly regular cases, incremental parsing/conversion up front is often the
fastest and most efficient way to deal with data; e.g.

http://effbot.org/zone/element-iterparse.htm#plist

</F>

Dec 12 '05 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
by: Norberto Eichstaedt | last post by:
Hi, I have a XML file that uses "html:table". <book> <chapter> <html:table> <html:tbody> .....
8
by: Nanda | last post by:
hi, I am trying to generate parameters for the updatecommand at runtime. this.oleDbDeleteCommand1.CommandText=cmdtext; this.oleDbDeleteCommand1.Connection =this.oleDbConnection1;...
2
by: JJ L. | last post by:
Hello. I have a project that consists of nine different objects, each serving their own purpose. In the past I have just created a form for each one, and then whenever you call, say,...
1
by: Chris | last post by:
Hi, to create an array of 2 objects (e.g. of type '__gc class Airplane') I need to do : Airplane * arrAirplanes __gc = new Airplane* __gc; arrAirplanes = new Airplane("N12344"); arrAirplanes...
3
by: Ken Varn | last post by:
I am just starting the process of creating ASP.NET server controls. I have created controls for .NET applications, but have just started with ASP.NET. I am a little confused about some areas that...
5
by: | last post by:
Trying to learn about manipulating collections of objects, and populating these objects dynamically from datasources. Could someone post a code sample that shows the following: Instantiating a...
1
by: chris | last post by:
I know I've asked this before, but I didn't really get an answer and I bet it's because I didn't explain myself very well. Here goes again. I have this code: Dim arrData(intNoOfRows,...
5
by: fireball | last post by:
please help newbie I need to create a lot of objects the same type (let's say: schemas) I wish to use paramerized block in loop to do so. - how to put names of my objects to such control-flow? ...
31
by: JoeC | last post by:
I have read books and have ideas on how to create objects. I often create my own projects and programs. They end up getting pretty complex and long. I often use objects in my programs they are...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.