By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
443,907 Members | 2,039 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 443,907 IT Pros & Developers. It's quick & easy.

pure python based XML parsing

P: 38
I am developing a program for the PSP, and so far the methods for processing XML's depend on dll's, so i cannot use them as they are not compiled for use with the PlayStation Portable, bout everything i see depends on expat which needs window dll's to work

so is there a "pure" python library? one where EVERYTHING needed for processing XML is contained in a .py file?
Mar 21 '08 #1
Share this Question
Share on Google+
3 Replies

P: 87
You should probably look into pypy. That is a python implementation written in python. You probably won't want to use the interpreter because it is incredibly slow, but you can use the modules, which have been converted to python instead of being written in C.
Mar 21 '08 #2

P: 38
nvr mind, that was easy:
Expand|Select|Wrap|Line Numbers
  1. class XMLParsers:
  2.     def __init__(self, inputString):
  3.         self.xml = inputString
  4.         self.returne = []
  5.         self.toplevel = 0
  6.     def output(self, inputl):
  7.         wow = []
  8.         returnme = {}
  9.         while inputl.find("</") > -1:
  10.             key = inputl.split("</")[len(inputl.split("</"))-1].split(">")[0]
  11.             m = inputl.split("</"+key+'>')[0].split("<"+key+">")[1]
  12.             if (m.find("</")>-1):
  13.                 if len(m.split("<",1)[0]) > 0:
  14.                     returnme[key]=[m.split("<",1)[0],self.output("<"+m+">")]
  15.                 else:
  16.                     returnme[key]=[self.output("<"+m+">")]
  17.             else:
  18.                 returnme[key]=m
  19.             inputl = "<"+inputl.replace("<"+key+">"+m+"</"+key+">","").strip("<>")+">"
  20.         return returnme
  22.     def retn(self):
  23.         return self.output(self.xml)
That was pretty cool making that
only problem so far is it doesnt take attributes, but it does process XML quite nicely, goes down as far as needed:
Expand|Select|Wrap|Line Numbers
  1. >>> XMLParsers("<p>lol<m>ds</m><i>p</i></p><l>my</l><oiu>wow xml parsing!</oiu>").retn()
  2. {'p': ['lol', {'i': 'p', 'm': 'ds'}], 'oiu': 'wow xml parsing!', 'l': 'my'}
i just need to code for multiple instances of the same thing, and then i should be good.
Mar 22 '08 #3

P: 38
ok, so i think i got this down...

Expand|Select|Wrap|Line Numbers
  1. >>> XMLParsers('<?xml version="1.0" encoding="UTF-8" ?><rss version="2.0"><channel><title>RSS Example</title><description>This is an example of an RSS feed</description><link></link><lastBuildDate>Mon, 28 Aug 2006 11:12:55 -0400 </lastBuildDate><pubDate>Tue, 29 Aug 2006 09:00:00 -0400</pubDate><item><title>Item Example</title><description>This is an example of an Item</description><link></link><guid isPermaLink="false"> 1102345</guid><pubDate>Tue, 29 Aug 2006 09:00:00 -0400</pubDate></item></channel></rss>').retn()
Expand|Select|Wrap|Line Numbers
  1. {'rss': [[{'channel': [[{'lastBuildDate': ['Mon, 28 Aug 2006 11:12:55 -0400 '], 'description': ['This is an example of an RSS feed'], 'pubDate': ['Tue, 29 Aug 2006 09:00:00 -0400'], 'title': ['RSS Example'], 'item': [[{'guid': [' 1102345'], 'link': [''], 'description': ['This is an example of an Item'], 'pubDate': ['Tue, 29 Aug 2006 09:00:00 -0400'], 'title': ['Item Example']}]], 'link': ['']}]]}]]}

so does the output make sense to everybody?

here is if it has multiple instances, and it handles attributes now:
Expand|Select|Wrap|Line Numbers
  1. XMLParsers("<p>first p</p><p>second p<m>ds</m><i>p</i></p><p>Third P</p><l>my</l><oiu o=moi bot>wow xml parsing!</oiu>").retn()

Expand|Select|Wrap|Line Numbers
  1. {'p': ['first p', ['second p', {'i': ['p'], 'm': ['ds']}], 'Third P'], 'oiu': ['wow xml parsing!'], 'l': ['my']}
Mar 22 '08 #4

Post your reply

Sign in to post your reply or Sign up for a free account.