Connecting Tech Pros Worldwide Forums | Help | Site Map

simple XML question

Member
 
Join Date: Jun 2007
Posts: 66
#1: Sep 14 '09
i know it's quite simple, but i looked around for hours and i dont seem to be able to get it, excuse my nubiness pleaseeee

i need to parse an xml document, to take out the values inside since i'm using it as a database,

here's a sample xml file

<?xml version="1.0" ?><amman Address="3rd area" Hazard="weapons" Name="third" Owner="tarik" Phone="6453222" PhotoAddress="C:\Users\tarik\Desktop\DSC_0018.jpg" PrevUsed="project1" />

i need to extract the values just as a string, so if i ask for address ill just get 3rd area

supposing the XML file is located at c:/tempDB

eternally grateful for an answer!!

T

Newbie
 
Join Date: Aug 2009
Location: Louisville
Posts: 13
#2: Sep 14 '09

re: simple XML question


I'm not familiar with XML myself but I think this should get you started.

Using xml.dom.minidom (http://docs.python.org/library/xml.dom.minidom.html) you can parse a file which should return a Document object which is documented here. You can then (I believe) use the Document methods to grab elements and values and the like.

Hope this helps!
bvdet's Avatar
Moderator
 
Join Date: Oct 2006
Location: Nashville, TN
Posts: 1,563
#3: Sep 15 '09

re: simple XML question


I have done some XML parsing, and it was not simple to me! The following will parse the document into a dictionary:
Expand|Select|Wrap|Line Numbers
  1. from xml.dom.minidom import parseString
  2.  
  3. xmlStr = '''<?xml version="1.0" ?>
  4.     <amman Address="3rd area"
  5.            Hazard="weapons"
  6.            Name="third"
  7.            Owner="tarik"
  8.            Phone="6453222"
  9.            PhotoAddress="C:\Users\tarik\Desktop\DSC_0018.jpg"
  10.            PrevUsed="project1" />'''
  11.  
  12. xmlDoc = parseString(xmlStr)
  13.  
  14. def getNodeDict(doc, dd={}):
  15.     '''Return a dictionary of node names and attribute names and values
  16.     found in an XML document parseString instance. The keys are the node
  17.     names and each value is a list of dictionaries (a list is used since an
  18.     XML document can have multiple nodes with the same name). The keys of
  19.     each dictionary in the list are the attribute names and the values are
  20.     the attribute values. All keys and values are converted to regular
  21.     strings.'''
  22.     if doc == None: return None
  23.     for child in doc.childNodes:
  24.         # Skip all nodes except ELEMENT_NODE
  25.         if child.nodeType == 1:
  26.             # dict of attribute/values
  27.             s = child.attributes
  28.             if s:
  29.                 dd.setdefault(str(child.nodeName),
  30.                               []).append(dict(zip([str(item) for item in s.keys()],
  31.                                                    [str(item.value) for item in s.values()])))
  32.             if child.hasChildNodes():
  33.                 dd = self.getNodeDict(child, dd)
  34.     return dd
  35.  
  36. print
  37.  
  38. for key, value in getNodeDict(xmlDoc).items():
  39.     print "Node Name: %s" % (key)
  40.     print "Attributes:"
  41.     for item in value:
  42.         for x in item:
  43.             print '    %s: %s' % (x, item[x])
Output:
Expand|Select|Wrap|Line Numbers
  1. >>> 
  2. Node Name: amman
  3. Attributes:
  4.     PrevUsed: project1
  5.     Name: third
  6.     Hazard: weapons
  7.     PhotoAddress: C:\Users arik\Desktop\DSC_0018.jpg
  8.     Phone: 6453222
  9.     Address: 3rd area
  10.     Owner: tarik
  11. >>>
This code has not been tested on anything other than your document.
Member
 
Join Date: Jun 2007
Posts: 66
#4: Sep 15 '09

re: simple XML question


i cant thank you enough!!! i bow before you master,

i shall try the code tomorrow i hope cause it's darn late here,

thanks again

T
Member
 
Join Date: Jun 2007
Posts: 66
#5: Sep 15 '09

re: simple XML question


couldnt sleep without trying it!

first of all, it works perfectly, i dont know how to thank you,

but the boring part is, can i ask a couple of questions concerning?

if child.nodeType == 1: # very clear, but how do i get a reference of what this might return? i've read XML documents and ive read MINIDOM documents, but the gap in between of what to use where, how do you recommend i understand that?


dd.setdefault(str(child.nodeName),[]).append(dict(zip([str(item) for item in s.keys()],
[str(item.value) for item in s.values()])))
#i get how it's working, but this is an out of place question, how do i understand functions like "setdefault" and "zip" ? i mean the documantation can only get u this much, then it ends up losing u! i recently moved from Autodesk|Maya's python, and working on programming more OS stuff,

thank you so much for your help!

T
bvdet's Avatar
Moderator
 
Join Date: Oct 2006
Location: Nashville, TN
Posts: 1,563
#6: Sep 15 '09

re: simple XML question


To begin with, I like this website for learning the basics of XML.

There are several types of nodes, one type being ELEMENT_NODE.
Expand|Select|Wrap|Line Numbers
  1. >>> xmlDoc.ELEMENT_NODE
  2. 1
  3. >>> xmlDoc.firstChild.nodeType
  4. 1
To see a list of the available attributes of an XML document parseString instance:
Expand|Select|Wrap|Line Numbers
  1. >>> for item in dir(xmlDoc):
  2. ...     print item
  3. ...     
  4. ATTRIBUTE_NODE
  5. CDATA_SECTION_NODE
  6. COMMENT_NODE
  7. DOCUMENT_FRAGMENT_NODE
  8. DOCUMENT_NODE................................
Built-in function dict() returns a dictionary. Built-in function zip() returns a list of tuples, where the i-th tuple contains the i-th element from each of the argument sequences. Example:
Expand|Select|Wrap|Line Numbers
  1. >>> s1 = [1,2,3]
  2. >>> s2 = ['a','b','c']
  3. >>> zip(s1, s2)
  4. [(1, 'a'), (2, 'b'), (3, 'c')]
  5. >>> dict(zip(s1,s2))
  6. {1: 'a', 2: 'b', 3: 'c'}
  7. >>> 
Dictionary method setdefault(key[, x]) returns the value of the dictionary key if the key exists, otherwise returns x and sets the dictionary key to x. Example:
Expand|Select|Wrap|Line Numbers
  1. >>> dd = dict(zip(s1,s2))
  2. >>> dd.setdefault(3, [])
  3. 'c'
  4. >>> dd.setdefault(4, [])
  5. []
  6. >>> dd
  7. {1: 'a', 2: 'b', 3: 'c', 4: []}
  8. >>> dd.setdefault(4, []).append('d')
  9. >>> dd
  10. {1: 'a', 2: 'b', 3: 'c', 4: ['d']}
  11. >>> 
Member
 
Join Date: Jun 2007
Posts: 66
#7: Sep 16 '09

re: simple XML question


could not have been clearer! thank you master

cheers

T
Reply

Tags
minidom, xml


Similar Python bytes