473,549 Members | 2,647 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

reading xml data

56 New Member
I have an Xml w/c look like this:
Expand|Select|Wrap|Line Numbers
  1. <xml>
  2.   <process name="proc1">
  3.        <mkdir>directory</mkdir>
  4.          <copyfile>src,dst</copyfile>
  5.   </process>
  6.  
  7.   <process name="proc2">
  8.        <copyfile>src,dst</copyfile>
  9.   </process>
  10. </xml>
then my problem is how can I get the child nodes of process w/c are "proc1" and "proc2"?
then i also need to get the value of these child nodes(example "src,dst" for <copyfile>)..
im using xml.dom.minidom module

Im hoping for your response guys...
Dec 4 '07 #1
8 1644
heiro
56 New Member
anyone knows???pls help...
Dec 5 '07 #2
bvdet
2,851 Recognized Expert Moderator Specialist
You will need to create a parser, something like this:
Expand|Select|Wrap|Line Numbers
  1. from xml.dom.minidom import parse
  2.  
  3. fn = 'sample.xml'
  4.  
  5. dom1 = parse(fn)
  6. # global variable required by handleData
  7. nameList = ["proc1", "proc2"]
  8.  
  9. def getText(nodelist):
  10.     rc = ""
  11.     for node in nodelist:
  12.         if node.nodeType == node.TEXT_NODE:
  13.             rc = rc + node.data
  14.     return rc
  15.  
  16. def handleData(nodelist, *args):
  17.     resList = []
  18.     for node in nodelist:
  19.         if str(node.attributes["name"].value) in nameList:
  20.             for arg in args:
  21.                 resList.append(node.getElementsByTagName(arg))
  22.     return [item[0] for item in resList if item]
  23.  
  24. for item in dom1.getElementsByTagName("copyfile"):
  25.     print getText(item.childNodes)
  26.  
  27. process_elements = dom1.getElementsByTagName('process')
  28. print process_elements
  29.  
  30. print handleData(process_elements, "mkdir", "copyfile")
  31.  
  32. for item in handleData(process_elements, "mkdir", "copyfile"):
  33.     print getText(item.childNodes)
Contents of sample.xml:
<xml>
<process name="proc1">
<mkdir>director y</mkdir>
<copyfile>src,d st</copyfile>
</process>

<process name="proc2">
<copyfile>src,d st</copyfile>
</process>
<process name="proc3">
<mkdir>director y</mkdir>
<copyfile>src,d st</copyfile>
</process>
<process name="proc4">
<mkdir>director y</mkdir>
<copyfile>src,d st</copyfile>
</process>
</xml>
Output from above code:
>>> src,dst
src,dst
src,dst
src,dst
[<DOM Element: process at 0xed2670>, <DOM Element: process at 0xed2f58>, <DOM Element: process at 0xedb4b8>, <DOM Element: process at 0xedb788>]
[<DOM Element: mkdir at 0xed2e68>, <DOM Element: copyfile at 0xed2e90>, <DOM Element: copyfile at 0xedb0a8>]
directory
src,dst
src,dst
>>>
Dec 5 '07 #3
heiro
56 New Member
You will need to create a parser, something like this:
Expand|Select|Wrap|Line Numbers
  1. from xml.dom.minidom import parse
  2.  
  3. fn = 'sample.xml'
  4.  
  5. dom1 = parse(fn)
  6. # global variable required by handleData
  7. nameList = ["proc1", "proc2"]
  8.  
  9. def getText(nodelist):
  10.     rc = ""
  11.     for node in nodelist:
  12.         if node.nodeType == node.TEXT_NODE:
  13.             rc = rc + node.data
  14.     return rc
  15.  
  16. def handleData(nodelist, *args):
  17.     resList = []
  18.     for node in nodelist:
  19.         if str(node.attributes["name"].value) in nameList:
  20.             for arg in args:
  21.                 resList.append(node.getElementsByTagName(arg))
  22.     return [item[0] for item in resList if item]
  23.  
  24. for item in dom1.getElementsByTagName("copyfile"):
  25.     print getText(item.childNodes)
  26.  
  27. process_elements = dom1.getElementsByTagName('process')
  28. print process_elements
  29.  
  30. print handleData(process_elements, "mkdir", "copyfile")
  31.  
  32. for item in handleData(process_elements, "mkdir", "copyfile"):
  33.     print getText(item.childNodes)
Contents of sample.xml:
<xml>
<process name="proc1">
<mkdir>director y</mkdir>
<copyfile>src,d st</copyfile>
</process>

<process name="proc2">
<copyfile>src,d st</copyfile>
</process>
<process name="proc3">
<mkdir>director y</mkdir>
<copyfile>src,d st</copyfile>
</process>
<process name="proc4">
<mkdir>director y</mkdir>
<copyfile>src,d st</copyfile>
</process>
</xml>
Output from above code:
>>> src,dst
src,dst
src,dst
src,dst
[<DOM Element: process at 0xed2670>, <DOM Element: process at 0xed2f58>, <DOM Element: process at 0xedb4b8>, <DOM Element: process at 0xedb788>]
[<DOM Element: mkdir at 0xed2e68>, <DOM Element: copyfile at 0xed2e90>, <DOM Element: copyfile at 0xedb0a8>]
directory
src,dst
src,dst
>>>


thanks bvdet....i'll try this one..thanks
Dec 6 '07 #4
bvdet
2,851 Recognized Expert Moderator Specialist
thanks bvdet....i'll try this one..thanks
You are welcome. I am learning about XML and DOM also.
Dec 6 '07 #5
heiro
56 New Member
You are welcome. I am learning about XML and DOM also.
I know this is too much :-).
I want to ask another favor..What if i want the output should look like this:

process name="proc1"
mkdir: directory
copyfile: src,dst


process name="proc2"
copyfile: src,dst


process name="proc3"
mkdir: directory
copyfile>src,ds t

and how can i parse an xml childnode w/c look like this:
<download ='ftp' user='username' password='passw ord'>

thanks in advance bvdet..Hope you can help me with this in a second time...
Dec 7 '07 #6
bvdet
2,851 Recognized Expert Moderator Specialist
I know this is too much :-).
I want to ask another favor..What if i want the output should look like this:

process name="proc1"
mkdir: directory
copyfile: src,dst


process name="proc2"
copyfile: src,dst


process name="proc3"
mkdir: directory
copyfile>src,ds t

and how can i parse an xml childnode w/c look like this:
<download ='ftp' user='username' password='passw ord'>

thanks in advance bvdet..Hope you can help me with this in a second time...
Create a function to format the data:
Expand|Select|Wrap|Line Numbers
  1. from xml.dom.minidom import parse
  2.  
  3. # global variables required by formatData
  4. nameList = ["proc1", "proc2"]
  5. nodeIDlist = ['name',]
  6.  
  7. def formatData(nodelist, *args):
  8.     resList = []
  9.     for node in nodelist:
  10.         for id in nodeIDlist:
  11.             try:
  12.                 s = str(node.attributes[id].value)
  13.                 if s in nameList:
  14.                     resList.append('%s name=%s' % (repr(elem.parentNode).split(':')[1].split()[0], s))
  15.                     for arg in args:
  16.                         try:
  17.                             resList.append('  %s: %s' % (arg, getText(node.getElementsByTagName(arg)[0].childNodes)))
  18.                         except IndexError, e:
  19.                             # print 'Invalid element tag: %s' % arg
  20.                             pass
  21.             except KeyError, e:
  22.                 # print 'Invalid node atribute:', e
  23.                 pass
  24.     return '\n'.join(resList)
  25.  
  26. dom1 = parse('sample.xml')
Expand|Select|Wrap|Line Numbers
  1. >>> process_elements = dom1.getElementsByTagName('process')
  2. >>> process_elements
  3. [<DOM Element: process at 0xf8bb98>, <DOM Element: process at 0xf8b918>, <DOM Element: process at 0xf8b710>, <DOM Element: process at 0xf87490>]
  4. >>> print formatData(process_elements, "mkdir", "copyfile")
  5. process name=proc1
  6.   mkdir: directory1
  7.   copyfile: src1,dst1
  8. process name=proc2
  9.   copyfile: src2,dst2
  10. >>> 
The string <download ='ftp' user='username' password='passw ord'> does not appear to be valid XML. Should not there be an attribute name to the left of the equal sign after 'download'?
Dec 7 '07 #7
heiro
56 New Member
Create a function to format the data:
Expand|Select|Wrap|Line Numbers
  1. from xml.dom.minidom import parse
  2.  
  3. # global variables required by formatData
  4. nameList = ["proc1", "proc2"]
  5. nodeIDlist = ['name',]
  6.  
  7. def formatData(nodelist, *args):
  8.     resList = []
  9.     for node in nodelist:
  10.         for id in nodeIDlist:
  11.             try:
  12.                 s = str(node.attributes[id].value)
  13.                 if s in nameList:
  14.                     resList.append('%s name=%s' % (repr(elem.parentNode).split(':')[1].split()[0], s))
  15.                     for arg in args:
  16.                         try:
  17.                             resList.append('  %s: %s' % (arg, getText(node.getElementsByTagName(arg)[0].childNodes)))
  18.                         except IndexError, e:
  19.                             # print 'Invalid element tag: %s' % arg
  20.                             pass
  21.             except KeyError, e:
  22.                 # print 'Invalid node atribute:', e
  23.                 pass
  24.     return '\n'.join(resList)
  25.  
  26. dom1 = parse('sample.xml')
Expand|Select|Wrap|Line Numbers
  1. >>> process_elements = dom1.getElementsByTagName('process')
  2. >>> process_elements
  3. [<DOM Element: process at 0xf8bb98>, <DOM Element: process at 0xf8b918>, <DOM Element: process at 0xf8b710>, <DOM Element: process at 0xf87490>]
  4. >>> print formatData(process_elements, "mkdir", "copyfile")
  5. process name=proc1
  6.   mkdir: directory1
  7.   copyfile: src1,dst1
  8. process name=proc2
  9.   copyfile: src2,dst2
  10. >>> 
The string <download ='ftp' user='username' password='passw ord'> does not appear to be valid XML. Should not there be an attribute name to the left of the equal sign after 'download'?

it actually look like this..

<process name='download' >
<download server='ftp' user='username' password='***** *'>
<destination>pa th</destination>
<unzip>*.jpg, *.doc, *.pdf</unzip>
</download>
</process>

Actually I'm making a program right now and its output depends on the xml.
You help a me a lot bvdet..Thanks man
Dec 8 '07 #8
bvdet
2,851 Recognized Expert Moderator Specialist
I have played around with XML parsing, and I made a new function. It is kind of ugly and does not work exactly the way I want, so maybe someone can improve it. Following is the complete code:
Expand|Select|Wrap|Line Numbers
  1. from xml.dom.minidom import parse
  2.  
  3. def getText(nodelist):
  4.     rc = []
  5.     for node in nodelist:
  6.         if node.nodeType == node.TEXT_NODE:
  7.             s = node.data.strip()
  8.             if s:
  9.                 rc.append(node.data)
  10.     return '\n'.join(rc)
  11.  
  12. def nodeName(node):
  13.     try: return repr(node).split(':')[1].split()[0]
  14.     except: return ''
  15.  
  16. def getDataList(nodelist, **kargs):
  17.     resList = []
  18.     for node in nodelist:
  19.         node_name = nodeName(node)
  20.         if node_name in kargs:
  21.             keys = kargs[node_name].keys()
  22.             for id in keys:
  23.                 try:
  24.                     s = str(node.attributes[id].value)
  25.                     v = kargs[node_name][id]
  26.                     if not v or s in kargs[node_name][id]:
  27.                         resList.append('%s %s=%s' % (node_name, id, s))
  28.  
  29.                         if node.nodeType == node.ELEMENT_NODE:
  30.                             nodes = node.childNodes
  31.                             name = node.nodeName
  32.                             print 'DOM element = %s' % name
  33.                             s = []
  34.                             for elem in nodes:
  35.                                 nm = nodeName(elem)
  36.                                 s.append('  %s%s' % (['', nm+': '][len(nm)>0 or 0],getText(elem.childNodes)))
  37.                             print '\n'.join([i for i in s if i.strip()])
  38.                         elif node.nodeType == node.TEXT_NODE:
  39.                             s = getText(node)
  40.                             print 'Text Node Text = %s' % s
  41.  
  42.                 except KeyError, e:
  43.                     print 'Invalid node atribute:', e
  44.                     pass
  45.     return resList
  46.  
  47. fn = r'H:\TEMP\temsys\sampleXML.txt'
  48.  
  49. dom1 = parse(fn)
  50.  
  51. process_elements = dom1.getElementsByTagName('process')
  52. download_elements = dom1.getElementsByTagName('download')
  53.  
  54. elemDict = {'process': {'name': ["proc1", "proc2"]}, 'download': {'server': ['ftp', ]}}
  55. x = getDataList(process_elements, **elemDict)
  56. y = getDataList(download_elements, **elemDict)
  57.  
  58. print
  59. print x
  60. print y
Output:
>>> DOM element = process
mkdir: directory1
mkdir: directory11
mkdir: directory111
copyfile: src1,dst1
DOM element = process
copyfile: src2,dst2
DOM element = download
destination: path
unzip: *.jpg, *.doc, *.pdf

['process name=proc1', 'process name=proc2']
['download server=ftp']
>>>
Dec 12 '07 #9

Sign in to post your reply or Sign up for a free account.

Similar topics

2
3017
by: Dariusz | last post by:
Below is part of a code I have for a database. While the database table is created correctly (if it doesn't exist), and data is input correctly into the database when executed, I have a problem when reading out the data into the PHP variables / array. It should be displaying the information out in the following way, using the PHP array: ...
0
3570
by: Andy | last post by:
Hi, In the code below (not pretty I know but it's an early version :-P) I'm having problems reading the data object back in. If I move the reading code to immediately after the section where it is written ( commented out in code) then it reads in OK. However, when I move the code to the right place ( as shown here) it throws an IO...
1
6742
by: Magnus | last post by:
allrite folks, got some questions here... 1) LAY-OUT OF REPORTS How is it possible to fundamentaly change the lay-out/form of a report in access? I dont really know it that "difficult", but listen up; Reports, the way I look at them, all present data downwards, in this way; TITLE data
6
3756
by: KevinD | last post by:
assumption: I am new to C and old to COBOL I have been reading a lot (self teaching) but something is not sinking in with respect to reading a simple file - one record at a time. Using C, I am trying to read a flatfile. In COBOL, my simple file layout and READ statement would look like below. Question: what is the standard, simple...
6
5247
by: arne.muller | last post by:
Hello, I've come across some problems reading strucutres from binary files. Basically I've some strutures typedef struct { int i; double x; int n; double *mz;
10
8341
by: Tyler | last post by:
Hello All: After trying to find an open source alternative to Matlab (or IDL), I am currently getting acquainted with Python and, in particular SciPy, NumPy, and Matplotlib. While I await the delivery of Travis Oliphant's NumPy manual, I have a quick question (hopefully) regarding how to read in Fortran written data. The data files are...
5
14975
blazedaces
by: blazedaces | last post by:
Ok, so you know my problem, java is running out of memory reading with SAX, the event-based xml parser intended more-so than DOM for extremely large files. I'll try to explain what I've been doing and why I have to do it. Hopefully someone has a suggestion... Alright, so I'm using a gps-simulation program that outputs gps data, like...
4
2094
by: Shark | last post by:
Hi, I need a help. My application reads data from COM port, this data is then parsed and displyed on: 1. two plotters 2. text box. I'm using Invoke method to update UI when new data is received (through delegate).
13
3667
by: swetha | last post by:
HI Every1, I have a problem in reading a binary file. Actually i want a C program which reads in the data from a file which is in binary format and i want to update values in it. The file consists of structures of type---- struct record { int acountnum; char name; float value;
6
3513
by: efrenba | last post by:
Hi, I came from delphi world and now I'm doing my first steps in C++. I'm using C++builder because its ide is like delphi although I'm trying to avoid the vcl. I need to insert new features to an old program that I wrote in delphi and it's a good opportunity to start with c++.
0
7532
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main...
0
7730
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. ...
0
7975
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that...
0
6059
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then...
1
5381
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes...
0
3491
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
1957
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
1
1069
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
777
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.