By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
435,512 Members | 3,847 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 435,512 IT Pros & Developers. It's quick & easy.

XML/encoding/prolog/python hell...

P: n/a
I am a beginning pythoner and I am having a terrible time trying to
figure out how to do something that (it would seeme to me) should be
fairly simple.

I have a CSV file of unknown encoding and I need to parse that file to
get the fields <--- DONE
I need to create an xml document that has the proper prolog and
namespace information in it. <--- NOT DONE
I need it to be encoded properly<--- Looks right in IE, not right in
any other app.

I should say that I have googled my butt off, tried ElementTree,
CSV2XML, and various other things and cannot get any of them to work.

A sample of the output I am looking for is as follows:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<ns2:boxes xmlns:ns2="Boxes">
<ns2:box id="9" mac="333333" d_a="2006" hw_ver="v1.1" sw_ver="3"
pl_h="No Data" name="Lounge" address="here" phone="555-5555"
country="US" city="LA"/>
<ns2:box id="7" mac="444444" d_a="2005" hw_ver="v1.0" sw_ver="3"
pl_h="No Data" name="MyHouse" address="there" phone="555-5556"
country="US" city="New York"/>
</ns2:boxes>

Is there some fundamental thing I am not getting? I cannot get
'tostrings' to work in ElementTree and I cannot figure the prolog out.

I posted a similar message back in January, but haven't had much luck.

PS
No I haven't been trying to do this since January, more important
things came up at work and I have just revived this. :)

Mar 28 '07 #1
Share this Question
Share on Google+
8 Replies


P: n/a
On Mar 28, 12:40 pm, "fscked" <fsckedag...@gmail.comwrote:
I am a beginning pythoner and I am having a terrible time trying to
figure out how to do something that (it would seeme to me) should be
fairly simple.

I have a CSV file of unknown encoding and I need to parse that file to
get the fields <--- DONE
I need to create an xml document that has the proper prolog and
namespace information in it. <--- NOT DONE
I need it to be encoded properly<--- Looks right in IE, not right in
any other app.

I should say that I have googled my butt off, tried ElementTree,
CSV2XML, and various other things and cannot get any of them to work.

A sample of the output I am looking for is as follows:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<ns2:boxes xmlns:ns2="Boxes">
<ns2:box id="9" mac="333333" d_a="2006" hw_ver="v1.1" sw_ver="3"
pl_h="No Data" name="Lounge" address="here" phone="555-5555"
country="US" city="LA"/>
<ns2:box id="7" mac="444444" d_a="2005" hw_ver="v1.0" sw_ver="3"
pl_h="No Data" name="MyHouse" address="there" phone="555-5556"
country="US" city="New York"/>
</ns2:boxes>

Is there some fundamental thing I am not getting? I cannot get
'tostrings' to work in ElementTree and I cannot figure the prolog out.

I posted a similar message back in January, but haven't had much luck.

PS
No I haven't been trying to do this since January, more important
things came up at work and I have just revived this. :)
I've never done this, but I found a recipe on the ActiveState website
that looks like it would be helpful:

http://aspn.activestate.com/ASPN/Coo.../Recipe/159100

I think you could modify it to make it work.

You could probably also use a combination of the csv module and the
pyxml module (links below).

http://pyxml.sourceforge.net/topics/
http://www.rexx.com/~dkuhlman/pyxmlfaq.html

I also found a Python XML book: http://www.oreilly.com/catalog/pytho...pter/ch01.html

I hope that helps. I've started my own adventure into XML with XRC and
wxPython.

Mike

Mar 28 '07 #2

P: n/a

ky******@gmail.com wrote:
<---SNIP--->
I've never done this, but I found a recipe on the ActiveState website
that looks like it would be helpful:

http://aspn.activestate.com/ASPN/Coo.../Recipe/159100

I tried looking at that but couldn't figure out how to get the
property file working.

I think you could modify it to make it work.

You could probably also use a combination of the csv module and the
pyxml module (links below).

http://pyxml.sourceforge.net/topics/
http://www.rexx.com/~dkuhlman/pyxmlfaq.html
These are a little too confusing for me. :)
I also found a Python XML book: http://www.oreilly.com/catalog/pytho...pter/ch01.html

I hope that helps. I've started my own adventure into XML with XRC and
wxPython.

Mike
Mar 28 '07 #3

P: n/a
En Wed, 28 Mar 2007 15:06:28 -0300, <ky******@gmail.comescribió:
You could probably also use a combination of the csv module and the
pyxml module (links below).

http://pyxml.sourceforge.net/topics/
Note the pyxml summary page on SourceForge: "PyXML is no longer
maintained."

--
Gabriel Genellina

Mar 28 '07 #4

P: n/a
fscked schrieb:
I am a beginning pythoner and I am having a terrible time trying to
figure out how to do something that (it would seeme to me) should be
fairly simple.
Show us code. As concise as possible. Then we might be able to help you.

Diez
Mar 28 '07 #5

P: n/a
Here is what I currently have. Still missing prolog information and
namespace info. Encoding is irritating me also. :)

import os,sys
import csv
from elementtree.ElementTree import Element, SubElement, ElementTree,
tostring

def indent(elem, level=0):
i = "\n" + level*" "
if len(elem):
if not elem.text or not elem.text.strip():
elem.text = i + " "
for elem in elem:
indent(elem, level+1)
if not elem.tail or not elem.tail.strip():
elem.tail = i
else:
if level and (not elem.tail or not elem.tail.strip()):
elem.tail = i

root = Element("boxes")
myfile = open('ClientsXMLUpdate.csv')
csvreader = csv.reader(myfile)

for row in csvreader:
mainbox = SubElement(root, "box")
r2 = csv.reader(myfile)
b = r2.next()
mainbox.attrib["city"] = b[10]
mainbox.attrib["country"] = b[9]
mainbox.attrib["phone"] = b[8]
mainbox.attrib["address"] = b[7]
mainbox.attrib["name"] = b[6]
mainbox.attrib["pl_heartbeat"] = b[5]
mainbox.attrib["sw_ver"] = b[4]
mainbox.attrib["hw_ver"] = b[3]
mainbox.attrib["date_activated"] = b[2]
mainbox.attrib["mac_address"] = b[1]
mainbox.attrib["boxid"] = b[0]

indent(root)
ElementTree(root).write('test.xml', "UTF-8")

Mar 29 '07 #6

P: n/a
Any ideas?

Apr 3 '07 #7

P: n/a
fscked schrieb:
I am a beginning pythoner and I am having a terrible time trying to
figure out how to do something that (it would seeme to me) should be
fairly simple.

I have a CSV file of unknown encoding and I need to parse that file to
get the fields <--- DONE
I need to create an xml document that has the proper prolog and
namespace information in it. <--- NOT DONE
I need it to be encoded properly<--- Looks right in IE, not right in
any other app.
UTF-8 encoding is the default. No need for a prologue here.

ET 1.3 will have an xml_declaration keyword argument for write() that will
allow you to write the declaration even if unnecessary. lxml already has it
now (and is ET compatible, so your code should just straight work).

http://codespeak.net/lxml

I should say that I have googled my butt off, tried ElementTree,
CSV2XML, and various other things and cannot get any of them to work.

A sample of the output I am looking for is as follows:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<ns2:boxes xmlns:ns2="Boxes">
<ns2:box id="9" mac="333333" d_a="2006" hw_ver="v1.1" sw_ver="3"
pl_h="No Data" name="Lounge" address="here" phone="555-5555"
country="US" city="LA"/>
<ns2:box id="7" mac="444444" d_a="2005" hw_ver="v1.0" sw_ver="3"
pl_h="No Data" name="MyHouse" address="there" phone="555-5556"
country="US" city="New York"/>
</ns2:boxes>
This should help you to get namespaces working:

http://effbot.org/zone/element.htm#xml-namespaces

Hope it helps,
Stefan
Apr 14 '07 #8

P: n/a
with lxml (although untested):

fscked wrote:
import os,sys
import csv
from lxml.etree import Element, SubElement, ElementTree, tostring
root = Element("{Boxes}boxes")
myfile = open('ClientsXMLUpdate.csv')
csvreader = csv.reader(myfile)

for row in csvreader:
mainbox = SubElement(root, "{Boxes}box")
r2 = csv.reader(myfile)
b = r2.next()
mainbox.put("city", b[10])
[...]

ElementTree(root).write('test.xml', "UTF-8", xml_declaration=True,
pretty_print=True)

Hope it helps,
Stefan
Apr 14 '07 #9

This discussion thread is closed

Replies have been disabled for this discussion.