By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
455,548 Members | 1,496 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 455,548 IT Pros & Developers. It's quick & easy.

xml aggregator

P: n/a
Hi all,

I am trying to write an xml aggregator, but so far, i've been failing
miserably.

what i want to do :

i have entries, in a list format :[[key1,value],[key2,value],[
key3,value]], value]

example :
[["route","23"],["equip","jr2"],["time","3pm"]],"my first value"]
[["route","23"],["equip","jr1"],["time","3pm"]],"my second value"]
[["route","23"],["equip","jr2"],["time","3pm"]],"my third value"]
[["route","24"],["equip","jr2"],["time","3pm"]],"my fourth value"]
[["route","25"],["equip","jr2"],["time","3pm"]],'"my fifth value"]

the tree i want in the end would be :
<results>
<route id="23">
<equip id=jr2">
<time id="3pm">
<data>my first value</data>
<data>my third value</data>
</time>
</equip>
<equip id=jr1">
<time id="3pm">
<data>my second value</data>
</time>
</equip>
<route id="24">
<equip id=jr2">
<time id="3pm">
<data>my fourthvalue</data>
</time>
</equip>
<route id="25">
<equip id=jr2">
<time id="3pm">

<data>my fifth value</data>
</time>
</equip>
</results>
If anyone has an idea of implemetation or any code ( i was trying with
ElementTree...

thank you so much

Jul 9 '06 #1
Share this Question
Share on Google+
7 Replies


P: n/a
kepioo wrote:
Hi all,

I am trying to write an xml aggregator, but so far, i've been failing
miserably.

what i want to do :

i have entries, in a list format :[[key1,value],[key2,value],[
key3,value]], value]

example :
[["route","23"],["equip","jr2"],["time","3pm"]],"my first value"]
[["route","23"],["equip","jr1"],["time","3pm"]],"my second value"]
[["route","23"],["equip","jr2"],["time","3pm"]],"my third value"]
[["route","24"],["equip","jr2"],["time","3pm"]],"my fourth value"]
[["route","25"],["equip","jr2"],["time","3pm"]],'"my fifth value"]
[snip example data]
>

If anyone has an idea of implemetation or any code ( i was trying with
ElementTree...
(You should have posted the code you tried)

The code below might help (though you should test it more than I have).
The 'findall' function comes from here:

http://gflanagan.net/site/python/ele...ementfilter.py

it's not the elementtree one.

Gerard

----------------------------------

X = [[[["route","23"],["equip","jr2"],["time","3pm"]],"my first
value"],
[[["route","23"],["equip","jr1"],["time","3pm"]],"my second value"],
[[["route","23"],["equip","jr2"],["time","3pm"]],"my third value"],
[[["route","24"],["equip","jr2"],["time","3pm"]],"my fourth value"],
[[["route","25"],["equip","jr2"],["time","3pm"]],"my fifth value"],
[[["route","25"],["equip","jr2"],["time","4pm"]],"my sixth value"]]

# reshape the data
records = []
for info, data in X:
record = []
for attr, val in info:
record.append(val)
record.append( data )
records.append( record )

for r in records:
print r

from elementtree.ElementTree import Element, SubElement, tostring
from elementfilter import findall

results = Element('results')

for r in records:
routeid, equipid, timeid, data = r
route, equip, time = None, None, None
existing_route = findall(results, "route[@id=='%s']" % routeid)
if existing_route:
route = existing_route[0]
existing_equip = findall(route, "equip[@id=='%s']" % equipid)
if existing_equip:
equip = existing_equip[0]
existing_time = findall(equip, "time[@id=='%s']" % timeid)
if existing_time:
time = existing_time[0]
route = route or SubElement(results, 'route', id=routeid)
equip = equip or SubElement(route, 'equip', id=equipid)
time = time or SubElement(equip, 'time', id=timeid)
data = SubElement(time,'data')
data.text = item

print tostring(results)

-------------------------------------------

Jul 9 '06 #2

P: n/a

Gerard Flanagan wrote:
kepioo wrote:
Hi all,

I am trying to write an xml aggregator, but so far, i've been failing
miserably.

what i want to do :

i have entries, in a list format :[[key1,value],[key2,value],[
key3,value]], value]

example :
[["route","23"],["equip","jr2"],["time","3pm"]],"my first value"]
[["route","23"],["equip","jr1"],["time","3pm"]],"my second value"]
[["route","23"],["equip","jr2"],["time","3pm"]],"my third value"]
[["route","24"],["equip","jr2"],["time","3pm"]],"my fourth value"]
[["route","25"],["equip","jr2"],["time","3pm"]],'"my fifth value"]

[snip example data]


If anyone has an idea of implemetation or any code ( i was trying with
ElementTree...

(You should have posted the code you tried)

The code below might help (though you should test it more than I have).
The 'findall' function comes from here:

http://gflanagan.net/site/python/ele...ementfilter.py

it's not the elementtree one.
Sorry, elementfilter.py was a bit broken - fixed now. Use the current
one and change the code I posted to:

[...]
existing_route = findall(results, "route[@id==%s]" % routeid)
#changed line
if existing_route:
route = existing_route[0]
existing_equip = findall(route, "equip[@id=='%s']" % equipid)
if existing_equip:
[...]

ie. don't quote the route id since it's numeric.

Gerard

Jul 9 '06 #3

P: n/a
thanks a lot for the code.

It was not working the first time (do not recognize item and
existing_time --i changed item by r[-1] and existing_time by
existing_equip).
however, it is not producing the result i expected, as in it doesn't
group by same category the elements, it creates a new block of xml
<results>
-
<LINK id="23">
-
<EQUIPMENT id="jr2">
-
<TIMESTAMP id="3pm">
<data>my first value</data>
</TIMESTAMP>
</EQUIPMENT>
</LINK>
-
<LINK id="23">
-
<EQUIPMENT id="jr1">
-
<TIMESTAMP id="3pm">
<data>my second value</data>
</TIMESTAMP>
</EQUIPMENT>
</LINK>
-
<LINK id="23">
-
<EQUIPMENT id="jr2">
-
<TIMESTAMP id="3pm">
<data>my third value</data>
</TIMESTAMP>
</EQUIPMENT>
</LINK>
-
<LINK id="24">
-
<EQUIPMENT id="jr2">
-
<TIMESTAMP id="3pm">
<data>my fourth value</data>
</TIMESTAMP>
</EQUIPMENT>
</LINK>
-
<LINK id="25">
-
<EQUIPMENT id="jr2">
-
<TIMESTAMP id="3pm">
<data>my fifth value</data>
</TIMESTAMP>
</EQUIPMENT>
</LINK>
-
<LINK id="25">
<EQUIPMENT id="jr2">
<TIMESTAMP id="4pm">
<data>my sixth value</data>
</TIMESTAMP>
</EQUIPMENT>
</LINK>
</results>

The idea behind all that is :

i want to create an xml file that'll have a XSL instructions.

The xsl will sort the entries and display something like :

Route 23:
*jr1
*3pm
value
value
value
*5pm
value
value
*jr2
*3pm
value
value
value
*5pm
value
value
Route 29
*jr1
*3pm
value
value
value
*5pm
value
value
*jr2
*3pm
value
value
value
*5pm
value
value
I know this is feasible with XSL2 , but i need something compatible
with quite old browser, and XSL2 is not even working on my comp( i
could upgrade but i cannot ask all the users to do so). That's why I
thought rearranging the xml would do it.

Do you have other idea? Do u think it is the best choice?

More information abt the application I am writing : i am parsing a
feed, extracting some data and producing reports. the application is
running on spyce, so i don't have to produce a file in output, just
print the xml to the screen and it is automatically wrting to the html
page we view.

Thanks again for your help.
Gerard Flanagan wrote:
Gerard Flanagan wrote:
kepioo wrote:
Hi all,
>
I am trying to write an xml aggregator, but so far, i've been failing
miserably.
>
what i want to do :
>
i have entries, in a list format :[[key1,value],[key2,value],[
key3,value]], value]
>
example :
[["route","23"],["equip","jr2"],["time","3pm"]],"my first value"]
[["route","23"],["equip","jr1"],["time","3pm"]],"my second value"]
[["route","23"],["equip","jr2"],["time","3pm"]],"my third value"]
[["route","24"],["equip","jr2"],["time","3pm"]],"my fourth value"]
[["route","25"],["equip","jr2"],["time","3pm"]],'"my fifth value"]
>
[snip example data]
>
>
If anyone has an idea of implemetation or any code ( i was trying with
ElementTree...
>
(You should have posted the code you tried)

The code below might help (though you should test it more than I have).
The 'findall' function comes from here:

http://gflanagan.net/site/python/ele...ementfilter.py

it's not the elementtree one.

Sorry, elementfilter.py was a bit broken - fixed now. Use the current
one and change the code I posted to:

[...]
existing_route = findall(results, "route[@id==%s]" % routeid)
#changed line
if existing_route:
route = existing_route[0]
existing_equip = findall(route, "equip[@id=='%s']" % equipid)
if existing_equip:
[...]

ie. don't quote the route id since it's numeric.

Gerard
Jul 10 '06 #4

P: n/a
Gerard Flanagan wrote:
Gerard Flanagan wrote:
kepioo wrote:
Hi all,

I am trying to write an xml aggregator, but so far, i've been failing
miserably.

what i want to do :

i have entries, in a list format :[[key1,value],[key2,value],[
key3,value]], value]

example :
[["route","23"],["equip","jr2"],["time","3pm"]],"my first value"]
[["route","23"],["equip","jr1"],["time","3pm"]],"my second value"]
[["route","23"],["equip","jr2"],["time","3pm"]],"my third value"]
[["route","24"],["equip","jr2"],["time","3pm"]],"my fourth value"]
[["route","25"],["equip","jr2"],["time","3pm"]],'"my fifth value"]

>
[snip example data]
>


If anyone has an idea of implemetation or any code ( i was trying with
ElementTree...

>
(You should have posted the code you tried)
>
The code below might help (though you should test it more than I have).
The 'findall' function comes from here:
>
http://gflanagan.net/site/python/ele...ementfilter.py
>
it's not the elementtree one.
>
Sorry, elementfilter.py was a bit broken - fixed now. Use the current
one and change the code I posted to:

[...]
existing_route = findall(results, "route[@id==%s]" % routeid)
#changed line
if existing_route:
route = existing_route[0]
existing_equip = findall(route, "equip[@id=='%s']" % equipid)
if existing_equip:
[...]

ie. don't quote the route id since it's numeric.
kepioo wrote:
thanks a lot for the code.

It was not working the first time (do not recognize item and
existing_time --
Apologies, I ran the code from PythonWin which remembers names that
were previously declared though deleted - should have run it as a
script.
i changed item by r[-1] and existing_time by
existing_equip).
'item' was wrong but not the other two. (I'm assuming your data is
regular - ie. all the records have the same number of fields)

change the for loop to the following:

8<------------------------------------------------------

for routeid, equipid, timeid, data in records:
route, equip, time = None, None, None
existing_route = findall(results, "route[@id==%s]" % routeid)
if existing_route:
route = existing_route[0]
existing_equip = findall(route, "equip[@id==%s]" % equipid)
if existing_equip:
equip = existing_equip[0]
existing_time = findall(equip, "time[@id==%s]" % timeid)
if existing_time:
time = existing_time[0]
route = route or SubElement(results, 'route', id=routeid)
equip = equip or SubElement(route, 'equip', id=equipid)
time = time or SubElement(equip, 'time', id=timeid)
dataitem = SubElement(time,'data')
dataitem.text = data

8<------------------------------------------------------
however, it is not producing the result i expected, as in it doesn't
group by same category the elements, it creates a new block of xml
[...]

the changes above should give you what you want - remember, as I wrote
in the previous post, it should be:

"[@id==%s]"

not

"[@id=='%s']"

ie. no single quotes needed.

With the above amended code I get:

<results>
<route id="23">
<equip id="jr2">
<time id="3pm">
<data>my first value</data>
<data>my third value</data>
</time>
</equip>
<equip id="jr1">
<time id="3pm">
<data>my second value</data>
</time>
</equip>
</route>
<route id="24">
<equip id="jr2">
<time id="3pm">
<data>my fourth value</data>
</time>
</equip>
</route>
<route id="25">
<equip id="jr2">
<time id="3pm">
<data>my fifth value</data>
</time>
<time id="4pm">
<data>my sixth value</data>
</time>
</equip>
</route>
</results>
------------------------------------

all the best

Gerard

ps. this newsgroup prefers that you don't top-post.

Jul 10 '06 #5

P: n/a
Thank you so much, it works and it rocks !

bad thing i need ot figure out is why mozilla cannot parse my xsl
sheet, but it works in IE ( most of my users are using IE)

so the module u wrote is to top up element tree with Xpath
capabilities, is it? Does the new element tree does that? which one is
the most appropriate?

btw, are u french?

Regards,

Nassim

Jul 10 '06 #6

P: n/a

kepioo wrote:
Thank you so much, it works and it rocks !
Great! Glad I could help.
bad thing i need ot figure out is why mozilla cannot parse my xsl
sheet, but it works in IE ( most of my users are using IE)
you could try transforming the xml on the server and send straight HTML
to the client - if you were to use CherryPy (http://www.cherrypy.org),
there is a filter which does this called picket:

http://www.cherrypy.org/wiki/Picket

you would also need to install 4Suite (http://4suite.org)
so the module u wrote is to top up element tree with Xpath
capabilities, is it?
it minimally extends the functionality of elementtree's existing
'findall' function - and it hasn't been put to much use, so let me know
if you run into problems.

the idea came from reading about the 'Specification Pattern':

pdf - http://www.martinfowler.com/apsupp/spec.pdf
good luck!

Gerard

Jul 10 '06 #7

P: n/a
kepioo wrote:
Hi all,

I am trying to write an xml aggregator, but so far, i've been failing
miserably.

what i want to do :

i have entries, in a list format :[[key1,value],[key2,value],[
key3,value]], value]

example :
[["route","23"],["equip","jr2"],["time","3pm"]],"my first value"]
[["route","23"],["equip","jr1"],["time","3pm"]],"my second value"]
[["route","23"],["equip","jr2"],["time","3pm"]],"my third value"]
[["route","24"],["equip","jr2"],["time","3pm"]],"my fourth value"]
[["route","25"],["equip","jr2"],["time","3pm"]],'"my fifth value"]

the tree i want in the end would be :
assuming that the actual order of the subelements doesn't matter, you
could simply sort the array, and use groupby to group related tags:

import elementtree.ElementTree as ET
import itertools, operator

data = [
([["route","23"],["equip","jr2"],["time","3pm"]],"my first value"),
([["route","23"],["equip","jr1"],["time","3pm"]],"my second value"),
([["route","23"],["equip","jr2"],["time","3pm"]],"my third value"),
([["route","24"],["equip","jr2"],["time","3pm"]],"my fourth value"),
([["route","25"],["equip","jr2"],["time","3pm"]],"my fifth value")
]
def group(data, index):
return itertools.groupby(sorted(data), lambda x: x[0][index])

root = ET.Element("result")

for key, items in group(data, 0):
route = ET.SubElement(root, key[0], id=key[1])
for key, items in group(items, 1):
equip = ET.SubElement(route, key[0], id=key[1])
for key, items in group(items, 2):
time = ET.SubElement(equip, key[0], id=key[1])
for data in items:
ET.SubElement(time, "data").text = data[1]

ET.dump(root)

if you want prettyprinted output, use this function

http://effbot.org/zone/element-lib.htm#prettyprint

on the resulting tree.

</F>

Jul 10 '06 #8

This discussion thread is closed

Replies have been disabled for this discussion.