472,119 Members | 1,467 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,119 software developers and data experts.

Help with optimisation

Hello,
I know this might be a little cheeky, and if it is, please say, but I need a
little hand optimising some code. For the simple reason that this is
'company' code and I have no idea what I'm allowed to release and not as the
case may be I've changed anything that could give an indication of the
company - if that makes any sense...

for the code below:
text_buffer is a single record from an XML stream. I can't read in the
entire XML at once because it isn't all available straight away, so I
capture line by line, and when a full message is available I use parseString
under the minidom API.
The SQL version is SQLite. It was recommended to me, and is adequate for the
uses I put it to.
The function doesn't return anything, but it's called often enough and
depending on the optimisation I'll be able to use the same style in other
areas of the program.

previous code:
def CreatePerson(text_buffer):
dom=xml.dom.minidom.parseString(text_buffer)
reflist = dom.getElementsByTagName('Country')
Country = reflist[0].firstChild.nodeValue
reflist = dom.getElementsByTagName('Age')
Age = reflist[0].firstChild.nodeValue
reflist = dom.getElementsByTagName('Surname')
Surname = reflist[0].firstChild.nodeValue
reflist = dom.getElementsByTagName('Forename')
Forename = reflist[0].firstChild.nodeValue
cursor.execute('INSERT INTO Person VALUES(?,?,?)', (Forename + "-" +
Surname, Age, Country))
connection.commit()

I've changed it now to this:
def CreatePerson(text_buffer):
dom=xml.dom.minidom.parseString(text_buffer)
elements=['Country','Age','Surname','Forename']
Values=[]
for element in elements:
reflist=dom.getElementsByTagName(element)
Values.append(reflist[0].firstChild.nodeValue)
# I can get away with the above because I know the structure of the
XML
cursor.execute('INSERT INTO Person
VALUES(?,?,?)',(Forename+"-"+Surname,Age,Country))
connection.commit()

They both seem ugly IMO (read: longer than intuitively necessary), and so I
was wondering whether there was any way to combine Forename and Surname
together within the Values list (think merge cells with the '-' in between)
so I could use the unary(?) operator within the SQL?

I suppose if this is a cheeky request then I won't get any replies.
Thank you for any help
Dominic

Aug 13 '07 #1
2 1183
special_dragonfly <Do*****@PLEASEASK.co.ukwrote:
...
dom=xml.dom.minidom.parseString(text_buffer)
If you need to optimize code that parses XML, use ElementTree (some
other parsers are also fast, but minidom ISN'T).
Alex
Aug 13 '07 #2
special_dragonfly a écrit :
Hello,
(snip)
The function doesn't return anything, but it's called often enough and
depending on the optimisation I'll be able to use the same style in other
areas of the program.

previous code:
def CreatePerson(text_buffer):
dom=xml.dom.minidom.parseString(text_buffer)
reflist = dom.getElementsByTagName('Country')
Country = reflist[0].firstChild.nodeValue
reflist = dom.getElementsByTagName('Age')
Age = reflist[0].firstChild.nodeValue
reflist = dom.getElementsByTagName('Surname')
Surname = reflist[0].firstChild.nodeValue
reflist = dom.getElementsByTagName('Forename')
Forename = reflist[0].firstChild.nodeValue
cursor.execute('INSERT INTO Person VALUES(?,?,?)', (Forename + "-" +
Surname, Age, Country))
connection.commit()

I've changed it now to this:
def CreatePerson(text_buffer):
dom=xml.dom.minidom.parseString(text_buffer)
elements=['Country','Age','Surname','Forename']
Values=[]
for element in elements:
reflist=dom.getElementsByTagName(element)
Values.append(reflist[0].firstChild.nodeValue)
# I can get away with the above because I know the structure of the
XML
cursor.execute('INSERT INTO Person
VALUES(?,?,?)',(Forename+"-"+Surname,Age,Country))
connection.commit()
A common python optimisation trick is to stote local references to save
on attribute lookup time, ie:

# local ref to parseString
import dom
dom_parseString=xml.dom.minidom.parseString

def CreatePerson(text_buffer):
dom = dom_parseString(text_buffer)
elements=['Country','Age','Surname','Forename']
values=[]
getElementByTagName = dom.getElementsByTagName
for element in elements:
reflist = getElementsByTagName(element)
values.append(reflist[0].firstChild.nodeValue)
But as Alex already pointed out, you'd be better using (c)ElementTree.
They both seem ugly IMO (read: longer than intuitively necessary),
I'd say this is a common problem with XML :-/
Aug 13 '07 #3

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

2 posts views Thread by Simon Elliott | last post: by
16 posts views Thread by simonwittber | last post: by
1 post views Thread by grid | last post: by
21 posts views Thread by c | last post: by
reply views Thread by leo001 | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.