473,324 Members | 2,535 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,324 software developers and data experts.

Rookie Speaks

I'm a python rookie, anyone have and suggestions to streamline this
function? Thanks in advance.....
def getdata(myurl):
sock = urllib.urlopen(myurl)
xmlSrc = sock.read()
sock.close()

xmldoc = minidom.parseString(xmlSrc)

def getattrs(weatherAttribute):
a = xmldoc.getElementsByTagName(weatherAttribute)
return a[0].firstChild.data

currname = getattrs("name")
currtemp = getattrs("fahrenheit")
currwind = getattrs("wind")
currdew = getattrs("dewpoint")
currbarom = getattrs("relative_humidity")
currhumid = getattrs("barometric_pressure")
currcondi = getattrs("conditions")

print "%13s\t%s\t%s\t%s\t%s\t%s\t%s" % (currname, currtemp,
currwind, currbarom, currdew, currhumid, currcondi)

Jul 18 '05 #1
10 1745
|Thus Spake William S. Perrin On the now historical date of Wed, 07 Jan
2004 17:37:57 -0600|
I'm a python rookie, anyone have and suggestions to streamline this
function? Thanks in advance.....


Please define "streamline" in this context.

Do you mean:
faster
smaller
easier to read
etc.

Sam Walters.

--
Never forget the halloween documents.
http://www.opensource.org/halloween/
""" Where will Microsoft try to drag you today?
Do you really want to go there?"""

Jul 18 '05 #2
Sorry, I guess is it efficient? That is if I called it 1000 times.....

Samuel Walters wrote:
|Thus Spake William S. Perrin On the now historical date of Wed, 07 Jan
2004 17:37:57 -0600|

I'm a python rookie, anyone have and suggestions to streamline this
function? Thanks in advance.....

Please define "streamline" in this context.

Do you mean:
faster
smaller
easier to read
etc.

Sam Walters.


Jul 18 '05 #3
|Thus Spake William S. Perrin On the now historical date of Wed, 07 Jan
2004 17:45:38 -0600|
Sorry, I guess is it efficient? That is if I called it 1000 times.....


Ponders... Even "efficient" has a loose meaning here. Since you describe
running it a thousand times, I'll assume you mean speed of execution.

There's a saying "Premature optimization is the root of all evil." Which
is another way of saying "Try it, and if it's too slow, figure out what
the hold up is. If it's not too slow, don't mess with it." So, try
running it in the context you need it in. Nothing about your code screams
"bad implementation." In fact, it's quite clearly written. Still, you
won't know if it's too slow until you try it.

There's a way to get a definitive answer on how how fast it's running
through the profile module in python. Do some research on that module. If
you're confused about it, come back and ask more questions then.

Take into consideration that it may not be your code that's slow, but
rather the way you're getting your information. This is called being "I/O
bound." The holdup might not be the program, but instead the disk or the
network. After all, you can't process information until you have the
information. The profile module will help you to see if this is the
problem.

One of the slicker solutions to slow code is the psyco module. It can
give an amazing speed boost to many processing intensive functions, but it
can sometimes even slow down your problem.

If you'd like to see an example of both the psyco and profile modules in
action, let me know and I'll give you some more understandable code that I
once wrote to see what types of things psyco is good at optimizing.

HTH

Sam Walters.

--
Never forget the halloween documents.
http://www.opensource.org/halloween/
""" Where will Microsoft try to drag you today?
Do you really want to go there?"""

Jul 18 '05 #4
sdd
William S. Perrin wrote:
I'm a python rookie, anyone have and suggestions to streamline this
function? Thanks in advance.....

... currname = getattrs("name")
currtemp = getattrs("fahrenheit")
currwind = getattrs("wind")
currdew = getattrs("dewpoint")
currbarom = getattrs("relative_humidity")
currhumid = getattrs("barometric_pressure")
currcondi = getattrs("conditions")

print "%13s\t%s\t%s\t%s\t%s\t%s\t%s" % (currname, currtemp, currwind,
currbarom, currdew, currhumid, currcondi)


How about:
name, temp, wind, dew, barom, humid, condi = map(getattrs,
"name fahrenheit wind dewpoint relative_humidity "
" barometric_pressure conditions".split())

print "%13s\t%s\t%s\t%s\t%s\t%s\t%s" % (name, temp, wind,
barom, dew, humid, condi)
-Scott David Daniels
Sc***********@Acm.Org
Jul 18 '05 #5
Anytime you find yourself repeating the same pattern of
code (i.e. the getattrs bit), there's usually a more elegant
way of doing it.

def getdata(myurl):
sock = urllib.urlopen(myurl)
xmlSrc = sock.read()
sock.close()

xmldoc = minidom.parseString(xmlSrc)

def getattrs(weatherAttribute):
a = xmldoc.getElementsByTagName(weatherAttribute)
return a[0].firstChild.data

attributes = ['name', 'fahrenheit', 'wind',
'dewpoint', 'relative_humidity',
'barometric_pressure', 'conditions']

current = {}

for a in attributes:
current[a] = getattrs(a)

format_str = "%13s"+"\t%s"*(len(attributes)-1)
print format_str % tuple([current[a] for a in attributes])
OR, if all you want is to print your numbers, skip the dictionary-

attributes = ['name', 'fahrenheit', 'wind',
'dewpoint', 'relative_humidity',
'barometric_pressure', 'conditions']

format_str = "%13s"+"\t%s"*(len(attributes)-1)
print format_str % tuple([getattrs(a) for a in attributes])
Jul 18 '05 #6
William S. Perrin wrote:

I thinke your function has a sane design :-) XML is slow by design, but in
your case it doesn't really matter, because is probably I/O-bound, as
already pointed out by Samuel Walters.

Below is a slightly different approach, that uses a class:

class Weather(object):
def __init__(self, url=None, xml=None):
""" Will accept either a URL or a xml string,
preferrably as a keyword argument """
if url:
if xml:
# not sure what would be the right exception here
# (ValueError?), so keep it generic for now
raise Exception("Must provide either url or xml, not both")
sock = urllib.urlopen(url)
try:
xml = sock.read()
finally:
sock.close()
elif xml is None:
raise Exception("Must provide either url or xml")
self._dom = minidom.parseString(xml)

def getAttrFromDom(self, weatherAttribute):
a = self._dom.getElementsByTagName(weatherAttribute)
return a[0].firstChild.data

def asRow(self):
# this will defeat lazy attribute lookup
return "%13s\t%s\t%s\t%s\t%s\t%s\t%s" % (self.name,
self.fahrenheit, self.wind, self.barometric_pressure,
self.dewpoint, self.relative_humidity, self.conditions)
return

def __getattr__(self, name):
try:
value = self.getAttrFromDom(name)
except IndexError:
raise AttributeError(
"'%.50s' object has no attribute '%.400s'" %
(self.__class__, name))
# now set the attribute so it need not be looked up
# in the dom next time
setattr(self, name, value)
return value

This has a slight advantage if you are interested only in a subset of the
attributes, say the temperature:

for url in listOfUrls:
print Weather(url).fahrenheit

Here getAttrFromDom() - the equivalent of your getattrs() - is only called
once per URL. The possibility to print a tab-delimited row is still there,

print Weather(url).asRow()

but will of course defeat this optimization scheme.

Peter
Jul 18 '05 #7
Samuel Walters <sw*************@yahoo.com> writes:
If you'd like to see an example of both the psyco and profile modules in
action, let me know and I'll give you some more understandable code that I
once wrote to see what types of things psyco is good at optimizing.


I think this is generally interesting, and would be curious to see it,
if you'd care to share.
Jul 18 '05 #8
|Thus Spake Jacek Generowicz On the now historical date of Thu, 08 Jan
2004 11:43:01 +0100|
Samuel Walters <sw*************@yahoo.com> writes:
If you'd like to see an example of both the psyco and profile modules
in action, let me know and I'll give you some more understandable code
that I once wrote to see what types of things psyco is good at
optimizing.


I think this is generally interesting, and would be curious to see it,
if you'd care to share.


Sure thing. The functions at the top are naive prime enumeration
algorithms. I chose them because they're each of a general "looping"
nature and I understand the complexity and methods of each of them. Some
use lists (and hence linearly indexed) methods and some use dictionary(
and hence are has bound). One of them, sieve_list is commented out because
it has such dog performance that I decided I wasn't interested in
how well it optimized.

These tests are by no means complete, nor is this probably a good example
of profiling or the manner in which psyco is useful. It's just from an
area where I understood the algorithmic bottlenecks to begin with.

Sam Walters.

--
Never forget the halloween documents.
http://www.opensource.org/halloween/
""" Where will Microsoft try to drag you today?
Do you really want to go there?"""

from math import sqrt
def primes_list(Limits = 1,KnownPrimes = [ 2 ]):
RetList = KnownPrimes
for y in xrange(2,Limits + 1):
w = y
p, r = 0,0
for x in RetList:
if x*x > w:
RetList.append(w)
break
p,r = divmod(y,x)
if r == 0:
w = p
return RetList

def primes_dict(Limits = 1,KnownPrimes = [ 2 ]):
RetList = KnownPrimes
RetDict = {}
for x in KnownPrimes:
RetDict[x] = 1
w = x + x
n = 2
while w <= Limits + 1:
RetDict[w] = n
w += x
n += 1
p, r = 0,0
for y in xrange(2, Limits + 1):
for x, z in RetDict.iteritems():
if x*x > y:
RetDict[y] = 1
break
p,r = divmod(y,x)
if r == 0:
RetDict[y] = p
break
return RetList

def sieve_list(Limits = 1, KnownPrimes = [ 2 ]):
RetList = KnownPrimes
CompList = [ ]
for y in xrange(2, Limits + 1):
if y not in CompList:
w = y
n = 1
while w <= Limits:
CompList.append(w)
w += y
n += 1
return RetList

def sieve_list_2(Limits = 1, KnownPrimes = [ 2 ]):
SieveList = [ 1 ]*(Limits )
RetList = [ ]
for y in xrange(2, Limits + 1):
if SieveList[y-2] == 1:
RetList.append(y)
w = y + y
n = 2
while w <= Limits + 1:
SieveList[w - 2] = n
w += y
n += 1
return RetList

def sieve_dict(Limits = 1, KnownPrimes = [ 2 ]):
SieveDict = { }
RetList = KnownPrimes
for x in KnownPrimes:
SieveDict[x] = 1
w = x + x
n = 2
while w <= Limits + 1:
SieveDict[w] = n
n += 1
w += x

for y in xrange(2, Limits + 1):
if not SieveDict.has_key(y):
RetList.append(y)
w = y
n = 1
while w <= Limits + 1:
SieveDict[w] = n
w += y
n += 1
return RetList

if __name__ == "__main__":
import sys
import profile
import pstats

import psyco

#this function wraps up all the calls that we wish to benchmark.
def multipass(number, args):
for x in xrange(1, number + 1):
primes_list(args, [ 2 ])
print ".",
sys.stdout.flush()
primes_dict(args, [ 2 ])
print ".",
sys.stdout.flush()
#Do not uncomment this line unless you have a *very* long time to wait.
#sieve_list(args)
sieve_dict(args, [ 2 ])
print ".",
sys.stdout.flush()
sieve_list_2(args, [ 2 ])
print "\r \r%i/%i"%(x, number),
sys.stdout.flush()
print "\n"

#number of times through the test
passes = 5
#find all primes up to maximum
maximum = 1000000

#create a profiling instance
#adjust the argument based on your system.
pr = profile.Profile( bias = 7.5e-06)

#run the tests
pr.run("multipass(%i, %i)"%(passes,maximum))
#save them to a file.
pr.dump_stats("primesprof")

#remove the profiling instance so that we can get a clean comparison.
del pr

#create a profiling instance
#adjust the argument based on your system.
pr = profile.Profile( bias = 7.5e-06)

#"recompile" each of the functions under consideration.
psyco.bind(primes_list)
psyco.bind(primes_dict)
psyco.bind(sieve_list)
psyco.bind(sieve_list_2)
psyco.bind(sieve_dict)

#run the tests
pr.run("multipass(%i, %i)"%(passes,maximum))
#save them to a file
pr.dump_stats("psycoprimesprof")

#clean up our mess
del pr

#load and display each of the run-statistics.
pstats.Stats('primesprof').strip_dirs().sort_stats ('cum').print_stats()
pstats.Stats('psycoprimesprof').strip_dirs().sort_ stats('cum').print_stats()

Jul 18 '05 #9
On Fri, 2004-01-09 at 05:25, Samuel Walters wrote:
|Thus Spake Jacek Generowicz On the now historical date of Thu, 08 Jan
2004 11:43:01 +0100|
Samuel Walters <sw*************@yahoo.com> writes:
If you'd like to see an example of both the psyco and profile modules
in action, let me know and I'll give you some more understandable code
that I once wrote to see what types of things psyco is good at
optimizing.


I think this is generally interesting, and would be curious to see it,
if you'd care to share.


Sure thing. The functions at the top are naive prime enumeration
algorithms. I chose them because they're each of a general "looping"
nature and I understand the complexity and methods of each of them. Some
use lists (and hence linearly indexed) methods and some use dictionary(
and hence are has bound). One of them, sieve_list is commented out because
it has such dog performance that I decided I wasn't interested in
how well it optimized.


Out of curiosity I ran your code, and obtained these results:

Fri Jan 9 08:30:25 2004 primesprof

23 function calls in 2122.530 CPU seconds

....

Fri Jan 9 08:43:24 2004 psycoprimesprof

23 function calls in -3537.828 CPU seconds

Does that mean that Armin Rigo has slipped some form of Einsteinian,
relativistic compiler into Psyco? I am reminded of the well-known
limerick:

There once was a lady called Bright,
Who could travel faster than light.
She went out one day,
In a relative way,
And came back the previous night.

--

Tim C

PGP/GnuPG Key 1024D/EAF993D0 available from keyservers everywhere
or at http://members.optushome.com.au/tchur/pubkey.asc
Key fingerprint = 8C22 BF76 33BA B3B5 1D5B EB37 7891 46A9 EAF9 93D0

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.7 (GNU/Linux)

iD8DBQA//dVyeJFGqer5k9ARAsuKAKDOA3t41ZqQy9QNIp9pZ2uuDE8yQAC go0wM
1w6Kzm37Xp/c3k5SaNk9iv4=
=XnLz
-----END PGP SIGNATURE-----

Jul 18 '05 #10
|Thus Spake Tim Churches On the now historical date of Fri, 09 Jan 2004
09:10:58 +1100|
Does that mean that Armin Rigo has slipped some form of Einsteinian,
relativistic compiler into Psyco?


No, no. It means one of two things: either you didn't adjust constant
that tries to factor out the overhead of profiling, or the call took so
long that the timer actually overflowed.

This will help you set the proper constant:

-----
import profile
import pprint

tests = 20
cycles = 10000
pr = profile.Profile()
proflist = []
for x in xrange(1, tests + 1):
proflist.append(pr.calibrate(cycles))

pprint.pprint(proflist)
-----

Increase cycles until your results don't exhibit much of a spread, then
take the lowest of those values. This is the constant you set when
instantiating a profiling object. It is specific to each individual
machine.

If it *still* gives you negative times, then the timer is overflowing and
you need to adjust the original script so that you're not running through
such a big list of numbers.

Then your apparent problems with causality should be solved.

Sam Walters.

--
Never forget the halloween documents.
http://www.opensource.org/halloween/
""" Where will Microsoft try to drag you today?
Do you really want to go there?"""

Jul 18 '05 #11

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

114
by: muldoon | last post by:
Americans consider having a "British accent" a sign of sophistication and high intelligence. Many companies hire salespersons from Britain to represent their products,etc. Question: When the...
11
by: Don Bruder | last post by:
Got a stumper here. I imagine that for someone experienced in C++, this is too pathetic for words. For a rookie, using this project as a sort of "midterm exam" in his self-taught "how to program in...
6
by: bigjmt | last post by:
Sorry to bother you guys with what I though would be an easy task. I have a table in my database were I would like one of the rows to increment a number for each row. I want the first row to start...
8
by: Tom | last post by:
Please help. I need a quick little scrpit to place on a web page that will count how many days have passed since January 1, 1970. I have ZERO experience writing ANY scripts. Anyone have any...
3
by: Dimitris \(GIS\) | last post by:
Hi I am a JavaScript rookie. I have an applet and want to change it's parameters with multiple links in my page. How can I do it? this is my code: <applet code="GIS.class"> <param...
2
by: Daveg | last post by:
Hello, Rookie here. I am new at c# and put together a script which is working fine grabbing some elements from an XML file that I created. The problem is that I would like to put at the top of...
0
by: EZboy | last post by:
Rookie question ..... can anybody please send me a sample code that illustrates : how to display an mpg video clip on a new window that is being opened from within an application ? Many Thanks,...
3
by: Dst | last post by:
Hi i'm trying to make a very simple web site using visual studio 2005. I'm completely noob at this so i need some pointers to get me started. As i understand frames should not be used in...
21
by: AsheeG87 | last post by:
Hey Everyone~ I'm still a C++ Rookie so please bear with me on this. I'm doing a temperature conversion program with prototype functions. Basicly, I was wondering if some of you would take a look...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.