By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
443,923 Members | 1,279 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 443,923 IT Pros & Developers. It's quick & easy.

urllib2 spinning CPU on read

P: n/a
Hello All,
I've ran into this problem on several sites where urllib2 will hang
(using all the CPU) trying to read a page. I was able to reproduce it
for one particular site. I'm using python 2.4

import urllib2
url = 'http://www.wautomas.info'
request = urllib2.Request(url)
opener = urllib2.build_opener()
result = opener.open(request)
data = result.read()

It never returns from this read call.

I did some profiling to try and see what was going on and make sure it
wasn't my code. There was a huge number of calls to (and amount of
time spent in) socket.py:315(readline) and to recv. A large amount of
time was also spent in httplib.py:482(_read_chunked). Here's the
significant part of the statistics:

32564841 function calls (32563582 primitive calls) in 545.250
CPU seconds

Ordered by: internal time
List reduced from 416 to 50 due to restriction <50>

ncalls tottime percall cumtime percall filename:lineno(function)
10844775 233.920 0.000 447.440 0.000 socket.py:315(readline)
10846078 152.430 0.000 152.430 0.000 :0(recv)
3 97.330 32.443 544.730 181.577
httplib.py:482(_read_chunked)
10844812 61.090 0.000 61.090 0.000 :0(join)
Also, where should I go to see if something like this has already been
reported as a bug?

Thanks for any help you can give me.

Nov 26 '06 #1
Share this Question
Share on Google+
3 Replies


P: n/a
"kdotsky" <kd*****@gmail.comwrites:
Hello All,
I've ran into this problem on several sites where urllib2 will hang
(using all the CPU) trying to read a page. I was able to reproduce it
for one particular site. I'm using python 2.4

import urllib2
url = 'http://www.wautomas.info'
[...]
Also, where should I go to see if something like this has already been
reported as a bug?
I didn't try looking at your example, but I think it's likely a bug
both in that site's HTTP server and in httplib. If it's the same one
I saw, it's already reported, but nobody fixed it yet.

http://python.org/sf/1411097
John
Nov 28 '06 #2

P: n/a
I didn't try looking at your example, but I think it's likely a bug
both in that site's HTTP server and in httplib. If it's the same one
I saw, it's already reported, but nobody fixed it yet.

http://python.org/sf/1411097
John
Thanks. I tried the example in the link you gave, and it appears to be
the same behavior.

Do you have any suggestions on how I could avoid this in the meantime?

Nov 28 '06 #3

P: n/a
"kdotsky" <kd*****@gmail.comwrites:
I didn't try looking at your example, but I think it's likely a bug
both in that site's HTTP server and in httplib. If it's the same one
I saw, it's already reported, but nobody fixed it yet.

http://python.org/sf/1411097
John

Thanks. I tried the example in the link you gave, and it appears to be
the same behavior.

Do you have any suggestions on how I could avoid this in the meantime?
Yes: read the recent messages on the tracker I linked to, and apply
the fix I suggest there.
John
Dec 1 '06 #4

This discussion thread is closed

Replies have been disabled for this discussion.