By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
459,289 Members | 1,263 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 459,289 IT Pros & Developers. It's quick & easy.

socket timeout / m2crypto.urllib problems

P: n/a

I have a test script below which I use to fetch urls into strings,
either over https or http. When over https, I use m2crypto.urllib and
when over http I use the standard urllib. Whenever, I import sockets
and setdefaulttimeout, however, using m2crypto.urllib tends to cause a
http.BadStatusLine to be raised, even if the timeout is set to be very
large. All of the documents in the test script can be accessed
publicly.

Any ideas? Is there a better/easier way to get https docs in python?

Thanks,
JDH

import urllib, socket
from cStringIO import StringIO
from M2Crypto import Rand, SSL, m2urllib

#comment out this line and the script generally works, but without it
#my zope process, which is using this code, hangs.
socket.setdefaulttimeout(200)
def url_to_string(source):
"""
get url as string, for https and http
"""
if source.startswith('https:'):
sh = StringIO()
url = m2urllib.FancyURLopener()
url.addheader('Connection', 'close')
u = url.open(source)

while 1:
data = u.read()
if not data: break
sh.write(data)
return sh.getvalue()
else:
return urllib.urlopen(source).read()

if __name__=='__main__':
s1 = url_to_string('https://crcdocs.bsd.uchicago.edu/crcdocs/Files/informatics.doc')

s2 = url_to_string('http://yahoo.com')

s3 = url_to_string('https://crcdocs.bsd.uchicago.edu/crcdocs/Files/facepage.doc')
print len(s1), len(s2), len(s3)

Jul 18 '05 #1
Share this Question
Share on Google+
1 Reply


P: n/a
John Hunter <jd******@ace.bsd.uchicago.edu> writes:
[...]
Any ideas? Is there a better/easier way to get https docs in python?

[...]

Python 2.3 has https support built-in even on Windows.
John
Jul 18 '05 #2

This discussion thread is closed

Replies have been disabled for this discussion.