By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
443,898 Members | 1,245 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 443,898 IT Pros & Developers. It's quick & easy.

urllib (and urllib2) read all data from page on open()?

P: n/a
The entire page is downloaded immediately whether you want it to or not when
you do an http request using urllib. This seems slightly broken to me.

Is there anyway to turn this behaviour off and have the objects read method
actually read data from the socket when you ask it to?

Jul 18 '05 #1
Share this Question
Share on Google+
2 Replies


P: n/a
Certianly under urllib2 - handle.read(100) will read the next 100 bytes
(up to) from the handle. Which is the same beahviour as the read method
for files.....

Regards,

Fuzzy
http://www.voidspace.org.uk/python/index.shtml

Jul 18 '05 #2

P: n/a

Alex Stapleton wrote:
Except wouldn't it of already read the entire file when it opened, or does it occour on the first read()?
Don't know, sorry. Try looking at the source code - it should be
reasonably obvious.
Also will the data returned from
handle.read(100) be raw HTTP? In which case what if the encoding is chunked or gzipped?


No - you get html - with the http stuff already handled (at least to
the best of my knowledge).

Regards,
Fuzzy
http://www.voidspace.org.uk/python/index.shtml

Jul 18 '05 #3

This discussion thread is closed

Replies have been disabled for this discussion.