By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
455,539 Members | 1,289 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 455,539 IT Pros & Developers. It's quick & easy.

http request doesn't work!

P: n/a
Could someone please let me know what i'm doing wrong here:
#!/usr/bin/python

import httplib

WEB_SITE = 'adsl.internode.on.net'
#WEB_SITE = 'www.google.com'
#PAGE_PATH = '/about.html'
PAGE_PATH = '/htm/un-metered-sites-ip-list.htm'
http = httplib.HTTP(WEB_SITE)
http.putrequest('GET', PAGE_PATH)
http.putheader('Accept', 'text/html')
http.putheader('Accept', 'text/plan')
http.endheaders()
httpcode, httpmsg, headers = http.getreply()
print 'msg: ' + httpmsg
print httpcode
doc = http.getfile()
data = doc.read()
doc.close()
print data


The output of the above is different if you go to the following in
your browser:
http://adsl.internode.on.net/htm/un-...es-ip-list.htm
Whats my problem?!?!??!
Jul 18 '05 #1
Share this Question
Share on Google+
2 Replies


P: n/a
ba**********@clinicdesignNOSPAM.com.au wrote:
Could someone please let me know what i'm doing wrong here:


Using the obsolete httplib.HTTP class.

<http://www.python.org/doc/lib/module-httplib.html>

Try using HTTPConnection and HTTPResponse objects:
import httplib

webhost = 'adsl.internode.on.net'
pagepath = '/htm/un-metered-sites-ip-list.htm'

http_conn = httplib.HTTPConnection( webhost )
http_conn.request( 'GET', pagepath )

http_resp = http_conn.getresponse()
( http_resp.status, http_resp.reason )

(200, 'OK')

--
\ "I stayed up all night playing poker with tarot cards. I got a |
`\ full house and four people died." -- Steven Wright |
_o__) |
Ben Finney <http://bignose.squidly.org/>
Jul 18 '05 #2

P: n/a
f29
>
The output of the above is different if you go to the following in
your browser:
http://adsl.internode.on.net/htm/un-...es-ip-list.htm
Whats my problem?!?!??!


Try adding User-Agent header of some popular browser (e.g.
"Mozilla/5.0 (Windows; U; Windows NT; en-US; rv:1.6) Gecko") so that
remote site could not prevent fetching their content with a robot.

Moreover, try looking at the urllib2 module, it has great power.

rgrds,
f29
Jul 18 '05 #3

This discussion thread is closed

Replies have been disabled for this discussion.