468,780 Members | 2,204 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 468,780 developers. It's quick & easy.

urllib2.HTTPError: HTTP Error 204: NoContent

I am getting the following error trying to download an html page using
urllib2.

urllib2.HTTPError: HTTP Error 204: NoContent

The url is of this type:

http://www.amazon.com/gp/offer-listi...N%3DB000KJX3A0

I can open it in my browser without problems.Any ideas on a solution?
Oct 19 '08 #1
2 3711

On Oct 19, 2008, at 6:13 AM, silk.odyssey wrote:
I am getting the following error trying to download an html page using
urllib2.

urllib2.HTTPError: HTTP Error 204: NoContent

The url is of this type:

http://www.amazon.com/gp/offer-listi...N%3DB000KJX3A0

I can open it in my browser without problems.Any ideas on a solution?
Are you changing the user-agent? Some sites sniff user agents and
return different results to browsers than to suspected bots.

I'd try it from here if you post a self-contained sample that
demonstrates the problem. Should only take a couple of lines.

Oct 19 '08 #2
On Oct 19, 9:49*am, Philip Semanchuk <phi...@semanchuk.comwrote:
On Oct 19, 2008, at 6:13 AM, silk.odyssey wrote:
I am getting the following error trying to download an html page using
urllib2.
urllib2.HTTPError: HTTP Error 204: NoContent
The url is of this type:
http://www.amazon.com/gp/offer-listi...scriptionId%3D...
I can open it in my browser without problems.Any ideas on a solution?

Are you changing the user-agent? Some sites sniff user agents and *
return different results to browsers than to suspected bots.

I tried it.
>>import urllib2
url = 'http://www.amazon.com/gp/offer-listing/B000KJX3A0%3FSubscriptionId%3D183VXJS74KNQ89D0NRR2 %26tag%3Dws%26linkCode%3Dxm2%26camp%3D2025%26creat ive%3D386001%26creativeASIN%3DB000KJX3A0'
op = urllib2.urlopen(url)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.5/urllib2.py", line 121, in urlopen
return _opener.open(url, data)
File "/usr/lib/python2.5/urllib2.py", line 380, in open
response = meth(req, response)
File "/usr/lib/python2.5/urllib2.py", line 491, in http_response
'http', request, response, code, msg, hdrs)
File "/usr/lib/python2.5/urllib2.py", line 418, in error
return self._call_chain(*args)
File "/usr/lib/python2.5/urllib2.py", line 353, in _call_chain
result = func(*args)
File "/usr/lib/python2.5/urllib2.py", line 499, in
http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 204: NoContent
>>headers = {}
headers['User-Agent'] = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.3) Gecko/2008092417 Firefox/3.0.3'
ro = urllib2.Request(url, None, headers)
op = urllib2.urlopen(ro)
page = op.read()
page
(lots of HTML)

So the answer is as Philip suggests - amazon.com doesn't like 'Python-
urllib/2.5' as a User-Agent. You have to give it something that looks
like a browser.

--
(for email use this address please - you can figure it out)

Mark Sapiro mark at msapiro net Any clod can have the facts;
San Francisco Bay Area, California having opinions is an art. -
C. McCabe, The Fearless
Spectator
Oct 19 '08 #3

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

1 post views Thread by Matthew Wilson | last post: by
2 posts views Thread by John F Dutcher | last post: by
1 post views Thread by Doug Farrell | last post: by
5 posts views Thread by Pascal | last post: by
reply views Thread by jacob c. | last post: by
reply views Thread by Ali.Sabil | last post: by
1 post views Thread by Alessandro Fachin | last post: by
1 post views Thread by Magnus.Moraberg | last post: by
6 posts views Thread by robean | last post: by
reply views Thread by zhoujie | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.