By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
443,275 Members | 1,745 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 443,275 IT Pros & Developers. It's quick & easy.

How to processing multi redirect?

P: n/a
I want fetching some articles from nytimes.com for my Palm, and I want
a clear, simple article too, my Palm has only 8M RAM.

With the WGET, I can fetching the page like:
"http://www.nytimes.com/2005/10/26/business/26fed.html?pagewanted=print",
and when WGET works, I can see the URL have been redirect many times.

When I run the below code with Python:
thing = urllib2.HTTPRedirectHandler()
opener = urllib2.build_opener(thing)
url = http://www.nytimes.com/2005/10/26/business/26fed.html?pagewanted=print'
page = opener.open(url)


I just get a error message: "HTTP Error 302: The HTTP server returned a
redirect error that would lead to an infinite loop. The last 30x error
message was: Moved Temporarily"

Why I can't fetching the page with python, but WGET can do it?

Thanks for your help in advance!

--
Gonnasi

Oct 26 '05 #1
Share this Question
Share on Google+
2 Replies


P: n/a
Gonnasi wrote:
I want fetching some articles from nytimes.com for my Palm, and I want
a clear, simple article too, my Palm has only 8M RAM.

With the WGET, I can fetching the page like:
"http://www.nytimes.com/2005/10/26/business/26fed.html?pagewanted=print",
and when WGET works, I can see the URL have been redirect many times.

When I run the below code with Python:
thing = urllib2.HTTPRedirectHandler()
opener = urllib2.build_opener(thing)
url = http://www.nytimes.com/2005/10/26/business/26fed.html?pagewanted=print'
page = opener.open(url)
I just get a error message: "HTTP Error 302: The HTTP server returned a
redirect error that would lead to an infinite loop. The last 30x error
message was: Moved Temporarily"

Why I can't fetching the page with python, but WGET can do it?

Thanks for your help in advance!

--
Gonnasi


Hi,

Your problem is that you're not preserving cookies from one request to
the next. nytimes.com redirects you to an automatic login page which
sets a cookie; this cookie is required to view the original page, or
else it'll get stuck in a loop. This fixes the problem:
thing = urllib2.HTTPRedirectHandler()
thing2 = urllib2.HTTPCookieProcessor()
opener = urllib2.build_opener(thing, thing2)
url = 'http://www.nytimes.com/2005/10/26/business/26fed.html?pagewanted=print'
page = opener.open(url)


Hope this helps,

-- David

Oct 26 '05 #2

P: n/a
Tons of thanks for your help!
Now I can fetching the page success.
Thansk again.

--
Gonnasi

Oct 27 '05 #3

This discussion thread is closed

Replies have been disabled for this discussion.