By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
459,289 Members | 1,513 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 459,289 IT Pros & Developers. It's quick & easy.

Problem with urllib.py

P: n/a
Hey,

I want to open a list of URLs with Pythons urllib and the fuction
open(URL) automatically. It is important that the program open ONLY
normal http-sites and no https-sites with user/password-request.
So exists a possibility that I could cancel all site requests with
user/password-dialogues?

Thx

--

Volker

Jul 18 '05 #1
Share this Question
Share on Google+
3 Replies


P: n/a
Am Thu, 22 Jul 2004 10:43:38 +0200 schrieb Volker M.:
Hey,

I want to open a list of URLs with Pythons urllib and the fuction
open(URL) automatically. It is important that the program open ONLY
normal http-sites and no https-sites with user/password-request.
So exists a possibility that I could cancel all site requests with
user/password-dialogues?


Hi,

urllib is not interactive. If you don't send a
login+password you get an "not authorized" response
with the corresponding http error code.
You can check this return code in your script.

By the way, the user/password request (Pop-Up of browser)
is HTTP Basic Authentication, it can be used with
http or https.

HTH,
Thomas

--
Thomas GŁttler, http://www.thomas-guettler.de/
Jul 18 '05 #2

P: n/a
danke :))

--
Volker
Jul 18 '05 #3

P: n/a
"Volker M." <sp********@gmx.de> writes:
I want to open a list of URLs with Pythons urllib and the fuction
open(URL) automatically. It is important that the program open ONLY
normal http-sites and no https-sites with user/password-request.
So exists a possibility that I could cancel all site requests with
user/password-dialogues?


Assuming you mean you don't want to handle Basic HTTP Authentication
(and you don't care whether http or https), you can use
urllib2.urlopen() instead of urllib.urlopen() You will then get a
urllib2.HTTPError with a .code of 401 when a site wants Basic
Authentication.

If you do mean https, though, again with urllib2:

class NullHTTPSHandler(urllib2.HTTPSHandler):
def https_open(self, request):
return None

o = urllib2.build_opener(NullHTTPSHandler())

response = o.open(url)
In general, urllib2 splits up the job of opening URLs into handlers,
so it's more 'turn-off-and-on-able' than urllib.

Since you're writing a robot, one other thing: the alpha version of my
ClientCookie package (urllib2-replacement with addons) contains code
for obeying robots.txt files (albeit not yet well tested, IIRC):

import ClientCookie
o = ClientCookie.build_opener(ClientCookie.HTTPRobotRu lesProcessor())

response = o.open(url)
Some time soon I'll have to make a distribution of this stuff that
works properly with 2.4 (which includes changes to urllib2 from
ClientCookie)...
John
Jul 18 '05 #4

This discussion thread is closed

Replies have been disabled for this discussion.