468,107 Members | 1,294 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 468,107 developers. It's quick & easy.

urllib2 through basic auth'ed proxy

I see from googling around that this is a popular topic, but I haven't seen
anyone saying "ah, yes, that works", so here it goes.

How does one connect through a proxy which requires basic authorisation?
The following code, stolen from somewhere, fails with a 407:

proxy_handler = urllib2.ProxyHandler({"http" :
"http://the.proxy.address:3128"})
proxy_auth_handler = urllib2.ProxyBasicAuthHandler()
proxy_auth_handler.add_password("The name of the realm sniffed from
telnetting to the proxy and doing a
get",'the.proxy.address','theusername','thepasswor d')
opener = urllib2.build_opener(proxy_handler, proxy_auth_handler)
urllib2.install_opener(opener)
f = urllib2.urlopen('http://www.google.com/')
I still get a 407 if I set the realm to None, I change host to the
'http://the.proxy.address/' form or even 'http://the.proxy.address:3128'
form.

The proxy is squid. Python version is 2.3.4 (I read that this version has a
problem in that it introduces an extra return after the authorisation, but
it isn't even getting to that bit). And yes, going through firefox,
everything works fine.

Can anyone explain me why this fails, or more importantly, code that would
work?

Thanks,
alejandro

Mar 29 '06 #1
6 8676
Alejandro Dubrovsky <du*******@physics.uq.edu.au> writes:
[...]
How does one connect through a proxy which requires basic authorisation?
The following code, stolen from somewhere, fails with a 407:
[...code involving urllib2.ProxyBasicAuthHandler()...] Can anyone explain me why this fails, or more importantly, code that would
work?


OK, I finally installed squid and had a look at the urllib2 proxy
basic auth support (which I've steered clear of for years despite
doing quite a bit with urllib2). Seems quite broken. Appears to have
been broken back in December 2004, with revision 38092 (note there's a
little revision number oddness in the Python SVN repo, BTW:
http://mail.python.org/pipermail/pyt.../058269.html):

--- urllib2.py (revision 38091)
+++ urllib2.py (revision 38092)
@@ -720,7 +720,10 @@
return self.retry_http_basic_auth(host, req, realm)

def retry_http_basic_auth(self, host, req, realm):
- user,pw = self.passwd.find_user_password(realm, host)
+ # TODO(jhylton): Remove the host argument? It depends on whether
+ # retry_http_basic_auth() is consider part of the public API.
+ # It probably is.
+ user, pw = self.passwd.find_user_password(realm, req.get_full_url())
if pw is not None:
raw = "%s:%s" % (user, pw)
....
That can't be right, can it? With a proxy, you're always
authenticating yourself for the whole proxy, and you want to look up
(RFC 2617 section 3.2.1). The ProxyBasicAuthHandler subclass
dutifully passes in the right thing for the host argument, but
AbstractBasicAuthHandler ignores it, which means that it never finds
the password -- e.g. if you're trying to connect to python.org through
myproxy.com, it'll be looking for a username/password for python.org
instead of the needed myproxy.com.

Obviously nobody else uses authenticating proxies either, or at least
nobody who can be bothered to fix urllib2 :-(

A workaround is to supply a stupid HTTPPasswordMgr that always returns
the proxy credentials regardless of what the handler asks it for (only
tested with a perhaps-broken 2.5 install, since I've broken my 2.4
install):

import urllib2

class DumbProxyPasswordMgr:
def __init__(self):
self.user = self.passwd = None
def add_password(self, realm, uri, user, passwd):
self.user = user
self.passwd = passwd
def find_user_password(self, realm, authuri):
return self.user, self.passwd
proxy_auth_handler = urllib2.ProxyBasicAuthHandler(DumbProxyPasswordMgr ())
proxy_handler = urllib2.ProxyHandler({"http": "http://localhost:3128"})
proxy_auth_handler.add_password(None, None, 'john', 'blah')
opener = urllib2.build_opener(proxy_handler, proxy_auth_handler)
f = opener.open('http://python.org/')
print f.read()
Yuck, yuck, yuck! I had realised the auth/proxies code in urllib2 was
buggy, but... And all those hoops to jump through.

Also, if you're using 2.5 SVN HEAD, it seems revision 42133 broke
ProxyHandler in an attempt to fix the URL host:post syntax!

I'll try to get some fixes in tomorrow so that 2.5 isn't broken (or at
least flag the issues to let somebody else fix them), but no promises
as usual...
John

Mar 30 '06 #2
Alejandro Dubrovsky <du*******@physics.uq.edu.au> writes:
[...]
The proxy is squid. Python version is 2.3.4 (I read that this version has a
problem in that it introduces an extra return after the authorisation, but
it isn't even getting to that bit). And yes, going through firefox,
everything works fine.

[...]

FWIW, at a glance, Python 2.3.4 has neither of the bugs I mentioned,
but the code I posted seems to work with 2.3.4. I'm not particularly
interested in what's wrong with 2.3.4's version or your usage of it
(probably both), since bugfix releases for 2.3 are no longer
happening, I believe.

I think the Examples section of the docs on this are wrong too, though
that's a bit of a moot point when the code is as broken as it seems...
John

Mar 31 '06 #3
John J. Lee wrote:
FWIW, at a glance, Python 2.3.4 has neither of the bugs I mentioned,
but the code I posted seems to work with 2.3.4. I'm not particularly
interested in what's wrong with 2.3.4's version or your usage of it
(probably both), since bugfix releases for 2.3 are no longer
happening, I believe.


"ah, yes, that works" on 2.3.4. Excellent. (I don't see what's so ugly
about that code, but i'm mostly accustomed to my own)
Thanks,
alejandro
Mar 31 '06 #4
Alejandro Dubrovsky <du*******@physics.uq.edu.au> writes:
John J. Lee wrote:
FWIW, at a glance, Python 2.3.4 has neither of the bugs I mentioned,
but the code I posted seems to work with 2.3.4. I'm not particularly
interested in what's wrong with 2.3.4's version or your usage of it
(probably both), since bugfix releases for 2.3 are no longer
happening, I believe.


"ah, yes, that works" on 2.3.4. Excellent. (I don't see what's so ugly
about that code, but i'm mostly accustomed to my own)


supplying a password surely shouldn't be that complicated...
John

Mar 31 '06 #5
jj*@pobox.com (John J. Lee) writes:
Alejandro Dubrovsky <du*******@physics.uq.edu.au> writes: [...Alejandro complains about non-working HTTP proxy auth in urllib2...]

[...John notes urllib2 bug...] A workaround is to supply a stupid HTTPPasswordMgr that always returns
the proxy credentials regardless of what the handler asks it for (only
tested with a perhaps-broken 2.5 install, since I've broken my 2.4
install): [...snip ugly code] Yuck, yuck, yuck! I had realised the auth/proxies code in urllib2 was
buggy, but... And all those hoops to jump through.

Also, if you're using 2.5 SVN HEAD, it seems revision 42133 broke
ProxyHandler in an attempt to fix the URL host:post syntax!

[...]

In fact the following also works with Python 2.3.4:

import urllib2
proxy_handler = urllib2.ProxyHandler({"http": "http://john:blah@localhost:3128"})
print urllib2.build_opener(proxy_handler).open('http://python.org/').read()
....but only just barely skirts around the bugs!-) :-(

(at least, the current bugs: I've no reason to work out what things
were like back in 2.3.4, but the above certainly works with that
version)
John

Apr 1 '06 #6
John J. Lee wrote:
jj*@pobox.com (John J. Lee) writes:
Alejandro Dubrovsky <du*******@physics.uq.edu.au> writes:

[...Alejandro complains about non-working HTTP proxy auth in urllib2...]

[...John notes urllib2 bug...]
A workaround is to supply a stupid HTTPPasswordMgr that always returns
the proxy credentials regardless of what the handler asks it for (only
tested with a perhaps-broken 2.5 install, since I've broken my 2.4
install):

[...snip ugly code]
Yuck, yuck, yuck! I had realised the auth/proxies code in urllib2 was
buggy, but... And all those hoops to jump through.

Also, if you're using 2.5 SVN HEAD, it seems revision 42133 broke
ProxyHandler in an attempt to fix the URL host:post syntax!

[...]

In fact the following also works with Python 2.3.4:

import urllib2
proxy_handler = urllib2.ProxyHandler({"http":
"http://john:blah@localhost:3128"}) print
urllib2.build_opener(proxy_handler).open('http://python.org/').read()

It does too. Thanks again. (I think this version is uglier, but easier to
insert into third party code)
Apr 3 '06 #7

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

4 posts views Thread by O. Koch | last post: by
4 posts views Thread by bmiras | last post: by
5 posts views Thread by Pascal | last post: by
4 posts views Thread by news.easynet.be | last post: by
reply views Thread by BobJones | last post: by
1 post views Thread by rx | last post: by
2 posts views Thread by mrstephengross | last post: by
6 posts views Thread by Larry Hale | last post: by
1 post views Thread by Solo | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.