473,692 Members | 2,271 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

HTTP - basic authentication example.

#!/usr/bin/python -u
# 15-09-04
# v1.0.0

# auth_example.py
# A simple script manually demonstrating basic authentication.

# Copyright Michael Foord
# Free to use, modify and relicense.
# No warranty express or implied for the accuracy, fitness to purpose
or otherwise for this code....
# Use at your own risk !!!

# E-mail or michael AT foord DOT me DOT uk
# Maintained at http://www.voidspace.org.uk/atlantib...thonutils.html

"""
There is a system for requiring a username/password before a client
can visit a webpage. This is called authentication and is implemented
by the server - it actually allows for a whole set of pages (called a
realm) to require authentication. This scheme (or schemes) are
actually defined by the HTTP spec, and so whilst python supports
authentication it doesn't document it very well. The HTTP
documentation is in the form of RFCs
(http://www.faqs.org/rfcs/rfc2617.html for basic and digest
authentication) which are technical documents and so not the most
readable !!

When I searched the web for details on authentication with python I
found lots of people asking questions, but a lack of clear answers.
This document and example code shows how to manually do basic
authentication with python. It is an example for performing a simple
operation rather than a technical document.

I am doing it manually rather than using an auth handler because my
script is a CGI which runs once for each page access. I have to store
the username/passwords between each access. A 'manual' explanation
also shows more clearly what is happening.

I've seen references to three authentication schemes, BASIC, NTLM and
DIGEST. It's possible there are more - but BASIC authentication is
overwhelmingly the most common. This tutorial/example only covers
BASIC authentication although some of the details may be applicable to
the other schemes.

--

In all these examples we will be the python standard library urllib2
to fetch web pages.

A client is any program that makes requests over the internet. It
could be a browser - or it could be a python program. When a client
requests a web page it sends a request to the server. That request
consists of headers with certain information about the request. Here
we are calling these headers 'http request headers'. If the request
fails to reach a server (the server name doesn't exist or there is no
internet connection for example) then the request will just fail. If
the request is made by python then an exception will be raised. This
exception will have a 'reason' attribute that is a tuple describing
the error.

The example below shows us creating a urllib2 request object, adding a
fake 'User-Agent' request header and making a request. The resulting
error shows what happens if you try to fetch a webpage without an
internet connection. The User-Agent request header tells the server
what program is asking for the web page - some sites (e.g. google)
won't allow requests from anything other than a browser... so we might
have to pretend to be a browser. (Which is generally considered bad
client behaviour ? so I might get my knuckles rapped for including it
here?).
import urllib2
req = urllib2.Request ('http://www.google.co.u k')
req.add_header( 'User-Agent', 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)') try: handle = urllib2.urlopen (req)
except IOError, e:
if hasattr(e, 'reason'):
print 'Reason : '
print e.reason
else:
print handle.read() # if we had a connection this
would print the page
Reason :
(7, 'getaddrinfo failed')
The actual exception is an URLError which is a subclass of IOError -
the one tested for above in the try?except block. If you did a dir(e)
on the above example then you would see all the attributes of the
exception object.

If however the request reaches a server then the server will send a
response. Whether or not the request succeeds the response will
contain headers from the server (or CGI script!!). These we call here
'http response headers'. If there is a problem then the response will
include an error code - you are familiar with some of then 404 : Page
not found, 500 : Internal Server Error etc. In this case an exception
will still be raised by urllib2, but instead of a 'reason' attribute
it will have a code attribute. The code attribute is an integer that
corresponds to the http error code. (see
http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html for a *full*
list of codes).

If a page requires authentication then the error code is 401. Included
in the response headers is a 'WWW-authenticate' header that tells you
what authentication scheme the server is using for this page *and*
also something called a realm. As you know it is rarely just a single
page that is protected by authentication but a whole 'realm' of a
website. The name of the realm is included in this header line. If the
client *already knows* the username/password for this realm then it
can encode them into the request headers and try again. If the
username/password combination are correct, then the request will
succeed as normal. If the client doesn't know the username/password it
should ask the user. This means that if you enter a protected 'realm'
the client effectively has to request each page twice. The first time
it will get an error code and be told what realm it is attempting to
access ? the client can then get the right username/password for that
realm (on that server) and repeat the request.

Suppose we are attempt to fetch a webpage protected by basic
authentication.
theurl = 'http://www.someserver. com/somepath/someprotectedpa ge.html'
try:

handle = urllib2.urlopen (theurl)
except IOError, e:
if hasattr(e, 'code'):
if e.code != 401:
print 'We got another error'
print e.code
else:
print e.headers
print e.headers['www-authenticate']
# print e.headers.get(' www-authenticate', '') might be a safer way
of doing this

Note the following things. We accessed the page directly from the url
instead of creating a request object. If the exception has a 'code'
attribute it also has an attribute called 'headers'. This is a
dictionary like object with all the headers in ? but you can also
print it to display all the headers. See the last line that displays
the 'www-authenticate' header line which ought to be present whenever
you get a 401 error.

WWW-Authenticate: Basic realm="cPanel"
Connection: close
Set-Cookie: cprelogin=no; path=/
Server: cpsrvd/9.4.2

Content-type: text/html

Basic realm="cPanel"

You can see the authentication scheme and the 'realm' part of the
'www-authenticate' header. Assuming you know the username and password
you can then navigate around that website ? whenever you get a 401
error with *the same realm* you can just encode the username/password
into your request headers and your request should succeed.

Lets assume you need to access two pages which are likely to be in
same realm. Lets also assume that you know the username and password.
You can save the realm information from when you make the first
access, and whenever you get a 401 and the same realm (assuming your
request is from the same server) you know you can use the same
username/password. So the only detail left is knowing how to encode
the username/password into request header. This is done by encoding it
as a base 64 string. It doesn't actually look like clear text ? but it
is only the most vaguest of 'encryption'. This means basic
authentication is just that ? basic. Anyone sniffing your traffic who
sees an authentication request header will be able to extract your
username and password from it. Many websites (like yahoo or ebay) may
use javascript hashing/encryption to authenticate a login, which is
much harder to detect and mimic. You may need to use a proxy client
server and see what information your browser is actually sending to
the website (See http://groups.google.co.uk/groups?hl...gle.com&rnum=2
for a suggestion of several proxy servers that can do this).

There is a very simple recipe on the Activestate Python Cookbook
(It's actually in the comments of this page
http://aspn.activestate.com/ASPN/Coo.../Recipe/267197 )
showing how to encode a username/password into a request header. It
looks like this :

import base64
base64string = base64.encodest ring('%s:%s' % (data['name'],
data['pass']))[:-1]
req.add_header( "Authorization" , "Basic %s" % base64string)

Where req is our request object like in the first example.

Let's wrap all this up with an example that shows accessing a page,
doing the authentication and saving the realm. I use a regular
expression to pull the scheme and realm out of the authentication
response header. I use urlparse to get the server part of a url. If we
store the username/password (or the whole request header line) then we
can re-use that information automatically if we come across another
page in that realm.

Some websites may also use cookies with authentication. Luckily there
is a library that will allow you to have automatic cookie management
without thinking about it. This is ClientCookie
(http://wwwsearch.sourceforge.net/ClientCookie/ ). In python 2.4 it
becomes part of the python standard library as clientcookie. See my
cookbook example of how to use it
(http://aspn.activestate.com/ASPN/Coo.../Recipe/302930 ).
I've done a bigger example that displays a lot of http information
including cookies, headers, environment variables (the CGI
environment) and all this authentication stuff. You can find it at
(http://aspn.activestate.com/ASPN/Coo.../Recipe/298336 ).
"""

import urllib2, sys, re, base64
from urlparse import urlparse
theurl = 'http://www.someserver. com/somepath/somepage.html'
# if you want to run this example you'll need to supply a protected
page with your username and password
username = 'johnny'
password = 'XXXXXX' # a very bad password

req = urllib2.Request (theurl)
try:
handle = urllib2.urlopen (req)
except IOError, e: # here we are assuming we fail
pass
else: # If we don't fail then the page
isn't protected
print "This page isn't protected by authentication. "
sys.exit(1)

if not hasattr(e, 'code') or e.code != 401: # we got
an error - but not a 401 error
print "This page isn't protected by authentication. "
print 'But we failed for another reason.'
sys.exit(1)

authline = e.headers.get(' www-authenticate', '') # this
gets the www-authenticat line from the headers - which has the
authentication scheme and realm in it
if not authline:
print 'A 401 error without an authentication response header -
very weird.'
sys.exit(1)

authobj = re.compile(r''' (?:\s*www-authenticate\s* :)?\s*(\w*)\s+r ealm=['"](\w+)['"]''',
re.IGNORECASE) # this regular expression is used to extract
scheme and realm
matchobj = authobj.match(a uthline)
if not matchobj: # if the
authline isn't matched by the regular expression then something is
wrong
print 'The authentication line is badly formed.'
sys.exit(1)
scheme = matchobj.group( 1)
realm = matchobj.group( 2)
if scheme.lower() != 'basic':
print 'This example only works with BASIC authentication. '
sys.exit(1)

base64string = base64.encodest ring('%s:%s' % (username,
password))[:-1]
authheader = "Basic %s" % base64string
req.add_header( "Authorization" , authheader)
try:
handle = urllib2.urlopen (req)
except IOError, e: # here we shouldn't fail if the
username/password is right
print "It looks like the username or password is wrong."
sys.exit(1)
thefirstpage = handle.read()

server = urlparse(theurl )[1].lower() # server names are
case insensitive, so we will convert to lower case
test = server.find(':' )
if test != -1: server = server[:test] # remove the :port
information if present, we're working on the principle that realm
names per server are likely to be unique...

passdict = {(server, realm) : authheader } # now if we get
another 401 we can test for an entry in passdict before having to ask
the user for a username/password

print 'Done successfully - information now stored in passdict.'

"""
ISSUES
CHANGELOG
15-09-04 Version 1.0.0
I think it's ok - a few references in the documentation to find.
"""
Jul 18 '05 #1
7 9284
On Thu, 16 Sep 2004 10:27:22 +0100, Michael Foord <fu******@gmail .com> wrote:
Cool, that is helpful.
The difficulties I would have with that approach are two fold - first
I use ClientCookie and have to install that as the handler. I may be
able to use an auth handler *as well* (I *think* yo ucan chain them
?).

Yes, you can chain them together, I believe, as long as they "handle"
different things. My version is a quick hack that specifically gets
me around the firewall here at work. I think the *correct* way to do
this Basic Authentication is (from urllib2.py):

# set up authentication info
authinfo = urllib2.HTTPBas icAuthHandler()
authinfo.add_pa ssword('realm', 'host', 'username', 'password')

proxy_support = urllib2.ProxyHa ndler({"http" : "http://ahad-haam:3128"})

# build a new opener that adds authentication and caching FTP handlers
opener = urllib2.build_o pener(proxy_sup port, authinfo, urllib2.CacheFT PHandler)

# install it
urllib2.install _opener(opener)

f = urllib2.urlopen ('http://www.python.org/')

I'm not sure what a ClientCookie is (didn't see it in my docs.)
Assuming that it is just a wrapper around the cookie, then you
probably only need the HTTP headers. You can grab the HTTP headers
from the object returned by urllib2.urlopen ().

filelike_obj = urllib2.urlopen ('http://www.python.org/')
headers = filelike_obj.in fo()
server_type = headers.gethead er( "SERVER")

Maybe you can feed the Cookie header to the ClientCookie ctor?

HTH,
jw
The second is that I think I need to take realm into account... I may
be handling multiple password/username combinations.

Anyway - I still find what you've sent useful - thanks.

Fuzzy


On Wed, 15 Sep 2004 11:21:21 -0500, Jaime Wyant <pr***********@ gmail.com> wrote:
FWIW, this is how I handle Basic Authentication:

import urllib2
import sys

class AuthenticateAll URIs:
"""This class authenticates all Basic Authentication using uname
/ pword."""
def __init__(self,u name,pword):
self.uname = uname
self.pword = pword

def find_user_passw ord(self, realm, host):
# Note, that this class doesn't take `realm' into consideration.
return self.uname, self.pword

def add_password( self, realm, uri, user, password ):
pass

auth = urllib2.ProxyBa sicAuthHandler( AuthenticateAll URIs('umjaw', 'fuse3'))
opener = urllib2.build_o pener( auth )
urllib2.install _opener( opener )
wp = urllib2.urlopen ("http://www.slashdot.or g")
print wp.read()

HTH,
jw

On 15 Sep 2004 08:37:12 -0700, Michael Foord <fu******@gmail .com> wrote:
[ snip! ]

--
http://www.Voidspace.org.uk
The Place where headspace meets cyberspace. Online resource site -
covering science, technology, computing, cyberpunk, psychology,
spirituality, fiction and more.

---
http://www.Voidspace.org.uk/atlantib...thonutils.html
Python utilities, modules and apps.
Including Nanagram, Dirwatcher and more.
---
http://www.fuchsiashockz.co.uk
http://groups.yahoo.com/group/void-shockz
---

Everyone has talent. What is rare is the courage to follow talent
to the dark place where it leads. -Erica Jong
Ambition is a poor excuse for not having sense enough to be lazy.
-Milan Kundera

Jul 18 '05 #2
Jaime Wyant <pr***********@ gmail.com> wrote in message news:<ma******* *************** *************** *@python.org>.. .
On Thu, 16 Sep 2004 10:27:22 +0100, Michael Foord <fu******@gmail .com> wrote:
Cool, that is helpful.
The difficulties I would have with that approach are two fold - first
I use ClientCookie and have to install that as the handler. I may be
able to use an auth handler *as well* (I *think* you can chain them
?).

Yes, you can chain them together, I believe, as long as they "handle"
different things. My version is a quick hack that specifically gets
me around the firewall here at work. I think the *correct* way to do
this Basic Authentication is (from urllib2.py):

# set up authentication info
authinfo = urllib2.HTTPBas icAuthHandler()
authinfo.add_pa ssword('realm', 'host', 'username', 'password')

proxy_support = urllib2.ProxyHa ndler({"http" : "http://ahad-haam:3128"})

# build a new opener that adds authentication and caching FTP handlers
opener = urllib2.build_o pener(proxy_sup port, authinfo, urllib2.CacheFT PHandler)

# install it
urllib2.install _opener(opener)

f = urllib2.urlopen ('http://www.python.org/')

I'm not sure what a ClientCookie is (didn't see it in my docs.)
Assuming that it is just a wrapper around the cookie, then you
probably only need the HTTP headers. You can grab the HTTP headers
from the object returned by urllib2.urlopen ().

filelike_obj = urllib2.urlopen ('http://www.python.org/')
headers = filelike_obj.in fo()
server_type = headers.gethead er( "SERVER")

Maybe you can feed the Cookie header to the ClientCookie ctor?


ClientCookie is an external library - it has only become part of the
standard library as cookielib in python 2.4 (which is why you won't
have found it in your docs). This means I have a ClientCookie handler
handling all my http requests.... I wonder if I can use an AuthHandler
as well ? There will be situations where I am likely to want to add an
Authroize header *and* handle cookies - ClientCookie manages all the
cookies in a way that I couldn't do manually.

Hmm... anyway - for situations where I may have saved passwords from
multiple realms, my code is only doing what an AuthHandler would do
anyway - without first having to feed it all the username/password
combinations... . I just store a dictionary of them.

In actual fact I think the *proper* way of doing it is *slightly* more
complicated - I think you ought to create an instance of an
HTTPPasswordMgr ...

The example you gave works I think - HTTPBasicAuthHa ndler does have an
add_password method, but not the find_user_passw ord that the
HTTPPasswordMgr has... so I can't easily check if it works properly.
In the urllib2 docs it says that passing a password manager in is
optional - but *nowhere* does it document that it has an add_password
method. It may be deducable from the fact that passing in a password
manager is optional - but surely explicit is better than implicit
(especially where documentation is concerned).

Thanks

Fuzzy
HTH,
jw
The second is that I think I need to take realm into account... I may
be handling multiple password/username combinations.

Anyway - I still find what you've sent useful - thanks.

Fuzzy


On Wed, 15 Sep 2004 11:21:21 -0500, Jaime Wyant <pr***********@ gmail.com> wrote:
FWIW, this is how I handle Basic Authentication:

import urllib2
import sys

class AuthenticateAll URIs:
"""This class authenticates all Basic Authentication using uname
/ pword."""
def __init__(self,u name,pword):
self.uname = uname
self.pword = pword

def find_user_passw ord(self, realm, host):
# Note, that this class doesn't take `realm' into consideration.
return self.uname, self.pword

def add_password( self, realm, uri, user, password ):
pass

auth = urllib2.ProxyBa sicAuthHandler( AuthenticateAll URIs('umjaw', 'fuse3'))
opener = urllib2.build_o pener( auth )
urllib2.install _opener( opener )
wp = urllib2.urlopen ("http://www.slashdot.or g")
print wp.read()

HTH,
jw

On 15 Sep 2004 08:37:12 -0700, Michael Foord <fu******@gmail .com> wrote:
[ snip! ]

--
http://www.Voidspace.org.uk
The Place where headspace meets cyberspace. Online resource site -
covering science, technology, computing, cyberpunk, psychology,
spirituality, fiction and more.

---
http://www.Voidspace.org.uk/atlantib...thonutils.html
Python utilities, modules and apps.
Including Nanagram, Dirwatcher and more.
---
http://www.fuchsiashockz.co.uk
http://groups.yahoo.com/group/void-shockz
---

Everyone has talent. What is rare is the courage to follow talent
to the dark place where it leads. -Erica Jong
Ambition is a poor excuse for not having sense enough to be lazy.
-Milan Kundera

Jul 18 '05 #3
"""This example is now updated to show an example at the end that uses
HTTPBasicAuthHa ndler *and* HTTPPasswordMgr WithDefaultReal m.
This is probably the 'right' way to do it... but is more of a pain
IMHO... (using a password manager either means *already* knowing the
realm... or *never* knowing the realm..)
"""

#!/usr/bin/python -u
# 16-09-04
# v1.0.0

# auth_example.py
# A simple CGI script manually demonstrating basic authentication.

# Copyright Michael Foord
# Free to use, modify and relicense.
# No warranty express or implied for the accuracy, fitness to purpose
or otherwise for this code....
# Use at your own risk !!!

# E-mail or michael AT foord DOT me DOT uk
# Maintained at http://www.voidspace.org.uk/atlantib...thonutils.html

"""
There is a system for requiring a username/password before a client
can visit a webpage. This is called authentication and is implemented
by the server - it actually allows for a whole set of pages (called a
realm) to be protected by authentication. This scheme (or schemes) are
actually defined by the HTTP spec, and so whilst python supports
authentication it doesn't document it very well. The HTTP
documentation is in the form of RFCs
(http://www.faqs.org/rfcs/rfc2617.html for basic and digest
authentication) which are technical documents and so not the most
readable !!

When I searched the web for details on authentication with python I
found lots of people asking questions, but a lack of clear answers.
This document and example code shows how to manually do basic
authentication with python. It is an example for performing a simple
operation rather than a technical document.

I am doing it manually rather than using an auth handler because my
script is a CGI which runs once for each page access. I have to store
the username/passwords between each access. A 'manual' explanation
also shows more clearly what is happening.

I've seen references to three authentication schemes, BASIC, NTLM and
DIGEST. It's possible there are more - but BASIC authentication is
overwhelmingly the most common. This tutorial/example only covers
BASIC authentication although some of the details may be applicable to
the other schemes.

--

In all these examples we will be using the python standard library
urllib2 to fetch web pages.

A client is any program that makes requests over the internet. It
could be a browser - or it could be a python program. When a client
requests a web page it sends a request to the server. That request
consists of headers with certain information about the request. Here
we are calling these headers 'http request headers'. If the request
fails to reach a server (the server name doesn't exist or there is no
internet connection for example) then the request will just fail. If
the request is made by python then an exception will be raised. This
exception will have a 'reason' attribute that is a tuple describing
the error.

The next example shows us creating a urllib2 request object, adding a
fake 'User-Agent' request header and making a request. The resulting
error shows what happens if you try to fetch a webpage without an
internet connection ! The User-Agent request header tells the server
what program is asking for the web page - some sites (e.g. google)
won't allow requests from anything other than a browser... so we might
have to pretend to be a browser. (Which is generally considered bad
client behaviour ? so I might get my knuckles rapped for including it
here).
import urllib2
req = urllib2.Request ('http://www.google.co.u k')
req.add_header( 'User-Agent', 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)') try: handle = urllib2.urlopen (req)
except IOError, e:
if hasattr(e, 'reason'):
print 'Reason : '
print e.reason
else:
print handle.read() # if we had a connection this
would print the page
Reason :
(7, 'getaddrinfo failed')
The actual exception is an URLError which is a subclass of IOError -
the one tested for above in the try-except block. If you did a dir(e)
in the above example then you would see all the attributes of the
exception object.

If however the request reaches a server then the server will send a
response back. Whether or not the request succeeds the response will
still contain headers from the server (or CGI script!!). These we call
here 'http response headers'. If there is a problem then this response
will include an error code that describes the problem. You will
already be familiar with some of these codes - 404 : Page not found,
500 : Internal Server Error etc. If this happens an exception will
still be raised by urllib2, but instead of a 'reason' attribute it
will have a 'code' attribute. The code attribute is an integer that
corresponds to the http error code. (see
http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html for a *full*
list of codes).

If a page requires authentication then the error code is 401. Included
in the response headers is a 'WWW-authenticate' header that tells you
what authentication scheme the server is using for this page *and*
also something called a realm. It is rarely just a single page that is
protected by authentication but a section - a 'realm' of a website.
The name of the realm is included in this header line. If the client
*already knows* the username/password for this realm then it can
encode them into the request headers and try again. If the
username/password combination are correct, then the request will
succeed as normal. If the client doesn't know the username/password it
should ask the user. This means that if you enter a protected 'realm'
the client effectively has to request each page twice. The first time
it will get an error code and be told what realm it is attempting to
access - the client can then get the right username/password for that
realm (on that server) and repeat the request.

HTTP is a 'stateless' protocol. This means that a server using basic
authentication won't 'remember' you are logged in and will need to be
sent the right header for every protected page you attempt to access.

Suppose we attempt to fetch a webpage protected by basic
authentication.
theurl = 'http://www.someserver. com/somepath/someprotectedpa ge.html'
try:

handle = urllib2.urlopen (theurl)
except IOError, e:
if hasattr(e, 'code'):
if e.code != 401:
print 'We got another error'
print e.code
else:
print e.headers
print e.headers['www-authenticate']
# print e.headers.get(' www-authenticate', '') might be a safer way
of doing this

Note the following things. We accessed the page directly from the url
instead of creating a request object. If the exception has a 'code'
attribute it also has an attribute called 'headers'. This is a
dictionary like object with all the headers in - but you can also
print it to display all the headers. See the last line that displays
the 'www-authenticate' header line which ought to be present whenever
you get a 401 error.

Output from above example :

WWW-Authenticate: Basic realm="cPanel"
Connection: close
Set-Cookie: cprelogin=no; path=/
Server: cpsrvd/9.4.2

Content-type: text/html

Basic realm="cPanel"

You can see the authentication scheme and the 'realm' part of the
'www-authenticate' header. Assuming you know the username and password
you can then navigate around that website - whenever you get a 401
error with *the same realm* you can just encode the username/password
into your request headers and your request should succeed.

Lets assume you need to access pages which are all in the same realm.
Assuming you have got the username and password from the user, you
save the name of the realm from the first access. Then whenever you
get a 401 error in the same realm (from the same server !) you know
the username/password to use. So the only detail left, is knowing how
to encode the username/password into the request header. This is done
by encoding it as a base 64 string. It doesn't actually look like
clear text - but it is only the most vaguest of 'encryption'. This
means basic authentication is just that - basic. Anyone sniffing your
traffic who sees an authentication request header will be able to
extract your username and password from it. Many websites like yahoo
or ebay, use javascript hashing/encryption and other tricks to
authenticate a login. This is much harder to detect and mimic from
python ! You may need to use a proxy client server and see what
information your browser is actually sending to the website (See
http://groups.google.co.uk/groups?hl...gle.com&rnum=2
for suggestions of several proxy servers that can do this).

There is a very simple recipe on the Activestate Python Cookbook
(It's actually in the comments of this page
http://aspn.activestate.com/ASPN/Coo.../Recipe/267197 )
showing how to encode a username/password into a request header. It
looks like this :

import base64
base64string = base64.encodest ring('%s:%s' % (data['name'],
data['pass']))[:-1]
req.add_header( "Authorization" , "Basic %s" % base64string)

Where req is our request object like in the first example.

Let's wrap all this up with an example that shows accessing a page,
doing the authentication and saving the realm. I use a regular
expression to pull the scheme and realm out of the authentication
response header. I use urlparse to get the server part of the url. If
we store the username/password (or the whole request header line) then
we can re-use that information automatically if we come across another
page in that realm. When the code has run the contents of the page
we've fetched is saved as a string in the variable 'thepage'. If you
are writing an http client of any sort that has to deal with basic
authentication then this example should have everything you need to
know - but see the comment below about cookies.

Some websites may also use cookies with authentication. Luckily there
is a library that will allow you to have automatic cookie management
without thinking about it. This is ClientCookie
(http://wwwsearch.sourceforge.net/ClientCookie/ ). In python 2.4 it
becomes part of the python standard library as clientcookie. See my
cookbook example of how to use it
(http://aspn.activestate.com/ASPN/Coo.../Recipe/302930 ).
I've done a bigger example that displays a lot of http information
including cookies, headers, environment variables (the CGI
environment) and all this authentication stuff. You can find it at
http://www.voidspace.org.uk/atlantib...book.html#http - it's a
CGI and there's an online version to try. It shows lot's of the
information about http state, the CGI environment etc..

--

In actual fact the 'proper' way to do BASIC authentication with Python
is to install an authentication handler as an 'opener' (along with a
password manager) in urllib2. It doesn't show as clearly what is
happening and is less suitable for my needs (within a CGI), where a
pickled dictionary is more usefl. See at the bottom of this example
for an alternative example using an auth handler - this was sent to me
by Jaime Wyant (and amended by me)....
"""

import urllib2, sys, re, base64
from urlparse import urlparse
theurl = 'http://www.someserver. com/somepath/somepage.html'
# if you want to run this example you'll need to supply a protected
page with your username and password
username = 'johnny'
password = 'XXXXXX' # a very bad password

req = urllib2.Request (theurl)
try:
handle = urllib2.urlopen (req)
except IOError, e: # here we are assuming we fail
pass
else: # If we don't fail then the page
isn't protected
print "This page isn't protected by authentication. "
sys.exit(1)

if not hasattr(e, 'code') or e.code != 401: # we got
an error - but not a 401 error
print "This page isn't protected by authentication. "
print 'But we failed for another reason.'
sys.exit(1)

authline = e.headers.get(' www-authenticate', '') # this
gets the www-authenticat line from the headers - which has the
authentication scheme and realm in it
if not authline:
print 'A 401 error without an authentication response header -
very weird.'
sys.exit(1)

authobj = re.compile(r''' (?:\s*www-authenticate\s* :)?\s*(\w*)\s+r ealm=['"](\w+)['"]''',
re.IGNORECASE) # this regular expression is used to extract
scheme and realm
matchobj = authobj.match(a uthline)
if not matchobj: # if the
authline isn't matched by the regular expression then something is
wrong
print 'The authentication line is badly formed.'
sys.exit(1)
scheme = matchobj.group( 1)
realm = matchobj.group( 2)
if scheme.lower() != 'basic':
print 'This example only works with BASIC authentication. '
sys.exit(1)

base64string = base64.encodest ring('%s:%s' % (username,
password))[:-1]
authheader = "Basic %s" % base64string
req.add_header( "Authorization" , authheader)
try:
handle = urllib2.urlopen (req)
except IOError, e: # here we shouldn't fail if the
username/password is right
print "It looks like the username or password is wrong."
sys.exit(1)
thepage = handle.read()

server = urlparse(theurl )[1].lower() # server names are
case insensitive, so we will convert to lower case
test = server.find(':' )
if test != -1: server = server[:test] # remove the :port
information if present, we're working on the principle that realm
names per server are likely to be unique...

passdict = {(server, realm) : authheader } # now if we get
another 401 we can test for an entry in passdict before having to ask
the user for a username/password

print 'Done successfully - information now stored in passdict.'
print 'The webpage is stored in thepage.'

"""
The proper way of actually doing this is to use an
HTTPBasicAuthHa ndler along with an HTTPPasswordMgr .
The python documentation on this is actually pretty minimal - so below
is an example showing how to do this.
The main problem with HTTPPasswordMgr is that you must *already* know
the realm - so we're going to use the HTTPPasswordMgr WithDefaultReal m
instead !!

theurl = 'http://www.someserver. com/highestlevelpro tectedpath/somepage.htm'
username = 'johnny'
password = 'XXXXXX' # a great password

passman = urllib2.HTTPPas swordMgrWithDef aultRealm() # this
creates a password manager
passman.add_pas sword(None, theurl, username, password) # because
we have put None at the start it will always use this
username/password combination

authhandler = urllib2.HTTPBas icAuthHandler(p assman) #
create the AuthHandler

opener = urllib2.build_o pener(authhandl er)
# build an 'opener' using the handler we've created
# you can use the opener directly to open URLs
# *or* you can install it as the default opener so that all calls to
urllib2.urlopen use this opener
urllib2.install _opener(opener)

# All calls to urllib2.urlopen will now use our handler
ISSUES
CHANGELOG
16-09-04 Version 1.0.0
I think it's ok.
"""
Jul 18 '05 #4
Good stuff. I knew if we knocked our heads together something good
would come out ;-).

jw
On Fri, 17 Sep 2004 13:20:23 +0100, Michael Foord <fu******@gmail .com> wrote:
I've added this example to the python cookbook (which basically does
what your example does I think) -

The proper way of actually doing this is to use an
HTTPBasicAuthHa ndler along with an HTTPPasswordMgr .
The python documentation on this is actually pretty minimal - so below
is an example showing how to do this.
The main problem with HTTPPasswordMgr is that you must *already* know
the realm - so we're going to use the HTTPPasswordMgr WithDefaultReal m
instead !!

theurl = 'http://www.someserver. com/highestlevelpro tectedpath/somepage.htm'
username = 'johnny'
password = 'XXXXXX' # a great password

passman = urllib2.HTTPPas swordMgrWithDef aultRealm() # this
creates a password manager
passman.add_pas sword(None, theurl, username, password) # because
we have put None at the start it will always use this
username/password combination

authhandler = urllib2.HTTPBas icAuthHandler(p assman) #
create the AuthHandler

opener = urllib2.build_o pener(authhandl er)
# build an 'opener' using the handler we've created
# you can use the opener directly to open URLs
# *or* you can install it as the default opener so that all calls to
urllib2.urlopen use this opener
urllib2.install _opener(opener)

# All calls to urllib2.urlopen will now use our handler


On Wed, 15 Sep 2004 11:21:21 -0500, Jaime Wyant <pr***********@ gmail.com> wrote:
FWIW, this is how I handle Basic Authentication:

import urllib2
import sys

class AuthenticateAll URIs:
"""This class authenticates all Basic Authentication using uname
/ pword."""
def __init__(self,u name,pword):
self.uname = uname
self.pword = pword

def find_user_passw ord(self, realm, host):
# Note, that this class doesn't take `realm' into consideration.
return self.uname, self.pword

def add_password( self, realm, uri, user, password ):
pass

auth = urllib2.ProxyBa sicAuthHandler( AuthenticateAll URIs('umjaw', 'fuse3'))
opener = urllib2.build_o pener( auth )
urllib2.install _opener( opener )
wp = urllib2.urlopen ("http://www.slashdot.or g")
print wp.read()

HTH,
jw

On 15 Sep 2004 08:37:12 -0700, Michael Foord <fu******@gmail .com> wrote:
[ snip! ]


--
http://www.Voidspace.org.uk
The Place where headspace meets cyberspace. Online resource site -
covering science, technology, computing, cyberpunk, psychology,
spirituality, fiction and more.

---
http://www.Voidspace.org.uk/atlantib...thonutils.html
Python utilities, modules and apps.
Including Nanagram, Dirwatcher and more.
---
http://www.fuchsiashockz.co.uk
http://groups.yahoo.com/group/void-shockz
---

Everyone has talent. What is rare is the courage to follow talent
to the dark place where it leads. -Erica Jong
Ambition is a poor excuse for not having sense enough to be lazy.
-Milan Kundera

Jul 18 '05 #5
fu******@gmail. com (Michael Foord) writes:
Jaime Wyant <pr***********@ gmail.com> wrote in message news:<ma******* *************** *************** *@python.org>.. . [...] have found it in your docs). This means I have a ClientCookie handler
handling all my http requests.... I wonder if I can use an AuthHandler
as well ? There will be situations where I am likely to want to add an
Authroize header *and* handle cookies - ClientCookie manages all the
cookies in a way that I couldn't do manually.
Sure, cookielib.HTTPC ookieProcessor (or
ClientCookie.HT TPCookieProcess or) should work fine with all other
urllib2 handlers. Cookies and Basic HTTP Authentication are quite
distinct and separate in their implementation at the HTTP level.

Assuming Python 2.4 (UNTESTED -- I haven't recently had occasion to
use any auth.):

import urllib2
import cookielib
import ClientCookie # for some more urllib2 handlers, for good measure ;-)

def build_opener(re alm, uri, user, password):
ch = cookielib.HTTPC ookieProcessor( )

mgr = HTTPPasswordMgr ()
mgr.add_passwor d(realm, uri, user, password)
ah = urllib2.HTTPBas icAuthHandler(m gr)

yet_more_handle rs = [ClientCookie.HT TPRefreshProces sor(max_time=No ne),
ClientCookie.HT TPEquivProcesso r(),
ClientCookie.HT TPRobotRulesPro cessor(),
]

return urllib2.build_o pener(ch, ah, *yet_more_handl ers)

opener = build_opener('m yrealm', 'http://example.com/', 'joe', 'joe')
opener.open('ht tp://example.com/restricted.html ')

[...] The example you gave works I think - HTTPBasicAuthHa ndler does have an
add_password method, but not the find_user_passw ord that the
HTTPPasswordMgr has... so I can't easily check if it works properly.
In the urllib2 docs it says that passing a password manager in is
optional - but *nowhere* does it document that it has an add_password
method. It may be deducable from the fact that passing in a password
manager is optional - but surely explicit is better than implicit
(especially where documentation is concerned).

[...]

Tested doc patches posted to the Python sf.net patch tracker are
welcome :-)
John
Jul 18 '05 #6
jj*@pobox.com (John J. Lee) writes:
[...]
Assuming Python 2.4 (UNTESTED -- I haven't recently had occasion to
use any auth.):

import urllib2
import cookielib
import ClientCookie # for some more urllib2 handlers, for good measure ;-)

def build_opener(re alm, uri, user, password):
ch = cookielib.HTTPC ookieProcessor( )

[...]

Whoops, HTTPCookieProce ssor is actually in urllib2, not cookielib.
John
Jul 18 '05 #7
jj*@pobox.com (John J. Lee) wrote in message news:<87******* *****@pobox.com >...
fu******@gmail. com (Michael Foord) writes:
Jaime Wyant <pr***********@ gmail.com> wrote in message news:<ma******* *************** *************** *@python.org>.. . [...]
have found it in your docs). This means I have a ClientCookie handler
handling all my http requests.... I wonder if I can use an AuthHandler
as well ? There will be situations where I am likely to want to add an
Authroize header *and* handle cookies - ClientCookie manages all the
cookies in a way that I couldn't do manually.


Sure, cookielib.HTTPC ookieProcessor (or
ClientCookie.HT TPCookieProcess or) should work fine with all other
urllib2 handlers. Cookies and Basic HTTP Authentication are quite
distinct and separate in their implementation at the HTTP level.


Really ? I can see situations where they both have to handle a
request.... but then I guess all cookielib has to do is add the
appropriate cookie header and then pass the request down the chain ?

(all is not meant as a denigration - merely a description ;-)

[snip..]

[...]
The example you gave works I think - HTTPBasicAuthHa ndler does have an
add_password method, but not the find_user_passw ord that the
HTTPPasswordMgr has... so I can't easily check if it works properly.
In the urllib2 docs it says that passing a password manager in is
optional - but *nowhere* does it document that it has an add_password
method. It may be deducable from the fact that passing in a password
manager is optional - but surely explicit is better than implicit
(especially where documentation is concerned). [...]

Tested doc patches posted to the Python sf.net patch tracker are
welcome :-)


Hmm... maybe.
I only know what I've deduced and I'm not over confident that it's
100% cast iron right - I just think it's probably right.

I'm quite happy to submit it - but if it's innacurate I don't do the
python community or myself any favours........ .

Regards,

Fuzzy
http://www.voidspace.org.uk/atlantib...thonutils.html

John

Jul 18 '05 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
1573
by: conradwt | last post by:
Hi, I'm configuring a PHP script to use HTTP Basic Authentication. Thus, is it possible to test basic authentication using 'localhost'? If so, could someone send me a working .htaccess file because the following doesn't seem to launch the login dialog: AuthType Basic AuthName "TEST HOST" AuthUserFile C:/wwwroot/testhost/passwd/.htpasswd AuthGroupFile /dev/null <Limit Get>
4
4541
by: jeff | last post by:
does anyone can help me with following situation. I use win32com.client to dispatch IE for a URL browing. However, the URL I am going require a HTTP basic authentication login. When IE starts to brow,the Login window pop up. I was trying to enter the right password for it, but IE wouldn't take it. it just keep poping up login window. How do I write auto login method to send user/password with COM method to bypass pop window for
0
2730
by: Vivek | last post by:
Hi, Can anyone provide me with sample code that will allow me to use SOAPpy with a WSDL file using HTTP basic authentication? The only thing I've been able to find on the net is SOAPpy.URLopener. However, I am unclear what the relationship is between SOAPpy.URLopener and SOAPpy.WSDL is. Is it as straightforward as this? > from SOAPpy import WSDL
3
3858
by: Yodai | last post by:
Hi all... I have this project on an embbeded system where I've programmed a very simple dynamic web-server on C, which serves 1 connection at a time. I need to implement a basic authentication system. I've heard the easiest is to go for HTTP 1.1 authentication, but I don't know ho to implement it.... Can anybody give me some advice, guidelines on where to start? Thank's ...... oh, and happy new year!!
13
15552
by: Pete | last post by:
I'm cross posting from mscom.webservices.general as I have received no answer there: There has been a number of recent posts requesting how to satisfactorily enable BASIC authorization at the HTTP level but as yet no fully useful answer. I too have been trying to call an apache/axis webservice which desires a username/password from my C# Client. (ie the equivalent of _call.setUsername("Myname") name from within a Java client proxy)...
1
9527
by: Tony Stephens | last post by:
Hi, I've created a small forms based application in c# to test a vendor's product and the web service interface that it exposes. We have deployed two instances of the vendor product one which has an unprotected (no authentication) interface and one that is protected using HTTP basic authentication. I can invoke methods on the unprotected instance and everything's fine. When I attempt to invoke methods on the protected instance the...
3
2573
by: Gilles Ganault | last post by:
Hello I have a PHP script rss.php that serves RSS to clients. It work fine, but I'd like to server customized contents, and for this, I need to know who the user is. Unless there's a better way, it looks like the easiest way is to use HTTP basic authentication: http://en.wikipedia.org/wiki/Basic_authentication_scheme
3
4241
by: Max | last post by:
Following the tutorial at http://personalpages.tds.net/~kent37/kk/00010.html, I understand how to access HTTP basic authenticated pages or form- based authenticated pages. How would I access a page protected by both form-based authentication (using cookies) *and* HTTP basic authentication?
0
3659
by: Tom | last post by:
Hi, I need to connect to a SOAP web service (Java server) that requires basic HTTP authentication over HTTP (not HTTPS). I've tried the following setup: <security mode="TransportCredentialOnly"> <transport clientCredentialType="Basic" proxyCredentialType="Basic" /> </security>
0
8611
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8969
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
8812
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
8810
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
7639
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
6462
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
4329
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
4563
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
2
2242
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.