By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
443,767 Members | 2,114 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 443,767 IT Pros & Developers. It's quick & easy.

urllib2/cookies - surely there's a better way ?

P: n/a
Hi - I'm writing a script which fetches a page from a web server and
takes note of any set-cookies which are served in the headers so that
when I next request a page I can send those cookies back to the
server. This is so that the usage analysis software on the server
(based on cookies) will take account of the scripts activities.

Now the thing is I'm halfway through doing this but I'm thinking there
must be a more refined mechanism than the one I'm using (see below).

I'm not really asking if there's a way to smarten up the rather clunky
splits (although if necessary that would be welcome) I'm more asking
is there not a more refined interface to the whole area of cookies.

Strangely enought the doco says that f.info() "return the
meta-information of the page, as a dictionary-like object" - well as
far as I can see it's a string and the f.info().headers is a list. I
don't usually find errors in the doco so this makes me wonder if
there's something I'm doing fundamentally wrong ?

Anyway any ideas would be welcome. Here goes with the work in progress
....
import urllib2
from string import split
from string import upper
req_headers = {
'User-Agent': 'Mozilla/4.0 (compatible; MSIE 5.5; Windows
NT)',
'Referer':''
}

TARGETURL = 'http://www.somedomain.com/a/b/'

req = urllib2.Request(TARGETURL, None, req_headers)
f = urllib2.urlopen(req)
lstHeaders=f.info().headers

for h in lstHeaders:

lstHeaderContents = split(h,":",1)
if upper(lstHeaderContents[0]) == "SET-COOKIE":
print lstHeaderContents[1]
#get the keyword value pair to the left of the first
';'
lstYetAnother = split(lstHeaderContents[1],";",1)
#put the keyword value pair into a list
lstOneMore = split(lstYetAnother[0],"=",1)
print "Keyword=" + lstOneMore[0] + ". Value = " +
lstOneMore[1] + "."

That's the script - thanks for readind this far.

regards

richard.
Jul 18 '05 #1
Share this Question
Share on Google+
2 Replies


P: n/a
Richard Shea wrote:

(about urllib2 and cookies)

Search for "ClientCookie"...
Jul 18 '05 #2

P: n/a
Peter Hansen <pe***@engcorp.com> wrote in message news:<84********************@powergate.ca>...
Richard Shea wrote:

(about urllib2 and cookies)

Search for "ClientCookie"...


That's great ! I actually laughed when I read the doco - I was only
looking for something to parse cookies with but this does the whole
thing ! I haven't yet used it but I've had a few wriggles getting to
where the 'import ClientCookie' works so I thought I might tell the
newsgroup what I did to make it work (although that might be fairly
obvious to many).

First of all I was installing on a W98 machine. I tried using the
install procedure "python setup.py build" but I got the message
"error: package directory 'ClientCookie' does not exist". I do have a
slightly weird setup so I wasn't all that surprised.

Anyway I then took the fallback option of copying the 'ClientCookie'
directory from the .ZIP manually. In order to make this work you need
to ensure that sys.path contains a path to ClientCookie before it
finds the standard libraries. I chose to do that by editing the
registry at

HKLM/SOFTWARE/Python/PytonCore/2.3/PytonPath

and adding a new entry there with a value which pointed at the
clientcookie directory (ie C:/a/b/ClientCookie-0.4.18/ClientCookie)
however although I got an extra entry at the 'right' place in sys.path
this didn't allow me to "import ClientCookie" and eventually I
modified the Registry entry to read C:/a/b/ClientCookie-0.4.18 and now
everthing seems to be fine.

This is probably pretty straightforward stuff for most pepole but I
still find some aspects of 'import' a dark art so I thoguht I was
worth sticking it into the archives.

Thanks again for the tip.

Regards

Richard.
Jul 18 '05 #3

This discussion thread is closed

Replies have been disabled for this discussion.