473,809 Members | 2,849 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Handling cookies without urllib2 and cookielib

Standard disclaimer: read, googled, read some more. If you have a link,
please free free to point me there.

I'm using HTTPlib to construct some functional tests for a web app we're
writing. We're not using urllib2 because we need support for PUT and
DELETE methods, which urllib2 does not do.

We also need client-side cookie handling. So, I start reading about
cookielib and run across a problem. It's cookie handling is tied quite
tightly to urllib2's request object. httplib has somewhat different
semantics in its request object. So, you can use cookielib with httplib.
And cookie lib has no simple function (that I could find) for passing in a
set-cookie header and getting back a CookieJar object (or even a list of
Cookie objects).

I'm sure I'm not the first to have to deal with httplib and cookies. Anyone
have suggestions or pointers?

j

Dec 15 '07 #1
2 3090
On 14 dic, 23:44, Joshua Kugler <jkug...@bigfoo t.comwrote:
I'm using HTTPlib to construct some functional tests for a web app we're
writing. We're not using urllib2 because we need support for PUT and
DELETE methods, which urllib2 does not do.

We also need client-side cookie handling. So, I start reading about
cookielib and run across a problem. It's cookie handling is tied quite
tightly to urllib2's request object. httplib has somewhat different
semantics in its request object. So, you can use cookielib with httplib.
And cookie lib has no simple function (that I could find) for passing in a
set-cookie header and getting back a CookieJar object (or even a list of
Cookie objects).
What about correcting the first thing, making urllib2 support HEAD/PUT/
DELETE?

import urllib2

class Request(urllib2 .Request):

def __init__(self, url, data=None, headers={},
origin_req_host =None, unverifiable=Fa lse,
method=None):
urllib2.Request .__init__(self, url, data, headers,
origin_req_host , unverifiable)
self.method = method

def get_method(self ):
if self.method is None:
if self.data is not None:
return "POST"
else:
return "GET"
return self.method

pyf = urllib2.urlopen (Request("http://www.python.org/",
method="HEAD"))
pyprint f.info()
Date: Sun, 16 Dec 2007 00:03:43 GMT
Server: Apache/2.2.3 (Debian) DAV/2 SVN/1.4.2 mod_ssl/2.2.3 OpenSSL/
0.9.8c
Last-Modified: Sat, 15 Dec 2007 16:25:58 GMT
ETag: "60193-3e6a-a24fb180"
Accept-Ranges: bytes
Content-Length: 15978
Connection: close
Content-Type: text/html

pyprint len(f.read())
0

Notes:
a) Instead of urlopen(url,... ) you must use urlopen(Request (url,...))
b) Redirection is not handled correctly in HTTPRedirectHan dler (the
request method should be copied over)
c) I've not verified PUT / DELETE methods
d) I'll try to make a proper patch later

--
Gabriel Genellina
Dec 16 '07 #2
Gabriel Genellina wrote:
On 14 dic, 23:44, Joshua Kugler <jkug...@bigfoo t.comwrote:
>I'm using HTTPlib to construct some functional tests for a web app we're
writing. We're not using urllib2 because we need support for PUT and
DELETE methods, which urllib2 does not do.

We also need client-side cookie handling. So, I start reading about
cookielib and run across a problem. It's cookie handling is tied quite
tightly to urllib2's request object. httplib has somewhat different
semantics in its request object. So, you can use cookielib with httplib.
And cookie lib has no simple function (that I could find) for passing in
a set-cookie header and getting back a CookieJar object (or even a list
of Cookie objects).

What about correcting the first thing, making urllib2 support HEAD/PUT/
DELETE?
<SNIP>

We may have to do that, and then hack on the Redirect handler too so it will
properly keep the request method. But that's not our preference, for
obvious reasons. :)

I just find it hard to believe that no one has ever needed to do cookie
handling in a generic way (i.e. input: set-cookie header, output: cookie
objects) before. May have to write my own. Or sublcass/extend cookielib.

j

Dec 18 '07 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
5186
by: Alex Hunsley | last post by:
I'm using urllib to post data to a web form by issuing a command similar to this: filename, headers = urllib.urlretrieve("http://www.thewebsitenamehere.com/servlet/com.blah.bloo.XmlFeed", "content.txt", None, urllib.urlencode({"aParameter": "theValue"})) Now, the problem is that the above fails, since I am not sending a session cookie. Visitors to the web sites' html submission form are sent a session cookie which is given back to...
0
1811
by: C. Titus Brown | last post by:
Hi all, just spent some time playing with cookielib in Python 2.4, trying to get the cookielib example to work with my mailman admindb page. The problem was that cookies weren't getting saved. The issue turned out to be that mailman sends out RFC 2965 cookies, which are by default rejected by cookielib. I don't remotely pretend to understand the issues involved; hence my post ;).
1
3587
by: Alex Hunsley | last post by:
I'm writing a test script in python for pulling web pages from a web server using urllib2 and cookielib. Since the main thing I am testing is what happens when concurrent requests are made to the web server, I need to make several requests concurrently, which I'll do from different threads in my python script. So the important question is: are cookielib and urllib2 thread safe? Are there any precautions that apply to using these libs in a...
2
2256
by: john.lehmann | last post by:
Attacked is a piece of code which first hits the login page successfully and receives back login cookies. But then when I attempt to hit a page which is restricted to logged in users only, I fail. That seems to be because I am not successfully re-attaching the cookies to the header portion of the this request. I have tried 2 methods which should both work I think. The first was to use install_opener to attach the cookie handler back...
1
2597
by: onceuponapriori | last post by:
Greetings gents. I'm a Railser working on a django app that needs to do some scraping to gather its data. I need to programatically access a site that requires a username and password. Once I post to the login.php page, there seems to be a redirect and it seems that the site is using a session (perhaps a cookie) to determine whether the user is logged in. So I need to log in and then have cookies and or sessions maintained as I access...
2
1831
by: Gilles Ganault | last post by:
Hello I need to write a script to automate fetching data from a web site: 1. using the POST method, log on, with login/password saved as cookies 2. download page and extract relevent information using regexes 3. log off 4. wait for a random number of minutes, and GOTO 1 I'm a bit confused with how to get POST and cookies in the same script:
2
2121
by: Devraj | last post by:
Hi everyone, I have been battling to make my code work with a HTTPS proxy, current my code uses urllib2 to to most things and works well, except that urllib2 doesn't handle HTTPS proxies. Urlgrabber (http://linux.duke.edu/projects/urlgrabber/help/ urlgrabber.grabber.html) looks very promising except that I can find a way to handle cookies in urlgrabber. Is there a way urlgrabber can use a HTTPCookieProcess or cookielib.CookieJar...
2
5895
by: Larry Bates | last post by:
I'm struggling with a project using mechanize and cookies to screen scape a website. The site requires a client created cookie for authentication. Below is the code I'm attempting to use with the traceback I'm getting: 'Set-Cookie: Manageopen=cards; Domain=.domain.com; expires=Fri, 16-May-2008 14:06:00 GMT; Path=/' Traceback (most recent call last): File "<interactive input>", line 1, in <module>
3
7278
by: trihaitran | last post by:
Hi I am trying to pull some data from a Web site: http://schoolfinder.com The issue is that I want to use the advanced search feature which requires logging into the Web site. I have a username and password, however I want to connect programmatically from Python. I have done data capture from the Web before so the only new thing here to me is the authentication stuff. I need cookies as this page describes:...
0
9722
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9603
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
1
10391
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
6881
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5550
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5690
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4333
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
3862
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
3015
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.