473,383 Members | 1,739 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,383 software developers and data experts.

How to share session with IE

zdp
Hello!

I need to process some webpages of a forum which is powered by discuz!.
When I login, there are some options about how long to keep the
cookies: forever, month, week, et al. If I choose forever, I don't
need to login each time, and When I open the internet explorer I can
access any pages directly. Some urls of the pages like:

http://www.somesite.com/bbs/viewthre...extra=page%3D1

However, now I need to process some pages by a python program. When I
use urllib.urlopen(theurl), I can only get a page which told me I need
login. I think It's reasonable, becuase I wasn't in a loggined session
which as IE did.

So how can I do my job? I want to get the right webpage by the url. I
have search answers from the groups but didn't get clear answer. Should
I use win32com or urllib? Any reply or information is appreciate. Hope
I put it clear.

Dapu

Oct 10 '06 #1
8 7512
Hello Dapu,

You can do the same thing as IE on your forum using urllib2 and
cookielib. In short you need to code a small webcrawler. I can give you
my browser module if necessary.
You might not have the time to fiddle with the coding part or my
browser module so you can also use this particularly useful module :
http://wwwsearch.sourceforge.net/mechanize/
The documentation is pretty clear for an initiated python programmer.
If it's not your case, I'd recommend to read some ebooks on the python
language first to get use to it.

Bernard


zdp wrote:
Hello!

I need to process some webpages of a forum which is powered by discuz!.
When I login, there are some options about how long to keep the
cookies: forever, month, week, et al. If I choose forever, I don't
need to login each time, and When I open the internet explorer I can
access any pages directly. Some urls of the pages like:

http://www.somesite.com/bbs/viewthre...extra=page%3D1

However, now I need to process some pages by a python program. When I
use urllib.urlopen(theurl), I can only get a page which told me I need
login. I think It's reasonable, becuase I wasn't in a loggined session
which as IE did.

So how can I do my job? I want to get the right webpage by the url. I
have search answers from the groups but didn't get clear answer. Should
I use win32com or urllib? Any reply or information is appreciate. Hope
I put it clear.

Dapu
Oct 10 '06 #2
zdp
It's exactly what I want. I'll try. Thanks!

Bernard wrote:
Hello Dapu,

You can do the same thing as IE on your forum using urllib2 and
cookielib. In short you need to code a small webcrawler. I can give you
my browser module if necessary.
You might not have the time to fiddle with the coding part or my
browser module so you can also use this particularly useful module :
http://wwwsearch.sourceforge.net/mechanize/
The documentation is pretty clear for an initiated python programmer.
If it's not your case, I'd recommend to read some ebooks on the python
language first to get use to it.

Bernard


zdp wrote:
Hello!

I need to process some webpages of a forum which is powered by discuz!.
When I login, there are some options about how long to keep the
cookies: forever, month, week, et al. If I choose forever, I don't
need to login each time, and When I open the internet explorer I can
access any pages directly. Some urls of the pages like:

http://www.somesite.com/bbs/viewthre...extra=page%3D1

However, now I need to process some pages by a python program. When I
use urllib.urlopen(theurl), I can only get a page which told me I need
login. I think It's reasonable, becuase I wasn't in a loggined session
which as IE did.

So how can I do my job? I want to get the right webpage by the url. I
have search answers from the groups but didn't get clear answer. Should
I use win32com or urllib? Any reply or information is appreciate. Hope
I put it clear.

Dapu
Oct 10 '06 #3
"Bernard" <be***********@gmail.comwrites:
zdp wrote:
[...]
However, now I need to process some pages by a python program. When I
use urllib.urlopen(theurl), I can only get a page which told me I need
login. I think It's reasonable, becuase I wasn't in a loggined session
which as IE did.

So how can I do my job? I want to get the right webpage by the url. I
have search answers from the groups but didn't get clear answer. Should
I use win32com or urllib? Any reply or information is appreciate. Hope
I put it clear.
You can do the same thing as IE on your forum using urllib2 and
cookielib. In short you need to code a small webcrawler. I can give you
my browser module if necessary.
You might not have the time to fiddle with the coding part or my
browser module so you can also use this particularly useful module :
http://wwwsearch.sourceforge.net/mechanize/
The documentation is pretty clear for an initiated python programmer.
If it's not your case, I'd recommend to read some ebooks on the python
language first to get use to it.
In particular, if you're following the approach Bernard suggests, you
can either:

1. Log in every time your program runs, by going through the sequence
of clicks, pages, etc. that you would use in a browser to log in.

2. Once only (or once a month, or whatever), log in by hand using IE
with a "Remember me"-style feature (if the website offers that) --
where the webapp asks the browser to save the cookie rather than
just keeping it in memory until you close your browser. Then your
program can load the cookies from your real browser's cookie store
using this:

http://wwwsearch.sourceforge.net/mec....html#browsers
There are other alternatives too, but they depend on knowing a little
bit more about how cookies and web apps work, and may or may not work
depending on what exactly the server does. I'm thinking specifically
here of saving *session* cookies (the kind that usually go away when
you close your browser) in a file -- but the server may not like them
when you send them back the next time, depending how much time has
elapsed since the last run. Of course, you can always detect the
"need to login" condition, and react accordingly.
John

Oct 10 '06 #4
John J. Lee wrote:
"Bernard" <be***********@gmail.comwrites:
>zdp wrote:
[...]
>>However, now I need to process some pages by a python program. When I
use urllib.urlopen(theurl), I can only get a page which told me I need
login. I think It's reasonable, becuase I wasn't in a loggined session
which as IE did.

So how can I do my job? I want to get the right webpage by the url. I
have search answers from the groups but didn't get clear answer. Should
I use win32com or urllib? Any reply or information is appreciate. Hope
I put it clear.
>You can do the same thing as IE on your forum using urllib2 and
cookielib. In short you need to code a small webcrawler. I can give you
my browser module if necessary.
You might not have the time to fiddle with the coding part or my
browser module so you can also use this particularly useful module :
http://wwwsearch.sourceforge.net/mechanize/
The documentation is pretty clear for an initiated python programmer.
If it's not your case, I'd recommend to read some ebooks on the python
language first to get use to it.

In particular, if you're following the approach Bernard suggests, you
can either:

1. Log in every time your program runs, by going through the sequence
of clicks, pages, etc. that you would use in a browser to log in.

2. Once only (or once a month, or whatever), log in by hand using IE
with a "Remember me"-style feature (if the website offers that) --
where the webapp asks the browser to save the cookie rather than
just keeping it in memory until you close your browser. Then your
program can load the cookies from your real browser's cookie store
using this:

http://wwwsearch.sourceforge.net/mec....html#browsers
There are other alternatives too, but they depend on knowing a little
bit more about how cookies and web apps work, and may or may not work
depending on what exactly the server does. I'm thinking specifically
here of saving *session* cookies (the kind that usually go away when
you close your browser) in a file -- but the server may not like them
when you send them back the next time, depending how much time has
elapsed since the last run. Of course, you can always detect the
"need to login" condition, and react accordingly.
John

Another option instead of making your program run through a series of
clicks and text inputs, which is difficult to program, is to browse the
html source until you find the name of the script that processes the
login, and use python to request the page with the necessary form fields
encoded in the request. Request something like
http://www.targetsite.com/login.cgi?username=pyuser&password="fhqwhgads"
This format is not guaranteed to work, since the login script or server
might only support one of GET and POST. If this is the case, creating
the request is slightly more involved and to be honest I haven't looked
into how to do it.

Thereafter, you will have to pass the environment to every page request
so the server can read the cookie. Which brings me to question whether
or not it is possible to do this manually once, export the environment
variable to a file, and reload this file each time the program is run.
Or to generate the cookie in the environment yourself. Quite frankly
any server application that allows the client to control whether or not
they have logged in sucks, but I've seen a fair few that do.[citation
required]

Cameron.
Oct 11 '06 #5
I just thought, your original question was whether or not it was
possible to share your browser session with IE. Unless you do this
explicitly, you may require a different login for your Python program
and for your IE user. If the Python program does not get the same
cookie as used by IE, or vice-versa, and tries to login, you may find
this resets the login in the other browser.

Oh and at risk of starting a flame war <a href="www.getfirefox.com"
reason="obligatory pro open source link">get a real browser.</a>
Oct 11 '06 #6
zdp
I found some similar topics in the newsgroup and get some ideas from
them.
http://groups.google.com/group/comp....e0be6c386adce4
http://groups.google.com/group/comp....1cec8747f64619

According to all you suggestions, there are at least two ways to get my
result.

1. Use the cookie of IE, so I don't need to code to logon. That means I
must use ClientCookie. I found some example in the docs and the
newsgroup. Below is some code based on the docs of ClientCookie. But
the page I get is still the page told me must login ( I CAN get the
right page in IE).

import ClientCookie, urllib2

url_string="http://www.targetsite.com/bbs/viewthread.php?tid=12345"
#the page I want to get

cj = ClientCookie.MSIECookieJar(delayload=True)
cj.load_from_registry()
print cj #I want to know what I get

opener =
ClientCookie.build_opener(ClientCookie.HTTPCookieP rocessor(cj))
ClientCookie.install_opener(opener)
f = ClientCookie.urlopen(url_string)
print f.read() # NOT the right page html
2. Logon myself by python. First, I access the login page and submit
the form of username and password. The form has many fields other than
username and passwd, so the dict "data" has all the fields even if it's
hide. Then, if the login succeed, I can get my page use the opener with
CookieJar.

import urllib2, cookielib

url_string="http://www.targetsite.com/bbs/viewthread.php?tid=12345"
#the page I want to get
url_login="http://www.targetsite.com/bbs/logging.php?action=login"
#the login page

headers = {'User-agent' : 'Mozilla/4.0 (compatible; MSIE 5.5;
Windows NT)'}
cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(c j))

urllib2.install_opener(opener)
data = {
'formhash': '3bd8bc0a',
"referer" : "index.php",
"loginfield": "username",
'username': 'myname',
'password': 'mypass',
"questionid": 0,
"answer":"",
"cookietime" : "315360000",
"loginmode":"",
"styleid":""
}
req=urllib2.Request(url_login, urllib.urlencode(data), headers)
f = opener.open(req)
print req.get_data()
print req.header_items()
print f.info()
print f.read()

## if login succeed, I can get my page
f=opener.open( url_string)
However, both ways didn't work for me. I don't know what's wrong. If
it's because the server page check the header or the submit of the form
is wrong?

I didn't study Mechanize module yet. I want a solution as simple as
possible for distribution reason.

John J. Lee 写道:
"Bernard" <be***********@gmail.comwrites:
zdp wrote:
[...]
However, now I need to process some pages by a python program. When I
use urllib.urlopen(theurl), I can only get a page which told me I need
login. I think It's reasonable, becuase I wasn't in a loggined session
which as IE did.
>
So how can I do my job? I want to get the right webpage by the url. I
have search answers from the groups but didn't get clear answer. Should
I use win32com or urllib? Any reply or information is appreciate. Hope
I put it clear.
You can do the same thing as IE on your forum using urllib2 and
cookielib. In short you need to code a small webcrawler. I can give you
my browser module if necessary.
You might not have the time to fiddle with the coding part or my
browser module so you can also use this particularly useful module :
http://wwwsearch.sourceforge.net/mechanize/
The documentation is pretty clear for an initiated python programmer.
If it's not your case, I'd recommend to read some ebooks on the python
language first to get use to it.

In particular, if you're following the approach Bernard suggests, you
can either:

1. Log in every time your program runs, by going through the sequence
of clicks, pages, etc. that you would use in a browser to log in.

2. Once only (or once a month, or whatever), log in by hand using IE
with a "Remember me"-style feature (if the website offers that) --
where the webapp asks the browser to save the cookie rather than
just keeping it in memory until you close your browser. Then your
program can load the cookies from your real browser's cookie store
using this:

http://wwwsearch.sourceforge.net/mec....html#browsers
There are other alternatives too, but they depend on knowing a little
bit more about how cookies and web apps work, and may or may not work
depending on what exactly the server does. I'm thinking specifically
here of saving *session* cookies (the kind that usually go away when
you close your browser) in a file -- but the server may not like them
when you send them back the next time, depending how much time has
elapsed since the last run. Of course, you can always detect the
"need to login" condition, and react accordingly.
John
Oct 12 '06 #7
Cameron Walsh <ca***********@gmail.comwrites:
[...]
Another option instead of making your program run through a series of
clicks and text inputs, which is difficult to program, is to browse
the html source until you find the name of the script that processes
the login, and use python to request the page with the necessary form
fields encoded in the request. Request something like
http://www.targetsite.com/login.cgi?username=pyuser&password="fhqwhgads"
This format is not guaranteed to work, since the login script or
server might only support one of GET and POST. If this is the case,
creating the request is slightly more involved and to be honest I
haven't looked into how to do it.
Absolutely, that's often a great way to do things, since it's very
simple, and is not in conflict with handling cookies (where that's
required).

(But of course if you need to handle cookies, you still need to
arrange to actually handle the cookies somewhere.)

Thereafter, you will have to pass the environment to every page
request so the server can read the cookie. Which brings me to
question whether or not it is possible to do this manually once,
export the environment variable to a file, and reload this file each
time the program is run. Or to generate the cookie in the environment
yourself.
[...]

Standard library module cookielib (or mechanize, which is not part of
the stdlib, and does some more stuff automatically and provides some
extra features for page navigation and form handling) does all this
automatically:

http://docs.python.org/lib/cookielib-examples.html

import cookielib, urllib2
cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(c j))
r = opener.open("http://example.com/")
For loading and saving (including Firefox support):

http://docs.python.org/lib/file-cookie-jar-classes.html

http://docs.python.org/lib/cookie-jar-objects.html
For loading IE cookies, use mechanize.

http://wwwsearch.sourceforge.net/mechanize/
John
Oct 14 '06 #8
"zdp" <zh******@gmail.comwrites:
[...]
1. Use the cookie of IE, so I don't need to code to logon. That means I
must use ClientCookie. I found some example in the docs and the
newsgroup. Below is some code based on the docs of ClientCookie. But
the page I get is still the page told me must login ( I CAN get the
right page in IE).
Try mechanize (same website as ClientCookie -- though right now, that
part of sourceforge seems to be down for me). It supports more
browser features automatically.

[...]
However, both ways didn't work for me. I don't know what's wrong. If
it's because the server page check the header or the submit of the form
is wrong?
Changing the HTTP headers you send may solve your problem, yes.
Either that, or the response body ;-)

I didn't study Mechanize module yet. I want a solution as simple as
possible for distribution reason.
OK, then you should compare the HTTP requests that a real browser
sends with the HTTP requests that your Python script sends. The
following pages give some help with that (from memory, since the site
is down right now):

http://wwwsearch.sf.net/mechanize/doc.html#debugging

http://wwwsearch.sf.net/bits/GeneralFAQ.html
John
Oct 14 '06 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
by: Patrick | last post by:
Hi all, I'm looking for a working example on how to share session state between ASP and ASP.NET. The solution of Y Yeung of Microsoft seemed to be a good solution but I did not get the asp part...
1
by: Charlie Hoo | last post by:
Hi, Anyone know how to share session state across different applications in the same server or different server? Thanks a lot. Charlie --- Posted using Wimdows.net Newsgroups -...
3
by: CharlieHoo | last post by:
Hello, How to share session state between two or more ASP.NET applications in the same server or different ones? Thanks a lot. Charlie
5
by: Joe | last post by:
I have an application which runs in a non-secure environment. I also have an application that runs in a secure environment (both on the same machine). Is there any way to share the session data for...
0
by: tony dong | last post by:
Hi there I have many applications running asp and asp.net1.1, right now I have an application running asp.net 2.0, any one know how to share session between 1.1 and 2.0 version? I have a test...
1
by: Hans Kesting | last post by:
Hi, Is it possible to share sessions between subdomains? Say: the user logs in at www.company.com, and is redirected to my.company.com. This is a different url for the same application. Can I...
1
by: Chris | last post by:
Hi I have to share session state between asp and asp.net. I am looking at code examples and it looks doable but my major concern is most of the methods that grab the ASP session involve posting to...
1
by: Nayana Thara | last post by:
We have a web site where we have asp and asp.net. how can we share session data b/w the two environments.
2
by: =?Utf-8?B?UGhpbCBKb2huc29u?= | last post by:
I work on an asp.net 1.1 application which we cannot upgrade to a newer version of .net so we want to look into sharing session, preferably with 3.5, but if that isn't possible with 2.0 or 3.0 ...
2
by: Mas Heru | last post by:
I heard the best method to share session across multiple domains on same server is to use custom php session handler. (ie, domain name different like abc.com, xyz.com but single application.) But...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.