Hi,
I'm writing a web automation script using ClientForm and urlgrabber.
I use urlgrabber because I need the "http keepalive" which doesn't exist in urllib2.
I'm facing a problem, the "form.click()" returns a urllib2.Request object which I can't pass to urlgrabber.urlopen().
In the ClientForm web page, I see "see HTMLForm.click.__doc__ if you don't have urllib2", but I have no idea what HTMLForm is.
I have tried to search it in google, nothing interesting.
Thanks for any help.
The error message is as follows. -
Traceback (most recent call last):
-
File "wretch4.py", line 53, in ?
-
response2 = urlopen(request2)
-
File "/var/lib/python-support/python2.4/urlgrabber/grabber.py", line 605, in urlopen
-
return default_grabber.urlopen(url, **kwargs)
-
File "/var/lib/python-support/python2.4/urlgrabber/grabber.py", line 881, in urlopen
-
(url,parts) = opts.urlparser.parse(url, opts)
-
File "/var/lib/python-support/python2.4/urlgrabber/grabber.py", line 653, in parse
-
parts = urlparse.urlparse(url)
-
File "/usr/lib/python2.4/urlparse.py", line 50, in urlparse
-
tuple = urlsplit(url, scheme, allow_fragments)
-
File "/usr/lib/python2.4/urlparse.py", line 89, in urlsplit
-
i = url.find(':')
-
File "/usr/lib/python2.4/urllib2.py", line 207, in __getattr__
-
raise AttributeError, attr
-
My code is as follows. -
#!/usr/bin/python
-
from ClientForm import ParseResponse
-
from urlgrabber import urlopen
-
-
-
url = 'http://www.ggggg.com'
-
-
headers = (
-
('User-Agent','Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.10) Gecko/20071115 Iceweasel/2.0.0.10 (Debian-2.0.0.10-0etch1)'),
-
('Accept','text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5'),
-
('Accept-Language','en-us,en;q=0.5'),
-
# ('Accept-Encoding','gzip,deflate'),
-
('Accept-Charset','ISO-8859-1,utf-8;q=0.7,*;q=0.7'),
-
('Keep-Alive','300'),
-
('Connection','keep-alive'),
-
('Referer',url),
-
('Content-Type','application/x-www-form-urlencoded'))
-
-
response = urlopen(url,http_headers = headers)
-
forms = ParseResponse(response, backwards_compat=False)
-
-
form = forms[0]
-
print form
-
-
form["passwd"] = "ggggg"
-
-
-
request2 = form.click()
-
response2 = urlopen(request2)
-
-
print response2
-
-
2 3921
Hi,
I'm writing a web automation script using ClientForm and urlgrabber.
I use urlgrabber because I need the "http keepalive" which doesn't exist in urllib2.
I'm facing a problem, the "form.click()" returns a urllib2.Request object which I can't pass to urlgrabber.urlopen().
In the ClientForm web page, I see "see HTMLForm.click.__doc__ if you don't have urllib2", but I have no idea what HTMLForm is.
I have tried to search it in google, nothing interesting.
Thanks for any help.
The error message is as follows. -
Traceback (most recent call last):
-
File "wretch4.py", line 53, in ?
-
response2 = urlopen(request2)
-
File "/var/lib/python-support/python2.4/urlgrabber/grabber.py", line 605, in urlopen
-
return default_grabber.urlopen(url, **kwargs)
-
File "/var/lib/python-support/python2.4/urlgrabber/grabber.py", line 881, in urlopen
-
(url,parts) = opts.urlparser.parse(url, opts)
-
File "/var/lib/python-support/python2.4/urlgrabber/grabber.py", line 653, in parse
-
parts = urlparse.urlparse(url)
-
File "/usr/lib/python2.4/urlparse.py", line 50, in urlparse
-
tuple = urlsplit(url, scheme, allow_fragments)
-
File "/usr/lib/python2.4/urlparse.py", line 89, in urlsplit
-
i = url.find(':')
-
File "/usr/lib/python2.4/urllib2.py", line 207, in __getattr__
-
raise AttributeError, attr
-
My code is as follows. -
#!/usr/bin/python
-
from ClientForm import ParseResponse
-
from urlgrabber import urlopen
-
-
-
url = 'http://www.ggggg.com'
-
-
headers = (
-
('User-Agent','Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.10) Gecko/20071115 Iceweasel/2.0.0.10 (Debian-2.0.0.10-0etch1)'),
-
('Accept','text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5'),
-
('Accept-Language','en-us,en;q=0.5'),
-
# ('Accept-Encoding','gzip,deflate'),
-
('Accept-Charset','ISO-8859-1,utf-8;q=0.7,*;q=0.7'),
-
('Keep-Alive','300'),
-
('Connection','keep-alive'),
-
('Referer',url),
-
('Content-Type','application/x-www-form-urlencoded'))
-
-
response = urlopen(url,http_headers = headers)
-
forms = ParseResponse(response, backwards_compat=False)
-
-
form = forms[0]
-
print form
-
-
form["passwd"] = "ggggg"
-
-
-
request2 = form.click()
-
response2 = urlopen(request2)
-
-
print response2
-
-
Hi,
I don't have an answer for you, sorry. I am not an expert Pyton programmer. But I have a problem that I think you may be able to assist me with, if you don't mind.
I have used cookielib to login to a website. I am unable to connect to additional urls within the webiste once I have logged on. I have the cookie, header, and so on setup properly - I believe. However, I notice from using 'HTTPAnalyze' program to check the header, that it is using a KEEPALIVE connection. This is new to me. I note from your problem statement that URLGRABBER may be my solution to addressing KEEPALIVE.
My problem is that I have tried to utilize URLGRABBER to handle the KEEPALIVE. I can acess the webiste and read, open, and grab: ,but I can't figure out how to login ('username' and 'passwd'), and let alone access additional pages inside the site. I am stumped. No examples available for this online after lots of searching. I was hoping that you might assist me with how to do this by providing a simple example. I am a beginner Python programmer/user.
I am using Pyton 2.5, on Windows XP, IE7 (if manually browsing)
I appreciate it very much.
Hi Mcgrete,
First you got to figure out how to login to the site, I think it might be GET or POST.
You can search http GET and POST in google and the urlgrabber manual can help you too.
http://linux.duke.edu/projects/urlgrabber/help/urlgrabber.grabber.html
P.S: I have overcome the problem I met, finally I use the mechanize module, my problem is nothing to do with "KEEPALIVE".
Sign in to post your reply or Sign up for a free account.
Similar topics
by: Matej Cepl |
last post by:
Hi,
using python 2.3, ClientForm, and ClientCookie and I have this code:
opener = ClientCookie.build_opener(ClientCookie.HTTPRefererProcessor,
ClientCookie.HTTPRefreshProcessor,...
|
by: narke |
last post by:
Does anyone here use ClientForm to handle a HTML form on client side?
I got a form, within which there is a image control, it direct me to
another page if i use mouse click on it. the code of...
|
by: kostem |
last post by:
Hi,
I need some help on using ClientForm to post to cgi and getting
response. I have done this many times and it worked very well until
now. I have contacted the webmaster of the page I'm...
|
by: m0sf3t |
last post by:
Does anyone here use ClientForm to handle a HTML form on client side?
I try to open this page https://www.orange.ch/footer/login
but got this message
File...
|
by: emiliano |
last post by:
Hey guys, i was just googling some information about how to use the ClientForm package with a page which requires HTTP basic authentication and i got here :P ... So here is the problem, lets see if...
|
by: Gordon Airporte |
last post by:
I've written a script using ClientForm to automate opening and closing
ports on my Linksys router. It works, but I wonder if there isn't a
better way to do it.
The problem is that the list of...
|
by: Devraj |
last post by:
Hi everyone,
I have been battling to make my code work with a HTTPS proxy, current
my code uses urllib2 to to most things and works well, except that
urllib2 doesn't handle HTTPS proxies.
...
|
by: skiani |
last post by:
Hi, I'm trying to automate downloading a file from a website. The site has a cookie based authentication so I use ClientCookie to first login then I fillout the form with ClientForm successfully....
|
by: Elaine121 |
last post by:
Hi i've been batteling for hours and can't seem to find the problem. When my server runs and I press the connect button the gui freezes until the client gui is terminated.. only then the gui becomes...
|
by: nemocccc |
last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
|
by: Sonnysonu |
last post by:
This is the data of csv file
1 2 3
1 2 3
1 2 3
1 2 3
2 3
2 3
3
the lengths should be different i have to store the data by column-wise with in the specific length.
suppose the i have to...
|
by: Hystou |
last post by:
There are some requirements for setting up RAID:
1. The motherboard and BIOS support RAID configuration.
2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
|
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers,...
|
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
|
by: isladogs |
last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM).
In this session, we are pleased to welcome a new...
|
by: conductexam |
last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
| |