473,327 Members | 1,919 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,327 software developers and data experts.

scripting browsers from Python

I would like to know what is available for scripting browsers from
Python.
For instance, webbrowser.open let me to perform GET requests, but I
would like
to do POST requests too. I don't want to use urllib to emulate a
browser, I am
interested in checking that browser X really works as intended with my
application. Any suggestion?

Michele Simionato

Jul 19 '05 #1
14 2778
On 31 May 2005 00:52:33 -0700, Michele Simionato
<mi***************@gmail.com> wrote:
I would like to know what is available for scripting browsers from
Python.
For instance, webbrowser.open let me to perform GET requests, but I
would like
to do POST requests too. I don't want to use urllib to emulate a
browser, I am
interested in checking that browser X really works as intended with my
application. Any suggestion?


I don't know of anything cross platform, or even cross browser, but on
Windows, IE can be automated via COM - see
<http://www.mayukhbose.com/python/IEC/> for example - and other
browsers should be able to be automated either via COM or by driving
the GUI with WATSUP (<http://www.tizmoi.net/watsup/intro.html>).

--
Cheers,
Simon B,
si***@brunningonline.net,
http://www.brunningonline.net/simon/blog/
Jul 19 '05 #2
Michele Simionato wrote:
I would like to know what is available for scripting browsers from
Python.
For instance, webbrowser.open let me to perform GET requests, but I
would like to do POST requests too. I don't want to use urllib to
emulate a browser, I am interested in checking that browser X really
works as intended with my application. Any suggestion?


For Konqueror running on KDE, you can use DCOP to control the browser.
There are a couple of different, but related, Python modules that you
can use to do this. See the following page for more information:

http://developer.kde.org/language-bindings/python/

I believe this approach has been used quite successfully with other
KDE applications:

http://www.kde-apps.org/content/show.php?content=18638

You should still be able to automate the browser with just popen2 and
the "dcop" command line tool if you are really desperate. I once had
to resort to this ad-hoc approach in the distant past but, these days,
I'd recommend one of the above modules instead.

David

Jul 19 '05 #3
Simon Brunning wrote:
On 31 May 2005 00:52:33 -0700, Michele Simionato
<mi***************@gmail.com> wrote:
I would like to know what is available for scripting browsers from
Python.


I don't know of anything cross platform, or even cross browser, but on
Windows, IE can be automated via COM - see
<http://www.mayukhbose.com/python/IEC/> for example


Also http://pamie.sourceforge.net/

Kent
Jul 19 '05 #4
"Michele Simionato" <mi***************@gmail.com> writes:
I would like to know what is available for scripting browsers from
Python.
For instance, webbrowser.open let me to perform GET requests, but I
would like
to do POST requests too. I don't want to use urllib to emulate a
browser, I am
interested in checking that browser X really works as intended with my
application. Any suggestion?


Yes:

http://selenium.thoughtworks.com/index.html

http://agiletesting.blogspot.com/200...on-part-2.html

http://products.actimind.com/actiWATE/
Unfortunately, there's still no free (as in speech) "macro recorder"
implemented as a browser plugin (nor even one implemented on the HTTP
level that can produce output in a form selenium understands, AFAIK).

For some other relevant links, see under "Misc Links" here (and for
that matter, the previous bullet point too):

http://wwwsearch.sourceforge.net/bits/GeneralFAQ.html
John

Jul 19 '05 #5
On Tue, 31 May 2005 00:52:33 -0700, Michele Simionato wrote:
I would like to know what is available for scripting browsers from
Python.
For instance, webbrowser.open let me to perform GET requests, but I
would like
to do POST requests too. I don't want to use urllib to emulate a
browser, I am
interested in checking that browser X really works as intended with my
application. Any suggestion?

Michele Simionato

ClientForm http://wwwsearch.sourceforge.net/ClientForm/

I use it for automation of POSTs of entire image directories to
imagevenue.com/imagehigh.com/etc hosts.

Works above urllib2.

You access forms by name or indice, then you access HTML elements as a
dict attribute of the form.

Support file upload within POST.

The only drawback I've found are:
- does not support nested forms (since forms are returned in a list)
- does not like ill-formed HTML (Uses HTMLParser as the underlying parser.
you may pass a parser class as parameter (say SGMLParser for greater
acceptance of stupid HTML code) but it's tricky because there is no well
defined parser interface)

Hope this helps.

Jul 19 '05 #6
This looks interesting, but I need an example here. What would be the
command
to open Konqueror to a given page and to post a form with given
parameters?
kde.org has tons a material, but I am getting lost and I don't find
anything
relevant to my simple problem.

Michele Simionato

Jul 19 '05 #7
has
Simon Brunning wrote:
On 31 May 2005 00:52:33 -0700, Michele Simionato
<mi***************@gmail.com> wrote:
I would like to know what is available for scripting browsers from
Python.


I don't know of anything cross platform, or even cross browser, but on
Windows, IE can be automated via COM


On OS X you can use appscript
<http://freespace.virgin.net/hamish.sanderson/appscript.html>, either
directly via the application's scripting interface if it has one or
indirectly by manipulating its GUI via GUI Scripting.

HTH

Jul 19 '05 #8
I wanted to have a Python program make my browser do a POST. I am using
Firefox on Linux.

Here's what I did:
* Prepare a HTML page on the local disk that looks like this:
<html><body onload="document.forms[0].submit()">
<div style="display: none">
<form method=post accept-charset="utf-8" action="http://www.example.com/cgi-bin/example.cgi">
<input name=field1 value=value1>
<input name=field2 value=value2>
<textarea name=text>....</textarea>
<input type=submit name=blah>
</form>
</div>
Submitting form...
</body>
</html>

* Point the webbrowser at it. In my case, the webbrowser module didn't work immediately so I
just used os.system() with a hardcoded browser name for it

Jeff

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)

iD8DBQFCncHiJd01MZaTXX0RAolzAKCKjs0pkEu86sQxY4yB83 fU71fECACfRyag
0wuz8b0pvJHWC3i90MhZG7k=
=4OPn
-----END PGP SIGNATURE-----

Jul 19 '05 #9
Olivier Favre-Simon <ol*****************@club-internet.fr> writes:
On Tue, 31 May 2005 00:52:33 -0700, Michele Simionato wrote:
I would like to know what is available for scripting browsers from
Python.
[...] ClientForm http://wwwsearch.sourceforge.net/ClientForm/

I use it for automation of POSTs of entire image directories to
imagevenue.com/imagehigh.com/etc hosts.
This doesn't actually address what the OP wanted: it's not a browser.

The only drawback I've found are:
- does not support nested forms (since forms are returned in a list)
Nested forms?? Good grief. Can you point me at a real life example
of such HTML? Can probably fix the parser to work around this.

- does not like ill-formed HTML (Uses HTMLParser as the underlying parser.
you may pass a parser class as parameter (say SGMLParser for greater
acceptance of stupid HTML code) but it's tricky because there is no well
defined parser interface)


Titus Brown says he's trying to fix sgmllib (to some extent, at least).

Also, you can always feed stuff through mxTidy.

I'd like to have a reimplementation of ClientForm on top of something
like BeautifulSoup...
John
Jul 19 '05 #10
Michele Simionato wrote:
This looks interesting, but I need an example here. What would be the
command to open Konqueror to a given page and to post a form with given
parameters?
Launch Konqueror, note the process ID (pid), and use the dcop command
line tool to open the page at a specified URL:

dcop konqueror-<pid> konqueror-mainwindow#1 openURL <URL>

Unfortunately, I don't think it's possible to manipulate the page
purely with DCOP, even with Python bindings, although I hope that
someone can prove me wrong.
kde.org has tons a material, but I am getting lost and I don't find
anything relevant to my simple problem.


A quick search revealed this discussion about using JavaScript with
DCOP:

http://lists.kde.org/?l=kfm-devel&m=103661664427286&w=2

This might be the best you can hope for with scripting outside the
browser. I've been trying to enable support for in-browser scripting
with Konqueror using KPart plugins, but this requires up to date
versions of sip, PyQt and PyKDE:

http://www.riverbankcomputing.co.uk

If you want to pursue that route, let me know and I'll try and tidy up
what I have.

David

Jul 19 '05 #11
On Wed, 01 Jun 2005 22:27:44 +0000, John J. Lee wrote:
Olivier Favre-Simon <ol*****************@club-internet.fr> writes:
On Tue, 31 May 2005 00:52:33 -0700, Michele Simionato wrote:
> I would like to know what is available for scripting browsers from
> Python.
[...]
ClientForm http://wwwsearch.sourceforge.net/ClientForm/

I use it for automation of POSTs of entire image directories to
imagevenue.com/imagehigh.com/etc hosts.


This doesn't actually address what the OP wanted: it's not a browser.


Yep. Didn't read with sufficient care. He really wants scripting not
webscraping.

The only drawback I've found are:
- does not support nested forms (since forms are returned in a list)
Nested forms?? Good grief. Can you point me at a real life example of
such HTML? Can probably fix the parser to work around this.


What I mean is: The parser does not detect a missing </form>, so
thinks that there are nested forms, and raises a ParseError.

Browsers have an easier task at spotting non-matching form tags, because
they can use matching table or div tags around to imply that the form is
closed (DOM approach).

Not easy with a SAXish approach like HTMLParser.

I don't mean nested forms should be supported, they are crap (is this even
legal code ?)

- does not like ill-formed HTML (Uses HTMLParser as the underlying
parser. you may pass a parser class as parameter (say SGMLParser for
greater acceptance of stupid HTML code) but it's tricky because there
is no well defined parser interface)


Titus Brown says he's trying to fix sgmllib (to some extent, at least).

Also, you can always feed stuff through mxTidy.

I'd like to have a reimplementation of ClientForm on top of something
like BeautifulSoup...
John


When taken separately, either ClientForm, HTMLParser or SGMLParser work
well.

But it would be cool that competent people in the HTML parsing domain join
up, and define a base parser interface, the same way smart guys did with
WSGI for webservers.

So libs like ClientForm would not raise say an AttributeError if some
custom parser class does not implement a given attribute.

Adding an otherwise unused attribute to a parser just in case one day it
will interop with ClientForm sounds silly. And what if ClientForm changes
its attributes, etc.

No really, whatever the chosen codebase, a common parser interface would
be great.
Jul 19 '05 #12
On 31 May 2005 00:52:33 -0700, Michele Simionato
<mi***************@gmail.com> wrote:
I would like to know what is available for scripting browsers from
Python.
For instance, webbrowser.open let me to perform GET requests, but I
would like
to do POST requests too. I don't want to use urllib to emulate a
browser, I am
interested in checking that browser X really works as intended with my
application. Any suggestion?

Michele Simionato


I use pbp, http://pbp.berlios.de/

It's essentially a python commandline webbrowser suitable for testing
websites. It makes it easy to do things like:

go http://user:pass@mywebsite/secure/
follow Admin
follow Configure
formvalue config max_widgets 300
submit config

in a script, and then run that script at your lesuire.

As it's designed for testing, everything you do is essnetial an
assertion, so if anything fails it fails spectacularly with debug
messages and non-zero exit codes. You can also load python code up so
you can do arbitary stuff.

--
Stephen Thorne
Development Engineer
Jul 19 '05 #13
Olivier Favre-Simon <ol*****************@club-internet.fr> writes:
[...]
I'd like to have a reimplementation of ClientForm on top of something
like BeautifulSoup...
John
When taken separately, either ClientForm, HTMLParser or SGMLParser work
well.

But it would be cool that competent people in the HTML parsing domain join
up, and define a base parser interface, the same way smart guys did with
WSGI for webservers.


Perhaps. Given a mythical fixed quantity of volunteer coding effort I
could assign to any HTML parsing project, I'd really prefer that
somebody separated out the HTML parsing, tree building and DOM code
from Mozilla and/or Konqueror.

So libs like ClientForm would not raise say an AttributeError if some
custom parser class does not implement a given attribute.

Adding an otherwise unused attribute to a parser just in case one day it
will interop with ClientForm sounds silly. And what if ClientForm changes
its attributes, etc.

[...]

I'm sorry, I didn't really follow that at all.

What I hoped to get from implementing the ClientForm interface on top
of something like BeautifulSoup was actually two things:

1. Better parsing

2. Access to a nice, and comprehensive, object model that lets you do
things with non-form elements, and the ability to move back and
forth between ClientForm and BeautifulSoup objects. I already did
this for the HTML DOM with DOMForm (unsupported), but for various
reasons the implementation is horrid, and since I no longer intend
to put in the effort to support JavaScript, I'd prefer a nicer tree
API than the DOM.
John
Jul 19 '05 #14
[Michele Simionato]
I would like to know what is available for scripting browsers from
Python. [...] to do POST requests too. I don't want to use urllib to emulate a
browser, I am
interested in checking that browser X really works as intended with my
application. Any suggestion?
[...]
[Stephen Thorne] I use pbp, http://pbp.berlios.de/

[...]

Again, that doesn't do what Michele wants.
John
Jul 19 '05 #15

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: The_Incubator | last post by:
As the subject suggests, I am interested in using Python as a scripting language for a game that is primarily implemented in C++, and I am also interested in using generators in those scripts... ...
1
by: PiedmontBiz | last post by:
I am looking for ways to do interactive web pages with python as a backend. I ran across serveral articles on Brent Ashley's JavaScript remote scripting (JSRS) locateded at...
41
by: Richard James | last post by:
Are we looking at the scripting world through Python colored glasses? Has Python development been sleeping while the world of scripting languages has passed us Pythonista's by? On Saturday...
22
by: Ajay | last post by:
hi! is there an authoritative source on the performance of scripting languages such as python vs. something like java, c, c++. its for a report, so it would be awesome if i could quote some...
33
by: Quest Master | last post by:
I am interested in developing an application where the user has an ample amount of power to customize the application to their needs, and I feel this would best be accomplished if a scripting...
6
by: Wolfgang Keller | last post by:
Hello, I'm looking for a spreadsheet application (MacOS X prefered, but Windows, Linux ar available as well) with support for Python scripting (third-party "plug-ins" are ok) and a database...
15
by: Birahim FALL | last post by:
Hi, I'm very fresh to PostgreSQL, coming from Oracle. I want to developp web applications based on apache and postgresql. Is there an equivalent of OWA server (Oracle Web Application server) for...
8
by: Nagarajan | last post by:
Hi group, I need to develop a web application. I am in a fix as to choose among the various server-side scripting options. I want to explore python (am a newbie) to gain expertise and upon search,...
12
by: jim | last post by:
I am new to web programming with javascript and I was wondering if javascript is the only scripting language that run in browsers like Firefox, IE and Opera or are there others? The scripting...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.