469,357 Members | 1,603 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,357 developers. It's quick & easy.

using python to visit web sites and print the web sites image to files

imx
Hi there,

I wonder whether python can be used to simulate a real user to do the
following:
1) open a web site in a browser;
2) printscreen, so to copy the current active window image to
clipboard;
3) save the image file to a real file

Any pointer will be apprieciated!

Xiong

Mar 12 '07 #1
10 1690
On Mar 12, 7:32 am, "imx" <xiong.xu...@gmail.comwrote:
I wonder whether python can be used to simulate a real user to do the
following:
1) open a web site in a browser;
2) printscreen, so to copy the current active window image to
clipboard;
3) save the image file to a real file
Any pointer will be apprieciated!
Xiong
Google pywinauto.

HTH

Davy
Mar 12 '07 #2
>
I wonder whether python can be used to simulate a real user to do the
following:
1) open a web site in a browser;
2) printscreen, so to copy the current active window image to
clipboard;
3) save the image file to a real file

Any pointer will be apprieciated!
Which OS?
Mar 12 '07 #3
You can definitely create a web bot with python. It doesn't require
that you "drive" A real web browser. There are libraries to open web
pages, scrape their contents, and do downloading. That would make your
bot platform neutral. Driving a GUI browser has the risk of being a
brittle script that might not handle different browsers, different
platforms, maybe even not handle different versions.

I run a mediawiki web site, and found a handy python-based library
written to manage it called pywikipediabot at http://sourceforge.net/projects/pywikipediabot/.

Okay, this library won't do your leg work for you, but it has pieces
and parts that demonstrate how to use python to surf a web site. Then,
with an HTML parser, you can hunt down images.

Greg

Mar 12 '07 #4
Goldfish wrote:
>
I run a mediawiki web site, and found a handy python-based library
written to manage it called pywikipediabot at
http://sourceforge.net/projects/pywikipediabot/.
>
This sounds interesting. My daughter had a nightmare that a hacker
invaded her Orkut and blanked all 1500+ scraps. This is not impossible.
Maybe I should save the contents to a file...

Alberto Monteiro

Mar 12 '07 #5
Goldfish wrote:
You can definitely create a web bot with python. It doesn't require
that you "drive" A real web browser.
That's true, but if you want to print the page to a file, you need
something that can reproduce the intended layout. The Pyglet library
developers mention "XML/HTML+CSS" as something the layout engine can
deal with, which sounds quite impressive if its support of CSS is
comprehensive:

http://pyglet.org/

Paul

Mar 12 '07 #6
imx
On 3月13日, 上午4时26分, "Paul Boddie" <p...@boddie..org.ukwrote:
Goldfish wrote:
You can definitely create a web bot with python. It doesn't require
that you "drive" A real web browser.

That's true, but if you want to print the page to a file, you need
something that can reproduce the intended layout. The Pyglet library
developers mention "XML/HTML+CSS" as something the layout engine can
deal with, which sounds quite impressive if its support of CSS is
comprehensive:

http://pyglet.org/

Paul
Thanks for all the replies.
I will check pyglet to see if it can help.

The reason I want to do simulation but not just crawling is : we have
to check many web pages' front page to see whether it conform to our
visual standard, e.g, it should put a search box on the top part of
the page. It's tedious for human work. So I want to 'crawl and save
the visual presentation of the web site automatically', and check
these image files later with human eyes.

-Xiong

Mar 13 '07 #7
imx
On 3月13日, 上午12时39分, "daftspan...@gmail.com" <daftspan...@gmail.com>
wrote:
On Mar 12, 7:32 am, "imx" <xiong.xu...@gmail.comwrote:
I wonder whether python can be used to simulate a real user to do the
following:
1) open a web site in a browser;
2) printscreen, so to copy the current active window image to
clipboard;
3) save the image file to a real file
Any pointer will be apprieciated!
Xiong

Google pywinauto.

HTH

Davy
I checked pyglet, it's in early development stage. Since I'm using
windows, I will try pywinauto.

Thanks,
Xiong

Mar 13 '07 #8
The reason I want to do simulation but not just crawling is : we have
to check many web pages' front page to see whether it conform to our
visual standard, e.g, it should put a search box on the top part of
the page. It's tedious for human work. So I want to 'crawl and save
the visual presentation of the web site automatically', and check
these image files later with human eyes.

-Xiong
Hi Xiong,

I have been working on a program to do something very similar to
generate thumbnails of websites.

The code is in IronPython (which may put you off!) and would need
modified or scripted with pywinauto to deal with multiple images.

Let me know if it is of use to you and I will upload it.

Cheers,
Davy

Mar 13 '07 #9
imx
On 3月14日, 上午5时44分, "daftspan...@gmail.com" <daftspan...@gmail.com>
wrote:
The reason I want to do simulation but not just crawling is : we have
to check many web pages' front page to see whether it conform to our
visual standard, e.g, it should put a search box on the top part of
the page. It's tedious for human work. So I want to 'crawl and save
the visual presentation of the web site automatically', and check
these image files later with human eyes.
-Xiong

Hi Xiong,

I have been working on a program to do something very similar to
generate thumbnails of websites.

The code is in IronPython (which may put you off!) and would need
modified or scripted with pywinauto to deal with multiple images.

Let me know if it is of use to you and I will upload it.

Cheers,
Davy
Cool, but does it mean that I will need .net to run the code?

Xiong

Mar 14 '07 #10
On Mar 14, 9:02 am, "imx" <xiong.xu...@gmail.comwrote:
Cool, but does it mean that I will need .net to run the code?
Yep - runtime is free though as is IronPython. For my program the
license is BSD.

Cheers,
Davy

Mar 14 '07 #11

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

14 posts views Thread by Tim Parkin | last post: by
5 posts views Thread by Premshree Pillai | last post: by
3 posts views Thread by Ricardo Sanchez | last post: by
4 posts views Thread by Japhy | last post: by
5 posts views Thread by Michael Sperlle | last post: by
true911m
4 posts views Thread by true911m | last post: by
1 post views Thread by CARIGAR | last post: by
reply views Thread by suresh191 | last post: by
1 post views Thread by Marylou17 | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.