473,396 Members | 1,693 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

using python to visit web sites and print the web sites image to files

imx
Hi there,

I wonder whether python can be used to simulate a real user to do the
following:
1) open a web site in a browser;
2) printscreen, so to copy the current active window image to
clipboard;
3) save the image file to a real file

Any pointer will be apprieciated!

Xiong

Mar 12 '07 #1
10 1782
On Mar 12, 7:32 am, "imx" <xiong.xu...@gmail.comwrote:
I wonder whether python can be used to simulate a real user to do the
following:
1) open a web site in a browser;
2) printscreen, so to copy the current active window image to
clipboard;
3) save the image file to a real file
Any pointer will be apprieciated!
Xiong
Google pywinauto.

HTH

Davy
Mar 12 '07 #2
>
I wonder whether python can be used to simulate a real user to do the
following:
1) open a web site in a browser;
2) printscreen, so to copy the current active window image to
clipboard;
3) save the image file to a real file

Any pointer will be apprieciated!
Which OS?
Mar 12 '07 #3
You can definitely create a web bot with python. It doesn't require
that you "drive" A real web browser. There are libraries to open web
pages, scrape their contents, and do downloading. That would make your
bot platform neutral. Driving a GUI browser has the risk of being a
brittle script that might not handle different browsers, different
platforms, maybe even not handle different versions.

I run a mediawiki web site, and found a handy python-based library
written to manage it called pywikipediabot at http://sourceforge.net/projects/pywikipediabot/.

Okay, this library won't do your leg work for you, but it has pieces
and parts that demonstrate how to use python to surf a web site. Then,
with an HTML parser, you can hunt down images.

Greg

Mar 12 '07 #4
Goldfish wrote:
>
I run a mediawiki web site, and found a handy python-based library
written to manage it called pywikipediabot at
http://sourceforge.net/projects/pywikipediabot/.
>
This sounds interesting. My daughter had a nightmare that a hacker
invaded her Orkut and blanked all 1500+ scraps. This is not impossible.
Maybe I should save the contents to a file...

Alberto Monteiro

Mar 12 '07 #5
Goldfish wrote:
You can definitely create a web bot with python. It doesn't require
that you "drive" A real web browser.
That's true, but if you want to print the page to a file, you need
something that can reproduce the intended layout. The Pyglet library
developers mention "XML/HTML+CSS" as something the layout engine can
deal with, which sounds quite impressive if its support of CSS is
comprehensive:

http://pyglet.org/

Paul

Mar 12 '07 #6
imx
On 3ÔÂ13ÈÕ, ÉÏÎç4ʱ26·Ö, "Paul Boddie" <p...@boddie..org.ukwrote:
Goldfish wrote:
You can definitely create a web bot with python. It doesn't require
that you "drive" A real web browser.

That's true, but if you want to print the page to a file, you need
something that can reproduce the intended layout. The Pyglet library
developers mention "XML/HTML+CSS" as something the layout engine can
deal with, which sounds quite impressive if its support of CSS is
comprehensive:

http://pyglet.org/

Paul
Thanks for all the replies.
I will check pyglet to see if it can help.

The reason I want to do simulation but not just crawling is : we have
to check many web pages' front page to see whether it conform to our
visual standard, e.g, it should put a search box on the top part of
the page. It's tedious for human work. So I want to 'crawl and save
the visual presentation of the web site automatically', and check
these image files later with human eyes.

-Xiong

Mar 13 '07 #7
imx
On 3ÔÂ13ÈÕ, ÉÏÎç12ʱ39·Ö, "daftspan...@gmail.com" <daftspan...@gmail.com>
wrote:
On Mar 12, 7:32 am, "imx" <xiong.xu...@gmail.comwrote:
I wonder whether python can be used to simulate a real user to do the
following:
1) open a web site in a browser;
2) printscreen, so to copy the current active window image to
clipboard;
3) save the image file to a real file
Any pointer will be apprieciated!
Xiong

Google pywinauto.

HTH

Davy
I checked pyglet, it's in early development stage. Since I'm using
windows, I will try pywinauto.

Thanks,
Xiong

Mar 13 '07 #8
The reason I want to do simulation but not just crawling is : we have
to check many web pages' front page to see whether it conform to our
visual standard, e.g, it should put a search box on the top part of
the page. It's tedious for human work. So I want to 'crawl and save
the visual presentation of the web site automatically', and check
these image files later with human eyes.

-Xiong
Hi Xiong,

I have been working on a program to do something very similar to
generate thumbnails of websites.

The code is in IronPython (which may put you off!) and would need
modified or scripted with pywinauto to deal with multiple images.

Let me know if it is of use to you and I will upload it.

Cheers,
Davy

Mar 13 '07 #9
imx
On 3ÔÂ14ÈÕ, ÉÏÎç5ʱ44·Ö, "daftspan...@gmail.com" <daftspan...@gmail.com>
wrote:
The reason I want to do simulation but not just crawling is : we have
to check many web pages' front page to see whether it conform to our
visual standard, e.g, it should put a search box on the top part of
the page. It's tedious for human work. So I want to 'crawl and save
the visual presentation of the web site automatically', and check
these image files later with human eyes.
-Xiong

Hi Xiong,

I have been working on a program to do something very similar to
generate thumbnails of websites.

The code is in IronPython (which may put you off!) and would need
modified or scripted with pywinauto to deal with multiple images.

Let me know if it is of use to you and I will upload it.

Cheers,
Davy
Cool, but does it mean that I will need .net to run the code?

Xiong

Mar 14 '07 #10
On Mar 14, 9:02 am, "imx" <xiong.xu...@gmail.comwrote:
Cool, but does it mean that I will need .net to run the code?
Yep - runtime is free though as is IronPython. For my program the
license is BSD.

Cheers,
Davy

Mar 14 '07 #11

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

14
by: Tim Parkin | last post by:
Terry Ready said: > YUCK< YUCK< YUCK. > <snip> > The pollenation site is one of the worst I have seen. The mockup page > has teeny type that IE will not enlarge. > <snip> > I care that the...
5
by: Premshree Pillai | last post by:
Hello, I recently wrote a Perl version of pyAlbum.py -- a Python script to create an image album from a given directory -- plAlbum.pl . It made me realize how easy-to-use Python is. ...
44
by: Xah Lee | last post by:
here's a large exercise that uses what we built before. suppose you have tens of thousands of files in various directories. Some of these files are identical, but you don't know which ones are...
3
by: Ricardo Sanchez | last post by:
Hello, I'm trying to upload images to http://imageshac.us via a Python script. I have looked at the POST request with HTTPLiveHeaders Firefox extension when I upload an image, but I can't...
4
by: Japhy | last post by:
Hello, I'm am pulling data from a mysql db and want to use the data to populate a <ul. Here are relavent parts of my code : $wohdate = mysql_result($wohRS,$wohndx,woh_date); $woh_display...
5
by: Michael Sperlle | last post by:
Is it possible? Bestcrypt can supposedly be set up on linux, but it seems to need changes to the kernel before it can be installed, and I have no intention of going through whatever hell that would...
4
true911m
by: true911m | last post by:
Here's a little walkthrough to get py2exe up and running. I'm not an expert, so I can't help much with any problems you might have. This is what worked for me. The result here will be to convert...
21
KevinADC
by: KevinADC | last post by:
Note: You may skip to the end of the article if all you want is the perl code. Introduction Uploading files from a local computer to a remote web server has many useful purposes, the most...
11
by: Faisal Vali | last post by:
Are there any guidelines people use that help them decide when it is better to dynamically generate all html elements using javascript versus actually writing some html and using it as scaffolding?...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.