473,408 Members | 2,813 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,408 software developers and data experts.

How to download a web page just like a web browser do ?

Hi ,
It is about one month passed since I post to this list
last time . Yes , I use python , I used it in every day normal
work , whenever I need to do some scripts or other little-scale
works , python is the first one I took consideration in . I must
say it is a powerful tool for me , and what is more important
is there is a friendly and flourish community here .
Oh , I must stop appriciation and go to my question now .

Everyday , I receive a newsletter from NYTimes , but I
didn't want to read the news in evening the time the letter
came in . So , I am about to download the web page
contains the news and read them next morning ! I decide to
use python to write a tool , which should be feeded with a
URL , and then download the page to my disk . This
function just like the Browser's "save as..." function . I
know what shoud I do to accomplish that , I need to parse
the web page , and download all pages in the page , and
modify all the links to conrespond local disk links and ...

So , is there any similar function any one have impelment?
Does anyone can share some code with me ? I really don't
want to some confusing code to process such as text findings
and substitutions .

Thanks in advance !
Bo

Aug 23 '06 #1
5 2259
In <11**********************@p79g2000cwp.googlegroups .com>, Bo Yang wrote:
Everyday , I receive a newsletter from NYTimes , but I
didn't want to read the news in evening the time the letter
came in . So , I am about to download the web page
contains the news and read them next morning ! I decide to
use python to write a tool , which should be feeded with a
URL , and then download the page to my disk . This
function just like the Browser's "save as..." function . I
know what shoud I do to accomplish that , I need to parse
the web page , and download all pages in the page , and
modify all the links to conrespond local disk links and ...
Why don't you just use the `wget` program. Not written in Python but much
easier to use instead of writing the functionality yourself.

Ciao,
Marc 'BlackJack' Rintsch
Aug 23 '06 #2
At Wednesday 23/8/2006 11:34, Bo Yang wrote:
Everyday , I receive a newsletter from NYTimes , but I
didn't want to read the news in evening the time the letter
came in . So , I am about to download the web page
contains the news and read them next morning ! I decide to
use python to write a tool , which should be feeded with a
URL , and then download the page to my disk . This
function just like the Browser's "save as..." function . I
know what shoud I do to accomplish that , I need to parse
the web page , and download all pages in the page , and
modify all the links to conrespond local disk links and ...
This tool already exists: wget

Gabriel Genellina
Softlab SRL

__________________________________________________
Preguntá. Respondé. Descubrí.
Todo lo que querías saber, y lo que ni imaginabas,
está en Yahoo! Respuestas (Beta).
¡Probalo ya!
http://www.yahoo.com.ar/respuestas

Aug 23 '06 #3
You can also try HarvestMan:

http://harvestman.freezope.org/

Bo Yang wrote:
Hi ,
It is about one month passed since I post to this list
last time . Yes , I use python , I used it in every day normal
work , whenever I need to do some scripts or other little-scale
works , python is the first one I took consideration in . I must
say it is a powerful tool for me , and what is more important
is there is a friendly and flourish community here .
Oh , I must stop appriciation and go to my question now .

Everyday , I receive a newsletter from NYTimes , but I
didn't want to read the news in evening the time the letter
came in . So , I am about to download the web page
contains the news and read them next morning ! I decide to
use python to write a tool , which should be feeded with a
URL , and then download the page to my disk . This
function just like the Browser's "save as..." function . I
know what shoud I do to accomplish that , I need to parse
the web page , and download all pages in the page , and
modify all the links to conrespond local disk links and ...

So , is there any similar function any one have impelment?
Does anyone can share some code with me ? I really don't
want to some confusing code to process such as text findings
and substitutions .

Thanks in advance !
Bo
Aug 23 '06 #4

Thank you , Max !
I think HarvestMan is just what I need !
Thanks again !

Aug 24 '06 #5
Mechanize (http://wwwsearch.sourceforge.net/mechanize/) is another
option, it can even fill out forms!

--
mvh Björn
Aug 24 '06 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

7
by: Brian Paul | last post by:
When a user clicks on a linkbutton on a page, i would like to render a printer-friendly version of the asp.net page and download it as an html attachment to the browser. The code below works great,...
0
by: Wictor Wilén | last post by:
Heya, I need help creating a download page that should be used to download files from a server, and the files on the server contains filenames that contains non-us characters such as the swedish...
3
by: Jeff Jarrell | last post by:
I want to setup a downloads page on my site. Most of the time they are zip files but they are also MSI files. Things work ok if I simply put an <a> element referencing the file to download but...
3
by: 4psite | last post by:
I am creating (using php) an html page with many links. Clicking on a link open the link as _self. Clicking on the browser back button brings the prv html page (with the links,) but instead of...
1
by: Brett Kelly | last post by:
Ok, I know this sounds odd. Let me explain further. I have an ASP.net page (w/ C# code behind) that, when given a session variable containing the path to a local file, will attempt to start the...
2
by: ben.s.carlson | last post by:
Hi, I'm sending a file to the browser using asp.net, and I want it to open up in excel. The dialog box pops up, but when I press the open button, the file opens up with a appended to it. That...
2
by: =?ISO-8859-1?B?UOVsIEEu?= | last post by:
Have a "standard" asp.net web solution which uses the standard asp.net authentication and authorization methods (forms authentication). Some users have raised concern that even if you logout...
1
by: Nathan Sokalski | last post by:
Where can I find out what values are submitted by different browsers for Page.Request.Browser.Browser? Thanks. -- Nathan Sokalski njsokalski@hotmail.com http://www.nathansokalski.com/
2
by: avinash sh | last post by:
how to remove only current page from browser histry.... actually in my application(asp.net) browser is caching the current page so that when i click on browser back button, page remain same.. if i...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.