How to prevent the script from stopping before it should

python

I have a script that downloads some webpages.The problem is that,
sometimes, after I download few pages the script hangs( stops).
(But sometimes it finishes in an excellent way ( to the end) and
download all the pages I want to)
I think the script stops if the internet connection to the server (from
where I download the pages) is rather poor.
Is there a solution how to prevent the script from hanging before all
pages are downloaded?

Thanks for help
Lad.

Jul 18 '05 #1

Subscribe Post Reply

1774

wittempj

#import urllib, sys
#pages = ['http://www.python.org', 'http://xxx']
#for i in pages:
# try:
# u = urllib.urlopen(i)
# print u.geturl()
# except Exception, e:
# print >> sys.stderr, '%s: %s' % (e.__class__.__name__, e)
will print an error if a page fails opening, rest opens fine

Jul 18 '05 #2

Steve Holden

wi******@hotmail.com wrote:

#import urllib, sys
#pages = ['http://www.python.org', 'http://xxx']
#for i in pages:
# try:
# u = urllib.urlopen(i)
# print u.geturl()
# except Exception, e:
# print >> sys.stderr, '%s: %s' % (e.__class__.__name__, e)
will print an error if a page fails opening, rest opens fine

More generally you may wish to use the timeout features of TCP sockets.
These were introduced in Python 2.3, though Tim O'Malley's module
"timeoutsocket" (which was the inspiration for the 2.3 upgrade) was
available for earlier versions.

You will need to import the socket module and then call
socket.setdefaulttimeout() to ensure that communication with
non-responsive servers results in a socket exception that you can trap.

regards
Steve
--
Steve Holden http://www.holdenweb.com/
Python Web Programming http://pydish.holdenweb.com/
Holden Web LLC +1 703 861 4237 +1 800 494 3119

Jul 18 '05 #3

python

Steve Holden wrote:

wi******@hotmail.com wrote:
#import urllib, sys
#pages = ['http://www.python.org', 'http://xxx']
#for i in pages:
# try:
# u = urllib.urlopen(i)
# print u.geturl()
# except Exception, e:
# print >> sys.stderr, '%s: %s' % (e.__class__.__name__, e)
will print an error if a page fails opening, rest opens fine
More generally you may wish to use the timeout features of TCP

sockets. These were introduced in Python 2.3, though Tim O'Malley's module
"timeoutsocket" (which was the inspiration for the 2.3 upgrade) was
available for earlier versions.

You will need to import the socket module and then call
socket.setdefaulttimeout() to ensure that communication with
non-responsive servers results in a socket exception that you can trap.
regards
Steve
--
Steve Holden http://www.holdenweb.com/
Python Web Programming http://pydish.holdenweb.com/
Holden Web LLC +1 703 861 4237 +1 800 494 3119

Thank you wi******@hotmail.com and Steve for some ideas.Finding the
fact that the script hanged is not a big problem .
I,however, would need a solution that I will not need to start again
the script but the script re-start by itself. I am thinking about two
threads, the main(master) that will supervise a slave thread.This slave
thread will download the pages and whenever there is a timeout the
master thread restart a slave thread.
Is it a good solution? Or is there a better one?
Thanks for help
Lad

Jul 18 '05 #4

Fredrik Lundh

Steve Holden wrote:

You will need to import the socket module and then call socket.setdefaulttimeout() to ensure that
communication with non-responsive servers results in a socket exception that you can trap.

or you can use asynchronous sockets, so your program can keep processing
the sites that do respond at once while it's waiting for the ones that don't. for
one way to do that, see "Using HTTP to Download Files" here:

http://effbot.org/zone/effnews-1.htm

(make sure you read the second and third article as well)

</F>

Jul 18 '05 #5

python

Fredrik Lundh wrote:

Steve Holden wrote:
You will need to import the socket module and then call socket.setdefaulttimeout() to ensure that communication with non-responsive servers results in a socket
exception that you can trap.
or you can use asynchronous sockets, so your program can keep processing the sites that do respond at once while it's waiting for the ones that don't. for one way to do that, see "Using HTTP to Download Files" here:

http://effbot.org/zone/effnews-1.htm

(make sure you read the second and third article as well)

Dear Fredrik Lundh,
Thank you for the link. I checked it. But I have not found an answer to
my question.
My problem is that I can not finish( sometimes) to download all pages.
Sometimes my script freezes and I can not do nothing but restart the
script from the last successfully downloaded web page. There is no
error saying that was an error. I do not know why; maybe the server is
programed to reduce the numbers of connection or there maybe different
reasons.So, my idea was two threads. One master ,suprevising the slave
thread that would do downloading and if the slave thread stopped,
master thread would start another slave. Is it a good solution? Or is
there a better solution?
Thanks for help
Lad

Jul 18 '05 #6

Fuzzyman

Steve Holden wrote:

wi******@hotmail.com wrote:
#import urllib, sys
#pages = ['http://www.python.org', 'http://xxx']
#for i in pages:
# try:
# u = urllib.urlopen(i)
# print u.geturl()
# except Exception, e:
# print >> sys.stderr, '%s: %s' % (e.__class__.__name__, e)
will print an error if a page fails opening, rest opens fine
More generally you may wish to use the timeout features of TCP

sockets. These were introduced in Python 2.3, though Tim O'Malley's module
"timeoutsocket" (which was the inspiration for the 2.3 upgrade) was
available for earlier versions.

You will need to import the socket module and then call
socket.setdefaulttimeout() to ensure that communication with
non-responsive servers results in a socket exception that you can trap.
So adding :

import socket
socket.setdefaulttimeout()
Is *necessary* in order to avoid hangs when using urllib2 to fetch web
resources ?

Regards,

Fuzzy
http://www.voidspace.org.uk/python/index.shtml
regards
Steve
--
Steve Holden http://www.holdenweb.com/
Python Web Programming http://pydish.holdenweb.com/
Holden Web LLC +1 703 861 4237 +1 800 494 3119

Jul 18 '05 #7

export

Fuzzyman wrote:

Steve Holden wrote:
wi******@hotmail.com wrote:
#import urllib, sys
#pages = ['http://www.python.org', 'http://xxx']
#for i in pages:
# try:
# u = urllib.urlopen(i)
# print u.geturl()
# except Exception, e:
# print >> sys.stderr, '%s: %s' % (e.__class__.__name__, e) will print an error if a page fails opening, rest opens fine
More generally you may wish to use the timeout features of TCP

sockets.
These were introduced in Python 2.3, though Tim O'Malley's module
"timeoutsocket" (which was the inspiration for the 2.3 upgrade) was
available for earlier versions.

You will need to import the socket module and then call
socket.setdefaulttimeout() to ensure that communication with
non-responsive servers results in a socket exception that you can

trap.

So adding :

import socket
socket.setdefaulttimeout()
Is *necessary* in order to avoid hangs when using urllib2 to fetch

web resources ?

Regards,

Fuzzy
http://www.voidspace.org.uk/python/index.shtml

Fuzzy,
I use HTTPLIB with timeoutsocket but there is no timeout but the script
freezes sometimes.I suspect the server, from which I download pages,
does that to prevent high traffic. I must re start my script.
Do you think Urllib2 would be better?
Or is there a better solution?
Regards,
Lad

Jul 18 '05 #8

Similar topics

Please help: Prevent Multiple Instances

by: Jerry | last post by:

Hi All, How can I prevent a script from running when a previous instance of the script had already been triggered and the script is running in the background already? So, even when a script is...

PHP

Prevent Premature clicking on a submit button while page is refreshed

by: anonieko | last post by:

Scenario: You have a page that is TOO slow to refresh. But it allows partial flushing of html contents. I.e. Submit button already appears but you don't want your users to click on it prematurely...

Javascript

How to prevent logging warning?

by: Thomas Heller | last post by:

I'm about to add some logging calls to a library I have. How can I prevent that the script that uses the library prints 'No handlers could be found for logger "comtypes.client"' when the script...

Python

Problems trying to get MySQL to work from PHP script

by: Martin | last post by:

I'm having trouble getting a new PHP/MySQl installation to work. Windows XP Pro, IIS 5.1, PHP 5.1.1, MySQL 5.0.16, ISAPI This is a new computer. The whole setup is for development use only -...

MySQL Database

Prevent a service from stopping.

by: chris.hearson | last post by:

How do I programmatically prevent a service from stopping? I want to be able to keep my service running (started), under certain conditions, when a user tries to stop it. I have tried throwing an...

C# / C Sharp

Refresh button resubmitting data - How to prevent this?

by: Andy | last post by:

My application is written in .NET (C#) with the inline Edit mode from DataGrid. When a SAVE button is pressed it will perform a action on the database either to add, edit or delete data that a user...

ASP / Active Server Pages

How to patch MS03-32 via cmd script

by: Anonieko | last post by:

REM This batch file addresses issues that exist with MS03-32 with REM V1.0 of ASP.NET on Windows XP only REM If you have any other configuration, you should not need to run this @echo off if...

ASP.NET

Prevent Paste

by: GarryJones | last post by:

I found this handy little script on the net that means the user can only press backspace or numbers in form input. <script type="text/javascript"> function numbersonly(e){ var...

Javascript

Prevent RadioButton Selection

by: Frinavale | last post by:

I am trying to ask the user for confirmation before a radio button is selected. Currently I'm handling the radio button's onclick event, asking the user to confirm their action and if they hit the...

Javascript

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Merging data from multiple Excel files

by: ryjfgjl | last post by:

In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...

Data Management

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

Online Marketing

The easy way to turn off automatic updates for Windows 10/11

by: Hystou | last post by:

Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

Windows Server