473,398 Members | 2,120 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,398 software developers and data experts.

urllib2 hangs "forever" where there is no network interface

I have written a script that uses the urllib2 module to download web
pages for parsing.

If there is no network interface, urllib2 hangs for a very long time
before it raises an exception. I have set the socket timeout with
socket.setdefaulttimeout(), however, where there is no network
interface, this seems to be ignored - presumably, because without a
network interface, there is nothing for the socket module to interact
with.

So, can someone point me in the right direction, so that I can catch
an exception where there is no network interface?

Feb 1 '07 #1
4 2572
"dumbkiwi" <dm*****@gmail.comwrites:
I have written a script that uses the urllib2 module to download web
pages for parsing.

If there is no network interface, urllib2 hangs for a very long time
before it raises an exception. I have set the socket timeout with
socket.setdefaulttimeout(), however, where there is no network
interface, this seems to be ignored - presumably, because without a
network interface, there is nothing for the socket module to interact
with.

So, can someone point me in the right direction, so that I can catch
an exception where there is no network interface?
Are you on Windows or something Unixy?

Presumably Windows? (Unix systems almost always have at least a
loopback interface)
John

Feb 1 '07 #2
On Feb 2, 5:02 am, j...@pobox.com (John J. Lee) wrote:
"dumbkiwi" <dmbk...@gmail.comwrites:
I have written a script that uses the urllib2 module to download web
pages for parsing.
If there is no network interface, urllib2 hangs for a very long time
before it raises an exception. I have set the socket timeout with
socket.setdefaulttimeout(), however, where there is no network
interface, this seems to be ignored - presumably, because without a
network interface, there is nothing for the socket module to interact
with.
So, can someone point me in the right direction, so that I can catch
an exception where there is no network interface?

Are you on Windows or something Unixy?
Linux
>
Presumably Windows? (Unix systems almost always have at least a
loopback interface)

John
Sorry, I should have been more specific. The network interfaces are
up - ie lo and eth1, it's where the wireless connection has dropped
out. Is the best solution to test for a wireless connection through /
proc before trying to download data?

Feb 1 '07 #3
(I'm having news trouble, sorry if anybody sees a similar reply three
times...)

"dumbkiwi" <dm*****@gmail.comwrites:
On Feb 2, 5:02 am, j...@pobox.com (John J. Lee) wrote:
"dumbkiwi" <dmbk...@gmail.comwrites:
[...]
If there is no network interface, urllib2 hangs for a very long time
before it raises an exception. I have set the socket timeout with
socket.setdefaulttimeout(), however, where there is no network
interface, this seems to be ignored - presumably, because without a
network interface, there is nothing for the socket module to interact
with.
[...]
Presumably Windows? (Unix systems almost always have at least a
loopback interface)

John

Sorry, I should have been more specific. The network interfaces are
up - ie lo and eth1, it's where the wireless connection has dropped
out.
The underlying problem is that Python's socket timeout is implemented
using select() or poll(). Those system calls only allow timing out
activity on file descriptors (e.g. sockets). The problem you're
seeing is caused by getaddrinfo() blocking for a long time, and that
function doesn't involve file descriptors. The problem should really
be fixed at the C level (in Modules/socketmodule.c), using something
like alarm() or a thread to apply a timeout to getaddrinfo() calls.

Is the best solution to test for a wireless connection through /
proc before trying to download data?
That may be a good practical solution.

Another workaround that might be useful is to do your DNS lookups only
once, then use only IP addresses.
John
Feb 3 '07 #4
jj*@pobox.com (John J. Lee) writes:
(I'm having news trouble, sorry if anybody sees a similar reply three
times...)

"dumbkiwi" <dm*****@gmail.comwrites:
On Feb 2, 5:02 am, j...@pobox.com (John J. Lee) wrote:
"dumbkiwi" <dmbk...@gmail.comwrites:
[...]
If there is no network interface, urllib2 hangs for a very long time
before it raises an exception. I have set the socket timeout with
socket.setdefaulttimeout(), however, where there is no network
interface, this seems to be ignored - presumably, because without a
network interface, there is nothing for the socket module to interact
with.
[...]
Presumably Windows? (Unix systems almost always have at least a
loopback interface)
>
John
Sorry, I should have been more specific. The network interfaces are
up - ie lo and eth1, it's where the wireless connection has dropped
out.

The underlying problem is that Python's socket timeout is implemented
using select() or poll(). Those system calls only allow timing out
activity on file descriptors (e.g. sockets). The problem you're
seeing is caused by getaddrinfo() blocking for a long time, and that
function doesn't involve file descriptors. The problem should really
be fixed at the C level (in Modules/socketmodule.c), using something
like alarm() or a thread to apply a timeout to getaddrinfo() calls.
Seems doing this portably with threads is a bit of a nightmare,
actually. You'd have to extend every one of CPython's thread
implementations (pthreads, Solaris threads, etc. etc. etc.) -- and I
don't even know if it's possible on all systems.

And since the GIL is released around the getaddrinfo() call in
socketmodule.c (and that can't be changed), one can't guarantee that a
Python thread won't set a different signal handler, so alarm() is not
good.

And of course Windows is a separate case.

Is the best solution to test for a wireless connection through /
proc before trying to download data?

That may be a good practical solution.

Another workaround that might be useful is to do your DNS lookups only
once, then use only IP addresses.
The portable way to actually solve what I assume is your underlying
problem (latency in a GUI) is to have a Python thread or separate
process do your urlopen()s (this can be done at the Python level).
John

Feb 10 '07 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: John Ramsden | last post by:
I have a script running on PHP v4.3.6 (cgi) that hangs forever in a call to the Postgres pg_get_result() function when and only when the query length is 65536 or more bytes. The query is a...
1
by: Lize | last post by:
Hi, I'm writing an ASP application to open an excel workbook, then run a macro stored in the excel file, which produces outputs that will be displayed back onto my ASP application. Now the...
1
by: Yannick Turgeon | last post by:
Hello all, I'm using SS 2000 and NT4 (and Access97 as front-end on another server) Well, probably by lack of knowledge about table locks, I don't really know where to start to present this...
1
by: David Bradbury | last post by:
Hi On my form, as soon as the user clicks my submit button a message pops up saying "Form processing" as the form submits. This is fine as long as the user only clicks the submit button once....
1
by: Randal Chapman | last post by:
Hi. I am returning a "distinct" list of nodes from an xmldocument using selectnodes. I get the list fine, and I can select individual nodes, but when I try to access the count of the list my...
4
by: Prince Kumar | last post by:
I joined a company recently and they have a java program which hangs (does nothing) after a while. This is no way consistent. It could succeed quite a few times and can fail a few other times....
1
by: Fredi Daellenbach | last post by:
Hi all, I have a report with a sub report in it. Everything displays all right, even with records in the main report on >1 page (*). But as soon as there are so many records in the _sub_report...
7
by: Marina | last post by:
Hi, We are experiencing a frequent problem with an ASP.NET application using IE. We have a registration process that involves an Infragistics grid on one of the pages. Most often, this problem...
6
by: alessandro | last post by:
Hi all, This is my framework for create TCP server listening forever on a port and supporting threads: import SocketServer port = 2222 ip = "192.168.0.4"
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.