473,399 Members | 2,159 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,399 software developers and data experts.

socket setdefaulttimeout

I'm doing DNS lookups on common spam blacklists (such as SpamCop..and
others) in an email filtering script. Sometimes, because the DNS server
that is resolving the looksup can go down, it is important to make sure
that the socket doesn't just hang there waiting for a response.

After a recent system upgrade to Python 2.4.1 (from 2.2.2) I thought I
could take advantage of the setdefaulttimeout in the socket module, to
limit the amount of time the sockets take for a lookup.

As a test, I set the default timout ridiculously low. But it doesn't
seem to be having any effect. The sockets will take several seconds to
make the connection and complete the lookups, even though I've set the
timeout to millionths of a second, which I thought would ensure a
timeout (sample script below).

Am I doing something wrong? Do I misunderstand something? Is what I
want to do simply not possible?

Thanks for any tips. Example code follows signature...

--
Sheila King
http://www.thinkspot.net/sheila/

#!/usr/local/bin/python2.4
import socket
import sys
from time import time, asctime, localtime

socket.setdefaulttimeout(.00001)
debugfile = "socketdebug.txt"
def debug(data):
timestamp = str(asctime(localtime(time())))
try:
f = open(debugfile, 'a')
f.write('\n*** %s ***\n' % timestamp)
f.write('%s\n' % data) # 24-Dec-2003 -ctm- removed one
linefeed
f.close()
except IOError:
pass
# do nothing if the file cannot be opened
IPaddy = '220.108.204.114'

if IPaddy:
IPquads = IPaddy.split('.')
IPquads.reverse()
reverseIP = '.'.join(IPquads)

bl_list = { 'bl.spamcop.net' : 'IP Address %s Rejected - see:
http://spamcop.net/bl.shtml' % IPaddy, \
'relays.ordb.org' : 'IP Address %s Rejected - see:
http://ordb.org/' % IPaddy, \
'list.dsbl.org' : 'IP Address %s Rejected - see:
http://dsbl.org' % IPaddy}

timing_done = 0
start_time = time()
for host in bl_list.keys():
if host in bl_list.keys():
IPlookup = "%s.%s" % (reverseIP, host)
try:
debug(" IPlookup=%s=" % IPlookup)
resolvesIP = socket.gethostbyname(IPlookup)
debug(" resolvesIP=%s=" % resolvesIP)
if resolvesIP.startswith('127.'):
end_time = time()
elapsed_time = end_time - start_time
timing_done = 1
debug("Time elapsed for rDNS on bl_list: %f secs" %
elapsed_time)
debug("exiting--SPAM! id'd by %s" % host)
print bl_list[host]
sys.exit(0)
except socket.gaierror:
pass
if not timing_done:
end_time = time()
elapsed_time = end_time - start_time
debug("2nd try:Time elapsed for rDNS on bl_list: %f secs" %
elapsed_time)

Aug 13 '05 #1
10 6518
I do note that the setdefaulttimeout is accomplishing something in my
full program.

I am testing some error handling in the code at the moment, and am
raising an exception to make the code go into the "except" blocks...

The part that sends an error email notice bombed due to socket timeout.
(well, until I raised the timeout to 3 seconds...)

The last time I asked about this topic in the newsgroups...was a while
back (like 2 or so years) and someone said that because the socket
function that I'm trying to use is coded in C, rather than Python, that
I could not use the timeoutsocket module (which was the only way prior
to Python 2.3 to set timeouts on sockets).

I wonder...is it possible this applies to the particular timeouts I'm
trying to enforce on the "gethostbyname" DNS lookups?

Insights appreciated....

Aug 13 '05 #2
Sheila King wrote:
I'm doing DNS lookups [...] it is important to make sure
that the socket doesn't just hang there waiting for a response.

After a recent system upgrade to Python 2.4.1 (from 2.2.2) I thought I
could take advantage of the setdefaulttimeout in the socket module, to
limit the amount of time the sockets take for a lookup.

As a test, I set the default timout ridiculously low. But it doesn't
seem to be having any effect.


The timeout applies to network communication on that socket, but
not to calls such as socket.gethostbyname. The gethostbyname
function actually goes to the operating system, which can look
up the name in a cache, or a hosts file, or query DNS servers
on sockets of its own.

Modern OS's generally have reasonably TCP/IP implementations,
and the OS will handle applying a reasonable timeout. Still
gethostbyname and its brethren can be a pain for single-
threaded event-driven programs, because they can block for
significant time.

Under some older threading systems, any system call would block
every thread in the process, and gethostbyname was notorious for
holding things up. Some systems offer an asynchronous
gethostbyname, but that doesn't help users of Python's library.
Some programmers would keep around a few extra processes to
handle their hosts lookups. Fortunately, threading systems are
now much better, and should only block the thread waiting for
gethostbyname.
--
--Bryan
Aug 13 '05 #3
On 08/12/2005 22:37:22 Bryan Olson <fa*********@nowhere.org> wrote:
Sheila King wrote:
I'm doing DNS lookups [...] it is important to make sure that the socket
doesn't just hang there waiting for a response. After a recent system upgrade to Python 2.4.1 (from 2.2.2) I thought I
could take advantage of the setdefaulttimeout in the socket module, to
limit the amount of time the sockets take for a lookup. As a test, I set the default timout ridiculously low. But it doesn't
seem to be having any effect.

The timeout applies to network communication on that socket, but not to
calls such as socket.gethostbyname. The gethostbyname function actually
goes to the operating system, which can look up the name in a cache, or a
hosts file, or query DNS servers on sockets of its own. Modern OS's generally have reasonably TCP/IP implementations, and the OS
will handle applying a reasonable timeout. Still gethostbyname and its
brethren can be a pain for single- threaded event-driven programs, because
they can block for significant time. Under some older threading systems, any system call would block every
thread in the process, and gethostbyname was notorious for holding things
up. Some systems offer an asynchronous gethostbyname, but that doesn't
help users of Python's library. Some programmers would keep around a few
extra processes to handle their hosts lookups. Fortunately, threading
systems are now much better, and should only block the thread waiting for
gethostbyname.


Thanks, Bryan. I'm not doing any threading. But we are running this script on
incoming email as it arrives at the SMTP server, and scripts have a 16 second
max time of execution. Generally they run in much less time. However, we have
seen incidents where, due to issues with the DNS servers for the blacklists,
that the script exceed it's max time to run and the process was killed by
the OS. This results in the email being placed back into the mail queue for
attempted re-delivery later. Of course, if this issue goes undetected, the
mail can eventually be "returned to sender". There's no effective way to check
from within the running filter script that the time is not exceeded if the
gethostbyname blocks and doesn't return. :(

As I said, normally this isn't a problem. But there have been a handful of
incidents where it did cause issues briefly over a few days. I was hoping to
address it. :/

Sounds like I'm out of luck.

--
Sheila King
http://www.thinkspot.net/sheila/

Aug 13 '05 #4
Sheila King wrote:
Bryan Olson wrote: [...]
Under some older threading systems, any system call would block every
thread in the process, and gethostbyname was notorious for holding things
up. Some systems offer an asynchronous gethostbyname, but that doesn't
help users of Python's library. Some programmers would keep around a few
extra processes to handle their hosts lookups. Fortunately, threading
systems are now much better, and should only block the thread waiting for
gethostbyname.

Thanks, Bryan. I'm not doing any threading. But we are running this

script on incoming email as it arrives at the SMTP server, and scripts have a 16 second max time of execution. Generally they run in much less time. However, we have seen incidents where, due to issues with the DNS servers for the blacklists, that the script exceed it's max time to run and the process was killed by the OS. This results in the email being placed back into the mail queue for attempted re-delivery later. Of course, if this issue goes undetected, the mail can eventually be "returned to sender". There's no effective way to check from within the running filter script that the time is not exceeded if the gethostbyname blocks and doesn't return. :(

As I said, normally this isn't a problem. But there have been a handful of incidents where it did cause issues briefly over a few days. I was hoping to address it. :/

Sounds like I'm out of luck.


The seperate thread-or-process trick should work. Start a deamon
thread to do the gethostbyname, and have the main thread give up
on the check if the deamon thread doesn't report (via a lock or
another socket) within, say, 8 seconds.

If you have decent thread support, you might do it like as
follows. (Oviously didn't have time test this well.)

from threading import Thread
from Queue import Queue, Empty
import socket
def start_deamon_thread(func):
""" Run func -- a callable of zero args -- in a deamon thread.
"""
thread = Thread(target = func)
thread.setDaemon(True)
thread.start()

def gethostbyname_or_timeout(hostname, timeout_secs = 8):
""" Return IP address from gethostbyname, or None on timeout.
"""
queue = Queue(1)

def attempt_ghbn():
queue.put(socket.gethostbyname(hostname))

start_deamon_thread(attempt_ghbn)
try:
result = queue.get(block = True, timeout = timeout_secs)
except Empty:
result = None
return result
--
--Bryan
Aug 13 '05 #5
Bryan: Thanks for the tips/suggestion.

I will definitely look into that. (It will be my first foray into
coding with threads...I do appreciate that you've laid a great deal of
it out. I will certainly refer to my references and do substantial
testing on this...)

Thanks!

--
Sheila King
http://www.thinkspot.net/sheila/

Aug 13 '05 #6
On Sat, 13 Aug 2005 05:37:22 GMT, Bryan Olson <fa*********@nowhere.org>
declaimed the following in comp.lang.python:


The timeout applies to network communication on that socket, but
not to calls such as socket.gethostbyname. The gethostbyname
function actually goes to the operating system, which can look
up the name in a cache, or a hosts file, or query DNS servers
on sockets of its own.

Modern OS's generally have reasonably TCP/IP implementations,
and the OS will handle applying a reasonable timeout. Still
gethostbyname and its brethren can be a pain for single-
threaded event-driven programs, because they can block for
significant time.
I've got the opposite problem -- I'm on a dial-up (well, for a few
more weeks -- until the DSL gear arrives). For some reason DNS lookups
seem to be low priority and, if I'm downloading messages in Agent (from
three servers yet) and email (Eudora), Firefox often comes back with a
"host not found"; enter the same URL/bookmark again, and it finds the
page. It seems my system times out DNS requests when 1) I have a lot of
traffic on the dial-up connection, 2) the request may need to be
traversed (not cached on Earthlink)
-- ================================================== ============ <
wl*****@ix.netcom.com | Wulfraed Dennis Lee Bieber KD6MOG <
wu******@dm.net | Bestiaria Support Staff <
================================================== ============ <
Home Page: <http://www.dm.net/~wulfraed/> <
Overflow Page: <http://wlfraed.home.netcom.com/> <

Aug 14 '05 #7
Dennis Lee Bieber wrote:

I've got the opposite problem -- I'm on a dial-up (well, for a few
more weeks -- until the DSL gear arrives). For some reason DNS lookups
seem to be low priority and, if I'm downloading messages in Agent (from
three servers yet) and email (Eudora), Firefox often comes back with a
"host not found"; enter the same URL/bookmark again, and it finds the
page. It seems my system times out DNS requests when 1) I have a lot of
traffic on the dial-up connection, 2) the request may need to be
traversed (not cached on Earthlink)


Dragging the thread off-topic a little:

I was having a similar DNS problem (host not found, try again
immediately, 2nd try successful) but even in a no-other-load situation,
and with this set of gear:

* ADSL with Netgear DG834G v2 "wireless ADSL firewall router"
* both hardwired and wireless LAN connection to router
* Firefox, Thunderbird [normally]
* IE6, Outlook Express [just to test if it was a Mozilla problem]
* Windows XP Professional SP2, Windows 2000 SP4

The problem stopped after I upgraded the router firmware from version
1.something to 2.something, but this may be yet another instance of the
"waved a dead chicken at the volcano" syndrome :-)
Aug 14 '05 #8
On 13/08/05 Bryan Olson said:
The seperate thread-or-process trick should work. Start a deamon
thread to do the gethostbyname, and have the main thread give up
on the check if the deamon thread doesn't report (via a lock or
another socket) within, say, 8 seconds.


Wouldn't an alarm be much simpler than a whole thread just for this?

Mike

--
Michael P. Soulier <ms******@digitaltorque.ca>
"Those who would give up esential liberty for temporary safety deserve
neither liberty nor safety." --Benjamin Franklin

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.7 (GNU/Linux)

iD8DBQFDAL/7KGqCc1vIvggRArUdAJ0c40bWlh5sC3BZ5IcMr6iXbT5V+QCgr Mw6
nONtimH/q9W5OQACRdifBos=
=AiJ1
-----END PGP SIGNATURE-----

Aug 15 '05 #9

Michael P. Soulier wrote:
On 13/08/05 Bryan Olson said:
The seperate thread-or-process trick should work. Start a deamon
thread to do the gethostbyname, and have the main thread give up
on the check if the deamon thread doesn't report (via a lock or
another socket) within, say, 8 seconds.


Wouldn't an alarm be much simpler than a whole thread just for this?


You mean a Unix-specific signal? If so that would be much less
portable. As for simpler, I'd have to see your code.
--
--Bryan

Aug 15 '05 #10
Sheila King wrote:
I'm doing DNS lookups on common spam blacklists (such as SpamCop..and
others) in an email filtering script. Sometimes, because the DNS server
that is resolving the looksup can go down, it is important to make sure
that the socket doesn't just hang there waiting for a response.

After a recent system upgrade to Python 2.4.1 (from 2.2.2) I thought I
could take advantage of the setdefaulttimeout in the socket module, to
limit the amount of time the sockets take for a lookup.

As a test, I set the default timout ridiculously low. But it doesn't
seem to be having any effect. The sockets will take several seconds to
make the connection and complete the lookups, even though I've set the
timeout to millionths of a second, which I thought would ensure a
timeout (sample script below).

Am I doing something wrong? Do I misunderstand something? Is what I
want to do simply not possible?

Thanks for any tips. Example code follows signature...

--
Sheila King
http://www.thinkspot.net/sheila/

#!/usr/local/bin/python2.4
import socket
import sys
from time import time, asctime, localtime

socket.setdefaulttimeout(.00001)
debugfile = "socketdebug.txt"
def debug(data):
timestamp = str(asctime(localtime(time())))
try:
f = open(debugfile, 'a')
f.write('\n*** %s ***\n' % timestamp)
f.write('%s\n' % data) # 24-Dec-2003 -ctm- removed one
linefeed
f.close()
except IOError:
pass
# do nothing if the file cannot be opened
IPaddy = '220.108.204.114'

if IPaddy:
IPquads = IPaddy.split('.')
IPquads.reverse()
reverseIP = '.'.join(IPquads)

bl_list = { 'bl.spamcop.net' : 'IP Address %s Rejected - see:
http://spamcop.net/bl.shtml' % IPaddy, \
'relays.ordb.org' : 'IP Address %s Rejected - see:
http://ordb.org/' % IPaddy, \
'list.dsbl.org' : 'IP Address %s Rejected - see:
http://dsbl.org' % IPaddy}

timing_done = 0
start_time = time()
for host in bl_list.keys():
if host in bl_list.keys():
IPlookup = "%s.%s" % (reverseIP, host)
try:
debug(" IPlookup=%s=" % IPlookup)
resolvesIP = socket.gethostbyname(IPlookup)
debug(" resolvesIP=%s=" % resolvesIP)
if resolvesIP.startswith('127.'):
end_time = time()
elapsed_time = end_time - start_time
timing_done = 1
debug("Time elapsed for rDNS on bl_list: %f secs" %
elapsed_time)
debug("exiting--SPAM! id'd by %s" % host)
print bl_list[host]
sys.exit(0)
except socket.gaierror:
pass
if not timing_done:
end_time = time()
elapsed_time = end_time - start_time
debug("2nd try:Time elapsed for rDNS on bl_list: %f secs" %
elapsed_time)

I don't believe that gethostbyname()'s use of socket technology can be
expected to raise socket timeout exceptions, since in general it's a
call to a library that uses standard system calls. This would at least
explain the behaviour you were seeing.

It might just be easier to to the DNS work yourself using the rather
nifty "dnspython" module. This does allow you to easily implement
timeouts for specific interactions.

regards
Steve
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC http://www.holdenweb.com/

Aug 17 '05 #11

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: Fortepianissimo | last post by:
I've tried using socket.setdefaulttimeout(timeout) to set the default timeout to 'timeout' for all sockets. The part that's not clear to me, is that if this will affect...
0
by: Tim Williams | last post by:
(python newbie) Is it possible to individually set the socket timeout of a connection created by smtplib, rather than use the socket.setdefaulttimeout() value used by the rest of the...
1
by: John Hunter | last post by:
I have a test script below which I use to fetch urls into strings, either over https or http. When over https, I use m2crypto.urllib and when over http I use the standard urllib. Whenever, I...
5
by: Russell Warren | last post by:
Does anyone know the scope of the socket.setdefaulttimeout call? Is it a cross-process/system setting or does it stay local in the application in which it is called? I've been testing this and...
1
by: rtilley | last post by:
Perhaps this is a dumb question... but here goes. Should a socket client and a socket server each have different values for socket.setdefaulttimeout() what happens? Does the one with the shortest...
0
by: Jaap Spies | last post by:
Hi, Running Fedora Core 4: Python 2.4.3 and Python 2.4.1. I'm getting: IOError: (2, 'No such file or directory') all the time. Trying to track down this problem: Python 2.4.1 (#1, May 16...
1
by: John Nagle | last post by:
Does setting "socket.setdefaulttimeout" affect the timeout in MySQLdb for connections to the database? I'm getting database connection timeouts on a local (same machine) connnection, and I've been...
2
by: Robin Becker | last post by:
While messing about with some deliberate socket timeout code I got an unexpected timeout after 20 seconds when my code was doing socket.setdefaulttimeout(120). Closer inspection revealed that...
10
by: Hendrik van Rooyen | last post by:
While doing a netstring implementation I noticed that if you build a record up using socket's recv(1), then when you close the remote end down, the recv(1) hangs, despite having a short time out...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.