469,596 Members | 2,269 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,596 developers. It's quick & easy.

thread help

Howdy,

Below is a script that I'm using to try and count the number of HTTP
servers within a company's private network. There are 65,536 possible
hosts that may have HTTP servers on them. Any way, I wrote this script
at first w/o threads. It works, but it takes days to run as it probes
one IP at a time... so I thought I'd try to make it threaded so it could
test several dozen IPs at once.

I'm no expert on threading, far from it. Could someone show me how I can
make this work correctly? I want to probe 64 unique IP address for HTTP
servers simultaneously, not the same IP addy 64 times (as I'm doing
now). Any tips would be much appreciated.

Bart

import urllib2, socket, threading, time

class trivialthread(threading.Thread):
def run(self):
socket.setdefaulttimeout(1)

hosts = []
networks = []

# Add the network 192.168.0 possibility.
networks.append("192.168.0.")
n = 0
while n < 255:
n = n + 1
# Generate and add networks 192.168.1-255 to the list of networks.
networks.append("192.168.%s." %(n))

for network in networks:
h = 0
# Add the n.n.n.0 host possibility
hosts.append(network+str(h))
while h < 255:
h = h + 1
# Add hosts 1 - 255 to each network.
hosts.append(network+str(h))

websites = file('websites.txt', 'w')
for ip in hosts:
try:
f = urllib2.urlopen("http://%s" %ip)
f.read()
f.close()
print>> websites, ip
except urllib2.URLError:
print ip
except socket.timeout:
print ip, "Timed Out..................."
except socket.sslerror:
print ip, "SSL Error..................."
websites.close()

if __name__ == '__main__':
threads = []
for x in range(64):
thread = trivialthread()
threads.append(thread)
for thread in threads:
thread.start()
while threading.activeCount() > 0:
print str(threading.activeCount()), "threads running incl. main"
time.sleep(1)
Jul 18 '05 #1
11 1512
In article <ca**********@solaris.cc.vt.edu>,
Bart Nessux <ba*********@hotmail.com> wrote:

I'm no expert on threading, far from it. Could someone show me how I can
make this work correctly? I want to probe 64 unique IP address for HTTP
servers simultaneously, not the same IP addy 64 times (as I'm doing
now). Any tips would be much appreciated.


Create a threading.Thread subclass that takes one IP address and a list
of ports to scan. Start 64 instances of this class, each with a
different IP address.
--
Aahz (aa**@pythoncraft.com) <*> http://www.pythoncraft.com/

"as long as we like the same operating system, things are cool." --piranha
Jul 18 '05 #2
Aahz wrote:
Bart Nessux <ba*********@hotmail.com> wrote:
Could someone show me how I can make this work correctly? I want to probe
64 unique IP address for HTTP servers simultaneously, ...
Create a threading.Thread subclass that takes one IP address and a list
of ports to scan. Start 64 instances of this class, each with a
different IP address.


An alternative is to create a que into which you push IP addresses to
contact, and have each thread read addresses off the queue when they are
free to process them. This has the advantage of decoupling the number
of threads from the number of addresses you want to examine.

-Scott David Daniels
Sc***********@Acm.Org
Jul 18 '05 #3
In article <40********@nntp0.pdx.net>,
Scott David Daniels <Sc***********@Acm.Org> wrote:
Aahz wrote:
Bart Nessux <ba*********@hotmail.com> wrote:

Could someone show me how I can make this work correctly? I want to probe
64 unique IP address for HTTP servers simultaneously, ...


Create a threading.Thread subclass that takes one IP address and a list
of ports to scan. Start 64 instances of this class, each with a
different IP address.


An alternative is to create a que into which you push IP addresses to
contact, and have each thread read addresses off the queue when they are
free to process them. This has the advantage of decoupling the number
of threads from the number of addresses you want to examine.


Absolutely, but that requires a bit more work for someone who isn't
already familiar with threading.
--
Aahz (aa**@pythoncraft.com) <*> http://www.pythoncraft.com/

"as long as we like the same operating system, things are cool." --piranha
Jul 18 '05 #4
Scott David Daniels wrote:
An alternative is to create a que into which you push IP addresses to
contact, and have each thread read addresses off the queue when they are
free to process them. This has the advantage of decoupling the number
of threads from the number of addresses you want to examine.


That is also the general best design pattern for doing threading.
Have a Queue.Queue object where you place work items and have
threads pull items off the queue and execute it. You can use
callbacks or another Queue for placing the results on.

The Queue.Queue class has the nice property that it is very
thread safe, and you can do both blocking and timed waits
on it.

The other problem to deal with that is particular to Python is
how to stop threads when you want to shutdown or cancel actions.
At the very least you can have a 'shutdown' message you place
on the Queue that when any thread reads, it shuts down.

Unfortunately Python doesn't allow interrupting a thread,
so any thread doing something will run to completion. You
can check a variable or something in lines of Python
code, but cannot do anything when in C code. For example
if you do some networking stuff and the C code (eg a DNS
lookup followed by a TCP connect) takes 2 minutes, then
you will have to wait at least that long.

In the simplest case you can just make all your threads be
daemon. Python will shutdown when there are no non-daemon
threads left, so you can just exit your main loop and all
will shutdown. However that means the worker threads just
get abruptly stopped in the middle of what they were
doing.

(IMHO it would be *really* nice if Python provided a way
to interrupt threads).

Roger
Jul 18 '05 #5
Roger Binns wrote:
In the simplest case you can just make all your threads be
daemon. Python will shutdown when there are no non-daemon
threads left, so you can just exit your main loop and all
will shutdown. However that means the worker threads just
get abruptly stopped in the middle of what they were
doing.

(IMHO it would be *really* nice if Python provided a way
to interrupt threads).


Sounds like you can't eat your cake and have it, too. If
you _could_ interrupt threads**, wouldn't that mean "the worker
threads just get abruptly stopped in the middle of what they
were doing"?

-Peter

** There is a way to interrupt threads in Python now, but
it requires an extension routine, or perhaps something with
ctypes. Findable in the archives for this newsgroup/list.
Jul 18 '05 #6
Peter Hansen wrote:
Sounds like you can't eat your cake and have it, too. If
you _could_ interrupt threads**, wouldn't that mean "the worker
threads just get abruptly stopped in the middle of what they
were doing"?


I meant in the same way that can in Java. In that case an
InterruptedException is thrown which the thread can catch
and do whatever it wants with.

As an example at the moment, socket.accept is a blocking
call and if a thread is executing that there is no way
of stopping it.

This would make shutdown and reconfigurations possible.
For example you could interrupt all relevant threads
and they could check a variable to see if they should
shutdown, bind to a different port, abandon the current
work item etc.

Roger
Jul 18 '05 #7
Roger Binns wrote:
Peter Hansen wrote:
Sounds like you can't eat your cake and have it, too. If
you _could_ interrupt threads**, wouldn't that mean "the worker
threads just get abruptly stopped in the middle of what they
were doing"?


I meant in the same way that can in Java. In that case an
InterruptedException is thrown which the thread can catch
and do whatever it wants with.


I didn't think things worked quite that way in Java. For
example, I thought InterruptedException was seen by a thread
only if it had actually been asleep at the time it was sent.

I also didn't know it would actually terminate certain
blocking calls, such as in socket stuff.

Oh well, it's been a while...

-Peter
Jul 18 '05 #8
Scott David Daniels wrote:
Aahz wrote:
Bart Nessux <ba*********@hotmail.com> wrote:
Could someone show me how I can make this work correctly? I want to
probe >>64 unique IP address for HTTP servers simultaneously, ...

Create a threading.Thread subclass that takes one IP address and a list
of ports to scan. Start 64 instances of this class, each with a
different IP address.

An alternative is to create a que into which you push IP addresses to
contact, and have each thread read addresses off the queue when they are
free to process them. This has the advantage of decoupling the number
of threads from the number of addresses you want to examine.

-Scott David Daniels
Sc***********@Acm.Org


I like this idea. I read up on the queue and threading module at
python.org and a few other sites around the Web and came up with this,
however, it doesn't work. I get these errors when it runs:

Exception in thread Thread-149:
Traceback (most recent call last):
File "/usr/lib/python2.3/threading.py", line 434, in __bootstrap
self.run()
File "/usr/lib/python2.3/threading.py", line 414, in run
self.__target(*self.__args, **self.__kwargs)
File "www_reads_threaded_1.py", line 49, in sub_thread_proc
f = urllib2.urlopen(url).read()
File "/usr/lib/python2.3/urllib2.py", line 129, in urlopen
return _opener.open(url, data)
File "/usr/lib/python2.3/urllib2.py", line 324, in open
type_ = req.get_type()
AttributeError: 'NoneType' object has no attribute 'get_type'

The problem I have is this: I know too little about thread programming.
If anyone thinks the code I have below could be made to work for my
tasks (probe 65,000 IPs for HTTP servers using threads to speed things
up), then please *show* me how I might change it in order for it to work.

Thanks again,
Bart

networks = []
hosts = []
urls = []
#socket.setdefaulttimeout(30)
max_threads = 2
http_timeout = 30
start_time = time.time()

# Add the network 192.168.0 possibility.
networks.append("192.168.0.")

# Generate and add networks 192.168.1-255 to the list of networks.
n = 0
while n < 255:
n = n + 1
networks.append("192.168.%s." %(n))

# Generate and add hosts 1-255 to each network
for network in networks:
h = 0
# Add the n.n.n.0 host possibility
hosts.append(network+str(h))
while h < 255:
h = h + 1
hosts.append(network+str(h))

for ip in hosts:
ip = "http://" + ip
urls.append(ip)

urls = dict(zip(urls,urls))
# print urls

# Create a queue of urls to feed the threads
url_queue = Queue.Queue()
for url in urls:
url_queue.put(url)
# print url

def test_HTTP(url_queue):
def sub_thread_proc(url, result):
# try:
f = urllib2.urlopen(url).read()
# except Exception:
# print "Exception"
# else:
result.append(url)
while 1:
try:
url = url_queue.get(0)
except Queue.Empty:
return
result = []
sub_thread = threading.Thread(target=sub_thread_proc,
args=(url,result))
sub_thread.setDaemon(True)
sub_thread.start()
sub_thread.join(http_timeout)
print result

test_HTTP(urls)
Jul 18 '05 #9
Bart Nessux <ba*********@hotmail.com> writes:

The problem I have is this: I know too little about thread programming.
If anyone thinks the code I have below could be made to work for my
tasks (probe 65,000 IPs for HTTP servers using threads to speed things
up), then please *show* me how I might change it in order for it to work.


I haven't been following this thread but if I was doing this I would want to
use asynchronous programming. It would finally force me to get to grips with
twisted.

Eddie
Jul 18 '05 #10
One word Eddie... WOW!

This async stuff is fabulous! It works and it's dead easy for my
application. There is no cpu limit with what I'm doing... only I/O
problems. At your suggestion, I looked at twisted, and then just used
the standard python asyncore module because it looked so darn easy, and
as it turned out, it was.

Thanks a million for the advice. I was looking in the *wrong* direction.

Eddie Corns wrote:
Bart Nessux <ba*********@hotmail.com> writes:
The problem I have is this: I know too little about thread programming.
If anyone thinks the code I have below could be made to work for my
tasks (probe 65,000 IPs for HTTP servers using threads to speed things
up), then please *show* me how I might change it in order for it to work.

I haven't been following this thread but if I was doing this I would want to
use asynchronous programming. It would finally force me to get to grips with
twisted.

Eddie

Jul 18 '05 #11
Peter Hansen wrote:
I didn't think things worked quite that way in Java. For
example, I thought InterruptedException was seen by a thread
only if it had actually been asleep at the time it was sent.

I also didn't know it would actually terminate certain
blocking calls, such as in socket stuff.


http://java.sun.com/j2se/1.4.2/docs/...tml#interrupt()

As is the Java way, they have different types of interrupted
exceptions and sockets etc would need to be InterruptibleChannels.
They also use checked exceptions which makes life a lot harder.

More on Java best practises for thread interruption:

http://java.sun.com/j2se/1.4.2/docs/...precation.html

I would just settle for a nice clean mechanism whereby you can
call Thread.interrupt() and have an InterruptException in that
thread (which ultimately terminates the thread if it isn't
handled anywhere).

Some threads won't be interruptible because they are deep in
extension libraries. Perhaps that can be returned to the
caller of Thread.interrupt().

Roger
Jul 18 '05 #12

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

12 posts views Thread by serge calderara | last post: by
6 posts views Thread by Tony Proctor | last post: by
4 posts views Thread by Leonardo Hyppolito | last post: by
7 posts views Thread by Ivan | last post: by
7 posts views Thread by Charles Law | last post: by
8 posts views Thread by Brad Walton | last post: by
reply views Thread by suresh191 | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.