How to force a thread to stop

Hans

Hi all,

Is there a way that the program that created and started a thread also stops
it.
(My usage is a time-out).

E.g.

thread = threading.Thread(target=Loop.testLoop)
thread.start() # This thread is expected to finish within a second
thread.join(2) # Or time.sleep(2) ?

if thread.isAlive():
# thread has probably encountered a problem and hangs
# What should be here to stop thread ??????

Note that I don't want to change the target (too much), as many possible
targets exist,
together thousands of lines of code.

Thanks,
Hans

Jul 22 '06 #1

Subscribe Post Reply

54750

bryanjugglercryptographer

Hans wrote:

Hi all,

Is there a way that the program that created and started a thread also stops
it.
(My usage is a time-out).

E.g.

thread = threading.Thread(target=Loop.testLoop)
thread.start() # This thread is expected to finish within a second
thread.join(2) # Or time.sleep(2) ?

No, Python has no threadicide method, and its absence is not an
oversight. Threads often have important business left to do, such
as releasing locks on shared data; killing them at arbitrary times
tends to leave the system in an inconsistent state.

if thread.isAlive():
# thread has probably encountered a problem and hangs
# What should be here to stop thread ??????

At this point, it's too late. Try to code so your threads don't hang.

Python does let you arrange for threads to die when you want to
terminate the program, with threading's Thread.setDaemon().
--
--Bryan

Jul 22 '06 #2

Méta-MCI

Hi!

>> "threadicide" method

I like this word...

Michel Claveau

Jul 23 '06 #3

Lawrence D'Oliveiro

In message <11**********************@m73g2000cwd.googlegroups .com>,
br***********************@yahoo.com wrote:

Python has no threadicide method, and its absence is not an
oversight. Threads often have important business left to do, such
as releasing locks on shared data; killing them at arbitrary times
tends to leave the system in an inconsistent state.

Perhaps another reason to avoid threads and use processes instead?

Jul 24 '06 #4

Paul Rubin

Lawrence D'Oliveiro <ld*@geek-central.gen.new_zealandwrites:

Python has no threadicide method, and its absence is not an
oversight. Threads often have important business left to do, such
as releasing locks on shared data; killing them at arbitrary times
tends to leave the system in an inconsistent state.

Perhaps another reason to avoid threads and use processes instead?

If the processes are sharing resources, the exact same problems arise.

Jul 24 '06 #5

Carl J. Van Arsdall

Dennis Lee Bieber wrote:

On Sat, 22 Jul 2006 14:47:30 +0200, "Hans" <No****@Hccnet.nldeclaimed
the following in comp.lang.python:

>Hi all,

Is there a way that the program that created and started a thread also stops
it.
(My usage is a time-out).

Hasn't this subject become a FAQ entry yet? <G>

The only reliable way of stopping a thread is from inside the thread
itself. That is, the thread must, at some point, examine some flag
variable which, when set, says "stop"

Without using actual code:

class StoppableThread(...):
def __init__(self, ...):
#whatever is needed to initialize as a thread
self.Stop = False

def run(self, ...):
while not self.Stop:
#do one cycle of the computation

My problem with the fact that python doesn't have some type of "thread
killer" is that again, the only solution involves some type of polling
loop. I.e. "if your thread of execution can be written so that it
periodically checks for a kill condition". This really sucks, not just
because polling is a ridiculous practice, but it forces programmers in
many situations to go through a lengthy process of organizing operations
into a list. For, say I have threads that share a bunch of common
memory (yea, i'm saying this exclusively to get the procses users off my
back) that executes a series of commands on remote nodes using rsh or
something. So if i've constructed my system using threads I need to
neatly go and dump all operations into some sort of list so that I can
implement a polling mechanism, i.e.

opList = [op1, op2, op3, op4]
for op in opList:
checkMessageQueue()
op()

That works if you can easily create an opList. If you want good
response time this can become quite ugly, especially if you have a lot
going on. Say I have a function I want to run in a thread:

#Just pretend for the sake of arguement that 'op' actually means
something and is a lengthy operation
def func_to_thread():
os.system('op 1')
os.system('op 2')
os.system('op 3')
#In order to make this killable with reasonable response time we have to
organize each of our ops into a function or something equally annoying

op_1():
os.system('op 1')

op_2():
os.system('op 2')

op_3():
os.system('op 3')

opList(op_1, op_2, op_3)
def to_thread():
for op in opList:
checkMessageQueue()
op()
So with this whole "hey mr. nice thread, please die for me" concept gets
ugly quickly in complex situations and doesn't scale well at all.
Furthermore, say you have a complex systems where users can write
pluggable modules. IF a module gets stuck inside of some screwed up
loop and is unable to poll for messages there's no way to kill the
module without killing the whole system. Any of you guys thought of a
way around this scenario?
--

Carl J. Van Arsdall
cv*********@mvista.com
Build and Release
MontaVista Software

Jul 24 '06 #6

Steve Holden

Carl J. Van Arsdall wrote:
[... rant ...]

So with this whole "hey mr. nice thread, please die for me" concept gets
ugly quickly in complex situations and doesn't scale well at all.
Furthermore, say you have a complex systems where users can write
pluggable modules. IF a module gets stuck inside of some screwed up
loop and is unable to poll for messages there's no way to kill the
module without killing the whole system. Any of you guys thought of a
way around this scenario?

Communications through Queue.Queue objects can help. But if you research
the history of this design decision in the language you should discover
there are fairly sound rasons for not allowing arbitrary "threadicide".

regards
Steve
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://holdenweb.blogspot.com
Recent Ramblings http://del.icio.us/steve.holden

Jul 24 '06 #7

Carl J. Van Arsdall

Steve Holden wrote:

Carl J. Van Arsdall wrote:
[... rant ...]

>So with this whole "hey mr. nice thread, please die for me" concept gets
ugly quickly in complex situations and doesn't scale well at all.
Furthermore, say you have a complex systems where users can write
pluggable modules. IF a module gets stuck inside of some screwed up
loop and is unable to poll for messages there's no way to kill the
module without killing the whole system. Any of you guys thought of a
way around this scenario?

Communications through Queue.Queue objects can help. But if you research
the history of this design decision in the language you should discover
there are fairly sound rasons for not allowing arbitrary "threadicide".

Right, I'm wondering if there was a way to make an interrupt driven
communication mechanism for threads? Example: thread receives a
message, stops everything, and processes the message.
--

Carl J. Van Arsdall
cv*********@mvista.com
Build and Release
MontaVista Software

Jul 24 '06 #8

Paul Rubin

"Carl J. Van Arsdall" <cv*********@mvista.comwrites:

Communications through Queue.Queue objects can help. But if you
research the history of this design decision in the language you
should discover there are fairly sound rasons for not allowing
arbitrary "threadicide".
Right, I'm wondering if there was a way to make an interrupt driven
communication mechanism for threads? Example: thread receives a
message, stops everything, and processes the message. --

There is in fact some under-the-covers mechanism in CPython (i.e.
one you can call from C extensions but not from Python code) to
raise exceptions in other threads. I've forgotten the details.
There has been discussion at various times about how to expose
something like that to Python code, but it's been inconclusive. E.g.:

http://sf.net/tracker/?func=detail&a...&group_id=5470

Jul 25 '06 #9

bryanjugglercryptographer

Carl J. Van Arsdall wrote:
[...]

My problem with the fact that python doesn't have some type of "thread
killer" is that again, the only solution involves some type of polling
loop.

A polliing loop is neither required nor helpful here.

[...]

#Just pretend for the sake of arguement that 'op' actually means
something and is a lengthy operation
def func_to_thread():
os.system('op 1')
os.system('op 2')
os.system('op 3')

What good do you think killing that thread would do? The
process running 'op n' has no particular binding to the thread
that called os.system(). If 'op n' hangs, it stays hung.

The problem here is that os.system doesn't give you enough
control. It doesn't have a timeout and doesn't give you a
process ID or handle to the spawned process.

Running os.system() in multiple threads strikes me as
kind of whacked. Won't they all compete to read and write
stdin/stdout simultaneously?

#In order to make this killable with reasonable response time we have to
organize each of our ops into a function or something equally annoying

op_1():
os.system('op 1')

op_2():
os.system('op 2')

op_3():
os.system('op 3')

opList(op_1, op_2, op_3)
def to_thread():
for op in opList:
checkMessageQueue()
op()

Nonsense. If op() hangs, you never get to checkMessageQueue().

Now suppose op has a timeout. We could write

def opcheck(thing):
result = op(thing)
if result == there_was_a_timeout:
raise some_timeout_exception

How is:

def func_to_thread():
opcheck('op 1')
opcheck('op 2')
opcheck('op 3')

any less managable than your version of func_to_thread?

So with this whole "hey mr. nice thread, please die for me" concept gets
ugly quickly in complex situations and doesn't scale well at all.
Furthermore, say you have a complex systems where users can write
pluggable modules. IF a module gets stuck inside of some screwed up
loop and is unable to poll for messages there's no way to kill the
module without killing the whole system. Any of you guys thought of a
way around this scenario?

Threadicide would not solve the problems you actually have, and it
tends to create other problems. What is the condition that makes
you want to kill the thread? Make the victim thread respond to that
condition itself.
--
--Bryan

Jul 25 '06 #10

Paul Rubin

br***********************@yahoo.com writes:

Threadicide would not solve the problems you actually have, and it
tends to create other problems. What is the condition that makes
you want to kill the thread? Make the victim thread respond to that
condition itself.

If the condition is a timeout, one way to notice it is with sigalarm,
which raises an exception in the main thread. But then you need a way
to make something happen in the remote thread.

Jul 25 '06 #11

bryanjugglercryptographer

Dennis Lee Bieber wrote:

On Mon, 24 Jul 2006 10:27:08 -0700, "Carl J. Van Arsdall"
My problem with the fact that python doesn't have some type of "thread
killer" is that again, the only solution involves some type of polling
loop. I.e. "if your thread of execution can be written so that it

And that is because the control of a thread, once started, is
dependent upon the underlying OS...

No; it's because killing a thread from another thread fundamentally
sloppy.

The process of creating a thread can
be translated into something supplied by pretty much all operating
systems: an Amiga task, posix thread, etc.

But ending a thread is then also dependent upon the OS -- and not
all OSs have a way to do that that doesn't run the risk of leaking
memory, leaving things locked, etc. until the next reboot.

No operating system has a good way to do it, at least not for
the kind of threads Python offers.

The procedure for M$ Windows to end a task basically comes down to
"send the task a 'close window' event; if that doesn't work, escalate...
until in the end it throw its hands up and says -- go ahead and leave
memory in a mess, just stop running that thread".

The right procedure in MS Windows is the same as under POSIX:
let the thread terminate on its own.

module without killing the whole system. Any of you guys thought of a
way around this scenario?

Ask Bill Gates... The problem is part of the OS.

Or learn how to use threads properly. Linux is starting to get good
threading. Win32 has had it for quite a while.
--
--Bryan

Jul 25 '06 #12

Carl J. Van Arsdall

br***********************@yahoo.com wrote:

Carl J. Van Arsdall wrote:
[...]

>My problem with the fact that python doesn't have some type of "thread
killer" is that again, the only solution involves some type of polling
loop.

A polliing loop is neither required nor helpful here.

[...]

>#Just pretend for the sake of arguement that 'op' actually means
something and is a lengthy operation
def func_to_thread():
os.system('op 1')
os.system('op 2')
os.system('op 3')

What good do you think killing that thread would do? The
process running 'op n' has no particular binding to the thread
that called os.system(). If 'op n' hangs, it stays hung.

The problem here is that os.system doesn't give you enough
control. It doesn't have a timeout and doesn't give you a
process ID or handle to the spawned process.

Running os.system() in multiple threads strikes me as
kind of whacked. Won't they all compete to read and write
stdin/stdout simultaneously?

Unfortunately this is due to the nature of the problem I am tasked with
solving. I have a large computing farm, these os.system calls are often
things like ssh that do work on locations remote from the initial python
task. I suppose eventually I'll end up using a framework like twisted
but, as with many projects, I got thrown into this thing and threading
is where we ended up. So now there's the rush to make things work
before we can really look at a proper solution.

>
>#In order to make this killable with reasonable response time we have to
organize each of our ops into a function or something equally annoying

op_1():
os.system('op 1')

op_2():
os.system('op 2')

op_3():
os.system('op 3')

opList(op_1, op_2, op_3)
def to_thread():
for op in opList:
checkMessageQueue()
op()

Nonsense. If op() hangs, you never get to checkMessageQueue().

Yea, understood. At the same time, I can't use a timeout either, I
don't know how long op_1 or op_2 will be. This is why I want something
that is triggered on an event.

Now suppose op has a timeout. We could write

def opcheck(thing):
result = op(thing)
if result == there_was_a_timeout:
raise some_timeout_exception

How is:

def func_to_thread():
opcheck('op 1')
opcheck('op 2')
opcheck('op 3')

any less managable than your version of func_to_thread?

Again, the problem I'm trying to solve doesn't work like this. I've
been working on a framework to be run across a large number of
distributed nodes (here's where you throw out the "duh, use a
distributed technology" in my face). The thing is, I'm only writing the
framework, the framework will work with modules, lots of them, which
will be written by other people. Its going to be impossible to get
people to write hundreds of modules that constantly check for status
messages. So, if I want my thread to "give itself up" I have to tell it
to give up. In order to tell it to give up I need some mechanism to
check messages that is not going to piss off a large team of
programmers. At the same time, do I really want to rely on other people
to make things work? Not really, I'd much rather let my framework
handle all control and not leave that up to programmers.

So the problem is, I have something linearly executed a large list of
python functions of various sizes ranging from short to long. Its not
about killing the thread so much as how do I make the thread listen to
control messages without polling.

>So with this whole "hey mr. nice thread, please die for me" concept gets
ugly quickly in complex situations and doesn't scale well at all.
Furthermore, say you have a complex systems where users can write
pluggable modules. IF a module gets stuck inside of some screwed up
loop and is unable to poll for messages there's no way to kill the
module without killing the whole system. Any of you guys thought of a
way around this scenario?

Threadicide would not solve the problems you actually have, and it
tends to create other problems. What is the condition that makes
you want to kill the thread? Make the victim thread respond to that
condition itself.

I feel like this is something we've established multiple times. Yes, we
want the thread to kill itself. Alright, now that we agree on that,
what is the best way to do that. Right now people keep saying we must
send the thread a message. That's fine and I completely understand
that, but right now the only mechanism I see is some type of polling
loop (or diving into the C API to force exceptions). So far I've not
seen any other method though. If you want to send a thread a control
message you must wait until that thread is able to check for a control
message. If something hangs in your thread you are totally screwed,
similarly, if your thread ends up in some excessively lengthy IO (IO
that could be interrupted or whatever) you have to wait for that IO to
finish before your thread can process any control messages.

--

Carl J. Van Arsdall
cv*********@mvista.com
Build and Release
MontaVista Software

Jul 25 '06 #13

Gerhard Fiedler

On 2006-07-25 13:30:22, Carl J. Van Arsdall wrote:

>Running os.system() in multiple threads strikes me as kind of whacked.
Won't they all compete to read and write stdin/stdout simultaneously?

Unfortunately this is due to the nature of the problem I am tasked with
solving. I have a large computing farm, these os.system calls are often
things like ssh that do work on locations remote from the initial python
task.

[...]

Again, the problem I'm trying to solve doesn't work like this. I've been
working on a framework to be run across a large number of distributed
nodes (here's where you throw out the "duh, use a distributed
technology" in my face). The thing is, I'm only writing the framework,
the framework will work with modules, lots of them, which will be
written by other people. Its going to be impossible to get people to
write hundreds of modules that constantly check for status messages.

Doesn't this sound like a case for using processes instead of threads?
Where you don't have control over the thread, you can use a process and get
the separation you need to be able to kill this task.

Alternatively you could possibly provide a base class for the threads that
handles the things you need every thread to handle. They'd not have to
write it then; they'd not even have to know too much about it.

Gerhard

Jul 25 '06 #14

Carl J. Van Arsdall

Gerhard Fiedler wrote:

On 2006-07-25 13:30:22, Carl J. Van Arsdall wrote:

>>Running os.system() in multiple threads strikes me as kind of whacked.
Won't they all compete to read and write stdin/stdout simultaneously?

Unfortunately this is due to the nature of the problem I am tasked with
solving. I have a large computing farm, these os.system calls are often
things like ssh that do work on locations remote from the initial python
task.

[...]

>Again, the problem I'm trying to solve doesn't work like this. I've been
working on a framework to be run across a large number of distributed
nodes (here's where you throw out the "duh, use a distributed
technology" in my face). The thing is, I'm only writing the framework,
the framework will work with modules, lots of them, which will be
written by other people. Its going to be impossible to get people to
write hundreds of modules that constantly check for status messages.

Doesn't this sound like a case for using processes instead of threads?
Where you don't have control over the thread, you can use a process and get
the separation you need to be able to kill this task.

Alternatively you could possibly provide a base class for the threads that
handles the things you need every thread to handle. They'd not have to
write it then; they'd not even have to know too much about it.

Gerhard

I'd be all for using processes but setting up communication between
processes would be difficult wouldn't it? I mean, threads have shared
memory so making sure all threads know the current system state is an
easy thing. With processes wouldn't I have to setup some type of
server/client design, where one process has the system state and then
the other processes constantly probe the host when they need the current
system state?

--

Carl J. Van Arsdall
cv*********@mvista.com
Build and Release
MontaVista Software

Jul 25 '06 #15

Paul Rubin

"Carl J. Van Arsdall" <cv*********@mvista.comwrites:

I'd be all for using processes but setting up communication between
processes would be difficult wouldn't it? I mean, threads have
shared memory so making sure all threads know the current system
state is an easy thing. With processes wouldn't I have to setup
some type of server/client design, where one process has the system
state and then the other processes constantly probe the host when
they need the current system state? --

http://poshmodule.sf.net might be of interest.

Jul 25 '06 #16

Gerhard Fiedler

On 2006-07-25 13:55:39, Carl J. Van Arsdall wrote:

I'd be all for using processes but setting up communication between
processes would be difficult wouldn't it? I mean, threads have shared
memory so making sure all threads know the current system state is an
easy thing.

I'm not sure about that. Sharing data between threads or processes is never
an easy thing, especially since you are saying you can't trust your module
coders to "play nice". If you can't trust them to terminate their threads
nicely when asked so, you also can't trust them to responsibly handle
shared memory. That's exactly the reason why I suggested processes.

With processes wouldn't I have to setup some type of server/client
design, where one process has the system state and then the other
processes constantly probe the host when they need the current system
state?

Anything else is bound to fail. You need to have safeguards around any
shared data. (A semaphore is a type of server/client thing...) At the very
least you need to prevent read access while it is updated; very rarely this
is an atomic action, so there are times where the system state is
inconsistent while it is being updated. (I don't think you can consider
many Python commands as atomic WRT threads, but I'm not sure about this.)
IMO, in the situation you are describing, it is an advantage that data is
not normally accessible -- this means that your module coders need to
access the data in the way you present it to them, and so you can control
that it is being accessed correctly.

Gerhard

Jul 25 '06 #17

bryanjugglercryptographer

Carl J. Van Arsdall wrote:

Unfortunately this is due to the nature of the problem I am tasked with
solving. I have a large computing farm, these os.system calls are often
things like ssh that do work on locations remote from the initial python
task. I suppose eventually I'll end up using a framework like twisted
but, as with many projects, I got thrown into this thing and threading
is where we ended up. So now there's the rush to make things work
before we can really look at a proper solution.

I don't get what threading and Twisted would to do for
you. The problem you actually have is that you sometimes
need terminate these other process running other programs.
Use spawn, fork/exec* or maybe one of the popens.

Again, the problem I'm trying to solve doesn't work like this. I've
been working on a framework to be run across a large number of
distributed nodes (here's where you throw out the "duh, use a
distributed technology" in my face). The thing is, I'm only writing the
framework, the framework will work with modules, lots of them, which
will be written by other people. Its going to be impossible to get
people to write hundreds of modules that constantly check for status
messages. So, if I want my thread to "give itself up" I have to tell it
to give up.

Threads have little to do with what you say you need.

[...]

I feel like this is something we've established multiple times. Yes, we
want the thread to kill itself. Alright, now that we agree on that,
what is the best way to do that.

Wrong. In your examples, you want to kill other processes. You
can't run external programs such as ssh as Python threads. Ending
a Python thread has essentially nothing to do with it.

Right now people keep saying we must send the thread a message.

Not me. I'm saying work the problem you actually have.
--
--Bryan

Jul 26 '06 #18

Carl J. Van Arsdall

br***********************@yahoo.com wrote:

Carl J. Van Arsdall wrote:

I don't get what threading and Twisted would to do for
you. The problem you actually have is that you sometimes
need terminate these other process running other programs.
Use spawn, fork/exec* or maybe one of the popens.

I have a strong need for shared memory space in a large distributed
environment. How does spawn, fork/exec allow me to meet that need?
I'll look into it, but I was under the impression having shared memory
in this situation would be pretty hairy. For example, I could fork of a
50 child processes, but then I would have to setup some kind of
communication mechanism between them where the server builds up a queue
of requests from child processes and then services them in a FIFO
fashion, does that sound about right?

Threads have little to do with what you say you need.

[...]

>I feel like this is something we've established multiple times. Yes, we
want the thread to kill itself. Alright, now that we agree on that,
what is the best way to do that.

Wrong. In your examples, you want to kill other processes. You
can't run external programs such as ssh as Python threads. Ending
a Python thread has essentially nothing to do with it.

There's more going on than ssh here. Since I want to run multiple
processes to multiple devices at one time and still have mass shared
memory I need to use threads. There's a mass distributed system that
needs to be controlled, that's the problem I'm trying to solve. You can
think of each ssh as a lengthy IO process that each gets its own
device. I use the threads to allow me to do IO to multiple devices at
once, ssh just happens to be the IO. The combination of threads and ssh
allowed us to have a *primitive* distributed system (and it works too,
so I *can* run external programs in python threads). I didn't say is
was the best or the correct solution, but it works and its what I was
handed when I was thrown into this project. I'm hoping in fifteen years
or when I get an army of monkeys to fix it, it will change. I'm not
worried about killing processes, that's easy, I could kill all the sshs
or whatever else I want without batting an eye. The threads that were
created in order to allow me to do all of this work simultaneously,
that's the issue. Granted, I'm heavily looking into a way of doing this
with processes, I still don't see how threads are the wrong choice with
my present situation.

>
Not me. I'm saying work the problem you actually have.

The problem I have is a large distributed system, that's the reality of
it. The short summary, I need to use and control 100+ machines in a
computing farm. They all need to share memory or to actively
communicate with each other via some other mechanism. Without giving
any other details, that's the problem I have to solve. Right now I'm
working with someone else's code. Without redesigning the system from
the ground up, I have to fix it.
--

Carl J. Van Arsdall
cv*********@mvista.com
Build and Release
MontaVista Software

Jul 26 '06 #19

Paul Rubin

"Carl J. Van Arsdall" <cv*********@mvista.comwrites:

The problem I have is a large distributed system, that's the reality
of it. The short summary, I need to use and control 100+ machines in
a computing farm. They all need to share memory or to actively
communicate with each other via some other mechanism. Without giving
any other details, that's the problem I have to solve.

Have you looked at POSH yet? http://poshmodule.sf.net

There's also an shm module that's older and maybe more reliable.
Or you might be able to just use mmap.

Jul 26 '06 #20

Carl J. Van Arsdall

Paul Rubin wrote:

"Carl J. Van Arsdall" <cv*********@mvista.comwrites:

>The problem I have is a large distributed system, that's the reality
of it. The short summary, I need to use and control 100+ machines in
a computing farm. They all need to share memory or to actively
communicate with each other via some other mechanism. Without giving
any other details, that's the problem I have to solve.

Have you looked at POSH yet? http://poshmodule.sf.net

There's also an shm module that's older and maybe more reliable.
Or you might be able to just use mmap.

I'm looking at POSH, shm, and stackless right now! :-)

Thanks!

-carl

--

Carl J. Van Arsdall
cv*********@mvista.com
Build and Release
MontaVista Software

Jul 26 '06 #21

bryanjugglercryptographer

Carl J. Van Arsdall wrote:

br***********************@yahoo.com wrote:
Carl J. Van Arsdall wrote:

I don't get what threading and Twisted would to do for
you. The problem you actually have is that you sometimes
need terminate these other process running other programs.
Use spawn, fork/exec* or maybe one of the popens.
I have a strong need for shared memory space in a large distributed
environment.

Distributed shared memory is a tough trick; only a few systems simulate
it.

How does spawn, fork/exec allow me to meet that need?

I have no idea why you think threads or fork/exec will give you
distributed
shared memory.

I'll look into it, but I was under the impression having shared memory
in this situation would be pretty hairy. For example, I could fork of a
50 child processes, but then I would have to setup some kind of
communication mechanism between them where the server builds up a queue
of requests from child processes and then services them in a FIFO
fashion, does that sound about right?

That much is easy. What it has to with what you say you require
remains a mystery.

Threads have little to do with what you say you need.

[...]

I feel like this is something we've established multiple times. Yes, we
want the thread to kill itself. Alright, now that we agree on that,
what is the best way to do that.

Wrong. In your examples, you want to kill other processes. You
can't run external programs such as ssh as Python threads. Ending
a Python thread has essentially nothing to do with it.
There's more going on than ssh here. Since I want to run multiple
processes to multiple devices at one time and still have mass shared
memory I need to use threads.

No, you would need to use something that implements shared
memory across multiple devices. Threads are multiple lines of
execution in the same address space.

There's a mass distributed system that
needs to be controlled, that's the problem I'm trying to solve. You can
think of each ssh as a lengthy IO process that each gets its own
device. I use the threads to allow me to do IO to multiple devices at
once, ssh just happens to be the IO. The combination of threads and ssh
allowed us to have a *primitive* distributed system (and it works too,
so I *can* run external programs in python threads).

No, you showed launching it from a Python thread using os.system().
It's not running in the thread; it's running in a separate process.

I didn't say is
was the best or the correct solution, but it works and its what I was
handed when I was thrown into this project. I'm hoping in fifteen years
or when I get an army of monkeys to fix it, it will change. I'm not
worried about killing processes, that's easy, I could kill all the sshs
or whatever else I want without batting an eye.

After launching it with os.sytem()? Can you show the code?
--
--Bryan

Jul 26 '06 #22

Carl J. Van Arsdall

br***********************@yahoo.com wrote:

Carl J. Van Arsdall wrote:

>br***********************@yahoo.com wrote:

>>Carl J. Van Arsdall wrote:

I don't get what threading and Twisted would to do for
you. The problem you actually have is that you sometimes
need terminate these other process running other programs.
Use spawn, fork/exec* or maybe one of the popens.

I have a strong need for shared memory space in a large distributed
environment.

Distributed shared memory is a tough trick; only a few systems simulate
it.

Yea, this I understand, maybe I chose some poor words to describe what I
wanted. I think this conversation is getting hairy and confusing so I'm
going to try and paint a better picture of what's going on. Maybe this
will help you understand exactly what's going on or at least what I'm
trying to do, because I feel like we're just running in circles. After
the detailed explanation, if threads are the obvious choice or not, it
will be much easier to pick apart what I need and probably also easier
for me to see your point... so here goes... (sorry its long, but I keep
getting dinged for not being thorough enough).

So, I have a distributed build system. The system is tasked with
building a fairly complex set of packages that form a product. The
system needs to build these packages for 50 architectures using cross
compilation as well as support for 5 different hosts. Say there are
also different versions of this with tweaks for various configurations,
so in the end I might be trying to build 200+ different things at once.
I have a computing farm of 40 machines to do this for me.. That's the
high-level scenario without getting too detailed. There are also
subsystems that help us manage the machines and things, I don't want to
get into that, I'm going to try to focus on a scenario more abstract
than cluster/resource management stuff.

Alright, so manually running builds is going to be crazy and
unmanageable. So what the people who came before me did to manage this
scenario was to fork on thread per build. The threads invoke a series
of calls that look like

os.system(ssh <host<command>)

or for more complex operations they would just spawn a process that ran
another python script)

os.system(ssh <host<script>)

The purpose behind all this was for a couple things:

* The thread constantly needed information about the state of the
system (for example we don't want to end up building the same
architecture twice)
* We wanted a centralized point of control for an entire build
* We needed to be able to use as many machines as possible from a
central location.

Python threads worked very well for this. os.system behaves a lot like
many other IO operations in python and the interpreter gives up the
GIL. Each thread could run remote operations and we didn't really have
any problems. There wasn't much of a need to do fork, all it would have
done is increased the amount of memory used by the system.

Alright, so this scheme that was first put in place kind of worked.
There were some problems, for example when someone did something like

os.system(ssh <host<script>) we had no good way of knowing what the
hell happened in the script. Now granted, they used shared files to do
some of it over nfs mounts, but I really hate that. It doesn't work
well, its clunky, and difficult to manage. There were other problems
too, but I just wanted to give a sample.

Alright, so things aren't working, I come on board, I have a boss who
wants things done immediately. What we did was created what we called a
"Python Execution Framework". The purpose of the framework was to
mitigate a number of problems we had as well as take the burden of
distribution away from the programmers by providing a few layers of
abstraction (i'm only going to focus on the distributed part of the
framework, the rest is irrelevant to the discussion). The framework
executes and threads modules (or lists of modules). Since we had
limited time, we designed the framework with "distribution environment"
in mind but realized that if we shoot for the top right away it will
take years to get anything implemented.

Since we knew we eventually wanted a distributed system that could
execute framework modules entirely on remote machines we carefully
design and prepared the system for this. This involves some abstraction
and some simple mechanisms. However right now each ssh call will be
executed from a thread (as they will be done concurrently, just like
before). The threads still need to know about the state of the system,
but we'd also like to be able to issue some type of control that is more
event driven -- this can be sending the thread a terminate message or
sending the thread a message regarding the completion of a dependency
(we use conditions and events to do this synchronization right now). We
hoped that in the case of a catastrophic event or a user 'kill' signal
that the the system could take control of all the threads (or at least,
ask them to go away), this is what started the conversation in the first
place. We don't want to use a polling loop for these threads to check
for messages, we wanted to use something event driven (I mistakenly used
the word interrupt in earlier posts, but I think it still illustrates my
point). Its not only important that the threads die, but that they die
with grace. There's lots of cleanup work that has to be done when
things exit or things end up in an indeterminable state.

So, I feel like I have a couple options,

1) try moving everything to a process oriented configuration - we think
this would be bad, from a resource standpoint as well as it would make
things more difficult to move to a fully distributed system later, when
I get my army of code monkeys.

2) Suck it up and go straight for the distributed system now - managers
don't like this, but maybe its easier than I think its going to be, I dunno

3) See if we can find some other way of getting the threads to terminate.

4) Kill it and clean it up by hand or helper scripts - we don't want to
do this either, its one of the major things we're trying to get away from.

Alright, that's still a fairly high-level description. After all that,
if threads are still stupid then I think I'll much more easily see it
but I hope this starts to clear up confused. I don't really need a
distributed shared memory environment, but right now I do need shared
memory and it needs to be used fairly efficiently. For a fully
distributed environment I was going to see what various technologies
offered to pass data around, I figured that they must have some
mechanism for doing it or at least accessing memory from a central
location (we're setup to do this now we threads, we just need to expand
the concept to allow nodes to do it remotely). Right now, based on what
I have to do I think threads are the right choice until I can look at a
better implementation (i hear twisted is good at what I ultimately want
to do, but I don't know a thing about it).

Alright, if you read all that, thanks, and thanks for your input.
Whether or not I've agreed with anything, me and a few colleagues
definitely discuss each idea as its passed to us. For that, thanks to
the python list!

-carl
--

Carl J. Van Arsdall
cv*********@mvista.com
Build and Release
MontaVista Software

Jul 26 '06 #23

Paul Rubin

"Carl J. Van Arsdall" <cv*********@mvista.comwrites:

Alright, so manually running builds is going to be crazy and
unmanageable. So what the people who came before me did to manage
this scenario was to fork on thread per build. The threads invoke a
series of calls that look like

os.system(ssh <host<command>)

Instead of using os.system, maybe you want to use one of the popens or
the subprocess module. For each ssh, you'd spawn off a process that
does the ssh and communicates back to the control process through a
set of file descriptors (Unix pipe endpoints or whatever). The
control process could use either threads or polling/select to talk to
the pipes and keep track of what the subprocesses were doing.

I don't think you need anything as complex as shared memory for this.
You're just writing a special purpose chat server.

Jul 26 '06 #24

bryanjugglercryptographer

Paul Rubin wrote:

Have you looked at POSH yet? http://poshmodule.sf.net

Paul, have you used POSH? Does it work well? Any major
gotchas?

I looked at the paper... well, not all 200+ pages, but I checked
how they handle a couple parts that I thought hard and they
seem to have good ideas. I didn't find the SourceForge project
so promising. The status is alpha, the ToDo's are a little scary,
and project looks stalled. Also it's *nix only.
--
--Bryan

Jul 26 '06 #25

Gerhard Fiedler

On 2006-07-26 17:33:19, Carl J. Van Arsdall wrote:

Alright, if you read all that, thanks, and thanks for your input. Whether
or not I've agreed with anything, me and a few colleagues definitely
discuss each idea as its passed to us. For that, thanks to the python
list!

I think you should spend a few hours and read up on realtime OS features
and multitasking programming techniques. Get a bit away from the bottom
level, forget about the specific features of your OS and your language and
try to come up with a set of requirements and a structure that fits them.

Regarding communicating with a thread (or process, that's about the same,
only the techniques vary), for example -- there are not that many options.
Either the thread/process polls a message queue or it goes to sleep once it
has done whatever it needed to do until something comes in through a queue
or until a semaphore gets set. What is better suited for you depends on
your requirements and overall structure. Both doesn't seem to be too clear.

If you have threads that take too long and need to be killed, then I'd say
fix the code that runs there...

Gerhard

Jul 26 '06 #26

Paul Rubin

br***********************@yahoo.com writes:

Have you looked at POSH yet? http://poshmodule.sf.net

Paul, have you used POSH? Does it work well? Any major gotchas?

I haven't used it. I've been wanting to try. I've heard it works ok
in Linux but I've heard of problems with it under Solaris.

Now that I understand what the OP is trying to do, I think POSH is
overkill, and just using pipes or sockets is fine. If he really wants
to use shared memory, hmmm, there used to be an shm module at

http://mambo.peabody.jhu.edu/omr/omi...ource/shm.html

but that site now hangs (and it's not on archive.org), and Python's
built-in mmap module doesn't support any type of locks.

I downloaded the above shm module quite a while ago, so if I can find
it I might upload it to my own site. It was a straightforward
interface to the Sys V shm calls (also *nix-only, I guess). I guess
he also could use mmap with no locks, but with separate memory regions
for reading and writing in each subprocess, using polling loops. I
sort of remember Apache's mod_mmap doing something like that if it has
to.

To really go off the deep end, there are a few different MPI libraries
with Python interfaces.

I looked at the paper... well, not all 200+ pages, but I checked
how they handle a couple parts that I thought hard and they
seem to have good ideas.

200 pages?? The paper I read was fairly short, and I looked at the
code (not too carefully) and it seemed fairly straightforward. Maybe
I missed something, or am not remembering; it's been a while.

I didn't find the SourceForge project
so promising. The status is alpha, the ToDo's are a little scary,
and project looks stalled. Also it's *nix only.

Yeah, using it for anything serious would involve being willing to fix
problems with it as they came up. But I think the delicate parts of
it are parts that aren't that important, so I'd just avoid using
those.

Jul 26 '06 #27

Gerhard Fiedler

On 2006-07-26 19:08:44, Carl J. Van Arsdall wrote:

Also, threading's condition and event constructs are used a lot
(i talk about it somewhere in that thing I wrote). They are easy to use
and nice and ready for me, with a server wouldn't I have to have things
poll/wait for messages?

How would a thread receive a message, unless it polls some kind of queue or
waits for a message from a queue or at a semaphore? You can't just "push" a
message into a thread; the thread has to "pick it up", one way or another.

Gerhard

Jul 27 '06 #28

Carl J. Van Arsdall

Gerhard Fiedler wrote:

On 2006-07-26 19:08:44, Carl J. Van Arsdall wrote:

>Also, threading's condition and event constructs are used a lot
(i talk about it somewhere in that thing I wrote). They are easy to use
and nice and ready for me, with a server wouldn't I have to have things
poll/wait for messages?

How would a thread receive a message, unless it polls some kind of queue or
waits for a message from a queue or at a semaphore? You can't just "push" a
message into a thread; the thread has to "pick it up", one way or another.

Gerhard

Well, I guess I'm thinking of an event driven mechanism, kinda like
setting up signal handlers. I don't necessarily know how it works under
the hood, but I don't poll for a signal. I setup a handler, when the
signal comes, if it comes, the handler gets thrown into action. That's
what I'd be interesting in doing with threads.

-c

--

Carl J. Van Arsdall
cv*********@mvista.com
Build and Release
MontaVista Software

Jul 27 '06 #29

Gerhard Fiedler

On 2006-07-26 21:38:06, Carl J. Van Arsdall wrote:

>>Also, threading's condition and event constructs are used a lot
(i talk about it somewhere in that thing I wrote). They are easy to use
and nice and ready for me, with a server wouldn't I have to have things
poll/wait for messages?

How would a thread receive a message, unless it polls some kind of queue or
waits for a message from a queue or at a semaphore? You can't just "push" a
message into a thread; the thread has to "pick it up", one way or another.

Well, I guess I'm thinking of an event driven mechanism, kinda like
setting up signal handlers. I don't necessarily know how it works under
the hood, but I don't poll for a signal. I setup a handler, when the
signal comes, if it comes, the handler gets thrown into action. That's
what I'd be interesting in doing with threads.

What you call an event handler is a routine that gets called from a message
queue polling routine. You said a few times that you don't want that.

The queue polling routine runs in the context of the thread. If any of the
actions in that thread takes too long, it will prevent the queue polling
routine from running, and therefore the event won't get handled. This is
exactly the scenario that you seem to want to avoid. Event handlers are not
anything multitask or multithread, they are simple polling mechanisms with
an event queue. It just seems that they act preemtively, when you can click
on one button and another button becomes disabled :)

There are of course also semaphores. But they also have to either get
polled like GUI events, or the thread just goes to sleep until the
semaphore wakes it up. You need to understand this basic limitation: a
processor can only execute statements. Either it is doing other things,
then it must, by programming, check the queue -- this is polling. Or it can
suspend itself (the thread or process) and tell the OS (or the thread
handling mechanism) to wake it up when a message arrives in a queue or a
semaphore gets active.

You need to look a bit under the hood, so to speak... That's why I said in
the other message that I think it would do you some good to read up a bit
on multitasking OS programming techniques in general. There are not that
many, in principle, but it helps to understand the basics.

Gerhard

Jul 27 '06 #30

bryanjugglercryptographer

Gerhard Fiedler wrote:

Carl J. Van Arsdall wrote:
Well, I guess I'm thinking of an event driven mechanism, kinda like
setting up signal handlers. I don't necessarily know how it works under
the hood, but I don't poll for a signal. I setup a handler, when the
signal comes, if it comes, the handler gets thrown into action. That's
what I'd be interesting in doing with threads.

What you call an event handler is a routine that gets called from a message
queue polling routine. You said a few times that you don't want that.

I think he's refering to Unix signal handlers. These really are called
asynchronously. When the signal comes in, the system pushes some
registers on the stack, calls the signal handler, and when the signal
handler returns it pops the registers off the stack and resumes
execution where it left off, more or less. If the signal comes while
the
process is in certain system calls, the call returns with a value or
errno setting that indicated it was interrupted by a signal.

Unix signals are an awkward low-level relic. They used to be the only
way to do non-blocking but non-polling I/O, but current systems offer
much better ways. Today the sensible things to do upon receiving a
signal are ignore it or terminate the process. My opinion, obviously.
--
--Bryan

Jul 27 '06 #31

H J van Rooyen

"Carl J. Van Arsdall" <cv*********@mvista.comwrote:

8<----------------------------------------------------------------
| point). Its not only important that the threads die, but that they die
| with grace. There's lots of cleanup work that has to be done when
| things exit or things end up in an indeterminable state.
|
| So, I feel like I have a couple options,
|
| 1) try moving everything to a process oriented configuration - we think
| this would be bad, from a resource standpoint as well as it would make
| things more difficult to move to a fully distributed system later, when
| I get my army of code monkeys.
|
| 2) Suck it up and go straight for the distributed system now - managers
| don't like this, but maybe its easier than I think its going to be, I dunno
|
| 3) See if we can find some other way of getting the threads to terminate.
|
| 4) Kill it and clean it up by hand or helper scripts - we don't want to
| do this either, its one of the major things we're trying to get away from.

8<-----------------------------------------------------------------------------

This may be a stupid suggestion - If I understand what you are doing, its
essentially running a bunch of compilers with different options on various
machines around the place - so there is a fifth option - namely to do nothing -
let them finish and just throw the output away - i.e. just automate the
cleanup...

- Hendrik

Jul 27 '06 #32

Paul Rubin

Dennis Lee Bieber <wl*****@ix.netcom.comwrites:

Ugh... Seems to me it would be better to find some Python library
for SSH, something similar to telnetlib, rather than doing an
os.system() per command line. EACH of those os.system() calls probably
causes a full fork() operation on Linux/UNIX, and the equivalent on
Windows (along with loading a command shell interpreter to handle the
actual statement).

I think Carl is using Linux, so the awful overhead of process creation
in Windows doesn't apply. Forking in Linux isn't that big a deal.
os.system() usually forks a shell, and the shell forks the actual
command, but even two forks per ssh is no big deal. The Apache web
server usually runs with a few hundred processes, etc. Carl, just how
many of these ssh's do you need active at once? If it's a few hundred
or less, I just wouldn't worry about these optimizations you're asking
about.

Jul 27 '06 #33

bryanjugglercryptographer

Carl J. Van Arsdall wrote:

br***********************@yahoo.com wrote:
Carl J. Van Arsdall wrote:

br***********************@yahoo.com wrote:

Carl J. Van Arsdall wrote:

I don't get what threading and Twisted would to do for
you. The problem you actually have is that you sometimes
need terminate these other process running other programs.
Use spawn, fork/exec* or maybe one of the popens.
I have a strong need for shared memory space in a large distributed
environment.

Distributed shared memory is a tough trick; only a few systems simulate
it.
Yea, this I understand, maybe I chose some poor words to describe what I
wanted.

Ya' think? Looks like you have no particular need for shared
memory, in your small distributed system.

I think this conversation is getting hairy and confusing so I'm
going to try and paint a better picture of what's going on. Maybe this
will help you understand exactly what's going on or at least what I'm
trying to do, because I feel like we're just running in circles.

[...]

So step out of the circles already. You don't have a Python thread
problem. You don't have a process overhead problem.

[...]

So, I have a distributed build system. [...]

Not a trivial problem, but let's not pretend we're pushing the
state of the art here.

Looks like the system you inherited already does some things
smartly: you have ssh set up so that a controller machine can
launch various build steps on a few dozen worker machines.

[...]

The threads invoke a series
of calls that look like

os.system(ssh <host<command>)

or for more complex operations they would just spawn a process that ran
another python script)

os.system(ssh <host<script>)

[...]

Alright, so this scheme that was first put in place kind of worked.
There were some problems, for example when someone did something like
os.system(ssh <host<script>) we had no good way of knowing what the
hell happened in the script.

Yeah, that's one thing we've been telling you. The os.system()
function doesn't give you enough information nor enough control.
Use one of the alternatives we've suggested -- probably the
subprocess.Popen class.

[...]

So, I feel like I have a couple options,

1) try moving everything to a process oriented configuration - we think
this would be bad, from a resource standpoint as well as it would make
things more difficult to move to a fully distributed system later, when
I get my army of code monkeys.

2) Suck it up and go straight for the distributed system now - managers
don't like this, but maybe its easier than I think its going to be, I dunno

3) See if we can find some other way of getting the threads to terminate.

4) Kill it and clean it up by hand or helper scripts - we don't want to
do this either, its one of the major things we're trying to get away from.

The more you explain, the sillier that feeling looks -- that those
are your options. Focus on the problems you actually have. Track
what build steps worked as expected; log what useful information
you have about the ones that did not.

That "resource standpoint" thing doesn't really make sense. Those
os.system() calls launch *at least* one more process. Some
implementations will launch a process to run a shell, and the
shell will launch another process to run the named command. Even
so, efficiency on the controller machine is not a problem given
the scale you have described.
--
--Bryan

Jul 27 '06 #34

Nick Craig-Wood

br***********************@yahoo.com <br***********************@yahoo.comwrote:

Hans wrote:
Is there a way that the program that created and started a thread also stops
it.
(My usage is a time-out).

E.g.

thread = threading.Thread(target=Loop.testLoop)
thread.start() # This thread is expected to finish within a second
thread.join(2) # Or time.sleep(2) ?

No, Python has no threadicide method

Actually it does in the C API, but it isn't exported to python.
ctypes can fix that though.

and its absence is not an oversight. Threads often have important
business left to do, such as releasing locks on shared data; killing
them at arbitrary times tends to leave the system in an inconsistent
state.

Here is a demo of how to kill threads in python in a cross platform
way. It requires ctypes. Not sure I'd use the code in production but
it does work...

"""
How to kill a thread demo
"""

import threading
import time
import ctypes

class ThreadKilledError(Exception): pass
_PyThreadState_SetAsyncExc = ctypes.pythonapi.PyThreadState_SetAsyncExc
_c_ThreadKilledError = ctypes.py_object(ThreadKilledError)

def _do_stuff(t):
"""Busyish wait for t seconds. Just sleeping delays the exeptions in the example"""
start = time.time()
while time.time() - start < t:
time.sleep(0.01)

class KillableThread(threading.Thread):
"""
Show how to kill a thread
"""
def __init__(self, name="thread", *args, **kwargs):
threading.Thread.__init__(self, *args, **kwargs)
self.name = name
print "Starting %s" % self.name
def kill(self):
"""Kill this thread"""
print "Killing %s" % self.name
_PyThreadState_SetAsyncExc(self.id, _c_ThreadKilledError)
def run(self):
self.id = threading._get_ident()
while 1:
print "Thread %s running" % self.name
_do_stuff(1.0)

if __name__ == "__main__":
thread1 = KillableThread(name="thread1")
thread1.start()
_do_stuff(0.5)
thread2 = KillableThread(name="thread2")
thread2.start()
_do_stuff(2.0)
thread1.kill()
thread1.join()
_do_stuff(2.0)
thread2.kill()
thread2.join()
print "Done"

--
Nick Craig-Wood <ni**@craig-wood.com-- http://www.craig-wood.com/nick

Jul 27 '06 #35

Paul Boddie

Paul Rubin wrote:

>
Instead of using os.system, maybe you want to use one of the popens or
the subprocess module. For each ssh, you'd spawn off a process that
does the ssh and communicates back to the control process through a
set of file descriptors (Unix pipe endpoints or whatever). The
control process could use either threads or polling/select to talk to
the pipes and keep track of what the subprocesses were doing.

For some insight into what you might need to do to monitor asynchronous
communications, take a look at the parallel/pprocess module, which I
wrote as a convenience for spawning processes using a thread
module-style API whilst providing explicit channels for interprocess
communication:

http://www.python.org/pypi/parallel

Better examples can presumably be found in any asynchronous
communications framework, I'm sure.

I don't think you need anything as complex as shared memory for this.
You're just writing a special purpose chat server.

Indeed. The questioner might want to look at the py.execnet software
that has been presented now at two consecutive EuroPython conferences
(at the very least):

http://indico.cern.ch/contributionDi...d=41&confId=44

Whether this solves the questioner's problems remains to be seen, but
issues of handling SSH-based communications streams do seem to be
addressed.

Paul

Jul 27 '06 #36

sjdevnull

Carl J. Van Arsdall wrote:

Well, I guess I'm thinking of an event driven mechanism, kinda like
setting up signal handlers. I don't necessarily know how it works under
the hood, but I don't poll for a signal. I setup a handler, when the
signal comes, if it comes, the handler gets thrown into action. That's
what I'd be interesting in doing with threads.

Note that you see many of the same problems with signal handlers
(including only being able to call reentrant functions from them).

Most advanced Unix programming books say you should treat signal
handlers in a manner similar to what people are advocating for remote
thread stoppage in this thread: unless you're doing something trivial,
your signal handler should just set a global variable. Then your
process can check that variable in the main loop and take more complex
action if it's set.

Jul 27 '06 #37

Paul Rubin

"Paul Boddie" <pa**@boddie.org.ukwrites:

Whether this solves the questioner's problems remains to be seen, but
issues of handling SSH-based communications streams do seem to be
addressed.

Actually I don't understand the need for SSH. This is traffic over a
LAN, right? Is all of the LAN traffic encrypted? That's unusual; SSH
is normally used to secure connections over the internet, but the
local network is usually trusted. Hopefully it's not wireless.

Jul 27 '06 #38

Carl J. Van Arsdall

Paul Rubin wrote:

"Paul Boddie" <pa**@boddie.org.ukwrites:

>Whether this solves the questioner's problems remains to be seen, but
issues of handling SSH-based communications streams do seem to be
addressed.

Actually I don't understand the need for SSH. This is traffic over a
LAN, right? Is all of the LAN traffic encrypted? That's unusual; SSH
is normally used to secure connections over the internet, but the
local network is usually trusted. Hopefully it's not wireless.

The reason for ssh is legacy. I think the person who originally set
things up (it was an 8 node farm at the time) just used ssh to execute
commands on the remote machine. It was a quick and dirty approach I
believe, but at the time, it wasn't worth investing in anything better.
Just setup some keys for each node and use ssh (as opposed to rexec or
something else, I doubt they put much thought into it). Its not a need
as much as I was working with what was handed, and as in many projects,
it becomes difficult to change everything at once. So the goal was to
change what we had to and abstract the ssh calls away so that we could
do something better later.

-c

--

Carl J. Van Arsdall
cv*********@mvista.com
Build and Release
MontaVista Software

Jul 27 '06 #39

sjdevnull

Paul Rubin wrote:

"Paul Boddie" <pa**@boddie.org.ukwrites:
Whether this solves the questioner's problems remains to be seen, but
issues of handling SSH-based communications streams do seem to be
addressed.

Actually I don't understand the need for SSH. This is traffic over a
LAN, right? Is all of the LAN traffic encrypted? That's unusual; SSH
is normally used to secure connections over the internet, but the
local network is usually trusted. Hopefully it's not wireless.

Most places I've worked do use a lot of encryption on the LAN. Not
everything on the LAN is encrypted (e.g outgoing http connections) but
a lot of things are. Trusting the whole network is a bad idea, since
it allows a compromise of one machine to turn into a compromise of the
whole LAN.

Jul 27 '06 #40

bryanjugglercryptographer

Paul Rubin wrote:

Actually I don't understand the need for SSH.

Who are you and what have you done with the real Paul Rubin?

This is traffic over a
LAN, right? Is all of the LAN traffic encrypted? That's unusual; SSH
is normally used to secure connections over the internet, but the
local network is usually trusted. Hopefully it's not wireless.

I think not running telnet and rsh daemons is a good policy anyway.
--
--Bryan

Jul 28 '06 #41

Paul Boddie

Paul Rubin wrote:

"Paul Boddie" <pa**@boddie.org.ukwrites:
Whether this solves the questioner's problems remains to be seen, but
issues of handling SSH-based communications streams do seem to be
addressed.

Actually I don't understand the need for SSH. This is traffic over a
LAN, right? Is all of the LAN traffic encrypted? That's unusual; SSH
is normally used to secure connections over the internet, but the
local network is usually trusted. Hopefully it's not wireless.

I don't run any wireless networks, but given the apparently poor state
of wireless network security (as far as the actual implemented
standards in commercially available products are concerned), I'd want
to be using as much encryption as possible if I did.

Anyway, the py.execnet thing is presumably designed to work over the
Internet and over local networks, with the benefit of SSH being that it
applies well to both domains. Whether it's a better solution for the
questioner's problem than established alternatives such as PVM (which
I've never had the need to look into, even though it seems
interesting), various distributed schedulers or anything else out
there, I can't really say.

Paul

Jul 28 '06 #42

Paul Rubin

"Paul Boddie" <pa**@boddie.org.ukwrites:

Anyway, the py.execnet thing is presumably designed to work over the
Internet and over local networks, with the benefit of SSH being that it
applies well to both domains. Whether it's a better solution for the
questioner's problem than established alternatives such as PVM (which
I've never had the need to look into, even though it seems
interesting), various distributed schedulers or anything else out
there, I can't really say.

You could use ssh's port forwarding features and just open a normal
TCP connection to a local port that the local ssh server listens to.
Then the ssh server forwards the traffic through an encrypted tunnel
to the other machine. Your application doesn't have to know anything
about ssh.

In fact there's a VPN function (using tun/tap) in recent versions of
ssh that should make it even simpler, but I hvean't tried it yet.

Jul 28 '06 #43

Alex Martelli

H J van Rooyen <ma**@microcorp.co.zawrote:

"Paul Rubin" <http://ph****@NOSPAM.invalidWrites:

| "H J van Rooyen" <ma**@microcorp.co.zawrites:
| *grin* - Yes of course - if the WDT was enabled - its something that
| I have not seen on PC's yet...
|
| They are available for PC's, as plug-in cards, at least for the ISA
| bus in the old days, and almost certainly for the PCI bus today.

That is cool, I was not aware of this - added to a long running server it will
help to make the system more stable - a hardware solution to hard to find bugs
in Software - (or even stuff like soft errors in hardware - speak to the
Avionics boys about Neutrons) do you know who sells them and what they are
called? -

When you're talking about a bunch of (multiprocessing) machines on a
LAN, you can have a "watchdog machine" (or more than one, for
redundancy) periodically checking all others for signs of health -- and,
if needed, rebooting the sick machines via ssh (assuming the sickness is
in userland, of course -- to come back from a kernel panic _would_
require HW support)... so (in this setting) you _could_ do it in SW, and
save the $100+ per box that you'd have to spend at some shop such as
<http://www.pcwatchdog.com/or the like...
Alex

Aug 3 '06 #44

H J van Rooyen

"Alex Martelli" <al***@mac.comWrote:
| H J van Rooyen <ma**@microcorp.co.zawrote:
|
| "Paul Rubin" <http://ph****@NOSPAM.invalidWrites:
| >
| | "H J van Rooyen" <ma**@microcorp.co.zawrites:
| | *grin* - Yes of course - if the WDT was enabled - its something that
| | I have not seen on PC's yet...
| |
| | They are available for PC's, as plug-in cards, at least for the ISA
| | bus in the old days, and almost certainly for the PCI bus today.
| >
| That is cool, I was not aware of this - added to a long running server it
will
| help to make the system more stable - a hardware solution to hard to find
bugs
| in Software - (or even stuff like soft errors in hardware - speak to the
| Avionics boys about Neutrons) do you know who sells them and what they are
| called? -
|
| When you're talking about a bunch of (multiprocessing) machines on a
| LAN, you can have a "watchdog machine" (or more than one, for
| redundancy) periodically checking all others for signs of health -- and,
| if needed, rebooting the sick machines via ssh (assuming the sickness is
| in userland, of course -- to come back from a kernel panic _would_
| require HW support)... so (in this setting) you _could_ do it in SW, and
| save the $100+ per box that you'd have to spend at some shop such as
| <http://www.pcwatchdog.com/or the like...
|
|
| Alex

Thanks - will check it out - seems a lot of money for 555 functionality
though....

Especially if like I, you have to pay for it with Rand - I have started to call
the local currency Runt...

(Typical South African Knee Jerk Reaction - everything is too expensive here...
:- ) )

- Hendrik

Aug 3 '06 #45

Gerhard Fiedler

On 2006-08-03 06:07:31, H J van Rooyen wrote:

Thanks - will check it out - seems a lot of money for 555 functionality
though....

Especially if like I, you have to pay for it with Rand - I have started
to call the local currency Runt...

Depending on what you're up to, you can make such a thing yourself
relatively easily. There are various possibilities, both for the
reset/restart part and for the kick-the-watchdog part.

Since you're talking about a "555" you know at least /some/ electronics :)

Two 555s (or similar):
- One wired as a retriggerable monostable and hooked up to a control line
of a serial port. It needs to be triggered regularly in order to not
trigger the second timer.
- The other wired as a monostable and hooked up to a relay that gets
activated for a certain time when it gets triggered. That relay controls
the computer power line (if you want to stay outside the case) or the reset
switch (if you want to build it into your computer).

I don't do such things with 555s... I'm more a digital guy. There are many
options to do that, and all a lot cheaper than those boards, if you have
more time than money :)

Gerhard

Aug 3 '06 #46

Carl J. Van Arsdall

Alex Martelli wrote:

H J van Rooyen <ma**@microcorp.co.zawrote:

>"Paul Rubin" <http://ph****@NOSPAM.invalidWrites:

| "H J van Rooyen" <ma**@microcorp.co.zawrites:
| *grin* - Yes of course - if the WDT was enabled - its something that
| I have not seen on PC's yet...
|
| They are available for PC's, as plug-in cards, at least for the ISA
| bus in the old days, and almost certainly for the PCI bus today.

That is cool, I was not aware of this - added to a long running server it will
help to make the system more stable - a hardware solution to hard to find bugs
in Software - (or even stuff like soft errors in hardware - speak to the
Avionics boys about Neutrons) do you know who sells them and what they are
called? -

When you're talking about a bunch of (multiprocessing) machines on a
LAN, you can have a "watchdog machine" (or more than one, for
redundancy) periodically checking all others for signs of health -- and,
if needed, rebooting the sick machines via ssh (assuming the sickness is
in userland, of course -- to come back from a kernel panic _would_
require HW support)... so (in this setting) you _could_ do it in SW, and
save the $100+ per box that you'd have to spend at some shop such as
<http://www.pcwatchdog.com/or the like...

Yea, there are other free solutions you might want to check out, I've
been looking at ganglia and nagios. These require constant
communication with a server, however they are customizable in that you
can have the server take action on various events.

Cheers!

-c
--

Carl J. Van Arsdall
cv*********@mvista.com
Build and Release
MontaVista Software

Aug 3 '06 #47

Paul Rubin

"Carl J. Van Arsdall" <cv*********@mvista.comwrites:

Yea, there are other free solutions you might want to check out, I've
been looking at ganglia and nagios. These require constant
communication with a server, however they are customizable in that you
can have the server take action on various events. Cheers!

There's some pretty tricky issues with desktop-class PC hardware about
what to do if you need to reconfigure or reboot one remotely. Real
server hardware is better equipped for this but costs a lot more.

I remember something called "PC-Weasel" which was an ISA-bus plug-in
card that was basically a VGA card with an ethernet port. That let
you see the bootup screens remotely, adjust the cmos settings, etc. I
remember trying without success to find something like that for the
PCI bus. Without something like that, all you can really do if a PC
in server gets wedged is remote-reset or power cycle it; even that of
course takes special hardware, but many colo places are already set up
for that.

Aug 3 '06 #48

H J van Rooyen

Aug 4 '06 #49

H J van Rooyen

"Gerhard Fiedler" <ge*****@gmail.comwrote:

| On 2006-08-03 06:07:31, H J van Rooyen wrote:
|
| Thanks - will check it out - seems a lot of money for 555 functionality
| though....
| >
| Especially if like I, you have to pay for it with Rand - I have started
| to call the local currency Runt...
|
| Depending on what you're up to, you can make such a thing yourself
| relatively easily. There are various possibilities, both for the
| reset/restart part and for the kick-the-watchdog part.
|
| Since you're talking about a "555" you know at least /some/ electronics :)

*grin* You could say that - original degree was Physics and Maths ...

| Two 555s (or similar):
| - One wired as a retriggerable monostable and hooked up to a control line
| of a serial port. It needs to be triggered regularly in order to not
| trigger the second timer.
| - The other wired as a monostable and hooked up to a relay that gets
| activated for a certain time when it gets triggered. That relay controls
| the computer power line (if you want to stay outside the case) or the reset
| switch (if you want to build it into your computer).
|
| I don't do such things with 555s... I'm more a digital guy. There are many
| options to do that, and all a lot cheaper than those boards, if you have
| more time than money :)

Like wise - some 25 years of amongst other things designing hardware and
programming 8051 and DSP type processors in assembler...

The 555 came to mind because it has been around for ever - and as someone once
said (Steve Circia ?) -
"My favourite programming language is solder"... - a dumb state machine
implemented in hardware beats a processor every time when it comes to
reliability - its just a tad inflexible...

The next step above the 555 is a PIC... then you can steal power from the RS-232
line - and its a small step from "PIC" to "PIG"...

Although this is getting bit off topic on a language group...

;-) Hendrik

Aug 4 '06 #50

How to force a thread to stop

Similar topics