469,949 Members | 2,613 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,949 developers. It's quick & easy.

Threads modify "global" variable -- asking for trouble?

I have been experimenting with some thread programming, but as I'm
doing this on my own I am worried I might be making a major mistake.

Here's a brief rundown of what I am working on. Multiple threads, via
Queue, are used to perform RBL checks on an IP. The threads are passed
a defined class (ConnectionScore) in their init. This is used to
collect the results, including a method to add the result. If I run
the program, everything goes fine and once all the threads complete the
main program has all the data from their RBL checks in the
ConnectionScore class.

It seems to work, even with an obscene (and silly) number of WORKERS
and RBLs to check. But I was reading in the Python Cookbook this
evening and it raised the issue in my mind of what happens if multiple
threads try to add data to the class at the same time.

I've attached the code below. The ConnectionClass.addRBL method just
takes the result and adds it to a dictionary object attached to the
class.
#!/usr/bin/python
from UCE_weight import ConnectionScore
from rbl import *
import Queue
import threading
import time, random

starttime = time.time()

WORKERS=5

class Worker(threading.Thread):

def __init__(self,queue,score):
self.__queue = queue
threading.Thread.__init__(self)
self.score = score

def run(self):
while 1:
RBL = self.__queue.get()
# Are we at the end of the queue?
if RBL is None: break
self.result = checkRBL("127.0.0.2",RBL,True,5)
score.addRBL(RBL,self.result[0],self.result[1])

queue = Queue.Queue(0)
score = ConnectionScore()

f = open("rbl.list",'r')
MEGARBL = []
while 1:
line = f.readline().strip()
if not line: break
MEGARBL.append(line)
f.close()
for i in range(WORKERS):
Worker(queue,score).start() # start a worker

for i in MEGARBL:
queue.put(i)

for i in range(WORKERS):
queue.put(None) # add end of queue marker

while 1:
# wait for the queue???
if queue.empty():
break
else:
time.sleep(0.1)
elapsed = time.time() - starttime
rate = len(MEGARBL)/elapsed
rate = str(round(rate,2))
elapsed = str(round(elapsed,2))

score.showRBL()
print "Checked",str(len(MEGARBL)),"zones in",elapsed,"seconds.
(",rate.strip(),"per second )"

Mar 17 '06 #1
4 2332

My apologizes, I missed the newish FAQ entry on this. The addrbl()
method looks like this:

def addRBL(self, testname, result, info=""):
self.testresultsRBL[testname] = result, info

So according to the FAQ, D[x] = y, where D is a dictionary, is atomic
and therefore thread-safe. Right?

Mar 17 '06 #2
J Rice <ri**********@gmail.com> wrote:
My apologizes, I missed the newish FAQ entry on this. The addrbl()
method looks like this:

def addRBL(self, testname, result, info=""):
self.testresultsRBL[testname] = result, info

So according to the FAQ, D[x] = y, where D is a dictionary, is atomic
and therefore thread-safe. Right?


In the current implementation specifically, for CPython exclusively, if
the hashing of testname is itself somehow 'atomic', and so is the access
to attribute testresultsRBL of object self, I do believe you might luck
out and end up "atomic" (by accident of implementation, only) if
everything's just right. Of course, any tiny change (any different
implementation of Python, any type that's not a pure primitive, any
future version of Python, etc, etc) could break this extremely fragile
set of circumstances -- Python as a language makes no guarantees of
atomicity for anything except the synchronization primitives (of which
Queue is the most powerful). There was a rather heated exchange on the
subject quite recently on this group.

Just put a lock acquire/release around this assignment (and any _use_ of
the same dictionary) and you should be vastly safer, IMNSHO.
Alex
Mar 17 '06 #3

"J Rice" <ri**********@gmail.com> wrote in message
news:11********************@i40g2000cwc.googlegrou ps.com...

My apologizes, I missed the newish FAQ entry on this. The addrbl()
method looks like this:

def addRBL(self, testname, result, info=""):
self.testresultsRBL[testname] = result, info

So according to the FAQ, D[x] = y, where D is a dictionary, is atomic
and therefore thread-safe. Right?


I believe an alternative, mentioned elsewhere by Tim Peters, is for workers
to send results (here (testname,result,info) ) back in a results queue read
by the main thread.

tjr

tjr

Mar 17 '06 #4
Thank you. Implementing a results queue was much simpler than I
expected, and I think as I add this into the rest of the program it
will avoid a lot of potential problems later too.

Mar 18 '06 #5

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

11 posts views Thread by mrbog | last post: by
1 post views Thread by Chris Stromberger | last post: by
8 posts views Thread by Hans Greif | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.