Alban Hertroys wrote:
Jeremy Jones wrote:
(not waiting, because it already did happen). What is it exactly
that you are trying to accomplish? I'm sure there is a better approach.
I think I saw at least a bit of the light, reading up on readers and
writers (A colleague showed up with a book called "Operating system
concepts" that has a chapter on process synchronization).
It looks like I should be writing and reading 3 Queues instead of
trying to halt and pause the threads explicitly. That looks a lot
easier...
Thanks for pointing out the problem area.
That's actually along the lines of what I was going to recommend after
getting more detail on what you are doing. A couple of things that may
(or may not) help you are:
* the Queue class in the Python standard library has a "maxsize"
parameter. When you create a queue, you can specify how large you want
it to grow. You can have your three threads busily parsing XML and
extracting data from it and putting it into a queue and when there are a
total of "maxsize" items in the queue, the next put() call (to put data
into the queue) will block until the consumer thread has reduced the
number of items in the queue. I've never used
xml.parsers.xmlproc.xmlproc.Application, but looking at the data, it
seems to resemble a SAX parser, so you should have no problem putting
(potentially blocking) calls to the queue into your handler. The only
thing this really buys you won't have read the whole XML file into memory.
* the get method on a queue object has a "block" flag. You can
effectively poll your queues something like this:
#untested code
#a_done, b_done and c_done are just checks to see if that particular
document is done
while not (a_done and b_done and c_done):
got_a, got_b, got_c = False, False, False
item_a, item_b, item_c = None, None, None
while (not a_done) and (not got_a):
try:
item_a = queue_a.get(0) #the 0 says don't block and raise an
Empty exception if there's nothing there
got_a = True
except Queue.Empty:
time.sleep(.3)
while (not b_done) and (not got_b):
try:
item_b = queue_b.get(0)
got_a = True
except Queue.Empty:
time.sleep(.3)
while (not c_done) and (not got_c):
try:
item_c = queue_c.get(0)
got_c = True
except Queue.Empty:
time.sleep(.3)
put_into_database_or_whatever(item_a, item_b, item_c)
This will allow you to deal with one item at a time and if the xml files
are different sizes, it should still work - you'll just pass None to
put_into_database_or_whaver for that particular file.
HTH.
Jeremy Jones