This has become a bit of a mixed reply, but I think it's better for
understanding the problem I'm trying to solve.
Istvan Albert wrote:
Alban Hertroys wrote:
my problem is that psycopg let's my threads end before the inserts
actually took place, resulting in my collecting thread finding no
Make sure you commit the inserts. Otherwise you might
simply end up selecting on the old view.
There is a commit when you close the db connection so the data is
there when you check it later. When migrating from dbs
without transaction support this can be very confusing.
But PostgreSQL does have transaction support... I rely on that. It is
one of the reasons I chose for Python (+psycopg).
The script I'm working on does bulk inserts from multiple related XML
files (parsed using the sax parser) and take turns inserting small
batches of xml records from those files. The collection thread combines
these into 1 xml record, which is why it's so important that the inserts
are done in time.
I can't commit until all the data has been inserted and combined. The
commit shouldn't happen until the end of the main thread is reached.
If the server goes down, or there's another reason it can't continue
parsing, the whole transaction should rollback. Committing in between
would be 'problematic' (where in the XML files were we interupted? Hard
to tell).
Also, I don't think I can join the threads (as someone else suggested),
as they are still working in an Application instance (part of the SAX
parser). The threads are waiting until they're allowed to continue; by a
linked list of Events (so that I can remove events for threads that
finished - which shouldn't happen, but it may). Unless I misunderstand
thread joining, of course.
I have to admit that I have very little experience with thread
programming, and this is (part of) my first Python program. It is
definitely a steep learning curve; I hope I don't fall back down too often.
So far, Python has been nice to me :)
Alban.