locks

Ajay wrote:

what would happen if i try to access a variable locked by another thread?
i am not trying to obtain a lock on it, just trying to access it.

If you first tell us haw you actually lock a variable, we then might be able
to tell you what happens if you access it....

And in general: python has the PIL - Python Interpreter Lock - that
"brutally" serializes (hopefully) all accesses to python data-structures -
so e.g. running several threads, appending to the same list, won't result
in messing up the internal list structure causing segfaults or the like.
That makes programming pretty easy, at the cost of lots of waiting for the
individual threads.

--
Regards,

Diez B. Roggisch

Jul 18 '05 #2

On Wed, 2004-10-13 at 14:11 +0200, Diez B. Roggisch wrote:

Ajay wrote:
what would happen if i try to access a variable locked by another thread?
i am not trying to obtain a lock on it, just trying to access it.
If you first tell us haw you actually lock a variable, we then might be able
to tell you what happens if you access it....

And in general: python has the PIL - Python Interpreter Lock - that

I think you mean the GIL (Global Interpreter Lock). PIL is the
excellent Python Imaging Library.
"brutally" serializes (hopefully) all accesses to python data-structures -
Nope. It doesn't do this. For access to items such as integers you are
probably fine, but for things like lists, dictionaries, class
attributes, etc, you're on your own. The GIL only ensures that two
threads won't be executing Python bytecode simultaneously. It locks the
Python *interpreter*, not your program or data structures.
so e.g. running several threads, appending to the same list, won't result
in messing up the internal list structure causing segfaults or the like.
True, you won't get segfaults. However, you may very well get a
traceback or mangled data.
That makes programming pretty easy, at the cost of lots of waiting for the
individual threads.

Threading in Python is pretty easy, but certainly not *that* easy. And
just to be certain, importing PIL won't help you here either <wink>.

Regards,
Cliff

--
Cliff Wells <cl************@comcast.net>

Jul 18 '05 #3

Fredrik Lundh

Diez B. Roggisch wrote:

And in general: python has the PIL - Python Interpreter Lock -

the lock is usually known as GIL (Global Interpreter Lock). see:

http://docs.python.org/api/threads.html

PIL is something else:

http://www.google.com/search?q=PIL

</F>

Jul 18 '05 #4

> I think you mean the GIL (Global Interpreter Lock). PIL is the

excellent Python Imaging Library.
I certainly did - to less caffeine in system yet...

Nope. It doesn't do this. For access to items such as integers you are
probably fine, but for things like lists, dictionaries, class
attributes, etc, you're on your own. The GIL only ensures that two
threads won't be executing Python bytecode simultaneously. It locks the
Python *interpreter*, not your program or data structures. <snip> True, you won't get segfaults. However, you may very well get a
traceback or mangled data.
I thougth that e.g. list manipulations are bytecodes, thus atomic.
So far, I never ran into serious problems giving me garbage lists or
stacktraces.

Nevertheless, I of course used queues and locked access to certain
datastructures when critical sections had to be entered - but in
comparision to java, I never had to ask for a specially thread-hardened
variant of a collection.

Threading in Python is pretty easy, but certainly not *that* easy. And
just to be certain, importing PIL won't help you here either <wink>.

Unless you plan to do some nifty image manipulation work multithreaded....
--
Regards,

Diez B. Roggisch

Jul 18 '05 #5

Peter L Hansen

Cliff Wells wrote:

On Wed, 2004-10-13 at 14:11 +0200, Diez B. Roggisch wrote:
"brutally" serializes (hopefully) all accesses to python data-structures -

Nope. It doesn't do this. For access to items such as integers you are
probably fine, but for things like lists, dictionaries, class
attributes, etc, you're on your own. The GIL only ensures that two
threads won't be executing Python bytecode simultaneously. It locks the
Python *interpreter*, not your program or data structures.
so e.g. running several threads, appending to the same list, won't result
in messing up the internal list structure causing segfaults or the like.

True, you won't get segfaults. However, you may very well get a
traceback or mangled data.
That makes programming pretty easy, at the cost of lots of waiting for the
individual threads.

Threading in Python is pretty easy, but certainly not *that* easy.

Cliff, do you have any references, or even personal experience to
relate about anything on which you comment above?

In my experience, and to my knowledge, Python threading *is*
that easy (ignoring higher level issues such as race conditions
and deadlocks and such), and the GIL *does* do exactly what Diez
suggests, and you will *not* get tracebacks nor (again, ignoring
higher level issues) mangled data.

You've tentatively upset my entire picture of the CPython (note,
CPython only) interpreter's structure and concept. Please tell
me you were going a little overboard to protect a possible
newbie from himself or something.

-Peter

Jul 18 '05 #6

On Wed, 2004-10-13 at 08:52 -0400, Peter L Hansen wrote:

Cliff Wells wrote:
On Wed, 2004-10-13 at 14:11 +0200, Diez B. Roggisch wrote:
"brutally" serializes (hopefully) all accesses to python data-structures -
Nope. It doesn't do this. For access to items such as integers you are
probably fine, but for things like lists, dictionaries, class
attributes, etc, you're on your own. The GIL only ensures that two
threads won't be executing Python bytecode simultaneously. It locks the
Python *interpreter*, not your program or data structures.
so e.g. running several threads, appending to the same list, won't result
in messing up the internal list structure causing segfaults or the like.

True, you won't get segfaults. However, you may very well get a
traceback or mangled data.
That makes programming pretty easy, at the cost of lots of waiting for the
individual threads.

Threading in Python is pretty easy, but certainly not *that* easy.

Cliff, do you have any references, or even personal experience to
relate about anything on which you comment above?

I'm no expert on Python internals but it seems clear that an operation
such as [].append() is going to span multiple bytecode instructions. It
seems to me that if those instructions span the boundary defined by
sys.getcheckinterval() that the operation won't happen in a single
thread context switch (unless the interpreter has explicit code to keep
the entire operation within a single context).

I'm no expert at dis nor Python bytecode, but I'll give it a shot :)

l = []
dis.dis(l.append(1)) 134 0 LOAD_GLOBAL 0 (findlabels)
3 LOAD_FAST 0 (code)
6 CALL_FUNCTION 1
9 STORE_FAST 5 (labels)
....
<snip dis spitting out over 500 lines of bytecode>
....

172 >> 503 PRINT_NEWLINE
504 JUMP_ABSOLUTE 33
507 POP_TOP 508 POP_BLOCK 509 LOAD_CONST 0 (None)

512 RETURN_VALUE

It looks fairly non-atomic to me. It's certainly smaller than the
default value for sys.getcheckinterval() (which defaults to 1000, iirc),
but that's hardly a guarantee that the operation won't cross the
boundary for a context switch (unless, as I mentioned above, the
interpreter has specific code to prevent the switch until the operation
is complete <shrug>).

I recall a similar discussion about three years ago on this list about
this very thing where people who know far more about it than I do flamed
it out a bit, but damned if I recall the outcome :P I do recall that it
didn't convince me to alter the approach I recommended to the OP.
In my experience, and to my knowledge, Python threading *is*
that easy (ignoring higher level issues such as race conditions
and deadlocks and such), and the GIL *does* do exactly what Diez
suggests, and you will *not* get tracebacks nor (again, ignoring
higher level issues) mangled data.
Okay, to clarify, for the most part I *was* in fact referring to "higher
level issues". I doubt tracebacks or mangled data would occur simply
due to the operation's being non-atomic. However, if you have code that
say, checks for an item's existence in a list and then appends it if it
isn't there, it may cause the program to fail if another thread adds
that item between the time of the check and the time of the append.
This is what I was referring to by potential for mangled data and/or
tracebacks.
You've tentatively upset my entire picture of the CPython (note,
CPython only) interpreter's structure and concept. Please tell
me you were going a little overboard to protect a possible
newbie from himself or something.

Certainly protecting the newbie, but not going overboard, IMHO. I've
written quite a number of threaded Python apps and I religiously
acquire/release whenever dealing with mutable data structures (lists,
etc). To date this approach has served me well. I code fairly
conservatively when it comes to threads as I am *absolutely* certain
that debugging a broken threaded application is very near the bottom of
my list of favorite things ;)

Regards,
Cliff

--
Cliff Wells <cl************@comcast.net>

Jul 18 '05 #7

Duncan Booth

Cliff Wells wrote:

I'm no expert at dis nor Python bytecode, but I'll give it a shot :)
l = []
dis.dis(l.append(1)) 134 0 LOAD_GLOBAL 0 (findlabels)
3 LOAD_FAST 0 (code)
6 CALL_FUNCTION 1
9 STORE_FAST 5 (labels)
...
<snip dis spitting out over 500 lines of bytecode>
...

172 >> 503 PRINT_NEWLINE
504 JUMP_ABSOLUTE 33 >> 507 POP_TOP 508 POP_BLOCK >> 509 LOAD_CONST 0 (None) 512 RETURN_VALUE

It looks fairly non-atomic to me.

The append method of a list returns None. dis.dis(None) disassembles the
code from the last traceback object, nothing at all to do with your
l.append(1) code.

Try this instead:

def f(): l.append(1)

dis.dis(f)

2 0 LOAD_GLOBAL 0 (l)
3 LOAD_ATTR 1 (append)
6 LOAD_CONST 1 (1)
9 CALL_FUNCTION 1
12 POP_TOP
13 LOAD_CONST 0 (None)
16 RETURN_VALUE

Jul 18 '05 #8

> Okay, to clarify, for the most part I *was* in fact referring to "higher

level issues". I doubt tracebacks or mangled data would occur simply
due to the operation's being non-atomic. However, if you have code that
say, checks for an item's existence in a list and then appends it if it
isn't there, it may cause the program to fail if another thread adds
that item between the time of the check and the time of the append.
This is what I was referring to by potential for mangled data and/or
tracebacks.

_That_ of course I'm very well aware of - but to my expirience, with several
dozen threads appending to one list I never encountered a interpreter
failure. That is in contrast to java, where you get an
"ConcurrentModificationException" unless you don't ask specifically for a
synchronized variant of the collection of yours.
--
Regards,

Diez B. Roggisch

Jul 18 '05 #9

On Wed, 2004-10-13 at 14:03 +0000, Duncan Booth wrote:

Cliff Wells wrote:
I'm no expert at dis nor Python bytecode, but I'll give it a shot :)
> l = []
> dis.dis(l.append(1)) 134 0 LOAD_GLOBAL 0 (findlabels)
3 LOAD_FAST 0 (code)
6 CALL_FUNCTION 1
9 STORE_FAST 5 (labels)
...
<snip dis spitting out over 500 lines of bytecode>
...

172 >> 503 PRINT_NEWLINE
504 JUMP_ABSOLUTE 33
>> 507 POP_TOP

508 POP_BLOCK
>> 509 LOAD_CONST 0 (None)

512 RETURN_VALUE
>

It looks fairly non-atomic to me.

The append method of a list returns None. dis.dis(None) disassembles the
code from the last traceback object, nothing at all to do with your
l.append(1) code.

Ah, thanks. I thought 500+ lines of bytecode was a bit excessive for a
simple append(), but didn't see any reason why. I saw the comment in
the docs about dis returning the last traceback if no argument was
provided but didn't see how that applied here.

Try this instead:
def f(): l.append(1)

dis.dis(f) 2 0 LOAD_GLOBAL 0 (l)
3 LOAD_ATTR 1 (append)
6 LOAD_CONST 1 (1)
9 CALL_FUNCTION 1
12 POP_TOP
13 LOAD_CONST 0 (None)
16 RETURN_VALUE

Much more reasonable. Still, I think my argument stands since this
appears non-atomic as well, although I do note this:

l = []
dis.dis(l.append) Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "/usr/lib/python2.3/dis.py", line 46, in dis
raise TypeError, \
TypeError: don't know how to disassemble builtin_function_or_method
objects

This suddenly gave me a smack on the head that list.append is
undoubtedly written in C and might, in fact, retain the GIL for the
duration of the function in which case the operation might, in fact, be
atomic (yes, I know that isn't necessarily what the above traceback was
saying, but it served as a clue-stick).

Still, despite the probability of being quite mistaken about the low-
level internals of the operation, I still stand by my assertion that not
using locks for mutable data is ill-advised at best, for the reasons I
outlined in my previous post (aside from the poorly executed
disassembly).

Regards,
Cliff

--
Cliff Wells <cl************@comcast.net>

Jul 18 '05 #10

On Wed, 2004-10-13 at 16:10 +0200, Diez B. Roggisch wrote:

Okay, to clarify, for the most part I *was* in fact referring to "higher
level issues". I doubt tracebacks or mangled data would occur simply
due to the operation's being non-atomic. However, if you have code that
say, checks for an item's existence in a list and then appends it if it
isn't there, it may cause the program to fail if another thread adds
that item between the time of the check and the time of the append.
This is what I was referring to by potential for mangled data and/or
tracebacks.

_That_ of course I'm very well aware of - but to my expirience, with several
dozen threads appending to one list I never encountered a interpreter
failure. That is in contrast to java, where you get an
"ConcurrentModificationException" unless you don't ask specifically for a
synchronized variant of the collection of yours.

Have you looked at the Queue module? It was explicitly designed for
this sort of thing and removes all doubt about thread-safety.

Regards,
Cliff

--
Cliff Wells <cl************@comcast.net>

Jul 18 '05 #11

configuring locks in awe system

Have you looked at the Queue module? It was explicitly designed for
this sort of thing and removes all doubt about thread-safety.

Sure, and I use it when appropriate, as said in response to another post of
yours.

But so far in my expirience, explicit serialization of access to certain
data structures was only necessary when more complicated structural
modifications where under way - but the usual suspects, as appending to
lists, insertion of values in dicts and the like never needed this. And
that I wanted to point out.

--
Regards,

Diez B. Roggisch

Jul 18 '05 #12

Duncan Booth

Cliff Wells wrote:

This suddenly gave me a smack on the head that list.append is
undoubtedly written in C and might, in fact, retain the GIL for the
duration of the function in which case the operation might, in fact, be
atomic (yes, I know that isn't necessarily what the above traceback was
saying, but it served as a clue-stick).

Roughly correct. list.append is written in C and therefore you might assume
it is atomic. However, it increases the size of a list which means it may
allocate memory which could cause the garbage collector to kick in which in
turn might free up cycles which could release references to objects with
__del__ methods which could execute other byte code at which point all bets
are off.

Jul 18 '05 #13

Tim Peters

[Cliff Wells]

This suddenly gave me a smack on the head that list.append is
undoubtedly written in C and might, in fact, retain the GIL for the
duration of the function in which case the operation might, in fact, be
atomic (yes, I know that isn't necessarily what the above traceback was
saying, but it served as a clue-stick).

[Duncan Booth] Roughly correct. list.append is written in C and therefore you might assume
it is atomic. However, it increases the size of a list which means it may
allocate memory which could cause the garbage collector to kick in which in
turn might free up cycles which could release references to objects with
__del__ methods which could execute other byte code at which point all bets
are off.

Not in CPython, no. The only things that can trigger CPython's cyclic
gc are calling gc.collect() explicitly, or (from time to time)
creating a new container object (a new object that participates in
cyclic gc). If list.append() can't get enough memory "on the first
try" to extend the existing list, it raises MemoryError. And if a
thread is in the bowels of list.append, the GIL prevents any other
thread from triggering cyclic GC for the duration.

Jul 18 '05 #14

Jeff Shannon

Diez B. Roggisch wrote:

But so far in my expirience, explicit serialization of access to certain
data structures was only necessary when more complicated structural
modifications where under way - but the usual suspects, as appending to
lists, insertion of values in dicts and the like never needed this. And
that I wanted to point out.

.... provided, of course, that no thread expects to maintain internal
consistency. For example, a thread doing something like:

foo_list.append(foo)
assert(foo == foo_list[-1])

The assertion here is *not* guaranteed to be true, and if one is
modifying and then reading a mutable object, it can be somewhat tricky
to ensure that no such assumptions creep into code.

Similarly, in

if foo_list[-1] is not None:
foo = foo_list.pop()

Here, foo may indeed be None, because another thread may have appended
to the list in between the test and the call to pop() -- but this is
getting into the "more complicated structural modifications" that you
mention.

A slightly trickier example, though:

foo_list[-1] == foo_list[-1]

I believe that this can't be guaranteed to always evaluate to True in a
(non-locking) multithreaded case, because foo_list can be modified in
between the two lookups.

Thus, while it's pretty safe to assume that accessing shared objects
won't, in and of itself, cause an exception, the case about "mangled
data" is hazier and depends on how exactly you mean "mangled".

I expect that most of us know this and would never assume otherwise, I
just wanted to make that explicit for the benefit of the O.P. and others
who're unfamiliar with threading, as it seems to me that this point
might've gotten a bit confused for those who didn't already understand
it well. :)

Jeff Shannon
Technician/Programmer
Credit International

Jul 18 '05 #15

Similar topics

by: aswinee | last post by:

I am running Microsoft SQL Server 2000 - 8.00.760 Enterprise Edition on Windows 2003 Enterprise Edition (NT 5.2 Build 3790:) I have 4CPU and 8GB of RAM. I have AWE enabled, /pae /3gb switch is on...

Microsoft SQL Server

postgresql locks the whole table!

by: Dr NoName | last post by:

Help! I have a table that multiple processes must be able to write to concurrently. However, it for some reason gets locked in exclusive mode. I narrowed it down to one SQL statement + some...

PostgreSQL Database

Query for Locks

by: John Carroll | last post by:

Is there a SQL query that can be run against a database that will give me all the details on any locks that are in place at the given time? I am interested in find the lock type and owner. Thank...

X Row locks causing Lock-waits

by: Bruce Pullen | last post by:

DB2 v7.2 (FP7 - DB2 v7.1.0.68) on AIX 5.2.0.0. We're seeing unexpected single row (then commit) insert locking behaviour. We're seeing Applications that already hold row-level W locks in...

DB2/UDB and Read Locks

by: Zri Man | last post by:

I have found the DB2/UDB implentation of locks a little more hard to deal with than Oracle's locking. First I realize there is lock escalation. But it would help me if somebody loudly thought...

[BUG]VS.NET 2003/C# locks up and locks up windows shell when entering breakpoint

by: Frans Bouma | last post by:

Hello, It seems VS.NET 2003 locks up itself and the complete shell (mouse locks also) when entering a breakpoint in a special situation. Below is the code to reproduce this behavior. It...

C# / C Sharp

performance: IIS - Sqlserver locks influence???

by: Alex Callea | last post by:

Hi there, We have a web application handling thousands of requests per seconds reading sql server data which is heavily updated. We are generally experiencing no performance problems. On some...

ASP.NET

DAO recordset append single record locks entire table (linked table)

by: RayPower | last post by:

I'm having problem with using DAO recordset to append record into a table and subsequent code to update other tables in a transaction. The MDB is Access 2000 with the latest service pack of JET 4....

Microsoft Access / VBA

Locks held from snapshot on db and on locks is totally different

by: shenanwei | last post by:

I have db2 v8.2.5 on AIX V5.3 with all the switches on Buffer pool (DFT_MON_BUFPOOL) = ON Lock (DFT_MON_LOCK) = ON Sort ...

DB2 IS (Intent Share) Locks

by: praveen | last post by:

Hi When does DB2 go for an IS (Intent Share) lock? IS mode is defined as a mode in which "The lock owner can read data in the locked table, but cannot update this data. Other applications can...