469,275 Members | 1,850 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,275 developers. It's quick & easy.

PEP for nondeterminism.

I frequently find myself in a situation where I wish I could say "the
execution order of these two lines just doesn't matter."

Proposed here are two freshly drafted PEP's advocating for the addition
of nondeterminism as a spring board for future automatic parallelism.

The first calls for core language changes, the second calls for a minor
(and likely symbolic) change to list comprehension's semantics to better
support future parallelism.

PEP: XXX
Title: Nondeterminism
Version: $Revision$
Last-Modified: $Date$
Author: Adam DePrince <ad***********@gmail.com>
Status: Draft
Python-Version: 3.0
Type: Standards
Content-Type: text/plain
Created: 24-Mar-2006
Post-History:
Abstract

Proposed herein is are two new control flow structures that allow
the explicit specification of parallelism by defining the order of
expression evaluation as non-deterministic, thus removing the
requirement of sequential execution.
Background

A number of mechanisms have been proposed in Python that have the
side effect of permitting the automatic discovery of parallelism.
PEP 288 presents the notion of a generator, an object that
inherently lends itself to concurrent operation and possibility
automatic self organization of a pipeline.

Stackless Python in PEP 219/220 presents a mechanism by which a
large number micro threads could be efficiently supported, thus
allowing for a realistic manifestation of a deep generator based
software pipeline.

Additionally, this PEP's sibling, XXX, proposes that the
execution order of list comprehensions be made non-deterministic
to permit future parallel implementations.

DISCUSSION

Two new control flow structures are proposed. While as of yet no
name is proposed, for the sake of brevity we will refer to them as
any and first.

The any structure would call for the execution of all expressions
within the any block in an non-determinate order. The user would
warrant that no side effects exist between the expressions, and
assigned to the variable the any block would be a list
representing the return values of each of the expressions in the
order they appear in the code (but not necessarily in the order of
execution.)

The first structure would execute all enclosed expressions
concurrently, again, the order of execution being
non-deterministic, other than the promise that all would be
started. The return value of the first expression to finish would
be assigned to the variable associated with the first block and
the execution of the other operations terminated.

SPECIFICATION

The use of the word 'any' and 'first' is only a convenience. I do
not yet propose a keyword for these structures.

...
any retval:
1
2+3
my_function_that_returns_17()
...

retval would be assigned [1, 5, 17] Only the order of evaluation
is not deterministic.

first retval:
urllib.open( 'mirror1' ).read()
urllib.open( 'mirror2' ).read()
urllib.open( 'mirror3' ).read()
....

retval would contain the data from the fastest mirror, the other
threads would be disposed of.

IMPLEMENTATION
TBD - Incomplete
REFERENCES
TBD
COPYRIGHT

This document has been placed in the public domain.


PEP: XXX
Title: Nondeterministic List comprehensions
Version: $Revision$
Last-Modified: $Date$
Author: Adam DePrince <ad***********@gmail.com>
Status: Draft
Python-Version: 3.0
Type: Standards
Content-Type: text/plain
Created: 24-Mar-2006
Post-History:
Abstract

The semantics of the list comprehension offer the future benefit
of multi-threading and parallel execution, assuming that the
programmer treat the order of evaluation as non-deterministic and
avoids the introduction of dependencies and side-effects.

Background

A number of mechanisms have been proposed in Python that have the
side effect of permitting the automatic discovery of parallelism.
PEP 288 presents the notion of a generator, an object that
inherently lends itself to concurrent operation and possibility
of automatic self organization of pipelined, and thus parallel,
operation.

Stackless Python in PEP 219/220 presents a mechanism by which a
large number micro threads could be efficiently supported, thus
allowing for a realistic manifestation of a deep generator based
software pipeline.
DISCUSSION

Generators and generator comprehension are often regarded as
superior to list comprehension due to the reduced footprint of
in-flight data and the future potential of vertical parallelism via
pipelining.

List comprehensions, while having the drawback of requiring the
presence of the entire working set in-flight, places no inherent
restriction on the order of execution. This permits us to
distribute the execution effort horizontally across multiple
micro-threads, with a far higher potential for parallelism than
generators at the expense of requiring the user warrant the
absence of side effects of dependencies between each passing of
the list comprehension loop.

This minor semantic requirement that list comprehension possess
no interdependencies or side effects permits us to operate on the
list in an arbitrary order, including dispatching the execution
across multiple threads.

SPECIFICATION

We change the semantics of a the list comprehension to
explicitly state that the order of evaluation is
non-deterministic, and that there should exist no dependencies or
side effects. This leaves open the potential for future
parallelization.
PROBLEMS

There are a currently a number of problems. Generators provide an
intuitive and natural partitioning of the problem. The future
partitioning of a list comprehension is a more capricious;
generally, we choose a fixed number of threads to dovetail across,
but little guidance can be gleaned from the use.

The second drawback is the current state of python. With the
global interpreter lock firmly in place, we have no compelling
argument by which to parallelize our list comprehension, even if
they were partitioned. Currently the creation of said
micro-threads and the dovetailing of the workload would actually
increase the execution time in the current 2.4 implementation of
Python. Generators, even with the current lack of real
parallelism to the resulting pipeline, benefit from the drastic
reduction in the amount of in-flight data during their execution.

The third problem is in the near term, this PEP calls for no
implementation change, its only a request that users regard the
execution order as non-deterministic. There is a reasonable change
that this would be ignored.

Lastly, there has been discussion of recasting list comprehension
as generator comprehensions as an easy of normalizing the
semantics of the iterator variable with respect to the callee's
name-space. Such an effort would assign an explicit order of
execution to the list comprehension.
REFERENCES
TBD
COPYRIGHT

This document has been placed in the public domain.


Mar 25 '06 #1
0 968

This discussion thread is closed

Replies have been disabled for this discussion.

By using this site, you agree to our Privacy Policy and Terms of Use.