An idiom for code generation with exec

eliben

Hello,

In a Python program I'm writing I need to dynamically generate
functions[*] and store them in a dict. eval() can't work for me
because a function definition is a statement and not an expression, so
I'm using exec. At the moment I came up with the following to make it
work:

def build_func(args):
code """def foo(...)..."""
d = {}
exec code in globals(), d
return d['foo']

My question is, considering that I really need code generation[*] -
"is there a cleaner way to do this ?" Also, what happens if I replace
globals() by None ?
Additionally, I've found indentation to be a problem in such
constructs. Is there a workable way to indent the code at the level of
build_func, and not on column 0 ?

Thanks in advance
Eli
[*] I know that each time a code generation question comes up people
suggest that there's a better way to achieve this, without using exec,
eval, etc. But in my case, for reasons too long to fully lay out, I
really need to generate non-trivial functions with a lot of hard-coded
actions for performance. And there's no problem of security
whatsoever. If someone is very interested in the application, I will
elaborate more.

Jun 27 '08 #1

Subscribe Post Reply

2410

Bruno Desthuilliers

eliben a écrit :

Hello,

In a Python program I'm writing I need to dynamically generate
functions[*]

(snip)

[*] I know that each time a code generation question comes up people
suggest that there's a better way to achieve this, without using exec,
eval, etc.

Just to make things clear: you do know that you can dynamically build
functions without exec, do you ?

But in my case, for reasons too long to fully lay out, I
really need to generate non-trivial functions with a lot of hard-coded
actions for performance.

Just out of curiousity : could you tell a bit more about your use case
and what makes a simple closure not an option ?

And there's no problem of security
whatsoever. If someone is very interested in the application, I will
elaborate more.

Jun 27 '08 #2

eliben

On Jun 20, 9:17 am, Bruno Desthuilliers <bruno.
42.desthuilli...@websiteburo.invalidwrote:

eliben a écrit :Hello,

In a Python program I'm writing I need to dynamically generate
functions[*]

(snip)

[*] I know that each time a code generation question comes up people
suggest that there's a better way to achieve this, without using exec,
eval, etc.

Just to make things clear: you do know that you can dynamically build
functions without exec, do you ?

Yes, but the other options for doing so are significantly less
flexible than exec.

But in my case, for reasons too long to fully lay out, I
really need to generate non-trivial functions with a lot of hard-coded
actions for performance.

Just out of curiousity : could you tell a bit more about your use case
and what makes a simple closure not an option ?

Okay.

I work in the field of embedded programming, and one of the main uses
I have for Python (and previously Perl) is writing GUIs for
controlling embedded systems. The communication protocols are usually
ad-hoc messages (headear, footer, data, crc) built on top of serial
communication (RS232).

The packets that arrive have a known format. For example (YAMLish
syntax):

packet_length: 10
fields:
- name: header
offset: 0
length: 1
- name: time_tag
offset: 1
length: 1
transform: val * 2048
units: ms
- name: counter
offset: 2
length: 4
bytes-msb-first: true
- name: bitmask
offset: 6
length: 1
bit_from: 0
bit_to: 5
...

This is a partial capability display. Fields have defined offsets and
lengths, can be only several bits long, can have defined
transformations and units for convenient display.

I have a program that should receive such packets from the serial port
and display their contents in tabular form. I want the user to be able
to specify the format of his packets in a file similar to above.

Now, in previous versions of this code, written in Perl, I found out
that the procedure of extracting field values from packets is very
inefficient. I've rewritten it using a dynamically generated procedure
for each field, that does hard coded access to its data. For example:

def get_counter(packet):
data = packet[2:6]
data.reverse()
return data

This gave me a huge speedup, because each field now had its specific
function sitting in a dict that quickly extracted the field's data
from a given packet.

Now I'm rewriting this program in Python and am wondering about the
idiomatic way to use exec (in Perl, eval() replaces both eval and exec
of Python).

Eli

Jun 27 '08 #3

Peter Otten

eliben wrote:

Additionally, I've found indentation to be a problem in such
constructs. Is there a workable way to indent the code at the level of
build_func, and not on column 0 ?

exec "if 1:" + code.rstrip()

Peter

Jun 27 '08 #4

George Sakkis

On Jun 20, 8:03 am, eliben <eli...@gmail.comwrote:

On Jun 20, 9:17 am, Bruno Desthuilliers <bruno.

42.desthuilli...@websiteburo.invalidwrote:
eliben a écrit :Hello,

In a Python program I'm writing I need to dynamically generate
functions[*]

(snip)

[*] I know that each time a code generation question comes up people
suggest that there's a better way to achieve this, without using exec,
eval, etc.

Just to make things clear: you do know that you can dynamically build
functions without exec, do you ?

Yes, but the other options for doing so are significantly less
flexible than exec.

But in my case, for reasons too long to fully lay out, I
really need to generate non-trivial functions with a lot of hard-coded
actions for performance.

Just out of curiousity : could you tell a bit more about your use case
and what makes a simple closure not an option ?

Okay.

I work in the field of embedded programming, and one of the main uses
I have for Python (and previously Perl) is writing GUIs for
controlling embedded systems. The communication protocols are usually
ad-hoc messages (headear, footer, data, crc) built on top of serial
communication (RS232).

The packets that arrive have a known format. For example (YAMLish
syntax):

packet_length: 10
fields:
- name: header
offset: 0
length: 1
- name: time_tag
offset: 1
length: 1
transform: val * 2048
units: ms
- name: counter
offset: 2
length: 4
bytes-msb-first: true
- name: bitmask
offset: 6
length: 1
bit_from: 0
bit_to: 5
...

This is a partial capability display. Fields have defined offsets and
lengths, can be only several bits long, can have defined
transformations and units for convenient display.

I have a program that should receive such packets from the serial port
and display their contents in tabular form. I want the user to be able
to specify the format of his packets in a file similar to above.

Now, in previous versions of this code, written in Perl, I found out
that the procedure of extracting field values from packets is very
inefficient. I've rewritten it using a dynamically generated procedure
for each field, that does hard coded access to its data. For example:

def get_counter(packet):
data = packet[2:6]
data.reverse()
return data

This gave me a huge speedup, because each field now had its specific
function sitting in a dict that quickly extracted the field's data
from a given packet.

It's still not clear why the generic version is so slower, unless you
extract only a few selected fields, not all of them. Can you post a
sample of how you used to write it without exec to clarify where the
inefficiency comes from ?

George

Jun 27 '08 #5

Raymond Hettinger

On Jun 20, 5:03*am, eliben <eli...@gmail.comwrote:

I've rewritten it using a dynamically generated procedure
for each field, that does hard coded access to its data. For example:

def get_counter(packet):
* data = packet[2:6]
* data.reverse()
* return data

This gave me a huge speedup, because each field now had its specific
function sitting in a dict that quickly extracted the field's data
from a given packet.

Now I'm rewriting this program in Python and am wondering about the
idiomatic way to use exec (in Perl, eval() replaces both eval and exec
of Python).

FWIW, when I had a similar challenge for dynamic coding, I just
generated a py file and then imported it. This technique was nice
because can also work with Pyrex or Psyco.

Also, the code above can be simplified to: get_counter = lambda
packet: packet[5:1:-1]

Since function calls are expensive in python, you can also gain speed
by parsing multiple fields at a time:

header, timetag, counter = parse(packet)
Raymond

Jun 27 '08 #6

Bruno Desthuilliers

eliben a écrit :

On Jun 20, 9:17 am, Bruno Desthuilliers <bruno.
42.desthuilli...@websiteburo.invalidwrote:
>eliben a écrit :Hello,

>>In a Python program I'm writing I need to dynamically generate
functions[*]
(snip)

>>[*] I know that each time a code generation question comes up people
suggest that there's a better way to achieve this, without using exec,
eval, etc.
Just to make things clear: you do know that you can dynamically build
functions without exec, do you ?

Yes, but the other options for doing so are significantly less
flexible than exec.

Let's see...

>>But in my case, for reasons too long to fully lay out, I
really need to generate non-trivial functions with a lot of hard-coded
actions for performance.
Just out of curiousity : could you tell a bit more about your use case
and what makes a simple closure not an option ?

Okay.

I work in the field of embedded programming, and one of the main uses
I have for Python (and previously Perl) is writing GUIs for
controlling embedded systems. The communication protocols are usually
ad-hoc messages (headear, footer, data, crc) built on top of serial
communication (RS232).

The packets that arrive have a known format. For example (YAMLish
syntax):

packet_length: 10
fields:
- name: header
offset: 0
length: 1
- name: time_tag
offset: 1
length: 1
transform: val * 2048
units: ms
- name: counter
offset: 2
length: 4
bytes-msb-first: true
- name: bitmask
offset: 6
length: 1
bit_from: 0
bit_to: 5
...

This is a partial capability display. Fields have defined offsets and
lengths, can be only several bits long, can have defined
transformations and units for convenient display.

I have a program that should receive such packets from the serial port
and display their contents in tabular form. I want the user to be able
to specify the format of his packets in a file similar to above.

Now, in previous versions of this code, written in Perl, I found out
that the procedure of extracting field values from packets is very
inefficient. I've rewritten it using a dynamically generated procedure
for each field, that does hard coded access to its data. For example:

def get_counter(packet):
data = packet[2:6]
data.reverse()
return data

This gave me a huge speedup, because each field now had its specific
function sitting in a dict that quickly extracted the field's data
from a given packet.

ok. So if I get it right, you build the function's code as a string
based on the YAML specification.

If so, well, I can't think of anything really better[1] - at least *if*
dynamically generated procedures are really better performance wise,
which may *or not* be the case in Python.

[1] except using compile to build a code object with the function's
body, then instanciate a function object using this code, but I'm not
sure whether it will buy you much more performance-wise. I'd personnaly
prefer this because I find it more explicit and readable, but YMMV.

Now I'm rewriting this program in Python and am wondering about the
idiomatic way to use exec (in Perl, eval() replaces both eval and exec
of Python).

Well... So far, the most pythonic way to use exec is to avoid using it -
unless it's the right tool for the job !-)

Jun 27 '08 #7

eliben

FWIW, when I had a similar challenge for dynamic coding, I just

generated a py file and then imported it. This technique was nice
because can also work with Pyrex or Psyco.

I guess this is not much different than using exec, at the conceptual
level. exec is perhaps more suitable when you really need just one
function at a time and not a whole file of related functions.

Also, the code above can be simplified to: get_counter = lambda
packet: packet[5:1:-1]

OK, but that was just a demonstration. The actual functions are
complex enough to not fit into a single expression.

Eli

Jun 27 '08 #8

eliben

[1] except using compile to build a code object with the function's

body, then instanciate a function object using this code, but I'm not
sure whether it will buy you much more performance-wise. I'd personnaly
prefer this because I find it more explicit and readable, but YMMV.

How is compiling more readable than exec - doesn't it require an extra
step ? You generate code dynamically anyway.

Eli

Jun 27 '08 #9

eliben

On Jun 20, 3:19 pm, George Sakkis <george.sak...@gmail.comwrote:

On Jun 20, 8:03 am, eliben <eli...@gmail.comwrote:

On Jun 20, 9:17 am, Bruno Desthuilliers <bruno.

42.desthuilli...@websiteburo.invalidwrote:
eliben a écrit :Hello,

In a Python program I'm writing I need to dynamically generate
functions[*]

(snip)

[*] I know that each time a code generation question comes up people
suggest that there's a better way to achieve this, without using exec,
eval, etc.

Just to make things clear: you do know that you can dynamically build
functions without exec, do you ?

Yes, but the other options for doing so are significantly less
flexible than exec.

But in my case, for reasons too long to fully lay out, I
really need to generate non-trivial functions with a lot of hard-coded
actions for performance.

Just out of curiousity : could you tell a bit more about your use case
and what makes a simple closure not an option ?

Okay.

I work in the field of embedded programming, and one of the main uses
I have for Python (and previously Perl) is writing GUIs for
controlling embedded systems. The communication protocols are usually
ad-hoc messages (headear, footer, data, crc) built on top of serial
communication (RS232).

The packets that arrive have a known format. For example (YAMLish
syntax):

packet_length: 10
fields:
- name: header
offset: 0
length: 1
- name: time_tag
offset: 1
length: 1
transform: val * 2048
units: ms
- name: counter
offset: 2
length: 4
bytes-msb-first: true
- name: bitmask
offset: 6
length: 1
bit_from: 0
bit_to: 5
...

This is a partial capability display. Fields have defined offsets and
lengths, can be only several bits long, can have defined
transformations and units for convenient display.

I have a program that should receive such packets from the serial port
and display their contents in tabular form. I want the user to be able
to specify the format of his packets in a file similar to above.

Now, in previous versions of this code, written in Perl, I found out
that the procedure of extracting field values from packets is very
inefficient. I've rewritten it using a dynamically generated procedure
for each field, that does hard coded access to its data. For example:

def get_counter(packet):
data = packet[2:6]
data.reverse()
return data

This gave me a huge speedup, because each field now had its specific
function sitting in a dict that quickly extracted the field's data
from a given packet.

It's still not clear why the generic version is so slower, unless you
extract only a few selected fields, not all of them. Can you post a
sample of how you used to write it without exec to clarify where the
inefficiency comes from ?

George

The generic version has to make a lot of decisions at runtime, based
on the format specification.
Extract the offset from the spec, extract the length. Is it msb-
first ? Then reverse. Are specific bits required ? If so, do bit
operations. Should bits be reversed ? etc.

A dynamically generated function doesn't have to make any decisions -
everything is hard coded in it, because these decisions have been done
at compile time. This can save a lot of dict accesses and conditions,
and results in a speedup.

I guess this is not much different from Lisp macros - making decisions
at compile time instead of run time and saving performance.

Eli

Jun 27 '08 #10

George Sakkis

On Jun 20, 3:44 pm, eliben <eli...@gmail.comwrote:

On Jun 20, 3:19 pm, George Sakkis <george.sak...@gmail.comwrote:

On Jun 20, 8:03 am, eliben <eli...@gmail.comwrote:

On Jun 20, 9:17 am, Bruno Desthuilliers <bruno.

42.desthuilli...@websiteburo.invalidwrote:
eliben a écrit :Hello,

In a Python program I'm writing I need to dynamically generate
functions[*]

(snip)

[*] I know that each time a code generation question comes up people
suggest that there's a better way to achieve this, without using exec,
eval, etc.

Just to make things clear: you do know that you can dynamically build
functions without exec, do you ?

Yes, but the other options for doing so are significantly less
flexible than exec.

But in my case, for reasons too long to fully lay out, I
really need to generate non-trivial functions with a lot of hard-coded
actions for performance.

Just out of curiousity : could you tell a bit more about your use case
and what makes a simple closure not an option ?

Okay.

I work in the field of embedded programming, and one of the main uses
I have for Python (and previously Perl) is writing GUIs for
controlling embedded systems. The communication protocols are usually
ad-hoc messages (headear, footer, data, crc) built on top of serial
communication (RS232).

The packets that arrive have a known format. For example (YAMLish
syntax):

packet_length: 10
fields:
- name: header
offset: 0
length: 1
- name: time_tag
offset: 1
length: 1
transform: val * 2048
units: ms
- name: counter
offset: 2
length: 4
bytes-msb-first: true
- name: bitmask
offset: 6
length: 1
bit_from: 0
bit_to: 5
...

This is a partial capability display. Fields have defined offsets and
lengths, can be only several bits long, can have defined
transformations and units for convenient display.

I have a program that should receive such packets from the serial port
and display their contents in tabular form. I want the user to be able
to specify the format of his packets in a file similar to above.

Now, in previous versions of this code, written in Perl, I found out
that the procedure of extracting field values from packets is very
inefficient. I've rewritten it using a dynamically generated procedure
for each field, that does hard coded access to its data. For example:

def get_counter(packet):
data = packet[2:6]
data.reverse()
return data

This gave me a huge speedup, because each field now had its specific
function sitting in a dict that quickly extracted the field's data
from a given packet.

It's still not clear why the generic version is so slower, unless you
extract only a few selected fields, not all of them. Can you post a
sample of how you used to write it without exec to clarify where the
inefficiency comes from ?

George

The generic version has to make a lot of decisions at runtime, based
on the format specification.
Extract the offset from the spec, extract the length. Is it msb-
first ? Then reverse. Are specific bits required ? If so, do bit
operations. Should bits be reversed ? etc.

So you are saying that for example "if do_reverse: data.reverse()" is
*much* slower than "data.reverse()" ? I would expect that checking the
truthness of a boolean would be negligible compared to the reverse
itself. Did you try converting all checks to identity comparisons with
None ? I mean replacing every "if compile_time_condition:" in a loop
with

compile_time_condition = compile_time_condition or None
for i in some_loop:
if compile_time_condition is None:
...

It's hard to believe that the overhead of identity checks is
comparable (let alone much higher) to the body of the loop for
anything more complex than "pass".

George

Jun 27 '08 #11

bruno.desthuilliers

On 20 juin, 21:41, eliben <eli...@gmail.comwrote:

[1] except using compile to build a code object with the function's
body, then instanciate a function object using this code, but I'm not
sure whether it will buy you much more performance-wise. I'd personnaly
prefer this because I find it more explicit and readable, but YMMV.

How is compiling more readable than exec -

Using compile and function(), you explicitely instanciate a new
function object, while using exec you're relying on a side effect.

doesn't it require an extra
step ?

Well... Your way:

d = {}
exec code in globals(), d
return d['foo']

My way:

return function(compile(code, '<string>', 'exec'), globals())

As far as I'm concern, it's two steps less - but YMMV, of course !-)

You generate code dynamically anyway.

Yes, indeed. Which may or not be the right thing to do here, but this
is a different question (and one I can't actually answer).

Jun 27 '08 #12

bruno.desthuilliers

On 20 juin, 21:44, eliben <eli...@gmail.comwrote:

On Jun 20, 3:19 pm, George Sakkis <george.sak...@gmail.comwrote:

(snip)

It's still not clear why the generic version is so slower, unless you
extract only a few selected fields, not all of them. Can you post a
sample of how you used to write it without exec to clarify where the
inefficiency comes from ?

George

The generic version has to make a lot of decisions at runtime, based
on the format specification.
Extract the offset from the spec, extract the length.

import operator

transformers = []
transformers.append(operator.itemgetter(slice(form at.offset,format.offset
+format.length)))

Is it msb-
first ? Then reverse.

if format.msb_first:
transformer.append(reverse)

Are specific bits required ? If so, do bit
operations.

etc.... Python functions are objects, you can define your own callable
(ie: function like) types, you can define anonymous single-expression
functions using lambda, functions are closures too so they can carry
the environment they were defined in, implementing partial application
(using either closures or callable objects) is trivial (and is in the
stdlib functools module since 2.5 FWIW), well... Defining a sequence
of transormer functionals is not a problem neither. And applying it to
your data bytestring is just trivial:

def apply_transformers(data, transormers) :
for transformer in transformers:
data = transformer(data)
return data

.... and is not necessarily that bad performance-wide (here you'd have
to benchmark both solutions to know for sure).

A dynamically generated function doesn't have to make any decisions -

No, but neither does a sequence of callable objects. The decisions are
taken where you have the necessary context, and applied somewhere
else. Dynamically generating/compiling code is one possible solution,
but not the only one.

I guess this is not much different from Lisp macros

The main difference is that Lisp macro are not built as raw string,
but as first class objects. I've so found this approach more flexible
and way easier to maintain, but here again, YMMV.

Anyway, even while (as you may have noticed by now) I'm one of these
"there's-a-better-way-than-eval-exec" peoples, I'd think you may
(depending on benchmarks with both solutions and real-life data) have
a valid use case here - and if you encapsulate this part correctly,
you can alway start with your current solution (so you make it work),
then eventually switch implementation later if it's worth the extra
effort...
Just my 2 cents. Truth is that as long as it works and is
maintainable, then who cares...

Jun 27 '08 #13

eliben

So you are saying that for example "if do_reverse: data.reverse()" is

*much* slower than "data.reverse()" ? I would expect that checking the
truthness of a boolean would be negligible compared to the reverse
itself. Did you try converting all checks to identity comparisons with
None ? I mean replacing every "if compile_time_condition:" in a loop
with

compile_time_condition = compile_time_condition or None
for i in some_loop:
if compile_time_condition is None:
...

It's hard to believe that the overhead of identity checks is
comparable (let alone much higher) to the body of the loop for
anything more complex than "pass".

There are also dict accesses (to extract the format parameters, such
as length and offsets) to the format, which are absent. Besides, the
fields are usually small, so reverse is relatively cheap.

Eli

Jun 27 '08 #14

eliben

d = {}

execcode in globals(), d
return d['foo']

My way:

return function(compile(code, '<string>', 'exec'), globals())

With some help from the guys at IRC I came to realize your way doesn't
do the same. It creates a function that, when called, creates 'foo' on
globals(). This is not exactly what I need.

Eli

Jun 27 '08 #15

eliben

On Jun 20, 2:44 pm, Peter Otten <__pete...@web.dewrote:

eliben wrote:
Additionally, I've found indentation to be a problem in such
constructs. Is there a workable way to indent the code at the level of
build_func, and not on column 0 ?

exec"if 1:" + code.rstrip()

Peter

Why is the 'if' needed here ? I had .strip work for me:

def make_func():
code = """
def foo(packet):
return ord(packet[3]) + 256 * ord(packet[4])
"""

d = {}
exec code.strip() in globals(), d
return d['foo']

Without .strip this doesn't work:

Traceback (most recent call last):
File "exec_code_generation.py", line 25, in <module>
foo = make_func()
File "exec_code_generation.py", line 20, in make_func
exec code in globals(), d
File "<string>", line 2
def foo(packet):
^
IndentationError: unexpected indent

Jun 27 '08 #16

Peter Otten

eliben wrote:

On Jun 20, 2:44 pm, Peter Otten <__pete...@web.dewrote:
>eliben wrote:
Additionally, I've found indentation to be a problem in such
constructs. Is there a workable way to indent the code at the level of
build_func, and not on column 0 ?

exec"if 1:" + code.rstrip()

Peter

Why is the 'if' needed here ? I had .strip work for me:

A simple .strip() doesn't work if the code comprises multiple lines:

>>def f():

.... return """
.... x = 42
.... if x 0:
.... print x
.... """
....

>>exec "if 1:\n" + f().rstrip()

>>exec f().strip()

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<string>", line 2
if x 0:
^
IndentationError: unexpected indent

You can of course split the code into lines, calculate the indentation of
the first non-white line, remove that indentation from all lines and then
rejoin.

Peter

Jun 27 '08 #17

eliben

On Jun 21, 8:52 am, Peter Otten <__pete...@web.dewrote:

eliben wrote:
On Jun 20, 2:44 pm, Peter Otten <__pete...@web.dewrote:
eliben wrote:
Additionally, I've found indentation to be a problem in such
constructs. Is there a workable way to indent the code at the level of
build_func, and not on column 0 ?

exec"if 1:" + code.rstrip()

Peter

Why is the 'if' needed here ? I had .strip work for me:

A simple .strip() doesn't work if the code comprises multiple lines:

>def f():

... return """
... x = 42
... if x 0:
... print x
... """
...>>exec "if 1:\n" + f().rstrip()
42

>exec f().strip()

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<string>", line 2
if x 0:
^
IndentationError: unexpected indent

I see. In my case I only evaluate function definitions with 'exec', so
I only need to de-indent the first line, and the others can be
indented because they're in a new scope anyway. What you suggest works
for arbitrary code and not only function definitions. It's a nice
trick with the "if 1:" :-)

Jun 27 '08 #18

Lie

On Jun 21, 2:02*pm, eliben <eli...@gmail.comwrote:

On Jun 21, 8:52 am, Peter Otten <__pete...@web.dewrote:

eliben wrote:
On Jun 20, 2:44 pm, Peter Otten <__pete...@web.dewrote:
>eliben wrote:
Additionally, I've found indentation to be a problem in such
constructs. Is there a workable way to indent the code at the level of
build_func, and not on column 0 ?

>exec"if 1:" + code.rstrip()

>Peter

Why is the 'if' needed here ? I had .strip work for me:

A simple .strip() doesn't work if the code comprises multiple lines:

>>def f():

... * * return """
... * * x = 42
... * * if x 0:
... * * * * * * print x
... * * """
...>>exec "if 1:\n" + f().rstrip()
42
>>exec f().strip()

Traceback (most recent call last):
* File "<stdin>", line 1, in <module>
* File "<string>", line 2
* * if x 0:
* * ^
IndentationError: unexpected indent

I see. In my case I only evaluate function definitions with 'exec', so
I only need to de-indent the first line, and the others can be
indented because they're in a new scope anyway. What you suggest works
for arbitrary code and not only function definitions. It's a nice
trick with the "if 1:" :-)

Have you actually profiled your code? Or are you just basing this
assumptions on guesses?

Jun 27 '08 #19

eliben

I see. In my case I only evaluate function definitions with 'exec', so

I only need to de-indent the first line, and the others can be
indented because they're in a new scope anyway. What you suggest works
for arbitrary code and not only function definitions. It's a nice
trick with the "if 1:" :-)

Have you actually profiled your code? Or are you just basing this
assumptions on guesses?

First of all, I see absolutely no connection between your question and
the text you quote. Is there? Or did you pick one post randomly to
post your question on?

Second, yes - I have profiled my code.

Third, this is a very typical torture path one has to go through when
asking about code generation. It is true of almost all communities,
except Lisp, perhaps. You have to convince everyone that you have a
real reason to do what you do. The simple norm of getting a reply to
your question doesn't work when you get to code generation. I wonder
why is it so. How many people have been actually "burned" by bad code
generation techniques, and how many are just parroting "goto is evil"
because it's the accepted thing to say. This is an interesting point
to ponder.

Eli

Jun 27 '08 #20

George Sakkis

On Jun 21, 9:40*am, eliben <eli...@gmail.comwrote:

I see. In my case I only evaluate function definitions with 'exec', so
I only need to de-indent the first line, and the others can be
indented because they're in a new scope anyway. What you suggest works
for arbitrary code and not only function definitions. It's a nice
trick with the "if 1:" :-)

Have you actually profiled your code? Or are you just basing this
assumptions on guesses?

First of all, I see absolutely no connection between your question and
the text you quote. Is there? Or did you pick one post randomly to
post your question on?

Second, yes - I have profiled my code.

Third, this is a very typical torture path one has to go through when
asking about code generation. It is true of almost all communities,
except Lisp, perhaps. You have to convince everyone that you have a
real reason to do what you do. The simple norm of getting a reply to
your question doesn't work when you get to code generation. I wonder
why is it so. How many people have been actually "burned" by bad code
generation techniques, and how many are just parroting "goto is evil"
because it's the accepted thing to say. This is an interesting point
to ponder.

It's not as much that many people have been burned but that, like
goto, 99% of the time there are better alternatives. Off the top of my
head, two recurring threads in c.l.py related to dynamic code
generation and evaluation are:
- Asking how to dynamically generate variable names ("for i in
xrange(10): exec 'x%d = %d' % (i,i)") instead of using a regular
dictionary.
- Using function names instead of the actual function objects and
calling eval(), not knowing that functions are first-class objects (or
not even familiar with what that means).

So even if your use case belongs to the exceptional 1% where dynamic
code generation is justified, you should expect people to question it
by default.

George

Jun 27 '08 #21

Scott David Daniels

br*****************@gmail.com wrote:

On 20 juin, 21:44, eliben <eli...@gmail.comwrote:

....

>The generic version has to make a lot of decisions at runtime, based
on the format specification.
Extract the offset from the spec, extract the length.

.... <example with lists of operations>...

Just my 2 cents. Truth is that as long as it works and is
maintainable, then who cares...

To chime in on "non-exec" operations alternatives. The extraction
piece-by-piece looks like it might be slow. You could figure out
where the majority of endian-ness is (typically it will be all one way),
and treat specially any others (making the string fields with an
operation applied). The best optimizations come from understanding
the regularities in your specific case; seldom do they come from
generating code that hides the regularity and depending on a compiler
to deduce that regularity.

Have you tried something like this (a sketch of a solution)?:

import struct
from functools import partial
import operator
class Decoder(object):
def __init__(self, unpack, processors, finals):
'''Of course this might be simply take the yaml in and go'''
self.unpack = unpack
self.processors = processors
self.length = struct.calcsize(unpack)
self.finals = finals

def packet(self, data):
parts = list(struct.unpack(self.unpack, data))
for n, action, result in self.processors:
if result is None:
parts.append(action(parts[n]))
else:
parts[n] = action(parts[n])
return tuple(parts[n] for n in self.finals) # or NamedTuple ...
def _bits(from_bit, mask, v):
return (v >from_bit) & mask
Your example extended a bit:
fmt = Decoder('<cBiBIi', [(1, partial(operator.mul, 2048), '-'),
(3, partial(_bits, 5, 0x3), None),
(3, partial(operator.iand, 0x80), None),
(3, partial(operator.iand, 0x1F), '-'),
(6, bool, '-')],
[0, 1, 2, 3, 6, 7, 4, 5])
print fmt.packet(source.read(fmt.length))
--Scott David Daniels
Sc***********@Acm.Org

Jun 27 '08 #22

eliben

Thanks for all the replies in this post. Just to conclude, I want to
post a piece of code I wrote to encapsulate function creation in this
way:

def create_function(code):
""" Create and return the function defined in code.
"""
m = re.match('\s*def\s+([a-zA-Z_]\w*)\s*\(', code)
if m:
func_name = m.group(1)
else:
return None

d = {}
exec code.strip() in globals(), d
return d[func_name]

Although the 'def' matching in the beginning looks a bit shoddy at
first, it should work in all cases.

Eli

Jun 27 '08 #23

Bruno Desthuilliers

eliben a écrit :

> d = {}
execcode in globals(), d
return d['foo']

My way:

return function(compile(code, '<string>', 'exec'), globals())

With some help from the guys at IRC I came to realize your way doesn't
do the same. It creates a function that, when called, creates 'foo' on
globals(). This is not exactly what I need.

I possibly messed up a couple things in the arguments, flags etc - I
very seldom use compile() and function(). The point was that it didn't
require any extra step.

Jun 27 '08 #24

Maric Michaud

Le Monday 23 June 2008 09:22:29 Bruno Desthuilliers, vous avez écrit*:

With some help from the guys at IRC I came to realize your way doesn't
do the same. It creates a function that, when called, creates 'foo' on
globals(). This is not exactly what I need.

I possibly messed up a couple things in the arguments, flags etc - I
very seldom use compile() and function(). The point was that *it didn't
require any extra step.

In the argument list of function type, the code object in first place is
expected to be created directly (no exec - eval) with the python type 'code'
(either found as types.CodeType or new.code).
In [24]: types.CodeType?
Type: type
Base Class: <type 'type'>
String Form: <type 'code'>
Namespace: Interactive
Docstring:
code(argcount, nlocals, stacksize, flags, codestring, constants, names,
varnames, filename, name, firstlineno, lnotab[, freevars[,
cellvars]])

Create a code object. Not for the faint of heart.

^^^^^^^^^^^^^^^

Even if it looks more "object oriented", I'm not sure it's actually the good
solution for the original problem. I think these interface are not a
replacement for the quick eval-exec idiom but more intended to make massive
code generation programs object oriented and closer to python internals.

AFAIK, the only use case I see code generation (eval - exec, playing with code
objects) as legitime in python is in programs that actually do code
generation, that is, parse and compile code from textual inputs (application
buillders).

If code generation is not the best, and I fail to see any performance issue
that could explain such a choice, except a misunderstanding of
what "compilation" means in python, just don't use it, use closures or
callable instances, there are many way to achieve this.

--
_____________

Maric Michaud

Jun 27 '08 #25

Bruno Desthuilliers

Maric Michaud a écrit :

Le Monday 23 June 2008 09:22:29 Bruno Desthuilliers, vous avez écrit :

>>With some help from the guys at IRC I came to realize your way doesn't
do the same. It creates a function that, when called, creates 'foo' on
globals(). This is not exactly what I need.
I possibly messed up a couple things in the arguments, flags etc - I
very seldom use compile() and function(). The point was that it didn't
require any extra step.

In the argument list of function type, the code object in first place is
expected to be created directly (no exec - eval) with the python type 'code'

Which is what compile returns. But indeed, re-reading compile's doc more
carefully, I'm afraid that the code object it returns may not be usable
the way I thought. My bad. <OP>sorry</OP>

(snip)

Jun 27 '08 #26

Fuzzyman

On Jun 21, 7:52*am, Peter Otten <__pete...@web.dewrote:

eliben wrote:
On Jun 20, 2:44 pm, Peter Otten <__pete...@web.dewrote:
eliben wrote:
Additionally, I've found indentation to be a problem in such
constructs. Is there a workable way to indent the code at the level of
build_func, and not on column 0 ?

exec"if 1:" + code.rstrip()

Peter

Why is the 'if' needed here ? I had .strip work for me:

A simple .strip() doesn't work if the code comprises multiple lines:

>def f():

... * * return """
... * * x = 42
... * * if x 0:
... * * * * * * print x
... * * """
...>>exec "if 1:\n" + f().rstrip()
42

>exec f().strip()

Traceback (most recent call last):
* File "<stdin>", line 1, in <module>
* File "<string>", line 2
* * if x 0:
* * ^
IndentationError: unexpected indent

You can of course split the code into lines, calculate the indentation of
the first non-white line, remove that indentation from all lines and then
rejoin.

textwrap.dedent will do all that for you...

Michael Foord
http://www.ironpythoninaction.com/

Peter

Jun 27 '08 #27

eliben

If code generation is not the best, and I fail to see any performance issue

that could explain such a choice, except a misunderstanding of
what "compilation" means in python, just don't use it, use closures or
callable instances, there are many way to achieve this.

And while we're on the topic of what compilation means in Python, I'm
not sure I fully understand the difference between compiled (.pyc)
code and exec-ed code. Is the exec-ed code turned to bytecode too,
i.e. it will be as efficient as compile-d code ?

Eli

Jun 27 '08 #28

Terry Reedy

eliben wrote:

And while we're on the topic of what compilation means in Python,

It depends on the implementation.
I'm

not sure I fully understand the difference between compiled (.pyc)
code and exec-ed code. Is the exec-ed code turned to bytecode too,
i.e. it will be as efficient as compile-d code ?

CPython always compiles to bytecode before executing. There is no
alternative execution path.

Jun 27 '08 #29

Maric Michaud

Le Tuesday 24 June 2008 07:18:47 eliben, vous avez écrit*:

If code generation is not the best, and I fail to see any performance
issue that could explain such a choice, except a misunderstanding of
what "compilation" means in python, just don't use it, use closures or
callable instances, there are many way to achieve this.

And while we're on the topic of what compilation means in Python, I'm
not sure I fully understand the difference between compiled (.pyc)
code and exec-ed code. Is the exec-ed code turned to bytecode too,
i.e. it will be as efficient as compile-d code ?

Yes, exactly the same, cpython always interprets compiled code, when a script
is executed for example, it is parsed/compiled to bytecode by the interpreter
before any execution. The .pyc/pyo files are just a cache created at import
time to avoid the rather time consuming parsing stage.

--
_____________

Maric Michaud

Jun 27 '08 #30

jhermann

Since nobody mentioned textwrap.dedent yet as an alternative to the
old "if 1:" trick, I thought I should do so. :)

Jun 27 '08 #31

eliben

On Jun 23, 6:44 am, eliben <eli...@gmail.comwrote:

Thanks for all the replies in this post. Just to conclude, I want to
post a piece of code I wrote to encapsulate function creation in this
way:

def create_function(code):
""" Create and return the function defined in code.
"""
m = re.match('\s*def\s+([a-zA-Z_]\w*)\s*\(', code)
if m:
func_name = m.group(1)
else:
return None

d = {}
exec code.strip() in globals(), d
return d[func_name]

Actually this won't work, because globals() returns the scope in which
create_function is defined, not called. So if I want to place
create_function in some utils file, the code it generates won't be
able to use auxiliary functions from the file that uses
create_function.

Eli

Jun 27 '08 #32

An idiom for code generation with exec

Similar topics