Embedding Python in Python

Robey Holderith

Anyone know a good way to embed python within python?

Now before you tell me that's silly, let me explain
what I'd like to do.

I'd like to allow user-defined scriptable objects. I'd
like to give them access to modify pieces of my classes.
I'd like to disallow access to pretty much the rest of
the modules.

Any ideas/examples?

-Robey

Jul 18 '05 #1

Subscribe Post Reply

2921

Phil Frost

You probably want something like this:

globalDict = {}
exec(stringOfPythonCodeFromUser, globalDict)

globalDict is now the global namespace of whatever was in
stringOfPythonCodeFromUser, so you can grab values from that and
selectivly import them into your namespace.

On Wed, Aug 18, 2004 at 02:26:00PM -0500, Robey Holderith wrote:

Anyone know a good way to embed python within python?

Now before you tell me that's silly, let me explain
what I'd like to do.

I'd like to allow user-defined scriptable objects. I'd
like to give them access to modify pieces of my classes.
I'd like to disallow access to pretty much the rest of
the modules.

Any ideas/examples?

-Robey

Jul 18 '05 #2

Paul Rubin

Robey Holderith <robey@slash_dev_slash_random.org> writes:

Anyone know a good way to embed python within python?
No.
I'd like to allow user-defined scriptable objects. I'd
like to give them access to modify pieces of my classes.
I'd like to disallow access to pretty much the rest of
the modules.
There was a feature called rexec/Bastion for that purposes in older
version of Python, but it was removed because it was insecure.
Any ideas/examples?

Run your sensitive stuff in a separate process (or separate computer)
and allow the hostile clients to communicate through sockets.

Jul 18 '05 #3

Robey Holderith

On Wed, 18 Aug 2004 14:35:21 -0400, Phil Frost wrote:

You probably want something like this:

globalDict = {}
exec(stringOfPythonCodeFromUser, globalDict)

globalDict is now the global namespace of whatever was in
stringOfPythonCodeFromUser, so you can grab values from that and
selectivly import them into your namespace.

So using this (with a little additional reading) it looks like I
can do this:

globalDict = {'__builtins__': <my modules here>}
exec(<pythonCodeFromUser>, globalDict)

And that this will disallow both importing of new modules and direct
access to my namespace. It will however allow access to the

Would this be secure?

Paul, what's your take on this?

-Robey
On Wed, Aug 18, 2004 at 02:26:00PM -0500, Robey Holderith wrote:

Anyone know a good way to embed python within python?

Now before you tell me that's silly, let me explain
what I'd like to do.

I'd like to allow user-defined scriptable objects. I'd
like to give them access to modify pieces of my classes.
I'd like to disallow access to pretty much the rest of
the modules.

Any ideas/examples?

-Robey

Jul 18 '05 #4

JCM

Paul Rubin <http://ph****@nospam.invalid> wrote:
....

There was a feature called rexec/Bastion for that purposes in older
version of Python, but it was removed because it was insecure.
Any ideas/examples?

Run your sensitive stuff in a separate process (or separate computer)
and allow the hostile clients to communicate through sockets.

If you're concerned about security, another possibility is to parse
the user's code and look for anything potentially dangerous. You'll
need to be aggressive, but I believe it's possible. For example,
disallow exec statements, the identifier "eval", any identifier of
__this__ form, import statements, etc. This is overly restrictive,
but it will provide security.

Jul 18 '05 #5

Phil Frost

No. An easy way to escape that is to start one's code with
'del __builtins__', then python will add the default __builtins__ back
to the namespace. Restricting what arbitrary code can do has been
discussed many, many times, and it seems there is no way to do it short
of reimplementing a python interpretor.

On Wed, Aug 18, 2004 at 02:56:04PM -0500, Robey Holderith wrote:

So using this (with a little additional reading) it looks like I
can do this:

globalDict = {'__builtins__': <my modules here>}
exec(<pythonCodeFromUser>, globalDict)

And that this will disallow both importing of new modules and direct
access to my namespace. It will however allow access to the

Would this be secure?

Paul, what's your take on this?

-Robey

On Wed, 18 Aug 2004 14:35:21 -0400, Phil Frost wrote:
You probably want something like this:

globalDict = {}
exec(stringOfPythonCodeFromUser, globalDict)

globalDict is now the global namespace of whatever was in
stringOfPythonCodeFromUser, so you can grab values from that and
selectivly import them into your namespace.

On Wed, Aug 18, 2004 at 02:26:00PM -0500, Robey Holderith wrote:

Anyone know a good way to embed python within python?

Now before you tell me that's silly, let me explain
what I'd like to do.

I'd like to allow user-defined scriptable objects. I'd
like to give them access to modify pieces of my classes.
I'd like to disallow access to pretty much the rest of
the modules.

Any ideas/examples?

-Robey

Jul 18 '05 #6

Paul Rubin

Robey Holderith <robey@slash_dev_slash_random.org> writes:

Would this be secure?
No.
Paul, what's your take on this?

Don't count on it.

Jul 18 '05 #7

Paul Rubin

JCM <jo*****@myway.com> writes:

If you're concerned about security, another possibility is to parse
the user's code and look for anything potentially dangerous. You'll
need to be aggressive, but I believe it's possible. For example,
disallow exec statements, the identifier "eval", any identifier of
__this__ form, import statements, etc. This is overly restrictive,
but it will provide security.

By the time you're done with all that, you may as well design a new
restricted language and interpret just that.

Hint:
e = vars()['__builtins__'].eval
print e('2+2')

Even Java keeps getting new holes found, and Python is not anywhere
near Java when it comes to this kind of thing.

Jul 18 '05 #8

JCM

Paul Rubin <http://ph****@nospam.invalid> wrote:

JCM <jo******************@myway.com> writes:
If you're concerned about security, another possibility is to parse
the user's code and look for anything potentially dangerous. You'll
need to be aggressive, but I believe it's possible. For example,
disallow exec statements, the identifier "eval", any identifier of
__this__ form, import statements, etc. This is overly restrictive,
but it will provide security.
By the time you're done with all that, you may as well design a new
restricted language and interpret just that. Hint:
e = vars()['__builtins__'].eval
print e('2+2') Even Java keeps getting new holes found, and Python is not anywhere
near Java when it comes to this kind of thing.

I don't think it's as difficult as you think. Your snippet of code
would be rejected by the rules I suggested. You'd also want to
prohibit other builtins like compile, execfile, input, reload, vars,
etc.

Jul 18 '05 #9

Jack Diederich

On Wed, Aug 18, 2004 at 07:44:47PM +0000, JCM wrote:

Paul Rubin <http://ph****@nospam.invalid> wrote:
JCM <jo******************@myway.com> writes:
If you're concerned about security, another possibility is to parse
the user's code and look for anything potentially dangerous. You'll
need to be aggressive, but I believe it's possible. For example,
disallow exec statements, the identifier "eval", any identifier of
__this__ form, import statements, etc. This is overly restrictive,
but it will provide security.

By the time you're done with all that, you may as well design a new
restricted language and interpret just that.

Hint:
e = vars()['__builtins__'].eval
print e('2+2')

Even Java keeps getting new holes found, and Python is not anywhere
near Java when it comes to this kind of thing.

I don't think it's as difficult as you think. Your snippet of code
would be rejected by the rules I suggested. You'd also want to
prohibit other builtins like compile, execfile, input, reload, vars,
etc.

foo = "ev" + "al"
e = vars()['__builtins__'].__dict__[foo]
print e('2+2')

This is a job for the operating system and not python.
Google groups for rexec and Bastion if you want to read ten lenghty
discussions of why this is the OS's job.

-Jack

Jul 18 '05 #10

Robey Holderith

On Wed, 18 Aug 2004 19:44:47 +0000, JCM wrote:

Paul Rubin <http://ph****@nospam.invalid> wrote:
JCM <jo******************@myway.com> writes:
If you're concerned about security, another possibility is to parse
the user's code and look for anything potentially dangerous. You'll
need to be aggressive, but I believe it's possible. For example,
disallow exec statements, the identifier "eval", any identifier of
__this__ form, import statements, etc. This is overly restrictive,
but it will provide security.

By the time you're done with all that, you may as well design a new
restricted language and interpret just that.

Hint:
e = vars()['__builtins__'].eval
print e('2+2')

Even Java keeps getting new holes found, and Python is not anywhere
near Java when it comes to this kind of thing.

I don't think it's as difficult as you think. Your snippet of code
would be rejected by the rules I suggested. You'd also want to
prohibit other builtins like compile, execfile, input, reload, vars,
etc.

I'm going to have to agree with Paul on this one. I do not feel up to
the task of thinking of every possible variant of malicious code. There
are far too many ways of writing the exact same thing. I think it would
be much easier to write my own interpreter.

-Robey

Jul 18 '05 #11

JCM

Jack Diederich <ja**@performancedrivers.com> wrote:

On Wed, Aug 18, 2004 at 07:44:47PM +0000, JCM wrote:

....

I don't think it's as difficult as you think. Your snippet of code
would be rejected by the rules I suggested. You'd also want to
prohibit other builtins like compile, execfile, input, reload, vars,
etc.

foo = "ev" + "al"
e = vars()['__builtins__'].__dict__[foo]
print e('2+2')

Also would be rejected by my original set of rules (can't use
__dict__). But I'd disallow vars too.

Jul 18 '05 #12

JCM

Robey Holderith <robey@slash_dev_slash_random.org> wrote:

On Wed, 18 Aug 2004 19:44:47 +0000, JCM wrote: ....
I don't think it's as difficult as you think. Your snippet of code
would be rejected by the rules I suggested. You'd also want to
prohibit other builtins like compile, execfile, input, reload, vars,
etc.

I'm going to have to agree with Paul on this one. I do not feel up to
the task of thinking of every possible variant of malicious code. There
are far too many ways of writing the exact same thing. I think it would
be much easier to write my own interpreter.

Well it certainly isn't easier to write your own interpreter if you're
talking about the effort you'd need to put into it. And I'm not
convinced it's that tricky to come up with a set of syntax rules to
decide whether a piece of code is simple/safe enough to run. It
basically comes down to disallowing certain statements and certain
identifiers. Of course you'll end up rejecting a lot of code that
isn't malicious.

If you're interested enough, I'll try to throw a safety-checker
together. You'd have to be pretty interested though (I'm lazy).

Jul 18 '05 #13

Robey Holderith

On Wed, 18 Aug 2004 15:27:50 -0400, Phil Frost wrote:

No. An easy way to escape that is to start one's code with
'del __builtins__', then python will add the default __builtins__ back
to the namespace. Restricting what arbitrary code can do has been
discussed many, many times, and it seems there is no way to do it short
of reimplementing a python interpretor.

Out of curiosity I tried the following in 2.3.4
#------Begin Code

import random

globalDict = {'__builtins__':random}
localDict = {}
execfile("test2.py", globalDict, localDict)

print globalDict
print localDict

localDict['move']()

#------- End Code
Where test2.py looked like this:
#---------Begin Code

print __builtins__

try:
del __builtins__
print 'del worked'
except:
pass

try:
exec('del __builtins__')
print('exec del worked')
except:
pass

try:
import sys
print 'Import Worked'
except:
pass

try:
f = file('out.tmp','w')
f.write('asdfasdf')
f.close()
print 'File Access Worked'
except:
pass

seed()

def move():
print __builtins__

#------ End Code

I sure it has a crack in in somewhere, but it doesn't
seem to be del __builtins__ .

-Robey

Jul 18 '05 #14

Robey Holderith

I've found the crack in the armor. See additions below.

-Robey

On Wed, 18 Aug 2004 16:48:26 -0500, Robey Holderith wrote:

Where test2.py looked like this:
#---------Begin Code

print __builtins__

try:
del __builtins__
print 'del worked'
except:
pass

try:
exec('del __builtins__')
print('exec del worked')
except:
pass

try:
import sys
print 'Import Worked'
except:
pass

try:
f = file('out.tmp','w')
f.write('asdfasdf')
f.close()
print 'File Access Worked'
except:
pass

seed()

def move(): #Add the following for a nice security hole
global __builtins__
del __builtins__ print __builtins__

#------ End Code

I sure it has a crack in in somewhere, but it doesn't
seem to be del __builtins__ .

-Robey

Jul 18 '05 #15

Robey Holderith

On Wed, 18 Aug 2004 20:33:09 +0000, JCM wrote:

Robey Holderith <robey@slash_dev_slash_random.org> wrote:
On Wed, 18 Aug 2004 19:44:47 +0000, JCM wrote:

...
I don't think it's as difficult as you think. Your snippet of code
would be rejected by the rules I suggested. You'd also want to
prohibit other builtins like compile, execfile, input, reload, vars,
etc.

I'm going to have to agree with Paul on this one. I do not feel up to
the task of thinking of every possible variant of malicious code. There
are far too many ways of writing the exact same thing. I think it would
be much easier to write my own interpreter.

Well it certainly isn't easier to write your own interpreter if you're
talking about the effort you'd need to put into it. And I'm not
convinced it's that tricky to come up with a set of syntax rules to
decide whether a piece of code is simple/safe enough to run. It
basically comes down to disallowing certain statements and certain
identifiers. Of course you'll end up rejecting a lot of code that
isn't malicious.

If you're interested enough, I'll try to throw a safety-checker
together. You'd have to be pretty interested though (I'm lazy).

Don't do it on my behalf. I started far too many projects doing something
similar before I realized that the only effective way to do security was
from the bottom up. The problem looks something like this (assuming each
function has 10 places where it is implemented.

Level | Malicious Variation Count
-----------------------------------------
0 | 10^0
1 | 10^1
2 | 10^2
x | 10^x

Suffice to say that in simple code... it is doable. In a
mature interpreter... near impossible.

-Robey

Jul 18 '05 #16

Jack Diederich

On Wed, Aug 18, 2004 at 08:25:04PM +0000, JCM wrote:

Jack Diederich <ja**@performancedrivers.com> wrote:
On Wed, Aug 18, 2004 at 07:44:47PM +0000, JCM wrote:

...
I don't think it's as difficult as you think. Your snippet of code
would be rejected by the rules I suggested. You'd also want to
prohibit other builtins like compile, execfile, input, reload, vars,
etc.

foo = "ev" + "al"
e = vars()['__builtins__'].__dict__[foo]
print e('2+2')

Also would be rejected by my original set of rules (can't use
__dict__). But I'd disallow vars too.

Google groups for this topic, it's been dead horse kicked.
You would have to eliminate getarr too and any C func that can
result in an infite loop.

Not-python's-job-ly,

-Jack

Jul 18 '05 #17

JCM

Jack Diederich <ja**@performancedrivers.com> wrote:

On Wed, Aug 18, 2004 at 08:25:04PM +0000, JCM wrote:
Jack Diederich <ja**@performancedrivers.com> wrote:
> On Wed, Aug 18, 2004 at 07:44:47PM +0000, JCM wrote: ...
>> I don't think it's as difficult as you think. Your snippet of code
>> would be rejected by the rules I suggested. You'd also want to
>> prohibit other builtins like compile, execfile, input, reload, vars,
>> etc.
>>
> foo = "ev" + "al"
> e = vars()['__builtins__'].__dict__[foo]
> print e('2+2')

Also would be rejected by my original set of rules (can't use
__dict__). But I'd disallow vars too.

Google groups for this topic, it's been dead horse kicked.
You would have to eliminate getarr too and any C func that can
result in an infite loop.

Infinite loops (and other resource use) are a different story, not
addressed by source code inspection. I worked on a project which
needed to run untrusted code, and we dealt with the infinite-loop
situation by always running untrusted code on the main thread and
signalling it if it took too long to execute (this worked on unix--I
don't know what you'd do on Windows). I realize this could leave data
in a bad state. Infinite loops are harder to deal with.

Jul 18 '05 #18

Paul Rubin

JCM <jo******************@myway.com> writes:

need to be aggressive, but I believe it's possible. For example,
disallow exec statements, the identifier "eval", any identifier of
__this__ form, import statements, etc. This is overly restrictive,
but it will provide security.

Hint:
e = vars()['__builtins__'].eval
print e('2+2')

I don't think it's as difficult as you think. Your snippet of code
would be rejected by the rules I suggested. You'd also want to
prohibit other builtins like compile, execfile, input, reload, vars, etc.

I don't see how. Your rules were to disallow:

1) exec statements. My example doesn't use it.

2) eval identifier. My example uses eval as an attribute and not an
identifier. You can eliminate the use of eval as an attribute with
e = getattr(vars()('__builtins__'), 'ev'+'al').
Now not even the string 'eval' appears in one piece.
3) identifiers like __this__. My example doesn't use any. It
uses a constant string of that form, not an identifier. The
string could be computed instead, like the eval example above.
4) import statements. My example doesn't use them.

Conclusion, my example gets past your suggested rules. I also didn't
use compile, execfile, input, or reload. I did use vars but there are
probably other ways to do the same thing. You can't take something
full of holes and start plugging holes until you think you found them
all. You have to start with something that has no holes. The Python
crowd has been through this many times already; do some searches for
rexec/Bastion security.

Jul 18 '05 #19

Michael J. Fromberger

In article <pan.2004.08.18.19.25.59.519570@slash_dev_slash_ra ndom.org>,
Robey Holderith <robey@slash_dev_slash_random.org> wrote:

Anyone know a good way to embed python within python?

help(eval)

;)

-M

--
Michael J. Fromberger | Lecturer, Dept. of Computer Science
http://www.dartmouth.edu/~sting/ | Dartmouth College, Hanover, NH, USA

Jul 18 '05 #20

Benjamin Niemann

Well it seems that this is impossible to do with the current Python. But
it is a feature that would be important for certain applications.
Actually I've been searching for this, too - and only found
abandoned/deprecated modules.

If you want to use the current Python interpreter to execute the code,
you'd have to remove many language features, because they could provide
a backdoor for malicous code. This could be done by defining a grammar
for a subset of Python (perhaps with some semantic checks), and verify
that the code satisfies the grammar before you feed it into eval(). This
could either be easy (resulting in a small subset of Python that is
probably too small for real use...), or difficult (resulting in a usable
subset, but with a large amount of complex grammar rules - with at least
one rule that introduces a security leak...).

A good solution has to be implemented in the Python interpreter. Are
there any plans for future versions of Python? I've seen the phrase
"security initiative" on this list. Was that a "there is a ..." or
"there should be a ..."? I couldn't find anything on the web (but didn't
search very deep).

My first idea:

- extend the C-API (alternative to Py_Initialize??) for embedding Python
to provide a 'stripped down' interpreter: no builtins with sideeffects
(like open()...), ...
I don't know anything about Pythons internals or embedding Python, so I
can say, if this is easy or possible at all.

- communication of the embedded script to the outside world (file or
network I/O...) must be provided by the hosting application that is
responsible for enforcing the desired security limitations.

- wrap it into a Python module. Then you can start the isolated embedded
Python from 'real' Python code.

The interesting (and most difficult) thing is, which part of Pythons
standard library relies on "dangerous" features. This could drastically
reduce the usability of this approach (until you build your own 'secure'
library).
Using this model, the secure interpreter is running in the same process
context as the unsecure host. A bug in python could result in unchecked
access to resources of the host. For higher security a separate process
should be started.

Jul 18 '05 #21

JCM

Paul Rubin <http://ph****@nospam.invalid> wrote:
....

> Hint:
> e = vars()['__builtins__'].eval
> print e('2+2')
I don't think it's as difficult as you think. Your snippet of code
would be rejected by the rules I suggested. You'd also want to
prohibit other builtins like compile, execfile, input, reload, vars, etc.

I don't see how. Your rules were to disallow: 1) exec statements. My example doesn't use it. 2) eval identifier. My example uses eval as an attribute and not an
identifier. You can eliminate the use of eval as an attribute with
e = getattr(vars()('__builtins__'), 'ev'+'al').
Now not even the string 'eval' appears in one piece.
You've used eval an as identifier (at least by the terminology to
which I'm accustomed), just not as a variable.
3) identifiers like __this__. My example doesn't use any. It
uses a constant string of that form, not an identifier. The
string could be computed instead, like the eval example above.
4) import statements. My example doesn't use them. Conclusion, my example gets past your suggested rules. I also
didn't use compile, execfile, input, or reload. I did use vars but
there are probably other ways to do the same thing. You can't take
something full of holes and start plugging holes until you think you
found them all. You have to start with something that has no holes.
It's fine to look at it that way. Start with a subset of Python that
you know to be safe, for example only integer literal expressions.
Keep adding more safe features until you're satisfied with the
expressiveness of your subset.
The Python crowd has been through this many times already; do some
searches for rexec/Bastion security.

I did do a [quick] search, and saw a lot of articles about how rexec
and Bastion were insecure; but I didn't find any arguments about how
it's (too) difficult to come up with a safe subset of Python, for some
definition of "safe".

Jul 18 '05 #22

510046470588-0001

Robey Holderith <robey@slash_dev_slash_random.org> writes:

Anyone know a good way to embed python within python?

Now before you tell me that's silly, let me explain
what I'd like to do.

I'd like to allow user-defined scriptable objects. I'd
like to give them access to modify pieces of my classes.
I'd like to disallow access to pretty much the rest of
the modules.

Any ideas/examples?

use the rexec module, or see how Zope does it
Klaus Schilling

Jul 18 '05 #23

Paul Rubin

JCM <jo******************@myway.com> writes:

It's fine to look at it that way. Start with a subset of Python that
you know to be safe, for example only integer literal expressions.
Keep adding more safe features until you're satisfied with the
expressiveness of your subset.

Well ok, but then you haven't got Python, you've got some subset, with
a completely different implementation than the Python that it's
embedded in.

Jul 18 '05 #24

Embedding Python in Python

Similar topics