eval() == evil? --- How to use it safely?

Fett

I am creating a program that requires some data that must be kept up
to date. What I plan is to put this data up on a web-site then have
the program periodically pull the data off the web-site.

My problem is that when I pull the data (currently stored as a
dictionary on the site) off the site, it is a string, I can use eval()
to make that string into a dictionary, and everything is great.
However, this means that I am using eval() on some string on a web-
site, which seems pretty un-safe.

I read that by using eval(code,{"__builtins__":None},{}) I can prevent
them from using pretty much anything, and my nested dictionary of
strings is still allowable. What I want to know is:

What are the dangers of eval?
- I originally was using exec() but switched to eval() because I
didn't want some hacker to be able to delete/steal files off my
clients computers. I assume this is not an issue with eval(), since
eval wont execute commands.
- What exactly can someone do by modifying my code string in a command
like: thing = eval(code{"__builtins__":None},{}), anything other than
assign their own values to the object thing?

Aug 28 '08 #1

Subscribe Post Reply

3901

Guilherme Polo

On Thu, Aug 28, 2008 at 6:51 PM, Fett <Fe********@gmail.comwrote:

I am creating a program that requires some data that must be kept up
to date. What I plan is to put this data up on a web-site then have
the program periodically pull the data off the web-site.

My problem is that when I pull the data (currently stored as a
dictionary on the site) off the site, it is a string, I can use eval()
to make that string into a dictionary, and everything is great.
However, this means that I am using eval() on some string on a web-
site, which seems pretty un-safe.

I read that by using eval(code,{"__builtins__":None},{}) I can prevent
them from using pretty much anything, and my nested dictionary of
strings is still allowable. What I want to know is:

What are the dangers of eval?
- I originally was using exec() but switched to eval() because I
didn't want some hacker to be able to delete/steal files off my
clients computers. I assume this is not an issue with eval(), since
eval wont execute commands.
- What exactly can someone do by modifying my code string in a command
like: thing = eval(code{"__builtins__":None},{}), anything other than
assign their own values to the object thing?

By "disabling" __builtins__ you indeed cut some obvious tricks, but
someone still could send you a string like "10 ** 10 ** 10".

--
http://mail.python.org/mailman/listinfo/python-list

--
-- Guilherme H. Polo Goncalves

Aug 28 '08 #2

James Mills

Hi,

If you cannot use a simple data structure/format
like JSON, or CSV, or similar, _don't_
use eval or exec, but use the pickle
libraries instead. This is much safer.

cheers
James

On Fri, Aug 29, 2008 at 7:51 AM, Fett <Fe********@gmail.comwrote:

I am creating a program that requires some data that must be kept up
to date. What I plan is to put this data up on a web-site then have
the program periodically pull the data off the web-site.

My problem is that when I pull the data (currently stored as a
dictionary on the site) off the site, it is a string, I can use eval()
to make that string into a dictionary, and everything is great.
However, this means that I am using eval() on some string on a web-
site, which seems pretty un-safe.

I read that by using eval(code,{"__builtins__":None},{}) I can prevent
them from using pretty much anything, and my nested dictionary of
strings is still allowable. What I want to know is:

What are the dangers of eval?
- I originally was using exec() but switched to eval() because I
didn't want some hacker to be able to delete/steal files off my
clients computers. I assume this is not an issue with eval(), since
eval wont execute commands.
- What exactly can someone do by modifying my code string in a command
like: thing = eval(code{"__builtins__":None},{}), anything other than
assign their own values to the object thing?
--
http://mail.python.org/mailman/listinfo/python-list

--
--
-- "Problems are solved by method"

Aug 28 '08 #3

castironpi

On Aug 28, 4:51*pm, Fett <FettMan...@gmail.comwrote:

I am creating a program that requires some data that must be kept up
to date. What I plan is to put this data up on a web-site then have
the program periodically pull the data off the web-site.

My problem is that when I pull the data (currently stored as a
dictionary on the site) off the site, it is a string, I can use eval()
to make that string into a dictionary, and everything is great.
However, this means that I am using eval() on some string on a web-
site, which seems pretty un-safe.

May I suggest PyYAML?

Aug 28 '08 #4

Matimus

On Aug 28, 3:09*pm, "Guilherme Polo" <ggp...@gmail.comwrote:

On Thu, Aug 28, 2008 at 6:51 PM, Fett <FettMan...@gmail.comwrote:
I am creating a program that requires some data that must be kept up
to date. What I plan is to put this data up on a web-site then have
the program periodically pull the data off the web-site.

My problem is that when I pull the data (currently stored as a
dictionary on the site) off the site, it is a string, I can use eval()
to make that string into a dictionary, and everything is great.
However, this means that I am using eval() on some string on a web-
site, which seems pretty un-safe.

I read that by using eval(code,{"__builtins__":None},{}) I can prevent
them from using pretty much anything, and my nested dictionary of
strings is still allowable. What I want to know is:

What are the dangers of eval?
- I originally was using exec() but switched to eval() because I
didn't want some hacker to be able to delete/steal files off my
clients computers. I assume this is not an issue with eval(), since
eval wont execute commands.
- What exactly can someone do by modifying my code string in a command
like: thing = eval(code{"__builtins__":None},{}), anything other than
assign their own values to the object thing?

By "disabling" __builtins__ you indeed cut some obvious tricks, but
someone still could send you a string like "10 ** 10 ** 10".

--
http://mail.python.org/mailman/listinfo/python-list

--
-- Guilherme H. Polo Goncalves

Or, they could pass in something like this:

(t for t in 42 .__class__.__base__.__subclasses__() if t.__name__ ==
'LibraryLoader').next()((t for t in
__class__.__base__.__subclasses__() if t.__name__ ==
'CDLL').next()).msvcrt.system("SOMETHING MALICIOUS")

Which can be used to execute pretty much anything on a Windows system
using a "safe" eval. This same exploit exists in some form on *nix.
The above assumes that ctypes has been loaded. It can be modified to
call code in other modules that have been loaded though as well.

Matt

Aug 29 '08 #5

Steven D'Aprano

On Thu, 28 Aug 2008 14:51:57 -0700, Fett wrote:

I read that by using eval(code,{"__builtins__":None},{}) I can prevent
them from using pretty much anything,

No, it can prevent them from some obvious dangers, but not all obvious
dangers and possibly not unobvious ones.

and my nested dictionary of
strings is still allowable. What I want to know is:

What are the dangers of eval?

You're executing code on your server that was written by arbitrary and
untrusted people over the Internet.

- I originally was using exec() but switched to eval() because I didn't
want some hacker to be able to delete/steal files off my clients
computers. I assume this is not an issue with eval(), since eval wont
execute commands.

Bare eval() certainly can:

eval('__import__("os").system("ls *")') # or worse...

eval() with the extra arguments given makes that sort of thing harder,
but does it make it impossible? Are you willing to bet your server on it?

- What exactly can someone do by modifying my code string in a command
like: thing = eval(code{"__builtins__":None},{}), anything other than
assign their own values to the object thing?

They can cause an exception:

code = '0.0/0.0'
thing = eval(code, {"__builtins__": None}, {})

They can cause a denial of service attack:

code = '10**10**10'

They can feed you bad data:

code = "{ 'akey': 'Something You Don\'t Expect' }"

You have to deal with bad data no matter what you do, but why make it
easy for them to cause exceptions?

BTW, in case you think that you only have to deal with malicious attacks,
you also have to deal with accidents caused by incompetent users.
--
Steven

Aug 29 '08 #6

Paul Rubin

Fett <Fe********@gmail.comwrites:

However, this means that I am using eval() on some string on a web-
site, which seems pretty un-safe.

Don't even think of doing that.

I read that by using eval(code,{"__builtins__":None},{})

It is not reliable enough. Don't use eval for this AT ALL.

- I originally was using exec() but switched to eval()

For this purpose there is no difference between exec and eval.

Use something like simpleson or cjson instead.

Aug 29 '08 #7

Paul Rubin

"James Mills" <pr******@shortcircuit.net.auwrites:

If you cannot use a simple data structure/format
like JSON, or CSV, or similar, _don't_
use eval or exec, but use the pickle
libraries instead. This is much safer.

Pickle uses eval and should also be considered unsafe, as its
documentation describes.

Aug 29 '08 #8

Fett

On Aug 28, 7:57*pm, Paul Rubin <http://phr...@NOSPAM.invalidwrote:

So long story short: if I am expecting a dictionary of strings, I
should make a parser that only accepts a dictionary of strings then.
There is no safe way to use an existing construct.

That is what I was afraid of. I know I will have to deal with the
possibility of bad data, but considering my use (an acronym legend for
a database), and the fact that the site I plan to use should be
secure, these issues should be minimal. The users should be able to
spot any obvious false data, and restoring it should be simple.

Many thanks to all of you for your alarmist remarks. I certainly don't
want to, in any way, put my clients computers at risk by providing
unsafe code.

Aug 29 '08 #9

Fett

On Aug 29, 7:42*am, Fett <FettMan...@gmail.comwrote:

On Aug 28, 7:57*pm, Paul Rubin <http://phr...@NOSPAM.invalidwrote:

So long story short: if I am expecting a dictionary of strings, I
should make a parser that only accepts a dictionary of strings then.
There is no safe way to use an existing construct.

That is what I was afraid of. I know I will have to deal with the
possibility of bad data, but considering my use (an acronym legend for
a database), and the fact that the site I plan to use should be
secure, these issues should be minimal. The users should be able to
spot any obvious false data, and restoring it should be simple.

Many thanks to all of you for your alarmist remarks. I certainly don't
want to, in any way, put my clients computers at risk by providing
unsafe code.

On a related note, what if I encrypted and signed the data, then only
ran eval() on the string after it was decrypted and the signature
verified?

It has occurred to me that posting this data on a site might not be
the best idea unless I can be sure that it is not read by anyone that
it shouldn't be. So I figure an encrypting is needed, and as long as I
can sign it as well, then only people with my private signing key
could pass bad data, much less harmful strings.

Aug 29 '08 #10

Bruno Desthuilliers

Fett a écrit :

On Aug 28, 7:57 pm, Paul Rubin <http://phr...@NOSPAM.invalidwrote:

So long story short: if I am expecting a dictionary of strings, I
should make a parser that only accepts a dictionary of strings then.

or use an existing parser for an existing and documented format, as many
posters (including myself) already suggested.

There is no safe way to use an existing construct.

Nothing coming from the outside world is safe.

That is what I was afraid of. I know I will have to deal with the
possibility of bad data, but considering my use (an acronym legend for
a database), and the fact that the site I plan to use should be
secure, these issues should be minimal.

If you feel like opening the door to any script-kiddie, then please
proceed. It's *your* computer, anyway...

Else, use a known format with a known working parser (xml, json, yaml,
csv, etc...), and possibly https if your data are to be protected.

Aug 29 '08 #11

Steven D'Aprano

On Fri, 29 Aug 2008 05:42:46 -0700, Fett wrote:

On Aug 28, 7:57Â*pm, Paul Rubin <http://phr...@NOSPAM.invalidwrote:

So long story short: if I am expecting a dictionary of strings, I should
make a parser that only accepts a dictionary of strings then. There is
no safe way to use an existing construct.

You may find the code here useful:

http://effbot.org/zone/simple-iterator-parser.htm
--
Steven

Aug 29 '08 #12

Lie

On Aug 29, 8:14*pm, Fett <FettMan...@gmail.comwrote:

On Aug 29, 7:42*am, Fett <FettMan...@gmail.comwrote:

On Aug 28, 7:57*pm, Paul Rubin <http://phr...@NOSPAM.invalidwrote:

So long story short: if I am expecting a dictionary of strings, I
should make a parser that only accepts a dictionary of strings then.
There is no safe way to use an existing construct.

That is what I was afraid of. I know I will have to deal with the
possibility of bad data, but considering my use (an acronym legend for
a database), and the fact that the site I plan to use should be
secure, these issues should be minimal. The users should be able to
spot any obvious false data, and restoring it should be simple.

Many thanks to all of you for your alarmist remarks. I certainly don't
want to, in any way, put my clients computers at risk by providing
unsafe code.

On a related note, what if I encrypted and signed the data, then only
ran eval() on the string after it was decrypted and the signature
verified?

It has occurred to me that posting this data on a site might not be
the best idea unless I can be sure that it is not read by anyone that
it shouldn't be. So I figure an encrypting is needed, and as long as I
can sign it as well, then only people with my private signing key
could pass bad data, much less harmful strings.

Your way of thinking is similar to Microsoft's. Encrypting and Signing
is a kludge, a real fix should fix the underlying cause. Anyway using
data parsers isn't that much harder than using eval/exec.

Aug 29 '08 #13

Fett

Your way of thinking is similar to Microsoft's. Encrypting and Signing

is a kludge, a real fix should fix the underlying cause. Anyway using
data parsers isn't that much harder than using eval/exec.

While I agree that in this situation I should do both, what would you
propose for cases where the data being sent is supposed to be
executable code:

I happen to know that for enterprise disk drives (like what Google
uses to store everything) the firmware is protected by exactly what I
describe. Since the firmware has to be able to run, the kind of fix
you propose is not possible. I would assume that if this kind of data
transfer was deemed poor, that Google and others would be demanding
something better (can you imagine if Google's database stopped working
because someone overwrote the firmware on their hard-drive?).

Again, I suppose that in this case writing a parser is a better option
(parsing a dict with strings by hand is faster than reading
documentation on someone else's parser anyway), but both is the best
option by far.

Again, thank you all for your help.

Aug 29 '08 #14

castironpi

On Aug 29, 1:51*pm, Fett <FettMan...@gmail.comwrote:

Your way of thinking is similar to Microsoft's. Encrypting and Signing
is a kludge, a real fix should fix the underlying cause. Anyway using
data parsers isn't that much harder than using eval/exec.

While I agree that in this situation I should do both, what would you
propose for cases where the data being sent is supposed to be
executable code:

I happen to know that for enterprise disk drives (like what Google
uses to store everything) the firmware is protected by exactly what I
describe. Since the firmware has to be able to run, the kind of fix
you propose is not possible. I would assume that if this kind of data
transfer was deemed poor, that Google and others would be demanding
something better (can you imagine if Google's database stopped working
because someone overwrote the firmware on their hard-drive?).

Again, I suppose that in this case writing a parser is a better option
(parsing a dict with strings by hand is faster than reading
documentation on someone else's parser anyway), but both is the best
option by far.

Again, thank you all for your help.

I as a fan of biological structures tend to favor the 'many-small'
strategy: expose your servers, but only a fraction to any given
source. If one of them crashes, blacklist their recent sources.
Distribute and decentralize ("redundantfy"). Compare I guess to a jet
plane with 1,000 engines, of which a few can fail no problem.
Resources can be expendable in small proportions.

More generally, think of a minimalist operating system, that can
tolerate malicious code execution, and just crash and reboot a lot.
If 'foreign code' execution is fundamental to the project, you might
even look at custom hardware. Otherwise, if it's a lower priority,
just run a custom Python install, and delete modules like os.py,
os.path.py, and maybe even sys.py. Either remove their corresponding
libraries, or create a wrapper that gets Admin approval for calls like
'subprocess.exec' and 'os.path.remove'.

You notice Windows now obtains User approval for internet access by a
new program it doesn't recognize.

Aug 29 '08 #15

mario

On Aug 28, 11:51 pm, Fett <FettMan...@gmail.comwrote:

I am creating a program that requires some data that must be kept up
to date. What I plan is to put this data up on a web-site then have
the program periodically pull the data off the web-site.

My problem is that when I pull the data (currently stored as a
dictionary on the site) off the site, it is a string, I can use eval()
to make that string into a dictionary, and everything is great.
However, this means that I am using eval() on some string on a web-
site, which seems pretty un-safe.

I read that by using eval(code,{"__builtins__":None},{}) I can prevent
them from using pretty much anything, and my nested dictionary of
strings is still allowable. What I want to know is:

What are the dangers of eval?
- I originally was using exec() but switched to eval() because I
didn't want some hacker to be able to delete/steal files off my
clients computers. I assume this is not an issue with eval(), since
eval wont execute commands.
- What exactly can someone do by modifying my code string in a command
like: thing = eval(code{"__builtins__":None},{}), anything other than
assign their own values to the object thing?

If you like to look at a specific attempt for making eval() safe(r)
take a look at how the **eval-based** Evoque Templating engine does
it, for which a short overview is here:
http://evoque.gizmojo.org/usage/restricted/

While it does not provide protection against DOS type attacks, it
should be safe against code that tries to pirate tangible resources
off your system, such as files and disk. Actually, any problems anyone
may find are greatly appreciated...

Sep 3 '08 #16

rustom

On Aug 29, 4:42*am, castironpi <castiro...@gmail.comwrote:

May I suggest PyYAML?

I second that.

Yaml is very pythonic (being indentation based) and pyyaml is sweet.

Only make sure you use safe_load not load and you will have only
default construction for standard python objects -- lists,
dictionaries and 'atomic' things so no arbitrary code can be executed.

Someone else suggested json which is about the same as yml if there
are no objects. And by using safe_load you are not using objects.

Sep 3 '08 #17

eval() == evil? --- How to use it safely?

Similar topics