467,920 Members | 1,134 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 467,920 developers. It's quick & easy.

Simple eval

Hi,

A while ago I asked a question on the list about a simple eval
function, capable of eval'ing simple python constructs (tuples, dicts,
lists, strings, numbers etc) in a secure manner:
http://groups.google.com/group/comp....01273441d445f/
>From the answers I got I chose to use simplejson... However, I was
also pointed to a simple eval function by Fredrik Lundh:
http://effbot.org/zone/simple-iterator-parser.htm. His solution, using
module tokenize, was short and elegant. So I used his code as a
starting point for simple evaluation of dicts, tuples, lists, strings,
unicode strings, integers, floats, None, True, and False. I've
included the code below, together with some basic tests, and
profiling... On my computer (winXP, python 2.5), simple eval is about
5 times slower than builtin eval...

Comments, speedups, improvements in general, etc are appreciated. As
this is a contribution to the community I suggest that any
improvements are posted in this thread...

-Tor Erik

Code (tested on 2.5, but should work for versions >= 2.3):

'''
Recursive evaluation of:
tuples, lists, dicts, strings, unicode strings, ints, floats,
True, False, and None
'''

import cStringIO, tokenize, itertools

KEYWORDS = {'None': None, 'False': False, 'True': True}

def atom(next, token):
if token[1] == '(':
out = []
token = next()
while token[1] != ')':
out.append(atom(next, token))
token = next()
if token[1] == ',':
token = next()
return tuple(out)
elif token[1] == '[':
out = []
token = next()
while token[1] != ']':
out.append(atom(next, token))
token = next()
if token[1] == ',':
token = next()
return out
elif token[1] == '{':
out = {}
token = next()
while token[1] != '}':
key = atom(next, token)
next() # Skip key-value delimiter
token = next()
out[key] = atom(next, token)
token = next()
if token[1] == ',':
token = next()
return out
elif token[1].startswith('u'):
return token[1][2:-1].decode('unicode-escape')
elif token[0] is tokenize.STRING:
return token[1][1:-1].decode('string-escape')
elif token[0] is tokenize.NUMBER:
try:
return int(token[1], 0)
except ValueError:
return float(token[1])
elif token[1] in KEYWORDS:
return KEYWORDS[token[1]]
raise SyntaxError('malformed expression (%r)¨' % token[1])

def simple_eval(source):
src = cStringIO.StringIO(source).readline
src = tokenize.generate_tokens(src)
src = itertools.ifilter(lambda x: x[0] is not tokenize.NL, src)
res = atom(src.next, src.next())
if src.next()[0] is not tokenize.ENDMARKER:
raise SyntaxError("bogus data after expression")
return res
if __name__ == '__main__':
expr = (1, 2.3, u'h\xf8h\n', 'h\xc3\xa6', ['a', 1],
{'list': [], 'tuple': (), 'dict': {}}, False, True, None)
rexpr = repr(expr)

a = simple_eval(rexpr)
b = eval(rexpr)
assert a == b

import timeit
print timeit.Timer('eval(rexpr)', 'from __main__ import
rexpr').repeat(number=1000)
print timeit.Timer('simple_eval(rexpr)', 'from __main__ import
rexpr, simple_eval').repeat(number=1000)
Nov 18 '07 #1
  • viewed: 1885
Share:
1 Reply
En Sun, 18 Nov 2007 22:24:39 -0300, greg <gr**@cosc.canterbury.ac.nz>
escribi�:
Importing the names from tokenize that you use repeatedly
should save some time, too.
from tokenize import STRING, NUMBER

If you were willing to indulge in some default-argument abuse, you
could also do

def atom(next, token, STRING = tokenize.STRING, NUMBER =
tokenize.NUMBER):
...

A more disciplined way would be to wrap it in a closure:

def make_atom():
from tokenize import STRING, NUMBER
def atom(next, token):
...
return atom
....but unfortunately it's the slowest alternative, so wouldn't count as a
speed optimization.

I would investigate a mixed approach: using a parser to ensure the
expression is "safe", then calling eval.

--
Gabriel Genellina

Nov 19 '07 #2

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

7 posts views Thread by J. Hall | last post: by
6 posts views Thread by Andreas Thiele | last post: by
4 posts views Thread by Geoff Cox | last post: by
2 posts views Thread by Eniac | last post: by
2 posts views Thread by Not Me | last post: by
8 posts views Thread by Jeff | last post: by
7 posts views Thread by Helpful person | last post: by
7 posts views Thread by bvdp | last post: by
7 posts views Thread by bvdp | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.