473,322 Members | 1,614 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,322 software developers and data experts.

Secure Pickle-like module

Hi all,

I'm currently working on a secure Pickle-like module, Cerealizer,
http://home.gna.org/oomadness/en/cerealizer/index.html
Cerealizer has a pickle-like interface (load, dump, __getstate__,
__setstate__,...), however it requires to register the class you want
to "cerealize", by calling cerealizer.register(YourClass).
Cerealizer doesn't import other modules (contrary to pickle), and the
only methods it may call are YourClass.__new__, YourClass.__getstate__
and YourClass.__setstate__ (Cerealizer keeps it own reference to these
three method, so as YourCall.__setstate__ = cracked_method is
harmless).
Thus, as long as __new__, __getstate__ and __setstate__ are not
dangerous, Cerealizer should be secure.

The performance are quite good and, with Psyco, it is about as fast as
cPickle. However, Cerealizer is written in less than 300 lines of
pure-Python code.

I would appreciate any comments, especially if there are some security
gurus here :-)

Jiba

May 25 '06 #1
1 2028
> There are a couple factual inaccuracies on the site that I'd like to clear up first:
Trivial benchmarks put cerealizer and banana/jelly on the same level as far as performance goes:
$ python -m timeit -s 'from cereal import dumps; L = ["Hello", " ", ("w", "o", "r", "l", "d", ".")]' 'dumps(L)'
10000 loops, best of 3: 84.1 usec per loop
$ python -m timeit -s 'from twisted.spread import banana, jelly; dumps = lambda o: banana.encode(jelly.jelly(o)); L = ["Hello", " ", ("w", "o", "r", "l", "d", ".")]' 'dumps(L)'
10000 loops, best of 3: 89.7 usec per loop

This is with cBanana though, which has to be explicitly enabled and, of course, is written in C. So Cerealizer looks like it has the potential to do pretty well, performance-wise.
My personal benchmark was different; it was using a list with 2000
objects defined as following:

class O(object):
def __init__(self):
self.x = 1
self.s = "jiba"
self.o = None

with self.o referring to another O object. I think my benchmark,
although still very limited, is more representative since it involves
object, string, number and list.

See it there:
http://svn.gna.org/viewcvs/*checkout...2Fplain&rev=31

The results are (using Psyco):
With old-style classes:
cerealizer
dumps in 0.0619530677795 s, 114914 bytes length
loads in 0.0313038825989 s

cPickle
dumps in 0.0301840305328 s, 116356 bytes length
loads in 0.023097038269 s

jelly + banana
dumps in 0.168012142181 s 169729 bytes length
loads in 1.82081913948 s

jelly + cBanana
dumps in 0.082946062088 s 169729 bytes length
loads in 0.156159877777 s

With new-style classes:
cerealizer
dumps in 0.0575239658356 s, 114914 bytes length
loads in 0.028165102005 s

cPickle
dumps in 0.07634806633 s, 116428 bytes length
loads in 0.0278959274292 s

jelly + banana
dumps in 0.156242132187 s 169729 bytes length
(TypeError; I didn't investigate this problem yet although it is
surely solvable)

jelly + cBanana
dumps in 0.10772895813 s 169729 bytes length
(TypeError; I didn't investigate this problem yet although it is
surely solvable)

As you see, cPickle is about 2 times faster than cerealizer for
old-style classes, but cerealizer beats cPickle for new-style classes
(which makes sense since I have optimized it for new-style classes).
However, Jelly is far behind, even using cBanana, especially for
loading.

You talked about _Tuple and _Dereference on the website as well. These are internal implementation details. jelly also supports extension types, by way of setUnjellyableForClass and similar functions.
The problem arises only when the extension type expects an attribute of
a specific class, e.g. (in Pyrex):

cdef class MyClass:
cdef MyClass other

The other attribute of MyClass can only contains a reference to an
instance of MyClass (or None). Thus it cannot be set to an instance of
_Dereference or _Tuple, even temporarily; doing other =
_Dereference(...) raises an exception.

I solve this problem in Cerealizer by doing a 2-pass object creation:
step 1, create all the objects; step 2, set all objects' states.
As far as security goes, no obvious problems jump out at me, either
from the API for from skimming the code. I think early-binding
__new__, __getstate__, and __setstate__ may be going further than
is necessary. If someone can find code to set attributes on classes
in your process space, they can probably already do anything they
want to your program and don't need to exploit security problems in
your serializer.


I agree on that; however I prefer to be "over-secure" than "just as
secure as necessary" :-)

Thank you for your opinion!
I'm going to update my website.
Jiba

May 25 '06 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: Kris Caselden | last post by:
Python's docs say that Shelve uses Pickle to serialize its data. However, I've noticed that Pickle can maintain internal links, while Shelve cannot. For instance: >>> d =...
3
by: Michael Hohn | last post by:
Hi, under python 2.2, the pickle/unpickle sequence incorrectly restores a larger data structure I have. Under Python 2.3, these structures now give an explicit exception from...
1
by: A.B., Khalid | last post by:
I wonder if someone can explain what is wrong here. I am pickling a list of dictionaries (see code attached) and unpickling it back using the HIGHEST_PROTOCOL of pickle and cPickle. I am getting an...
28
by: Grant Edwards | last post by:
I finally figured out why one of my apps sometimes fails under Win32 when it always works fine under Linux: Under Win32, the pickle module only works with a subset of floating point values. In...
4
by: Shi Mu | last post by:
I got a sample code and tested it but really can not understand the use of pickle and dump: >>> import pickle >>> f = open("try.txt", "w") >>> pickle.dump(3.14, f) >>> pickle.dump(, f) >>>...
6
by: Jim Lewis | last post by:
Pickling an instance of a class, gives "can't pickle instancemethod objects". What does this mean? How do I find the class method creating the problem?
10
by: crystalattice | last post by:
I'm creating an RPG for experience and practice. I've finished a character creation module and I'm trying to figure out how to get the file I/O to work. I've read through the python newsgroup...
5
by: Chris | last post by:
Why can pickle serialize references to functions, but not methods? Pickling a function serializes the function name, but pickling a staticmethod, classmethod, or instancemethod generates an...
1
by: Nagu | last post by:
I didn't have the problem with dumping as a string. When I tried to save this object to a file, memory error pops up. I am sorry for the mention of size for a dictionary. What I meant by...
1
by: IceMan85 | last post by:
Hi to all, I have spent the whole morning trying, with no success to pickle an object that I have created. The error that I get is : Can't pickle 'SRE_Match' object: <_sre.SRE_Match object at...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.