By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
444,002 Members | 1,020 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 444,002 IT Pros & Developers. It's quick & easy.

Sending binary pickled data through TCP

P: n/a
I have a pair of programs which trade python data back and forth by
pickling up lists of objects on one side (using
pickle.HIGHEST_PROTOCOL), and sending that data over a TCP socket
connection to the receiver, who unpickles the data and uses it.

So far this has been working fine, but I now need a way of separating
multiple chunks of pickled binary data in the stream being sent back and
forth.

Questions:

Is it safe to do what I'm doing? I didn't think there was anything
fundamentally wrong with sending binary pickled data, especially in the
closed, safe environment these programs operate under...but maybe I'm
making a poor assumption?

I was going to separate the chunks of pickled data with some well-formed
string, but couldn't that string potentially randomly appear in the
pickled data? Do I just pick an extremely
unlikely-to-be-randomly-generated string as the separator? Is there some
string that will definitely NEVER show up in pickled binary data?

I thought about base64 encoding the data, and then decoding on the
opposite side (like what xmlrpclib does), but that turns out to be a
very expensive operation, which I want to avoid, speed is of the essence
in this situation.

Is there a reliable way to determine the byte count of some pickled
binary data? Can I rely on len(<pickled data>) == bytes?

Thanks for all responses,
-David

--
Presenting:
mediocre nebula.

Oct 13 '06 #1
Share this Question
Share on Google+
4 Replies


P: n/a
David Hirschfield <da****@ilm.comwrites:
Is there a reliable way to determine the byte count of some pickled
binary data? Can I rely on len(<pickled data>) == bytes?
Huh? Yes, of course len gives you the length.

As for the network representation, DJB proposes this format:
http://cr.yp.to/proto/netstrings.txt
Oct 13 '06 #2

P: n/a
Paul Rubin <httpwrote:
As for the network representation, DJB proposes this format:
http://cr.yp.to/proto/netstrings.txt
Netstrings are cool and you'll find some python implementations if you
search.

But it is basically "number:string,", ie "12:hello world!,"

Or you could use escaping which is what I usually do. This has the
advantage that you don't need to know how long the data is in advance.

Eg, these are from a scheme which uses \t to seperate arguments and
\r or \n to seperate transactions. These are then escaped in the
actual data using these functions

def escape(s):
"""This escapes the string passed in, changing CR, LF, TAB and \\ into
\\r, \\n, \\t and \\\\"""
s = s.replace("\\", "\\\\")
s = s.replace("\r", "\\r")
s = s.replace("\n", "\\n")
s = s.replace("\t", "\\t")
return s

def unescape(s, _unescape_mapping = string.maketrans('tnr','\t\n\r'), _unescape_re = re.compile(r'\\([(rnt\\)])')):
"""This unescapes the string passed in, changing \\r, \\n, \\t and \\any_char into
CR, LF, TAB and any_char"""
def _translate(m):
return m.group(1).translate(_unescape_mapping)
return _unescape_re.sub(_translate, s)

(These functions have been through the optimisation mill which is why
they may not look immediately like how you might first think of
writing them!)

--
Nick Craig-Wood <ni**@craig-wood.com-- http://www.craig-wood.com/nick
Oct 13 '06 #3

P: n/a

David Hirschfield wrote:
I have a pair of programs which trade python data back and forth by
pickling up lists of objects on one side (using
pickle.HIGHEST_PROTOCOL), and sending that data over a TCP socket
connection to the receiver, who unpickles the data and uses it.

So far this has been working fine, but I now need a way of separating
multiple chunks of pickled binary data in the stream being sent back and
forth.

Questions:

Is it safe to do what I'm doing? I didn't think there was anything
fundamentally wrong with sending binary pickled data, especially in the
closed, safe environment these programs operate under...but maybe I'm
making a poor assumption?

I was going to separate the chunks of pickled data with some well-formed
string, but couldn't that string potentially randomly appear in the
pickled data? Do I just pick an extremely
unlikely-to-be-randomly-generated string as the separator? Is there some
string that will definitely NEVER show up in pickled binary data?

I thought about base64 encoding the data, and then decoding on the
opposite side (like what xmlrpclib does), but that turns out to be a
very expensive operation, which I want to avoid, speed is of the essence
in this situation.

Is there a reliable way to determine the byte count of some pickled
binary data? Can I rely on len(<pickled data>) == bytes?
Instead of communicating directly with the TCP socket, you could talk
to it via an object which precedes each chunk with a byte count, and if
you're working with multiple streams of picked data, then each chunk
could also have an identifier which specified which stream it belonged
to.

Oct 13 '06 #4

P: n/a
David Hirschfield wrote:
I have a pair of programs which trade python data back and forth by
pickling up lists of objects on one side (using
pickle.HIGHEST_PROTOCOL), and sending that data over a TCP socket
connection to the receiver, who unpickles the data and uses it.

So far this has been working fine, but I now need a way of separating
multiple chunks of pickled binary data in the stream being sent back and
forth.
[...]

Save yourself the trouble of implementing some sort of IPC mechanism
over sockets, and give Pyro a swing: http://pyro.sourceforge.net

In Pyro almost all of the nastyness that is usually associated with socket
programming is shielded from you and you'll get much more as well
(a complete pythonic IPC library).

It may be a bit heavy for what you are trying to do but it may
be the right choice to avoid troubles later when your requirements
get more complex and/or you discover problems with your networking code.

Hth,
---Irmen de Jong
Oct 13 '06 #5

This discussion thread is closed

Replies have been disabled for this discussion.