By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
432,086 Members | 1,919 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 432,086 IT Pros & Developers. It's quick & easy.

Receive data from socket stream

P: n/a
I wanted to ask for standard ways to receive data from a socket stream
(with socket.socket.recv()). It's simple when you know the amount of
data that you're going to receive, or when you'll receive data until
the remote peer closes the connection. But I'm not sure which is the
best way to receive a message with undetermined length from a stream
in a connection that you expect to remain open. Until now, I've been
doing this little trick:

data = client.recv(256)
new = data
while len(new) == 256:
new = client.recv(256)
data += new

That works well in most cases. But it's obviously error-prone. What if
the client sent *exactly* two hundred and fifty six bytes? It would
keep waiting for data inside the loop. Is there really a better and
standard way, or is this as best as it gets?

Sorry if this is a little off-topic and more related to networking,
but I'm using Python anyway.

Thanks,
Sebastian
Jun 27 '08 #1
Share this Question
Share on Google+
16 Replies


P: n/a
s0****@gmail.com wrote:
I wanted to ask for standard ways to receive data from a socket stream
(with socket.socket.recv()). It's simple when you know the amount of
data that you're going to receive, or when you'll receive data until
the remote peer closes the connection. But I'm not sure which is the
best way to receive a message with undetermined length from a stream
in a connection that you expect to remain open. Until now, I've been
doing this little trick:

data = client.recv(256)
new = data
while len(new) == 256:
new = client.recv(256)
data += new

That works well in most cases. But it's obviously error-prone. What if
the client sent *exactly* two hundred and fifty six bytes? It would
keep waiting for data inside the loop. Is there really a better and
standard way, or is this as best as it gets?

Sorry if this is a little off-topic and more related to networking,
but I'm using Python anyway.
You solve this by having a protocol that the client and server both
agree on, so that the client knows how much to read from the server.
There are any number of ways of doing this, all of which depend on the
kind of data you want to transfer and for what purpose.

--
Erik Max Francis && ma*@alcyone.com && http://www.alcyone.com/max/
San Jose, CA, USA && 37 18 N 121 57 W && AIM, Y!M erikmaxfrancis
In the final choice a solider's pack is not so heavy a burden as a
prisoner's chains. -- Dwight D. Eisenhower, 1890-1969
Jun 27 '08 #2

P: n/a
On Apr 25, 5:52*pm, Erik Max Francis <m...@alcyone.comwrote:
s0s...@gmail.com wrote:
I wanted to ask for standard ways to receive data from a socket stream
(with socket.socket.recv()). It's simple when you know the amount of
data that you're going to receive, or when you'll receive data until
the remote peer closes the connection. But I'm not sure which is the
best way to receive a message with undetermined length from a stream
in a connection that you expect to remain open. Until now, I've been
doing this little trick:
data = client.recv(256)
new = data
while len(new) == 256:
* * new = client.recv(256)
* * data += new
That works well in most cases. But it's obviously error-prone. What if
the client sent *exactly* two hundred and fifty six bytes? It would
keep waiting for data inside the loop. Is there really a better and
standard way, or is this as best as it gets?
Sorry if this is a little off-topic and more related to networking,
but I'm using Python anyway.

You solve this by having a protocol that the client and server both
agree on, so that the client knows how much to read from the server.
There are any number of ways of doing this, all of which depend on the
kind of data you want to transfer and for what purpose.

--
Erik Max Francis && m...@alcyone.com &&http://www.alcyone.com/max/
* San Jose, CA, USA && 37 18 N 121 57 W && AIM, Y!M erikmaxfrancis
* *In the final choice a solider's pack is not so heavy a burden as a
* * prisoner's chains. -- Dwight D. Eisenhower, 1890-1969
So, in an HTTP client/server, I'd had to look in a Content-Length
header?
Jun 27 '08 #3

P: n/a
On Apr 25, 7:39*pm, s0s...@gmail.com wrote:
I wanted to ask for standard ways to receive data from a socket stream
(with socket.socket.recv()). It's simple when you know the amount of
data that you're going to receive, or when you'll receive data until
the remote peer closes the connection. But I'm not sure which is the
best way to receive a message with undetermined length from a stream
in a connection that you expect to remain open. Until now, I've been
doing this little trick:

data = client.recv(256)
new = data
while len(new) == 256:
* * new = client.recv(256)
* * data += new

That works well in most cases. But it's obviously error-prone. What if
the client sent *exactly* two hundred and fifty six bytes? It would
keep waiting for data inside the loop. Is there really a better and
standard way, or is this as best as it gets?

Sorry if this is a little off-topic and more related to networking,
but I'm using Python anyway.

Thanks,
Sebastian

done = False
remaining = ''
while done == False:
data = client.recv(256)
done, remaining = process(remaining + data)

PS: are you sure you shouldn't be using RPC or SOAP ?
Jun 27 '08 #4

P: n/a
s0****@gmail.com wrote:
Until now, I've been
doing this little trick:

data = client.recv(256)
new = data
while len(new) == 256:
new = client.recv(256)
data += new
Are you aware that recv() will not always return the amount of bytes asked for?
(send() is similar; it doesn't guarantee that the full buffer you pass to it will be
sent at once)

I suggest reading this: http://www.amk.ca/python/howto/sockets/sockets.html
--irmen
Jun 27 '08 #5

P: n/a
On Apr 26, 7:25 am, Irmen de Jong <irmen.NOS...@xs4all.nlwrote:
s0s...@gmail.com wrote:
Until now, I've been
doing this little trick:
data = client.recv(256)
new = data
while len(new) == 256:
new = client.recv(256)
data += new

Are you aware that recv() will not always return the amount of bytes asked for?
(send() is similar; it doesn't guarantee that the full buffer you pass to it will be
sent at once)

I suggest reading this:http://www.amk.ca/python/howto/sockets/sockets.html

--irmen
So every time I use I want to send some thing, I must use

totalsent = 0
while sent < len(data):
sent = sock.send(data[totalsent:])
totalsent += sent

instead of a simple sock.send(data)? That's kind of nasty. Also, is it
better then to use sockets as file objects? Maybe unbuffered?
Jun 27 '08 #6

P: n/a

<s0****@gmail.comwrote in message
news:4c**********************************@56g2000h sm.googlegroups.com...
On Apr 26, 7:25 am, Irmen de Jong <irmen.NOS...@xs4all.nlwrote:
>s0s...@gmail.com wrote:
Until now, I've been
doing this little trick:
data = client.recv(256)
new = data
while len(new) == 256:
new = client.recv(256)
data += new

Are you aware that recv() will not always return the amount of bytes
asked for?
(send() is similar; it doesn't guarantee that the full buffer you pass to
it will be
sent at once)

I suggest reading
this:http://www.amk.ca/python/howto/sockets/sockets.html

--irmen

So every time I use I want to send some thing, I must use

totalsent = 0
while sent < len(data):
sent = sock.send(data[totalsent:])
totalsent += sent

instead of a simple sock.send(data)? That's kind of nasty. Also, is it
better then to use sockets as file objects? Maybe unbuffered?
I think you meant:

while totalsent < len(data):

Python also has the sendall() function.

-Mark

Jun 27 '08 #7

P: n/a
s0****@gmail.com <s0****@gmail.comwrote:
I wanted to ask for standard ways to receive data from a socket stream
(with socket.socket.recv()). It's simple when you know the amount of
data that you're going to receive, or when you'll receive data until
the remote peer closes the connection. But I'm not sure which is the
best way to receive a message with undetermined length from a stream
in a connection that you expect to remain open. Until now, I've been
doing this little trick:

data = client.recv(256)
new = data
while len(new) == 256:
new = client.recv(256)
data += new

That works well in most cases. But it's obviously error-prone. What if
the client sent *exactly* two hundred and fifty six bytes? It would
keep waiting for data inside the loop. Is there really a better and
standard way, or is this as best as it gets?
What you are missing is that if the recv ever returns no bytes at all
then the other end has closed the connection. So something like this
is the correct thing to write :-

data = ""
while True:
new = client.recv(256)
if not new:
break
data += new

From the man page for recv

RETURN VALUE

These calls return the number of bytes received, or -1 if an
error occurred. The return value will be 0 when the peer has
performed an orderly shutdown.

In the -1 case python will raise a socket.error.

--
Nick Craig-Wood <ni**@craig-wood.com-- http://www.craig-wood.com/nick
Jun 27 '08 #8

P: n/a
Mark Tolonen <M8********@mailinator.comwrote:
So every time I use I want to send some thing, I must use

totalsent = 0
while sent < len(data):
sent = sock.send(data[totalsent:])
totalsent += sent

instead of a simple sock.send(data)? That's kind of nasty. Also, is it
better then to use sockets as file objects? Maybe unbuffered?

I think you meant:

while totalsent < len(data):

Python also has the sendall() function.
Just to elaborate, sendall does exactly the above for you.

sendall(self, *args)
sendall(data[, flags])

Send a data string to the socket. For the optional flags
argument, see the Unix manual. This calls send() repeatedly
until all data is sent. If an error occurs, it's impossible
to tell how much data has been sent.

There should really be a recvall for symmetry, but I don't think it
would get much use!

--
Nick Craig-Wood <ni**@craig-wood.com-- http://www.craig-wood.com/nick
Jun 27 '08 #9

P: n/a
Nick Craig-Wood <ni**@craig-wood.comwrites:
What you are missing is that if the recv ever returns no bytes at all
then the other end has closed the connection. So something like this
is the correct thing to write :-

data = ""
while True:
new = client.recv(256)
if not new:
break
data += new
This is a good case for the iter() function:

buf = cStringIO.StringIO()
for new in iter(partial(client.recv, 256), ''):
buf.write(new)
data = buf.getvalue()

Note that appending to a string is almost never a good idea, since it
can result in quadratic allocation.
Jun 27 '08 #10

P: n/a
Hrvoje Niksic <hn*****@xemacs.orgwrote:
Nick Craig-Wood <ni**@craig-wood.comwrites:
What you are missing is that if the recv ever returns no bytes at all
then the other end has closed the connection. So something like this
is the correct thing to write :-

data = ""
while True:
new = client.recv(256)
if not new:
break
data += new

This is a good case for the iter() function:

buf = cStringIO.StringIO()
for new in iter(partial(client.recv, 256), ''):
buf.write(new)
data = buf.getvalue()

Note that appending to a string is almost never a good idea, since it
can result in quadratic allocation.
My aim was clear exposition rather than the ultimate performance!

Anyway str += was optimised in python 2.4 or 2.5 (forget which) wasn't
it? I'm not convinced it will be any worse performing than
cStringIO.StringIO.write() which effectively appends to a string in
exactly the same way.

This test agrees with me!

$ python -m timeit -s 's = ""' 'for i in xrange(100000): s+="x"'
10 loops, best of 3: 23.8 msec per loop

$ python -m timeit -s 'from cStringIO import StringIO; s=StringIO()' 'for i in xrange(100000): s.write("x")'
10 loops, best of 3: 56 msec per loop

--
Nick Craig-Wood <ni**@craig-wood.com-- http://www.craig-wood.com/nick
Jun 27 '08 #11

P: n/a
On Apr 28, 4:30 am, Nick Craig-Wood <n...@craig-wood.comwrote:
s0s...@gmail.com <s0s...@gmail.comwrote:
I wanted to ask for standard ways to receive data from a socket stream
(with socket.socket.recv()). It's simple when you know the amount of
data that you're going to receive, or when you'll receive data until
the remote peer closes the connection. But I'm not sure which is the
best way to receive a message with undetermined length from a stream
in a connection that you expect to remain open. Until now, I've been
doing this little trick:
data = client.recv(256)
new = data
while len(new) == 256:
new = client.recv(256)
data += new
That works well in most cases. But it's obviously error-prone. What if
the client sent *exactly* two hundred and fifty six bytes? It would
keep waiting for data inside the loop. Is there really a better and
standard way, or is this as best as it gets?

What you are missing is that if the recv ever returns no bytes at all
then the other end has closed the connection. So something like this
is the correct thing to write :-

data = ""
while True:
new = client.recv(256)
if not new:
break
data += new

From the man page for recv

RETURN VALUE

These calls return the number of bytes received, or -1 if an
error occurred. The return value will be 0 when the peer has
performed an orderly shutdown.

In the -1 case python will raise a socket.error.

--
Nick Craig-Wood <n...@craig-wood.com--http://www.craig-wood.com/nick
But as I said in my first post, it's simple when you know the amount
of data that you're going to receive, or when you'll receive data
until the remote peer closes the connection. But what about receiving
a message with undetermined length in a connection that you don't want
to close? I already figured it out: in the case of an HTTP server/
client (which is what I'm doing), you have to look for an empty line
in the message, which signals the end of the message headers. As for
the body, you have to look at the Content-Length header, or, if the
message body contains the "chunked" transfer-coding, you have to
dynamically decode the encoding. There are other cases but those are
the most influent.

BTW, has anybody used sockets as file-like objects
(client.makefile())? Is it more secure? More efficient?

Sebastian
Jun 27 '08 #12

P: n/a
Nick Craig-Wood <ni**@craig-wood.comwrites:
> Note that appending to a string is almost never a good idea, since it
can result in quadratic allocation.

My aim was clear exposition rather than the ultimate performance!
That would normally be fine. My post wasn't supposed to pick
performance nits, but to point out potentially quadratic behavior.
Anyway str += was optimised in python 2.4 or 2.5 (forget which) wasn't
it?
That optimization works only in certain cases, when working with
uninterned strings with a reference count of 1, and then only when the
strings are in stored local variables, rather than in global vars or
in slots. And then, it only works in CPython, not in other
implementations. The optimization works by "cheating" -- breaking the
immutable string abstraction in the specific cases in which it is
provably safe to do so.
http://utcc.utoronto.ca/~cks/space/b...tringConcatOpt
examines it in some detail.

Guido was reluctant to accept the patch that implements the
optimization because he thought it would "change the way people write
code", a sentiment expressed in
http://mail.python.org/pipermail/pyt...st/046702.html
This discussion shows that he was quite right in retrospect. (I'm not
saying that the optimization is a bad thing, just that it is changing
the "recommended" way of writing Python in a way that other
implementations cannot follow.)
Jun 27 '08 #13

P: n/a
On Apr 28, 4:42 am, Hrvoje Niksic <hnik...@xemacs.orgwrote:
Nick Craig-Wood <n...@craig-wood.comwrites:
What you are missing is that if the recv ever returns no bytes at all
then the other end has closed the connection. So something like this
is the correct thing to write :-
data = ""
while True:
new = client.recv(256)
if not new:
break
data += new

This is a good case for the iter() function:

buf = cStringIO.StringIO()
for new in iter(partial(client.recv, 256), ''):
buf.write(new)
data = buf.getvalue()

Note that appending to a string is almost never a good idea, since it
can result in quadratic allocation.
A question regarding cStringIO.StringIO(): is there a way to do get
getvalue() to return all the bytes after the current file position
(not before)? For example

buf = cStringIO.StringIO()
buf.write("foo bar")
buf.seek(3)
buf.getvalue(True) # the True argument means
# to return the bytes up
# to the current file position

That returns 'foo'. Is there a way to get it to return ' bar'?
Jun 27 '08 #14

P: n/a
En Mon, 28 Apr 2008 19:29:33 -0300, <s0****@gmail.comescribió:
A question regarding cStringIO.StringIO(): is there a way to do get
getvalue() to return all the bytes after the current file position
(not before)? For example

buf = cStringIO.StringIO()
buf.write("foo bar")
buf.seek(3)
buf.getvalue(True) # the True argument means
# to return the bytes up
# to the current file position

That returns 'foo'. Is there a way to get it to return ' bar'?
buf.read() - the obvious answer, once you know it :)

--
Gabriel Genellina

Jun 27 '08 #15

P: n/a
Hrvoje Niksic <hn*****@xemacs.orgwrote:
Nick Craig-Wood <ni**@craig-wood.comwrites:
Note that appending to a string is almost never a good idea, since it
can result in quadratic allocation.
My aim was clear exposition rather than the ultimate performance!

That would normally be fine. My post wasn't supposed to pick
performance nits, but to point out potentially quadratic behavior.
Anyway str += was optimised in python 2.4 or 2.5 (forget which) wasn't
it?

That optimization works only in certain cases, when working with
uninterned strings with a reference count of 1, and then only when the
strings are in stored local variables, rather than in global vars or
in slots. And then, it only works in CPython, not in other
implementations. The optimization works by "cheating" -- breaking the
immutable string abstraction in the specific cases in which it is
provably safe to do so.
http://utcc.utoronto.ca/~cks/space/b...tringConcatOpt
examines it in some detail.
Ah, I didn't realise that - thanks for the interesting link.

For the example I gave, just a simple local variable the optimisation
kicks in. I can see how you could easily migrate that to an instance
variable and the optimisation would no longer work, eg

$ python -m timeit -s 's=""' 'for i in xrange(10000): s+="x"'
1000 loops, best of 3: 1.04 msec per loop

$ python -m timeit -s 'class A: pass' -s 'a=A(); a.s=""' 'for i in xrange(10000): a.s+="x"'
10 loops, best of 3: 160 msec per loop
Guido was reluctant to accept the patch that implements the
optimization because he thought it would "change the way people write
code", a sentiment expressed in
http://mail.python.org/pipermail/pyt...st/046702.html
This discussion shows that he was quite right in retrospect. (I'm not
saying that the optimization is a bad thing, just that it is changing
the "recommended" way of writing Python in a way that other
implementations cannot follow.)
Certainly something I wasn't aware of before - thanks!

--
Nick Craig-Wood <ni**@craig-wood.com-- http://www.craig-wood.com/nick
Jun 27 '08 #16

P: n/a
s0****@gmail.com <s0****@gmail.comwrote:
But as I said in my first post, it's simple when you know the amount
of data that you're going to receive, or when you'll receive data
until the remote peer closes the connection. But what about receiving
a message with undetermined length in a connection that you don't want
to close?
You obviously need some sort of protocol. Here is some code (taken
from a real project and modified a bit) which returns \r\n seperated
lines from a socket as they arrive which is a very simple (but
widespread) protocol.

self.rx_buf is set to "" in the initialisation
self.sock is the socket

def rx_line(self):
message = None
while 1:
pos = self.rx_buf.find("\r\n")
if pos >= 0:
message = self.rx_buf[:pos]
self.rx_buf = self.rx_buf[pos+2:]
break
try:
rx = self.sock.recv(4096)
except socket.error, e:
self.sock = None
raise ServerNetworkException(e)
if len(rx) == 0:
self.sock = None
raise ServerDisconnectedException()
self.rx_buf += rx
return message

Sorry I mis-understood your original post!

--
Nick Craig-Wood <ni**@craig-wood.com-- http://www.craig-wood.com/nick
Jun 27 '08 #17

This discussion thread is closed

Replies have been disabled for this discussion.