473,407 Members | 2,326 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,407 software developers and data experts.

TPCServer and xdrlib


Hi All,

I'm trying to write a multi threaded TPC server. I have used xmlrpc
before for many purposes, but in this case this would not be efficient:

- I have to send larger amounts of data, the overhead of converting to
XML and parsing XML back would be too much pain
- I have no clue how to do keep-alive with simplexmlrpcserver and it is
slow to open a new connection for each RPC
- I would like to do session management (authentication, then store
session info on server site) which is also hard with xmlrpc.

I have looked at various solutions including:

- PyOrbit - too heavy weight
- Pyro - uses pickle, I do not trust it

BTW I do not care about the clients - they must trust the server side.
In contrast, the server should not receive anything from the clients
that is dangerous. I would like to use something that is fast, and can
only transfer data, not code. For this reason I think I cannot use the
marshal module because it is able to marshal code objects. I think I'm
going to implement my own "pickler" over xdrlib, that will only
pack/unpack data, NOT code. (It would also have the advantage that
others could write clients in different languages.)

Before I start re-inventing the wheel:

- Is there another (already existing) higher level framework that I can
try? It should be safe and fast, that is the only restriction.
- Do you think that it is a good idea to use xdrlib? I haven't seen
projects using it directly. For me it is like the rotor module was - it
was there but almost nobody used it. There might be a better lower level
library which I don't know of.

Thank you,

Laszlo

Jun 27 '08 #1
11 1499
On May 16, 7:16*am, Laszlo Nagy <gand...@shopzeus.comwrote:
* Hi All,
Hello, :-)
>
I'm trying to write a multi threaded TPC server. I have used xmlrpc
How exactly did you come to the conclusion that your server must be
multi threaded ?

- I have to send larger amounts of data, the overhead of converting to
XML and parsing XML back would be too much pain
- What's the expected amount of data you have to transfer ?
- What's the expected network bandwidth ?
- What's the expected acceptable transfer time ?
- How many users are expected to be transfering data at the same
time ?
I have looked at various solutions including:

- PyOrbit - too heavy weight
- Pyro - uses pickle, I do not trust it
Did you consider gzipping your XML (or YAML) packets ? Would the
transfer time be acceptable in this case ?
>
BTW I do not care about the clients - they must trust the server side.
Oh, he said he _doesn't care about the clients_ ! ;-)
In contrast, the server should not receive anything from the clients
that is dangerous. I would like to use something that is fast, and can
only transfer data, not code. For this reason I think I cannot use the
marshal module because it is able to marshal code objects. I think I'm
going to implement my own "pickler" over xdrlib, that will only
pack/unpack data, NOT code. (It would also have the advantage that
others could write clients in different languages.)
In general I would avoid that. Try to better estimate the speed
requirements, to see if you really need do to this.
>
Before I start re-inventing the wheel:

- Is there another (already existing) higher level framework that I can
try? It should be safe and fast, that is the only restriction.
There's "Twisted".
http://twistedmatrix.com/projects/co...o/servers.html
- Do you think that it is a good idea to use xdrlib? I haven't seen
projects using it directly. For me it is like the rotor module was - it
It's probably the best way to send binary stuff over the network.
But, again, I would avoid doing so.
was there but almost nobody used it. There might be a better lower level
library which I don't know of.

Thank you,

* *Laszlo
Jun 27 '08 #2
>
Did you consider gzipping your XML (or YAML) packets ? Would the
transfer time be acceptable in this case ?
That would add even more to the overhead of transcoding the
transportlayer. Switching from XMLRPC to a json-based protocol reduced
in a project of mine reduced the overhead 10-20fold - mainly because of
reduced size and parsing efforts.

Diez
Jun 27 '08 #3
Laszlo Nagy <ga*****@shopzeus.comwrote:
I'm trying to write a multi threaded TPC server. I have used xmlrpc
before for many purposes, but in this case this would not be efficient:

- I have to send larger amounts of data, the overhead of converting to
XML and parsing XML back would be too much pain
- I have no clue how to do keep-alive with simplexmlrpcserver and it is
slow to open a new connection for each RPC
- I would like to do session management (authentication, then store
session info on server site) which is also hard with xmlrpc.

I have looked at various solutions including:

- PyOrbit - too heavy weight
- Pyro - uses pickle, I do not trust it
It is possible to change the serialization used by Pyro

http://pyro.sourceforge.net/manual/9...ty.html#pickle

to the the 'gnosis' XML Pickler.

--
Nick Craig-Wood <ni**@craig-wood.com-- http://www.craig-wood.com/nick
Jun 27 '08 #4
On May 16, 9:26*am, "Diez B. Roggisch" <de...@nospam.web.dewrote:
*Did you consider gzipping your XML (or YAML) packets ? Would the
transfer time be acceptable in this case ?

That would add even more to the overhead of transcoding the
transportlayer. Switching from XMLRPC to a json-based protocol reduced
Yes, that's why I suggested YAML.
in a project of mine reduced the overhead 10-20fold - mainly because of
reduced size and parsing efforts.
I don't think so. It probably just the reduced size (check if the
json file is around 10 times smaller).

I believe the server will be mostly I/O-bound, ie, most overhead will
be in the data link/physical layers. The compression/parsing time (a
few microseconds) should be a small fraction of the total transfer
time (a few milliseconds). Even if the service is not I/O bound,
(considering the Youtube example) if there's significant traffic in
the server, the database access time should be the most significant.

I have used compression for SOAP messages in a GPRS (~20kbps) link
and got similar performance improvements (the web server was set to
automatically compress the data).
>
Diez
Jun 27 '08 #5
Henrique Dante de Almeida schrieb:
On May 16, 9:26 am, "Diez B. Roggisch" <de...@nospam.web.dewrote:
>> Did you consider gzipping your XML (or YAML) packets ? Would the
transfer time be acceptable in this case ?
That would add even more to the overhead of transcoding the
transportlayer. Switching from XMLRPC to a json-based protocol reduced

Yes, that's why I suggested YAML.
>in a project of mine reduced the overhead 10-20fold - mainly because of
reduced size and parsing efforts.

I don't think so. It probably just the reduced size (check if the
json file is around 10 times smaller).

I believe the server will be mostly I/O-bound, ie, most overhead will
be in the data link/physical layers. The compression/parsing time (a
few microseconds) should be a small fraction of the total transfer
time (a few milliseconds). Even if the service is not I/O bound,
(considering the Youtube example) if there's significant traffic in
the server, the database access time should be the most significant.

I have used compression for SOAP messages in a GPRS (~20kbps) link
and got similar performance improvements (the web server was set to
automatically compress the data).
I'm sorry, yes - I forgot that the main problem was the pure message
size due to some quadratic behaviour, which made things CPU-bound.

Still, XML-parsing is much more expensive, and packing/unpacking will of
course add to that.

Diez
Jun 27 '08 #6
It is possible to change the serialization used by Pyro

http://pyro.sourceforge.net/manual/9...ty.html#pickle

to the the 'gnosis' XML Pickler.
As I said earlier, I would not use XML. Just an example - I need to be
able to transfer image files, word and excel documents. How silly it
would be to base64encode a binary file, then put it into an XML.

L

Jun 27 '08 #7
>I'm trying to write a multi threaded TPC server. I have used xmlrpc

How exactly did you come to the conclusion that your server must be
multi threaded ?
I don't think that it is important. But if you are interested:

- yes, the server will probably be I/O bound, not CPU bound
- I'm have experience with thread programming, but not with twisted
>- I have to send larger amounts of data, the overhead of converting to
XML and parsing XML back would be too much pain

- What's the expected amount of data you have to transfer ?
I cannot predict. But I will be trasferring image files which would be
silly to do with XML.
- What's the expected network bandwidth ?
It cannot be determined in advance.
- What's the expected acceptable transfer time ?
Not known.
- How many users are expected to be transfering data at the same time ?
The server should be scaleable up to hundreds of users. (I'm just trying
to answer your questions, if that helps to answer mine.)
Did you consider gzipping your XML (or YAML) packets ? Would the
transfer time be acceptable in this case ?
No. "Image binary data -base64encode -XML -gzip" - looks very
silly. It cannot be efficient. Do you have better ideas?
>BTW I do not care about the clients - they must trust the server side.

Oh, he said he _doesn't care about the clients_ ! ;-)
I meant *safety* here: clients are going to download program updates
from the server. So if they do not trust the server then they should not
use it. The server is different: it must be safe against external
attacks. Maybe it was my bad English? Sorry for the misunderstanding.
In general I would avoid that. Try to better estimate the speed
requirements, to see if you really need do to this.
I cannot predict "acceptable speed" requirements, but I can tell that
there will be some clients downloading 100MB report files from the
server, so I presume that I will need a progress bar. I think that I
need to develop my own protocol for this, and probably the underlying
layer should use binary representation.
>Before I start re-inventing the wheel:

- Is there another (already existing) higher level framework that I can
try? It should be safe and fast, that is the only restriction.

There's "Twisted".
http://twistedmatrix.com/projects/co...o/servers.html
Yes, I tried twisted before and I did not like it. It forces me to
things that I do not want to do. (I cannot tell what it was - it was two
years ago.)
>- Do you think that it is a good idea to use xdrlib? I haven't seen
projects using it directly. For me it is like the rotor module was - it

It's probably the best way to send binary stuff over the network.
But, again, I would avoid doing so.
It is NOT the best way. Just to tell one example: big endian / little
endian integers. Definitely I need some encoding.

(But if you are right and this is the best way, why would you avoid?)

L
Jun 27 '08 #8
On May 19, 10:28*am, Laszlo Nagy <gand...@shopzeus.comwrote:
>
I don't think that it is important. But if you are interested:

- yes, the server will probably be I/O bound, not CPU bound
- I'm have experience with thread programming, but not with twisted
That part was just to show you that being multithreaded is not really
a requirement. The server could be single-threaded, for example.
Surely, if you are comfortable with writing a threaded server, there's
no problem with that.
>
No. "Image binary data -base64encode -XML -gzip" - looks very
silly. It cannot be efficient. Do you have better ideas?>BTW I do not care about the
Okay, that would be silly. The questions above considered that you
would be sending typical unserialized objects that would be serialized
to XML, not pre-serialized binary data.
use it. The server is different: it must be safe against external
attacks. Maybe it was my bad English? Sorry for the misunderstanding.*In general I
That part was a joke. You didn't have to answer that. :-P
I cannot predict "acceptable speed" requirements, but I can tell that
there will be some clients downloading 100MB report files from the
server, so I presume that I will need a progress bar. I think that I
need to develop my own protocol for this, and probably the underlying
Okay, so you need to wrap large binary files in some kind of message,
without pre processing them. I think developing your own protocol
using XDR is a safe bet.
layer should use binary representation.>Before I start re-inventing the wheel:
If you are worried about not reinventing the wheel, there are a
couple of solutions I can think of. None of them seem to fully support
you security and session management requirements, so you should
estimate the required project/development time for them.

- You may create a standard web application for that (with Django ?).
Binary transfers simply use HTTP and have trivial overhead. You have
to implement security and session management (cookies ?) on top of
that.

- Subclass BaseHTTPServer to implement a stateful and secure
protocol. Again, binary transfers have trivial overhead.

- If the goal of the project is to provide versioned file support,
you could use a dpkg/apt (or RPM?) based installation system (it uses
HTTP and FTP for file transfers). Write trivial front-ends in the
client and the server and choose a secure HTTP server.

- Finally, if you want to create your own protocol, but don't want to
use XDR, you could to a similar thing using MIME based messages (the
first message part is a XML message that references the binary
attachments).
It is NOT the best way. Just to tell one example: big endian / little
endian integers. Definitely I need some encoding.
Huh ? XDR does exactly that.
>
(But if you are right and this is the best way, why would you avoid?)
I would avoid packing/unpacking objects by hand.

Jun 27 '08 #9

Laszlo Nagy wrote:
>
>It is possible to change the serialization used by Pyro

http://pyro.sourceforge.net/manual/9...ty.html#pickle

to the the 'gnosis' XML Pickler.
As I said earlier, I would not use XML. Just an example - I need to be
able to transfer image files, word and excel documents. How silly it
would be to base64encode a binary file, then put it into an XML.

L
Fair enough.

In that case, here's 5 suggestions:

- use simple file copying from a mounted network drive
- use http (web server)
- use ftp
- use scp
- use rsync

Why wouldn't one of these work for you? Did I miss something in your original
requirements? All of the above high level protocols are very efficient in concurrently
transferring files from a server to multiple clients.

--irmen
Jun 27 '08 #10
Henrique Dante de Almeida <hd****@gmail.comwrote:
On May 19, 10:28?am, Laszlo Nagy <gand...@shopzeus.comwrote:
I cannot predict "acceptable speed" requirements, but I can tell that
there will be some clients downloading 100MB report files from the
server, so I presume that I will need a progress bar. I think that I
need to develop my own protocol for this, and probably the underlying

Okay, so you need to wrap large binary files in some kind of message,
without pre processing them. I think developing your own protocol
using XDR is a safe bet.
You might want to consider using netstrings rather than XDR

http://cr.yp.to/proto/netstrings.txt

They are very simple and would be minimal overhead if all you are
passing is a file and a bit of metadata.

You'll find several modules for python with a bit of searching. Also
I believe twisted supports them directly or you could easily roll your
own.

--
Nick Craig-Wood <ni**@craig-wood.com-- http://www.craig-wood.com/nick
Jun 27 '08 #11
>
- use simple file copying from a mounted network drive
Untrustable clients should not mount out anything from my server. (Also,
it is not a protocol. I need to communicate with a real program, not
just copying files.)
- use http (web server)
I mentioned this before - don't know how to keep-alive with
simplehttpserver. Similar solutions e.g. Apache + mod_python are too
heavy weight. Too many dependencies etc.
- use ftp
- use scp
- use rsync

Why wouldn't one of these work for you? Did I miss something in your
original requirements?
Yes. I also need business logic on the server. Not just copying file. It
happens that some of the messages will contain images.
Thank you for all your efforts. I think I'll go with TCPServer + xdrlib.
Laszlo

Jun 27 '08 #12

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
by: Dietrich Epp | last post by:
Are there any good modules for reading a bitstream? Specifically, I have a string and I want to be able to get the next N bits as an integer. Right now I'm using struct.unpack and bit operations,...
26
by: Steven Bethard | last post by:
I thought it might be useful to put the recent lambda threads into perspective a bit. I was wondering what lambda gets used for in "real" code, so I grepped my Python Lib directory. Here are some...
92
by: Reed L. O'Brien | last post by:
I see rotor was removed for 2.4 and the docs say use an AES module provided separately... Is there a standard module that works alike or an AES module that works alike but with better encryption?...
42
by: Irmen de Jong | last post by:
Pickle and marshal are not safe. They can do harmful things if fed maliciously constructed data. That is a pity, because marshal is fast. I need a fast and safe (secure) marshaler. Is xdrlib the...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.