473,324 Members | 2,196 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,324 software developers and data experts.

Slow network reading?

I have a simple network protocol client (it's a part of this:
http://sqlcached.sourceforge.net) implemented in Python, PHP and C.
Everything's fine, except that the Python implementation is the slowest
- up to 30% slower than the PHP version (which implements exactly the
same logic, in a class).

In typical usage (also in the benchmark), an object is created and
..query is called repeatedly. Typical numbers for the benchmark are:

For Python version:

Timing 100000 INSERTs...
5964.4 qps
Timing 100000 SELECTs...
7491.0 qps

For PHP version:

Timing 100000 inserts...
7820.2 qps
Timing 100000 selects...
9926.2 qps
The main part of the client class is:

----

import os, socket, re

class SQLCacheD_Exception(Exception):
pass

class SQLCacheD:

DEFAULT_UNIX_SOCKET = '/tmp/sqlcached.sock'
SC_VER_SIG = 'sqlcached-1'
SOCK_UNIX = 'unix'
SOCK_TCP = 'tcp'

re_rec = re.compile(r"\+REC (\d+), (\d+)")
re_ok = re.compile(r"\+OK (.+)")
re_ver = re.compile(r"\+VER (.+)")

def __init__(self, host = '/tmp/sqlcached.sock', type = 'unix'):
if type != SQLCacheD.SOCK_UNIX:
raise

self.sock = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
self.sock.connect(host)
self.sf = self.sock.makefile('U', 4000)

self.sf.write("VER %s\r\n" % SQLCacheD.SC_VER_SIG)
self.sf.flush()
if self.sf.readline().rstrip() != '+VER %s' % SQLCacheD.SC_VER_SIG:
raise SQLCacheD_Exception("Handshake failure (invalid
version signature?)")
def query(self, sql):
self.sf.write("SQL %s\r\n" % sql)
self.sf.flush()
resp = self.sf.readline().rstrip()
m = SQLCacheD.re_rec.match(resp)
if m != None: # only if some rows are returned (SELECT)
n_rows = int(m.group(1))
n_cols = int(m.group(2))
cols = []
for c in xrange(n_cols):
cols.append(self.sf.readline().rstrip())
rs = []
for r in xrange(n_rows):
row = {}
for c in cols:
row[c] = self.sf.readline().rstrip()
rs.append(row)
return rs
m = SQLCacheD.re_ok.match(resp)
if m != None: # no rows returned (e.g. INSERT/UPDATE/DELETE)
return True
raise SQLCacheD_Exception(resp)

----

My question is: Am I missing something obvious? The C implementation is
(as expected) the fastest with result of 10000:15000, but somehow I
expected the Python one to be closer to, or even faster than PHP.

I tried using 'r' mode for .makefile() but it had no significant effect.
May 11 '06 #1
4 1729
Ivan Voras wrote:
def query(self, sql):
self.sf.write("SQL %s\r\n" % sql)
self.sf.flush()
resp = self.sf.readline().rstrip()
m = SQLCacheD.re_rec.match(resp)
if m != None: # only if some rows are returned (SELECT)
n_rows = int(m.group(1))
n_cols = int(m.group(2))
cols = []
for c in xrange(n_cols):
cols.append(self.sf.readline().rstrip())
rs = []
for r in xrange(n_rows):
row = {}
for c in cols:
row[c] = self.sf.readline().rstrip()
rs.append(row)
return rs
m = SQLCacheD.re_ok.match(resp)
if m != None: # no rows returned (e.g. INSERT/UPDATE/DELETE)
return True
raise SQLCacheD_Exception(resp)


Comparative CPU & memory utilisation statistics, not to mention platform
and version of Python, would be useful hints...

Note that the file-like object returned by makefile() has significant
portions of heavy lifting code in Python rather than C which can be a
drag on ultimate performance... If on a Unix platform, it may be worth
experimenting with os.fdopen() on the socket's fileno() to see whether
the core Python file object (implemented in C) can be used in place of
the lookalike returned from the makefile method.

Even without that, you are specifying a buffer size smaller than the
default (8k - see Lib/socket.py). 16k might be even better.

Although they're only micro-optimisations, I'd be interested in the
relative performance of the query method re-written as:

def query(self, sql):
self.sf.write("SQL %s\r\n" % sql)
self.sf.flush()
sf_readline = self.sf.readline
resp = sf_readline().rstrip()
m = self.re_rec.match(resp)
if m is not None:
# some rows are returned (SELECT)
rows = range(int(m.group(1)))
cols = range(int(m.group(2)))
for c in cols:
cols[c] = sf_readline().rstrip()
for r in rows:
row = {}
for c in cols:
row[c] = sf_readline().rstrip()
rows[r] = row
return rows
elif self.re_ok.match(resp) is not None:
# no rows returned (e.g. INSERT/UPDATE/DELETE)
return True
raise SQLCacheD_Exception(resp)
This implementation is based on 2 strategies for better performance:
- minimise name lookups by hoisting references from outside the method
to local references;
- pre-allocate lists when the required sizes are known, to avoid the
costs associated with growing them.

Both strategies can pay fair dividends when the repetition counts are
large enough; whether this is the case for your tests I can't say.

--
-------------------------------------------------------------------------
Andrew I MacIntyre "These thoughts are mine alone..."
E-mail: an*****@bullseye.apana.org.au (pref) | Snail: PO Box 370
an*****@pcug.org.au (alt) | Belconnen ACT 2616
Web: http://www.andymac.org/ | Australia
May 11 '06 #2
Andrew MacIntyre wrote:
Comparative CPU & memory utilisation statistics, not to mention platform
and version of Python, would be useful hints...
During benchmarking, all versions cause all CPU to be used, but Python
version has ~1.5x more CPU time allocated to it than PHP. Python is 2.4.1
Note that the file-like object returned by makefile() has significant
portions of heavy lifting code in Python rather than C which can be a
drag on ultimate performance... If on a Unix platform, it may be worth
experimenting with os.fdopen() on the socket's fileno() to see whether
the core Python file object (implemented in C) can be used in place of
the lookalike returned from the makefile method.
That's only because I need the .readline() function. In C, I'm using
fgets() (with the expectation that iostream will buffer data).
Even without that, you are specifying a buffer size smaller than the
default (8k - see Lib/socket.py). 16k might be even better.
The benchmark is such that all of data is < 200 bytes. I estimate that
in production almost all protocol data will be < 4KB.
Although they're only micro-optimisations, I'd be interested in the
relative performance of the query method re-written as:


The change (for the better) is minor (3-5%).
May 12 '06 #3
Ivan Voras wrote:
Andrew MacIntyre wrote:
Comparative CPU & memory utilisation statistics, not to mention platform
and version of Python, would be useful hints...
During benchmarking, all versions cause all CPU to be used, but Python
version has ~1.5x more CPU time allocated to it than PHP. Python is 2.4.1


A pretty fair indication of the Python interpreter doing a lot more work...
Note that the file-like object returned by makefile() has significant
portions of heavy lifting code in Python rather than C which can be a
drag on ultimate performance... If on a Unix platform, it may be worth
experimenting with os.fdopen() on the socket's fileno() to see whether
the core Python file object (implemented in C) can be used in place of
the lookalike returned from the makefile method. That's only because I need the .readline() function. In C, I'm using
fgets() (with the expectation that iostream will buffer data).


The readline method of the file object lookalike returned by makefile
implements all of the line splitting logic in Python code, which is very
likely where the extra process CPU time is going. Note that this code is
in Python for portability reasons, as Windows socket handles cannot be
used as file handles the way socket handles on Unix systems can be.

If you are running on Windows, a fair bit of work will be required to
improve performance as the line splitting logic needs to be moved to
native code - I wonder whether psyco could do anything with this?.
Even without that, you are specifying a buffer size smaller than the
default (8k - see Lib/socket.py). 16k might be even better.


The benchmark is such that all of data is < 200 bytes. I estimate that
in production almost all protocol data will be < 4KB.


A matter of taste perhaps, but that seems to me like another reason not
to bother with a non-default buffer size.
Although they're only micro-optimisations, I'd be interested in the
relative performance of the query method re-written as:


The change (for the better) is minor (3-5%).


Given your comments above about how much data is actually involved, I'm
a bit surprised that the tweaked version actually produced a measurable
gain.

-------------------------------------------------------------------------
Andrew I MacIntyre "These thoughts are mine alone..."
E-mail: an*****@bullseye.apana.org.au (pref) | Snail: PO Box 370
an*****@pcug.org.au (alt) | Belconnen ACT 2616
Web: http://www.andymac.org/ | Australia
May 13 '06 #4
Andrew MacIntyre wrote:
That's only because I need the .readline() function. In C, I'm using
fgets() (with the expectation that iostream will buffer data).
The readline method of the file object lookalike returned by makefile
implements all of the line splitting logic in Python code, which is very
likely where the extra process CPU time is going. Note that this code is


Heh, I didn't know that - you're probably right about this being a
possible bottleneck.
in Python for portability reasons, as Windows socket handles cannot be
used as file handles the way socket handles on Unix systems can be.
I think they actually can in NT and above... but no, I'm doing it on Unix.
Given your comments above about how much data is actually involved, I'm
a bit surprised that the tweaked version actually produced a measurable
gain.


I didn't do statistical analysis of the results so the difference
actually could be negligable IRL.

Anyway, thanks for the advice - I'll leave it as it is, as the Python
client is not used currently.

--
Things Mr Welch Cannot Do During An RPG:
274. I cannot commune with the Gods during peak hours.
May 13 '06 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

9
by: j askey | last post by:
I have a strange bandwidth issue that I have finally tracked down to something fairly specific if anyone has any ideas... Setup: Win2003 Server, PHP 4.3.4, IIS6.0, ISAPI Network: DSL line with...
5
by: Michael Mossey | last post by:
Runnng python 2.2 on HP-UX, I get intermittant slow startup. Sometimes python starts up in a small fraction of a second, and sometimes takes 3-5 seconds. This applies to any script I run, or just...
3
by: Harry | last post by:
Using Oracle 8i enterprise on win 2000 (sp3) Installed the standard configuration & whenever I make a connection it takes about 10 secs. It's running on a P1900 with 1gb Ram so no reason there...
16
by: mamo74 | last post by:
Hello. I am administering a SQL Server (Enterprise Edition on Windows 2003) from some month and can't understand what is going on in the latest week (when the db grow a lot). The DB is around...
9
by: Neil | last post by:
I've been discussing here a SQL 7 view which scrolls slowly when linked to an Access 2000 MDB. After trying various things, I've distilled it down to the following: when the linked view has a...
2
by: David | last post by:
Hi, We have an internal network of 3 users. Myself & one other currently have individual copies of the front-end MS Access forms and via our individual ODBC links we have used the: File > Get...
4
by: Bri | last post by:
Hi, First let me explain the process I have going on, then I'll address the problems I'm having: 1) Insert records in a temp table using a query 2) Using a query that joins the temp table with...
3
by: Sir Psycho | last post by:
Hi, For some reason, when i step over this code, it returns the full byte stream im expecting from the server, however when I let it run with no intervention, it only seems to grab a small chunk...
12
by: grace | last post by:
i am wondering why my database retrieval becomes too slow...we set up a new server (ubuntu, breezy badger) machine where we transferred all our files from the old server.. Our new server uses Asus...
4
by: Andrew Jackson | last post by:
I am writing a newsgroup client. I have the protocol figured out. But I get slow transfer speeds off any of the network objects read the data from For example one of the commands for a news...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.