473,915 Members | 4,402 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Slow network reading?

I have a simple network protocol client (it's a part of this:
http://sqlcached.sourceforge.net) implemented in Python, PHP and C.
Everything's fine, except that the Python implementation is the slowest
- up to 30% slower than the PHP version (which implements exactly the
same logic, in a class).

In typical usage (also in the benchmark), an object is created and
..query is called repeatedly. Typical numbers for the benchmark are:

For Python version:

Timing 100000 INSERTs...
5964.4 qps
Timing 100000 SELECTs...
7491.0 qps

For PHP version:

Timing 100000 inserts...
7820.2 qps
Timing 100000 selects...
9926.2 qps
The main part of the client class is:

----

import os, socket, re

class SQLCacheD_Excep tion(Exception) :
pass

class SQLCacheD:

DEFAULT_UNIX_SO CKET = '/tmp/sqlcached.sock'
SC_VER_SIG = 'sqlcached-1'
SOCK_UNIX = 'unix'
SOCK_TCP = 'tcp'

re_rec = re.compile(r"\+ REC (\d+), (\d+)")
re_ok = re.compile(r"\+ OK (.+)")
re_ver = re.compile(r"\+ VER (.+)")

def __init__(self, host = '/tmp/sqlcached.sock' , type = 'unix'):
if type != SQLCacheD.SOCK_ UNIX:
raise

self.sock = socket.socket(s ocket.AF_UNIX, socket.SOCK_STR EAM)
self.sock.conne ct(host)
self.sf = self.sock.makef ile('U', 4000)

self.sf.write(" VER %s\r\n" % SQLCacheD.SC_VE R_SIG)
self.sf.flush()
if self.sf.readlin e().rstrip() != '+VER %s' % SQLCacheD.SC_VE R_SIG:
raise SQLCacheD_Excep tion("Handshake failure (invalid
version signature?)")
def query(self, sql):
self.sf.write(" SQL %s\r\n" % sql)
self.sf.flush()
resp = self.sf.readlin e().rstrip()
m = SQLCacheD.re_re c.match(resp)
if m != None: # only if some rows are returned (SELECT)
n_rows = int(m.group(1))
n_cols = int(m.group(2))
cols = []
for c in xrange(n_cols):
cols.append(sel f.sf.readline() .rstrip())
rs = []
for r in xrange(n_rows):
row = {}
for c in cols:
row[c] = self.sf.readlin e().rstrip()
rs.append(row)
return rs
m = SQLCacheD.re_ok .match(resp)
if m != None: # no rows returned (e.g. INSERT/UPDATE/DELETE)
return True
raise SQLCacheD_Excep tion(resp)

----

My question is: Am I missing something obvious? The C implementation is
(as expected) the fastest with result of 10000:15000, but somehow I
expected the Python one to be closer to, or even faster than PHP.

I tried using 'r' mode for .makefile() but it had no significant effect.
May 11 '06 #1
4 1755
Ivan Voras wrote:
def query(self, sql):
self.sf.write(" SQL %s\r\n" % sql)
self.sf.flush()
resp = self.sf.readlin e().rstrip()
m = SQLCacheD.re_re c.match(resp)
if m != None: # only if some rows are returned (SELECT)
n_rows = int(m.group(1))
n_cols = int(m.group(2))
cols = []
for c in xrange(n_cols):
cols.append(sel f.sf.readline() .rstrip())
rs = []
for r in xrange(n_rows):
row = {}
for c in cols:
row[c] = self.sf.readlin e().rstrip()
rs.append(row)
return rs
m = SQLCacheD.re_ok .match(resp)
if m != None: # no rows returned (e.g. INSERT/UPDATE/DELETE)
return True
raise SQLCacheD_Excep tion(resp)


Comparative CPU & memory utilisation statistics, not to mention platform
and version of Python, would be useful hints...

Note that the file-like object returned by makefile() has significant
portions of heavy lifting code in Python rather than C which can be a
drag on ultimate performance... If on a Unix platform, it may be worth
experimenting with os.fdopen() on the socket's fileno() to see whether
the core Python file object (implemented in C) can be used in place of
the lookalike returned from the makefile method.

Even without that, you are specifying a buffer size smaller than the
default (8k - see Lib/socket.py). 16k might be even better.

Although they're only micro-optimisations, I'd be interested in the
relative performance of the query method re-written as:

def query(self, sql):
self.sf.write(" SQL %s\r\n" % sql)
self.sf.flush()
sf_readline = self.sf.readlin e
resp = sf_readline().r strip()
m = self.re_rec.mat ch(resp)
if m is not None:
# some rows are returned (SELECT)
rows = range(int(m.gro up(1)))
cols = range(int(m.gro up(2)))
for c in cols:
cols[c] = sf_readline().r strip()
for r in rows:
row = {}
for c in cols:
row[c] = sf_readline().r strip()
rows[r] = row
return rows
elif self.re_ok.matc h(resp) is not None:
# no rows returned (e.g. INSERT/UPDATE/DELETE)
return True
raise SQLCacheD_Excep tion(resp)
This implementation is based on 2 strategies for better performance:
- minimise name lookups by hoisting references from outside the method
to local references;
- pre-allocate lists when the required sizes are known, to avoid the
costs associated with growing them.

Both strategies can pay fair dividends when the repetition counts are
large enough; whether this is the case for your tests I can't say.

--
-------------------------------------------------------------------------
Andrew I MacIntyre "These thoughts are mine alone..."
E-mail: an*****@bullsey e.apana.org.au (pref) | Snail: PO Box 370
an*****@pcug.or g.au (alt) | Belconnen ACT 2616
Web: http://www.andymac.org/ | Australia
May 11 '06 #2
Andrew MacIntyre wrote:
Comparative CPU & memory utilisation statistics, not to mention platform
and version of Python, would be useful hints...
During benchmarking, all versions cause all CPU to be used, but Python
version has ~1.5x more CPU time allocated to it than PHP. Python is 2.4.1
Note that the file-like object returned by makefile() has significant
portions of heavy lifting code in Python rather than C which can be a
drag on ultimate performance... If on a Unix platform, it may be worth
experimenting with os.fdopen() on the socket's fileno() to see whether
the core Python file object (implemented in C) can be used in place of
the lookalike returned from the makefile method.
That's only because I need the .readline() function. In C, I'm using
fgets() (with the expectation that iostream will buffer data).
Even without that, you are specifying a buffer size smaller than the
default (8k - see Lib/socket.py). 16k might be even better.
The benchmark is such that all of data is < 200 bytes. I estimate that
in production almost all protocol data will be < 4KB.
Although they're only micro-optimisations, I'd be interested in the
relative performance of the query method re-written as:


The change (for the better) is minor (3-5%).
May 12 '06 #3
Ivan Voras wrote:
Andrew MacIntyre wrote:
Comparative CPU & memory utilisation statistics, not to mention platform
and version of Python, would be useful hints...
During benchmarking, all versions cause all CPU to be used, but Python
version has ~1.5x more CPU time allocated to it than PHP. Python is 2.4.1


A pretty fair indication of the Python interpreter doing a lot more work...
Note that the file-like object returned by makefile() has significant
portions of heavy lifting code in Python rather than C which can be a
drag on ultimate performance... If on a Unix platform, it may be worth
experimenting with os.fdopen() on the socket's fileno() to see whether
the core Python file object (implemented in C) can be used in place of
the lookalike returned from the makefile method. That's only because I need the .readline() function. In C, I'm using
fgets() (with the expectation that iostream will buffer data).


The readline method of the file object lookalike returned by makefile
implements all of the line splitting logic in Python code, which is very
likely where the extra process CPU time is going. Note that this code is
in Python for portability reasons, as Windows socket handles cannot be
used as file handles the way socket handles on Unix systems can be.

If you are running on Windows, a fair bit of work will be required to
improve performance as the line splitting logic needs to be moved to
native code - I wonder whether psyco could do anything with this?.
Even without that, you are specifying a buffer size smaller than the
default (8k - see Lib/socket.py). 16k might be even better.


The benchmark is such that all of data is < 200 bytes. I estimate that
in production almost all protocol data will be < 4KB.


A matter of taste perhaps, but that seems to me like another reason not
to bother with a non-default buffer size.
Although they're only micro-optimisations, I'd be interested in the
relative performance of the query method re-written as:


The change (for the better) is minor (3-5%).


Given your comments above about how much data is actually involved, I'm
a bit surprised that the tweaked version actually produced a measurable
gain.

-------------------------------------------------------------------------
Andrew I MacIntyre "These thoughts are mine alone..."
E-mail: an*****@bullsey e.apana.org.au (pref) | Snail: PO Box 370
an*****@pcug.or g.au (alt) | Belconnen ACT 2616
Web: http://www.andymac.org/ | Australia
May 13 '06 #4
Andrew MacIntyre wrote:
That's only because I need the .readline() function. In C, I'm using
fgets() (with the expectation that iostream will buffer data).
The readline method of the file object lookalike returned by makefile
implements all of the line splitting logic in Python code, which is very
likely where the extra process CPU time is going. Note that this code is


Heh, I didn't know that - you're probably right about this being a
possible bottleneck.
in Python for portability reasons, as Windows socket handles cannot be
used as file handles the way socket handles on Unix systems can be.
I think they actually can in NT and above... but no, I'm doing it on Unix.
Given your comments above about how much data is actually involved, I'm
a bit surprised that the tweaked version actually produced a measurable
gain.


I didn't do statistical analysis of the results so the difference
actually could be negligable IRL.

Anyway, thanks for the advice - I'll leave it as it is, as the Python
client is not used currently.

--
Things Mr Welch Cannot Do During An RPG:
274. I cannot commune with the Gods during peak hours.
May 13 '06 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

9
3461
by: j askey | last post by:
I have a strange bandwidth issue that I have finally tracked down to something fairly specific if anyone has any ideas... Setup: Win2003 Server, PHP 4.3.4, IIS6.0, ISAPI Network: DSL line with a Cisco 678 router using NAT and static port maps. Two switches between webserver and Cisco. Problem: Accessing PHP driven site via any computer except the web server is about 20 times slower. I can get fast response on the server
5
2213
by: Michael Mossey | last post by:
Runnng python 2.2 on HP-UX, I get intermittant slow startup. Sometimes python starts up in a small fraction of a second, and sometimes takes 3-5 seconds. This applies to any script I run, or just typing 'python' at the prompt. I also observed something similar on Linux. Any ideas what would cause *intermittant* slow startup? -Mike
3
14540
by: Harry | last post by:
Using Oracle 8i enterprise on win 2000 (sp3) Installed the standard configuration & whenever I make a connection it takes about 10 secs. It's running on a P1900 with 1gb Ram so no reason there for slowness. Once I'm connected the queries work pretty much instantanously but to connect using SQLPLUS, Toad, ODBC, OLEDB all take about 10 secs. I connect using OLEDB from Visual Basic development environment & after the 1st connection...
16
21405
by: mamo74 | last post by:
Hello. I am administering a SQL Server (Enterprise Edition on Windows 2003) from some month and can't understand what is going on in the latest week (when the db grow a lot). The DB is around 250G, and has one table with 1 billion rows. It is performing in a decent way, but can't understand why a particolar table has strong performance problem.
9
2398
by: Neil | last post by:
I've been discussing here a SQL 7 view which scrolls slowly when linked to an Access 2000 MDB. After trying various things, I've distilled it down to the following: when the linked view has a virtual primary key, it is slow; when there is no virtual primary key, it scrolls quickly. Looking at Profiler, I see that when there is no virtual primary key, Access sends a simple select command to SQL Server. However, when there is a virtual...
2
3361
by: David | last post by:
Hi, We have an internal network of 3 users. Myself & one other currently have individual copies of the front-end MS Access forms and via our individual ODBC links we have used the: File > Get External Data > Link Tables > select ODBC Databases facility to link to our back-end MySQL Server. On both our machines the tables appear in the window very quickly and if we hit 'Select All', all the tables start loading really quickly into our...
4
2153
by: Bri | last post by:
Hi, First let me explain the process I have going on, then I'll address the problems I'm having: 1) Insert records in a temp table using a query 2) Using a query that joins the temp table with TableA Insert records into TableB 3) Delete records from temp Table 4) Table specs; temp Table - 4 fields, 3 indexes, gets 100-4500 records inserted
3
6778
by: Sir Psycho | last post by:
Hi, For some reason, when i step over this code, it returns the full byte stream im expecting from the server, however when I let it run with no intervention, it only seems to grab a small chunk on the stream. What am I doing wrong? There is more code than this, but this is the problem code.
12
3961
by: grace | last post by:
i am wondering why my database retrieval becomes too slow...we set up a new server (ubuntu, breezy badger) machine where we transferred all our files from the old server.. Our new server uses Asus p5pe-vm motherboard and an Intel Pentium D 3.0Ghz processor, compared to the old one where we uses asrock motherboard and AMD Duron. Both has the same version of mysql installed... To summarized, both machine has the same configuration except...
4
7598
by: Andrew Jackson | last post by:
I am writing a newsgroup client. I have the protocol figured out. But I get slow transfer speeds off any of the network objects read the data from For example one of the commands for a news client to use is "XOVER articlenumber-" This return string after string of all the news articles from article number on.... Another newsclient, i wont name names, pulls data down just fine. Using a
0
9881
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
11354
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10923
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
9732
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
8100
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
7256
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5943
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
2
4344
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
3368
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.