473,395 Members | 1,348 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,395 software developers and data experts.

Pyrex speed

Has anyone found a good link on exactly how to speed up code using
pyrex? I found various info but the focus is usually not on code
speedup.

May 27 '06 #1
11 1955
Jim Lewis schrieb:
Has anyone found a good link on exactly how to speed up code using
pyrex? I found various info but the focus is usually not on code
speedup.


The code speedup comes through the usage of C by pyrex itself, and using
it to put a thin layer over C-functions available/coded for that purpose.

Diez
May 27 '06 #2
I'm not planning to write C functions. My understanding is that by
using cdefs in the python code one can gain substantial speed. I'm
trying to find a description of how to modify python code in more
detail so it runs fast under pyrex.

May 27 '06 #3
You can gain substantial speed-ups in very certain cases, but the main
point of Pyrex is ease of wrapping, not of speeding-up.

Depending on what you're doing, rewriting in Pyrex or even in C, using
the Python/C API directly, might not gain you much.

May 27 '06 #4
> main point of Pyrex is ease of wrapping, not of speeding-up.

Supposedly the primes example is 50 times faster.

May 27 '06 #5
Jim Lewis napisal:
main point of Pyrex is ease of wrapping, not of speeding-up.


Supposedly the primes example is 50 times faster.


How often you perform primes calculations in your programs? In my >10
years of professional career in writing business software I never had an
opportunity to do any more sophisticated math than simple adding,
multiplying, subtracting and dividing.

--
Jarek Zgoda
http://jpa.berlios.de/
May 27 '06 #6
> I never had an opportunity to do any more sophisticated math than simple adding,
multiplying, subtracting and dividing.

Neither is the primes example doing anything more sophisticated than
basic arithmetic but it's 50 times faster.

May 27 '06 #7
Jim Lewis \/\/|20+3:
I'm not planning to write C functions. My understanding is that by
using cdefs in the python code one can gain substantial speed. I'm
trying to find a description of how to modify python code in more
detail so it runs fast under pyrex.


I've used pyrex to speed up my code. It worked. While it isn't
intended as a tutorial on pyrex you can have a look at it here:

http://www.microtonal.co.uk/temper.html

The trick is to write C functions using pyrex. That's not much easier
than writing C functions in C. But I still found it convenient enough
to be worth doing that way. Some tips:

- declare functions with cdef

- declare the type of every variable you use

- don't use Python builtins, or other libraries

The point of these rules is that generated C code using Python
variables will still be slow. You want Pyrex to write C code using C
variables only. To check this is happening you can look at the
automatically generated source code to make sure there are no reference
counting functions where there shouldn't be.

The usual rule for C optimization applies -- rewrite the code that
you're spending most time in. But if that innermost function's being
called from a loop it can be worth changing the loop as well so that
you pass in and out C variables.

HTH,

Graham

May 27 '06 #8
Hi Jim,

It depends a lot on what you're doing. You will get speed up from Pyrex
or wrapping C code if you understand how does it work internally, and to
speed up you application via coding *only* Pyrex parts (I mean don't
using it for wrapping C but implementing in Pyrex), it limits a lot the
things that you can expect to get faster -'cause on some kind of things
you can even get better performance coding that in straight Python than
in Pyrex and converted to C & compiled, I thought you should know how
Python works in the C side to understand it fully-.

I attach some examples of different code where C is a lot faster, or
just a little bit faster (and I compare with C counterparts, not Pyrex
ones -Pyrex is only used for wrapping in these examples-). So you can
get an idea of why depends a lot on what you're doing. If you plan only
using cdefs to speed-up Python code, you're very limited in the things
that could be speed-up. Try to do some experiments and examine the C
generated code by Pyrex, and you will see why it is -you will see how
Pyrex does Python C api function calls for conversion from Python
objects to C type values every time you use that var, and that's not a
great gain, even in some kind of operations can be worse as Python does
a better job than generated C code by Pyrex for some operations or value
conversions (i.e. when doing operations on some kind of iterable objects
I remember to read on some paper that Pyrex does not traslate to the
faster C approach)

Some days ago I posted some timing results for a function coded in
Python, or coded in C and wrapped by Pyrex. C approach was more than 80
times faster. And I attach below another one, where C isn't much a gain
(1 time faster).

Example A:
This code is more than 80 times faster than a "easy" Python
implementation. For every call, it does some bitwise operations and does
an array lookup for every string character from argument. Its a lot
faster because in Python approach a list lookup is done and it is a lot
faster to do a C array lookup -thought that in these C loops no Python
type value conversions are needed, if it where the case, C approach
would not be so faster than python. I don't know how would perform an
array based Python code, but I expect it to be a lot faster than using a
list, so Python code can be speed up a lot if you know how to do it.

// C code:
int CRC16Table[256]; // Filled elsewhere
int CalcCRC16(char *str)
{
int crc;

for(crc = 0xFFFF; *str != 0; str++) {
crc = CRC16Table [(( crc >> 8 ) & 255 )] ^ ( crc << 8 ) ^ *str;
}

return crc;
}

# Python code
gCRC16Table = [] # Filled elsewhere
def CalcCRC16(astr):
crc = 0xFFFFL
for c in astr:
crc = gCRC16Table[((crc >> 8) & 255)] ^ ((crc & 0xFFFFFF) << 8)
^ ord(c)
return crc

-------------------------------------------------------------------------
Example B:
If we do compare the functions below, Python approach is only a bit
slowly than C implementation. I know both aren't the faster approaches
for every language, but that's a different issue. C here is only about 1
time faster:

// C code. gTS type is struct { int m, int s }
gTS gTS_diff(gTS t0, gTS t1) {
gTS retval;

retval.s = (t1.s-t0.s);
if ((t0.m>t1.m)) {
retval.m = (t1.m-t0.m);

while((retval.m<0)) {
retval.s = (retval.s-1);
retval.m = (m+1000);
}
} else {
retval.m = (t1.m-t0.m);
}

while((retval.m>999)) {
retval.m = (retval.m-1000);
retval.s = (retval.s+1);
}
return retval;
}

# Python code (t0 and t1 are tuples)
def gts_diff(t0,t1):
s = t1[0] - t0[0]
if (t0[1] > t1[1]):
m = t1[1] - t0[1]

while m < 0:
s = s - 1
m = m + 1000
else:
m = t1[1] - t0[1]

while m > 999:
m = m - 1000
s = s + 1
return s, m
I encourage you to google for some Pyrex papers on the net, they explain
the "to do"'s and the "not to do"'s with Pyrex. Sorry but I don't have
the urls.

Regards,
Gonzalo

Jim Lewis escribió:
I never had an opportunity to do any more sophisticated math than simple adding,

multiplying, subtracting and dividing.

Neither is the primes example doing anything more sophisticated than
basic arithmetic but it's 50 times faster.


May 27 '06 #9
On 28/05/2006 12:10 AM, Gonzalo Monzón wrote:

[good advice snipped]

Example A:
This code is more than 80 times faster than a "easy" Python
implementation. For every call, it does some bitwise operations and does
an array lookup for every string character from argument. Its a lot
faster because in Python approach a list lookup is done and it is a lot
faster to do a C array lookup -thought that in these C loops no Python
type value conversions are needed, if it where the case, C approach
would not be so faster than python. I don't know how would perform an
array based Python code, but I expect it to be a lot faster than using a
list, so Python code can be speed up a lot if you know how to do it.

// C code:
int CRC16Table[256]; // Filled elsewhere
int CalcCRC16(char *str)
{
int crc;
for(crc = 0xFFFF; *str != 0; str++) {
crc = CRC16Table [(( crc >> 8 ) & 255 )] ^ ( crc << 8 ) ^ *str;
Gonzalo, just in case there are any C compilers out there which need to
be told:
for(crc = 0xFFFF; *str != 0;) {
crc = CRC16Table [(( crc >> 8 ) & 255 )] ^ ( crc << 8 ) ^ *str++;
}
return crc;
}

# Python code
gCRC16Table = [] # Filled elsewhere
def CalcCRC16(astr):
crc = 0xFFFFL
Having that L on the end (plus the fact that you are pointlessly
maintaining "crc" as an *unsigned* 32-bit quantity) will be slowing the
calculation down -- Python will be doing it in long integers. You are
calculating a *sixteen bit* CRC! The whole algorithm can be written
simply so as to not need more than 16-bit registers, and not to pollute
high-order bits in 17-or-more-bit registers.
for c in astr:
crc = gCRC16Table[((crc >> 8) & 255)] ^ ((crc & 0xFFFFFF) << 8) ^
ord(c)


Note that *both* the C and Python routines still produce a 32-bit result
with 16 bits of high-order rubbish -- I got the impression from the
previous thread that you were going to fix that.

This Python routine never strays outside 16 bits, so avoiding your "&
255" and a final "& 0xFFFF" (which you don't have).

def CalcCRC16(astr):
crc = 0xFFFF
for c in astr:
crc = gCRC16Table[crc >> 8] ^ ((crc & 0xFF) << 8) ^ ord(c)
return crc

==============
To the OP:

I'd just like to point out that C code and Pyrex code can gain
signicantly (as the above example does) by not having to use ord() and
chr().

As Gonzalo says, read the generated C code. Look for other cases of
using Python built-ins that could be much faster with a minor bit of
effort in Pyrex e.g. "max(a, b)" -> "(a) > (b) ? (a) : (b) " or if you
don't like that, a cdef function to get the max of 2 ints will be *way*
faster than calling Python's max()
May 28 '06 #10
Hi John,

John Machin escribió:
On 28/05/2006 12:10 AM, Gonzalo Monzón wrote:

[good advice snipped]
Example A:
This code is more than 80 times faster than a "easy" Python
implementation. For every call, it does some bitwise operations and does
an array lookup for every string character from argument. Its a lot
faster because in Python approach a list lookup is done and it is a lot
faster to do a C array lookup -thought that in these C loops no Python
type value conversions are needed, if it where the case, C approach
would not be so faster than python. I don't know how would perform an
array based Python code, but I expect it to be a lot faster than using a
list, so Python code can be speed up a lot if you know how to do it.

// C code:
int CRC16Table[256]; // Filled elsewhere
int CalcCRC16(char *str)
{
int crc;
for(crc = 0xFFFF; *str != 0; str++) {
crc = CRC16Table [(( crc >> 8 ) & 255 )] ^ ( crc << 8 ) ^ *str;
Gonzalo, just in case there are any C compilers out there which need to
be told:
for(crc = 0xFFFF; *str != 0;) {
crc = CRC16Table [(( crc >> 8 ) & 255 )] ^ ( crc << 8 ) ^ *str++;


Thank you for the advise! I didn't know you couldn't advance pointer in
the for in some compilers...


}
return crc;
}

# Python code
gCRC16Table = [] # Filled elsewhere
def CalcCRC16(astr):
crc = 0xFFFFL
Having that L on the end (plus the fact that you are pointlessly
maintaining "crc" as an *unsigned* 32-bit quantity) will be slowing the
calculation down -- Python will be doing it in long integers. You are
calculating a *sixteen bit* CRC! The whole algorithm can be written
simply so as to not need more than 16-bit registers, and not to pollute
high-order bits in 17-or-more-bit registers.

Yes I know but I plan to post a quick example for Jim, and got the first
one file from several versions... :-) The issue was about Jim
understanding how some code can be speed-up a lot and some other not and
how that's not a trivial question.
for c in astr:
crc = gCRC16Table[((crc >> 8) & 255)] ^ ((crc & 0xFFFFFF) << 8) ^
ord(c)


Note that *both* the C and Python routines still produce a 32-bit result
with 16 bits of high-order rubbish -- I got the impression from the
previous thread that you were going to fix that.

Yes of course! I plan to spend some time on this issue, the last week I
had not much time to work on this, but thought it worth the pain to
setup a compiling environment -ms.evc++ obviously-, and got succesfuly
compiled Python and some of these own custom Pyrex extensions for the
PocketPC, easily, only adding the C files to makefile, as Pyrex glue
code compiles well on ARM, so I have to make some timings and decide
what version to use for the code that won't be likely to be changed in
long time. I still have to test the last improved Python array based
approach and make some timings on the PDA.
This Python routine never strays outside 16 bits, so avoiding your "&
255" and a final "& 0xFFFF" (which you don't have).

def CalcCRC16(astr):
crc = 0xFFFF
for c in astr:
crc = gCRC16Table[crc >> 8] ^ ((crc & 0xFF) << 8) ^ ord(c)
return crc

Thank you again for your thoughts John! :-)

Regards,
Gonzalo
==============
To the OP:

I'd just like to point out that C code and Pyrex code can gain
signicantly (as the above example does) by not having to use ord() and
chr().

As Gonzalo says, read the generated C code. Look for other cases of
using Python built-ins that could be much faster with a minor bit of
effort in Pyrex e.g. "max(a, b)" -> "(a) > (b) ? (a) : (b) " or if you
don't like that, a cdef function to get the max of 2 ints will be *way*
faster than calling Python's max()


May 28 '06 #11
The stuff you do are not representative of 100% of programming
conducted in the world. Not even 90% and probably not even 50% of
programming work is similar to what you do.
The fact you never use sophisticated math doesn't mean this guy doesn't
either.
Personally, I've used pyrex a lot. And it was never for wrapping -
always for speeding up.

May 28 '06 #12

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
by: Gary Stephenson | last post by:
I'm getting a clean generate, compile and link from my .pyx script, but when I attempt to run the resultant .exe, I get: "The procedure entry point Py_NoneStruct could not be located in the...
10
by: Kyler Laird | last post by:
I need to submit C/C++ code for a class. (It's not a programming class. The choice of language is inertial. I think that it mostly serves to distract students from the course subject.) I'm...
1
by: Paul Prescod | last post by:
PyCon 2004 Slides on "Extending Python with Pyrex" PDF: http://www.prescod.net/pyrex/ExtendingPythonWithPyrex.pdf PPT: http://www.prescod.net/pyrex/ExtendingPythonWithPyrex.ppt Pycon 2004 Slides...
4
by: Kyler Laird | last post by:
I mentioned earlier that I started using Pyrex because I'm taking a computer vision course that requires all assignments to be submitted as C(++). While I could write C it would hurt me to do so...
6
by: SeeBelow | last post by:
I just read "about Pyrex" at http://www.cosc.canterbury.ac.nz/~greg/python/Pyrex/version/Doc/About.html It seems that it is not compiled into machine code, as C would be, and therefore it does...
1
by: Martin Bless | last post by:
Now that I've got my extension building machine using the VC++ Toolkit 2003 up and running I'm keen on using Pyrex (Pyrex-0.9.3, Python-2.4.0). But the definition of the swig_sources() method...
7
by: ajikoe | last post by:
Hello, I have an idea to build python module to speed up python code in some of field where pyrex shines such as numeric, code which needs a lot of looping etc. What do you think? ...
27
by: Julien Fiore | last post by:
Do you wand to install Pyrex on Windows ? Here is a step-by-step guide explaining: A) how to install Pyrex on Windows XP. B) how to compile a Pyrex module. Julien Fiore, U. of Geneva
7
by: Jim Lewis | last post by:
I'm trying to move a function into pyrex for speed. The python side needs to pass a list to the pyrex function. Do I need to convert to array or something so pyrex can generate tight code? I'm not...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.