473,387 Members | 1,510 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

make faster Richards benchmark

I'd appreciate any suggestions on how to make faster Python
implementations of Richards benchmark. Perhaps there are obvious
problems that can be corrected?

http://www.lissett.com/ben/bench1.htm
Jul 18 '05 #1
15 1841
Duncan Lissett wrote:
I'd appreciate any suggestions on how to make faster Python
implementations of Richards benchmark. Perhaps there are obvious
problems that can be corrected?


The most obvious problem: how does the Richards benchmark work?
Care to post the source code?

Mit freundlichen Gruessen,

Peter Maas

--
-------------------------------------------------------------------
Peter Maas, M+R Infosysteme, D-52070 Aachen, Hubert-Wienen-Str. 24
Tel +49-241-93878-0 Fax +49-241-93878-20 eMail pe********@mplusr.de
-------------------------------------------------------------------
Jul 18 '05 #2
Hi !

On the site, click on the word "Python #92"


Jul 18 '05 #3
Duncan Lissett wrote:
I'd appreciate any suggestions on how to make faster Python
implementations of Richards benchmark. Perhaps there are obvious
problems that can be corrected?

http://www.lissett.com/ben/bench1.htm

What's about including a second python implementation of the Richards
benchmark using psyco? You don't have to modify your code, you only have
to add two lines. It would be also interesting to see the differences
between both source codes.

Regards,
Josef
Jul 18 '05 #4
dl*******@yahoo.com (Duncan Lissett) writes:
I'd appreciate any suggestions on how to make faster Python
implementations of Richards benchmark. Perhaps there are obvious
problems that can be corrected?

http://www.lissett.com/ben/bench1.htm


import psyco
psyco.full()

2 times faster :-)

--
Wilk - http://flibuste.net
Jul 18 '05 #5
dl*******@yahoo.com (Duncan Lissett) wrote in
news:67**************************@posting.google.c om:
I'd appreciate any suggestions on how to make faster Python
implementations of Richards benchmark. Perhaps there are obvious
problems that can be corrected?
That should say "the Richards benchmark", named for Martin Richards.

http://www.lissett.com/ben/bench1.htm


Well, if I was going to do a Python implementation of this I would want to
look at what the benchmark is trying to achieve, and then program it fresh
from the ground up instead of trying to ape some other language. In
particular I expect that the classes with 'run' methods would probably
become generators with most or all of their state held as local variables.

For example, imagining that all the 'run' methods have first been renamed
'next', IdleTask becomes (untested):

def IdleTask(scheduler, v1, v2):
for i in range(v2):
if v1 & 1:
v1 = (v1 >> 1) ^ 0xD008
yield scheduler.release(DEVICEB)
else:
v1 = v1 >> 1
yield scheduler.release(DEVICEA)
yield scheduler.holdCurrent()

Next most obvious thing is to junk the linked lists and replace them with
ordinary Python builtin lists. Pass the relevant list around instead of the
id constants 'DEVICEA', 'DEVICEB' and you can get rid of all that lookup
overhead. This involves quite a major restructuring of the scheduler
though: hence my comment about understanding what the code is trying to
achieve and starting from the ground up.

Packet.addTo should look more like:

def addTo(self, queue):
queue.append(self)

and then vapourise entirely.
Minor points:

Remove the pointless use of the 'global' keyword in various places. Replace
the traceOn variable with __debug__ so you get the same benefits as
compiled languages by optimising out the test for the trace statements.

Remove the pointless set/get methods and just access the members directly.
If you are writing a benchmark they will cripple performance.

Avoiding multiple accesses to the same instance variable, or assigning to
instance variables until you are about to return from a method: use a local
during the execution of the method.
Jul 18 '05 #6
Josef Meile wrote:
Duncan Lissett wrote:
I'd appreciate any suggestions on how to make faster Python
implementations of Richards benchmark. Perhaps there are obvious
problems that can be corrected?

http://www.lissett.com/ben/bench1.htm


What's about including a second python implementation of the Richards
benchmark using psyco? You don't have to modify your code, you only have
to add two lines. It would be also interesting to see the differences
between both source codes.


I get an immediate 38% speedup by doing "import psyco; psyco.full()"
at the start of the benchmark.

-Peter
Jul 18 '05 #7
Michel Claveau/Hamster wrote:
On the site, click on the word "Python #92"


Thanks. Oh, how silly of me. I didn't recognize these as links,
still tuned to underscore links.

Mit freundlichen Gruessen,

Peter Maas

--
-------------------------------------------------------------------
Peter Maas, M+R Infosysteme, D-52070 Aachen, Hubert-Wienen-Str. 24
Tel +49-241-93878-0 Fax +49-241-93878-20 eMail pe********@mplusr.de
-------------------------------------------------------------------
Jul 18 '05 #8
Duncan Booth <me@privacy.net> wrote in message news:<Xn***************************@127.0.0.1>...

-snip-
Well, if I was going to do a Python implementation of this I would want to
look at what the benchmark is trying to achieve, and then program it fresh
from the ground up instead of trying to ape some other language. In
particular I expect that the classes with 'run' methods would probably
become generators with most or all of their state held as local variables.

For example, imagining that all the 'run' methods have first been renamed
'next', IdleTask becomes (untested):

def IdleTask(scheduler, v1, v2):
for i in range(v2):
if v1 & 1:
v1 = (v1 >> 1) ^ 0xD008
yield scheduler.release(DEVICEB)
else:
v1 = v1 >> 1
yield scheduler.release(DEVICEA)
yield scheduler.holdCurrent()

Next most obvious thing is to junk the linked lists and replace them with
ordinary Python builtin lists. Pass the relevant list around instead of the
id constants 'DEVICEA', 'DEVICEB' and you can get rid of all that lookup
overhead. This involves quite a major restructuring of the scheduler
though: hence my comment about understanding what the code is trying to
achieve and starting from the ground up.

Packet.addTo should look more like:

def addTo(self, queue):
queue.append(self)

and then vapourise entirely.
Minor points:

Remove the pointless use of the 'global' keyword in various places. Replace
the traceOn variable with __debug__ so you get the same benefits as
compiled languages by optimising out the test for the trace statements.
Thanks, these are all interesting suggestions.

Remove the pointless set/get methods and just access the members directly.
If you are writing a benchmark they will cripple performance.

Avoiding multiple accesses to the same instance variable, or assigning to
instance variables until you are about to return from a method: use a local
during the execution of the method.


The pointless set/get methods are pointless in the other language
implementations as well - that's the point. (And I'll cripple that
Oberon-2 implementation real soon by enforcing privacy with modules
and set/get procedures.)

OTOH there's definitely a place for a truly Pythonic implementation
here:
http://www.lissett.com/ben/bench3.htm

best wishes, Duncan
Jul 18 '05 #9
>>> Mit freundlichen Gruessen,

Désolé, je ne comprend pas l'allemand. Et pourtant, en octobre, je vais
aller là :
http://www.babstsoft.com/Paradox/Con...n04/Info_1.htm

Je présenterai PONX (Python for Paradox) (cf http://ponx.org)
@-salutations
--
Michel Claveau
mél : http://cerbermail.com/?6J1TthIa8B
site : http://mclaveau.com


Jul 18 '05 #10
Wilk <wi******@OUTflibuste.net> wrote in message news:<87************@blakie.riol>...
dl*******@yahoo.com (Duncan Lissett) writes:
I'd appreciate any suggestions on how to make faster Python
implementations of Richards benchmark. Perhaps there are obvious
problems that can be corrected?

http://www.lissett.com/ben/bench1.htm


import psyco
psyco.full()

2 times faster :-)


And Simon: "I just tried the benchmark with Psyco, and cut the run
time for input=10000 from 8.1 seconds to 3.4"

Joseph, Simon, Wilk: thanks for the Psyco suggestion, and special
thanks for going ahead and trying it - we get slightly better than x2.

Given how little I know about Python, my assumption is that there's
scope for x10 improvement by writing better code... a more Pythonic
version for http://www.lissett.com/ben/bench3.htm

Suggestions appreciated.

best wishes, Duncan
Jul 18 '05 #11
Duncan Lissett wrote:
Wilk <wi******@OUTflibuste.net> wrote in message news:<87************@blakie.riol>...
dl*******@yahoo.com (Duncan Lissett) writes:

I'd appreciate any suggestions on how to make faster Python
implementations of Richards benchmark. Perhaps there are obvious
problems that can be corrected?

http://www.lissett.com/ben/bench1.htm


import psyco
psyco.full()

2 times faster :-)


And Simon: "I just tried the benchmark with Psyco, and cut the run
time for input=10000 from 8.1 seconds to 3.4"

Joseph, Simon, Wilk: thanks for the Psyco suggestion, and special
thanks for going ahead and trying it - we get slightly better than x2.


What platform is this on? On WinXP, with an AMD 2500+ chip, and lots
of memory etc., I'm still getting only 38% speedup, nowhere near 100%...

-Peter
Jul 18 '05 #12
On Thu, May 13, 2004 at 09:15:58AM -0700, Duncan Lissett wrote:
Duncan Booth <me@privacy.net> wrote in message news:<Xn***************************@127.0.0.1>...
Remove the pointless set/get methods and just access the members directly.
If you are writing a benchmark they will cripple performance.

Avoiding multiple accesses to the same instance variable, or assigning to
instance variables until you are about to return from a method: use a local
during the execution of the method.


The pointless set/get methods are pointless in the other language
implementations as well - that's the point. (And I'll cripple that
Oberon-2 implementation real soon by enforcing privacy with modules
and set/get procedures.)


I would leave python and some other languages out of the comparison.
You can do near line-for-line translations for languages in the same class
- say C++ and Java, or Assembly and C. Requiring the python/perl/ruby
versions to look like a C++ program just measures how badly C++ maps to
Python, and not much else.

I've even seen some line-for-line translations in production use that
use python list-of-strings where the orignal version used char arrarys.
You can imagine what that does for performance *shudder*.

Writing benchmarks is just hard, if you allow people to solve the problem
in whatever way they like you end up measuring how good a coder the
language A guy is compared to the language B submitter.

-jackdied


Jul 18 '05 #13
Jack Diederich wrote:
On Thu, May 13, 2004 at 09:15:58AM -0700, Duncan Lissett wrote:
The pointless set/get methods are pointless in the other language
implementations as well - that's the point.


I would leave python and some other languages out of the comparison.
You can do near line-for-line translations for languages in the same class
- say C++ and Java, or Assembly and C. Requiring the python/perl/ruby
versions to look like a C++ program just measures how badly C++ maps to
Python, and not much else.

I've even seen some line-for-line translations in production use that
use python list-of-strings where the orignal version used char arrarys.
You can imagine what that does for performance *shudder*.

Writing benchmarks is just hard, if you allow people to solve the problem
in whatever way they like you end up measuring how good a coder the
language A guy is compared to the language B submitter.


In the case of the Richards benchmark, it's better to say "specifying
benchmarks is hard". The problem that I saw with it is that it
describes the problem in terms that are inappropriate for many of the
languages it has been translated to, effectively constraining the
implementation in ways that are more suitable to certain languages
than others.

One key example is the requirement for "global registers" which act
similarly to how the low-level registers in the CPU have to be handled,
specifically by saving and restoring them in a task-local context
whenever a task switch occurs. This is clearly what Richards really
wanted, but I'd argue that the approach is ill-conceived, especially
in this century...

On the other hand, if the specification had been more thoroughly
thought out, or perhaps modernized is a more appropriate way of looking
at it, then it would describe the expected *use and behaviour* of the
tasks in terms of black box testability... in other words, it shouldn't
matter that "global registers" are used, but merely that the
benchmark produces certain results.

Without those tests, it's fairly open to interpretation just how
much deviation from the original BCPL implementation is appropriate.
Duncan L's ideas for a generator-based approach are interesting,
though there are probably those who would argue that it wouldn't
follow the benchmark specs exactly. (Especially if it showed
performance much closer to C. <wink>)

-Peter
Jul 18 '05 #14
Duncan Booth <me@privacy.net> wrote in message news:<Xn***************************@127.0.0.1>...

-snip-
Next most obvious thing is to junk the linked lists and replace them with
ordinary Python builtin lists.
After the Packet links were replace with a Python list the code was
~1% slower...
Replace the traceOn variable with __debug__ so you get the same benefits as
compiled languages by optimising out the test for the trace statements.
Using __debug__ made the code ~1% faster
Remove the pointless set/get methods and just access the members directly.
As a special dispensation to Python, directly accessed Packet and Tcb
fields, ~10% faster

Avoiding multiple accesses to the same instance variable, or assigning to
instance variables until you are about to return from a method: use a local
during the execution of the method.


Seems we're pushing into code optimization rather than not coding
badly.

Being careful about the tests in conditionals, and reordering some
conditionals sped things up some more. (I should roll the same changes
into the other langauge implementations.)

The OO style implementation took 104.4s and now takes 87.1s

I took the OO style implementation, made the methods of the scheduler
class into functions, made the fields into globals, and made the Task
class run methods into top-level functions.

The C style implementation took 114.9s and the OO conversion takes
80.3s
http://www.lissett.com/ben/bench3.htm

Further suggestions are welcome (yes, I will try pysco later)
Jul 18 '05 #15
Jack Diederich <ja**@performancedrivers.com> wrote in message news:<ma**************************************@pyt hon.org>...
I would leave python and some other languages out of the comparison.
You can do near line-for-line translations for languages in the same class
- say C++ and Java, or Assembly and C. Requiring the python/perl/ruby
versions to look like a C++ program just measures how badly C++ maps to
Python, and not much else.
There's no requirement that Python versions look like a C++ program.
Peter Hansen chose to write a Python version closely based on the C
implementation. I've now replaced that with a faster version initially
based on a Smalltalk OO implementation.

-snip- Writing benchmarks is just hard, if you allow people to solve the problem
in whatever way they like you end up measuring how good a coder the
language A guy is compared to the language B submitter.


Maybe the hardest thing about benchmarks is keeping some perspective
on what they might and might not mean ;-)

Yes, the fine performance of the BCPL interpreter implementation (and
the C implementation) has a lot to do with programmer skill. But I'm
not that sure about the other versions ;-)
Jul 18 '05 #16

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

16
by: Duncan Lissett | last post by:
Is there a Python implementation of Martin Richards benchmark Bench? Be interesting to add it to these www.lissett.com/ben/exp/bench1.htm
6
by: maricel | last post by:
Is there anybody out there who have any idea why EXPORT is relatively slower when putting the output file on a network drive - map drive from onother PC compared to putting it on my local PC drive...
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.