473,396 Members | 1,771 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

Some simple performace tests (long)

TPJ
"The advantage of xrange() over range() is minimal (since xrange()
still has to create the values when asked for them) except when a very
large range is used on a memory-starved machine or when all of the
range's elements are never used (such as when the loop is usually
terminated with break)." - from Python Library Reference.

I decided to measure the performance of range and xrange. I did it with
the following functions:

def rprint( n ):
a = time.time()
for i in range(n): print i
print time.time() - a

def xrprint( n ):
a = time.time()
for i in xrange(n): print i
print time.time() - a

def rpass( n ):
a = time.time()
for i in range(n): pass
print time.time() - a

def xrpass( n ):
a = time.time()
for i in xrange(n): pass
print time.time() - a
The results were as follows:

n rprint xrprint

10^4 0.37 s 0.34 s <- (1)
10^5 4.26 s 4.25 s
10^6 42.57 s 42.57 s
10^7 431.94 s 438.32 s <- (2)

n rpass xpass

10^4 0.0012 s 0.0011 s
10^5 0.0220 s 0.0139 s
10^6 0.1463 s 0.1298 s
10^7 1.4818 s 1.1807 s

The values are the average times printed by tested functions.

Conclusions:

1) According to (1) I could say that xrange might be somewhat faster
than range with lower numbers of iterations.
2) According to (2) I could say that xrange might be slower than range
with higher number of iterations.

The test with pass is not so important as the test with print (because
we usually do something inside of loops). So despite xpass has beaten
rpass, I would say that range is not slower than xrange (especially for
higher numbers of iterations). The final conclusion is : if you want
speed, you should use xrange privided that there aren't many
iterations. If you want less memory usage, you should use xrange.
I've also made more tests. The code was as follows:

-----------------------------
import array, random, time

def test1( n, size ):
a = time.time()
for i in xrange(n):
l = []
for i in xrange(size):
l.append( random.randint( 1,10 ) )
e = sum(l) / float(size) # just for taking some time
print time.time() - a
def test2( n, size ):
a = time.time()
l = []
for i in xrange(n):
del l[:]
for i in xrange(size):
l.append( random.randint( 1,10 ) )
e = sum(l) / float(size)
print time.time() - a
def test3( n, size ):
a = time.time()
l = range(size)
for i in xrange(n):
for i in xrange(size):
l[i] = random.randint( 1,10 )
e = sum(l) / float(size)
print time.time() - a
def test4( n, size ):
a = time.time()
l = array.array( 'L', xrange(size) )
for i in xrange(n):
for i in xrange(size):
l[i] = random.randint( 1,10 )
e = sum(l) / float(size)
print time.time() - a
def test5( n, size ):
a = time.time()
ind1 = range(size)
ind2 = range(size)
for i in xrange(n):
des1 = []
des2 = []
point = random.randint( 1, size-1 )
des1 = ind1[:point] + ind2[point:]
des2 = ind2[:point] + ind1[point:]
print time.time() - a
def test6( n, size ):
a = time.time()
ind1 = range(size)
ind2 = range(size)
des1 = []
des2 = []
for i in xrange(n):
del des1[:]
del des2[:]
point = random.randint( 1, size-1 )
des1 = ind1[:point] + ind2[point:]
des2 = ind2[:point] + ind1[point:]
print time.time() - a
def test7( n, size ):
a = time.time()
ind1 = range(size)
ind2 = range(size)
des1 = range(size)
des2 = range(size)
for i in xrange(n):
point = random.randint( 1, size-1 )
des1[:point] = ind1[:point]
des1[point:] = ind2[point:]
des2[:point] = ind2[:point]
des2[point:] = ind1[point:]
print time.time() - a
def test8( n, size ):
a = time.time()
ind1 = array.array( 'L', xrange(size) )
ind2 = array.array( 'L', xrange(size) )
des1 = array.array( 'L', xrange(size) )
des2 = array.array( 'L', xrange(size) )
for i in xrange(n):
point = random.randint( 1, size-1 )
des1[:point] = ind1[:point]
des1[point:] = ind2[point:]
des2[:point] = ind2[:point]
des2[point:] = ind1[point:]
print time.time() - a
-----------------------------

And this is my session with Python 2.4.1:
for i in xrange(5): test.test1( 10000, 10 ) ....
2.27345108986
2.51863479614
2.49968791008
2.68024802208
2.28194379807 for i in xrange(5): test.test2( 10000, 10 ) ....
2.54866194725
2.36415600777
2.71178197861
2.32558512688
2.71971893311 for i in xrange(5): test.test3( 10000, 10 ) ....
2.29083013535
2.5563249588
2.32064318657
1.90063691139
2.30613899231 for i in xrange(5): test.test4( 10000, 10 ) ....
2.55809211731
2.42571187019
2.59921813011
2.19631099701
2.16659498215 for i in xrange(5): test.test5( 10000, 10 ) ....
0.318142175674
0.442049980164
0.367480039597
0.327154874802
0.322648048401 for i in xrange(5): test.test6( 10000, 10 ) ....
0.356222867966
0.471677780151
0.332046031952
0.339803934097
0.48833990097 for i in xrange(5): test.test7( 10000, 10 ) ....
0.467595815659
0.317886829376
0.311239004135
0.312664031982
0.49030995369 for i in xrange(5): test.test8( 10000, 10 ) ....
0.499684095383
0.330184936523
0.332714080811
0.329524040222
0.50562787056 for i in xrange(5): test.test5( 10000, 100 ) ....
0.387717962265
0.45348906517
0.507198095322
0.402877807617
0.526827096939 for i in xrange(5): test.test6( 10000, 100 ) ....
0.525599002838
0.41659784317
0.443000078201
0.403271913528
0.591446876526 for i in xrange(5): test.test7( 10000, 100 ) ....
0.399652957916
0.416820049286
0.400202035904
0.404708862305
0.57714009285 for i in xrange(5): test.test8( 10000, 100 ) ....
0.357075929642
0.540817022324
0.378996133804
0.372053146362
0.554198980331 for i in xrange(5): test.test5( 10000, 250 ) ....
0.630347967148
0.497437000275
0.687075138092
0.497366905212
0.935706853867 for i in xrange(5): test.test6( 10000, 250 ) ....
0.493726015091
0.683156013489
0.512520074844
0.697488069534
0.746694803238 for i in xrange(5): test.test7( 10000, 250 ) ....
0.519948005676
0.583598136902
0.624222993851
0.528346061707
0.948079824448 for i in xrange(5): test.test8( 10000, 250 ) ....
0.553761005402
0.401547908783
0.389595985413
0.578064918518
0.394165039062 for i in xrange(5): test.test1( 10000, 1000 ) ....
233.676990032
229.95272994
228.739851952
228.541095018
226.404256105 for i in xrange(5): test.test2( 10000, 1000 ) ....
223.505224943
225.172422886
223.084803104
223.407966137
224.717788935 for i in xrange(5): test.test3( 10000, 1000 ) ....
210.81110096
211.956163168
212.362264156
211.730306149
209.519776106 for i in xrange(5): test.test4( 10000, 1000 ) ....
241.220864773
248.316150904
247.426213026
239.199230909
242.972666025 for i in xrange(5): test.test5( 10000, 1000 ) ....
1.32021999359
1.29506993294
1.14080190659
1.50338101387
1.30436086655 for i in xrange(5): test.test6( 10000, 1000 ) ....
1.28036403656
1.09035301208
1.07259607315
1.0751209259
1.07368779182 for i in xrange(5): test.test7( 10000, 1000 ) ....
1.38129281998
1.36377501488
1.40786099434
1.35044002533
1.37256002426 for i in xrange(5): test.test8( 10000, 1000 )

....
0.524824142456
0.696274995804
0.544312000275
0.719218969345
0.77623295784
Tests with size equal to 10 were for testing a "small" size case. The
sizes 100 and 250 are more adequate in my case. The size 1000 is a
"big" size case. There are "random" tests (test1 ... test4) and "slice"
tests (test5 ... test8).

Conclusions:

1) Small size tests: there is no one winner of the random tests. The
differences are rather small and might be accidental. There is also no
winner of the slice tests.

2) Big size tests: as I expected, test3 is better than test2, and test
2 is better than test1. As I definitelly hadn't expected is the fact
that test4 was the worst (shouldn't arrays be more efficient than
lists?). So the winner of random tests is test3. And the winner in
slicing is test8.

I'm going to implement genetic algorothm, so I think that slicing tests
are more important than random tests. And the final conclusion is I
should use arrays instead of lists.

I'm going also to write tests that use Numeric.

Aug 6 '05 #1
3 1507
TPJ wrote:
I decided to measure the performance of range and xrange...

The results were as follows:

n rprint xrprint

10^4 0.37 s 0.34 s <- (1)
10^5 4.26 s 4.25 s
10^6 42.57 s 42.57 s
10^7 431.94 s 438.32 s <- (2)

n rpass xpass

10^4 0.0012 s 0.0011 s
10^5 0.0220 s 0.0139 s
10^6 0.1463 s 0.1298 s
10^7 1.4818 s 1.1807 s

The values are the average times printed by tested functions.


The number of replicates and standard deviations would be useful in
analyzing these results.
--
Michael Hoffman
Aug 6 '05 #2

"TPJ" <tp*****@interia.pl> wrote in message
news:11**********************@z14g2000cwz.googlegr oups.com...
"The advantage of xrange() over range() is minimal (since xrange()
still has to create the values when asked for them) except when a very
large range is used on a memory-starved machine or when all of the
range's elements are never used (such as when the loop is usually
terminated with break)." - from Python Library Reference.
Nothing in your results, properly interpreted, contradicts this.
I decided to measure the performance of range and xrange.
On one machine, with one binary. Relative performance on different systems
can vary by, say, 10%. But of course, if you are optimizing a program to
run on just one machine, then the results on that machine are all that
matters.
I did it with the following functions:

def rprint( n ):
a = time.time()
for i in range(n): print i
print time.time() - a

def xrprint( n ):
a = time.time()
for i in xrange(n): print i
print time.time() - a
To compare two things, one wants to remove all possible noise factors. If
one wants relative performance (ratios, percentages) then constant factors
should also be removed, when possible, to make the apparent difference as
close to the real difference as possible. Print is a large constant +
large noise factor that you *DO NOT* want for the stated comparison. It
requires a call through lots of OS code with variable execution times.
def rpass( n ):
a = time.time()
for i in range(n): pass
print time.time() - a

def xrpass( n ):
a = time.time()
for i in xrange(n): pass
print time.time() - a
This is the right comparison for the reasons noted.
The results were as follows:

n rprint xrprint

10^4 0.37 s 0.34 s <- (1)
10^5 4.26 s 4.25 s
10^6 42.57 s 42.57 s
10^7 431.94 s 438.32 s <- (2)
These times with print are about 300x the results below without print.
They are useless for comparing range and xrange. The differences above are
differences in printing (display) times and not in range versus xrange.

Think about it. When the differences between runs with print are several
times as large as the total time without, then those large differences
cannot have anything to do with the minor difference in looping method.
n rpass xpass

10^4 0.0012 s 0.0011 s
10^5 0.0220 s 0.0139 s
10^6 0.1463 s 0.1298 s
10^7 1.4818 s 1.1807 s
The looks more coherent.
The values are the average times printed by tested functions.

Conclusions:

1) According to (1) I could say that xrange might be somewhat faster
than range with lower numbers of iterations.
2) According to (2) I could say that xrange might be slower than range
with higher number of iterations.
You could, but you would be wrong. The second table shows that xrange is
slightly faster on your machine over the range tested.
The test with pass is not so important as the test with print (because
we usually do something inside of loops).
This is completely backwards for the reasons already given. The point
about doing something inside loops in relevant in so far as it says the
that the minor difference between range and xrange, whichever way it goes
on a particular system, will matter relatively even less in application.
So the choice hardly matters unless space is a considerations. Which is
what the quoted docs more or less said.
import array, random, time
To reduce random noise from random, the generator should be reinitialized
with the seed(someint) at the beginning of each function. This is the
purpose of seed(x). But for the comparisons you are making, randomness is
irrelevance and the time for calls to random a nuisance.
def test1( n, size ):
a = time.time()
for i in xrange(n):
l = []
for i in xrange(size):
l.append( random.randint( 1,10 ) )
e = sum(l) / float(size) # just for taking some time
This is exactly what you DO NOT WANT TO DO for comparing anything else.
print time.time() - a def test2( n, size ):
a = time.time()
l = []
for i in xrange(n):
del l[:]
for i in xrange(size):
l.append( random.randint( 1,10 ) )
e = sum(l) / float(size)
print time.time() - a
This is trivially different. To test 'l = []' versus 'del l[:]', test just
that.
def test3( n, size ):
a = time.time()
l = range(size)
for i in xrange(n):
for i in xrange(size):
l[i] = random.randint( 1,10 )
e = sum(l) / float(size)
print time.time() - a


It is not surprising that allocating a list just once and filling it in is
faster than starting empty and expanding and copying multiple times. In
any case, if you want to test that, test just that without all the noise
and masking of other stuff.

I did not look at the slicing stuff.

Terry J. Reedy

Aug 6 '05 #3
How much ram does your machine have?
the main point is "except when a very large range is used on a
memory-starved machine"
run
x = range(10 ** 6)
and look at the memory usage of python..

what happens when you run this program:

import time

def t(func, num):
s = time.time()
for x in func(num):
pass
return time.time() - s

def run(func, num):
times = []
for x in range(5):
times.append(t(func,num))
return min(times), max(times), sum(times)/5

def main():
x = 10 ** 6
while 1:
print "trying", x
for s, f in ('xr', xrange), (' r', range):
print s + " %.3f %.3f %.3f" % run(f, x)
x *= 1.5
x = int(x)
if __name__ == "__main__":
main()
I get (columns are mix/max/average):

trying 1000000
xr 0.110 0.115 0.111
r 0.101 0.186 0.119
trying 1500000
xr 0.082 0.087 0.083
r 0.152 0.158 0.154
trying 2250000
xr 0.124 0.138 0.128
r 0.228 0.235 0.230
trying 3375000
xr 0.184 0.189 0.186
r 0.344 0.352 0.346
trying 5062500
xr 0.276 0.284 0.279
r 0.515 0.528 0.519
trying 7593750
xr 0.415 0.421 0.416
r 0.774 0.795 0.779
trying 11390625
xr 0.623 0.634 0.626
r 1.163 1.246 1.180
trying 17085937
xr 0.934 0.941 0.937
Killed

The "Killed" is from the linux OOM killing the python process.. notice
that the xrange for that number worked fine.

Aug 7 '05 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

12
by: Paul Moore | last post by:
One of the things I really dislike about Unittest (compared, say, to a number of adhoc testing tricks I've used in the past, and to Perl's "standard" testing framework) is that the...
1
by: engsolnom | last post by:
Knowing absolutely zero about benchmarks, I was interested in the posting referencing the paper by Dr. Cowell-Shah. I was curious how long Python takes just to read and display time.clock(), soI...
193
by: Michael B. | last post by:
I was just thinking about this, specifically wondering if there's any features that the C specification currently lacks, and which may be included in some future standardization. Of course, I...
2
by: thisyr4leafs | last post by:
Hello All I'm stumped as to why my aspx page takes so long to load. When I run this is a development environment (with minimal data in the database) all is fine. I ran the stored procs (in a...
6
by: TPJ | last post by:
Help me please, because I really don't get it. I think it's some stupid mistake I make, but I just can't find it. I have been thinking about it for three days so far and I still haven't found any...
176
by: nw | last post by:
Hi, I previously asked for suggestions on teaching testing in C++. Based on some of the replies I received I decided that best way to proceed would be to teach the students how they might write...
0
by: Jordan S. | last post by:
Okay so I've finally "seen the Light" about writing automated unit tests ahead of time. Question: What is a very simple approach that I can use to setting up automated unit tests, considering...
55
by: copx | last post by:
Can anyone point me to a simple, fast RRNG function to generate random ints within a specified range? It is important that each value within the range has the same probability (uniform...
0
by: tarun | last post by:
Hello All, Please find the code for a simple wx.TreeCtrl code. Whenever I right click on any item in the Tree, I see the function, 'OnSelChanged' gets called twice. What I've just done is...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.