473,722 Members | 2,484 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Why is indexing into an numpy array that slow?

Hello!

Out of curiosity and to learn a little bit about the numpy package i've
tryed to implement
a vectorised version of the 'Sieve of Zakiya'.

While the code itself works fine it is astounding for me that the numpy
Version is almost 7 times slower than
the pure python version. I tryed to find out if i am doing something wrong
but wasn't able to find any answer.

Some experimentation showed that indexing into an numpy array is the
bottleneck that slows down my code.
However i don't understand why.

So i am looking for any explaination why the numpy version is that slow
(i expected it to be at least as fast as the pure python version).

Thank you in advance

>pythonw -u "PrimeSieve .py"
5 Erat: 0.000000 ZakijaP5: 0.000000 VZakijaP5: 0.000000
1000005 Erat: 0.219000 ZakijaP5: 0.219000 VZakijaP5: 1.578000
2000005 Erat: 0.468000 ZakijaP5: 0.469000 VZakijaP5: 3.297000
3000005 Erat: 0.719000 ZakijaP5: 0.703000 VZakijaP5: 5.000000
4000005 Erat: 0.937000 ZakijaP5: 0.985000 VZakijaP5: 6.734000
5000005 Erat: 1.219000 ZakijaP5: 1.218000 VZakijaP5: 8.172000
6000005 Erat: 1.500000 ZakijaP5: 1.531000 VZakijaP5: 9.829000
7000005 Erat: 1.781000 ZakijaP5: 1.813000 VZakijaP5: 11.500000
8000005 Erat: 2.047000 ZakijaP5: 2.094000 VZakijaP5: 13.187000
9000005 Erat: 2.312000 ZakijaP5: 2.391000 VZakijaP5: 14.891000
10000005 Erat: 2.578000 ZakijaP5: 2.672000 VZakijaP5: 16.641000
11000005 Erat: 2.891000 ZakijaP5: 2.953000 VZakijaP5: 18.203000
12000005 Erat: 3.110000 ZakijaP5: 3.297000 VZakijaP5: 19.937000

#------------------------------
Prime_Sieves.py--------------------------------------
from math import sqrt
def sieveOfErat(end ):
if end < 2: return []
lng = (end-1)>>1
sieve = [True]*(lng+1)
for i in xrange(int(sqrt (end)) >1):
if not sieve[i]: continue
for j in xrange( (i*(i + 3) << 1) + 3, lng, (i << 1) + 3):
sieve[j] = False
primes = [2]
primes.extend([(i << 1) + 3 for i in xrange(lng) if sieve[i]])
return primes

def SoZP5(val):
if val < 3 : return [2]
if val < 5 : return [2,3]
lndx = (val-1)/2
sieve = [0]*(lndx+15)
sieve[0:14] = 0 , 1 , 2, 0, 4, 5, 0, 7, 8, 0, 10, 0, 0, 13
for _i in xrange(14, lndx , 15):
sieve[_i:_i+15] = _i -1, 0, 0, _i + 2 , 0, _i + 4, _i + 5, 0, _i +7,
_i + 8, 0, _i + 10, 0, 0, _i + 13

for i in xrange(2, (int(sqrt(val))-3/2)+1):
if not sieve[i]: continue
k = 30*i+45
p1 = 7*i+9 # 7 (2i+3)
p2 = 11*i+15 # 11 (2i+3)
p3 = 13*i+18 # 13 (2i+3)
p4 = 17*i+24 # 17 (2i+3)
p5 = 19*i+27 # 19 (2i+3)
p6 = 23*i+33 # 23 (2i+3)
p7 = 29*i+42 # 29 (2i+3)
p8 = 31*i+45 # 31 (2i+3)
for _i in xrange(0, lndx, k):
try:
sieve[_i+p1] = 0
sieve[_i+p2] = 0
sieve[_i+p3] = 0
sieve[_i+p4] = 0
sieve[_i+p5] = 0
sieve[_i+p6] = 0
sieve[_i+p7] = 0
sieve[_i+p8] = 0
except (IndexError, ):
break

primes = [(i*2)+3 for i in xrange(2, lndx) if sieve[i]]
primes[0:0] = [2,3,5]
return primes

from numpy import array, zeros
def VSoZP5(val):
""" Vectorised Version of Sieve of Zakia"""
if val < 3 : return [2]
if val < 5 : return [2,3]
lndx = (val-1)/2
sieve = zeros(lndx+15,d type="int32") #<-- Offending line: Indexing a
numpy array is slow
sieve[0:14] = 0 , 1 , 2, 0, 4, 5, 0, 7, 8, 0, 10, 0, 0, 13
for _i in xrange(14, lndx , 15):
sieve[_i:_i+15] = _i -1, 0, 0, _i + 2 , 0, _i + 4, _i + 5, 0, _i +7,
_i + 8, 0, _i + 10, 0, 0, _i + 13

om = array([9, 15, 18, 24, 27, 33, 42, 45], dtype="int32")
bm = array([7, 11, 13, 17, 19, 23, 29, 31], dtype="int32")
for i in xrange(2, (int(sqrt(val))-3/2)+1):
if not sieve[i]: continue
k = 30*i+45
rm = (bm * i) + om
for _i in xrange(0, lndx/k + 1):
try:
sieve[rm] = 0 #<-- Offending line: Indexing a numpy array
is slow
rm += k
except (IndexError,):
break
#
primes = [(i*2)+3 for i in xrange(2, lndx) if sieve[i]]
primes[0:0] = [2,3,5]
return primes

if __name__ == '__main__':
import time

for j in range(5, 20000007, 1000000):
print j,

a = time.time()
soe = sieveOfErat(j)
print "Erat: %2f" % (time.time() -a,),

a = time.time()
soz5 = SoZP5(j)
print "ZakijaP5: %2f" % (time.time() -a),

a = time.time()
sovz5 = VSoZP5(j)
print "VZakijaP5: %2f" % (time.time() -a)

assert soe == soz5 == sovz5
#------------------------------
Prime_Sieves.py--------------------------------------



Nov 8 '08 #1
3 6837
Rüdiger Werner wrote:
Hello!
Hi!

numpy questions are best answered on the numpy mailing list.

http://www.scipy.org/Mailing_Lists
Out of curiosity and to learn a little bit about the numpy package i've
tryed to implement
a vectorised version of the 'Sieve of Zakiya'.

While the code itself works fine it is astounding for me that the numpy
Version is almost 7 times slower than
the pure python version. I tryed to find out if i am doing something wrong
but wasn't able to find any answer.
The result of indexing into a numpy array is a numpy scalar object. We do this
instead of returning a float or an int because numpy supports many more data
types than just a C double or long, respectively. If I have a uint16 array,
indexing into it gives me a uint16 numpy scalar. These are a little more
complicated to set up than a regular Python float or int, so they take more time
to create.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco

Nov 10 '08 #2
Rüdiger Werner:
So i am looking for any explaination why the numpy version is that slow
(i expected it to be at least as fast as the pure python version).
For speed use Psyco with array.array. Psyco is available for Python
2.6 too now.

Bye,
bearophile
Nov 10 '08 #3
Robert Kern wrote:
Rüdiger Werner wrote:
>Hello!

Hi!

numpy questions are best answered on the numpy mailing list.

http://www.scipy.org/Mailing_Lists
>Out of curiosity and to learn a little bit about the numpy package
i've tryed to implement
a vectorised version of the 'Sieve of Zakiya'.

While the code itself works fine it is astounding for me that the
numpy Version is almost 7 times slower than
the pure python version. I tryed to find out if i am doing something
wrong but wasn't able to find any answer.

The result of indexing into a numpy array is a numpy scalar object. We
do this instead of returning a float or an int because numpy supports
many more data types than just a C double or long, respectively. If I
have a uint16 array, indexing into it gives me a uint16 numpy scalar.
These are a little more complicated to set up than a regular Python
float or int, so they take more time to create.
Taking a closer look, that's only part of the problem. The larger problem is
that the numpy code wasn't vectorized enough to actually make use of numpy's
capabilities. You were paying for the overhead numpy imposes without coding in
such a way to make use of numpy's efficiencies. Here is my version that is more
than two times faster than the others. I didn't know exactly which details were
critical to the algorithm and which weren't (e.g. lndx+15), so there's some
gorpiness that can probably be removed. If you don't actually need to convert
back to a list, then you can be about 3 times faster than the others.

from numpy import arange, array, newaxis, zeros
def VSoZP5(val):
""" Vectorised Version of Sieve of Zakia"""
if val < 3 : return [2]
if val < 5 : return [2,3]
lndx = (val-1)/2
sieve = zeros(lndx+15,d type="int32")
sieve[0:14] = 0 , 1 , 2, 0, 4, 5, 0, 7, 8, 0, 10, 0, 0, 13
_i = arange(14, lndx, 15)
if lndx 14:
sieve[14:-14:15] = _i - 1
sieve[17:-11:15] = _i + 2
sieve[19:-9:15] = _i + 4
sieve[20:-8:15] = _i + 5
sieve[22:-6:15] = _i + 7
sieve[23:-5:15] = _i + 8
sieve[25:-3:15] = _i + 10
sieve[28::15] = _i + 13
om = array([9, 15, 18, 24, 27, 33, 42, 45], dtype="int32")
bm = array([7, 11, 13, 17, 19, 23, 29, 31], dtype="int32")
for i in xrange(2, (int(sqrt(val))-3/2)+1):
if not sieve[i]: continue
k = 30*i+45
rm = (bm * i) + om
rm2 = (rm + k * arange(0, lndx/k + 1)[:,newaxis]).flat
rm3 = rm2[rm2 < len(sieve)]
sieve[rm3] = 0
sieve = sieve[2:lndx]
primes = (2*arange(2, lndx) + 3)[sieve 0].tolist()
primes[0:0] = [2,3,5]
return primes

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco

Nov 11 '08 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

7
2640
by: 2mc | last post by:
I am finding out all kinds of ways to do things in NumPy through the many suggestions I have received. It's exciting. Thanks to all who have replied on my other threads. I'm still having trouble with one thing. Let me set a scenario and see if anyone has any ideas. Assume a multidimensional array (2d). This would be like a spreadsheet of rows and columns. Further, assume hundreds of 'rows' and 3 columns. Suppose I want a running...
2
2792
by: robert | last post by:
in Gnuplot (Gnuplot.utils) the input array will be converted to a Numeric float array as shown below. When I insert a numpy array into Gnuplot like that below, numbers 7.44 are cast to 7.0 Why is this and what should I do ? Is this bug in numpy or in Numeric? >>m #numpy array array(, , , ..., ,
2
3968
by: Chris Smith | last post by:
Howdy, I'm a college student and for one of we are writing programs to numerically compute the parameters of antenna arrays. I decided to use Python to code up my programs. Up to now I haven't had a problem, however we have a problem set where we are creating a large matrix and finding it's inverse to solve the problem. To invert the matrix I've tried using numpy.numarray.linear_algebra.inverse and...
5
4862
by: robert | last post by:
Turning algs for old NumPy modules into numpy code I suffer from this: Upon further processing of returns of numpy calculations, lots of data in an apps object tree will become elementary numpy types. First there is some inefficiency in calculations. And then you get data inflation and questionable dependencies - e.g. with pickle,ZODB,mpi's ... : 0.0...
4
1458
by: Grace Fang | last post by:
Hi, I am writing code to sort the columns according to the sum of each column. The dataset is huge (50k rows x 300k cols), so i need to read line by line and do the summation to avoid the out-of-memory problem. But I don't know why it runs very slow, and part of the code is as follows. I suspect it's because of array index, but not sure. Can anyone point out what needs to be modified to make it run fast? thanks in advance!
4
11605
by: Emin | last post by:
Dear Experts, How much slower is dict indexing vs. list indexing (or indexing into a numpy array)? I realize that looking up a value in a dict should be constant time, but does anyone have a sense of what the overhead will be in doing a dict lookup vs. indexing into a list? Some ad hoc tests I've done indicate that the overhead is less than 15% (i.e., dict lookups seem to take no more than 15% longer than indexing into a list and there...
2
1852
by: John [H2O] | last post by:
I'm having trouble slicing arrays: I thought I could do the following: Traceback (most recent call last): File "<string>", line 1, in <string> ValueError: shape mismatch: objects cannot be broadcast to a single shape It's strange, because I can do this:
0
896
by: Terry Reedy | last post by:
John wrote: If these are numpy arrays, as appears, rather that array module arrays, then the numpy list might be a better place. In any case, using 'numpy' would have gotten the attention of someone scanning for posts about numpy.
2
1695
by: Rick Giuly | last post by:
Hello All, Case 1 This generates an error, which makes sense because the argument should be a list of numbers: numpy.array(10,10) Case 2 This does not generate an error and the result is an array with a single element:
0
8739
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
9384
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
9238
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
8052
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
5995
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4502
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
4762
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
3207
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
3
2147
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.