473,756 Members | 1,764 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Loopless syntax for 2d in NumPy (or Numarray)

2mc
I am finding out all kinds of ways to do things in NumPy through the
many suggestions I have received. It's exciting. Thanks to all who
have replied on my other threads.

I'm still having trouble with one thing. Let me set a scenario and
see if anyone has any ideas.

Assume a multidimensiona l array (2d). This would be like a
spreadsheet of rows and columns. Further, assume hundreds of 'rows'
and 3 columns. Suppose I want a running list of the highest value in a
single column for 20 'rows'. So, starting at 'row' 19, the answer
would be the highest value from 'row' 0 to 'row' 19. Then, at 'row'
20, the answer would be the highest value from 'row' 1 to 'row' 20.
And, so on. Further, suppose I want this value for each 'column'.
The result would be a 3 'column' array with 19 less rows than the
source array and would contain a running list of highest values of
each column for the last 20 rows.

How would this be done without loops? Or, at least without looping
through every row.

If it can't be done, is this something that numarray could do?

Thanks a million.

Matt
Jul 18 '05 #1
7 2641

"2mc" <mc*****@bigfoo t.com> wrote in message
news:50******** *************** ***@posting.goo gle.com...
Assume a multidimensiona l array (2d). This would be like a
spreadsheet of rows and columns. Further, assume hundreds of 'rows'
and 3 columns. Suppose I want a running list of the highest value in a single column for 20 'rows'. So, starting at 'row' 19, the answer
would be the highest value from 'row' 0 to 'row' 19. Then, at 'row'
20, the answer would be the highest value from 'row' 1 to 'row' 20.
And, so on. Further, suppose I want this value for each 'column'.
The result would be a 3 'column' array with 19 less rows than the
source array and would contain a running list of highest values of
each column for the last 20 rows.

How would this be done without loops? Or, at least without looping
through every row.


Just curious: is this a real problem, or one you made up to stump
NumPy? Keep in mind that NumPy was written to do typical array
operations, linear algebra/analysis, ffts, and even some things not so
typical. A moving maximum is highly nonlinear, unusual, and likely to
need explicit looping to be done efficiently.

I think the way to avoid redoing the max from scratch with each move
of the window is to use a heap augmented by a circular index that
enables easy replacement of the departing number with the incoming
number. After the replacement, re-establish the heap property and
record the max. (For more on heaps, see heapq.py in the lib or
various algorithm books.)

Terry J. Reedy
Jul 18 '05 #2
2mc
"Terry Reedy" <tj*****@udel.e du> wrote in message news:<Vf******* *************@c omcast.com>...
"2mc" <mc*****@bigfoo t.com> wrote in message
news:50******** *************** ***@posting.goo gle.com...
Assume a multidimensiona l array (2d). This would be like a
spreadsheet of rows and columns. Further, assume hundreds of 'rows'
and 3 columns. Suppose I want a running list of the highest value in a
single column for 20 'rows'. So, starting at 'row' 19, the answer
would be the highest value from 'row' 0 to 'row' 19. Then, at 'row'
20, the answer would be the highest value from 'row' 1 to 'row' 20.
And, so on. Further, suppose I want this value for each 'column'.
The result would be a 3 'column' array with 19 less rows than the
source array and would contain a running list of highest values of
each column for the last 20 rows.

How would this be done without loops? Or, at least without looping
through every row.


Just curious: is this a real problem, or one you made up to stump
NumPy?


Yes, this is a real problem. Of course, it involves much more than
this. But, yes, I would like to get a running list of highest values
within a range of values.
Keep in mind that NumPy was written to do typical array operations, linear
algebra/analysis, ffts, and even some things not so typical. A moving
maximum is highly nonlinear, unusual, and likely to need explicit looping to
be done efficiently.

I think the way to avoid redoing the max from scratch with each move
of the window is to use a heap augmented by a circular index that
enables easy replacement of the departing number with the incoming
number. After the replacement, re-establish the heap property and
record the max. (For more on heaps, see heapq.py in the lib or
various algorithm books.)


I didn't know there was a function called heapq.py. I have 3 books:
Learning Python, Python Essential Reference, and Practical Python. I
also have read the Numerical Python manual on the numerical Python
website. None of these mentioned it.

I have a program that was written in a programming language called SPL
(Smart Programming Language). It is a Pascal-like scripting language
that worked with an older suite of programs called "Smartware 2000."
It is like Visual Basic for Applications except that it isn't OO. In
this language I use multidimensiona l arrays of quite large sizes and
explicitly declare loops to accomplish a lot of what I'm doing. I
keep the arrays entirely in RAM for speed purposes. But, because it
was a program designed to work with a suite of applications (though it
does not have to use any of them - just like Visual Basic) it has a
lot of overhead that slows the program down.

So, I want to port this program to something else that will speed the
process up. In investigating Python one of the comments I read about
it was that it was slower than other languages but easier to program -
that is, it was truly a rapid application developer. Because it was
slow, an extension module was created to provide true multidimensiona l
arrays with a computational speed only slightly slower than C++. So,
I thought to myself that I would sacrifice some speed through the
normal parts of the program for the gain of computaional speed using
this multidimensiona l array extension in those parts needing it and
also for the ease of application design. My concern is that I don't
inadvertently write the program in such a way as to slow it down
rather than taking advantage of these modules that provide assistance
with regard to speed. The language is just enough different that I'm
having trouble thinking in NumPy.

Your suggestion to use heapq was interesting. Basically, in my
current program I use something similar. I sort, enter new data into
oldest data's spot, resort, find the max, and repeat.

I have a couple of questions about heapq. The website says that it is
an array but that it works on lists. So, heapq would not work on
NumPy arrays, right? And, it is then a normal Python array rather
than a NumPy array, right? Is there a comparable speed increase using
heapq over explicit Python loops as there is in NumPy over explicit
Python loops?

I guess I'm looking for the fastest way in Python or in any Python
module to accomplish what I want to do.

Thanks for your response. I appreciate it.

Matt
Jul 18 '05 #3

"2mc" <mc*****@bigfoo t.com> wrote in message
news:50******** *************** **@posting.goog le.com...
"Terry Reedy" <tj*****@udel.e du> wrote in message news:<Vf******* *************@c omcast.com>...
Just curious: is this a real problem, or one you made up to stump
NumPy?


Yes, this is a real problem. Of course, it involves much more than
this. But, yes, I would like to get a running list of highest

values within a range of values.
I think the way to avoid redoing the max from scratch with each move of the window is to use a heap augmented by a circular index that
enables easy replacement of the departing number with the incoming
number. After the replacement, re-establish the heap property and
record the max. (For more on heaps, see heapq.py in the lib or
various algorithm books.)


I didn't know there was a function called heapq.py.


It is a module added to the standard library perhaps in 2.2. Not
surprised if not in books.
Your suggestion to use heapq was interesting. Basically, in my
current program I use something similar. I sort, enter new data into oldest data's spot, resort, find the max, and repeat.
Same idea. I think best resort method should be binary search for
insertion spot followed by shift-1 movement of off-by-1 block toward
empty spot and then insertion of new item in proper place.

The same operation for heaps would be O(logN) instead of O(N), but
with a higher constant, so it might or might not be faster for N=19.
(But I suspect it would be.)
I have a couple of questions about heapq. The website says that it is an array but that it works on lists. So, heapq would not work on
NumPy arrays, right? And, it is then a normal Python array rather
than a NumPy array, right?
The algorithms, with adjustment, can use any type of random access
array, but there would not be any obvious benefit of shifting. heapq,
as implied by the q for queue, is intended for heap use as priority
queue. It has a replace_max function but not a
replace-arbitrary-element operation. Regardless of what array type
you use, you would have to write your own replace function. But it
should be a straight forward adjustment of the current replace
function -- once you understand heaps.
Is there a comparable speed increase using
heapq over explicit Python loops as there is in NumPy over explicit
Python loops?
No, heapq is written in Python and uses normal looping. But do keep
in mind that pure Python code, especially when using ints, floats, and
loops, can ofter run several times faster when psyco-ized.
Thanks for your response. I appreciate it.


Welcome,

Terry J. Reedy
Jul 18 '05 #4
In article <5t************ ********@comcas t.com>,
"Terry Reedy" <tj*****@udel.e du> wrote:
I didn't know there was a function called heapq.py.


It is a module added to the standard library perhaps in 2.2. Not
surprised if not in books.


It was added in 2.3.
http://www.python.org/doc/2.3/whatsnew/node18.html

--
David Eppstein http://www.ics.uci.edu/~eppstein/
Univ. of California, Irvine, School of Information & Computer Science
Jul 18 '05 #5
2mc
"Terry Reedy" <tj*****@udel.e du> wrote in message news:<5t******* *************@c omcast.com>...
"2mc" <mc*****@bigfoo t.com> wrote in message
news:50******** *************** **@posting.goog le.com...
"Terry Reedy" <tj*****@udel.e du> wrote in message news:<Vf******* *************@c omcast.com>...
Just curious: is this a real problem, or one you made up to stump
NumPy?


Yes, this is a real problem. Of course, it involves much more than
this. But, yes, I would like to get a running list of highest

values
within a range of values.

I think the way to avoid redoing the max from scratch with each move of the window is to use a heap augmented by a circular index that
enables easy replacement of the departing number with the incoming
number. After the replacement, re-establish the heap property and
record the max. (For more on heaps, see heapq.py in the lib or
various algorithm books.)


I didn't know there was a function called heapq.py.


It is a module added to the standard library perhaps in 2.2. Not
surprised if not in books.
Your suggestion to use heapq was interesting. Basically, in my
current program I use something similar. I sort, enter new data

into
oldest data's spot, resort, find the max, and repeat.


Same idea. I think best resort method should be binary search for
insertion spot followed by shift-1 movement of off-by-1 block toward
empty spot and then insertion of new item in proper place.

The same operation for heaps would be O(logN) instead of O(N), but
with a higher constant, so it might or might not be faster for N=19.
(But I suspect it would be.)
I have a couple of questions about heapq. The website says that it

is
an array but that it works on lists. So, heapq would not work on
NumPy arrays, right? And, it is then a normal Python array rather
than a NumPy array, right?


The algorithms, with adjustment, can use any type of random access
array, but there would not be any obvious benefit of shifting. heapq,
as implied by the q for queue, is intended for heap use as priority
queue. It has a replace_max function but not a
replace-arbitrary-element operation. Regardless of what array type
you use, you would have to write your own replace function. But it
should be a straight forward adjustment of the current replace
function -- once you understand heaps.
Is there a comparable speed increase using
heapq over explicit Python loops as there is in NumPy over explicit
Python loops?


No, heapq is written in Python and uses normal looping. But do keep
in mind that pure Python code, especially when using ints, floats, and
loops, can ofter run several times faster when psyco-ized.


Thanks for your responses. They have been most helpful. Thank you
for your kindness.

I've looked a little bit at Psyco. I have a couple of questions -
what else is new? :-)

1. Would a psyco-ized normal Python program that performs calculations
be slower, faster, or as fast as the equivalent in NumPy?

2. Can Numerical Python be psyco-ized? I'm supposing it cannot.

Thanks again. I appreciate it.

Matt

Thanks for your response. I appreciate it.


Welcome,

Terry J. Reedy

Jul 18 '05 #6
2mc wrote:
How would this be done without loops? Or, at least without looping
through every row.


For those cases when you have no option but looping, the easiest approach out
there is probably weave.inline(). Sorry I can't give you too many more
details right now, but if you go here:

http://windom.colorado.edu/~fperez/python/python-c/

you should be able to get started quickly. For weave itself, go to

http://scipy.org/

weave.inline() is one amazing piece of work.

Cheers,

f
Jul 18 '05 #7

"2mc" <mc*****@bigfoo t.com> wrote in message
news:50******** *************** ***@posting.goo gle.com...
1. Would a psyco-ized normal Python program that performs calculations be slower, faster, or as fast as the equivalent in NumPy?
This is not an either/or choice. Psyco is best used on a few
bottlenect functions. (Currently, it greatly expands the size of what
is works on.) If you did a running max with heaps, that would be a
candidate. Large array calcs done in NumPy as it is meant to be used
should be fastest.

If speed is a problem, also look at Weave as Fernando suggested. That
might be alternative for all or part of running max.

For spicific answers for your problem on your system, you will have to
test time yourself. New time_it make simple stuff pretty easy. We
have given you about as much general answer as is possible.
2. Can Numerical Python be psyco-ized? I'm supposing it cannot.


Correct, only the Python code calling it, but there would be little
point to that probably.

Terry J. Reedy
Jul 18 '05 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
1478
by: Colin J. Williams | last post by:
numarray is a package which is under development and intended to replace Numeric, an efficient and operational package. One of the classes in numarray is NumArray. As currently implemented, instances of this class are instantiated using factory functions. This appears to restrict the sub-classing of NumArray Examples: a sub-class Matrix, an array restricted to two dimensions, or a sub-class Mix, which combines the NumArray with...
2
2968
by: Marc Schellens | last post by:
Following the NumPy documentation, I took over some C code, but run into an error. Does anybody have a suggestion? Thanks, marc gdlpython.cpp:225: `PyArray_Type' undeclared (first use this function) #include <python2.3/Python.h>
3
1704
by: Alex Hunsley | last post by:
I'm looking for a way to iterate over all the items in a numarray. Writing a few nested loops isn't going to cut it, because the numarray in question could be of any dimension... I am aware of the revel function, but that appears to just flatten the numarray. What I need is an iterator that can give each value and the coordinates in the array of that item.... thanks alex
11
1990
by: grv | last post by:
So it is supposed to be very fast to have an array of say 5 million integers stored in a binary file and do a = numarray.fromfile('filename', (2, 2, 2)) numarray.add(a, 9, a) but how is that faster than reading the entire file into memory and then having a for loop in C: (loop over range) { *p++ += 9 }
20
2584
by: mclaugb | last post by:
Has anyone recompiled the Scientific Computing package using NumPy instead of Numeric? I need a least squares algorithm and a Newton Rhaphson algorithm which is contained in Numeric but all the documentation out there says that Numeric is crap and all code should be using NumPy. Thanks, Bryan
10
2243
by: Bryan | last post by:
hi, what is the difference among numeric, numpy and numarray? i'm going to start using matplotlib soon and i'm not sure which one i should use. this page says, "Numarray is a re-implementation of an older Python array module called Numeric" http://www.stsci.edu/resources/software_hardware/numarray
0
1997
by: robert | last post by:
just a note - some speed comparisons : 0.60627370238398726 0.42836673376223189 0.36965815487747022 0.016557970357098384 0.15692469294117473 0.01951756438393204
2
3969
by: Chris Smith | last post by:
Howdy, I'm a college student and for one of we are writing programs to numerically compute the parameters of antenna arrays. I decided to use Python to code up my programs. Up to now I haven't had a problem, however we have a problem set where we are creating a large matrix and finding it's inverse to solve the problem. To invert the matrix I've tried using numpy.numarray.linear_algebra.inverse and...
4
3380
by: Christian Convey | last post by:
I need to bang out an image processing library (it's schoolwork, so I can't just use an existing one). But I see three libraries competing for my love: numpy, numarray, and numeric. Can anyone recommend which one I should use? If one is considered the officially blessed one going forward, that would be my ideal. Thanks, Christian
0
9455
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9271
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10031
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
9869
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
9838
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
1
7242
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6534
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
1
3805
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
3
2665
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.