473,386 Members | 1,734 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,386 software developers and data experts.

Wow, Python much faster than MatLab

hi All,

instead of questions,
my first success story:

I converted my first MatLab algorithm into Python (using SciPy),
and it not only works perfectly,
but also runs much faster:

MatLab: 14 msec
Python: 2 msec

After taking the first difficult steps into Python,
all kind of small problems as you already know,
it nows seems a piece of cake to convert from MatLab to Python.
(the final programs of MatLab and Python can almost only be
distinguished by the comment character ;-)

Especially I like:
- more relaxed behavior of exceeded the upper limit of a (1-dimensional)
array
- much more functions available, like a simple "mean"
- reducing datatype if it's allowed (booleans of 1 byte)

thanks for all your help,
probably need some more in the future,
cheers,
Stef Mientki
Dec 29 '06 #1
11 2887

Stef Mientki wrote:
hi All,

instead of questions,
my first success story:

I converted my first MatLab algorithm into Python (using SciPy),
and it not only works perfectly,
but also runs much faster:

MatLab: 14 msec
Python: 2 msec
For times this small, I wonder if timing comparisons are valid. I do
NOT think SciPy is in general an order of magnitude faster than Matlab
for the task typically performed with Matlab.
>
After taking the first difficult steps into Python,
all kind of small problems as you already know,
it nows seems a piece of cake to convert from MatLab to Python.
(the final programs of MatLab and Python can almost only be
distinguished by the comment character ;-)

Especially I like:
- more relaxed behavior of exceeded the upper limit of a (1-dimensional)
array
Could you explain what this means? In general, I don't want a
programming language to be "relaxed" about exceeding array bounds.

Dec 30 '06 #2
On Fri, 29 Dec 2006 19:35:22 -0800, Beliavsky wrote:
>Especially I like:
- more relaxed behavior of exceeded the upper limit of a (1-dimensional)
array

Could you explain what this means? In general, I don't want a
programming language to be "relaxed" about exceeding array bounds.
I'm not sure about SciPy, but lists in standard Python allow this:
>>array = [1, 2, 3, 4]
array[2:50000]
[3, 4]

That's generally a good thing.


--
Steven.

Dec 30 '06 #3
>MatLab: 14 msec
Python: 2 msec

For times this small, I wonder if timing comparisons are valid. I do
NOT think SciPy is in general an order of magnitude faster than Matlab
for the task typically performed with Matlab.
The algorithm is meant for real-time analysis,
where these kind of differences counts a lot.
I'm also a typical "surface programmer"
(don't need/want to know what's going inside),
just want to get my analysis done,
and the fact that Python has much more functions available,
means I've to write far less explicit or implicit for loops,
and thus I expect it to "look" faster for me always.
>
>After taking the first difficult steps into Python,
all kind of small problems as you already know,
it nows seems a piece of cake to convert from MatLab to Python.
(the final programs of MatLab and Python can almost only be
distinguished by the comment character ;-)

Especially I like:
- more relaxed behavior of exceeded the upper limit of a (1-dimensional)
array

Could you explain what this means? In general, I don't want a
programming language to be "relaxed" about exceeding array bounds.
Well, I've to admit, that wasn't a very tactic remark, "noise" is still
an unwanted issue in software.
But in the meanwhile I've reading further and I should replace that by
some other great things:
- the very efficient way, comment is turned into help information
- the (at first sight) very easy, but yet quit powerfull OOPs implemetation.

cheers,
Stef Mientki
Dec 30 '06 #4
>
I'm not sure about SciPy,
Yes SciPy allows it too !
but lists in standard Python allow this:
>
>>>array = [1, 2, 3, 4]
array[2:50000]
[3, 4]

That's generally a good thing.
You're not perhaps by origin an analog engineer ;-)

cheers,
Stef Mientki
Dec 30 '06 #5
A other great thing: With rpy you have R bindings for python.
So you have the power of R and the easy syntax and big standard lib of python! :)
Dec 30 '06 #6
Mathias Panzenboeck wrote:
A other great thing: With rpy you have R bindings for python.
forgive my ignorance, what's R, rpy ?
Or is only relevant for Linux users ?

cheers
Stef
So you have the power of R and the easy syntax and big standard lib of python! :)
Dec 30 '06 #7
Stef Mientki <S.**************@mailbox.kun.nlwrites:
Mathias Panzenboeck wrote:
A other great thing: With rpy you have R bindings for python.

forgive my ignorance, what's R, rpy ?
Or is only relevant for Linux users ?
[...]

R is a language / environment for statistical programming. RPy is a
Python interface to let you use R from Python. I think they both run
on both Windows and Linux.

http://www.r-project.org/

http://rpy.sourceforge.net/
John
Dec 30 '06 #8

Stef Mientki wrote:
MatLab: 14 msec
Python: 2 msec
I have the same experience. NumPy is usually faster than Matlab. But it
very much depends on how the code is structured.

I wonder if it is possible to improve the performance of NumPy by
having its fundamental types in the language, instead of depending on
operator overloading. For example, in NumPy, a statement like

array3[:] = array1[:] + array2[:]

allocates an intermediate array that is not needed. This is because the
operator overloading cannot know if it's evaluating a part of a larger
statement like

array1[:] = (array1[:] + array2[:]) * (array3[:] + array4[:])

If arrays had been a part of the language, as it is in Matlab and
Fortran 95, the compiler could see this and avoid intermediate storage,
as well as looping over the data only once. This is one of the main
reasons why Fortran is better than C++ for scientific computing. I.e.
instead of

for (i=0; i<n; i++)
array1[i] = (array1[i] + array2[i]) * (array3[i] + array4[i]);

one actually gets something like three intermediates and four loops:

tmp1 = malloc(n*sizeof(whatever));
for (i=0; i<n; i++)
tmp1[i] = array1[i] + array2[i];
tmp2 = malloc(n*sizeof(whatever));
for (i=0; i<n; i++)
tmp2[i] = array3[i] + array4[i];
tmp3 = malloc(n*sizeof(whatever));
for (i=0; i<n; i++)
tmp3[i] = tmp1[i] + tmp2[i];
free(tmp1);
free(tmp2);
for (i=0; i<n; i++)
array1[i] = tmp3[i];
free(tmp3);

In C++ this is actually further bloated by constructor, destructor and
copyconstructor calls.
Why one should use Fortran over C++ is obvious. But it also applies to
NumPy, and also to the issue of Numpy vs. Matlab, as Matlab know about
arrays and has a compiler that can deal with this, whilst NumPy depends
on bloated operator overloading. On the other hand, Matlab is
fundamentally impaired on function calls and array slicing compared
with NumPy (basically copies are created instead of views). Thus, which
is faster - Matlab or NumPy - very much depends on how the code is
written.

Now for my question: operator overloading is (as shown) not the
solution to efficient scientific computing. It creates serious bloat
where it is undesired. Can NumPy's performance be improved by adding
the array types to the Python language it self? Or are the dynamic
nature of Python preventing this?

Sturla Molden

Dec 31 '06 #9
sturlamolden wrote:
array3[:] = array1[:] + array2[:]
OT, but why are you slicing array1 and array2? All that does is create new array
objects pointing to the same data.
Now for my question: operator overloading is (as shown) not the
solution to efficient scientific computing. It creates serious bloat
where it is undesired. Can NumPy's performance be improved by adding
the array types to the Python language it self? Or are the dynamic
nature of Python preventing this?
Pretty much. Making the array types builtin rather than from a third party
module doesn't really change anything. However, if type inferencing tools like
psyco are taught about numpy arrays like they are already taught about ints,
then one could do make it avoid temporaries.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco

Dec 31 '06 #10

sturlamolden wrote:
as well as looping over the data only once. This is one of the main
reasons why Fortran is better than C++ for scientific computing. I.e.
instead of

for (i=0; i<n; i++)
array1[i] = (array1[i] + array2[i]) * (array3[i] + array4[i]);

one actually gets something like three intermediates and four loops:

tmp1 = malloc(n*sizeof(whatever));
for (i=0; i<n; i++)
tmp1[i] = array1[i] + array2[i];
tmp2 = malloc(n*sizeof(whatever));
for (i=0; i<n; i++)
tmp2[i] = array3[i] + array4[i];
tmp3 = malloc(n*sizeof(whatever));
for (i=0; i<n; i++)
tmp3[i] = tmp1[i] + tmp2[i];
free(tmp1);
free(tmp2);
for (i=0; i<n; i++)
array1[i] = tmp3[i];
free(tmp3);
C/C++ do not allocate extra arrays. What you posted _might_ bear a
small resemblance to what numpy might produce (if using vectorized
code, not explicit loop code). This is entirely unrelated to the
reasons why fortran can be faster than c.

-Mike

Dec 31 '06 #11

Klaas wrote:
C/C++ do not allocate extra arrays. What you posted _might_ bear a
small resemblance to what numpy might produce (if using vectorized
code, not explicit loop code). This is entirely unrelated to the
reasons why fortran can be faster than c.
Array libraries in C++ that use operator overloading produce
intermediate arrays for the same reason as NumPy. There is a C++
library that are sometimes able to avoid intermediates (Blitz++), but
it can only do so for small arrays for which bounds are known at
compile time.

Operator overloading is sometimes portrayed as required for scientific
computing (e.g. in Java vs. C# flame wars), but the cure can be worse
than the disease.

C does not have operator overloading and is an entirely different case.
You can of course avoid intermediates in C++ if you use C++ as C. You
can do that in Python as well.

Jan 1 '07 #12

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

9
by: Carl | last post by:
I have been using Python for quite some time now and I love it. I use it mainly for explorative computing and numerical prototyping, ie testing and trying out different kinds of algorithms and...
42
by: Fred Ma | last post by:
Hello, This is not a troll posting, and I've refrained from asking because I've seen similar threads get all nitter-nattery. But I really want to make a decision on how best to invest my time....
4
by: dataangel | last post by:
I'm a student who's considering doing a project for a Machine Learning class on pathing (bots learning to run through a maze). The language primarily used by the class has been Matlab. I would...
53
by: Michael Tobis | last post by:
Someone asked me to write a brief essay regarding the value-add proposition for Python in the Fortran community. Slightly modified to remove a few climatology-related specifics, here it is. I...
9
by: Carl | last post by:
I am desperately looking for a way to call Python from Matlab. I have become used to Python's rich syntax and large number of libraries, and feel ridiculously clumsy being stuck with Matlab's...
2
by: lbolla | last post by:
take a look at Python vs Matlab comparison for numerical computations: numerical-computing-matlab-vs-python+numpy+weave it seems that what stated in a previous thread was true! Python+Numpy...
4
by: wang frank | last post by:
Hi, While comparing the speed of octave and matlab, I decided to do a similar test for python and matlab. The result shows that python is slower than matlab by a factor of 5. It is not bad since...
53
by: Vicent Giner | last post by:
Hello. I am new to Python. It seems a very interesting language to me. Its simplicity is very attractive. However, it is usually said that Python is not a compiled but interpreted programming...
55
by: sturlamolden | last post by:
I have recently been playing with a kd-tree for solving the "post office problem" in a 12-dimensional space. This is pure cpu bound number crunching, a task for which I suspected Python to be...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.