# vectorized computation in C++ such as those in Matlab (Matlab toC++)?

 P: n/a Dear all, Can C++/STL/Boost do the vectorized calculation as those in Matlab? For example, in the following code, what I really want to do is to send in a vector of u's. All other parameters such as t, l1, l2, l3, etc. are scalars... But u is a vector. Thus, t6 becomes a vector. t9 is an element-wise multiplication... The following code was actually converted from Matlab. If vectorized computation is not facilitated, then I have to call this function millions of times. But if vectorized computation is okay, then I can send in just a u vector with batch elements a time. I have many such code in Matlab need to be converted into C++ with vectorization. Any thoughts? Thank you! double t5, t6, t7, t9, t11, t13, t16, t20, t23, t27, t32, t34, t36, t37, t38, t42, t44, t47, t48, t51, t52, t54, t59, t60, t61, t66, t67, t69, t74, t75, t76, t81, t82, t84, t87, t105, t106, t110, t112; t5 = exp(-t * l1 - t * l2 - t * l3); t6 = t * u; t7 = mu1 * mu1; t9 = u * u; t11 = kappa * kappa; t13 = 0.1e1 / (t9 * t7 + t11); Jul 22 '08 #1
 Dear all,

> Can C++/STL/Boost do the vectorized calculation as those in Matlab?
For example, in the following code, what I really want to do is to
send in a vector of u's. All other parameters such as t, l1, l2, l3,
etc. are scalars... But u is a vector. Thus, t6 becomes a vector. t9
is an element-wise multiplication... The following code was actually
converted from Matlab. If vectorized computation is not facilitated,
then I have to call this function millions of times. But if
vectorized computation is okay, then I can send in just a u vector
with batch elements a time. I have many such code in Matlab need to
be converted into C++ with vectorization. Any thoughts? Thank you!
* * * * * * * * double t5, t6, t7, t9, t11, t13, t16, t20, t23, t27, t32, t34, t36, t37,
t38, t42,
* * * * * * * * * * * * t44, t47, t48, t51, t52, t54, t59, t60, t61, t66, t67, t69, t74,
t75, t76, t81,
* * * * * * * * * * * * t82, t84, t87, t105, t106, t110, t112;
* * * * * * * * t5 = exp(-t * l1 - t * l2 - t * l3);
* * * * * * * * t6 = t * u;
* * * * * * * * t7 = mu1 * mu1;
* * * * * * * * t9 = u * u;
* * * * * * * * t11 = kappa * kappa;
* * * * * * * * t13 = 0.1e1 / (t9 * t7 + t11);

Hi. I think matlab provides a c++ api. Have you checked it out?
There's also the matrix template library for general algebra
computations. You might find it useful.

--
Leandro T. C. Melo

Jul 22 '08 #2

 I don't think Matlab's C++ API can do that. I think it is just a C
interface. It does not have STL, Boost etc. Also, we are not talking
about things as complicated as high speed matrix computation, it's
just vectorized computation...

On Jul 22, 11:00*am, Leandro Melo

 On Tue, 22 Jul 2008 07:08:38 -0700, Luna Moon wrote:

Dear all,

Can C++/STL/Boost do the vectorized calculation as those in Matlab?

What exactly do you mean by "vectorized calculation as those in Matlab"?
Do you just mean that Matlab has a native vector type and does
calculations with it, or were you suggesting that Matlab processes
vectors in some special way that C++ cannot?

Matlab, AFAIK, does a lot of its matrix/vector arithmetic, such as dot
products and matrix-matrix or matrix-vector multiplication, using a BLAS
library - that is highly optimised linear algebra code (generally written
in Fortran) - which is accessible via C++, since there is a well-defined
interface for C++ (C, really) and Fortran. There is a good chance you
will already have a BLAS library on your system; if not, there are open
source (e.g,. the ATLAS project) as well as vendor-supplied versions
(e.g. Intel, AMD, etc supply BLAS libraries).

It is possible that Matlab will also make use of very machine-specific
optimisations such as sse/mmx for floating point computation. You can use
these too from C++ if you can persuade your compiler to play ball.

The bottom line is that there's nothing Matlab can do that you can't do
in C++, equally (if not more) efficiently. It's more a question of
convenience: Matlab is designed specifically for vector/matrix
manipulation - C++ is a general-purpose programming language.

For example, in the following code, what I really want to do is to send
in a vector of u's. All other parameters such as t, l1, l2, l3, etc. are
scalars... But u is a vector. Thus, t6 becomes a vector. t9 is an
element-wise multiplication... The following code was actually converted
from Matlab. If vectorized computation is not facilitated, then I have to
call this function millions of times. But if vectorized computation is
okay, then I can send in just a u vector with batch elements a time.

I'm really not quite sure what you mean here. The closest thing in C++ to
a Matlab vector is probably the std::valarray

 Luna Moon wrote:

Dear all,

Can C++/STL/Boost do the vectorized calculation as those in Matlab?

I don't know what Boost has in the field of matrix & vector
computations, but standard C++ does not have anything even remotely
resembling the capabilities of Matlab.

The closest you can get with standard C++ is to use std::valarray<>,
which was intended to facilitate computations that can potentially be
executed in parallel.

Bart v Ingen Schenau
--
a.c.l.l.c-c++ FAQ: http://www.comeaucomputing.com/learn/faq
c.l.c FAQ: http://c-faq.com/
c.l.c++ FAQ: http://www.parashift.com/c++-faq-lite/

Jul 22 '08 #5

 On 22 Jul, 18:37, Lionel B

 On 23 Jul, 11:26, Lionel B

 On 22 Jul., 16:08, Luna Moon

 On Jul 22, 10:08 pm, Luna Moon

 Luna Moon

If vectorized computation is not facilitated, then I have to call this
function millions of times. But if vectorized computation is okay, then
I can send in just a u vector with batch elements a time. I have many
such code in Matlab need to be converted into C++ with vectorization.
Any thoughts? Thank you!

double t5, t6, t7, t9, t11, t13, t16, t20, t23, t27, t32, t34, t36,
t37, t38, t42,
t44, t47, t48, t51, t52, t54, t59, t60, t61, t66, t67, t69, t74, t75,
t76, t81,
t82, t84, t87, t105, t106, t110, t112;
t5 = exp(-t * l1 - t * l2 - t * l3);
t6 = t * u;
t7 = mu1 * mu1;
t9 = u * u;
t11 = kappa * kappa;
t13 = 0.1e1 / (t9 * t7 + t11);

This is out of my element, but I have written some vector code in GCC,
just for my own edification. I'm not sure how you would translate the
above code... I don't exactly see the vectors, unless I'm
misinterpreting. Anyhow:

http://gcc.gnu.org/onlinedocs/gcc-4....tor-Extensions

It's exceedingly simple. ICC and MSC support similar, though
incompatible syntax. With GCC, make sure to manually specify -march,
otherwise the generator won't have access to SSE instructions (or
whatever your platform has).

The biggest initial syntax gotcha I encountered was initializing the
vector; the vector can be treated like an array, IIRC, but I
encountered some hangups. Second, GCC has to load all the SSE
registers, and other sourcery I wasn't acquainted with. There just
seems like there'd be lots of headaches keeping the pipeline chugging
along, depending on your data set, and where it comes from.

Also, I have no idea if the syntax carriers over to C++. And perhaps
GCC can already accomplish similar optimizations with valarrays.
Either way, you'll definitely want to use the latest GCC 4.3 version,
which AFAIK is at the moment king of the hill regarding
auto-vectorization.

Jul 27 '08 #10

 In article <6d**********************************@x41g2000hsb.
googlegroups.com>,
Rune Allnor

On 6 Aug, 14:54, Giovanni GherdovichI'm aware of Matlab "vectorization" techniques; I use them toavoid
for-loops.

That's a *matlab* problem. 'Vectorization' is a conceptexclusive to
matlab, which historically was caused bywhat I consider to be bugs in
the matlab interpreter.

To describe this property of matlab as "buggy" is inordinately harsh.
It's an inherent property of interpreters: If you're interpreting a
loop, you have to look at the loop condition code, and the loop
bookkeeping code, and the code inside the loop, every time through.
Unless you go out of your way to make this fast, you end up having to
do a lookup-decode-process for each of those steps. Compiling to
native code lets you do the lookup-decode at compile time, and for
typical loops only generates a few machine-code instructions for the
loop bookkeeping and condition checking, which substantially reduces
the total amount of work the processor is doing. But making an
interpreter clever enough to do interpreted loops that fast is a Much
Harder Problem.

(So, the answer to the OP's question is (as already noted): Don't
worry about vectorizing, write loops and ask the compiler to optimize
it, and you'll probably come close enough to Matlab's performance that
you won't be able to tell the difference.)

Since Matlab is targeting numerical work with large arrays anyways,
there's not much benefit to speeding up this part of the interpreter;
if the program is spending most of its time inside the large-matrix
code (which is compiled to native code, aggressively optimized by the
compiler, and probably hand-tuned for speed), then speeding up the
interpreter's handling of the loop won't gain you any noticeable
speedup anyways. If you're writing loopy code to do things Matlab has
primitives for, you're probably better off vectorizing it anyways,
since that will make it both clearer and faster. So (unlike with
general-purpose interpreted languages that don't have primitives that
replace common loop idioms) there's no real benefit to speeding up the
Matlab interpreter's loop handling, and there are obvious costs
(development time, increased complexity, more potential for bugs), so
there are good reasons not to bother.

If you do have code that doesn't fit Matlab's vectorization model, you
can always write it in C or Fortran and wrap it up in a Matlab FFI
wrapper; Matlab's FFI is not hard to use on the compiled-to-native-code
side, and looks exactly like a Matlab function on the Matlab code side,
so it's almost always the Right Tool For The Job in that case. (At my
day job, I've been asked to do this for the Matlab programmers a few
times, and for hard-to-vectorize loopy code getting a speedup of two or
three orders of magnitude just by doing a reasonably direct translation
into C and compiling to native code with an optimizing compiler is
pretty much expected.)

dave
--
Dave Vandervies dj3vande at eskimo dot com
Erm... wouldn't clock(), used with Bill Godfrey's follow-up, ignoring my
follow-up to him (as suggested in your follow-up to me), do the trick
quite nicely? --Joona I Palaste in comp.lang.c

Aug 6 '08 #11 