P: n/a

Hi,
I've developed a C program which contains a large number of vectors
and matrices operations. Throughout my code, I used the template from
the Numerical Recipes book to define vectors and matrices and access
their elements. For example, to define a vector I used the function:
my_vector=vector(0,n1);
Which actually allocate the memory as follows:
my_vector = (float *) malloc ( n*sizeof(float) );
and then to access its elements, I used my_vector[0],.......,
my_vector[n]
The execution time of the program is extremely important for me and
it should be made as short as possible. I was wondering if the method
that I used for vector and matrix notation in my code is the most
efficient one. Does anybody know the best method (in terms of
efficiency) to define matrices and perform operations on their
elements in C? Even a small speedup in my code would be valuable for
me.
Thanks for your help  
Share this Question
P: n/a

"Babak" <b.*******@gmail.comwrote in message
news:e7**********************************@l28g2000 prd.googlegroups.com...
Hi,
I've developed a C program which contains a large number of vectors
and matrices operations. Throughout my code, I used the template from
the Numerical Recipes book to define vectors and matrices and access
their elements. For example, to define a vector I used the function:
my_vector=vector(0,n1);
Which actually allocate the memory as follows:
my_vector = (float *) malloc ( n*sizeof(float) );
and then to access its elements, I used my_vector[0],.......,
my_vector[n]
The execution time of the program is extremely important for me and
it should be made as short as possible. I was wondering if the method
that I used for vector and matrix notation in my code is the most
efficient one. Does anybody know the best method (in terms of
efficiency) to define matrices and perform operations on their
elements in C? Even a small speedup in my code would be valuable for
me.
That looks pretty efficient to me, assuming my_vector is a pointer to float.
What does the matrix code look like?

Bartc  
P: n/a

On May 24, 6:28*pm, "Bartc" <b...@freeuk.comwrote:
"Babak" <b.asgh...@gmail.comwrote in message
news:e7**********************************@l28g2000 prd.googlegroups.com...
Hi,
I've developed a C program which contains a large number of vectors
and matrices operations. Throughout my code, I used the template from
the Numerical Recipes book to define vectors and matrices and access
their elements. For example, to define a vector I used the function:
my_vector=vector(0,n1);
Which actually allocate the memory as follows:
my_vector = (float *) malloc ( n*sizeof(float) );
and then to access its elements, I used my_vector[0],.......,
my_vector[n]
The execution time of the program is extremely important for me and
it should be made as short as possible. I was wondering if the method
that I used for vector and matrix notation in my code is the most
efficient one. Does anybody know the best method (in terms of
efficiency) to define matrices and perform operations on their
elements in C? Even a small speedup in my code would be valuable for
me.
That looks pretty efficient to me, assuming my_vector is a pointer to float.
What does the matrix code look like?

Bartc Hide quoted text 
 Show quoted text 
The matrix code is like:
my_matrix=matrix(0,n1,0,n1);
which is equivalent to:
/* allocate pointers to rows */
my_matrix=(float **) malloc(n*sizeof(float*));
/* allocate rows and set pointers to them */
my_matrix[i]=(float *) malloc(n*sizeof(float)); i=0,...,n1
and I access its elements like my_matrix[0][0],...........
I'm not sure if working with pointers instead of array indices will
make any difference in the speed of the code.  
P: n/a

Babak wrote:
On May 24, 6:28 pm, "Bartc" <b...@freeuk.comwrote:
>"Babak" <b.asgh...@gmail.comwrote in message news:e7**********************************@l28g200 0prd.googlegroups.com...
>>I've developed a C program which contains a large number of vectors and matrices operations. Throughout my code, I used the template from the Numerical Recipes book to define vectors and matrices and access their elements. For example, to define a vector I used the function:
>>my_vector=vector(0,n1);
>>Which actually allocate the memory as follows:
>>my_vector = (float *) malloc ( n*sizeof(float) );
>>and then to access its elements, I used my_vector[0],......., my_vector[n]
>>The execution time of the program is extremely important for me and it should be made as short as possible. I was wondering if the method that I used for vector and matrix notation in my code is the most efficient one. Does anybody know the best method (in terms of efficiency) to define matrices and perform operations on their elements in C? Even a small speedup in my code would be valuable for me.
That looks pretty efficient to me, assuming my_vector is a pointer to float.
What does the matrix code look like?
The matrix code is like:
my_matrix=matrix(0,n1,0,n1);
which is equivalent to:
/* allocate pointers to rows */
my_matrix=(float **) malloc(n*sizeof(float*));
/* allocate rows and set pointers to them */
my_matrix[i]=(float *) malloc(n*sizeof(float)); i=0,...,n1
and I access its elements like my_matrix[0][0],...........
That looks fairly standard too. Your compiler will take care of lowlevel
optimisation such as converting between indexing and pointer operations.
What sort of speedup are you looking for? Have you actually measured the
execution time yet?
The math(s) calculations in your code are lilely to have a much bigger
impact on speed than the mode of array access.

Bartc  
P: n/a

Babak <b.*******@gmail.comwrote:
I'm not sure if working with pointers instead of array indices will
make any difference in the speed of the code.
There's no difference at all between using
a[ i ]
and
*( a + i )
The index notation is just a bit easier to read for humans, but
the compiler will rewrite index to pointer notattion in a first
step. That's why you can even write 'i[a]' instead of 'a[i]',
both get translated to '*(a+i)'.
Regards, Jens

\ Jens Thoms Toerring ___ jt@toerring.de
\__________________________ http://toerring.de  
P: n/a

Babak <b.*******@gmail.comwrites:
On May 24, 6:28Â*pm, "Bartc" <b...@freeuk.comwrote:
>"Babak" <b.asgh...@gmail.comwrote in message news:e7**********************************@l28g200 0prd.googlegroups.com...
<snip>
I've developed a C program which contains a large number of vectors
and matrices operations.
<snip>
>What does the matrix code look like?
 Bartc Hide quoted text 
 Show quoted text 
When replying through the Google interface, it helps if you remove
this sort of thing.
The matrix code is like:
my_matrix=matrix(0,n1,0,n1);
which is equivalent to:
/* allocate pointers to rows */
my_matrix=(float **) malloc(n*sizeof(float*));
/* allocate rows and set pointers to them */
my_matrix[i]=(float *) malloc(n*sizeof(float)); i=0,...,n1
and I access its elements like my_matrix[0][0],...........
I'm not sure if working with pointers instead of array indices will
make any difference in the speed of the code.
Bartc already said this but I will repeat it: measure. In particular
profile the code. There is no point in doing anything to the array
parts if it is, say, some trigonometric functions that are taking the
time.
Some general observations: (1) This array of pointers method can be
either faster or slower than the direct 2D array method. It all
depends on the sizes, the access pattern, and the processor. (2)
double can be faster than float. (3) If your array sizes are
compiletime constants it may pay to declare the arrays rather than
allocating them. (4) If the sizes are not constant, C99 can still let
you declare the arrays. You need a compiler that does variable
length arrays (and you need to not mind loosing some portability).
The problem here is that you need a program to measure to find out what
is and is not costly, but you want to write using the fastest method
(for your type of problem) from the start. It might help to find
someone who has done something similar and has measured the
performance of different techniques on a similar system.

Ben.  
P: n/a

On Sun, 25 May 2008 11:55:05 +0000, Jens Thoms Toerring wrote:
Babak <b.*******@gmail.comwrote:
> I'm not sure if working with pointers instead of array indices will make any difference in the speed of the code.
There's no difference at all between using
a[ i ]
and
*( a + i )
Correct, as far as the language is concerned, but...
The index notation is just a bit easier to read for humans, but the
compiler will rewrite index to pointer notattion in a first step.
....at least one common compiler (and probably more) does not simply
translate [] to * and +, or vice versa. It's not required; all that
matters is that you, the programmer, can freely mix them.
Whether the compiler translates should not normally be visible to its
users, but in this case a minor bug exists that affects [], but not * and
+.  
P: n/a

Babak wrote:
Hi,
I've developed a C program which contains a large number of vectors
and matrices operations. Throughout my code, I used the template from
the Numerical Recipes book to define vectors and matrices and access
their elements. For example, to define a vector I used the function:
my_vector=vector(0,n1);
Which actually allocate the memory as follows:
my_vector = (float *) malloc ( n*sizeof(float) );
and then to access its elements, I used my_vector[0],.......,
my_vector[n]
Did you really go all the way to element [n], inclusive?
That's unfortunate, because no memory has been allocated for
that element: You've requested n elements altogether, and the
sequence 0,1,...,n1,n has n+1 values.
The execution time of the program is extremely important for me and
it should be made as short as possible. I was wondering if the method
that I used for vector and matrix notation in my code is the most
efficient one. Does anybody know the best method (in terms of
efficiency) to define matrices and perform operations on their
elements in C? Even a small speedup in my code would be valuable for
me.
What you've shown in this posting is likely to be as fast
as anything else. Or to turn it around, there's no reason to
expect it to be any slower. However, I see from the followup
messages that what you've shown here is *not* actually what
you're trying to do, which is to use a *two*dimensional array.
There are three main approaches:
1) If the row and column count are known at compile time,
just declare the array as matrix[nrows][ncols] and be done
with it. (Variation: use matrix[ncols][nrows] instead; if the
operations you're interested in are heavily roworiented or
columnoriented, interchanging subscripts *might* make a
difference. This should be investigated for the other
approaches, too.)
2) Allocate a onedimensional array of pointers to rows
(columns), and for each of them allocate a onedimensional
row (column). I gather this is the arrangement the macros
you're now using will produce.
3) Allocate one big nrows*ncols block of memory, and do
the indexing "by hand:" matrix[i][j] <==block[i*ncols+j]
(or block[i+j*nrows]). A macro can might make the index
calculation more readable; be lavish with parentheses. The
multiplications may look scarily slow, but the compiler will
probably find ways to perform "strength reduction" on them.
Which of these three (six) will be fastest on your machine
with your compiler and your algorithms? Would it be worth while
to make the rows (columns) a little longer than necessary and
use just the first ncols (nrows) positions, in hopes of getting
better behavior from memory caches? The only way to tell for
sure is to measure, measure, measure.

Eric Sosman es*****@ieeedotorg.invalid  
P: n/a

On Sat, May 24, 2008 at 06:39:32PM 0700, Babak wrote:
>
The matrix code is like:
my_matrix=matrix(0,n1,0,n1);
which is equivalent to:
/* allocate pointers to rows */
my_matrix=(float **) malloc(n*sizeof(float*));
/* allocate rows and set pointers to them */
my_matrix[i]=(float *) malloc(n*sizeof(float)); i=0,...,n1
I think the numerical recipes matrix routine does _not_ work like this.
my_matrix has to be declared as float**, then it rather does:
int i;
my_matrix= malloc(n*sizeof(float*)); /* as you wrote */
my_matrix[0]=malloc(n*n*sizeof(float*)); /* as you wrote */
for(i=1;i<n;i++) my_matrix[i]=my_matrix[i1]+n;
So there is only one chunk of memory.
In this sense the layout of data the same as in the array
float my_matrix[n][n];
There is though an important diffierence between a 2d array and float**,
that is the type. You cannot pass my_matrix, declared as an array to
the other numerical recipes routines, since they all expect a float **.
So unless you rewrite them (with variadic arguments in the c99 sense)
you have no choice. If my_matrix is float**, then
my_matrix[i][j] involves three memory operations (my_matrix,*(my_matrix+i),
*(*(my_matrix+i)+j), whereas if my_matrix is a 2d array, stored on the
stack (let's assume an ordinary system), the address of my_matrix is
likely to be in a register. So there will be only one memory read and
an integer multiplication (the array size is likely to be in a register, too).
But all these you'll find in the generated assembly code, which you might
have a look at, when you optimise.
An other advantage of my_matrix being an array, is that there is no need
for frequent calls to malloc() and free().
Finally some tips that usually work for my numerical problems:
1) Carefully apply the restrict and const qualifier to the pointers in the
function arguments. (Wherever they apply)
2) If you can afford, use array sizes that are known at compile time.
You define enum constants for the array sizes and recompile for each size.
In some cases numerical programs run for years for a given array size and
compile in a minute.
3) Be bold to pass (fixed size) multidimensional arrays to your functions.
4) Numerical recipes is not the best choice for speed
5) Prefer array sizes that are powers of two
.... there are some more hints that one can tell if the actual numerical
task is known.
I'm not sure if working with pointers instead of array indices will
make any difference in the speed of the code.
There are cases when it does, but I am not sure if it is the case for you.
Remember, the compiler can be very very clever. Sometimes a little fiddling
with the qualifiers will open new dimensions for the compiler's optimisation.
Szabolcs  
P: n/a
 jt@toerring.de (Jens Thoms Toerring) writes:
Babak <b.*******@gmail.comwrote:
> I'm not sure if working with pointers instead of array indices will make any difference in the speed of the code.
There's no difference at all between using
a[ i ]
and
*( a + i )
The index notation is just a bit easier to read for humans, but
the compiler will rewrite index to pointer notattion in a first
step. That's why you can even write 'i[a]' instead of 'a[i]',
both get translated to '*(a+i)'.
Right, but there is a difference, at least potentially, between a[i]
and *p.
For example, consider the two equivalent loops in this program:
#include <stdio.h>
int main(void)
{
#define LEN 4
double arr[LEN] = { 1.2, 3.4, 5.6, 7.8 };
int i;
double *p;
for (i = 0; i < LEN; i ++) {
printf("%g ", arr[i]);
}
putchar('\n');
for (p = arr; p < arr+LEN; p ++) {
printf("%g ", *p);
}
putchar('\n');
return 0;
}
In the first, you're implicitly doing a multiplication and an addition
to compute arr[i]. In the second, you're just doing a dereference;
the more expensive multiplication has been "strengthreduced" into
repeated addtion in "p ++".
*But* this strengthreduction optimization is something that modern
compilers are often able to do for you. In the old days, it made
sense to use the pointer form because it could be substantially
faster. Today, such microoptimizations are just as likely to
inferfere with the optimizer and give you worse code.

Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
 Antony Jay and Jonathan Lynn, "Yes Minister"  
P: n/a

On May 25, 8:23*am, Babak <b.asgh...@gmail.comwrote:
Throughout my code, I used the template from
the Numerical Recipes book to define vectors and matrices ...
...
*The execution time of the program is extremely important
... Even a small speedup in my code would be valuable
No responder, except perhaps Eric, emphasized what I think
is a key point. Assuming, for example, that you will take
both left and rightproduct of a matrix
A x B .... and later .... B x C
then you *definitely* do *not* want the pointers
that, IIRC, Numerical Recipes will give you.
Note that, because of the *beautiful* way C combines
the treatments of arrays and pointers, you will probably
be able to use most of your code as is, even after you
change the declarations and allocations to use 2D
arrays. To use such declarations (all but one of) the
array dimensions must be known at compile time, but
this is how you'd want to do it if maximum speed is
your goal. If instead, you know only a maximum
dimension, there'll be little problem: use the maximum
for allocation/definition and the smaller actual
dimension for operations.
IIRC, Recipes uses indexes of 1 to N, rather than
the more usual (in C) range of 0 to N1.
Be sure you clear up any confusion from this.
James Dow Allen   This discussion thread is closed Replies have been disabled for this discussion.   Question stats  viewed: 4800
 replies: 10
 date asked: Jun 27 '08
