In article <11*********************@f14g2000cwb.googlegroups. com>,
Tiza Naziri <wa****@yahoo.com> wrote:
How to represent this matrix equation in C code:
[ 8 x 8 matrix multiplication shown ]
Depends -- do you need to do the calculation symbolically
or numerically? Are b0 thru b7 integers or floating point?
Are the matrix coefficients always going to be 0's and 1's with
no other possible values? Are you always going to be doing 8x8
or will the size be variable? Is the maximum size and density
enough that sparse representations seem like a good idea?
Are you solving for a[*] (matrix multiplication) or b[*] (matrix
division)? Is stability of the matrix inverse a significant concern?
Is there a good reason to impliment the manipulation yourself instead
of using one of the well-established libraries such as BLAS
or SCSL?
One hint: If you are doing the math yourself, then watch out for
the memory layout that C uses for multidimensional arrays, which
is different than the memory layout traditionally used by Fortran.
Pay attention to the memory layout when you are chosing which
index to loop over most quickly.
And if you get into very big matrices that might not fit in primary
cache, and you are implimenting by yourself, read up on matrix blocking
techniques.
Another hint: especially when your matrices have 2^N elements, watch
out for cache line aliasing -- if the array positions in memory are
seperated by a multiple of the cache line size, then something
seemingly simple such as
X[i][J] = Y[i][J];
might require a cache load to fetch the value from Y, and then might
require that the -same- cache line be dumped and loaded with the
corresponding position in X in order to do the store; whereas if
the starting positions of the arrays are carefully positioned to be
on different cache lines, then a cache-line's worth of data from Y
could be transfered to X before having to load cache again.
[Having a large primary cache doesn't help if the program's memory
access keeps thrashing over cache-line access...]
Anything to do with caches is outside the scope of the C89 standard.
Generally speaking pre-written libraries such as BLAS have been written
to take such matters into account, and so can sometimes end up being orders
of magnitude faster than naive code. It doesn't usually pay to
redevelop the mistakes of the past: it's usually best to find a
reputable matrix library that already does what you need.
A reference book for such matters: "Numerical Algorithms in C".
--
Any sufficiently old bug becomes a feature.