473,586 Members | 2,839 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

how to speed up the matrix multiplication in C

Hi all,

Is there any method which can implememt the matrix multiplication
faster than using the formula as we often do by hand?
I am writing the following code and my matrice: one is 3x40000 and the
other one 40000x3.
the program runs too slow:

/*multiply two matrice: xyz_trans * xyz , the output is w 3x3*/

for (i=0;i<3;i++)
{
for (j=0;j<3;j++)
{
w[i][j]=0;
for (k=0;k<40000;k+ +)
{
w[i][j]+=xyz_trans[i][k]*xyz[k][j];
}

}
}

What is your code when you multiply 2 big matrice? Is there any faster
algorithm? I see MATLAB handles this task so quick.

Thanks

Mar 16 '07 #1
7 7573
Vi*******@gmail .com wrote:
Is there any method which can implememt the matrix multiplication
faster than using the formula as we often do by hand?
Yes, but AFAIK its rather hairy and only useful for large (no,
larger than that) matricies. IMBW.
I am writing the following code and my matrice: one is 3x40000 and the
other one 40000x3.
the program runs too slow:

/*multiply two matrice: xyz_trans * xyz , the output is w 3x3*/

for (i=0;i<3;i++)
{
for (j=0;j<3;j++)
{
w[i][j]=0;
for (k=0;k<40000;k+ +)
{
w[i][j]+=xyz_trans[i][k]*xyz[k][j];
}

}
}
How slow is "too slow"? It's pretty much instant on the-here machine.
(The inner loop is only executed 360000 times. I'm assuming that
the arrays are float/double.) But if you /really/ want more speed:

Um. Well. You're not exactly /helping/ the compiler. For example, the
expression `w[i][j]` is evaluated /every time round the inner loop/.
So is the expression `xyz_trans[i]`. You can have a temp for w[i][j],
and you can declare a local for xyz_trans[i].

--
Chris "electric hedgehog" Dollin
"What I don't understand is this ..." Trevor Chaplin, /The Beiderbeck Affair/

Mar 16 '07 #2
I am writing the following code and my matrice: one is 3x40000 and the
other one 40000x3.
the program runs too slow:

/*multiply two matrice: xyz_trans * xyz , the output is w 3x3*/

for (i=0;i<3;i++)
{
for (j=0;j<3;j++)
{
w[i][j]=0;
for (k=0;k<40000;k+ +)
{
w[i][j]+=xyz_trans[i][k]*xyz[k][j];
}

}
}

What is your code when you multiply 2 big matrice? Is there any faster
algorithm? I see MATLAB handles this task so quick.
I'm going to risk blowing smoke (i.e., I haven't tested what I'm
saying so take it with a huge grain of salt). With that in mind, here
are some ideas, which you can try and see if they help.

1) Split into two parts (as the other poster mentioned), one for
initialization and one for multiplication.
2) In the second part, switch the order of k and j. This might
potentially increase your cache hit rate, which can have a significant
speed increase. (Because in this case you'd always be accessing
consecutive memory locations, i.e., only changing the last index.)

The resulting code would be:

for (i=0;i<3;i++) {
for (j=0;j<3;j++) {
w[i][j]=0;
}
}
for (i=0;i<3;i++) {
for (k=0;k<40000;k+ +) {
for (j=0;j<3;j++) {
w[i][j] += xyz_trans[i][k] * xyz[k][j];
}
}
}

3) In the off chance you haven't already, set your compiler's flags to
maximum optimization.
4) Maybe introduce temporary variables for some things (hopefully
compiler optimization would catch those). I probably wouldn't do
this, but the code would look something like this (second part only):
for (i=0;i<3;i++) {
double* wp = w[i];
for (k=0;k<40000;k+ +) {
double* xyzp = xyz[k];
double xyz_trans_val = xyz_trans[i][k];
for (j=0;j<3;j++) {
wp[j] += xyz_trans_val * xyzp[j];
}
}
}
As I said, I wouldn't do it because I expect my compiler to do this
kind of thing for me. If I were really desparate, I might look at the
generated assembly to figure out if my compiler is actually doing
that.

Finally, as mentioned, I haven't tested this, so it might actually
make the code slower.

And it looks like xyz_trans might be the same as xyz transposed, in
which case you could rewrite the whole thing with only xyz and without
creating a separate array.

Michael

P.S. Matrix multiplication as written is N^3 if you multiply NxN and
NxN matrices. There's a better algorithm that's N^2.5 or something
like that, but since one of your dimensions is 3, it's not likely to
help much (and it is a significant pain to implement).

Mar 16 '07 #3
On Mar 16, 9:37 am, VijaKh...@gmail .com wrote:
Hi all,

Is there any method which can implememt the matrix multiplication
faster than using the formula as we often do by hand?
I am writing the following code and my matrice: one is 3x40000 and the
other one 40000x3.
the program runs too slow:

/*multiply two matrice: xyz_trans * xyz , the output is w 3x3*/

for (i=0;i<3;i++)
{
for (j=0;j<3;j++)
{
w[i][j]=0;
for (k=0;k<40000;k+ +)
{
w[i][j]+=xyz_trans[i][k]*xyz[k][j];
}

}
}

What is your code when you multiply 2 big matrice? Is there any faster
algorithm? I see MATLAB handles this task so quick.

Thanks
I posted a C Strassen matrix multiply in this thread:
http://groups.google.com/group/comp....be9b87e8427d4b

There is a link to a C++ version in this thread:
http://groups.google.com/group/comp....54b01120cc246b

Mar 16 '07 #4
On 16 Mar 2007 09:37:36 -0700, Vi*******@gmail .com wrote:
>Hi all,

Is there any method which can implememt the matrix multiplication
faster than using the formula as we often do by hand?
I am writing the following code and my matrice: one is 3x40000 and the
other one 40000x3.
the program runs too slow:

/*multiply two matrice: xyz_trans * xyz , the output is w 3x3*/

for (i=0;i<3;i++)
{
for (j=0;j<3;j++)
{
w[i][j]=0;
for (k=0;k<40000;k+ +)
{
w[i][j]+=xyz_trans[i][k]*xyz[k][j];
}

}
}

What is your code when you multiply 2 big matrice? Is there any faster
algorithm? I see MATLAB handles this task so quick.
Your problem is that you have lots of caches faults and perhaps even
page fault because the stride in one of the terms in the dot product
isn't equal to one. This being C code, the stride in xyz[k][j] is 3
rather than 1. This is a well known issue in matrix multiplication. If
feasible the solution is to transpose the matrix causing trouble first.

BTW this really isn't a C problem. Comp.programmin g or
sci.math.num-analysis might be more appropriate groups.

Mar 16 '07 #5
On Mar 16, 4:37 pm, VijaKh...@gmail .com wrote:
Hi all,

Is there any method which can implememt the matrix multiplication
faster than using the formula as we often do by hand?
I am writing the following code and my matrice: one is 3x40000 and the
other one 40000x3.
the program runs too slow:

/*multiply two matrice: xyz_trans * xyz , the output is w 3x3*/

for (i=0;i<3;i++)
{
for (j=0;j<3;j++)
{
w[i][j]=0;
for (k=0;k<40000;k+ +)
{
w[i][j]+=xyz_trans[i][k]*xyz[k][j];
}

}
}

What is your code when you multiply 2 big matrice? Is there any faster
algorithm? I see MATLAB handles this task so quick.

Thanks
Your post is missing some important things: How exactly are i, j, k,
xyz, xyz_trans and w declared? How fast or slow exactly is the code,
on which processor? That information is needed to decide if there is
something significantly wrong with your code, or whether this is just
a bit slow because your code is quite clumsy, or whether this code is
actually as good as you could expect. I suggest you replace your code
with (I left out some declarations because you left out stuff that I
would need to know)
for (i=0;i<3;i++)
{
s0 = s1 = s2 = 0;
for (k=0;k<40000;k+ +)
{
t = xyz_trans[i][k];
p = &xyz[k][0];
s0 += t * p [0]; s1 _= t * p [1]; s2 += t * p [2];
}
w [i][0] = s0; w [i][1] = s1; w [i][2] = s2;
}
Measure the times and post it here, plus post the _complete_ source,
then we can have another look. About a millisecond on a modern desktop
computer seems reasonable.
Mar 17 '07 #6
In article <11************ *********@l75g2 000hse.googlegr oups.comVi*******@gmail .com writes:
Is there any method which can implememt the matrix multiplication
faster than using the formula as we often do by hand?
I am writing the following code and my matrice: one is 3x40000 and the
other one 40000x3.
the program runs too slow:

/*multiply two matrice: xyz_trans * xyz , the output is w 3x3*/
for (i=0;i<3;i++)
{
for (j=0;j<3;j++)
{
w[i][j]=0;
for (k=0;k<40000;k+ +)
{
w[i][j]+=xyz_trans[i][k]*xyz[k][j];
}

}
}
It depends on the processor, and I think also on caching. But you might
try the following:
for(i = 0; i < 3; i++) for(j = 0; j < 3; j++) w[i][j] = 0;
for(k = 0; k < 40000; k++) {
for(i = 0; i < 3; i++) {
xyzt = xyz_trans[i][k];
for(j = 0; j < 3; j++) {
w[i][j] += xyzt * xyz[k][j];
}
}
}

Under some circumstances it works better.
--
dik t. winter, cwi, kruislaan 413, 1098 sj amsterdam, nederland, +31205924131
home: bovenover 215, 1025 jn amsterdam, nederland; http://www.cwi.nl/~dik/
Mar 17 '07 #7
In article <11************ **********@y66g 2000hsf.googleg roups.com"Micha el" <mc******@aol.c omwrites:
....
P.S. Matrix multiplication as written is N^3 if you multiply NxN and
NxN matrices. There's a better algorithm that's N^2.5 or something
like that, but since one of your dimensions is 3, it's not likely to
help much (and it is a significant pain to implement).
It will not help at all. In order to benefit from it the matrices need
a reasonable size in all dimensions, otherwise it will be slower.
--
dik t. winter, cwi, kruislaan 413, 1098 sj amsterdam, nederland, +31205924131
home: bovenover 215, 1025 jn amsterdam, nederland; http://www.cwi.nl/~dik/
Mar 17 '07 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

15
13256
by: christopher diggins | last post by:
Here is some code I wrote for Matrix multiplication for arbitrary dimensionality known at compile-time. I am curious how practical it is. For instance, is it common to know the dimensionality of matricies at compile-time? Any help would be appreciated. Hopefully this code comes in useful for someone, let me know if you find it useful, or if...
3
12887
by: robix | last post by:
Hi again. I'm now asking your help because of a smal problem i'm getting with my multiplication matrix code. I'll try to give you as many details as possible. My matrix structure: typedef struct { int col; /* number of colowns */ int lin; /* number of lines*/
14
4933
by: amitnanda | last post by:
Hi Guys, I have a matrix multiplication program in C that multiplies two matrices. When their size is 3*3 or 800*800, the program runs fine. But above that size, I get a "segmentation fault". I need this huge size as part of my assignment.
20
5205
by: Frank-O | last post by:
Hi , Recently I have been commited to the task of "translating" some complex statistical algorithms from Matlab to C++. The goal is to be three times as fast as matlab ( the latest) . I've used various techniques ( loop unrolling, loop jamming...) and tried some matrix libraries : newmat (slow for large matrix) , STL (fast but ..not...
0
4196
by: lituncse | last post by:
dear friends, i have come across a problem which is difficult to solve for me.it's about starssen's matrix multiplication.in general matrix multiplication we need 8 multiplications and 4 additions but for the starssen's matrix multiplication we need 7 multiplication and 14 additions which is very important from time complexity point of...
6
16428
by: amitsoni.1984 | last post by:
Hi, Is there any direct function for matrix multiplication in Python or any of its packages? or do we have to multiply element by element? Thank you, Amit
3
3875
by: crazygrey | last post by:
Hello, I'm a newbie to C++ so excuse me if my question was trivial but it is important to me. I'm implementing a simple code to find the forward kinematics of a robot: #include "stdafx.h" #include<iostream> #include<iomanip> #include<fstream> #include"math.h"
1
9142
by: Sozos | last post by:
Hi guys. I have a problem with writing the base case for the following matrix multiplication function I have implemented. Please help. #define index(i,j,power) (((i)<<(power))+(j)) void recMultiply(int i, int j, float a, int k, int l, float b, int x, int y, float c, int s); int i, j, k, s, matrixsize, blocksize, jj, kk, power, bsize;...
8
7877
by: joegao1 | last post by:
can some one give me a hint? I want to program the code for matrix multiplication with as less arithmetical / multiplication operations as possible. my task is to calculate the matrix multiplication A*B*A' , where - A' refers to the transpose of matrix A - matrix B is symmetric matrix. - (optional) it will be much better if the...
0
7911
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main...
0
7839
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language...
0
8200
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. ...
1
7954
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For...
1
5710
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes...
0
5390
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert...
0
3836
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in...
0
3864
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
2345
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.