473,326 Members | 2,076 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,326 software developers and data experts.

speeding up C code

I have C code which computes the row sums of a matrix, divide each
element of each row with the row sum and then compute the column sum of
the resulting matrix. Is there a way I can speed up the code in C:

/* Here is the code */
// Table is "wij"

int i, j;
for(i = 0; i < N; ++i)
{
for(j = 0; j < N; ++j)
{
sum_over_j_wij[i] += wij[i,j];
}
for(j = 0; j < N; ++j)
{
sum_over_i[j] += wij[i,j]/sum_over_j_wij[i)];
}
}

Jul 23 '05 #1
12 2165
dv*****@hotmail.com wrote:
I have C code which computes the row sums of a matrix, divide each
element of each row with the row sum and then compute the column sum of
the resulting matrix. Is there a way I can speed up the code in C:

/* Here is the code */
// Table is "wij"

int i, j;
for(i = 0; i < N; ++i)
{
for(j = 0; j < N; ++j)
{
sum_over_j_wij[i] += wij[i,j];
By "win[i,j]" you probably mean "win[i][j]", don't you?
}
for(j = 0; j < N; ++j)
{
sum_over_i[j] += wij[i,j]/sum_over_j_wij[i)];
}
}


There are some loop unrolling techniques that might help. You can also
avoid indexing all the time by using pointers and advancing those using
the built-in ++ operator or += operator.

Next time, please refrain from typing your code into the message and
instead use the "copy-and-paste" capability offered by all modern Otes
and applications.

V
Jul 23 '05 #2
dvumani wrote:
I have C code which computes the row sums of a matrix, divide each
element of each row with the row sum and then compute the column sum of
the resulting matrix. Is there a way I can speed up the code in C: sum_over_j_wij[i] += wij[i,j];


Is wij[i,j] doing what you think it does?

Have you time-tested this code to see if it's slow?

Have you time-tested your entire application, to see if the speed of this
code is relevant?

After all that research, switch to C++ and look up "expression
metatemplates". You will probably find several examples with matrices.

--
Phlip
http://www.c2.com/cgi/wiki?ZeekLand

Jul 23 '05 #3
dv*****@hotmail.com wrote:
I have C code which computes the row sums of a matrix, divide each
element of each row with the row sum and then compute the column sum of
the resulting matrix. Is there a way I can speed up the code in C:

/* Here is the code */
// Table is "wij"

int i, j;
for(i = 0; i < N; ++i)
{
for(j = 0; j < N; ++j)
{
sum_over_j_wij[i] += wij[i,j];
}
for(j = 0; j < N; ++j)
{
sum_over_i[j] += wij[i,j]/sum_over_j_wij[i)];
}
}


1. If you're talking C, why are you posting to c.l.c++?

2. Your code won't compile; x[i,j] may be a syntax error

2.a.Your code might compile, but it may only provide x[j] ( or x[i], I
don't remember whether the comma operator return the right or left
operand in C).
Jul 23 '05 #4
dv*****@hotmail.com wrote:
I have C code which computes the row sums of a matrix, divide each
element of each row with the row sum and then compute the column sum of
the resulting matrix. Is there a way I can speed up the code in C: You have some errors in your code.

Besides the error, perhaps you could see something by
expanding your loop into a series.

/* Here is the code */
// Table is "wij"

int i, j;
for(i = 0; i < N; ++i)
{
for(j = 0; j < N; ++j)
{
sum_over_j_wij[i] += wij[i,j]; Here is the problem: wij[i,j]
The comma operator is not a subscript operator.
In this expression, it evaluates 'i', then 'j'
and returns the value of 'j'. So this is equivalent
to: wij[j]

The syntax for a multiple dimension array is:
wij[i][j]

}
for(j = 0; j < N; ++j)
{
sum_over_i[j] += wij[i,j]/sum_over_j_wij[i)]; See note above about "wij[i,j]".
Syntax error: "sum_overj_wij[i)];"
}
}


For N = 1:
sum_over_j_wij[0] = wij[0][0];
sum_over_i[0] = wij[0][0] / (wij[0][0]); /* substitution */

For N = 2:
sum_over_j_wij[0] = wij[0][0] + wij[0][1];
sum_over_i[0] = wij[0][0] / (wij[0][0] + wij[0][1])
+ wij[0][1] / (wij[0][0] + wij[0][1]);

sum_over_j_wij[1] = wij[1][0] + wij[1][1];
sum_over_i[1] = wij[1][0] / (wij[1][0] + wij[1][1])
+ wij[1][1] / (wij[1][1] + wij[1][1]);

Follow this through N = 5.
See if there are any commonalities or if the terms
can be re-arranged so that the operation can be
distributed (such as one pass is addition, another
division, etc.)

See also loop unrolling and also repetitive subtraction
rather than division.

Think about trying to "pipeline" the operations.

--
Thomas Matthews

C++ newsgroup welcome message:
http://www.slack.net/~shiva/welcome.txt
C++ Faq: http://www.parashift.com/c++-faq-lite
C Faq: http://www.eskimo.com/~scs/c-faq/top.html
alt.comp.lang.learn.c-c++ faq:
http://www.comeaucomputing.com/learn/faq/
Other sites:
http://www.josuttis.com -- C++ STL Library book
http://www.sgi.com/tech/stl -- Standard Template Library
Jul 23 '05 #5
red floyd wrote:
dv*****@hotmail.com wrote:
I have C code which computes the row sums of a matrix, divide each
element of each row with the row sum and then compute the column sum of
the resulting matrix. Is there a way I can speed up the code in C:

/* Here is the code */
// Table is "wij"

int i, j;
for(i = 0; i < N; ++i)
{
for(j = 0; j < N; ++j)
{
sum_over_j_wij[i] += wij[i,j];
}
for(j = 0; j < N; ++j)
{
sum_over_i[j] += wij[i,j]/sum_over_j_wij[i)];
}
}

1. If you're talking C, why are you posting to c.l.c++?


Techniques to speed up some code are common to both languages.
2. Your code won't compile; x[i,j] may be a syntax error
Why? The comma operator exists in both languages. It may not be doing
what's intended, but that's not the point.
2.a.Your code might compile, but it may only provide x[j] ( or x[i], I
don't remember whether the comma operator return the right or left
operand in C).


Then you might consider studying before attempting to reply to both
newsgroups. Just a thought...

V
Jul 23 '05 #6
dv*****@hotmail.com wrote:
I have C code which computes the row sums of a matrix, divide each
element of each row with the row sum and then compute the column sum of
the resulting matrix. Is there a way I can speed up the code in C:
Try this:
cat f.c #include <math.h>
#include <stdlib.h>
#include <stdbool.h>

bool f( // false if any row sum is zero.
const size_t m, // number of rows
const size_t n, // number of columns
const double w[m][n], // input matrix
double_t* restrict sum // output (columm sums)
) {
for (size_t i = 0; i < m; ++i) {
sum[i] = 0.0;
}
for (size_t i = 0; i < m; ++i) {
double_t t = 0.0; // row sum accumulator
for (size_t j = 0; j < n; ++j) {
t += w[i][j];
}
if (0.0 == t)
return false;
for (size_t j = 0; j < n; ++j) {
sum[j] += w[i][j]/t;
}
}
return true;
}
gcc -Wall -std=c99 -pedantic -O3 -c f.c


Performance will degrade when rows of array w are too large
to keep in level 1 cache along with the column sums.
Jul 23 '05 #7
On Fri, 06 May 2005 11:55:47 -0400, Victor Bazarov wrote:
dv*****@hotmail.com wrote:
I have C code which computes the row sums of a matrix, divide each
element of each row with the row sum and then compute the column sum of
the resulting matrix. Is there a way I can speed up the code in C:

/* Here is the code */
// Table is "wij"

int i, j;
for(i = 0; i < N; ++i)
{
for(j = 0; j < N; ++j)
{
sum_over_j_wij[i] += wij[i,j];
sum_over_j_wij[i] hasn't visibly been initialised anywhere. I assume these
array elements should be initialised to 0 before the loop.
By "win[i,j]" you probably mean "win[i][j]", don't you?
}
for(j = 0; j < N; ++j)
{
sum_over_i[j] += wij[i,j]/sum_over_j_wij[i)];

Similarly sum_over_i[j] hasn't visibly been initialised.
}
}


There are some loop unrolling techniques that might help.


The biggest time consumer here is likely to be the division. Division is
usually a lot slower than other arithmetic operations. Loop unrolling
probably won't help much in comparison, and is something the compiler
might do for you anyway.
You can also
avoid indexing all the time by using pointers and advancing those using
the built-in ++ operator or += operator.


Indexing is usually not an expensive operation, and compilers tend to be
good at optimising it. It isn't uncommon for an indexed version of code
to turn out faster than one using pointer arithmetic.

You could experimet with the following (untested) code.

int i, j;

/* Initialise sum_over_i elements if necessary here */

for(j = 0; j < N; ++j)
{
sum_over_i[j] = 0.0;
}

for(i = 0; i < N; ++i)
{
/* Assuming elements are of type double */
const double *const wij_row = wij[i];
double sum = 0.0;
double sum_reciprocal;

for(j = 0; j < N; ++j)
{
sum += wij_row[j];
}

sum_over_j_wij[i] = sum;
sum_reciprocal = 1.0 / sum;

for(j = 0; j < N; ++j)
{
sum_over_i[j] += wij_row[j] * sum_reciprocal;
}
}

wij_row may or may not help, the compiler could well optimise in a similar
or perhaps better way. Multiplying by the reciprocal may be marginally
less accurate than dividing.

Lawrence

Jul 23 '05 #8
dvum...@hotmail.com wrote:
I have C code which computes the row sums of a matrix, divide each
element of each row with the row sum and then compute the column sum of the resulting matrix. Is there a way I can speed up the code in C:

/* Here is the code */
// Table is "wij"

int i, j;
for(i = 0; i < N; ++i)
{
for(j = 0; j < N; ++j)
{
sum_over_j_wij[i] += wij[i,j];
}
for(j = 0; j < N; ++j)
{
sum_over_i[j] += wij[i,j]/sum_over_j_wij[i)];
}
}


As others have pointed out, you are using Pascal-style subscripts
rather than C or C++ style. And you have a strange syntax error in
there too. I'm going to assume all these arrays are "doubles".

Remember, division, module and square root are all roughly the same
speed are and very slow (as a rule of thumb you should think of them as
about 10 times slower than addition.) So reducing their quantity in
your inner loops is of prime importance in this case. So, if you can
live with slight accuracy issues, the key "speeding up" consideration
is conversion of division to reciprocal multiplication:

for (i=0; i < N; i++) {
for (s=0.0,j=0; j < N; j++) s += wij[i][j];
s = 1.0 / s; /* if s == 0, you are SOL. */
for (j=0; j < N; j++) sum_over_i[j] += s*wij[i][j];
}

Some compilers are not able to hoist out the wij[i] calculation, so it
might be useful to precalculate this as double *wptr = wij[i]; and
replace the instances of wij[i] with wptr.

Certainly, using a good vectorizing compiler, such as Intel's compiler
will likely make a *huge* difference on code like this. Microsoft
claims that their latest compilers have vectorization capabilities, but
I have not verified this myself. In any event, the benefits of using
the SIMD instruction set basically goes straight to the bottom line,
especially in cases like this.

If you are on a processor which has a "multiply accumulate" (PowerPC,
Itanium, PA-RISC) instead of SIMD, you can invert the loops:

for (i=0; i < N; i++) {
for (s=0.0,j=0; j < N; j++) s += wij[i][j];
recip_sum_over_j[i] = 1.0 / s; /* if s == 0, you are SOL. */
}

for (j=0; j < N; j++) {
for (s=0.0,i=0; i < N; i++) s += recip_sum_over_j[i]*wij[i][j];
sum_over_i[j] = s;
}

So you can try each possibility, and check your compiler settings
(w.r.t SIMD and "Multiply-Accumulate") to see which one works better
for your platform.

---
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Jul 23 '05 #9
In article <11**********************@f14g2000cwb.googlegroups .com>,
dv*****@hotmail.com wrote:
I have C code which computes the row sums of a matrix, divide each
element of each row with the row sum and then compute the column sum of
the resulting matrix. Is there a way I can speed up the code in C:

/* Here is the code */
// Table is "wij"

int i, j;
for(i = 0; i < N; ++i)
{
for(j = 0; j < N; ++j)
{
sum_over_j_wij[i] += wij[i,j];
}
for(j = 0; j < N; ++j)
{
sum_over_i[j] += wij[i,j]/sum_over_j_wij[i)];
}
}


Easy. Note that wij[i,j] is exactly the same as wij[j], so we change
this to

int i, j;
for(i = 0; i < N; ++i)
{
for(j = 0; j < N; ++j)
{
sum_over_j_wij[i] += wij[j];
}
for(j = 0; j < N; ++j)
{
sum_over_i[j] += wij[j]/sum_over_j_wij[i)];
}
}

Now the two inner loops are independent and can be split; then i and j
can be exchanged in the second loop, so we get:

int i, j;
for(i = 0; i < N; ++i)
{
for(j = 0; j < N; ++j)
{
sum_over_j_wij[i] += wij[j];
}
}
for(j = 0; j < N; ++j)
{
for(i = 0; i < N; ++i)
{
sum_over_i[j] += wij[j]/sum_over_j_wij[i)];
}
}

In the first nested loop, we always add the same values to
sum_over_j_wij, so we calculate that sum only once:

int i, j;
double s = 0.0;
for (j = 0; j < N; ++j) s += wij[j];
for(i = 0; i < N; ++i)
{
sum_over_j_wij[i] += s;
}
for(j = 0; j < N; ++j)
{
for(i = 0; i < N; ++i)
{
sum_over_i[j] += wij[j]/sum_over_j_wij[i)];
}
}

In the second nested loop, we add wij[j] multiplied by the sum over 1 /
sum_over_j_wij, so we change this to:

int i, j;
double s = 0.0, t = 0.0;

for (j = 0; j < N; ++j) s += wij[j];
for (i = 0; i < N; ++i) sum_over_j_wij[i] += s;
for (i = 0; i < N; ++i) t += 1.0 / sum_over_j_wij[i];
for (j = 0; j < N; ++j) sum_over_i[j] += wij[j] / t;

That should make it a bit faster.
Jul 23 '05 #10
Hi,

If you want to speed up your code beyond a certain level of optimizaton
then better study your data. The code written below may not perform
well if the matrix contains binary data. The study of data pattern may
not be useful at all times. Just an idea.

int i, j;
for(i = 0; i++ < N; )
{
for(j = 0; j++ < N;)
sum_over_j_wij[i] += wij[i,j];

for(j = 0; j++ < N;)
sum_over_i[j] += wij[i,j]/sum_over_j_wij[i)];

}

Thanks
Bharat
dvum...@hotmail.com wrote:
I have C code which computes the row sums of a matrix, divide each
element of each row with the row sum and then compute the column sum of the resulting matrix. Is there a way I can speed up the code in C:

/* Here is the code */
// Table is "wij"

int i, j;
for(i = 0; i < N; ++i)
{
for(j = 0; j < N; ++j)
{
sum_over_j_wij[i] += wij[i,j];
}
for(j = 0; j < N; ++j)
{
sum_over_i[j] += wij[i,j]/sum_over_j_wij[i)];
}
}


Jul 23 '05 #11
> /* Here is the code */
// Table is "wij"

int i, j;
for(i = 0; i < N; ++i)
{
for(j = 0; j < N; ++j)
{
sum_over_j_wij[i] += wij[i,j];
}
for(j = 0; j < N; ++j)
{
sum_over_i[j] += wij[i,j]/sum_over_j_wij[i)];
}
}


wij[i,j] is probably wrong, maybe you meant wij[i][j]?

Having modern C or C++ compiler I don't think you can make any significant
speedups here. Unrolling the loops will not help because optimizers do
that - especially well if N is compile time constant. Using pointers over
indices is also not going to help - the optimizer will create the same code
from such simple loops no matter if you use array indexing or pointers.

Though, if you use an older compiler, these techniques may help a lot.

cheers,
M.
Jul 23 '05 #12
In article <42***************@spamcop.net>,
Kenneth Brody <ke******@spamcop.net> wrote:
Lawrence Kirby wrote:
[...]
for(i = 0; i < N; ++i)
{
/* Assuming elements are of type double */
const double *const wij_row = wij[i];
double sum = 0.0;
double sum_reciprocal;

for(j = 0; j < N; ++j)
{
sum += wij_row[j];
}


And that can be further improved by incrementing and dereferencing a
pointer, rather than subscripting the array each time:


With a good compiler (that is with a compiler that is not completely
brain-damaged) that won't make the slightest difference, except that it
obfuscates the code. Strength reduction is an optimisation that was used
in the seventies.

And processors that can access wij_row[j] with a single instruction that
executes just as fast as dereferencing a pointer are not exactly rare.
Jul 23 '05 #13

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: Snyke | last post by:
Hi. I have a command line script which works really fine, the only problem is that it take *really* long for the first output to be printed on screen. Since I also get some HTTP headers I'm...
9
by: mfyahya | last post by:
Hi, I'm new to databases :) I need help speeding up select queries on my data which are currently taking 4-5 seconds. I set up a single large table of coordinates data with an index on the fields...
15
by: dvumani | last post by:
I have C code which computes the row sums of a matrix, divide each element of each row with the row sum and then compute the column sum of the resulting matrix. Is there a way I can speed up the...
10
by: Timothy Graves | last post by:
I have a quick (pun intended) question for the guru's out there. I have a piece of code where I am validating the input of chancters into a cell in a datagrid. I am using the keypressed event to...
2
by: Robert Wilkens | last post by:
Ok... This may be the wrong forum, but it's the first place I'm trying. I'm new to C# and just implemented the 3-tier Distributed application from Chapter 1 (the first walkthrough) in the...
2
by: OHM | last post by:
I was wondering about this topic and although I accept that different situations call for different solutions, but wondered are there any other solutions and whether has anyone carried out a...
5
by: RobinAG | last post by:
Hello, I just split my database into front and back end. My front end users are experiencing really slow opening of forms. I've searched online for help speeding up forms, but I'm not sure what...
10
by: ags5406 | last post by:
I've created an application that downloads data daily from a secure web site and stores that data in an Access database. Then there are different options for allowing the user to do keyword and...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.