I have C code which computes the row sums of a matrix, divide each
element of each row with the row sum and then compute the column sum of
the resulting matrix. Is there a way I can speed up the code in C:
/* Here is the code */
// Table is "wij"
int i, j;
for(i = 0; i < N; ++i)
{
for(j = 0; j < N; ++j)
{
sum_over_j_wij[i] += wij[i,j];
}
for(j = 0; j < N; ++j)
{
sum_over_i[j] += wij[i,j]/sum_over_j_wij[i)];
}
} 12 2236 dv*****@hotmail .com wrote: I have C code which computes the row sums of a matrix, divide each element of each row with the row sum and then compute the column sum of the resulting matrix. Is there a way I can speed up the code in C:
/* Here is the code */ // Table is "wij"
int i, j; for(i = 0; i < N; ++i) { for(j = 0; j < N; ++j) { sum_over_j_wij[i] += wij[i,j];
By "win[i,j]" you probably mean "win[i][j]", don't you?
} for(j = 0; j < N; ++j) { sum_over_i[j] += wij[i,j]/sum_over_j_wij[i)]; } }
There are some loop unrolling techniques that might help. You can also
avoid indexing all the time by using pointers and advancing those using
the builtin ++ operator or += operator.
Next time, please refrain from typing your code into the message and
instead use the "copyandpaste" capability offered by all modern Otes
and applications.
V
dvumani wrote: I have C code which computes the row sums of a matrix, divide each element of each row with the row sum and then compute the column sum of the resulting matrix. Is there a way I can speed up the code in C:
sum_over_j_wij[i] += wij[i,j];
Is wij[i,j] doing what you think it does?
Have you timetested this code to see if it's slow?
Have you timetested your entire application, to see if the speed of this
code is relevant?
After all that research, switch to C++ and look up "expression
metatemplates". You will probably find several examples with matrices.

Phlip http://www.c2.com/cgi/wiki?ZeekLand dv*****@hotmail .com wrote: I have C code which computes the row sums of a matrix, divide each element of each row with the row sum and then compute the column sum of the resulting matrix. Is there a way I can speed up the code in C:
/* Here is the code */ // Table is "wij"
int i, j; for(i = 0; i < N; ++i) { for(j = 0; j < N; ++j) { sum_over_j_wij[i] += wij[i,j]; } for(j = 0; j < N; ++j) { sum_over_i[j] += wij[i,j]/sum_over_j_wij[i)]; } }
1. If you're talking C, why are you posting to c.l.c++?
2. Your code won't compile; x[i,j] may be a syntax error
2.a.Your code might compile, but it may only provide x[j] ( or x[i], I
don't remember whether the comma operator return the right or left
operand in C). dv*****@hotmail .com wrote: I have C code which computes the row sums of a matrix, divide each element of each row with the row sum and then compute the column sum of the resulting matrix. Is there a way I can speed up the code in C:
You have some errors in your code.
Besides the error, perhaps you could see something by
expanding your loop into a series.
/* Here is the code */ // Table is "wij"
int i, j; for(i = 0; i < N; ++i) { for(j = 0; j < N; ++j) { sum_over_j_wij[i] += wij[i,j];
Here is the problem: wij[i,j]
The comma operator is not a subscript operator.
In this expression, it evaluates 'i', then 'j'
and returns the value of 'j'. So this is equivalent
to: wij[j]
The syntax for a multiple dimension array is:
wij[i][j]
} for(j = 0; j < N; ++j) { sum_over_i[j] += wij[i,j]/sum_over_j_wij[i)];
See note above about "wij[i,j]".
Syntax error: "sum_overj_ wij[i)];"
} }
For N = 1:
sum_over_j_wij[0] = wij[0][0];
sum_over_i[0] = wij[0][0] / (wij[0][0]); /* substitution */
For N = 2:
sum_over_j_wij[0] = wij[0][0] + wij[0][1];
sum_over_i[0] = wij[0][0] / (wij[0][0] + wij[0][1])
+ wij[0][1] / (wij[0][0] + wij[0][1]);
sum_over_j_wij[1] = wij[1][0] + wij[1][1];
sum_over_i[1] = wij[1][0] / (wij[1][0] + wij[1][1])
+ wij[1][1] / (wij[1][1] + wij[1][1]);
Follow this through N = 5.
See if there are any commonalities or if the terms
can be rearranged so that the operation can be
distributed (such as one pass is addition, another
division, etc.)
See also loop unrolling and also repetitive subtraction
rather than division.
Think about trying to "pipeline" the operations.

Thomas Matthews
C++ newsgroup welcome message: http://www.slack.net/~shiva/welcome.txt
C++ Faq: http://www.parashift.com/c++faqlite
C Faq: http://www.eskimo.com/~scs/cfaq/top.html
alt.comp.lang.l earn.cc++ faq: http://www.comeaucomputing.com/learn/faq/
Other sites: http://www.josuttis.com  C++ STL Library book http://www.sgi.com/tech/stl  Standard Template Library
red floyd wrote: dv*****@hotmail .com wrote:
I have C code which computes the row sums of a matrix, divide each element of each row with the row sum and then compute the column sum of the resulting matrix. Is there a way I can speed up the code in C:
/* Here is the code */ // Table is "wij"
int i, j; for(i = 0; i < N; ++i) { for(j = 0; j < N; ++j) { sum_over_j_wij[i] += wij[i,j]; } for(j = 0; j < N; ++j) { sum_over_i[j] += wij[i,j]/sum_over_j_wij[i)]; } }
1. If you're talking C, why are you posting to c.l.c++?
Techniques to speed up some code are common to both languages.
2. Your code won't compile; x[i,j] may be a syntax error
Why? The comma operator exists in both languages. It may not be doing
what's intended, but that's not the point.
2.a.Your code might compile, but it may only provide x[j] ( or x[i], I don't remember whether the comma operator return the right or left operand in C).
Then you might consider studying before attempting to reply to both
newsgroups. Just a thought...
V dv*****@hotmail .com wrote: I have C code which computes the row sums of a matrix, divide each element of each row with the row sum and then compute the column sum of the resulting matrix. Is there a way I can speed up the code in C:
Try this:
cat f.c
#include <math.h>
#include <stdlib.h>
#include <stdbool.h>
bool f( // false if any row sum is zero.
const size_t m, // number of rows
const size_t n, // number of columns
const double w[m][n], // input matrix
double_t* restrict sum // output (columm sums)
) {
for (size_t i = 0; i < m; ++i) {
sum[i] = 0.0;
}
for (size_t i = 0; i < m; ++i) {
double_t t = 0.0; // row sum accumulator
for (size_t j = 0; j < n; ++j) {
t += w[i][j];
}
if (0.0 == t)
return false;
for (size_t j = 0; j < n; ++j) {
sum[j] += w[i][j]/t;
}
}
return true;
}
gcc Wall std=c99 pedantic O3 c f.c
Performance will degrade when rows of array w are too large
to keep in level 1 cache along with the column sums.
On Fri, 06 May 2005 11:55:47 0400, Victor Bazarov wrote: dv*****@hotmail .com wrote: I have C code which computes the row sums of a matrix, divide each element of each row with the row sum and then compute the column sum of the resulting matrix. Is there a way I can speed up the code in C:
/* Here is the code */ // Table is "wij"
int i, j; for(i = 0; i < N; ++i) { for(j = 0; j < N; ++j) { sum_over_j_wij[i] += wij[i,j];
sum_over_j_wij[i] hasn't visibly been initialised anywhere. I assume these
array elements should be initialised to 0 before the loop. By "win[i,j]" you probably mean "win[i][j]", don't you?
} for(j = 0; j < N; ++j) { sum_over_i[j] += wij[i,j]/sum_over_j_wij[i)];
Similarly sum_over_i[j] hasn't visibly been initialised.
} }
There are some loop unrolling techniques that might help.
The biggest time consumer here is likely to be the division. Division is
usually a lot slower than other arithmetic operations. Loop unrolling
probably won't help much in comparison, and is something the compiler
might do for you anyway.
You can also avoid indexing all the time by using pointers and advancing those using the builtin ++ operator or += operator.
Indexing is usually not an expensive operation, and compilers tend to be
good at optimising it. It isn't uncommon for an indexed version of code
to turn out faster than one using pointer arithmetic.
You could experimet with the following (untested) code.
int i, j;
/* Initialise sum_over_i elements if necessary here */
for(j = 0; j < N; ++j)
{
sum_over_i[j] = 0.0;
}
for(i = 0; i < N; ++i)
{
/* Assuming elements are of type double */
const double *const wij_row = wij[i];
double sum = 0.0;
double sum_reciprocal;
for(j = 0; j < N; ++j)
{
sum += wij_row[j];
}
sum_over_j_wij[i] = sum;
sum_reciprocal = 1.0 / sum;
for(j = 0; j < N; ++j)
{
sum_over_i[j] += wij_row[j] * sum_reciprocal;
}
}
wij_row may or may not help, the compiler could well optimise in a similar
or perhaps better way. Multiplying by the reciprocal may be marginally
less accurate than dividing.
Lawrence
dvum...@hotmail .com wrote: I have C code which computes the row sums of a matrix, divide each element of each row with the row sum and then compute the column sum
of the resulting matrix. Is there a way I can speed up the code in C:
/* Here is the code */ // Table is "wij"
int i, j; for(i = 0; i < N; ++i) { for(j = 0; j < N; ++j) { sum_over_j_wij[i] += wij[i,j]; } for(j = 0; j < N; ++j) { sum_over_i[j] += wij[i,j]/sum_over_j_wij[i)]; } }
As others have pointed out, you are using Pascalstyle subscripts
rather than C or C++ style. And you have a strange syntax error in
there too. I'm going to assume all these arrays are "doubles".
Remember, division, module and square root are all roughly the same
speed are and very slow (as a rule of thumb you should think of them as
about 10 times slower than addition.) So reducing their quantity in
your inner loops is of prime importance in this case. So, if you can
live with slight accuracy issues, the key "speeding up" consideration
is conversion of division to reciprocal multiplication:
for (i=0; i < N; i++) {
for (s=0.0,j=0; j < N; j++) s += wij[i][j];
s = 1.0 / s; /* if s == 0, you are SOL. */
for (j=0; j < N; j++) sum_over_i[j] += s*wij[i][j];
}
Some compilers are not able to hoist out the wij[i] calculation, so it
might be useful to precalculate this as double *wptr = wij[i]; and
replace the instances of wij[i] with wptr.
Certainly, using a good vectorizing compiler, such as Intel's compiler
will likely make a *huge* difference on code like this. Microsoft
claims that their latest compilers have vectorization capabilities, but
I have not verified this myself. In any event, the benefits of using
the SIMD instruction set basically goes straight to the bottom line,
especially in cases like this.
If you are on a processor which has a "multiply accumulate" (PowerPC,
Itanium, PARISC) instead of SIMD, you can invert the loops:
for (i=0; i < N; i++) {
for (s=0.0,j=0; j < N; j++) s += wij[i][j];
recip_sum_over_ j[i] = 1.0 / s; /* if s == 0, you are SOL. */
}
for (j=0; j < N; j++) {
for (s=0.0,i=0; i < N; i++) s += recip_sum_over_ j[i]*wij[i][j];
sum_over_i[j] = s;
}
So you can try each possibility, and check your compiler settings
(w.r.t SIMD and "MultiplyAccumulate") to see which one works better
for your platform.

Paul Hsieh http://www.pobox.com/~qed/ http://bstring.sf.net/
In article <11************ **********@f14g 2000cwb.googleg roups.com>, dv*****@hotmail .com wrote: I have C code which computes the row sums of a matrix, divide each element of each row with the row sum and then compute the column sum of the resulting matrix. Is there a way I can speed up the code in C:
/* Here is the code */ // Table is "wij"
int i, j; for(i = 0; i < N; ++i) { for(j = 0; j < N; ++j) { sum_over_j_wij[i] += wij[i,j]; } for(j = 0; j < N; ++j) { sum_over_i[j] += wij[i,j]/sum_over_j_wij[i)]; } }
Easy. Note that wij[i,j] is exactly the same as wij[j], so we change
this to
int i, j;
for(i = 0; i < N; ++i)
{
for(j = 0; j < N; ++j)
{
sum_over_j_wij[i] += wij[j];
}
for(j = 0; j < N; ++j)
{
sum_over_i[j] += wij[j]/sum_over_j_wij[i)];
}
}
Now the two inner loops are independent and can be split; then i and j
can be exchanged in the second loop, so we get:
int i, j;
for(i = 0; i < N; ++i)
{
for(j = 0; j < N; ++j)
{
sum_over_j_wij[i] += wij[j];
}
}
for(j = 0; j < N; ++j)
{
for(i = 0; i < N; ++i)
{
sum_over_i[j] += wij[j]/sum_over_j_wij[i)];
}
}
In the first nested loop, we always add the same values to
sum_over_j_wij, so we calculate that sum only once:
int i, j;
double s = 0.0;
for (j = 0; j < N; ++j) s += wij[j];
for(i = 0; i < N; ++i)
{
sum_over_j_wij[i] += s;
}
for(j = 0; j < N; ++j)
{
for(i = 0; i < N; ++i)
{
sum_over_i[j] += wij[j]/sum_over_j_wij[i)];
}
}
In the second nested loop, we add wij[j] multiplied by the sum over 1 /
sum_over_j_wij, so we change this to:
int i, j;
double s = 0.0, t = 0.0;
for (j = 0; j < N; ++j) s += wij[j];
for (i = 0; i < N; ++i) sum_over_j_wij[i] += s;
for (i = 0; i < N; ++i) t += 1.0 / sum_over_j_wij[i];
for (j = 0; j < N; ++j) sum_over_i[j] += wij[j] / t;
That should make it a bit faster. This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics 
by: Snyke 
last post by:
Hi.
I have a command line script which works really fine, the only problem
is that it take *really* long for the first output to be printed on
screen.
Since I also get some HTTP headers I'm suspecting that some sort of
output buffering is used.
How can I tell PHP to flush the buffer automatically (without using
flush(); after every print or echo) and to remove the headers?

by: mfyahya 
last post by:
Hi,
I'm new to databases :) I need help speeding up select queries on my
data which are currently taking 45 seconds. I set up a single large
table of coordinates data with an index on the fields I use most
frequently in select queries. The data is about 100MB and index is
80MB. The table has the following structure:
CREATE TABLE `ptimes` (
`id` INT UNSIGNED NOT NULL ,
`rc` CHAR ( 1 ) UNSIGNED NOT NULL ,

by: dvumani 
last post by:
I have C code which computes the row sums of a matrix, divide each
element of each row with the row sum and then compute the column sum of
the resulting matrix. Is there a way I can speed up the code in C:
/* Here is the code */
// Table is "wij"
int i, j;
for(i = 0; i < N; ++i)
{

by: Timothy Graves 
last post by:
I have a quick (pun intended) question for the guru's out there.
I have a piece of code where I am validating the input of chancters
into a cell in a datagrid. I am using the keypressed event to get the
charcter that the user typed and then allowing it to be passed to the
cell. Now here is the tricked part, I am also validating the values
(max and min) so that the user does not input a invalid number (out of
range ex. byte != 256).
...

by: Robert Wilkens 
last post by:
Ok... This may be the wrong forum, but it's the first place I'm trying.
I'm new to C# and just implemented the 3tier Distributed application from
Chapter 1 (the first walkthrough) in the "Walkthrough" book that comes with
Visual Studio .NET 2003 Enterprise Architect.
My first observation is  woah, is this thing slow. From the time I
clicked "load" to the time I had a populated data set on the windowsbased
app was almost 510...
 
by: OHM 
last post by:
I was wondering about this topic and although I accept that different
situations call for different solutions, but wondered are there any other
solutions and whether has anyone carried out a comparison of the different
methods for avoiding JIT. Further more, is there anything I should be
considering before using NGEN? Better methods etc >
TIA

by: RobinAG 
last post by:
Hello,
I just split my database into front and back end. My front end users are experiencing really slow opening of forms. I've searched online for help speeding up forms, but I'm not sure what the best way is with my current setup. I've inherited the database from a previous programmer, and he set things up a little uniquely.
Here's the deal:
I have a main form that opens, listing in a subform all of the projects, with some toplevel...

by: ags5406 
last post by:
I've created an application that downloads data daily from a secure
web site and stores that data in an Access database. Then there are
different options for allowing the user to do keyword and date
searches on the database and have the information displayed for them.
Everything looks and functions great, my only real dissatisfaction
with my application is the update time, which in my last test took
about 4546 minutes for 9800 records....

by: Hystou 
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it.
First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...

by: Oralloy 
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bitfields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed.
This is as boiled down as I can make it.
Here is my compilation command:
g++12 std=c++20 Wnarrowing bit_field.cpp
Here is the code in...

by: jinu1996 
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth.
The Art of Business Website Design
Your website is...
 
by: Hystou 
last post by:
Overview:
Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...

by: tracyyun 
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, ZWave, WiFi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...

by: conductexam 
last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one.
At the time of converting from word file to html my equations which are in the word document file was convert into image.
Globals.ThisAddIn.Application.ActiveDocument.Select();...

by: TSSRALBI 
last post by:
Hello
I'm a network technician in training and I need your help.
I am currently learning how to create and manage the different types of VPNs and I have a question about LANtoLAN VPNs.
The last exercise I practiced was to create a LANtoLAN VPN between two Pfsense firewalls, by using IPSEC protocols.
I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...

by: adsilva 
last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?

by: muto222 
last post by:
How can i add a mobile payment intergratation into php mysql website.
 