Unwinding a loop: why can't .NET add 1 + 2 + 3 ... ?

Mountain Bikn' Guy

Take some standard code such as shown below. It simply loops to add up a
series of terms and it produces the correct result.

// sum numbers with a loop
public int DoSumLooping(in t iterations)
{
int result = 0;
for(int i = 1;i <=iterations;i+ +)
{
result += i;
}
return result;
}

Now translate this into a specific solution that doesn't use looping (and
use the same value for the number of iterations the loop performs). This
code returns an incorrect result. The method consists entirely of a very
straightforward code statement, but in this case .NET adds the numbers
incorrectly.
public double ComputeSum( )
{
// Brute force sum method
// For iterations == 10000
double sum = 0+ 1+ 2+ 3+ 4+ 5+ 6+ 7+ 8+ 9+ 10+ 11+ 12+ 13+ 14+ 15+ ...
+ 9997+ 9998+ 9999+ 10000;
return sum;
}
The above method returns an incorrect result with any number of terms above
about 200. It will correctly add 1 + 2 + ... + 200, but it will NOT
correctly add 1 + 2 + ... + 1000.

I have just run across this, and I have not yet researched the possible
reasons for this behavior. It may be a known issue related to either stack
size or the length of a code line, but to my knowledge it hasn't been
discussed in any of the "popular" literature on C# and .NET. I need to write
code like this, so if anyone has already encountered this issue, please
advise me.
Here's another example that also creates problems, but of a somewhat
different nature. Take the following code and translate it into a specific,
non-looping method and try to execute it using reflection. It fails.

public double LoopToCompute()
{
double sumOfProducts = 0;
double grandTotal = 0;
for (int i = 0; i < maxRows; ++i)
{
for (int j = 0; j < maxCols; ++j)
{
sumOfProducts += coeff[j] * table[i][j];
}
a_point[i] = sumOfProducts;
grandTotal += sumOfProducts;
sumOfProducts = 0;
}
return grandTotal;
}//LoopToCompute

The above code works -- but it's equivalent code with loops unrolled (shown
below) doesn't work unless the maxRows is set very small. For small values,
the 2 methods (above and below) produce identical results. There is nothing
"wrong" with the code in that sense. It's similar to the above situation. If
the "size" of the code statement or the number of code statements is too
large, .NET fails. In this case (using reflection) it doesn't return the
incorrect result, as the first example did. In this case, reflection calls
it an invalid program and refuses to run it (but only when the value of
maxRows is above about 250). The reason for this is probably
straightforward . However, I have the need to make statements like this for
performance reasons so I need a work-around. Any suggestions are
appreciated! All comments are appreciated.

public double DoBruteForceCom pute()
{
double bruteForceSum = 0;

point1=coeff1*t able[0][0] +coeff2*table[0][1] +coeff3*table[0][2]
+coeff4*table[0][3] +coeff5*table[0][4] +coeff6*table[0][5]
+coeff7*table[0][6] +coeff8*table[0][7] +coeff9*table[0][8]
+coeff10*table[0][9] +coeff11*table[0][10] +coeff12*table[0][11]
+coeff13*table[0][12] +coeff14*table[0][13] +coeff15*table[0][14]
+coeff16*table[0][15] +coeff17*table[0][16] +coeff18*table[0][17]
+coeff19*table[0][18] +coeff20*table[0][19] +coeff21*table[0][20]
+coeff22*table[0][21] +coeff23*table[0][22] +coeff24*table[0][23]
+coeff25*table[0][24] +coeff26*table[0][25] +coeff27*table[0][26]
+coeff28*table[0][27] +coeff29*table[0][28] +coeff30*table[0][29]
+coeff31*table[0][30] +coeff32*table[0][31] +coeff33*table[0][32]
+coeff34*table[0][33] +coeff35*table[0][34] ;

point2=coeff1*t able[1][0] +coeff2*table[1][1] +coeff3*table[1][2]
+coeff4*table[1][3] +coeff5*table[1][4] +coeff6*table[1][5]
+coeff7*table[1][6] +coeff8*table[1][7] +coeff9*table[1][8]
+coeff10*table[1][9] +coeff11*table[1][10] +coeff12*table[1][11]
+coeff13*table[1][12] +coeff14*table[1][13] +coeff15*table[1][14]
+coeff16*table[1][15] +coeff17*table[1][16] +coeff18*table[1][17]
+coeff19*table[1][18] +coeff20*table[1][19] +coeff21*table[1][20]
+coeff22*table[1][21] +coeff23*table[1][22] +coeff24*table[1][23]
+coeff25*table[1][24] +coeff26*table[1][25] +coeff27*table[1][26]
+coeff28*table[1][27] +coeff29*table[1][28] +coeff30*table[1][29]
+coeff31*table[1][30] +coeff32*table[1][31] +coeff33*table[1][32]
+coeff34*table[1][33] +coeff35*table[1][34] ;
[...]

point500=coeff1 *table[499][0] +coeff2*table[499][1] +coeff3*table[499][2]
+coeff4*table[499][3] +coeff5*table[499][4] +coeff6*table[499][5]
+coeff7*table[499][6] +coeff8*table[499][7] +coeff9*table[499][8]
+coeff10*table[499][9] +coeff11*table[499][10] +coeff12*table[499][11]
+coeff13*table[499][12] +coeff14*table[499][13] +coeff15*table[499][14]
+coeff16*table[499][15] +coeff17*table[499][16] +coeff18*table[499][17]
+coeff19*table[499][18] +coeff20*table[499][19] +coeff21*table[499][20]
+coeff22*table[499][21] +coeff23*table[499][22] +coeff24*table[499][23]
+coeff25*table[499][24] +coeff26*table[499][25] +coeff27*table[499][26]
+coeff28*table[499][27] +coeff29*table[499][28] +coeff30*table[499][29]
+coeff31*table[499][30] +coeff32*table[499][31] +coeff33*table[499][32]
+coeff34*table[499][33] +coeff35*table[499][34] ;

bruteForceSum =
point1 +
point2 + ... +

point499 +
point500
;

return bruteForceSum;

}//DoBruteForceCom pute

Jul 21 '05

Subscribe Reply

12345

« First
<
2
3
4
5
>

Mountain Bikn' Guy

Jon,
I'm becoming convinced this issue is a compiler bug. I'm curious what you or
others find:
can you reproduce it using my code?
is there a work around (with equivalent performance advantages)?
is the bug present in both VS.NET 2002 and 2003?
Regards,
Dave

"Jon Skeet [C# MVP]" <sk***@pobox.co m> wrote in message
news:MP******** *************** *@msnews.micros oft.com...

Mountain Bikn' Guy <vc@attbi.com > wrote:
Anyone have any ideas why the following code "is not a valid program". It compiles fine, but the runtime refuses to run it. I apologize for it being a bit long (but I did cut out 497 of the 500 statement lines in the method). FYI, this is part 2 of my initial question in this thread (and in this case, the statement lines are not longer than the limit).

<snip>

Presumably you do actually still *have* the full version around? I
appreciate it would be too long to post, but could you mail it to me or
put it on a website? It's easier to test that way :)

If you could make it a full program that we could just compile as-is
and then run, that would be great.

--
Jon Skeet - <sk***@pobox.co m>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too

Jul 21 '05 #31

Jon Skeet [C# MVP]

Mountain Bikn' Guy <vc@attbi.com > wrote:

I'm becoming convinced this issue is a compiler bug. I'm curious what you or
others find:
can you reproduce it using my code?
Yup.
is there a work around (with equivalent performance advantages)?
is the bug present in both VS.NET 2002 and 2003?

I suspect it's actually in the framework itself (which is where the
guts of the compiler is) rather than VS.NET itself.

Is the simple looping code definitely too slow for your needs?

--
Jon Skeet - <sk***@pobox.co m>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too

Jul 21 '05 #32

Mountain Bikn' Guy

In the event anyone wishes to review this thread now or later, I thought I
would provide some conclusions that resulted from everyone's input and from
my reading:

1. unrolling loops often doesn't provide a performance gain on large (real
life size?) problems. In my experience, the unrolled loop was 5-10 times
SLOWER than the simple nested loop that I used as a reference. I found the
following quote in Eric Gunnerson's book (2nd ed). He said (about unrolled
loops), a "function is so big it doesn't fit into cache and therefore gets
slower performance."

Jon added the following info:
I suspect he means the processor's instruction
cache - if it's a small loop, the processor (not JIT, note!) can decode
the instructions once and keep the microcode available for further
iterations.

2. VS.NET has a 2048 char line limit in code files. In my experience, when
using runtime code generation and compiling, this can cause unusual trouble
if one isn't aware of it. I was able to compile and run an application that
was created from a code file with a line longer than this limit. The only
error I encountered was that the mathematical expression in the executing
program returned incorrect results! (The fix was simply inserting line
breaks into the code file.)

3. VS.NET has some compiler bugs that will result in the following error
message:
Additional information: Common Language Runtime detected an invalid program.

This appears to happen on complex mathematical expressions. However, in my
case the expression itself was not that complex. It was just long (but each
line wasn't too long). When I reached about 1000 subexpressions in a simple
sum of products expression, I encountered this compiler bug. Search on
Google Groups using the error message, and you'll see that others have
encountered it (frequently) when dealing with complex mathematical
expressions.

To my knowledge, there is no solution for this. Anyone from Microsoft care
to comment?

That about sums up (no pun intended) what I've learned so far. HTH.
Dave
"Mountain Bikn' Guy" <vc@attbi.com > wrote in message
news:NZUjb.7959 42$YN5.793228@s ccrnsc01...

Take some standard code such as shown below. It simply loops to add up a
series of terms and it produces the correct result.

// sum numbers with a loop
public int DoSumLooping(in t iterations)
{
int result = 0;
for(int i = 1;i <=iterations;i+ +)
{
result += i;
}
return result;
}

Now translate this into a specific solution that doesn't use looping (and
use the same value for the number of iterations the loop performs). This
code returns an incorrect result. The method consists entirely of a very
straightforward code statement, but in this case .NET adds the numbers
incorrectly.
public double ComputeSum( )
{
// Brute force sum method
// For iterations == 10000
double sum = 0+ 1+ 2+ 3+ 4+ 5+ 6+ 7+ 8+ 9+ 10+ 11+ 12+ 13+ 14+ 15+ .... + 9997+ 9998+ 9999+ 10000;
return sum;
}
The above method returns an incorrect result with any number of terms above about 200. It will correctly add 1 + 2 + ... + 200, but it will NOT
correctly add 1 + 2 + ... + 1000.

I have just run across this, and I have not yet researched the possible
reasons for this behavior. It may be a known issue related to either stack
size or the length of a code line, but to my knowledge it hasn't been
discussed in any of the "popular" literature on C# and .NET. I need to write code like this, so if anyone has already encountered this issue, please
advise me.
Here's another example that also creates problems, but of a somewhat
different nature. Take the following code and translate it into a specific, non-looping method and try to execute it using reflection. It fails.

public double LoopToCompute()
{
double sumOfProducts = 0;
double grandTotal = 0;
for (int i = 0; i < maxRows; ++i)
{
for (int j = 0; j < maxCols; ++j)
{
sumOfProducts += coeff[j] * table[i][j];
}
a_point[i] = sumOfProducts;
grandTotal += sumOfProducts;
sumOfProducts = 0;
}
return grandTotal;
}//LoopToCompute

The above code works -- but it's equivalent code with loops unrolled (shown below) doesn't work unless the maxRows is set very small. For small values, the 2 methods (above and below) produce identical results. There is nothing "wrong" with the code in that sense. It's similar to the above situation. If the "size" of the code statement or the number of code statements is too
large, .NET fails. In this case (using reflection) it doesn't return the
incorrect result, as the first example did. In this case, reflection calls
it an invalid program and refuses to run it (but only when the value of
maxRows is above about 250). The reason for this is probably
straightforward . However, I have the need to make statements like this for
performance reasons so I need a work-around. Any suggestions are
appreciated! All comments are appreciated.

public double DoBruteForceCom pute()
{
double bruteForceSum = 0;

point1=coeff1*t able[0][0] +coeff2*table[0][1] +coeff3*table[0][2]
+coeff4*table[0][3] +coeff5*table[0][4] +coeff6*table[0][5]
+coeff7*table[0][6] +coeff8*table[0][7] +coeff9*table[0][8]
+coeff10*table[0][9] +coeff11*table[0][10] +coeff12*table[0][11]
+coeff13*table[0][12] +coeff14*table[0][13] +coeff15*table[0][14]
+coeff16*table[0][15] +coeff17*table[0][16] +coeff18*table[0][17]
+coeff19*table[0][18] +coeff20*table[0][19] +coeff21*table[0][20]
+coeff22*table[0][21] +coeff23*table[0][22] +coeff24*table[0][23]
+coeff25*table[0][24] +coeff26*table[0][25] +coeff27*table[0][26]
+coeff28*table[0][27] +coeff29*table[0][28] +coeff30*table[0][29]
+coeff31*table[0][30] +coeff32*table[0][31] +coeff33*table[0][32]
+coeff34*table[0][33] +coeff35*table[0][34] ;

point2=coeff1*t able[1][0] +coeff2*table[1][1] +coeff3*table[1][2]
+coeff4*table[1][3] +coeff5*table[1][4] +coeff6*table[1][5]
+coeff7*table[1][6] +coeff8*table[1][7] +coeff9*table[1][8]
+coeff10*table[1][9] +coeff11*table[1][10] +coeff12*table[1][11]
+coeff13*table[1][12] +coeff14*table[1][13] +coeff15*table[1][14]
+coeff16*table[1][15] +coeff17*table[1][16] +coeff18*table[1][17]
+coeff19*table[1][18] +coeff20*table[1][19] +coeff21*table[1][20]
+coeff22*table[1][21] +coeff23*table[1][22] +coeff24*table[1][23]
+coeff25*table[1][24] +coeff26*table[1][25] +coeff27*table[1][26]
+coeff28*table[1][27] +coeff29*table[1][28] +coeff30*table[1][29]
+coeff31*table[1][30] +coeff32*table[1][31] +coeff33*table[1][32]
+coeff34*table[1][33] +coeff35*table[1][34] ;
[...]

point500=coeff1 *table[499][0] +coeff2*table[499][1] +coeff3*table[499][2]
+coeff4*table[499][3] +coeff5*table[499][4] +coeff6*table[499][5]
+coeff7*table[499][6] +coeff8*table[499][7] +coeff9*table[499][8]
+coeff10*table[499][9] +coeff11*table[499][10] +coeff12*table[499][11]
+coeff13*table[499][12] +coeff14*table[499][13] +coeff15*table[499][14]
+coeff16*table[499][15] +coeff17*table[499][16] +coeff18*table[499][17]
+coeff19*table[499][18] +coeff20*table[499][19] +coeff21*table[499][20]
+coeff22*table[499][21] +coeff23*table[499][22] +coeff24*table[499][23]
+coeff25*table[499][24] +coeff26*table[499][25] +coeff27*table[499][26]
+coeff28*table[499][27] +coeff29*table[499][28] +coeff30*table[499][29]
+coeff31*table[499][30] +coeff32*table[499][31] +coeff33*table[499][32]
+coeff34*table[499][33] +coeff35*table[499][34] ;

bruteForceSum =
point1 +
point2 + ... +

point499 +
point500
;

return bruteForceSum;

}//DoBruteForceCom pute

Jul 21 '05 #33

Mountain Bikn' Guy

I think I found the right balance:
Instead of using a nested loop, I unroll the inner loop but leave the outer
loop. This has resulted in the best performance so far (and it doesn't cause
any of the issues/bugs and have been discussed in this thread).

Thanks for all your replies.

Dave
"Jon Skeet [C# MVP]" <sk***@pobox.co m> wrote in message
news:MP******** *************** *@msnews.micros oft.com...

Mountain Bikn' Guy <vc@attbi.com > wrote:
I'm becoming convinced this issue is a compiler bug. I'm curious what you or others find:
can you reproduce it using my code?

Yup.
is there a work around (with equivalent performance advantages)?
is the bug present in both VS.NET 2002 and 2003?

I suspect it's actually in the framework itself (which is where the
guts of the compiler is) rather than VS.NET itself.

Is the simple looping code definitely too slow for your needs?

--
Jon Skeet - <sk***@pobox.co m>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too

Jul 21 '05 #34

anonymouse

Did you make these unrolled loops manually or did you use a CodeDom to
generate them?

Curious.

Why do it the hard way:D
"Mountain Bikn' Guy" <vc@attbi.com > wrote in message
news:RAblb.1964 35$%h1.190498@s ccrnsc02...

In the event anyone wishes to review this thread now or later, I thought I
would provide some conclusions that resulted from everyone's input and from my reading:

1. unrolling loops often doesn't provide a performance gain on large (real
life size?) problems. In my experience, the unrolled loop was 5-10 times
SLOWER than the simple nested loop that I used as a reference. I found the
following quote in Eric Gunnerson's book (2nd ed). He said (about unrolled
loops), a "function is so big it doesn't fit into cache and therefore gets
slower performance."

Jon added the following info:
I suspect he means the processor's instruction
cache - if it's a small loop, the processor (not JIT, note!) can decode
the instructions once and keep the microcode available for further
iterations.

2. VS.NET has a 2048 char line limit in code files. In my experience, when
using runtime code generation and compiling, this can cause unusual trouble if one isn't aware of it. I was able to compile and run an application that was created from a code file with a line longer than this limit. The only
error I encountered was that the mathematical expression in the executing
program returned incorrect results! (The fix was simply inserting line
breaks into the code file.)

3. VS.NET has some compiler bugs that will result in the following error
message:
Additional information: Common Language Runtime detected an invalid program.
This appears to happen on complex mathematical expressions. However, in my
case the expression itself was not that complex. It was just long (but each line wasn't too long). When I reached about 1000 subexpressions in a simple sum of products expression, I encountered this compiler bug. Search on
Google Groups using the error message, and you'll see that others have
encountered it (frequently) when dealing with complex mathematical
expressions.

To my knowledge, there is no solution for this. Anyone from Microsoft care
to comment?

That about sums up (no pun intended) what I've learned so far. HTH.
Dave
"Mountain Bikn' Guy" <vc@attbi.com > wrote in message
news:NZUjb.7959 42$YN5.793228@s ccrnsc01...
Take some standard code such as shown below. It simply loops to add up a
series of terms and it produces the correct result.

// sum numbers with a loop
public int DoSumLooping(in t iterations)
{
int result = 0;
for(int i = 1;i <=iterations;i+ +)
{
result += i;
}
return result;
}

Now translate this into a specific solution that doesn't use looping (and use the same value for the number of iterations the loop performs). This
code returns an incorrect result. The method consists entirely of a very
straightforward code statement, but in this case .NET adds the numbers
incorrectly.
public double ComputeSum( )
{
// Brute force sum method
// For iterations == 10000
double sum = 0+ 1+ 2+ 3+ 4+ 5+ 6+ 7+ 8+ 9+ 10+ 11+ 12+ 13+ 14+ 15+ ...
+ 9997+ 9998+ 9999+ 10000;
return sum;
}
The above method returns an incorrect result with any number of terms

above
about 200. It will correctly add 1 + 2 + ... + 200, but it will NOT
correctly add 1 + 2 + ... + 1000.

I have just run across this, and I have not yet researched the possible
reasons for this behavior. It may be a known issue related to either stack size or the length of a code line, but to my knowledge it hasn't been
discussed in any of the "popular" literature on C# and .NET. I need to

write
code like this, so if anyone has already encountered this issue, please
advise me.
Here's another example that also creates problems, but of a somewhat
different nature. Take the following code and translate it into a

specific,
non-looping method and try to execute it using reflection. It fails.

public double LoopToCompute()
{
double sumOfProducts = 0;
double grandTotal = 0;
for (int i = 0; i < maxRows; ++i)
{
for (int j = 0; j < maxCols; ++j)
{
sumOfProducts += coeff[j] * table[i][j];
}
a_point[i] = sumOfProducts;
grandTotal += sumOfProducts;
sumOfProducts = 0;
}
return grandTotal;
}//LoopToCompute

The above code works -- but it's equivalent code with loops unrolled

(shown
below) doesn't work unless the maxRows is set very small. For small

values,
the 2 methods (above and below) produce identical results. There is

nothing
"wrong" with the code in that sense. It's similar to the above situation. If
the "size" of the code statement or the number of code statements is too
large, .NET fails. In this case (using reflection) it doesn't return the
incorrect result, as the first example did. In this case, reflection

calls it an invalid program and refuses to run it (but only when the value of
maxRows is above about 250). The reason for this is probably
straightforward . However, I have the need to make statements like this for performance reasons so I need a work-around. Any suggestions are
appreciated! All comments are appreciated.

public double DoBruteForceCom pute()
{
double bruteForceSum = 0;

point1=coeff1*t able[0][0] +coeff2*table[0][1] +coeff3*table[0][2]
+coeff4*table[0][3] +coeff5*table[0][4] +coeff6*table[0][5]
+coeff7*table[0][6] +coeff8*table[0][7] +coeff9*table[0][8]
+coeff10*table[0][9] +coeff11*table[0][10] +coeff12*table[0][11]
+coeff13*table[0][12] +coeff14*table[0][13] +coeff15*table[0][14]
+coeff16*table[0][15] +coeff17*table[0][16] +coeff18*table[0][17]
+coeff19*table[0][18] +coeff20*table[0][19] +coeff21*table[0][20]
+coeff22*table[0][21] +coeff23*table[0][22] +coeff24*table[0][23]
+coeff25*table[0][24] +coeff26*table[0][25] +coeff27*table[0][26]
+coeff28*table[0][27] +coeff29*table[0][28] +coeff30*table[0][29]
+coeff31*table[0][30] +coeff32*table[0][31] +coeff33*table[0][32]
+coeff34*table[0][33] +coeff35*table[0][34] ;

point2=coeff1*t able[1][0] +coeff2*table[1][1] +coeff3*table[1][2]
+coeff4*table[1][3] +coeff5*table[1][4] +coeff6*table[1][5]
+coeff7*table[1][6] +coeff8*table[1][7] +coeff9*table[1][8]
+coeff10*table[1][9] +coeff11*table[1][10] +coeff12*table[1][11]
+coeff13*table[1][12] +coeff14*table[1][13] +coeff15*table[1][14]
+coeff16*table[1][15] +coeff17*table[1][16] +coeff18*table[1][17]
+coeff19*table[1][18] +coeff20*table[1][19] +coeff21*table[1][20]
+coeff22*table[1][21] +coeff23*table[1][22] +coeff24*table[1][23]
+coeff25*table[1][24] +coeff26*table[1][25] +coeff27*table[1][26]
+coeff28*table[1][27] +coeff29*table[1][28] +coeff30*table[1][29]
+coeff31*table[1][30] +coeff32*table[1][31] +coeff33*table[1][32]
+coeff34*table[1][33] +coeff35*table[1][34] ;
[...]

point500=coeff1 *table[499][0] +coeff2*table[499][1] +coeff3*table[499][2] +coeff4*table[499][3] +coeff5*table[499][4] +coeff6*table[499][5]
+coeff7*table[499][6] +coeff8*table[499][7] +coeff9*table[499][8]
+coeff10*table[499][9] +coeff11*table[499][10] +coeff12*table[499][11]
+coeff13*table[499][12] +coeff14*table[499][13] +coeff15*table[499][14]
+coeff16*table[499][15] +coeff17*table[499][16] +coeff18*table[499][17]
+coeff19*table[499][18] +coeff20*table[499][19] +coeff21*table[499][20]
+coeff22*table[499][21] +coeff23*table[499][22] +coeff24*table[499][23]
+coeff25*table[499][24] +coeff26*table[499][25] +coeff27*table[499][26]
+coeff28*table[499][27] +coeff29*table[499][28] +coeff30*table[499][29]
+coeff31*table[499][30] +coeff32*table[499][31] +coeff33*table[499][32]
+coeff34*table[499][33] +coeff35*table[499][34] ;

bruteForceSum =
point1 +
point2 + ... +

point499 +
point500
;

return bruteForceSum;

}//DoBruteForceCom pute

Jul 21 '05 #35

Mountain Bikn' Guy

Since you asked...

I could use some help with CodeDOM. Here is how I am generating these loops:

//
//TODO: generate this entire expression using CodeDOM classes:
StringBuilder formulaString = new StringBuilder() ;
formulaString.A ppend("dataPoin ts[row] = 0");
if (columns != null)
{
formulaString.C apacity = 40 * (columns.Count + 2);
for (int j = 0; j < columns.Count; ++j)
{
if ( ((MyColumnType) columns[j]).Role == RoleType.Indepe ndent)
{
formulaString.A ppend(System.En vironment.NewLi ne);
formulaString.A ppend("+ coefficients[" + j + "] * table[row][" + j + "] ");
}
}
}
formulaString.A ppend(";");
//

The rest of the stuff (namespace, class, outer for loop, return statement,
etc.) is all generated purely without code snippet expressions (with 1
exception -- see bug below). Anyone want to show me how to do the above
expression without code snippets? (Note the need to break the lines to keep
them below the 2046 char limit.)

The one other issue I am having trouble with is this:

compute.ReturnT ype = new CodeTypeReferen ce("public float[]"); //suggested
workaround for bug below
compute.Attribu tes |= MemberAttribute s.Public;//has a bug

Is there an easy, language neutral way to work around the
MemberAttribute s.Public bug when I return an array?

TIA,
Dave

"anonymouse " <an********@dis cussions.micros oft.com> wrote in message
news:Ol******** ********@TK2MSF TNGP12.phx.gbl. ..

Did you make these unrolled loops manually or did you use a CodeDom to
generate them?

Curious.

Why do it the hard way:D
"Mountain Bikn' Guy" <vc@attbi.com > wrote in message
news:RAblb.1964 35$%h1.190498@s ccrnsc02...
In the event anyone wishes to review this thread now or later, I thought I
would provide some conclusions that resulted from everyone's input and

from
my reading:

1. unrolling loops often doesn't provide a performance gain on large (real life size?) problems. In my experience, the unrolled loop was 5-10 times
SLOWER than the simple nested loop that I used as a reference. I found the following quote in Eric Gunnerson's book (2nd ed). He said (about unrolled loops), a "function is so big it doesn't fit into cache and therefore gets slower performance."

Jon added the following info:
I suspect he means the processor's instruction
cache - if it's a small loop, the processor (not JIT, note!) can decode
the instructions once and keep the microcode available for further
iterations.

2. VS.NET has a 2048 char line limit in code files. In my experience, when using runtime code generation and compiling, this can cause unusual

trouble
if one isn't aware of it. I was able to compile and run an application

that
was created from a code file with a line longer than this limit. The only error I encountered was that the mathematical expression in the executing program returned incorrect results! (The fix was simply inserting line
breaks into the code file.)

3. VS.NET has some compiler bugs that will result in the following error
message:
Additional information: Common Language Runtime detected an invalid

program.

This appears to happen on complex mathematical expressions. However, in my case the expression itself was not that complex. It was just long (but

each
line wasn't too long). When I reached about 1000 subexpressions in a

simple
sum of products expression, I encountered this compiler bug. Search on
Google Groups using the error message, and you'll see that others have
encountered it (frequently) when dealing with complex mathematical
expressions.

To my knowledge, there is no solution for this. Anyone from Microsoft care to comment?

That about sums up (no pun intended) what I've learned so far. HTH.
Dave
"Mountain Bikn' Guy" <vc@attbi.com > wrote in message
news:NZUjb.7959 42$YN5.793228@s ccrnsc01...
Take some standard code such as shown below. It simply loops to add up a series of terms and it produces the correct result.

// sum numbers with a loop
public int DoSumLooping(in t iterations)
{
int result = 0;
for(int i = 1;i <=iterations;i+ +)
{
result += i;
}
return result;
}

Now translate this into a specific solution that doesn't use looping (and use the same value for the number of iterations the loop performs). This code returns an incorrect result. The method consists entirely of a very straightforward code statement, but in this case .NET adds the numbers
incorrectly.
public double ComputeSum( )
{
// Brute force sum method
// For iterations == 10000
double sum = 0+ 1+ 2+ 3+ 4+ 5+ 6+ 7+ 8+ 9+ 10+ 11+ 12+ 13+ 14+ 15+ ...
+ 9997+ 9998+ 9999+ 10000;
return sum;
}
The above method returns an incorrect result with any number of terms

above
about 200. It will correctly add 1 + 2 + ... + 200, but it will NOT
correctly add 1 + 2 + ... + 1000.

I have just run across this, and I have not yet researched the
possible reasons for this behavior. It may be a known issue related to either

stack size or the length of a code line, but to my knowledge it hasn't been
discussed in any of the "popular" literature on C# and .NET. I need to

write
code like this, so if anyone has already encountered this issue, please advise me.
Here's another example that also creates problems, but of a somewhat
different nature. Take the following code and translate it into a

specific,
non-looping method and try to execute it using reflection. It fails.

public double LoopToCompute()
{
double sumOfProducts = 0;
double grandTotal = 0;
for (int i = 0; i < maxRows; ++i)
{
for (int j = 0; j < maxCols; ++j)
{
sumOfProducts += coeff[j] * table[i][j];
}
a_point[i] = sumOfProducts;
grandTotal += sumOfProducts;
sumOfProducts = 0;
}
return grandTotal;
}//LoopToCompute

The above code works -- but it's equivalent code with loops unrolled

(shown
below) doesn't work unless the maxRows is set very small. For small

values,
the 2 methods (above and below) produce identical results. There is

nothing
"wrong" with the code in that sense. It's similar to the above situation.
If
the "size" of the code statement or the number of code statements is too large, .NET fails. In this case (using reflection) it doesn't return the incorrect result, as the first example did. In this case, reflection

calls it an invalid program and refuses to run it (but only when the value of maxRows is above about 250). The reason for this is probably
straightforward . However, I have the need to make statements like this for performance reasons so I need a work-around. Any suggestions are
appreciated! All comments are appreciated.

public double DoBruteForceCom pute()
{
double bruteForceSum = 0;

point1=coeff1*t able[0][0] +coeff2*table[0][1] +coeff3*table[0][2]
+coeff4*table[0][3] +coeff5*table[0][4] +coeff6*table[0][5]
+coeff7*table[0][6] +coeff8*table[0][7] +coeff9*table[0][8]
+coeff10*table[0][9] +coeff11*table[0][10] +coeff12*table[0][11]
+coeff13*table[0][12] +coeff14*table[0][13] +coeff15*table[0][14]
+coeff16*table[0][15] +coeff17*table[0][16] +coeff18*table[0][17]
+coeff19*table[0][18] +coeff20*table[0][19] +coeff21*table[0][20]
+coeff22*table[0][21] +coeff23*table[0][22] +coeff24*table[0][23]
+coeff25*table[0][24] +coeff26*table[0][25] +coeff27*table[0][26]
+coeff28*table[0][27] +coeff29*table[0][28] +coeff30*table[0][29]
+coeff31*table[0][30] +coeff32*table[0][31] +coeff33*table[0][32]
+coeff34*table[0][33] +coeff35*table[0][34] ;

point2=coeff1*t able[1][0] +coeff2*table[1][1] +coeff3*table[1][2]
+coeff4*table[1][3] +coeff5*table[1][4] +coeff6*table[1][5]
+coeff7*table[1][6] +coeff8*table[1][7] +coeff9*table[1][8]
+coeff10*table[1][9] +coeff11*table[1][10] +coeff12*table[1][11]
+coeff13*table[1][12] +coeff14*table[1][13] +coeff15*table[1][14]
+coeff16*table[1][15] +coeff17*table[1][16] +coeff18*table[1][17]
+coeff19*table[1][18] +coeff20*table[1][19] +coeff21*table[1][20]
+coeff22*table[1][21] +coeff23*table[1][22] +coeff24*table[1][23]
+coeff25*table[1][24] +coeff26*table[1][25] +coeff27*table[1][26]
+coeff28*table[1][27] +coeff29*table[1][28] +coeff30*table[1][29]
+coeff31*table[1][30] +coeff32*table[1][31] +coeff33*table[1][32]
+coeff34*table[1][33] +coeff35*table[1][34] ;
[...]

point500=coeff1 *table[499][0] +coeff2*table[499][1] +coeff3*table[499][2] +coeff4*table[499][3] +coeff5*table[499][4] +coeff6*table[499][5]
+coeff7*table[499][6] +coeff8*table[499][7] +coeff9*table[499][8]
+coeff10*table[499][9] +coeff11*table[499][10] +coeff12*table[499][11]
+coeff13*table[499][12] +coeff14*table[499][13] +coeff15*table[499][14] +coeff16*table[499][15] +coeff17*table[499][16] +coeff18*table[499][17] +coeff19*table[499][18] +coeff20*table[499][19] +coeff21*table[499][20] +coeff22*table[499][21] +coeff23*table[499][22] +coeff24*table[499][23] +coeff25*table[499][24] +coeff26*table[499][25] +coeff27*table[499][26] +coeff28*table[499][27] +coeff29*table[499][28] +coeff30*table[499][29] +coeff31*table[499][30] +coeff32*table[499][31] +coeff33*table[499][32] +coeff34*table[499][33] +coeff35*table[499][34] ;

bruteForceSum =
point1 +
point2 + ... +

point499 +
point500
;

return bruteForceSum;

}//DoBruteForceCom pute

Jul 21 '05 #36

Eric Gunnerson [MS]

Comments inline

--
Eric Gunnerson

Visit the C# product team at http://www.csharp.net
Eric's blog is at http://blogs.gotdotnet.com/ericgu/

This posting is provided "AS IS" with no warranties, and confers no rights.
"Mountain Bikn' Guy" <vc@attbi.com > wrote in message
news:RAblb.1964 35$%h1.190498@s ccrnsc02...

In the event anyone wishes to review this thread now or later, I thought I
would provide some conclusions that resulted from everyone's input and from my reading:

1. unrolling loops often doesn't provide a performance gain on large (real
life size?) problems. In my experience, the unrolled loop was 5-10 times
SLOWER than the simple nested loop that I used as a reference. I found the
following quote in Eric Gunnerson's book (2nd ed). He said (about unrolled
loops), a "function is so big it doesn't fit into cache and therefore gets
slower performance."

Jon added the following info:
I suspect he means the processor's instruction
cache - if it's a small loop, the processor (not JIT, note!) can decode
the instructions once and keep the microcode available for further
iterations.
I do mean the processor cache, but it's not necessarily just an instruction
effect.

Jan Gray's excellent article on managed code performance has a lot more
detail on this:

http://msdn.microsoft.com/library/?u...anagedcode.asp

2. VS.NET has a 2048 char line limit in code files. In my experience, when
using runtime code generation and compiling, this can cause unusual trouble if one isn't aware of it. I was able to compile and run an application that was created from a code file with a line longer than this limit. The only
error I encountered was that the mathematical expression in the executing
program returned incorrect results! (The fix was simply inserting line
breaks into the code file.)

3. VS.NET has some compiler bugs that will result in the following error
message:
Additional information: Common Language Runtime detected an invalid program.
This appears to happen on complex mathematical expressions. However, in my
case the expression itself was not that complex. It was just long (but each line wasn't too long). When I reached about 1000 subexpressions in a simple sum of products expression, I encountered this compiler bug. Search on
Google Groups using the error message, and you'll see that others have
encountered it (frequently) when dealing with complex mathematical
expressions.

To my knowledge, there is no solution for this. Anyone from Microsoft care
to comment?
I haven't heard of this before. Can you send me a program that demonstrates
the problem?
Er****@microsof t.com

That about sums up (no pun intended) what I've learned so far. HTH.
Dave
"Mountain Bikn' Guy" <vc@attbi.com > wrote in message
news:NZUjb.7959 42$YN5.793228@s ccrnsc01...
Take some standard code such as shown below. It simply loops to add up a
series of terms and it produces the correct result.

// sum numbers with a loop
public int DoSumLooping(in t iterations)
{
int result = 0;
for(int i = 1;i <=iterations;i+ +)
{
result += i;
}
return result;
}

Now translate this into a specific solution that doesn't use looping (and use the same value for the number of iterations the loop performs). This
code returns an incorrect result. The method consists entirely of a very
straightforward code statement, but in this case .NET adds the numbers
incorrectly.
public double ComputeSum( )
{
// Brute force sum method
// For iterations == 10000
double sum = 0+ 1+ 2+ 3+ 4+ 5+ 6+ 7+ 8+ 9+ 10+ 11+ 12+ 13+ 14+ 15+ ...
+ 9997+ 9998+ 9999+ 10000;
return sum;
}
The above method returns an incorrect result with any number of terms

above
about 200. It will correctly add 1 + 2 + ... + 200, but it will NOT
correctly add 1 + 2 + ... + 1000.

I have just run across this, and I have not yet researched the possible
reasons for this behavior. It may be a known issue related to either stack size or the length of a code line, but to my knowledge it hasn't been
discussed in any of the "popular" literature on C# and .NET. I need to

write
code like this, so if anyone has already encountered this issue, please
advise me.
Here's another example that also creates problems, but of a somewhat
different nature. Take the following code and translate it into a

specific,
non-looping method and try to execute it using reflection. It fails.

public double LoopToCompute()
{
double sumOfProducts = 0;
double grandTotal = 0;
for (int i = 0; i < maxRows; ++i)
{
for (int j = 0; j < maxCols; ++j)
{
sumOfProducts += coeff[j] * table[i][j];
}
a_point[i] = sumOfProducts;
grandTotal += sumOfProducts;
sumOfProducts = 0;
}
return grandTotal;
}//LoopToCompute

The above code works -- but it's equivalent code with loops unrolled

(shown
below) doesn't work unless the maxRows is set very small. For small

values,
the 2 methods (above and below) produce identical results. There is

nothing
"wrong" with the code in that sense. It's similar to the above situation. If
the "size" of the code statement or the number of code statements is too
large, .NET fails. In this case (using reflection) it doesn't return the
incorrect result, as the first example did. In this case, reflection

calls it an invalid program and refuses to run it (but only when the value of
maxRows is above about 250). The reason for this is probably
straightforward . However, I have the need to make statements like this for performance reasons so I need a work-around. Any suggestions are
appreciated! All comments are appreciated.

public double DoBruteForceCom pute()
{
double bruteForceSum = 0;

point1=coeff1*t able[0][0] +coeff2*table[0][1] +coeff3*table[0][2]
+coeff4*table[0][3] +coeff5*table[0][4] +coeff6*table[0][5]
+coeff7*table[0][6] +coeff8*table[0][7] +coeff9*table[0][8]
+coeff10*table[0][9] +coeff11*table[0][10] +coeff12*table[0][11]
+coeff13*table[0][12] +coeff14*table[0][13] +coeff15*table[0][14]
+coeff16*table[0][15] +coeff17*table[0][16] +coeff18*table[0][17]
+coeff19*table[0][18] +coeff20*table[0][19] +coeff21*table[0][20]
+coeff22*table[0][21] +coeff23*table[0][22] +coeff24*table[0][23]
+coeff25*table[0][24] +coeff26*table[0][25] +coeff27*table[0][26]
+coeff28*table[0][27] +coeff29*table[0][28] +coeff30*table[0][29]
+coeff31*table[0][30] +coeff32*table[0][31] +coeff33*table[0][32]
+coeff34*table[0][33] +coeff35*table[0][34] ;

point2=coeff1*t able[1][0] +coeff2*table[1][1] +coeff3*table[1][2]
+coeff4*table[1][3] +coeff5*table[1][4] +coeff6*table[1][5]
+coeff7*table[1][6] +coeff8*table[1][7] +coeff9*table[1][8]
+coeff10*table[1][9] +coeff11*table[1][10] +coeff12*table[1][11]
+coeff13*table[1][12] +coeff14*table[1][13] +coeff15*table[1][14]
+coeff16*table[1][15] +coeff17*table[1][16] +coeff18*table[1][17]
+coeff19*table[1][18] +coeff20*table[1][19] +coeff21*table[1][20]
+coeff22*table[1][21] +coeff23*table[1][22] +coeff24*table[1][23]
+coeff25*table[1][24] +coeff26*table[1][25] +coeff27*table[1][26]
+coeff28*table[1][27] +coeff29*table[1][28] +coeff30*table[1][29]
+coeff31*table[1][30] +coeff32*table[1][31] +coeff33*table[1][32]
+coeff34*table[1][33] +coeff35*table[1][34] ;
[...]

point500=coeff1 *table[499][0] +coeff2*table[499][1] +coeff3*table[499][2] +coeff4*table[499][3] +coeff5*table[499][4] +coeff6*table[499][5]
+coeff7*table[499][6] +coeff8*table[499][7] +coeff9*table[499][8]
+coeff10*table[499][9] +coeff11*table[499][10] +coeff12*table[499][11]
+coeff13*table[499][12] +coeff14*table[499][13] +coeff15*table[499][14]
+coeff16*table[499][15] +coeff17*table[499][16] +coeff18*table[499][17]
+coeff19*table[499][18] +coeff20*table[499][19] +coeff21*table[499][20]
+coeff22*table[499][21] +coeff23*table[499][22] +coeff24*table[499][23]
+coeff25*table[499][24] +coeff26*table[499][25] +coeff27*table[499][26]
+coeff28*table[499][27] +coeff29*table[499][28] +coeff30*table[499][29]
+coeff31*table[499][30] +coeff32*table[499][31] +coeff33*table[499][32]
+coeff34*table[499][33] +coeff35*table[499][34] ;

bruteForceSum =
point1 +
point2 + ... +

point499 +
point500
;

return bruteForceSum;

}//DoBruteForceCom pute

Jul 21 '05 #37

Chris R. Timmons

"Mountain Bikn' Guy" <vc@attbi.com > wrote in
news:zcelb.8381 83$Ho3.244900@s ccrnsc03:

Since you asked...

I could use some help with CodeDOM. Here is how I am generating
these loops:

//
//TODO: generate this entire expression using CodeDOM classes:
StringBuilder formulaString = new StringBuilder() ;
formulaString.A ppend("dataPoin ts[row] = 0");
if (columns != null)
{
formulaString.C apacity = 40 * (columns.Count + 2);
for (int j = 0; j < columns.Count; ++j)
{
if ( ((MyColumnType) columns[j]).Role == RoleType.Indepe ndent)
{
formulaString.A ppend(System.En vironment.NewLi ne);
formulaString.A ppend("+ coefficients[" + j + "] * table[row][" +
j + "] "); }
}
}
formulaString.A ppend(";");
//

The rest of the stuff (namespace, class, outer for loop, return
statement, etc.) is all generated purely without code snippet
expressions (with 1 exception -- see bug below). Anyone want to
show me how to do the above expression without code snippets?
(Note the need to break the lines to keep them below the 2046
char limit.)

The one other issue I am having trouble with is this:

compute.ReturnT ype = new CodeTypeReferen ce("public float[]");
//suggested workaround for bug below
compute.Attribu tes |= MemberAttribute s.Public;//has a bug

Is there an easy, language neutral way to work around the
MemberAttribute s.Public bug when I return an array?

Dave,

[I don't mean to come across as arrogant in the post below. This is
just my long-winded way of saying that there may be a much simpler
solution to your problem.]

I've read most of the messages in the threads that deal with the
problems you're runing into. I'm not going to pretend that I know
what all of your requirements are, or the reason why you're writing
this code (learning exercise or production code?). But I get the
feeling that you may be trying to pound a round peg through a square
hole.

Lately I've become a big fan of Occam's Razor
(http://pespmc1.vub.ac.be/OCCAMRAZ.html). Dynamically generating and
compiling code is usually the slowest and most complex way to
implement a solution. It's almost always my last choice. When
designing a solution to a complex and/or large problem, my solution
generally comes from a list like this (in order of simplicity):

1. structured (or non-polymorphic OOP code) code w/ a fixed data
structure
2. same code as #1, but w/ a dynamic data structure
3. polymorphic OOP and/or interfaced code w/ a fixed data structure
4. same code as #3, but w/ a dynamic data structure
5. database solution (SQL)
6. potentially very complex code and data (writing a custom
compiler, lengthy regular expressions, using CodeDOM, etc.)

Another way of looking at this list:

1. simple code and data.
2. simple code and moderately complex data.
3. moderately complex code and simple data.
4. moderately complex code and data.
5. range of simple-to-complex code and data, but has an external
dependency on a database engine. May also have performance
drawbacks.
6. very complex code and data.

Too often I've learned (the hard way) that a complex solution to a
seemingly simple problem is often an indicator of a design flaw.
Using lists like this has helped me catch design errors like that and
fix them before they infect the rest of the app.

Just my $0.02...

Chris.
-------------
C.R. Timmons Consulting, Inc.
http://www.crtimmonsinc.com/

Jul 21 '05 #38

Mountain Bikn' Guy

sample program sent!
Thanks for your interest!
Dave

"Eric Gunnerson [MS]" <er****@online. microsoft.com> wrote in message
news:%2******** ********@TK2MSF TNGP10.phx.gbl. ..

Comments inline

--
Eric Gunnerson

Visit the C# product team at http://www.csharp.net
Eric's blog is at http://blogs.gotdotnet.com/ericgu/

This posting is provided "AS IS" with no warranties, and confers no rights. "Mountain Bikn' Guy" <vc@attbi.com > wrote in message
news:RAblb.1964 35$%h1.190498@s ccrnsc02...
In the event anyone wishes to review this thread now or later, I thought I
would provide some conclusions that resulted from everyone's input and from
my reading:

1. unrolling loops often doesn't provide a performance gain on large (real life size?) problems. In my experience, the unrolled loop was 5-10 times
SLOWER than the simple nested loop that I used as a reference. I found the following quote in Eric Gunnerson's book (2nd ed). He said (about unrolled loops), a "function is so big it doesn't fit into cache and therefore gets slower performance."

Jon added the following info:
I suspect he means the processor's instruction
cache - if it's a small loop, the processor (not JIT, note!) can decode
the instructions once and keep the microcode available for further
iterations.

I do mean the processor cache, but it's not necessarily just an

instruction effect.

Jan Gray's excellent article on managed code performance has a lot more
detail on this:

http://msdn.microsoft.com/library/?u...anagedcode.asp

2. VS.NET has a 2048 char line limit in code files. In my experience, when using runtime code generation and compiling, this can cause unusual trouble
if one isn't aware of it. I was able to compile and run an application

that
was created from a code file with a line longer than this limit. The only error I encountered was that the mathematical expression in the executing program returned incorrect results! (The fix was simply inserting line
breaks into the code file.)

3. VS.NET has some compiler bugs that will result in the following error
message:
Additional information: Common Language Runtime detected an invalid

program.

This appears to happen on complex mathematical expressions. However, in my case the expression itself was not that complex. It was just long (but

each
line wasn't too long). When I reached about 1000 subexpressions in a

simple
sum of products expression, I encountered this compiler bug. Search on
Google Groups using the error message, and you'll see that others have
encountered it (frequently) when dealing with complex mathematical
expressions.

To my knowledge, there is no solution for this. Anyone from Microsoft care to comment?

I haven't heard of this before. Can you send me a program that

demonstrates the problem?
Er****@microsof t.com

That about sums up (no pun intended) what I've learned so far. HTH.
Dave
"Mountain Bikn' Guy" <vc@attbi.com > wrote in message
news:NZUjb.7959 42$YN5.793228@s ccrnsc01...
Take some standard code such as shown below. It simply loops to add up a series of terms and it produces the correct result.

// sum numbers with a loop
public int DoSumLooping(in t iterations)
{
int result = 0;
for(int i = 1;i <=iterations;i+ +)
{
result += i;
}
return result;
}

Now translate this into a specific solution that doesn't use looping (and use the same value for the number of iterations the loop performs). This code returns an incorrect result. The method consists entirely of a very straightforward code statement, but in this case .NET adds the numbers
incorrectly.
public double ComputeSum( )
{
// Brute force sum method
// For iterations == 10000
double sum = 0+ 1+ 2+ 3+ 4+ 5+ 6+ 7+ 8+ 9+ 10+ 11+ 12+ 13+ 14+ 15+ ...
+ 9997+ 9998+ 9999+ 10000;
return sum;
}
The above method returns an incorrect result with any number of terms

above
about 200. It will correctly add 1 + 2 + ... + 200, but it will NOT
correctly add 1 + 2 + ... + 1000.

I have just run across this, and I have not yet researched the
possible reasons for this behavior. It may be a known issue related to either

stack size or the length of a code line, but to my knowledge it hasn't been
discussed in any of the "popular" literature on C# and .NET. I need to

write
code like this, so if anyone has already encountered this issue, please advise me.
Here's another example that also creates problems, but of a somewhat
different nature. Take the following code and translate it into a

specific,
non-looping method and try to execute it using reflection. It fails.

public double LoopToCompute()
{
double sumOfProducts = 0;
double grandTotal = 0;
for (int i = 0; i < maxRows; ++i)
{
for (int j = 0; j < maxCols; ++j)
{
sumOfProducts += coeff[j] * table[i][j];
}
a_point[i] = sumOfProducts;
grandTotal += sumOfProducts;
sumOfProducts = 0;
}
return grandTotal;
}//LoopToCompute

The above code works -- but it's equivalent code with loops unrolled

(shown
below) doesn't work unless the maxRows is set very small. For small

values,
the 2 methods (above and below) produce identical results. There is

nothing
"wrong" with the code in that sense. It's similar to the above situation.
If
the "size" of the code statement or the number of code statements is too large, .NET fails. In this case (using reflection) it doesn't return the incorrect result, as the first example did. In this case, reflection

calls it an invalid program and refuses to run it (but only when the value of maxRows is above about 250). The reason for this is probably
straightforward . However, I have the need to make statements like this for performance reasons so I need a work-around. Any suggestions are
appreciated! All comments are appreciated.

public double DoBruteForceCom pute()
{
double bruteForceSum = 0;

point1=coeff1*t able[0][0] +coeff2*table[0][1] +coeff3*table[0][2]
+coeff4*table[0][3] +coeff5*table[0][4] +coeff6*table[0][5]
+coeff7*table[0][6] +coeff8*table[0][7] +coeff9*table[0][8]
+coeff10*table[0][9] +coeff11*table[0][10] +coeff12*table[0][11]
+coeff13*table[0][12] +coeff14*table[0][13] +coeff15*table[0][14]
+coeff16*table[0][15] +coeff17*table[0][16] +coeff18*table[0][17]
+coeff19*table[0][18] +coeff20*table[0][19] +coeff21*table[0][20]
+coeff22*table[0][21] +coeff23*table[0][22] +coeff24*table[0][23]
+coeff25*table[0][24] +coeff26*table[0][25] +coeff27*table[0][26]
+coeff28*table[0][27] +coeff29*table[0][28] +coeff30*table[0][29]
+coeff31*table[0][30] +coeff32*table[0][31] +coeff33*table[0][32]
+coeff34*table[0][33] +coeff35*table[0][34] ;

point2=coeff1*t able[1][0] +coeff2*table[1][1] +coeff3*table[1][2]
+coeff4*table[1][3] +coeff5*table[1][4] +coeff6*table[1][5]
+coeff7*table[1][6] +coeff8*table[1][7] +coeff9*table[1][8]
+coeff10*table[1][9] +coeff11*table[1][10] +coeff12*table[1][11]
+coeff13*table[1][12] +coeff14*table[1][13] +coeff15*table[1][14]
+coeff16*table[1][15] +coeff17*table[1][16] +coeff18*table[1][17]
+coeff19*table[1][18] +coeff20*table[1][19] +coeff21*table[1][20]
+coeff22*table[1][21] +coeff23*table[1][22] +coeff24*table[1][23]
+coeff25*table[1][24] +coeff26*table[1][25] +coeff27*table[1][26]
+coeff28*table[1][27] +coeff29*table[1][28] +coeff30*table[1][29]
+coeff31*table[1][30] +coeff32*table[1][31] +coeff33*table[1][32]
+coeff34*table[1][33] +coeff35*table[1][34] ;
[...]

point500=coeff1 *table[499][0] +coeff2*table[499][1] +coeff3*table[499][2] +coeff4*table[499][3] +coeff5*table[499][4] +coeff6*table[499][5]
+coeff7*table[499][6] +coeff8*table[499][7] +coeff9*table[499][8]
+coeff10*table[499][9] +coeff11*table[499][10] +coeff12*table[499][11]
+coeff13*table[499][12] +coeff14*table[499][13] +coeff15*table[499][14] +coeff16*table[499][15] +coeff17*table[499][16] +coeff18*table[499][17] +coeff19*table[499][18] +coeff20*table[499][19] +coeff21*table[499][20] +coeff22*table[499][21] +coeff23*table[499][22] +coeff24*table[499][23] +coeff25*table[499][24] +coeff26*table[499][25] +coeff27*table[499][26] +coeff28*table[499][27] +coeff29*table[499][28] +coeff30*table[499][29] +coeff31*table[499][30] +coeff32*table[499][31] +coeff33*table[499][32] +coeff34*table[499][33] +coeff35*table[499][34] ;

bruteForceSum =
point1 +
point2 + ... +

point499 +
point500
;

return bruteForceSum;

}//DoBruteForceCom pute

Jul 21 '05 #39

Santiago

I really don't want to make this thread any longer than it already is but I
found Chris' comments (above) interesting. And I thought I would add my own
$0.02.

If the nature of this app is such that every bit of performance is needed,
to the extent that even loops have to be unrolled (something that's usually
found in game engines but not business apps), I would question the use of
..NET. Using C++ (non-managed) you will shave a whole layer which will yield
further performance advantages if well done.

- Santiago

"Mountain Bikn' Guy" <vc@attbi.com > wrote in message
news:zcelb.8381 83$Ho3.244900@s ccrnsc03...

Since you asked...

I could use some help with CodeDOM. Here is how I am generating these loops:
//
//TODO: generate this entire expression using CodeDOM classes:
StringBuilder formulaString = new StringBuilder() ;
formulaString.A ppend("dataPoin ts[row] = 0");
if (columns != null)
{
formulaString.C apacity = 40 * (columns.Count + 2);
for (int j = 0; j < columns.Count; ++j)
{
if ( ((MyColumnType) columns[j]).Role == RoleType.Indepe ndent)
{
formulaString.A ppend(System.En vironment.NewLi ne);
formulaString.A ppend("+ coefficients[" + j + "] * table[row][" + j + "] "); }
}
}
formulaString.A ppend(";");
//

The rest of the stuff (namespace, class, outer for loop, return statement,
etc.) is all generated purely without code snippet expressions (with 1
exception -- see bug below). Anyone want to show me how to do the above
expression without code snippets? (Note the need to break the lines to keep them below the 2046 char limit.)

The one other issue I am having trouble with is this:

compute.ReturnT ype = new CodeTypeReferen ce("public float[]"); //suggested
workaround for bug below
compute.Attribu tes |= MemberAttribute s.Public;//has a bug

Is there an easy, language neutral way to work around the
MemberAttribute s.Public bug when I return an array?

TIA,
Dave

"anonymouse " <an********@dis cussions.micros oft.com> wrote in message
news:Ol******** ********@TK2MSF TNGP12.phx.gbl. ..
Did you make these unrolled loops manually or did you use a CodeDom to
generate them?

Curious.

Why do it the hard way:D
"Mountain Bikn' Guy" <vc@attbi.com > wrote in message
news:RAblb.1964 35$%h1.190498@s ccrnsc02...
In the event anyone wishes to review this thread now or later, I thought
I
would provide some conclusions that resulted from everyone's input and from
my reading:

1. unrolling loops often doesn't provide a performance gain on large (real life size?) problems. In my experience, the unrolled loop was 5-10
times SLOWER than the simple nested loop that I used as a reference. I found the following quote in Eric Gunnerson's book (2nd ed). He said (about unrolled loops), a "function is so big it doesn't fit into cache and therefore gets slower performance."

Jon added the following info:
I suspect he means the processor's instruction
cache - if it's a small loop, the processor (not JIT, note!) can decode the instructions once and keep the microcode available for further
iterations.

2. VS.NET has a 2048 char line limit in code files. In my experience, when using runtime code generation and compiling, this can cause unusual

trouble
if one isn't aware of it. I was able to compile and run an application

that
was created from a code file with a line longer than this limit. The only error I encountered was that the mathematical expression in the executing program returned incorrect results! (The fix was simply inserting line
breaks into the code file.)

3. VS.NET has some compiler bugs that will result in the following error message:
Additional information: Common Language Runtime detected an invalid

program.

This appears to happen on complex mathematical expressions. However, in my
case the expression itself was not that complex. It was just long (but each
line wasn't too long). When I reached about 1000 subexpressions in a

simple
sum of products expression, I encountered this compiler bug. Search on
Google Groups using the error message, and you'll see that others have
encountered it (frequently) when dealing with complex mathematical
expressions.

To my knowledge, there is no solution for this. Anyone from Microsoft care to comment?

That about sums up (no pun intended) what I've learned so far. HTH.
Dave
"Mountain Bikn' Guy" <vc@attbi.com > wrote in message
news:NZUjb.7959 42$YN5.793228@s ccrnsc01...
> Take some standard code such as shown below. It simply loops to add
up a > series of terms and it produces the correct result.
>
> // sum numbers with a loop
> public int DoSumLooping(in t iterations)
> {
> int result = 0;
> for(int i = 1;i <=iterations;i+ +)
> {
> result += i;
> }
> return result;
> }
>
> Now translate this into a specific solution that doesn't use looping

(and
> use the same value for the number of iterations the loop performs). This > code returns an incorrect result. The method consists entirely of a very > straightforward code statement, but in this case .NET adds the
numbers > incorrectly.
> public double ComputeSum( )
> {
> // Brute force sum method
> // For iterations == 10000
> double sum = 0+ 1+ 2+ 3+ 4+ 5+ 6+ 7+ 8+ 9+ 10+ 11+ 12+ 13+ 14+

15+ ...
> + 9997+ 9998+ 9999+ 10000;
> return sum;
> }
> The above method returns an incorrect result with any number of terms above
> about 200. It will correctly add 1 + 2 + ... + 200, but it will NOT
> correctly add 1 + 2 + ... + 1000.
>
> I have just run across this, and I have not yet researched the possible > reasons for this behavior. It may be a known issue related to either

stack
> size or the length of a code line, but to my knowledge it hasn't been > discussed in any of the "popular" literature on C# and .NET. I need to write
> code like this, so if anyone has already encountered this issue, please > advise me.
>
>
> Here's another example that also creates problems, but of a somewhat
> different nature. Take the following code and translate it into a
specific,
> non-looping method and try to execute it using reflection. It fails.
>
> public double LoopToCompute()
> {
> double sumOfProducts = 0;
> double grandTotal = 0;
> for (int i = 0; i < maxRows; ++i)
> {
> for (int j = 0; j < maxCols; ++j)
> {
> sumOfProducts += coeff[j] * table[i][j];
> }
> a_point[i] = sumOfProducts;
> grandTotal += sumOfProducts;
> sumOfProducts = 0;
> }
> return grandTotal;
> }//LoopToCompute
>
> The above code works -- but it's equivalent code with loops unrolled
(shown
> below) doesn't work unless the maxRows is set very small. For small
values,
> the 2 methods (above and below) produce identical results. There is
nothing
> "wrong" with the code in that sense. It's similar to the above

situation.
If
> the "size" of the code statement or the number of code statements is too > large, .NET fails. In this case (using reflection) it doesn't return the > incorrect result, as the first example did. In this case, reflection

calls
> it an invalid program and refuses to run it (but only when the value of > maxRows is above about 250). The reason for this is probably
> straightforward . However, I have the need to make statements like this for
> performance reasons so I need a work-around. Any suggestions are
> appreciated! All comments are appreciated.
>
> public double DoBruteForceCom pute()
> {
> double bruteForceSum = 0;
>
> point1=coeff1*t able[0][0] +coeff2*table[0][1] +coeff3*table[0][2]
> +coeff4*table[0][3] +coeff5*table[0][4] +coeff6*table[0][5]
> +coeff7*table[0][6] +coeff8*table[0][7] +coeff9*table[0][8]
> +coeff10*table[0][9] +coeff11*table[0][10] +coeff12*table[0][11]
> +coeff13*table[0][12] +coeff14*table[0][13] +coeff15*table[0][14]
> +coeff16*table[0][15] +coeff17*table[0][16] +coeff18*table[0][17]
> +coeff19*table[0][18] +coeff20*table[0][19] +coeff21*table[0][20]
> +coeff22*table[0][21] +coeff23*table[0][22] +coeff24*table[0][23]
> +coeff25*table[0][24] +coeff26*table[0][25] +coeff27*table[0][26]
> +coeff28*table[0][27] +coeff29*table[0][28] +coeff30*table[0][29]
> +coeff31*table[0][30] +coeff32*table[0][31] +coeff33*table[0][32]
> +coeff34*table[0][33] +coeff35*table[0][34] ;
>
> point2=coeff1*t able[1][0] +coeff2*table[1][1] +coeff3*table[1][2]
> +coeff4*table[1][3] +coeff5*table[1][4] +coeff6*table[1][5]
> +coeff7*table[1][6] +coeff8*table[1][7] +coeff9*table[1][8]
> +coeff10*table[1][9] +coeff11*table[1][10] +coeff12*table[1][11]
> +coeff13*table[1][12] +coeff14*table[1][13] +coeff15*table[1][14]
> +coeff16*table[1][15] +coeff17*table[1][16] +coeff18*table[1][17]
> +coeff19*table[1][18] +coeff20*table[1][19] +coeff21*table[1][20]
> +coeff22*table[1][21] +coeff23*table[1][22] +coeff24*table[1][23]
> +coeff25*table[1][24] +coeff26*table[1][25] +coeff27*table[1][26]
> +coeff28*table[1][27] +coeff29*table[1][28] +coeff30*table[1][29]
> +coeff31*table[1][30] +coeff32*table[1][31] +coeff33*table[1][32]
> +coeff34*table[1][33] +coeff35*table[1][34] ;
>
>
> [...]
>
> point500=coeff1 *table[499][0] +coeff2*table[499][1]

+coeff3*table[499][2]
> +coeff4*table[499][3] +coeff5*table[499][4] +coeff6*table[499][5]
> +coeff7*table[499][6] +coeff8*table[499][7] +coeff9*table[499][8]
> +coeff10*table[499][9] +coeff11*table[499][10]
+coeff12*table[499][11] > +coeff13*table[499][12] +coeff14*table[499][13]

+coeff15*table[499][14] > +coeff16*table[499][15] +coeff17*table[499][16] +coeff18*table[499][17] > +coeff19*table[499][18] +coeff20*table[499][19] +coeff21*table[499][20] > +coeff22*table[499][21] +coeff23*table[499][22] +coeff24*table[499][23] > +coeff25*table[499][24] +coeff26*table[499][25] +coeff27*table[499][26] > +coeff28*table[499][27] +coeff29*table[499][28] +coeff30*table[499][29] > +coeff31*table[499][30] +coeff32*table[499][31] +coeff33*table[499][32] > +coeff34*table[499][33] +coeff35*table[499][34] ;
>
> bruteForceSum =
> point1 +
> point2 + ... +
>
> point499 +
> point500
> ;
>
> return bruteForceSum;
>
> }//DoBruteForceCom pute
>
>

Jul 21 '05 #40

Similar topics