468,761 Members | 1,779 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 468,761 developers. It's quick & easy.

Poor array performance

I had a customer call about some C# code they had put together that was
handling some large arrays. The performance was rather poor. The C# code
runs in about 22 seconds and the equivalent C++.Net code runs in 0.3
seconds. Can someone help me understand why the C# code performance is so
poor? I rewote the C# code to use a single dimenional array and the time
went down to about 3 seconds, but that's still no explaination as to why the
two dimenional array performance is so bad. I tried this on both C# 1.1 and
C# 2.0.
The original code was:

public void TestLoop4OldMethod()
{
double[,] emisbase = new double[1000,8784];
double[,] vombase = new double[1000,8784];
int Iteration_Index;
int Hourly_Index;
double differenceInSeconds;
DateTime myDateTime1;
DateTime myDateTime2;

myDateTime1 = DateTime.UtcNow;
for (int i1 = 0; i1 < 10; )
{
for (Hourly_Index = 0; Hourly_Index < 8784; )
{
for (Iteration_Index = 0; Iteration_Index < 1000; )
{
emisbase[Iteration_Index, Hourly_Index] = Iteration_Index *
Iteration_Index;
vombase[Iteration_Index, Hourly_Index] = emisbase[Iteration_Index,
Iteration_Index];
Iteration_Index++;
}
Hourly_Index++;
}
//Console.WriteLine("Here we are - Loop 4: {0}", i1);
i1++;
}
myDateTime2 = DateTime.UtcNow;
TimeSpan ts = myDateTime2 - myDateTime1;
differenceInSeconds = ts.TotalMilliseconds / 1000;

Console.WriteLine("RunTime in Seconds - Left Most Index + array
reference: {0} ", differenceInSeconds);
Console.WriteLine(" ");
}
It runs in about 22 seconds. Here I rewrote the code in C++ and it runs in
0.3 seconds:
// -------------------------------------------------------------------------
-------------------------------------------
int Iteration_Index;
int Hourly_Index;
double differenceInSeconds;
DateTime myDateTime1;
DateTime myDateTime2;

double * emisbase = new double[1000,8784];
double * vombase = new double[1000,8784];

myDateTime1 = DateTime::UtcNow;
for (int i1 = 0; i1 < 10; )
{
for (Hourly_Index = 0; Hourly_Index < 8784; Hourly_Index++)
{
for (Iteration_Index = 0; Iteration_Index < 1000; Iteration_Index++)
{
emisbase[Iteration_Index, Hourly_Index] = Iteration_Index *
Iteration_Index;
vombase[Iteration_Index, Hourly_Index] = emisbase[Iteration_Index,
Iteration_Index];
//Iteration_Index++;
}
//Hourly_Index++;
}
//Console::Write(S"Here we are - Loop 4: ");
//Console::WriteLine(Convert::ToString(i1));
i1++;
}
myDateTime2 = DateTime::UtcNow;
TimeSpan ts = myDateTime2 - myDateTime1;
differenceInSeconds = ts.TotalMilliseconds / 1000;

Console::Write(S"RunTime in Seconds - Left Most Index + array reference:
");
Console::WriteLine(Convert::ToString(differenceInS econds));

// -------------------------------------------------------------------------
-------------------------------------------

Feb 8 '06 #1
20 1913
For a loop that runs almost 9 million times, 3 seconds is not that bad! ;-)

I don't know the internals of C# well enough to say something intelligent
about it, but why not leave in the managed C++ part? It sounds like this is
a critical section that should have the heck optimized out of it anyhow...
"John Mark Howell" <jm*********@hotmail.com> wrote in message
news:%2****************@TK2MSFTNGP12.phx.gbl...
I had a customer call about some C# code they had put together that was
handling some large arrays. The performance was rather poor. The C#
code
runs in about 22 seconds and the equivalent C++.Net code runs in 0.3
seconds. Can someone help me understand why the C# code performance is so
poor? I rewote the C# code to use a single dimenional array and the
time
went down to about 3 seconds, but that's still no explaination as to why
the
two dimenional array performance is so bad. I tried this on both C# 1.1
and
C# 2.0.
The original code was:

public void TestLoop4OldMethod()
{
double[,] emisbase = new double[1000,8784];
double[,] vombase = new double[1000,8784];
int Iteration_Index;
int Hourly_Index;
double differenceInSeconds;
DateTime myDateTime1;
DateTime myDateTime2;

myDateTime1 = DateTime.UtcNow;
for (int i1 = 0; i1 < 10; )
{
for (Hourly_Index = 0; Hourly_Index < 8784; )
{
for (Iteration_Index = 0; Iteration_Index < 1000; )
{
emisbase[Iteration_Index, Hourly_Index] = Iteration_Index *
Iteration_Index;
vombase[Iteration_Index, Hourly_Index] = emisbase[Iteration_Index,
Iteration_Index];
Iteration_Index++;
}
Hourly_Index++;
}
//Console.WriteLine("Here we are - Loop 4: {0}", i1);
i1++;
}
myDateTime2 = DateTime.UtcNow;
TimeSpan ts = myDateTime2 - myDateTime1;
differenceInSeconds = ts.TotalMilliseconds / 1000;

Console.WriteLine("RunTime in Seconds - Left Most Index + array
reference: {0} ", differenceInSeconds);
Console.WriteLine(" ");
}
It runs in about 22 seconds. Here I rewrote the code in C++ and it runs
in
0.3 seconds:
// -------------------------------------------------------------------------
-------------------------------------------
int Iteration_Index;
int Hourly_Index;
double differenceInSeconds;
DateTime myDateTime1;
DateTime myDateTime2;

double * emisbase = new double[1000,8784];
double * vombase = new double[1000,8784];

myDateTime1 = DateTime::UtcNow;
for (int i1 = 0; i1 < 10; )
{
for (Hourly_Index = 0; Hourly_Index < 8784; Hourly_Index++)
{
for (Iteration_Index = 0; Iteration_Index < 1000; Iteration_Index++)
{
emisbase[Iteration_Index, Hourly_Index] = Iteration_Index *
Iteration_Index;
vombase[Iteration_Index, Hourly_Index] = emisbase[Iteration_Index,
Iteration_Index];
//Iteration_Index++;
}
//Hourly_Index++;
}
//Console::Write(S"Here we are - Loop 4: ");
//Console::WriteLine(Convert::ToString(i1));
i1++;
}
myDateTime2 = DateTime::UtcNow;
TimeSpan ts = myDateTime2 - myDateTime1;
differenceInSeconds = ts.TotalMilliseconds / 1000;

Console::Write(S"RunTime in Seconds - Left Most Index + array reference:
");
Console::WriteLine(Convert::ToString(differenceInS econds));

// -------------------------------------------------------------------------
-------------------------------------------


Feb 8 '06 #2
How many times did you run the loop? The first time the code is
executed it must be JIT compiled and that can take some of the time.

Just a thought.

Feb 8 '06 #3
Fast right, but correct? One important thing when running such kind of
benchmark is that the results are correct and repeatable (and I mean the
timing and the results of the operations performed).
Did you inspect the value of the array elements (say the first 10 locations)
after the run?
IMO they aren't correct in case of C++.
I will take a look at the code when I find some spare time, but 0.3 sec for
80 Million iterations smell like "broken optimization".


Willy.


"John Mark Howell" <jm*********@hotmail.com> wrote in message
news:%2****************@TK2MSFTNGP12.phx.gbl...
|I had a customer call about some C# code they had put together that was
| handling some large arrays. The performance was rather poor. The C#
code
| runs in about 22 seconds and the equivalent C++.Net code runs in 0.3
| seconds. Can someone help me understand why the C# code performance is so
| poor? I rewote the C# code to use a single dimenional array and the
time
| went down to about 3 seconds, but that's still no explaination as to why
the
| two dimenional array performance is so bad. I tried this on both C# 1.1
and
| C# 2.0.
|
|
| The original code was:
|
| public void TestLoop4OldMethod()
| {
| double[,] emisbase = new double[1000,8784];
| double[,] vombase = new double[1000,8784];
| int Iteration_Index;
| int Hourly_Index;
| double differenceInSeconds;
| DateTime myDateTime1;
| DateTime myDateTime2;
|
| myDateTime1 = DateTime.UtcNow;
| for (int i1 = 0; i1 < 10; )
| {
| for (Hourly_Index = 0; Hourly_Index < 8784; )
| {
| for (Iteration_Index = 0; Iteration_Index < 1000; )
| {
| emisbase[Iteration_Index, Hourly_Index] = Iteration_Index *
| Iteration_Index;
| vombase[Iteration_Index, Hourly_Index] = emisbase[Iteration_Index,
| Iteration_Index];
| Iteration_Index++;
| }
| Hourly_Index++;
| }
| //Console.WriteLine("Here we are - Loop 4: {0}", i1);
| i1++;
| }
| myDateTime2 = DateTime.UtcNow;
| TimeSpan ts = myDateTime2 - myDateTime1;
| differenceInSeconds = ts.TotalMilliseconds / 1000;
|
| Console.WriteLine("RunTime in Seconds - Left Most Index + array
| reference: {0} ", differenceInSeconds);
| Console.WriteLine(" ");
| }
|
|
| It runs in about 22 seconds. Here I rewrote the code in C++ and it runs
in
| 0.3 seconds:
|
|
|
// -------------------------------------------------------------------------
| -------------------------------------------
| int Iteration_Index;
| int Hourly_Index;
| double differenceInSeconds;
| DateTime myDateTime1;
| DateTime myDateTime2;
|
| double * emisbase = new double[1000,8784];
| double * vombase = new double[1000,8784];
|
| myDateTime1 = DateTime::UtcNow;
| for (int i1 = 0; i1 < 10; )
| {
| for (Hourly_Index = 0; Hourly_Index < 8784; Hourly_Index++)
| {
| for (Iteration_Index = 0; Iteration_Index < 1000; Iteration_Index++)
| {
| emisbase[Iteration_Index, Hourly_Index] = Iteration_Index *
| Iteration_Index;
| vombase[Iteration_Index, Hourly_Index] = emisbase[Iteration_Index,
| Iteration_Index];
| //Iteration_Index++;
| }
| //Hourly_Index++;
| }
| //Console::Write(S"Here we are - Loop 4: ");
| //Console::WriteLine(Convert::ToString(i1));
| i1++;
| }
| myDateTime2 = DateTime::UtcNow;
| TimeSpan ts = myDateTime2 - myDateTime1;
| differenceInSeconds = ts.TotalMilliseconds / 1000;
|
| Console::Write(S"RunTime in Seconds - Left Most Index + array reference:
| ");
| Console::WriteLine(Convert::ToString(differenceInS econds));
|
|
// -------------------------------------------------------------------------
| -------------------------------------------
|
|
|
|
|
Feb 8 '06 #4
JIT compiling such a small piece of code takes less than a millisecond.

Willy.

"Chris Dunaway" <du******@gmail.com> wrote in message
news:11**********************@f14g2000cwb.googlegr oups.com...
| How many times did you run the loop? The first time the code is
| executed it must be JIT compiled and that can take some of the time.
|
| Just a thought.
|
Feb 8 '06 #5
Hi John,
I ran your original 2D code in a loop 10 times and the average I get is
about 6.36 seconds (my computer is a 3Ghz Pentium4 with 512MB of RAM) to loop
87,840,000 times which if you look at one iteration time is roughly 7.9 x
10E-8 which is pretty fast :-). If I am looping 87 million times I think I
would be happy with the 6 second range.

Not sure why the C++ is so much faster but something smells fishy when
performing so much processing :-)

Mark
http://www.markdawson.org

"John Mark Howell" wrote:
I had a customer call about some C# code they had put together that was
handling some large arrays. The performance was rather poor. The C# code
runs in about 22 seconds and the equivalent C++.Net code runs in 0.3
seconds. Can someone help me understand why the C# code performance is so
poor? I rewote the C# code to use a single dimenional array and the time
went down to about 3 seconds, but that's still no explaination as to why the
two dimenional array performance is so bad. I tried this on both C# 1.1 and
C# 2.0.
The original code was:

public void TestLoop4OldMethod()
{
double[,] emisbase = new double[1000,8784];
double[,] vombase = new double[1000,8784];
int Iteration_Index;
int Hourly_Index;
double differenceInSeconds;
DateTime myDateTime1;
DateTime myDateTime2;

myDateTime1 = DateTime.UtcNow;
for (int i1 = 0; i1 < 10; )
{
for (Hourly_Index = 0; Hourly_Index < 8784; )
{
for (Iteration_Index = 0; Iteration_Index < 1000; )
{
emisbase[Iteration_Index, Hourly_Index] = Iteration_Index *
Iteration_Index;
vombase[Iteration_Index, Hourly_Index] = emisbase[Iteration_Index,
Iteration_Index];
Iteration_Index++;
}
Hourly_Index++;
}
//Console.WriteLine("Here we are - Loop 4: {0}", i1);
i1++;
}
myDateTime2 = DateTime.UtcNow;
TimeSpan ts = myDateTime2 - myDateTime1;
differenceInSeconds = ts.TotalMilliseconds / 1000;

Console.WriteLine("RunTime in Seconds - Left Most Index + array
reference: {0} ", differenceInSeconds);
Console.WriteLine(" ");
}
It runs in about 22 seconds. Here I rewrote the code in C++ and it runs in
0.3 seconds:
// -------------------------------------------------------------------------
-------------------------------------------
int Iteration_Index;
int Hourly_Index;
double differenceInSeconds;
DateTime myDateTime1;
DateTime myDateTime2;

double * emisbase = new double[1000,8784];
double * vombase = new double[1000,8784];

myDateTime1 = DateTime::UtcNow;
for (int i1 = 0; i1 < 10; )
{
for (Hourly_Index = 0; Hourly_Index < 8784; Hourly_Index++)
{
for (Iteration_Index = 0; Iteration_Index < 1000; Iteration_Index++)
{
emisbase[Iteration_Index, Hourly_Index] = Iteration_Index *
Iteration_Index;
vombase[Iteration_Index, Hourly_Index] = emisbase[Iteration_Index,
Iteration_Index];
//Iteration_Index++;
}
//Hourly_Index++;
}
//Console::Write(S"Here we are - Loop 4: ");
//Console::WriteLine(Convert::ToString(i1));
i1++;
}
myDateTime2 = DateTime::UtcNow;
TimeSpan ts = myDateTime2 - myDateTime1;
differenceInSeconds = ts.TotalMilliseconds / 1000;

Console::Write(S"RunTime in Seconds - Left Most Index + array reference:
");
Console::WriteLine(Convert::ToString(differenceInS econds));

// -------------------------------------------------------------------------
-------------------------------------------


Feb 9 '06 #6
Only once, but the timer is in the code, not external so the IL is JIT'ed by
the time I get the begin time.

"Chris Dunaway" <du******@gmail.com> wrote in message
news:11**********************@f14g2000cwb.googlegr oups.com...
How many times did you run the loop? The first time the code is
executed it must be JIT compiled and that can take some of the time.

Just a thought.

Feb 9 '06 #7
That's an option I suggested to them. They have code already in Fortran and
were considering migrating to C# for business reasons. Does anyone know how
well the FORTRAN.Net from Intel or Lahey perform?

"Gabriel Magaña" <no*****@no-spam.com> wrote in message
news:eQ**************@TK2MSFTNGP10.phx.gbl...
For a loop that runs almost 9 million times, 3 seconds is not that bad!
;-)

I don't know the internals of C# well enough to say something intelligent
about it, but why not leave in the managed C++ part? It sounds like this
is a critical section that should have the heck optimized out of it
anyhow...


Feb 9 '06 #8
"broken optimization"? Do you mean that it should take longer or shorter?

The client is looking at a piece of Intel FORTRAN code that is running on a
Windows box in 0.04 seconds as a basis. They are considering migrating to
C# for business reasons. Has anyone looked at the performance of either the
Intel or Lahey FORTRAN.Net products?
"Willy Denoyette [MVP]" <wi*************@telenet.be> wrote in message
news:ud*************@TK2MSFTNGP14.phx.gbl...
Fast right, but correct? One important thing when running such kind of
benchmark is that the results are correct and repeatable (and I mean the
timing and the results of the operations performed).
Did you inspect the value of the array elements (say the first 10
locations)
after the run?
IMO they aren't correct in case of C++.
I will take a look at the code when I find some spare time, but 0.3 sec
for
80 Million iterations smell like "broken optimization".


Willy.

Feb 9 '06 #9
MarkD you may be on to something. I only ran the loop once. C# may have
some type of internal tuning that takes more than one pass. I'll have to
adjust my test and re-run it multiple times to see.

--
John Mark Howell
blog: http://jmarkhowell.blogspot.com/

"Mark R. Dawson" <Ma*********@discussions.microsoft.com> wrote in message
news:96**********************************@microsof t.com...
Hi John,
I ran your original 2D code in a loop 10 times and the average I get is
about 6.36 seconds (my computer is a 3Ghz Pentium4 with 512MB of RAM) to
loop
87,840,000 times which if you look at one iteration time is roughly 7.9 x
10E-8 which is pretty fast :-). If I am looping 87 million times I think
I
would be happy with the 6 second range.

Not sure why the C++ is so much faster but something smells fishy when
performing so much processing :-)

Mark
http://www.markdawson.org


Feb 9 '06 #10
"John Mark Howell" <jm*********@hotmail.com> wrote in message
news:%2****************@TK2MSFTNGP12.phx.gbl...
<snip>
double * emisbase = new double[1000,8784];
double * vombase = new double[1000,8784];


I think these are native arrays.
Feb 9 '06 #11
What is 'native'?

--
John Mark Howell
blog: http://jmarkhowell.blogspot.com/
"James Park" <so*****@hotmail.com> wrote in message
news:um*************@TK2MSFTNGP14.phx.gbl...
"John Mark Howell" <jm*********@hotmail.com> wrote in message
news:%2****************@TK2MSFTNGP12.phx.gbl...
<snip>
double * emisbase = new double[1000,8784];
double * vombase = new double[1000,8784];


I think these are native arrays.

Feb 9 '06 #12
Here are some tests that I have done with the code included below. I
compiled it in Release mode and ran it outside of VStudio.

Empty loop
..01 seconds

count++ only
..15 seconds

count++ and vals1[x,y] = x*y+1
1.5 seconds

count++ and vals1[x,y] = x*y+1 and vals2[x,y] = x*y+2;

4.5 for the first few runs then it jumped up to about 10 seconds.

It may not give any answers to why (we may just have to accept it as a fact)
but it does give you some ideas as to how much array access and
multiplication decreases performance.
Code I used:
=====================
using System;
namespace SpeedTest
{
class Class1
{
[STAThread]
static void Main(string[] args)
{
for(int x=1; x<=10; x++)
RunTest();
Console.ReadLine();
}
public static void RunTest()
{
DateTime myDateTime1;
DateTime myDateTime2;
double differenceInSeconds;
long count = 0;
double [,]vals1 = new double[1001,8001];
double [,]vals2 = new double[1001,8001];
myDateTime1 = DateTime.UtcNow;
for(int x=1; x<=1000; x++)
{
for(int y=1; y<=8000; y++)
{
for(int z=1; z<=10; z++)
{
count++;
// vals1[x,y] = x*y+1;
// vals2[x,y] = x*y+2;
}
}
}
myDateTime2 = DateTime.UtcNow;
TimeSpan ts = myDateTime2 - myDateTime1;
differenceInSeconds = ts.TotalMilliseconds / 1000;
Console.WriteLine("{0} loops in {1}Seconds", count, differenceInSeconds);
}
}
}
=====================
--
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Charles Cox
VC/VB/C# Developer
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

"John Mark Howell" <jm*********@hotmail.com> wrote in message
news:%2****************@TK2MSFTNGP12.phx.gbl...
I had a customer call about some C# code they had put together that was
handling some large arrays. The performance was rather poor. The C#
code
runs in about 22 seconds and the equivalent C++.Net code runs in 0.3
seconds. Can someone help me understand why the C# code performance is so
poor? I rewote the C# code to use a single dimenional array and the
time
went down to about 3 seconds, but that's still no explaination as to why
the
two dimenional array performance is so bad. I tried this on both C# 1.1
and
C# 2.0.
The original code was:

public void TestLoop4OldMethod()
{
double[,] emisbase = new double[1000,8784];
double[,] vombase = new double[1000,8784];
int Iteration_Index;
int Hourly_Index;
double differenceInSeconds;
DateTime myDateTime1;
DateTime myDateTime2;

myDateTime1 = DateTime.UtcNow;
for (int i1 = 0; i1 < 10; )
{
for (Hourly_Index = 0; Hourly_Index < 8784; )
{
for (Iteration_Index = 0; Iteration_Index < 1000; )
{
emisbase[Iteration_Index, Hourly_Index] = Iteration_Index *
Iteration_Index;
vombase[Iteration_Index, Hourly_Index] = emisbase[Iteration_Index,
Iteration_Index];
Iteration_Index++;
}
Hourly_Index++;
}
//Console.WriteLine("Here we are - Loop 4: {0}", i1);
i1++;
}
myDateTime2 = DateTime.UtcNow;
TimeSpan ts = myDateTime2 - myDateTime1;
differenceInSeconds = ts.TotalMilliseconds / 1000;

Console.WriteLine("RunTime in Seconds - Left Most Index + array
reference: {0} ", differenceInSeconds);
Console.WriteLine(" ");
}
It runs in about 22 seconds. Here I rewrote the code in C++ and it runs
in
0.3 seconds:
// -------------------------------------------------------------------------
-------------------------------------------
int Iteration_Index;
int Hourly_Index;
double differenceInSeconds;
DateTime myDateTime1;
DateTime myDateTime2;

double * emisbase = new double[1000,8784];
double * vombase = new double[1000,8784];

myDateTime1 = DateTime::UtcNow;
for (int i1 = 0; i1 < 10; )
{
for (Hourly_Index = 0; Hourly_Index < 8784; Hourly_Index++)
{
for (Iteration_Index = 0; Iteration_Index < 1000; Iteration_Index++)
{
emisbase[Iteration_Index, Hourly_Index] = Iteration_Index *
Iteration_Index;
vombase[Iteration_Index, Hourly_Index] = emisbase[Iteration_Index,
Iteration_Index];
//Iteration_Index++;
}
//Hourly_Index++;
}
//Console::Write(S"Here we are - Loop 4: ");
//Console::WriteLine(Convert::ToString(i1));
i1++;
}
myDateTime2 = DateTime::UtcNow;
TimeSpan ts = myDateTime2 - myDateTime1;
differenceInSeconds = ts.TotalMilliseconds / 1000;

Console::Write(S"RunTime in Seconds - Left Most Index + array reference:
");
Console::WriteLine(Convert::ToString(differenceInS econds));

// -------------------------------------------------------------------------
-------------------------------------------


Feb 9 '06 #13
"John Mark Howell" <jm*********@hotmail.com> wrote in message
news:%2****************@tk2msftngp13.phx.gbl...
What is 'native'?


In managed Managed C++, the statements should be:
Double emisbase[,] = new Double[1000, 8784];
Double vombase[,] = new Double[1000, 8784];

In C++/CLI, the statements should be:
array<double, 2>^ emisbase = gcnew array<double, 2>(1000,8784);
array<double, 2>^ vombase = gcnew array<double, 2>(1000,8784);

What you really are doing right now is really:
double * emisbase = new double[8784];
double * vombase = new double[8784];

And statements like:
emisbase[Iteration_Index, Hourly_Index]

turn into:
emisbase[Hourly_Index]

because emisbase doesn't refer to a .NET array. It's just a regular C++
pointer to a regular C++ array.

Once you run apples to apples, it takes just about as long.
Feb 9 '06 #14
Thanks James. That does clarify a lot of questions about the difference.
But do you know of anyway to coerce C# to get the performance of a native
array as in C++ or is that just asking too much? Would it be worth it to
have sections of unsafe code?

"James Park" <so*****@hotmail.com> wrote in message
news:%2******************@tk2msftngp13.phx.gbl...
"John Mark Howell" <jm*********@hotmail.com> wrote in message
news:%2****************@tk2msftngp13.phx.gbl...
What is 'native'?


In managed Managed C++, the statements should be:
Double emisbase[,] = new Double[1000, 8784];
Double vombase[,] = new Double[1000, 8784];

In C++/CLI, the statements should be:
array<double, 2>^ emisbase = gcnew array<double, 2>(1000,8784);
array<double, 2>^ vombase = gcnew array<double, 2>(1000,8784);

What you really are doing right now is really:
double * emisbase = new double[8784];
double * vombase = new double[8784];

And statements like:
emisbase[Iteration_Index, Hourly_Index]

turn into:
emisbase[Hourly_Index]

because emisbase doesn't refer to a .NET array. It's just a regular C++
pointer to a regular C++ array.

Once you run apples to apples, it takes just about as long.

Feb 9 '06 #15
No I mean that the C++ figures are wrong because of "broken" optimization.
If I run the code (managed C++) you posted it takes 0.210 secs. to complete,
impressive, but I don't trust it.
Now if I watch the memory consumption when running (1000 iterations to make
it possible to measure), the "private bytes" counter stays at ~9 Mb and the
working-set at ~6.5 Mb, wich is the minimum for a ".NET" process, this
smells like agressive optimization. Why?, well the program should create two
arrays of doubles (well, the optimizer could optimize away the first array),
each of them - 8784*1000*sizeof(double) = 70.272.000 bytes. That means that
the final private bytes count should reach at least 70.272.000 (or
140.544.000 bytes).
So it's clear that NO arrays are created, that would mean both array
creation are optimized away (which is possible because they aren't used
outside the loop, right?
So I decided to look at the results, by including this code after the loop:

for (Hourly_Index = 0; Hourly_Index < 1; Hourly_Index++)
{
for (Iteration_Index = 0;Iteration_Index < 1000 ; Iteration_Index++ )
{
Console::WriteLine("{0}", Convert::ToString(vombase[Iteration_Index,
Hourly_Index]));
}
}
See, I'm only interested at the first 1000 elements, and he! they are all at
0, thus wrong! The I watched the last elements values, they are all at
998001, which is 999^2 or the largest value calculated in the loop. Finaly I
checked all the 'values' and guess, there are only 0 (the first 1.000.000)
and 998001 (the remaining), but still no array's, the memory consumption
remain the same.
So, this is what I call "broken optimization", the results are wrong, so the
benchmarh time is bogus.

Note that I did the same with the C# code, and here the results are correct,
memory consumption is > 140Mb and the benchmark time is correct (5.2 Secs.
on my box).

Actually I didn't look into the generated code, I will do is I find some
spare time.

Willy.

"John Mark Howell" <jm*********@hotmail.com> wrote in message
news:u1**************@TK2MSFTNGP10.phx.gbl...
| "broken optimization"? Do you mean that it should take longer or shorter?
|
| The client is looking at a piece of Intel FORTRAN code that is running on
a
| Windows box in 0.04 seconds as a basis. They are considering migrating to
| C# for business reasons. Has anyone looked at the performance of either
the
| Intel or Lahey FORTRAN.Net products?
|
|
| "Willy Denoyette [MVP]" <wi*************@telenet.be> wrote in message
| news:ud*************@TK2MSFTNGP14.phx.gbl...
| > Fast right, but correct? One important thing when running such kind of
| > benchmark is that the results are correct and repeatable (and I mean
the
| > timing and the results of the operations performed).
| > Did you inspect the value of the array elements (say the first 10
| > locations)
| > after the run?
| > IMO they aren't correct in case of C++.
| > I will take a look at the code when I find some spare time, but 0.3 sec
| > for
| > > 80 Million iterations smell like "broken optimization".
| >
| > Willy.
|
|
Feb 9 '06 #16
"John Mark Howell" <jm*********@hotmail.com> wrote in message
news:ud**************@TK2MSFTNGP11.phx.gbl...
Thanks James. That does clarify a lot of questions about the difference.
But do you know of anyway to coerce C# to get the performance of a native
array as in C++ or is that just asking too much?


The implementation given is broken. If you fix it by declaring the arrays:

double (*emisbase)[8784];
double (*vombase)[8784];

and switching all [a,b] to [a][b], it takes as long. Even if you use
straight native C++, it doesn't really help (I just tested it).
Feb 9 '06 #17
This is how the code should look like:
....

// changed array declarations
double (*emisbase)[8700];
double (*vombase)[8700];
emisbase = new double[1000][8700]; // allocate native array on the heap
vombase = new double[1000][8700]; // "
.....
for (int i1 = 0; i1 < 10; i1++)
{
for (Hourly_Index = 0; Hourly_Index < 8784; Hourly_Index++)
{
for (Iteration_Index = 0; Iteration_Index < 1000; Iteration_Index++)
{
emisbase[Iteration_Index][ Hourly_Index] =
Iteration_Index*Iteration_Index;
vombase[Iteration_Index][ Hourly_Index] =
emisbase[Iteration_Index][Iteration_Index];
}
}
}

note that the code you posted only creates array's of 8700 entries (as per
C++ standard), which leads to broken MSIL code as explained in my previous
post.
Running the above results in the same time to complete as C#, note also that
allocations of the managed arrays are somewhat faster in C# as opposed to
the native array allocation in C++ (managed code).

Willy.
"Willy Denoyette [MVP]" <wi*************@telenet.be> wrote in message
news:er**************@TK2MSFTNGP09.phx.gbl...
| No I mean that the C++ figures are wrong because of "broken" optimization.
| If I run the code (managed C++) you posted it takes 0.210 secs. to
complete,
| impressive, but I don't trust it.
| Now if I watch the memory consumption when running (1000 iterations to
make
| it possible to measure), the "private bytes" counter stays at ~9 Mb and
the
| working-set at ~6.5 Mb, wich is the minimum for a ".NET" process, this
| smells like agressive optimization. Why?, well the program should create
two
| arrays of doubles (well, the optimizer could optimize away the first
array),
| each of them - 8784*1000*sizeof(double) = 70.272.000 bytes. That means
that
| the final private bytes count should reach at least 70.272.000 (or
| 140.544.000 bytes).
| So it's clear that NO arrays are created, that would mean both array
| creation are optimized away (which is possible because they aren't used
| outside the loop, right?
| So I decided to look at the results, by including this code after the
loop:
|
| for (Hourly_Index = 0; Hourly_Index < 1; Hourly_Index++)
| {
| for (Iteration_Index = 0;Iteration_Index < 1000 ; Iteration_Index++ )
| {
| Console::WriteLine("{0}", Convert::ToString(vombase[Iteration_Index,
| Hourly_Index]));
| }
| }
| See, I'm only interested at the first 1000 elements, and he! they are all
at
| 0, thus wrong! The I watched the last elements values, they are all at
| 998001, which is 999^2 or the largest value calculated in the loop. Finaly
I
| checked all the 'values' and guess, there are only 0 (the first 1.000.000)
| and 998001 (the remaining), but still no array's, the memory consumption
| remain the same.
| So, this is what I call "broken optimization", the results are wrong, so
the
| benchmarh time is bogus.
|
| Note that I did the same with the C# code, and here the results are
correct,
| memory consumption is > 140Mb and the benchmark time is correct (5.2 Secs.
| on my box).
|
| Actually I didn't look into the generated code, I will do is I find some
| spare time.
|
| Willy.
|
|
|
| "John Mark Howell" <jm*********@hotmail.com> wrote in message
| news:u1**************@TK2MSFTNGP10.phx.gbl...
|| "broken optimization"? Do you mean that it should take longer or
shorter?
||
|| The client is looking at a piece of Intel FORTRAN code that is running on
| a
|| Windows box in 0.04 seconds as a basis. They are considering migrating
to
|| C# for business reasons. Has anyone looked at the performance of either
| the
|| Intel or Lahey FORTRAN.Net products?
||
||
|| "Willy Denoyette [MVP]" <wi*************@telenet.be> wrote in message
|| news:ud*************@TK2MSFTNGP14.phx.gbl...
|| > Fast right, but correct? One important thing when running such kind of
|| > benchmark is that the results are correct and repeatable (and I mean
| the
|| > timing and the results of the operations performed).
|| > Did you inspect the value of the array elements (say the first 10
|| > locations)
|| > after the run?
|| > IMO they aren't correct in case of C++.
|| > I will take a look at the code when I find some spare time, but 0.3 sec
|| > for
|| > > 80 Million iterations smell like "broken optimization".
|| >
|| > Willy.
||
||
|
|
Feb 9 '06 #18
Ouch. I should have looked closer at the code and tested the results before
posting. That would have avoided some confusion. Thanks for setting me
straight on what was actually happening. After I fixed my code here, you
are correct, I'm getting equivalent speed to the C# code.
"Willy Denoyette [MVP]" <wi*************@telenet.be> wrote in message
news:er**************@TK2MSFTNGP09.phx.gbl...
No I mean that the C++ figures are wrong because of "broken" optimization.
If I run the code (managed C++) you posted it takes 0.210 secs. to complete, impressive, but I don't trust it.
Now if I watch the memory consumption when running (1000 iterations to make it possible to measure), the "private bytes" counter stays at ~9 Mb and the working-set at ~6.5 Mb, wich is the minimum for a ".NET" process, this
smells like agressive optimization. Why?, well the program should create two arrays of doubles (well, the optimizer could optimize away the first array), each of them - 8784*1000*sizeof(double) = 70.272.000 bytes. That means that the final private bytes count should reach at least 70.272.000 (or
140.544.000 bytes).
So it's clear that NO arrays are created, that would mean both array
creation are optimized away (which is possible because they aren't used
outside the loop, right?
So I decided to look at the results, by including this code after the loop:
for (Hourly_Index = 0; Hourly_Index < 1; Hourly_Index++)
{
for (Iteration_Index = 0;Iteration_Index < 1000 ; Iteration_Index++ )
{
Console::WriteLine("{0}", Convert::ToString(vombase[Iteration_Index,
Hourly_Index]));
}
}
See, I'm only interested at the first 1000 elements, and he! they are all at 0, thus wrong! The I watched the last elements values, they are all at
998001, which is 999^2 or the largest value calculated in the loop. Finaly I checked all the 'values' and guess, there are only 0 (the first 1.000.000)
and 998001 (the remaining), but still no array's, the memory consumption
remain the same.
So, this is what I call "broken optimization", the results are wrong, so the benchmarh time is bogus.

Note that I did the same with the C# code, and here the results are correct, memory consumption is > 140Mb and the benchmark time is correct (5.2 Secs.
on my box).

Actually I didn't look into the generated code, I will do is I find some
spare time.

Willy.

Feb 9 '06 #19
No problem, I was getting confused because I would have expected the
compiler would flag the array declarations as illegal. Further investigation
on my part showed the 'tiny' 8784 elem. array's being created which made
ring a bell.

Willy.
"John Mark Howell" <jm*********@hotmail.com> wrote in message
news:%2****************@tk2msftngp13.phx.gbl...
| Ouch. I should have looked closer at the code and tested the results
before
| posting. That would have avoided some confusion. Thanks for setting me
| straight on what was actually happening. After I fixed my code here, you
| are correct, I'm getting equivalent speed to the C# code.
|
|
| "Willy Denoyette [MVP]" <wi*************@telenet.be> wrote in message
| news:er**************@TK2MSFTNGP09.phx.gbl...
| > No I mean that the C++ figures are wrong because of "broken"
optimization.
| > If I run the code (managed C++) you posted it takes 0.210 secs. to
| complete,
| > impressive, but I don't trust it.
| > Now if I watch the memory consumption when running (1000 iterations to
| make
| > it possible to measure), the "private bytes" counter stays at ~9 Mb and
| the
| > working-set at ~6.5 Mb, wich is the minimum for a ".NET" process, this
| > smells like agressive optimization. Why?, well the program should create
| two
| > arrays of doubles (well, the optimizer could optimize away the first
| array),
| > each of them - 8784*1000*sizeof(double) = 70.272.000 bytes. That means
| that
| > the final private bytes count should reach at least 70.272.000 (or
| > 140.544.000 bytes).
| > So it's clear that NO arrays are created, that would mean both array
| > creation are optimized away (which is possible because they aren't used
| > outside the loop, right?
| > So I decided to look at the results, by including this code after the
| loop:
| >
| > for (Hourly_Index = 0; Hourly_Index < 1; Hourly_Index++)
| > {
| > for (Iteration_Index = 0;Iteration_Index < 1000 ; Iteration_Index++ )
| > {
| > Console::WriteLine("{0}", Convert::ToString(vombase[Iteration_Index,
| > Hourly_Index]));
| > }
| > }
| > See, I'm only interested at the first 1000 elements, and he! they are
all
| at
| > 0, thus wrong! The I watched the last elements values, they are all at
| > 998001, which is 999^2 or the largest value calculated in the loop.
Finaly
| I
| > checked all the 'values' and guess, there are only 0 (the first
1.000.000)
| > and 998001 (the remaining), but still no array's, the memory consumption
| > remain the same.
| > So, this is what I call "broken optimization", the results are wrong, so
| the
| > benchmarh time is bogus.
| >
| > Note that I did the same with the C# code, and here the results are
| correct,
| > memory consumption is > 140Mb and the benchmark time is correct (5.2
Secs.
| > on my box).
| >
| > Actually I didn't look into the generated code, I will do is I find some
| > spare time.
| >
| > Willy.
|
|
Feb 9 '06 #20
Hi John,

A couple things, you might want to use System.Diagnostics.StopWatch to
do benchmarks. its more accurate.

If you need some really powerful and fast arrays and collections, I
suggest you look at Wintellect's Powercollections. (Free too to boot!)

Paul

John Mark Howell wrote:
I had a customer call about some C# code they had put together that was
handling some large arrays. The performance was rather poor. The C# code
runs in about 22 seconds and the equivalent C++.Net code runs in 0.3
seconds. Can someone help me understand why the C# code performance is so
poor? I rewote the C# code to use a single dimenional array and the time
went down to about 3 seconds, but that's still no explaination as to why the
two dimenional array performance is so bad. I tried this on both C# 1.1 and
C# 2.0.
The original code was:

public void TestLoop4OldMethod()
{
double[,] emisbase = new double[1000,8784];
double[,] vombase = new double[1000,8784];
int Iteration_Index;
int Hourly_Index;
double differenceInSeconds;
DateTime myDateTime1;
DateTime myDateTime2;

myDateTime1 = DateTime.UtcNow;
for (int i1 = 0; i1 < 10; )
{
for (Hourly_Index = 0; Hourly_Index < 8784; )
{
for (Iteration_Index = 0; Iteration_Index < 1000; )
{
emisbase[Iteration_Index, Hourly_Index] = Iteration_Index *
Iteration_Index;
vombase[Iteration_Index, Hourly_Index] = emisbase[Iteration_Index,
Iteration_Index];
Iteration_Index++;
}
Hourly_Index++;
}
//Console.WriteLine("Here we are - Loop 4: {0}", i1);
i1++;
}
myDateTime2 = DateTime.UtcNow;
TimeSpan ts = myDateTime2 - myDateTime1;
differenceInSeconds = ts.TotalMilliseconds / 1000;

Console.WriteLine("RunTime in Seconds - Left Most Index + array
reference: {0} ", differenceInSeconds);
Console.WriteLine(" ");
}
It runs in about 22 seconds. Here I rewrote the code in C++ and it runs in
0.3 seconds:
// -------------------------------------------------------------------------
-------------------------------------------
int Iteration_Index;
int Hourly_Index;
double differenceInSeconds;
DateTime myDateTime1;
DateTime myDateTime2;

double * emisbase = new double[1000,8784];
double * vombase = new double[1000,8784];

myDateTime1 = DateTime::UtcNow;
for (int i1 = 0; i1 < 10; )
{
for (Hourly_Index = 0; Hourly_Index < 8784; Hourly_Index++)
{
for (Iteration_Index = 0; Iteration_Index < 1000; Iteration_Index++)
{
emisbase[Iteration_Index, Hourly_Index] = Iteration_Index *
Iteration_Index;
vombase[Iteration_Index, Hourly_Index] = emisbase[Iteration_Index,
Iteration_Index];
//Iteration_Index++;
}
//Hourly_Index++;
}
//Console::Write(S"Here we are - Loop 4: ");
//Console::WriteLine(Convert::ToString(i1));
i1++;
}
myDateTime2 = DateTime::UtcNow;
TimeSpan ts = myDateTime2 - myDateTime1;
differenceInSeconds = ts.TotalMilliseconds / 1000;

Console::Write(S"RunTime in Seconds - Left Most Index + array reference:
");
Console::WriteLine(Convert::ToString(differenceInS econds));

// -------------------------------------------------------------------------
-------------------------------------------


Feb 9 '06 #21

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

10 posts views Thread by AC Slater | last post: by
2 posts views Thread by Frank Pool | last post: by
4 posts views Thread by Gregory.A.Book | last post: by
1 post views Thread by CARIGAR | last post: by
reply views Thread by zhoujie | last post: by
reply views Thread by Marin | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.