jo*********@yahoo.com wrote:
I need to copy about 30000 bytes within a single thread but there are
about 9 threads that need to run simultaneously, so the net effect is
270000 bytes. This must all happen well under a second. You're probably
right about having to use unsafe code but that isn't an issue. Anything
that accomplishes the job. Iterating 270000 times sounds kind
horrifying for performance issues.
You have to measure.
I've just done a few benchmark, copying bytes from a byte[270000] to an
int[270, 1000] array, using three methods. In all cases, the source and
destination arrays were preallocated in order to avoid garbage
collections during the benchmark:
[Benchmark]
public static void TestSimple()
{
for (int i = 0; i < src.Length; i++)
dest[i / 1000, i % 1000] = src[i];
}
[Benchmark]
public static void TestDouble()
{
int psrc = 0;
for (int y = 0; y < 270; y++)
{
for (int x = 0; x < 1000; x++)
{
dest[y, x] = src[psrc++];
}
}
}
[Benchmark]
public unsafe static void TestUnsafe()
{
fixed (int* fd = dest)
{
int* pd = fd;
fixed (byte* fs = src)
{
byte* ps = fs;
for (int i = 0; i < 270000; i++)
*pd++ = *ps++;
}
}
}
And the results were as follows:
-------------------------------------------------
Benchmarking 'BenchmarkFX.ArrayTest - Void TestSimple()'
Generating code...
Running benchmark...
Result: Performed 1000 iterations in 8,33078142615884 sec
Single call took: 8,331 ms
-------------------------------------------------
Benchmarking 'BenchmarkFX.ArrayTest - Void TestDouble()'
Generating code...
Running benchmark...
Result: Performed 1000 iterations in 3,05974032866275 sec
Single call took: 3,060 ms
-------------------------------------------------
Benchmarking 'BenchmarkFX.ArrayTest - Void TestUnsafe()'
Generating code...
Running benchmark...
Result: Performed 3000 iterations in 2,56928532893266 sec
Single call took: 856,428 us
Done
As you can see, the biggest performance killer appears to be the div /
mod in the simple case. I think the performance gained through
optimizing the code is negligible (even though it's an order of
magnitude from the simplest case to unsafe code), as long as it's
required to run "well under a second".
All the tests were made using a P4C, running 2.6GHz
HTH,
Stefan