rayreeves wrote:
This self-appointed project is to produce the world's fastest prime number
generator, where speed is the only criterion.
For example, on my 1300M processor my C++ code found the 5 000 000 primes
less than 100 000 000 in 14 secs. Unfortunately, I lost my hard drive and
all my source so I have to rewrite it, and it seems like a good time to try
in C#.
I can see that I have been under some misapprehensions about arrays and
references but I now understand that if I declare a class PRIMEVALUE with
two int fields p2 and pn and then declare an array pq = new PRIMEVALUE[some
size] it will take a suitable space on the heap and I can then populate the
elements with pq[i] = new PRIMEVALUE(p2Value, pnValue) and those element
values will also be on the heap. Now I assume that when I interchange
elements of pq
with swap(pq[i], pq[j]) the references will be exchanged but the values
remain where they are.
I also assume that references are the size of one int, so moving them should
be faster than moving two ints.
Not necessarily. Consider the pros and the cons of the two scenarios.
1. Make PrimeValue a struct. It is therefore a value type and lives
directly within the array, which is allocated as one block on the heap.
o You must swap PrimeValue values, which are 64 bytes each, rather than
pointers, which are 32 bytes each (until you switch to 64-bit, at which
point they will be the same size as PrimeValue, unless you want
PrimeValue of 64+64 = 128 bits...).
o Single pointer dereference to get at any element of the array: array
address plus offset gets you the value.
o Values are close together, and so chunks of the array will fit in the
on-processor cache. If you are exchanging adjacent elements of part of
a sort, this will be FAST. (Of course, it depends upon which sort
you're using and how the processor manages the cache....)
o No garbage collection (apart from the array as a whole). (See below.)
2. Make PrimeValue a class. It is therefore a reference type, and each
PrimeValue will be allocated separately on the heap. The array will be
an array of pointers, and only pointers will be exchanged.
o You can swap pointers rather than PrimeValue values.
o Double pointer dereference to get at any element of the array: array
address plus offset gets you the address, which you must then
dereference again to get the value.
o Values are scattered about in memory, and are (by definition) no kept
with any part of the array (except perhaps the start and the end). In
order to follow the double-reference noted above, the processor must
fetch a chunk of the array and then a chunk of memory where the value
is stored. Poor use of on-board cache, probably resulting in
significant performance degradation.
o Garbage collection needed for each array element. (See below.)
Every msec counts! Garbage collection, I think, is irrelevant.
It may well be, depending upon how much computation is done for each
PrimeValue creation / destruction. If the PrimeValues are allocated on
start-up, and deallocated only when the application terminates, and
there's a lot of computation in between, then GC is an overhead cost
and really doesn't contribute much. If you're constantly creating new
PrimeValues and destroying them (or, if they're value types, boxing
them and then abandoning the boxed copy) it could come with a
significant penalty. It all depends upon your algorithm, and how you
design the PrimeValue type.
In C++ I understood perfectly well what was going on, but C# conceals so
much that I have some doubts that it is suitable for this purpose.
I don't know that C# conceals all that much, really. You just have to
understand value versus reference types and what is happening in each
case. It all relates back to what's going on under the covers. It's
just unfamiliar.
In your case, I believe that the benefits of having the 64 (or 128) bit
values directly in an array outweight the benefits of swapping only
pointers. Particularly the superior use of on-board cache should give
you a big performance boost. At the very least, keep in mind that all
memory accesses are not created equal, and that if you want raw speed,
how you lay things out in memory has a significant effect, so it's not
just how many bits you're moving around, it's also which memory you're
moving them about in.
As well, don't forget the cost of the double pointer dereference.
That's four fetches for every value comparison, not just two. Since in
any sort there are typically more comparisons than swaps, you will lose
performance there.
So it's not cut-and-dried: pointers mean moving less data, but at the
cost of an extra dereference and poorer use of on-board cache; values
mean moving more data, but save you a pointer dereference and make
better use of on-board cache. You may have to write both and benchmark
the two programs. :-)