Gennaro Prota wrote:
In 2003, libstdc++ switched from the table-based SGI implementation to
using builtins. I'd be curious to know how does dynamic_bitset< unsigned
long >::count() compares, on your machine; I'd expect it to be close to
the old libstdc++ (at least if you have CHAR_BIT == 8 and no padding
bits in unsigned longs :-)).
My computer is a 3.4GHz Pentium4, and I'm using gcc 4.1.2 on a linux
system. I created a bitset with 2100000000 elements, with approximately
10% of the bits set (more precisely, discarding all the even numbers,
all the primes between 2 and 4200000000 are set as a set bit; the total
amount of set bits is 198996103).
I compiled the program with "-O3 -march=pentium4".
std::bitset::count() took approximately 590 ms.
boost::dynamic_bitset::count() took approximately 250 ms.
The latter seems to be significantly faster.
If my calculations are correct, this means that the former uses in
average approximately 0.95 clock cycles per bit, while the latter uses
in average only 0.4 clock cycles per bit.
That's *fast* bit counting...