473,403 Members | 2,293 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,403 software developers and data experts.

Shootout: the benchmarks game.

Next:

fannkuch benchmark
(c++)
http://shootout.alioth.debian.org/gp...&lang=gpp&id=2
(java)

g++ -pipe -O3 -fomit-frame-pointer -finline-functions fannkuch.cpp -o
fannkuch.exe -march=native

jc -inline+ fannkuch.class

Time in secs as N increases
N 9 10 11
g++ 0.047s 0.392s 5.013s
JVM 0.180s 0.689s 7.057s (-server -Xms64m)
JET 0.278s 0.820s 7.792s

C++ version is faster; however, (given JIT compilation takes time),
the difference decreases with n = 11. With n = 12 (not included in
shootout site), the difference is even smaller

1m34.648s (-server -Xms64m)
1m7.465s (g++)

In any case, JVM is again (slightly) faster than JET native compiler
in this case.
Jun 27 '08 #1
31 1635
On Sun, 20 Apr 2008 18:40:34 -0500, Razii <wh**********@hotmail.com>
wrote:
>fannkuch benchmark
(c++)
http://shootout.alioth.debian.org/gp...&lang=gpp&id=2
(java)
http://shootout.alioth.debian.org/gp...lang=java&id=2
Jun 27 '08 #2
"Razii" <wh**********@hotmail.comwrote in message
news:pg********************************@4ax.com...
Next:

fannkuch benchmark
(c++)
http://shootout.alioth.debian.org/gp...&lang=gpp&id=2
(java)

g++ -pipe -O3 -fomit-frame-pointer -finline-functions fannkuch.cpp -o
fannkuch.exe -march=native

jc -inline+ fannkuch.class

Time in secs as N increases
N 9 10 11
g++ 0.047s 0.392s 5.013s
JVM 0.180s 0.689s 7.057s (-server -Xms64m)
JET 0.278s 0.820s 7.792s

C++ version is faster; however, (given JIT compilation takes time),
the difference decreases with n = 11. With n = 12 (not included in
shootout site), the difference is even smaller

1m34.648s (-server -Xms64m)
1m7.465s (g++)

In any case, JVM is again (slightly) faster than JET native compiler
in this case.
Try these with N = 9, 10, 11 Please!

C++: http://pastebin.com/m6db54f31
Java: http://pastebin.com/m15714c90
Thank you. Here is results I get on a P4 3.06ghz 512mb system:


Build Flags
__________________________________________________ _____
C++: /O2 /Ob2 /Oi /Ot /Oy /GL /D "WIN32" /D "NDEBUG" /D "_CONSOLE" /D
"_UNICODE" /D "UNICODE" /FD /EHsc /MT /Fo"Release\\" /Fd"Release\vc80.pdb"
/W3 /nologo /c /Wp64 /Zi /Gr /TP /errorReport:prompt

Java: -server -Xms64m


Test Results
P4 3.06ghz 512mb
__________________________________________________ _____
C++: N = 12 / Time: 47 ms
Java: N = 12 / Time: 94 ms
-------------------------------------------------------
C++: N = 13 / Time: 79 ms
Java: N = 13 / Time: 140 ms
-------------------------------------------------------
C++: N = 14 / Time: 172 ms
Java: N = 14 / Time: 250 ms
-------------------------------------------------------
C++: N = 15 / Time: 344 ms
Java: N = 15 / Time: 485 ms
-------------------------------------------------------
C++: N = 16 / Time: 828 ms
Java: N = 16 / Time: 1156 ms
-------------------------------------------------------
C++: N = 17 / Time: 1954 ms
Java: N = 17 / Time: 2359 ms
-------------------------------------------------------
C++: N = 18 / Time: 4485 ms
Java: N = 18 / Time: 7468 ms
-------------------------------------------------------
C++: N = 19 / Time: 11609 ms
Java: N = 19 / Time: 15047 ms
-------------------------------------------------------
C++: N = 20 / Time: 58891 ms
Java: N = 20 / CRASHED! OutOfMemoryError: Java Heap space
-------------------------------------------------------
C++: N = 21 / BOOOOOOM!
Java: N = 21 / CRASHED! OutOfMemoryError: Java Heap space
-------------------------------------------------------

Any thoughts?

Jun 27 '08 #3
"Chris Thomasson" <cr*****@comcast.netwrote in message
news:Dv******************************@comcast.com. ..
[...]
Try these with N = 9, 10, 11 Please!

C++: http://pastebin.com/m6db54f31
Java: http://pastebin.com/m15714c90
Thank you. Here is results I get on a P4 3.06ghz 512mb system:


Build Flags
__________________________________________________ _____
C++: /O2 /Ob2 /Oi /Ot /Oy /GL /D "WIN32" /D "NDEBUG" /D "_CONSOLE" /D
[...]

C++ == VC8

Jun 27 '08 #4

"Chris Thomasson" <cr*****@comcast.netwrote in message
news:Dv******************************@comcast.com. ..
"Razii" <wh**********@hotmail.comwrote in message
news:pg********************************@4ax.com...
>Next:

fannkuch benchmark
(c++)
http://shootout.alioth.debian.org/gp...&lang=gpp&id=2
(java)

g++ -pipe -O3 -fomit-frame-pointer -finline-functions fannkuch.cpp -o
fannkuch.exe -march=native

jc -inline+ fannkuch.class

Time in secs as N increases
N 9 10 11
g++ 0.047s 0.392s 5.013s
JVM 0.180s 0.689s 7.057s (-server -Xms64m)
JET 0.278s 0.820s 7.792s

C++ version is faster; however, (given JIT compilation takes time),
the difference decreases with n = 11. With n = 12 (not included in
shootout site), the difference is even smaller

1m34.648s (-server -Xms64m)
1m7.465s (g++)

In any case, JVM is again (slightly) faster than JET native compiler
in this case.

Try these with N = 9, 10, 11 Please!

C++: http://pastebin.com/m6db54f31
Java: http://pastebin.com/m15714c90

This took 15 minutes of my time.

Jun 27 '08 #5
Lew
Chris Thomasson wrote:
This took 15 minutes of my time.
Probably fifteen more minutes than it merited.

--
Lew
Jun 27 '08 #6
"Lew" <le*@lewscanon.comwrote in message
news:48******************************@comcast.com. ..
Chris Thomasson wrote:
>This took 15 minutes of my time.

Probably fifteen more minutes than it merited.
:^o

Jun 27 '08 #7

"Razii" <wh**********@hotmail.comwrote in message
news:pg********************************@4ax.com...
Next:

fannkuch benchmark
(c++)
http://shootout.alioth.debian.org/gp...&lang=gpp&id=2
(java)

g++ -pipe -O3 -fomit-frame-pointer -finline-functions fannkuch.cpp -o
fannkuch.exe -march=native

jc -inline+ fannkuch.class

Time in secs as N increases
N 9 10 11
g++ 0.047s 0.392s 5.013s
JVM 0.180s 0.689s 7.057s (-server -Xms64m)
JET 0.278s 0.820s 7.792s

C++ version is faster; however, (given JIT compilation takes time),
the difference decreases with n = 11. With n = 12 (not included in
shootout site), the difference is even smaller

1m34.648s (-server -Xms64m)
1m7.465s (g++)

In any case, JVM is again (slightly) faster than JET native compiler
in this case.
Strange... I would always assume that native code compiled with JET would
outperform JIT compilation. Odd. I guess its a QOI issue wrt the nature of
the JET compiler.

Jun 27 '08 #8
On Sun, 20 Apr 2008 21:13:30 -0700, "Chris Thomasson"
<cr*****@comcast.netwrote:
>Try these with N = 9, 10, 11 Please!

C++: http://pastebin.com/m6db54f31
Java: http://pastebin.com/m15714c90
What does this have to do with fannkuch benchmark benchmark? You are
responding to the wrong post. I am giong to start a new thread with
the third benchmark because once again I don't follow what is going
again.

The output of our C++ version is not correct (at least not with g++ I
haven't tried VC++ yet)

binarytrees2.12
Static Region Reset
Static Region De-Activate
Static Region Activate
Static Region Reset
Static Region Reset
Static Region Reset
Static Region Reset
Static Region Reset
Static Region Reset
Time: 187 ms

The output should look like this for n = 12

Java -server -Xms64m binarytrees 12
stretch tree of depth 13 check: -1
8192 trees of depth 4 check: -8192
2048 trees of depth 6 check: -2048
512 trees of depth 8 check: -512
128 trees of depth 10 check: -128
32 trees of depth 12 check: -32
long lived tree of depth 12 check: -1
Jun 27 '08 #9
On Mon, 21 Apr 2008 01:03:08 -0500, Razii <wh**********@hotmail.com>
wrote:
>binarytrees2.12
Static Region Reset
Static Region De-Activate
Static Region Activate
Static Region Reset
Static Region Reset
Static Region Reset
Static Region Reset
Static Region Reset
Static Region Reset
Time: 187 ms
Ops never mind. That was caused by not including /NDEBUG "DEBUG" ..
yuck.

Jun 27 '08 #10
>This took 15 minutes of my time.
The C++ version crashes on my comp with n = 20

"binarytrees2.exe has encountered problem and needs to be closed
Please tell Microsoft about this problem."

The c++ version compiled with the follwoing flags is slower than java
on my comp

cl /O2 /GL /D "NDEBUG" binarytrees2.cpp /link /ltcg

time binarytrees2 18 >/dev/null
real 0m6.845s
user 0m0.015s
sys 0m0.000s

$ time java -server -Xms512m -Xmx512m -XX:NewRatio=1 binarytrees 18
>/dev/null
real 0m5.316s
user 0m0.015s
sys 0m0.015s
>C++: N = 20 / Time: 58891 ms
Java: N = 20 / CRASHED! OutOfMemoryError: Java Heap space
If you are going to use such high numbers, increase heap size. The
default is 64m. Try this

java -server -Xms512m -Xmx512m -XX:NewRatio=1 binarytrees 20

Jun 27 '08 #11
On Sun, 20 Apr 2008 21:34:03 -0700, "Chris Thomasson"
<cr*****@comcast.netwrote:
>"Lew" <le*@lewscanon.comwrote in message
news:48******************************@comcast.com ...
>Chris Thomasson wrote:
>>This took 15 minutes of my time.

Probably fifteen more minutes than it merited.
Strangely, it crashes with n = 20 with the following flags

cl /O2 /GL /D "NDEBUG" binarytrees2.cpp /link /ltcg

However, it worked with g++....

g++ -pipe -O3 -DNDEBUG -fomit-frame-pointer -finline-functions binar
ytrees2.cpp -o binarytrees2.exe -march=native

$ time binarytrees2 20 >/dev/null

real 0m57.075s
user 0m0.015s
sys 0m0.015s

but was slower than java

$ time java -server -Xms512m -Xmx512m -XX:NewRatio=1 binarytrees 20
>/dev/null
real 0m21.687s
user 0m0.015s
sys 0m0.000s

Jun 27 '08 #12
On Sun, 20 Apr 2008 21:13:30 -0700, "Chris Thomasson"
<cr*****@comcast.netwrote:
>-------------------------------------------------------
C++: N = 16 / Time: 828 ms
Java: N = 16 / Time: 1156 ms
-------------------------------------------------------
C++: N = 17 / Time: 1954 ms
Java: N = 17 / Time: 2359 ms
-------------------------------------------------------
C++: N = 18 / Time: 4485 ms
Java: N = 18 / Time: 7468 ms
-------------------------------------------------------
C++: N = 19 / Time: 11609 ms
Java: N = 19 / Time: 15047 ms
the problem with these high n is that you are restricting java to use
only 64m. If yuo try it with

java -server -Xms512m -Xmx512m

it will improve the result...
Jun 27 '08 #13
On Sun, 20 Apr 2008 21:13:30 -0700, "Chris Thomasson"
<cr*****@comcast.netwrote:
>C++: N = 20 / Time: 58891 ms
Java: N = 20 / CRASHED! OutOfMemoryError: Java Heap space
And for n = 20, this should

java -server -Xms512m -Xmx512m -XX:NewRatio=1 binarytrees 20
Jun 27 '08 #14
Lew
Razii wrote:
responding to the wrong post. I am giong to start a new thread
Please don't.

--
Lew
Jun 27 '08 #15
On Sun, 20 Apr 2008 21:13:30 -0700, "Chris Thomasson"
<cr*****@comcast.netwrote:
>Try these with N = 9, 10, 11 Please!

C++: http://pastebin.com/m6db54f31
Java: http://pastebin.com/m15714c90
I guess I figure out why I get different time than you (at least in
this case). The c++ version takes much longer to quit after printing
time

$ time java -server -Xms64m binarytrees2 16
stretch tree of depth 17 check: -1
131072 trees of depth 4 check: -131072
32768 trees of depth 6 check: -32768
8192 trees of depth 8 check: -8192
2048 trees of depth 10 check: -2048
512 trees of depth 12 check: -512
128 trees of depth 14 check: -128
32 trees of depth 16 check: -32
long lived tree of depth 16 check: -1
Time: 1156 ms

real 0m1.277s
user 0m0.015s
sys 0m0.015s

note that it say 1156 ms time and time reported by time command is
pretty close behind 0m1.277s

Now the C++ version

$ time binarytrees2 16
stretch tree of depth 17 check: -1
131072 trees of depth 4 check: -131072
32768 trees of depth 6 check: -32768
8192 trees of depth 8 check: -8192
2048 trees of depth 10 check: -2048
512 trees of depth 12 check: -512
128 trees of depth 14 check: -128
32 trees of depth 16 check: -32
long lived tree of depth 16 check: -1
Time: 843 ms

real 0m1.614s
user 0m0.030s
sys 0m0.000s

Huh? 843 ms reported by internal time (faster than java's 1156 ms but
time reported by time command is 1.614s? The C++ version took longer
to quit after printing 843 ms.
Jun 27 '08 #16
"Razii" <wh**********@hotmail.comwrote in message
news:2l********************************@4ax.com...
On Sun, 20 Apr 2008 21:34:03 -0700, "Chris Thomasson"
<cr*****@comcast.netwrote:
>>"Lew" <le*@lewscanon.comwrote in message
news:48******************************@comcast.co m...
>>Chris Thomasson wrote:
This took 15 minutes of my time.

Probably fifteen more minutes than it merited.

Strangely, it crashes with n = 20 with the following flags

cl /O2 /GL /D "NDEBUG" binarytrees2.cpp /link /ltcg

However, it worked with g++....
WIERD!

g++ -pipe -O3 -DNDEBUG -fomit-frame-pointer -finline-functions binar
ytrees2.cpp -o binarytrees2.exe -march=native

$ time binarytrees2 20 >/dev/null

real 0m57.075s
user 0m0.015s
sys 0m0.015s

but was slower than java

$ time java -server -Xms512m -Xmx512m -XX:NewRatio=1 binarytrees 20
>>/dev/null

real 0m21.687s
user 0m0.015s
sys 0m0.000s
What times does the Java and C++ version output for n = 20?
Time: == What?
I am not interested in measuring shutdown time. I will run the tests again
with a bigger heap setting.

Jun 27 '08 #17
I'm curious to know how fast this program would be when translated to
Java. In my computer (3.4GHz Pentium4) using gcc 4.1.2 (with the options
-O3 -march=pentium4 -fomit-frame-pointer) it takes 1 min and 17 seconds,
while taking 252 MB of memory (according to 'top').

The task the program solves is in the comment. (Note that I'm not
really asking how fast the *problem* can be solved in Java, but how fast
it can be solved by using this same algorithm/technique.)
/*
Find the largest n for which:

1) n is a prime.
2) The nth prime is smaller than 4200000000.

As the answer, print both n and the nth prime.
And while you are at it, also print the amount of
primes smaller than 4200000000.
*/

#include <bitset>
#include <iostream>

namespace
{
const unsigned MAX_VALUE = 4200000000U;
std::bitset<MAX_VALUE/2prime;

// http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes
void init()
{
prime.set();
for(unsigned n = 3; n*n < MAX_VALUE; n += 2)
if(prime.test(n/2))
for(unsigned index = n*n; index < MAX_VALUE; index += n)
if(index % 2)
prime.reset(index/2);
}
}

#include <ctime>

int main()
{
const clock_t t1 = std::clock();
init();

unsigned count = prime.count(), p = MAX_VALUE+1;

std::cout << "There are " << count << " primes smaller than "
<< MAX_VALUE << std::endl;

while(true)
{
do p -= 2; while(!prime.test(p/2));
if(count%2 && prime.test(count/2)) break;
--count;
}

const int seconds = (std::clock() - t1)/CLOCKS_PER_SEC;
std::cout << count << "\n" << p << std::endl
<< seconds/60 << " mins " << seconds%60 << " s\n";
}
Jun 27 '08 #18
"Razii" <wh**********@hotmail.comwrote in message
news:2l********************************@4ax.com...
On Sun, 20 Apr 2008 21:34:03 -0700, "Chris Thomasson"
<cr*****@comcast.netwrote:
>>"Lew" <le*@lewscanon.comwrote in message
news:48******************************@comcast.co m...
>>Chris Thomasson wrote:
This took 15 minutes of my time.

Probably fifteen more minutes than it merited.

Strangely, it crashes with n = 20 with the following flags

cl /O2 /GL /D "NDEBUG" binarytrees2.cpp /link /ltcg

However, it worked with g++....
I know exactly what's causing the delayed shutdown time, and why it crashed.
Apparently, VC++ did not like the static allocator. G++ handled it better,
but it still took a long time to shutdown. Anyway, here is a fix:

http://pastebin.com/m7aca890a
Here is the new times I get:

Java: -server -Xms512m -Xmx512m -XX:NewRatio=1
__________________________________________________ _____
C++: N = 12 / Time: 46 ms
Java: N = 12 / Time: 94 ms
-------------------------------------------------------
C++: N = 13 / Time: 109 ms
Java: N = 13 / Time: 149 ms
-------------------------------------------------------
C++: N = 14 / Time: 281 ms
Java: N = 14 / Time: 297 ms
-------------------------------------------------------
C++: N = 15 / Time: 531 ms
Java: N = 15 / Time: 547 ms
-------------------------------------------------------
C++: N = 16 / Time: 1265 ms
Java: N = 16 / Time: 1078 ms
-------------------------------------------------------
C++: N = 17 / Time: 2140 ms
Java: N = 17 / Time: 1875 ms
-------------------------------------------------------
C++: N = 18 / Time: 4650 ms
Java: N = 18 / Time: 4265 ms
-------------------------------------------------------
C++: N = 19 / Time: 9450 ms
Java: N = 19 / Time: 9109 ms
-------------------------------------------------------
C++: N = 20 / Time: 21671 ms
Java: N = 20 / Time: 23819 ms
-------------------------------------------------------
C++: N = 21 / Time: 42992 ms
Java: N = 21 / Time: 41187 ms
-------------------------------------------------------

The java settings allowed it to complete the N 19 tests. The C++ version
does not have the static allocator anymore, and all the problems related to
it are gone. Can you test this one please? Its going to have a dramatically
faster shutdown time. Thanks Razii.

Jun 27 '08 #19
Juha Nieminen wrote:
for(unsigned index = n*n; index < MAX_VALUE; index += n)
if(index % 2)
prime.reset(index/2);
Actually that can be replaced with:

for(unsigned index = n*n; index < MAX_VALUE; index += n*2)
prime.reset(index/2);

OTOH, the speedup is rather small. The runtime was reduced to 1 min
and 13 seconds.
Jun 27 '08 #20
On Mon, 21 Apr 2008 15:35:43 +0300, Juha Nieminen
<no****@thanks.invalidwrote:
const unsigned MAX_VALUE = 4200000000U;
huh.
std::bitset<MAX_VALUE/2prime;

// http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes
void init()
{
prime.set();
for(unsigned n = 3; n*n < MAX_VALUE; n += 2)
I am not sure about 64 bit JVM, but on 32 bit (that I have), java
version would be 10 or more times slower because there is no unsigned
in java. That means there is no choice but to use long in all these
loops, which would make it much slower.

Another problem is that this benchmark really depends on how the
Bitsets class is implemented in java. Perhaps C++ library implements
it in more efficient way.

I am going to use MAX_VALUE that stays within the bounds of int (that
also means the index shouldn't go negative also in this loop)

for(int index = n*n;index < MAX_VALUE;index += n)

(c++)
There are 54400028 primes smaller than 1073741824
54399997
1073741173
0 mins 23 s

(java) http://pastebin.com/f58cf4d36
There are 54400028 primes smaller than 1073741824
54399997
1073741173
Time: 47734 ms

java version is still two times slower (it still could be due to
inefficient implementation of Bitset in the library).

By the way, the c++ version crashes with MAX_VALUE == odd number.
Jun 27 '08 #21
On Mon, 21 Apr 2008 05:49:06 -0700, "Chris Thomasson"
<cr*****@comcast.netwrote:
>C++: N = 12 / Time: 46 ms
Java: N = 12 / Time: 94 ms
-------------------------------------------------------
C++: N = 13 / Time: 109 ms
Java: N = 13 / Time: 149 ms
-------------------------------------------------------
C++: N = 14 / Time: 281 ms
Java: N = 14 / Time: 297 ms
-------------------------------------------------------
C++: N = 15 / Time: 531 ms
Java: N = 15 / Time: 547 ms
-------------------------------------------------------
C++: N = 16 / Time: 1265 ms
Java: N = 16 / Time: 1078 ms
-------------------------------------------------------
C++: N = 17 / Time: 2140 ms
Java: N = 17 / Time: 1875 ms
-------------------------------------------------------
C++: N = 18 / Time: 4650 ms
Java: N = 18 / Time: 4265 ms
-------------------------------------------------------
C++: N = 19 / Time: 9450 ms
Java: N = 19 / Time: 9109 ms
-------------------------------------------------------
C++: N = 20 / Time: 21671 ms
Java: N = 20 / Time: 23819 ms
-------------------------------------------------------
C++: N = 21 / Time: 42992 ms
Java: N = 21 / Time: 41187 ms
-------------------------------------------------------
I get similar differences (except for n=20).

Jun 27 '08 #22
Razii wrote:
I am going to use MAX_VALUE that stays within the bounds of int (that
also means the index shouldn't go negative also in this loop)
I think you can safely use a MAX_VALUE of 2000000000 (and in fact a
bit higher) without running into problems.
Jun 27 '08 #23
"Razii" <wh**********@hotmail.comwrote in message
news:46********************************@4ax.com...
On Mon, 21 Apr 2008 05:49:06 -0700, "Chris Thomasson"
<cr*****@comcast.netwrote:
>>C++: N = 12 / Time: 46 ms
Java: N = 12 / Time: 94 ms
-------------------------------------------------------
C++: N = 13 / Time: 109 ms
Java: N = 13 / Time: 149 ms
-------------------------------------------------------
C++: N = 14 / Time: 281 ms
Java: N = 14 / Time: 297 ms
-------------------------------------------------------
C++: N = 15 / Time: 531 ms
Java: N = 15 / Time: 547 ms
-------------------------------------------------------
C++: N = 16 / Time: 1265 ms
Java: N = 16 / Time: 1078 ms
-------------------------------------------------------
C++: N = 17 / Time: 2140 ms
Java: N = 17 / Time: 1875 ms
-------------------------------------------------------
C++: N = 18 / Time: 4650 ms
Java: N = 18 / Time: 4265 ms
-------------------------------------------------------
C++: N = 19 / Time: 9450 ms
Java: N = 19 / Time: 9109 ms
-------------------------------------------------------
C++: N = 20 / Time: 21671 ms
Java: N = 20 / Time: 23819 ms
-------------------------------------------------------
C++: N = 21 / Time: 42992 ms
Java: N = 21 / Time: 41187 ms
-------------------------------------------------------

I get similar differences (except for n=20).
Yeah. I don't know what happened there. I should have ran each test 5 times
and given an average. Java is pretty damn hard to beat sometimes!

:^D

Jun 27 '08 #24
Razii wrote:
Another problem is that this benchmark really depends on how the
Bitsets class is implemented in java. Perhaps C++ library implements
it in more efficient way.
Definitely. A Java BitSet instance is not fixed in size and can
grow and shrink - and there's quite a bit of code that results from
this. (Clearing a bit, for example, results in a recalculation of the
number of words required to hold the remaining bits - something that
only pays off if you're clearing the rightmost 'on' bits in the set)!
The emphasis of the implementation appears to be more on efficient
space use over the cost of bitwise operations.

Given that this benchmark is really only using a bitset as a (sorta)
space efficient mapping of numbers to primality and makes very
little use of the other bitset features in either C++ or Java, it would
make sense to implement your own more primitive mapping facility
("FixedSizeBitMap", perhaps).

--
Steve Wampler -- sw******@noao.edu
The gods that smiled on your birth are now laughing out loud.
Jun 27 '08 #25
Steve Wampler wrote:
Given that this benchmark is really only using a bitset as a (sorta)
space efficient mapping of numbers to primality and makes very
little use of the other bitset features in either C++ or Java
There's one additional thing where a bitset feature is used
effetively: To count the number of set bits.

I measured how long it takes for the gcc implementation to do so. The
total amount of bits is 2100000000 (ie. half of the max value), from
which 198996103 are 1-bits (ie. the amount of primes smaller than
4200000000). That is, approximately 10% of (quite random) bits are on.

It takes my computer approximately 0.6 seconds to count the number of
1-bits in that bitset. That's approximately one bit test per clock cycle
in average. I'd call that efficient.
Jun 27 '08 #26
On Mon, 21 Apr 2008 22:56:24 +0300, Juha Nieminen
<no****@thanks.invalidwrote:
>It takes my computer approximately 0.6 seconds to count the number of
1-bits in that bitset. That's approximately one bit test per clock cycle
in average. I'd call that efficient.
Are you on 32-bit OS or 64-bit? If 64-bit, can you compare java vs C++
version? BitSet class uses long arrays of "words". Just make sure to
use -server and enough Xmx value that it doesn't throw out of memory
exception.
Jun 27 '08 #27
Razii wrote:
On Mon, 21 Apr 2008 22:56:24 +0300, Juha Nieminen
<no****@thanks.invalidwrote:
>It takes my computer approximately 0.6 seconds to count the number of
1-bits in that bitset. That's approximately one bit test per clock cycle
in average. I'd call that efficient.

Are you on 32-bit OS or 64-bit? If 64-bit, can you compare java vs C++
version? BitSet class uses long arrays of "words". Just make sure to
use -server and enough Xmx value that it doesn't throw out of memory
exception.
Pentium4 is a 32-bit system. Unfortunately I don't have javac
installed here.

(Btw, if you are curious to know how it's even possible for that
function to be so fast, AFAIK it uses this algorithm:
http://en.wikipedia.org/wiki/Hamming_weight )
Jun 27 '08 #28
ldv
On Apr 21, 12:55*pm, "Chris Thomasson" <cris...@comcast.netwrote:
"Razii" <whatever1...@hotmail.comwrote in message

news:pg********************************@4ax.com...


Next:
fannkuch benchmark
(c++)
http://shootout.alioth.debian.org/gp...t=fannkuch&lan....
(java)
g++ -pipe -O3 -fomit-frame-pointer -finline-functions fannkuch.cpp -o
fannkuch.exe -march=native
jc -inline+ fannkuch.class
Time in secs as N increases
N * * * *9 * * * * 10 * * * 11
g++ * *0.047s * *0.392s * *5.013s
JVM * *0.180s * *0.689s * *7.057s (-server -Xms64m)
JET* *0.278s * *0.820s * *7.792s
C++ version is faster; however, (given JIT compilation takes time),
the difference decreases with n = 11. With n = 12 (not included in
shootout site), the difference is even smaller
1m34.648s (-server -Xms64m)
1m7.465s *(g++)
In any case, JVM is again (slightly) faster thanJETnative compiler
in this case.

Strange... I would always assume that native code compiled withJETwould
outperform JIT compilation. Odd. I guess its a QOI issue wrt the nature of
theJETcompiler.
Well, just as HotSpot 6 is in many cases faster than HotSpot 5,
Excelsior claims that Excelsior JET 6.4 beta is 15%-20% faster than
6.0 _on some becnhmarks_, so they surely have more room for
improvement. :)

Any complicated piece of engineering work is full of tradeoffs. For
instance, I know for sure that the HotSpot VM allocates objects on the
heap faster than Excelsior JET VM. But full GC is slower on HotSpot,
dramatically slower if some objects die after being promoted to the
old generation. If GC pauses annoy your users, Excelsior JET may be a
better soluition even if it does not improve overall performance.

http://www.experimentalstuff.com/Tec...old/index.html

LDV
Jun 27 '08 #29
On Mon, 21 Apr 2008 16:35:42 +0300, Juha Nieminen
<no****@thanks.invalidwrote:
>for(unsigned index = n*n; index < MAX_VALUE; index += n*2)
this makes 4 sec differences with the newer java version (that
non-libray bitset).

Jun 27 '08 #30
On Mon, 21 Apr 2008 12:01:44 -0700, Steve Wampler <sw******@noao.edu>
wrote:
>Given that this benchmark is really only using a bitset as a (sorta)
space efficient mapping of numbers to primality and makes very
little use of the other bitset features in either C++ or Java, it would
make sense to implement your own more primitive mapping facility
("FixedSizeBitMap", perhaps).
I implemented my own bitset (in fact just copied the part that I
needed from API). The bitset class does not always grow and shrink. It
has a boolean sizeIsSticky = false; which is set to true if the user
uses the constructor BitSet(int nbits).

However, there were still lots of checks ad codes. Using own my own
bitset improved the speed by around 10 sec. Changing the word long[]
words to int [] words further improved the speed by 4 sec.

so here is the c++ version (fixed index += n*2)
http://pastebin.com/fc1186b7

0 mins 21 s (g++ -pipe -O3 -fomit-frame-pointer -march=native)

java version
http://pastebin.com/fef4e048

23813 ms (-server -Xms128m -Xmx128m)
69782 ms (-Xms128m -Xmx128m, i.e -client)
29531 ms (Jet native compiler)

As always -server VM makes a huge difference and is pretty close to
C++ version. It's also again faster than Jet too.

Jun 27 '08 #31
Juha Nieminen wrote:
(Btw, if you are curious to know how it's even possible for that
function to be so fast, AFAIK it uses this algorithm:
http://en.wikipedia.org/wiki/Hamming_weight )
That's most likely the same as in Java's Long/Integer.bitCount()
methods, I think.

The BitSet in Java suffers from the use of longs when run in 32bit
mode, for operations on individual bits. Long operations such as
bit shifting are double-word operations in a 32bit world.

It would be more efficient in this case to use a (simplified) bit
set class that uses ints internally instead of longs. Perhaps
considerably so for the current benchmark.

--
Steve Wampler -- sw******@noao.edu
The gods that smiled on your birth are now laughing out loud.
Jun 27 '08 #32

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

14
by: bearophileHUGS | last post by:
The The Computer Language Shootout has just published results for Python 2.5 and Psyco 1.5.2. Comparing the old (Python 2.4) Gentoo Pentium 4 results (now not visible anymore) with the new results,...
80
by: tech | last post by:
Hi, i have the following problem In file1.h namespace A { class Bar { void foo();
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.