469,342 Members | 6,660 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,342 developers. It's quick & easy.

benchmarks? java vs .net

The shootout site has benchmarks comparing different languages. It
includes C# Mono vs Java but not C# .NET vs Java. So I went through
all the benchmark on the site ...

http://kingrazi.blogspot.com/2008/05...enchmarks.html

Just to keep the post on topic for my friends at comp.lang.c++, how do
I play default windows sounds with C++?

Jun 27 '08
358 11407
Jon Harrop <jo*@ffconsultancy.comwrote:
Please go by what he posted, not some other settings you've come up
with. Your claim that Razii "effectively turn[ed] off the GC" is simply
untrue.

My claim was clearly perfectly correct.
Rubbish. You claimed that Razii had turned the garbage collector off.
He certainly hadn't, or the program would not have run to completion,
limited as it was by his options to 512MB.

Just to make it absolutely clear, here is how Razii was running the
code:

java -server -verbose:gc -Xms512m -Xmx512m -XX:NewRatio=1

With those options, the garbage collector *does* run, and *does*
collect memory. Your claim was absolutely incorrect, and your attempt
to confuse the matter by posting other options which *did* negate the
need for the garbage collector to run does not in any way change the
options under which Razii ran his test.
Please follow this logic and explain how your logic could *possibly* be
correct:
1) The test creates 305135616 instances of TreeNode. To verify this,
just introduce a static counter which is incremented in the
constructor, and print out the result at the end of the test.

2) The option -Xmx512m limits the heap to 512MB

3) There is no way of holding 305135616 objects in memory at the same
time in Java within 512MB. (An object would have to take less than
1.759 bytes on average, which is clearly ridicuolous.)

4) Therefore the memory used by some of the objects must have been
reused by other objects. The mechanism for this is garbage collection.

5) Therefore the GC had *not* been "effectively turned off".
Now rather than just repeatedly stating your claim, or arguing by using
*different* options, please address the steps above. Which of the 5
facts/deductions above do you disagree with? The conclusion directly
contradicts your claim, so you should either retract your claim or
refute the logic above.

--
Jon Skeet - <sk***@pobox.com>
Web site: http://www.pobox.com/~skeet
Blog: http://www.msmvps.com/jon.skeet
C# in Depth: http://csharpindepth.com
Jun 27 '08 #251
On Sat, 7 Jun 2008 13:29:53 +0100, Jon Skeet [C# MVP]
<sk***@pobox.comwrote:
>Now rather than just repeatedly stating your claim, or arguing by using
*different* option
He not only used diiferent options (3 gig ram) but he also reduced
command line argument from 20 to 16.

Jun 27 '08 #252
On Jun 7, 1:51*pm, Razii <pyukj...@gmail.comwrote:
He not only used diiferent options (3 gig ram) but he also reduced
command line argument from 20 to 16.
Ahh... booo hoo ... Poor little Ratboy ...

regards
Andy Little

Jun 27 '08 #253
kwikius <an**@servocomm.freeserve.co.ukwrote:
He not only used diiferent options (3 gig ram) but he also reduced
command line argument from 20 to 16.

Ahh... booo hoo ... Poor little Ratboy ...
Is there any chance we could keep the discussion civil, and based on
the technical merits of the arguments instead of personal insults?

--
Jon Skeet - <sk***@pobox.com>
Web site: http://www.pobox.com/~skeet
Blog: http://www.msmvps.com/jon.skeet
C# in Depth: http://csharpindepth.com
Jun 27 '08 #254
On Sat, 7 Jun 2008 16:32:57 +0100, Jon Skeet [C# MVP]
<sk***@pobox.comwrote:
>Is there any chance we could keep the discussion civil, and based on
the technical merits of the arguments instead of personal insults?
It's fine. kwikius is a well known troll :) He claimed to have plonked
me a dozen of time but can't seem to stop reading the my threads.
Jun 27 '08 #255
Razii wrote:
On Sat, 7 Jun 2008 13:29:53 +0100, Jon Skeet [C# MVP]
<sk***@pobox.comwrote:
>>Now rather than just repeatedly stating your claim, or arguing by using
*different* option

He not only used diiferent options (3 gig ram) but he also reduced
command line argument from 20 to 16.
The shootout uses n=16.

--
Dr Jon D Harrop, Flying Frog Consultancy
http://www.ffconsultancy.com/products/?u
Jun 27 '08 #256
On Sat, 07 Jun 2008 19:03:58 +0100, Jon Harrop <jo*@ffconsultancy.com>
wrote:
>The shootout uses n=16.
The shootout doesn't use 3 gig max memory. Besides, with n = 16, it's
faster to only use -Xms64m flag and nothing else.

$ time java -server -Xms64m binarytrees 16
stretch tree of depth 17 check: -1
131072 trees of depth 4 check: -131072
32768 trees of depth 6 check: -32768
8192 trees of depth 8 check: -8192
2048 trees of depth 10 check: -2048
512 trees of depth 12 check: -512
128 trees of depth 14 check: -128
32 trees of depth 16 check: -32
long lived tree of depth 16 check: -1

real 0m0.964s

$ time java -server -Xms512m -Xmx512m -XX:NewRatio=1 binarytrees 16
stretch tree of depth 17 check: -1
131072 trees of depth 4 check: -131072
32768 trees of depth 6 check: -32768
8192 trees of depth 8 check: -8192
2048 trees of depth 10 check: -2048
512 trees of depth 12 check: -512
128 trees of depth 14 check: -128
32 trees of depth 16 check: -32
long lived tree of depth 16 check: -1

real 0m1.176s

The first one is faster with n=16


Jun 27 '08 #257
Razii wrote:
On Sat, 07 Jun 2008 19:03:58 +0100, Jon Harrop <jo*@ffconsultancy.com>
wrote:
>>The shootout uses n=16.

The shootout doesn't use 3 gig max memory.
How do you know that?
Besides, with n = 16, it's faster to only use -Xms64m flag and nothing
else.

$ time java -server -Xms64m binarytrees 16
stretch tree of depth 17 check: -1
131072 trees of depth 4 check: -131072
32768 trees of depth 6 check: -32768
8192 trees of depth 8 check: -8192
2048 trees of depth 10 check: -2048
512 trees of depth 12 check: -512
128 trees of depth 14 check: -128
32 trees of depth 16 check: -32
long lived tree of depth 16 check: -1

real 0m0.964s

$ time java -server -Xms512m -Xmx512m -XX:NewRatio=1 binarytrees 16
stretch tree of depth 17 check: -1
131072 trees of depth 4 check: -131072
32768 trees of depth 6 check: -32768
8192 trees of depth 8 check: -8192
2048 trees of depth 10 check: -2048
512 trees of depth 12 check: -512
128 trees of depth 14 check: -128
32 trees of depth 16 check: -32
long lived tree of depth 16 check: -1

real 0m1.176s

The first one is faster with n=16
I get a 10% error here and no significant difference between the results:

$ time java -server -Xms64m binarytrees 16
....
real 0m3.094s
user 0m2.360s
sys 0m0.312s

$ time java -server -Xms512m -Xmx512m -XX:NewRatio=1 binarytrees 16
....
real 0m3.080s
user 0m2.372s
sys 0m0.300s

--
Dr Jon D Harrop, Flying Frog Consultancy
http://www.ffconsultancy.com/products/?u
Jun 27 '08 #258
On Sat, 07 Jun 2008 19:35:08 +0100, Jon Harrop <jo*@ffconsultancy.com>
wrote:
>The shootout doesn't use 3 gig max memory.

How do you know that?
They list the options they use. Besides, the computer they are using,
Pentium 4, has only 512 MB ram.
>I get a 10% error here and no significant difference between the results:
I do. With n=16, Xms64m on average is faster. If you don't get a
difference, your point is still moot about GC. Only with larger n,
like n=20, there is big improvement in speed with using my flags. The
GC still works and it's is faster, two times faster.

Jun 27 '08 #259
On Jun 7, 4:32*pm, Jon Skeet [C# MVP] <sk...@pobox.comwrote:
kwikius <a...@servocomm.freeserve.co.ukwrote:
He not only used diiferent options (3 gig ram) but he also reduced
command line argument from 20 to 16.
Ahh... booo hoo ... Poor little *Ratboy ...

Is there any chance we could keep the discussion civil, and based on
the technical merits of the arguments instead of personal insults?
Depends...

Ratboy started this and you and Jon Harrop are continuing it. I
politely hope that you will stop crossposting to comp.lang.c++ and
remove followups. ( I guarantee that Ratboy will put them back so you
will need to watch that on every post) There's no C++ content here.

Bear in mind that Ratboy here changes his email address on pretty much
every post because everyone on comp.lang.c++ killfiles him. Do you
think that is being civil? I don't. I think its down right offensive.

Most people know how to kill file this thread in spite of Ratboys
efforts of course but newbies often don't.

regards
Andy Little

Jun 27 '08 #260
On Sat, 7 Jun 2008 12:17:48 -0700 (PDT), kwikius
<an**@servocomm.freeserve.co.ukwrote:
>Most people know how to kill file this thread in spite of Ratboys
efforts of course but newbies often don't.
Ratgirl, you do know how to kill-file a thread, don't you? You seem to
be obsessively reading each and every post in each and every thread.
How come? Your group is filled with spam. Do you respond to each spam
and whine that they stop posting? If not, what's your obsession with
these threads, Ratgirl?
Jun 27 '08 #261
Razii wrote:
On Sat, 07 Jun 2008 19:35:08 +0100, Jon Harrop <jo*@ffconsultancy.com>
wrote:
>>The shootout doesn't use 3 gig max memory.
How do you know that?

They list the options they use. Besides, the computer they are using,
Pentium 4, has only 512 MB ram.
Xmx is a reliable max on memory usage.

Physical memory is not. Java uses virtual memory. If there is not
enough physical memory to cover then it still works - it just becomes
very slow.

Arne
Jun 27 '08 #262
Jon Skeet [C# MVP] wrote:
Jon Harrop <jo*@ffconsultancy.comwrote:
Please go by what he posted, not some other settings you've come up
with. Your claim that Razii "effectively turn[ed] off the GC" is simply
untrue.

My claim was clearly perfectly correct.

Rubbish. You claimed that Razii had turned the garbage collector off.
Now you have resorted to misquoting me. I think that says it all.
He certainly hadn't, or the program would not have run to completion,
limited as it was by his options to 512MB.
For n=20?
Just to make it absolutely clear, here is how Razii was running the
code:

java -server -verbose:gc -Xms512m -Xmx512m -XX:NewRatio=1
You have not specified "n" but it appears that your entire line of thinking
revolves around n=20.
With those options, the garbage collector *does* run, and *does*
collect memory.
Here is another trivial counter example using Razii's arguments as you
quoted them:

$ java -server -verbose:gc -Xms512m -Xmx512m -XX:NewRatio=1 binarytrees 13
stretch tree of depth 14 check: -1
16384 trees of depth 4 check: -16384
4096 trees of depth 6 check: -4096
1024 trees of depth 8 check: -1024
256 trees of depth 10 check: -256
64 trees of depth 12 check: -64
long lived tree of depth 13 check: -1

As you can see, the GC never ran.
Your claim was absolutely incorrect, and your attempt
to confuse the matter by posting other options which *did* negate the
need for the garbage collector to run does not in any way change the
options under which Razii ran his test.
I said "the GC is effectively off". You say "Rubbish... the need for the GC
to run had been negated.". The difference is academic.
Please follow this logic and explain how your logic could *possibly* be
correct:

1) The test creates 305135616 instances of TreeNode.

2) The option -Xmx512m limits the heap to 512MB

3) There is no way of holding 305135616 objects in memory at the same
time in Java within 512MB. (An object would have to take less than
1.759 bytes on average, which is clearly ridicuolous.)

4) Therefore the memory used by some of the objects must have been
reused by other objects. The mechanism for this is garbage collection.

5) Therefore the GC had *not* been "effectively turned off".

Now rather than just repeatedly stating your claim, or arguing by using
*different* options, please address the steps above. Which of the 5
facts/deductions above do you disagree with? The conclusion directly
contradicts your claim, so you should either retract your claim or
refute the logic above.
The main problem is with your interpretation of the word "effectively". You
seem to think that you can add and remove this word at will without
affecting the meaning of a sentence when, in fact, you cannot.
Consequently, your conclusion (5) is wrong. It should be "Therefore the GC
had *not* been turned off". No disagreement here. But that says nothing
about my original statement.

There were two sides to my original point. Firstly, if you manually tweak
the GC parameters by hand for one specific input on one specific machine
then you are doing manual memory management. Garbage collection means
*automatic* memory management. So you can kiss goodbye to the idea of
claiming that your GC is fast (which is exactly what Razii was trying to
do). The same goes for explicitly calling the GC from within your code (it
is a form of manual memory management).

Secondly, Razii's technique and results for this benchmark have absolutely
no bearing on reality whatsoever. Indeed, I cannot even reproduce his
results using the same program with the same input on a slightly different
machine. So let's not pretend this is of any practical relevance.

All you have managed to do is optimize a flawed benchmark which, as I said
from the beginning, is completely fruitless.

--
Dr Jon D Harrop, Flying Frog Consultancy
http://www.ffconsultancy.com/products/?u
Jun 27 '08 #263
Jon Harrop wrote:
Jon Skeet [C# MVP] wrote:
>3) There is no way of holding 305135616 objects in memory at the same
time in Java within 512MB. (An object would have to take less than
1.759 bytes on average, which is clearly ridicuolous.)

4) Therefore the memory used by some of the objects must have been
reused by other objects. The mechanism for this is garbage collection.

5) Therefore the GC had *not* been "effectively turned off".

Now rather than just repeatedly stating your claim, or arguing by using
*different* options, please address the steps above. Which of the 5
facts/deductions above do you disagree with? The conclusion directly
contradicts your claim, so you should either retract your claim or
refute the logic above.

The main problem is with your interpretation of the word "effectively". You
seem to think that you can add and remove this word at will without
affecting the meaning of a sentence when, in fact, you cannot.
Consequently, your conclusion (5) is wrong. It should be "Therefore the GC
had *not* been turned off". No disagreement here. But that says nothing
about my original statement.
If GC has GC'ed objects then it is neither "turned off" or "effectively
turned off".

Jon's logic is 100% valid.
There were two sides to my original point. Firstly, if you manually tweak
the GC parameters by hand for one specific input on one specific machine
then you are doing manual memory management. Garbage collection means
*automatic* memory management. So you can kiss goodbye to the idea of
claiming that your GC is fast (which is exactly what Razii was trying to
do). The same goes for explicitly calling the GC from within your code (it
is a form of manual memory management).
Nonsense.

You use the term "memory management" in two different meanings here.

GC means automatic deallocation of memory dynamical allocated from heap.

That is not what you are doing by configuring how GC works.

And the argument that something is not efficient because it is
tunable is laughable.

Arne
Jun 27 '08 #264
On Sat, 07 Jun 2008 20:00:46 -0400, Arne Vajhj <ar**@vajhoej.dk>
wrote:
>If GC has GC'ed objects then it is neither "turned off" or "effectively
turned off".
With n=7, there is no GC activity even without any command line GC
options:

$ java -verbose:gc binarytrees 7
stretch tree of depth 8 check: -1
256 trees of depth 4 check: -256
64 trees of depth 6 check: -64
long lived tree of depth 7 check: -1

So according to Harpo's brilliant logic, GC was turned off by Razii :)
Jun 27 '08 #265
On Jun 8, 2:06*am, King Twat<ki******@gmail.comwrote:
I added the C++ group back.
? Thats what I told you to do, King Twat.

regards
Andy Little


Jun 27 '08 #266
Jon Harrop wrote:
Mark Thornton wrote:
>Rather amusing really as unintentional use of unbuffered IO is a
frequent cause of Java benchmarks running more slowly than they should.
It seems that .NET copied that characteristic as well.

Yes. I've no idea why they do that. Isn't buffered IO a better default?!
It is easier to put a buffered wrapper around an unbuffered stream
than the the other way around.

Arne
Jun 27 '08 #267
Jon Skeet [C# MVP] wrote:
On Jun 3, 4:39 pm, Jon Harrop <j...@ffconsultancy.comwrote:
>FWIW, F#/Mono is 3x slower than F#/.NET on the SciMark2 benchmark. I'd like
to know how performance compares between these platforms for parallel code.

If Razii is genuinely comparing Java with Mono (I haven't looked at
any of the figures) it's a silly test to start with (unless you're
specifically interested in Mono, of course). The vast majority of C#
code runs on .NET rather than Mono - and while I applaud the Mono
team's work, I seriously doubt that it has quite has much effort going
into it as Microsoft is putting into .NET. I'd expect .NET to
outperform Mono, and on microbencharks like these the difference could
be quite significant in some cases.
I have frequently seen a x2 factor between MS .NET 2.0 and Mono 1.2 !

Comparing with Mono is too easy.

Arne
Jun 27 '08 #268
Jon Skeet [C# MVP] wrote:
Now, you're using *those results* to form conclusions, right? If not,
there was little point in posting them. However, those results are of
Mono, not .NET.

This is why I've urged consistency. What's the point in debating Java
vs .NET if you keep posting links to results of Java vs Mono?
You are correct.

I would just prefer to consider .NET an overall group and call the
implementations MS .NET and Mono.

(probably MS's trademark lawyers would not like that, but ...)

Arne
Jun 27 '08 #269
Jon Skeet [C# MVP] wrote:
Even running .NET rather than Mono will only go so far - you'd need to
run it on both 64 bit and 32 bit platforms, as the runtimes do
different things. (IIRC the 64 bit CLR uses tail recursion more heavily
than the 32 bit CLR, for instance.)

If I were to find a really slow JVM, would that prove that Java is a
slower *language* than C#?
It is even more complex than that.

MS .NET 1.1, MS .NET 2.0, Mono 1.1, Mono 1.2

Win32/x86, Win64/x86, Win64/IA64, Linux/x86, MacOS X, Solaris/SPARC

that is 14 combinations.

If the benchmark involves file IO, then it would be relevant to
test different file systems on Linux as well.

For Java it is much worse.

1.4.2, 1.5.0, 1.6.0
SUN, IBM, BEA, Oracle
client VM, server VM
3-4 platforms per vendor
and a ton of VM tuning options

I would need to use BigInteger to calculate the number of combinations.

And believe me - there are huge differences in performance
characteristics between SUN, IBM and BEA !

I would say that trying to compare the speed of Java with the speed
of .NET reveals a big lack of knowledge about both Java and .NET !

Arne
Jun 27 '08 #270
Razii wrote:
On Tue, 3 Jun 2008 20:11:45 +0100, Jon Skeet [C# MVP]
<sk***@pobox.comwrote:
>Now, you're using *those results* to form conclusions, right? If not,
there was little point in posting them. However, those results are of
Mono, not .NET.

Is this guy Jon Skeet really this stupid?
Unlike you he is a valuable contributor to both the Java and
..NET groups.

Arne
Jun 27 '08 #271
Jon Skeet [C# MVP] wrote:
On Jun 3, 11:15 pm, Razii <klgf...@mail.comwrote:
><sk...@pobox.comwrote:
>>Has it occurred to you that that may have an effect on the results?
No, it won't have a major effect.

And you know this because...?

Just claiming that it won't have an effect isn't exactly compelling
evidence.
It probably is not.

time just note counters, execute the command and note counters
again. I don't know which API's it use to execute the command,
but it should not effect the programs speed whether is 3 Unix
emulation calls and ShellExecute or something else.

Arne
Jun 27 '08 #272
Jon Harrop wrote:
Razii wrote:
>On Wed, 04 Jun 2008 12:21:39 -0500, Razii <ni*******@mail.comwrote:
>>>sumcol: .NET is 2.5x faster than Java.
Now that I read your other post, I see where you got that from: by
writing custom parser that breaks benchmark rule that program must use
standard library for reading and parsing:

Now you are introducing new subjective "rules" to circumvent problems with
the benchmarks.
There are not much point in testing the speed of custom code that
does what in real apps would be done in the standard libraries.

Arne
Jun 27 '08 #273
Jon Harrop wrote:
Razii wrote:
>On Wed, 04 Jun 2008 19:29:35 +0100, Jon Harrop <jo*@ffconsultancy.com>
wrote:
>>For another subjection notion of "correct" that also has no practical
relevance, yes.
Why can't you post C# version that uses standard library and is not
slower? Perhaps that's because it's not possible?

The C# version works perfectly when you don't run your CPU in legacy mode.
Since approx. 95% of all computers run in that legacy mode, then
that legacy mode is what is most relevant.

Arne
Jun 27 '08 #274
Jon Harrop <jo*@ffconsultancy.comwrote:
My claim was clearly perfectly correct.
Rubbish. You claimed that Razii had turned the garbage collector off.

Now you have resorted to misquoting me. I think that says it all.
You've applied different options to the program to make it appear that
your claim about Razii's options were correct. I missed out the word
"effectively" above. Which of those is more important?

Similarly, you've later managed to quote something I wrote about *your*
set of options as if I was writing it about Razii's set of options.
He certainly hadn't, or the program would not have run to completion,
limited as it was by his options to 512MB.

For n=20?
Yes.
Just to make it absolutely clear, here is how Razii was running the
code:

java -server -verbose:gc -Xms512m -Xmx512m -XX:NewRatio=1

You have not specified "n" but it appears that your entire line of thinking
revolves around n=20.
Yes, which is what he ran.

The post where you claimed Razii had effectively claimed the garbage
collector off was message ID
<Ro******************************@posted.plusnet >

If you look back up the thread directly from that to the last time
Razii had specified options (i.e. the run that was under discussion) it
was message ID <lv********************************@4ax.com>.

The exactly command line specified was:

$ time java -server -Xms512m -Xmx512m -XX:NewRatio=1 binarytrees 20
With those options, the garbage collector *does* run, and *does*
collect memory.

Here is another trivial counter example using Razii's arguments as you
quoted them:
I only quoted the memory part because that's the only part I *saw* you
change. However, if you look
$ java -server -verbose:gc -Xms512m -Xmx512m -XX:NewRatio=1 binarytrees 13
stretch tree of depth 14 check: -1
16384 trees of depth 4 check: -16384
4096 trees of depth 6 check: -4096
1024 trees of depth 8 check: -1024
256 trees of depth 10 check: -256
64 trees of depth 12 check: -64
long lived tree of depth 13 check: -1

As you can see, the GC never ran.
Your claim was absolutely incorrect, and your attempt
to confuse the matter by posting other options which *did* negate the
need for the garbage collector to run does not in any way change the
options under which Razii ran his test.

I said "the GC is effectively off". You say "Rubbish... the need for the GC
to run had been negated.". The difference is academic.
I would agree - but you're quoting me in entirely the wrong context.
Let's have a look at the statement where I talked about "the need for
the GC to had had been negated":

<quote>
Your claim was absolutely incorrect, and your attempt
to confuse the matter by posting other options which *did* negate the
need for the garbage collector to run does not in any way change the
options under which Razii ran his test.
</quote>

Oh look, it's in the context of *your* options, not Razii's.

I never claimed, nor *would* I claim, that the need for the GC to run
had been negated with Razii's options. Those are the only options which
I think should be considered in this part of the discussion, as those
are the options for which you originally claimed that the GC had been
effectively turned off.
Now rather than just repeatedly stating your claim, or arguing by using
*different* options, please address the steps above. Which of the 5
facts/deductions above do you disagree with? The conclusion directly
contradicts your claim, so you should either retract your claim or
refute the logic above.

The main problem is with your interpretation of the word "effectively". You
seem to think that you can add and remove this word at will without
affecting the meaning of a sentence when, in fact, you cannot.
In this case it doesn't change things. If the garbage collector had
*effectively* been turned off, it would not have been able to run to
completion.
Consequently, your conclusion (5) is wrong. It should be "Therefore the GC
had *not* been turned off". No disagreement here. But that says nothing
about my original statement.
No way. You can't realistically claim a garbage collector has been
"effectively" turned off when it being turned *on* is critical to the
program running to completion. What exactly do you take "effectively"
to mean? I take it to mean "to the same effect". Now I'm happy for
"effect" to only mean in terms of computed results rather than
performance - so a fast program can be effectively the same as a slow
program - but it can't be in terms of completing the run. Options where
the program fails to run to completion are *not* "effectively" the same
as options where the program runs fine.
There were two sides to my original point. Firstly, if you manually tweak
the GC parameters by hand for one specific input on one specific machine
then you are doing manual memory management.
No - you're changing the configuration options to let the automatic
memory management work more effectively. Note how I did exactly the
same to make the .NET GC use a different implementation - and again, it
sped things up.
Garbage collection means *automatic* memory management. So you can
kiss goodbye to the idea of claiming that your GC is fast (which is
exactly what Razii was trying to do). The same goes for explicitly
calling the GC from within your code (it is a form of manual memory
management).
Well, it's a hint to the garbage collector - a hint which is usually
unnecessary, but *can* occasionally be beneficial. There's a long, long
way from that to fully manual memory management though.
Secondly, Razii's technique and results for this benchmark have absolutely
no bearing on reality whatsoever. Indeed, I cannot even reproduce his
results using the same program with the same input on a slightly different
machine. So let's not pretend this is of any practical relevance.
I never have.
All you have managed to do is optimize a flawed benchmark which, as I said
from the beginning, is completely fruitless.
No argument there - but then that's not what I was responding to, was
it?
Let me make this absolutely crystal clear, so that so long as you quote
from this sentence down in your reply, everything else above is
irrelevant:

Given a run of the code with these options:

$ time java -server -Xms512m -Xmx512m -XX:NewRatio=1 binarytrees 20

You claimed that Razii effectively turned the GC off. I claim that he
certainly didn't, because with the GC having no effect the program
would not have completed.

Do you disagree with my claim that with the GC *actually* turned off
(if there were some way to do that) the program would have failed to
finish, or do you think that a (hypothetical) set of options where a
program fails to finish can be *effectively* the same as a set of
options where the program manages to run?

--
Jon Skeet - <sk***@pobox.com>
Web site: http://www.pobox.com/~skeet
Blog: http://www.msmvps.com/jon.skeet
C# in Depth: http://csharpindepth.com
Jun 27 '08 #275
On Jun 8, 8:39*am, Jon Skeet [C# MVP] <sk...@pobox.comwrote:

<...>

This stuff is still appearing on clc++.

There is no C++ related content here.

regards
Andy Little

Jun 27 '08 #276
On Sun, 08 Jun 2008 12:34:34 +0100, Mark Thornton
<mt*******@optrak.co.ukwrote:
>In real systems this is usually not true. In most applications it is
possible to get good performance from the garbage collector.
It's not only not true. It's downright false.

http://www.idiom.com/~zilla/Computer...benchmark.html

Consider what happens when you do a new/malloc: a) the allocator looks
for an empty slot of the right size, then returns you a pointer. b)
This pointer is pointing to some fairly random place.

With GC, a) the allocator doesn't need to look for memory, it knows
where it is, b) the memory it returns is adjacent to the last bit of
memory you requested. The wandering around part happens not all the
time but only at garbage collection. And then (depending on the GC
algorithm) things get moved of course as well.
The cost of missing the cache
The big benefit of GC is memory locality. Because newly allocated
memory is adjacent to the memory recently used, it is more likely to
already be in the cache.

How much of an effect is this? One rather dated (1993) example shows
that missing the cache can be a big cost: changing an array size in
small C program from 1023 to 1024 results in a slowdown of 17 times
(not 17%). This is like switching from C to VB! This particular
program stumbled across what was probably the worst possible cache
interaction for that particular processor (MIPS); the effect isn't
that bad in general...but with processor speeds increasing faster than
memory, missing the cache is probably an even bigger cost now than it
was then.

(It's easy to find other research studies demonstrating this; here's
one from Princeton: they found that (garbage-collected) ML programs
translated from the SPEC92 benchmarks have lower cache miss rates than
the equivalent C and Fortran programs.)

This is theory, what about practice? In a well known paper [2] several
widely used programs (including perl and ghostscript) were adapted to
use several different allocators including a garbage collector
masquerading as malloc (with a dummy free()). The garbage collector
was as fast as a typical malloc/free; perl was one of several programs
that ran faster when converted to use a garbage collector. Another
interesting fact is that the cost of malloc/free is significant: both
perl and ghostscript spent roughly 25-30% of their time in these
calls.

Besides the improved cache behavior, also note that automatic memory
management allows escape analysis, which identifies local allocations
that can be placed on the stack. (Stack allocations are clearly
cheaper than heap

Jun 27 '08 #277
Razii wrote:
Consider what happens when you do a new/malloc: a) the allocator looks
for an empty slot of the right size, then returns you a pointer.
Modern allocators have several arrays of slots of suitable sizes, and
can therefore easily find one in the right size. The next allocation of
that size will also be immediately adjacent. Only rather large sizes
require another approach, but I assume these are pretty rare in both
kinds of environments, and I guess that programs tend to hang on to
such large objects much longer as well. Deallocation of objects is
immediate, which often means that memory consumption is lower and not
dependent on when a GC might finally run. Also, no heaps of memory are
moved around in non-GC memory management.

IOW, there are arguments for both approaches. The GC one has the big
advantage that one big cause of errors, all errors regarding memory
use, are more or less completely eliminated. But I doubt I would call
speed one of the main factors to choose a GC.
--
Rudy Velthuis http://rvelthuis.de

"My last cow just died, so I won't need your bull anymore."
Jun 27 '08 #278
Lew
Razii wrote:
but with processor speeds increasing faster than
memory, missing the cache is probably an even bigger cost now than it
was then.
You might not have noticed, but processor speeds have been flat for the last
several years, or actually declined. Memory has gotten faster, and CPUs have
gotten more cache, so actually the trend is the opposite of what you stated.

--
Lew
Jun 27 '08 #279
Lew
Rudy Velthuis wrote:
Modern allocators have several arrays of slots of suitable sizes, and
can therefore easily find one in the right size. The next allocation of
that size will also be immediately adjacent. Only rather large sizes
require another approach, but I assume these are pretty rare in both
kinds of environments, and I guess that programs tend to hang on to
such large objects much longer as well. Deallocation of objects is
immediate, which often means that memory consumption is lower and not
dependent on when a GC might finally run. Also, no heaps of memory are
moved around in non-GC memory management.
Deallocation of young objects in Java takes no time at all. GCs of the young
generation take very little time for typical memory-usage patterns. It could
be, for a large class of programs, that memory management takes less time in a
GCed language like Java than in a language like C++ with manual memory management.
IOW, there are arguments for both approaches. The GC one has the big
advantage that one big cause of errors, all errors regarding memory
use, are more or less completely eliminated. But I doubt I would call
speed one of the main factors to choose a GC.
It can be. The problem is that assertions about speed are nearly impossible
to make /a priori/ - there are so many factors and emergent interactions
involved that one is unlikely to guess correctly without experimentation and
measurement.

--
Lew
Jun 27 '08 #280
Lew wrote:
Razii wrote:
>but with processor speeds increasing faster than
memory, missing the cache is probably an even bigger cost now than it
was then.

You might not have noticed, but processor speeds have been flat for the
last several years, or actually declined.
Processor speed is increasing at the same speed as ever.

GHz rates are not. They reached the heat barrier. But GHz was
never a good indication for speed.

Growth in core speed has slowed down, because the the way processors
get faster today is to add more cores.

Arne

Jun 27 '08 #281
Lew
Arne Vajhøj wrote:
Lew wrote:
>Razii wrote:
>>but with processor speeds increasing faster than
memory, missing the cache is probably an even bigger cost now than it
was then.

You might not have noticed, but processor speeds have been flat for
the last several years, or actually declined.

Processor speed is increasing at the same speed as ever.

GHz rates are not. They reached the heat barrier. But GHz was
never a good indication for speed.

Growth in core speed has slowed down, because the the way processors
get faster today is to add more cores.
I believe you're speaking of processing speed. The term "processor speed" has
always meant clock speed of a processor in every context I've encountered it
heretofore.

Adding cores to a processor doesn't inherently make it faster. The software
has to take advantage of the additional cores.

--
Lew
Jun 27 '08 #282
Lew wrote:
Arne Vajhøj wrote:
>Lew wrote:
>>Razii wrote:
but with processor speeds increasing faster than
memory, missing the cache is probably an even bigger cost now than it
was then.

You might not have noticed, but processor speeds have been flat for
the last several years, or actually declined.

Processor speed is increasing at the same speed as ever.

GHz rates are not. They reached the heat barrier. But GHz was
never a good indication for speed.

Growth in core speed has slowed down, because the the way processors
get faster today is to add more cores.

I believe you're speaking of processing speed. The term "processor
speed" has always meant clock speed of a processor in every context I've
encountered it heretofore.
Could be. But that meaning does not fit very well with the original
context.
Adding cores to a processor doesn't inherently make it faster. The
software has to take advantage of the additional cores.
It makes it potential faster.

It is up to the programmers to utilize the potential.

Arne
Jun 27 '08 #283
Lew
Lew wrote:
>Adding cores to a processor doesn't inherently make it faster. The
software has to take advantage of the additional cores.
Arne Vajhøj wrote:
It makes it potential faster.
And most OSes do manage to use at least some of that potential.
It is up to the programmers to utilize the potential.
I agree completely.

I also see a trend for more and more programs to at least allow its use. CPUs
are also getting more and faster cache memory, and mainboard memory
utilization also is getting faster, so the OP's assertion that "processor
speeds [are] increasing faster than memory" becomes a little less generally
reliable.

Regardless. you and the OP together are clearly correct that processing speed
is getting much faster, as is memory speed. Taken together, along with
implications of multi-processor algorithms on the memory model, there are
great effects on the state of the art in programming.

Nit-picking about specific minor terminologies aside, your conclusions are
inarguable.

--
Lew
Jun 27 '08 #284
Arne Vajhj wrote:
Jon Harrop wrote:
>Yes. I've no idea why they do that. Isn't buffered IO a better default?!

It is easier to put a buffered wrapper around an unbuffered stream
than the the other way around.
Did these languages not have optional arguments when their standard
libraries were designed?

--
Dr Jon D Harrop, Flying Frog Consultancy
http://www.ffconsultancy.com/products/?u
Jun 27 '08 #285
Arne Vajhj wrote:
Jon Harrop wrote:
>Now you are introducing new subjective "rules" to circumvent problems
with the benchmarks.

There are not much point in testing the speed of custom code that
does what in real apps would be done in the standard libraries.
Absolutely. Real code would use a lexer in this case. Java stdlib happens to
bundle a lexer than already handles this case but lexing should be easy in
any modern language.

--
Dr Jon D Harrop, Flying Frog Consultancy
http://www.ffconsultancy.com/products/?u
Jun 27 '08 #286
Razii wrote:
On Sun, 08 Jun 2008 12:34:34 +0100, Mark Thornton
<mt*******@optrak.co.ukwrote:
>>In real systems this is usually not true. In most applications it is
possible to get good performance from the garbage collector.

It's not only not true. It's downright false.

http://www.idiom.com/~zilla/Computer...benchmark.html

Consider what happens when you do a new/malloc: a) the allocator looks
for an empty slot of the right size, then returns you a pointer. b)
This pointer is pointing to some fairly random place.
...
This is just another strawman argument. Malloc is not the only alternative
to GC.
With GC, a) the allocator doesn't need to look for memory, it knows
where it is, b) the memory it returns is adjacent to the last bit of
memory you requested. The wandering around part happens not all the
time but only at garbage collection. And then (depending on the GC
algorithm) things get moved of course as well.
That is exactly that the STL allocators do, for example.

--
Dr Jon D Harrop, Flying Frog Consultancy
http://www.ffconsultancy.com/products/?u
Jun 27 '08 #287
Lew wrote:
Rudy Velthuis wrote:
>Modern allocators have several arrays of slots of suitable sizes, and
can therefore easily find one in the right size. The next allocation of
that size will also be immediately adjacent. Only rather large sizes
require another approach, but I assume these are pretty rare in both
kinds of environments, and I guess that programs tend to hang on to
such large objects much longer as well. Deallocation of objects is
immediate, which often means that memory consumption is lower and not
dependent on when a GC might finally run. Also, no heaps of memory are
moved around in non-GC memory management.

Deallocation of young objects in Java takes no time at all...
You are ignoring all of the overheads of a GC, like thread synchronization,
stack walking and limitations placed upon the code generator required to
keep the GC happy.

If you compare generically and assume infinite development time then
lower-level languages will surely win in terms of raw performance. The
reason the world moved on to GC'd languages is that they allow more
complicated programs to be written more robustly and efficiently in a given
amount of development time, i.e. they are more cost effective.

--
Dr Jon D Harrop, Flying Frog Consultancy
http://www.ffconsultancy.com/products/?u
Jun 27 '08 #288
Lew
Jon Harrop wrote:
You are ignoring all of the overheads of a GC, like thread synchronization,
stack walking and limitations placed upon the code generator required to
keep the GC happy.
Balanced, to a degree at least, by the absence of manual memory-management
code, which would also have an overhead of its own, and the presence of
dynamic optimizers like Hotspot.
If you compare generically and assume infinite development time then
lower-level languages will surely win in terms of raw performance. The
reason the world moved on to GC'd languages is that they allow more
complicated programs to be written more robustly and efficiently in a given
amount of development time, i.e. they are more cost effective.
Your points are well taken, but all I'm saying is that /a priori/ arguments
about the overhead of GC are not reliable. The advantages to performance that
GC brings tend to reduce the overhead of collections. Which one wins depends
so much on details of the JVM implementation, the needs of the algorithm, the
idioms followed by the app programmer, and other factors that it seems the
height of hubris to predict without measurement.

So far it seems that you must be at least mostly correct - from what I've seen
and read, most Java programs on most JVMs still seem to be somewhat slower
than most "natively compiled" programs. However, the gap has unequivocally
lessened over the years, and one can easily see it tilting the way of the
intelligently GCed platform.

--
Lew
Jun 27 '08 #289
Rudy Velthuis wrote:
Razii wrote:
>Consider what happens when you do a new/malloc: a) the allocator looks
for an empty slot of the right size, then returns you a pointer.

Modern allocators have several arrays of slots of suitable sizes, and
can therefore easily find one in the right size. The next allocation of
that size will also be immediately adjacent. Only rather large sizes
require another approach, but I assume these are pretty rare in both
kinds of environments, and I guess that programs tend to hang on to
such large objects much longer as well. Deallocation of objects is
immediate, which often means that memory consumption is lower and not
dependent on when a GC might finally run. Also, no heaps of memory are
moved around in non-GC memory management.

IOW, there are arguments for both approaches. The GC one has the big
advantage that one big cause of errors, all errors regarding memory
use, are more or less completely eliminated. But I doubt I would call
speed one of the main factors to choose a GC.
Actually GC speed is very good.

The problem people complain over is the non deterministic
aspect of it.

Arne
Jun 27 '08 #290
Jon Harrop wrote:
Arne Vajhj wrote:
>Jon Harrop wrote:
>>Yes. I've no idea why they do that. Isn't buffered IO a better default?!
It is easier to put a buffered wrapper around an unbuffered stream
than the the other way around.

Did these languages not have optional arguments when their standard
libraries were designed?
Neither Java or C# has optional arguments today. But operator
overload can be used instead.

It is just not good OO to do it that way.

Arne

Jun 27 '08 #291
Jon Harrop wrote:
Lew wrote:
>Rudy Velthuis wrote:
>>Modern allocators have several arrays of slots of suitable sizes, and
can therefore easily find one in the right size. The next allocation of
that size will also be immediately adjacent. Only rather large sizes
require another approach, but I assume these are pretty rare in both
kinds of environments, and I guess that programs tend to hang on to
such large objects much longer as well. Deallocation of objects is
immediate, which often means that memory consumption is lower and not
dependent on when a GC might finally run. Also, no heaps of memory are
moved around in non-GC memory management.
Deallocation of young objects in Java takes no time at all...

You are ignoring all of the overheads of a GC, like thread synchronization,
stack walking and limitations placed upon the code generator required to
keep the GC happy.
I would expect non-GC solutions to need more thread synchronization
than GC because it will need it many more times.
If you compare generically and assume infinite development time then
lower-level languages will surely win in terms of raw performance. The
reason the world moved on to GC'd languages is that they allow more
complicated programs to be written more robustly and efficiently in a given
amount of development time, i.e. they are more cost effective.
I agree with that part.

Arne
Jun 27 '08 #292
Razii wrote:
On Wed, 4 Jun 2008 19:49:20 +0100, Jon Skeet [C# MVP]
<sk***@pobox.comwrote:
>into the search bar, you get 0.858272132, the same as the .NET answer.
That can't be pure coincidence, but I've no idea where the similarity
is...

That's pretty simple to explain. Both use hardware for angle reduction
and get the same wrong answer. That's why it's faster but wrong. It's
not just 1e15

What about these?

Console.WriteLine(Math.Sin (1e7));

0.420547793190771 (C# with .NET)
0.42054779319078249129850658974095 (right answer)

Console.WriteLine(Math.Sin (1e10));

-0.48750602507627 (C# with .NET)
-0.48750602508751069152779429434811 (right answer)

I am sure there are better examples but the point is made.
Everybody knows there is this effect. Maybe everybody except Jon H.

But does it matter ?

I don't think there will be that many usages where the differences in
accuracy between the Java way and the C# way will have much impact.

Either it will be good enough or even greater precision will be needed.

Arne
Jun 27 '08 #293
Jon Harrop wrote:
Razii wrote:
>On Wed, 4 Jun 2008 20:01:54 +0100, Jon Skeet [C# MVP]
<sk***@pobox.comwrote:
>>Out of interest, what happens in your version of the test if you avoid
the deprecated StreamTokenizerConstructor? Force Java to actually deal
with text as text (which is why the constructor taking Stream is
deprecated) and see how the results fare.
$ time cat sum.txt | java -server sumcol (deprecated)
10500000

real 0m4.224s
user 0m0.155s
sys 0m0.327s

$ time cat sum.txt | sumcol (.NET)
10500000

real 0m6.395s
user 0m0.077s
sys 0m0.342s

changing the line to

StreamTokenizer lineTokenizer = new StreamTokenizer(new BufferedReader
(new InputStreamReader(System.in)));

$ time cat sum.txt | java -server sumcol
10500000

real 0m5.375s
user 0m0.202s
sys 0m0.296s

still faster.

You are still timing Cygwin's implementation of Unix pipes which has nothing
to do with anything.
Do you believe that by there is a little test in the Cygwin code that
makes it work slower if it sees a .NET code ? Or why do you think the
Cygwin stuff can explain the difference in the two tests ?

Arne
Jun 27 '08 #294
Jon Harrop wrote:
Razii wrote:
>On Wed, 04 Jun 2008 19:31:43 +0100, Jon Harrop <jo*@ffconsultancy.com>
wrote:
>>64-bit or 32-bit Windows?
32-bit. I hope your next answer is not that 32-bit is legacy system.

So you're running a 64-bit CPU in 32-bit mode.
That is what most people do.

32 bit Windows on a 64 bit capable Intel or AMD CPU.

And therefore obviously what is most relevant to test.

Arne
Jun 27 '08 #295
Patricia Shanahan wrote:
Arguable in the case of Java. The Java Language Specification says that
each compilation unit implicitly starts with an import of "the
predefined package java.lang".

I don't know whether C# has any libraries with such a privileged position.
There is no implicit import of namespace System.

But I am not sure that I consider that a big difference.

Both Java and C# are pretty married to their libraries anyway.

Arne
Jun 27 '08 #296
Jon Harrop wrote:
Razii wrote:
>On Wed, 04 Jun 2008 20:34:59 +0100, Jon Harrop <jo*@ffconsultancy.com>
wrote:
>>You are still timing Cygwin's implementation of Unix pipes which has
nothing to do with anything.
I am still waiting for you to verify and demonstrate it has any
effect.

Your inexplicably anomalous results already proved that.
For anyone with just a minimum of understanding of logic: not.

Arne
Jun 27 '08 #297
Mark Thornton wrote:
Jon Skeet [C# MVP] wrote:
>The VM spec is also separate from the language spec, which is a good
thing. It's just a shame that the name "Java" applies to the platform,
the runtime, and the language.
And at times several other things with no obvious relationship at the
whim of Sun's marketing department.
I guess you can argue that funding Java development and giving most
of it away for free has given them the right to try and make a few
bucks selling a Java branded Linux.

Arne
Jun 27 '08 #298
Jon Skeet [C# MVP] wrote:
Razii <ni*********@mail.comwrote:
>As for C# using file, post the changes.

I did, early yesterday evening.
Some people has an amazing ability to miss the posts
which does not fit their pet theory.

Arne
Jun 27 '08 #299
Jon Skeet [C# MVP] wrote:
Razii <ni*******@mail.comwrote:
>So we are done with this partialsums benchmark; this was the only case
where C# was significantly faster.

And I for one have never been claiming that .NET or C# is significantly
faster than Java overall. (If any benchmarks made custom calls to
native code, that would be interesting. I really have no idea how JNI
compares with P/Invoke in terms of speed. I know which is more pleasant
to use, mind you :)

Given how similar their overall approach is, and how both of them have
had a lot of time and money spent optimising their VMs (with more work
to be done, for sure) it's unsurprising that they're pretty much a tie.

My conclusion: don't choose between Java and .NET on performance
grounds. There are far better reasons to choose one or the other,
depending on other criteria.
Jon - that is common sense - completely off topic in this thread !

:-)

Arne
Jun 27 '08 #300

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

80 posts views Thread by tech | last post: by
318 posts views Thread by King Raz | last post: by
reply views Thread by suresh191 | last post: by
1 post views Thread by haryvincent176 | last post: by
1 post views Thread by Marylou17 | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.