468,490 Members | 2,532 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 468,490 developers. It's quick & easy.

Do you use a garbage collector?

I followed a link to James Kanze's web site in another thread and was
surprised to read this comment by a link to a GC:

"I can't imagine writing C++ without it"

How many of you c.l.c++'ers use one, and in what percentage of your
projects is one used? I have never used one in personal or professional
C++ programming. Am I a holdover to days gone by?
Apr 10 '08
350 9443
On Apr 11, 9:20 am, Razii <DONTwhatever...@hotmail.comwrote:
On Thu, 10 Apr 2008 20:37:59 -0500, Razii

<DONTwhatever...@hotmail.comwrote:
int main(int argc, char *argv[]) {
clock_t start=clock();
for (int i=0; i<=10000000; i++) {
Test *test = new Test(i);
if (i % 5000000 == 0)
cout << test;
}

If I add delete test; to this loop it gets faster. huh? what the
exaplanation for this?

2156 ms

and after I add delete test; to the loop

1781 ms

why is that?
Ah, so you found it ;)

Well, is not it obvious?

Without delete, you are allocating more and more memory. Means more
cache issues, more memory to manage etc. In Java, the total amount of
memory in loop is kept low, because a lot of it is cheaply recollected
by GC.

Mirek
Jun 27 '08 #101
On Apr 12, 5:02 am, Mirek Fidler <c...@ntllib.orgwrote:
If you have time, fix and try with U++.... (it has overloaded new/
delete).
Well, to save you a bit of time:

#include <Core/Core.h>

using namespace Upp;

class Test {
public:
Test (int c) {count = c;}
// virtual ~Test() { }
int count;
};

CONSOLE_APP_MAIN
{
RTIMING("NewDelete");
for (int i=0; i<=10000000; i++) {
Test *test = new Test(i);
delete test;
}
}

Note that commenting / uncommenting virtual destructor has huge
impact... (means allocation/deallocation itself is about as fast as
virtual table assignment...)

Mirek
Jun 27 '08 #102
In article <ma*******************@ram.dialup.fu-berlin.de>,
ra*@zedat.fu-berlin.de says...
Juha Nieminen <no****@thanks.invalidwrites:
I don't see how this is so much different from what Java does.
»[A]llocation in modern JVMs is far faster than the best
performing malloc implementations. The common code path
for new Object() in HotSpot 1.4.2 and later is
approximately 10 machine instructions (data provided by
Sun; see Resources), whereas the best performing malloc
implementations in C require on average between 60 and 100
instructions per call (Detlefs, et. al.; see Resources).
If somebody wanted to make an equally meaningless claim in the opposite
direction, they could just as accurately claim that "freeing a block
memory with free() typically consumes no more than 4 machine
instructions, while a single execution of a garbage collector typically
consumes at least 10,000 clock cycles."
And allocation performance is not a trivial component of
overall performance -- benchmarks show that many
real-world C and C++ programs, such as Perl and
Ghostscript, spend 20 to 30 percent of their total
execution time in malloc and free -- far more than the
allocation and garbage collection overhead of a healthy
Java application (Zorn; see Resources).«
If you're using the exact versions of Ghostscript and Perl they tested,
compiled with the exact C++ compiler they used, running the exact
scripts they used for testing, this comparison probably means a lot.
Changing any of these will reduce the meaning of the tests -- and with
any more than minimal changes, there's likely to be no meaning left at
all.

Again, I could equally easily exchange "Java" and "C++", by merely
replacing "Perl and Ghostscript" with a couple of carefully chosen Java
programs.

To put things in perspective, consider the I recently profiled an
application that I wrote and maintain for my real work. According to the
profiler, the combined total of time spent in operator new and operator
delete (including everything else they called) was 0.115%. The very best
Java (or anything else) could hope to do is beat that by 0.115%, which
would hardly be enough to measure, not to mention caring about.

Of course, you don't know the exact nature of that program or even
generally what it does. You don't have the source code, so you have no
idea how it works, or whether it normally uses dynamic allocation at all
-- IOW, you know exactly as much about it as you do about the tests
cited by IBM.

Unlike them, I'll tell you at least a bit about the code. Like most of
my code, it uses standard containers where they seem useful. Unlike
some, it and makes no real attempt at optimizing their usage either
(e.g. by reserving space). Essentially all the data it loads (typically
at least a few megabytes, sometimes as much as a few gigabytes) goes
into dynamically allocated memory (mostly vectors). OTOH, after loading
that data, it does multidimensional scaling and then displays the result
in 3D (using OpenGL). It allocates a lot of memory dynamically, but most
of its time is spent on computation.

--
Later,
Jerry.

The universe is a figment of its own imagination.
Jun 27 '08 #103
In article <66*************@mid.dfncis.de>, mk*@incubus.de says...

[ ... ]
If you use a relatively low-level language such as C++, this certainly
_does_ effect your productivity. I can't understand where you get the
idea that it doesn't. A more expressive language, preferrably one
designed for or adaptable to the problem domain, will in many cases
produce a dramatic improvement in productivity over the kind of manual
stone-breaking one is doing in C++.
It sounds a great deal as if you don't know C++ at all. Offhand, I can't
think of any other language I've used that has nearly as good of support
for domain-specific languages as C++.

Just for one well-known example, the Boost.Spirit library is a parser
generator library built entirely out of C++ templates. Though it's not
at all domain-specific itself, Boost.MPL (MetaProgramming Library)
supports metaprogramming that can be used to generate all sorts of other
domain specific languages. In fact, one of the primary uses for
metaprogramming is to embed domain specific languages into C++.

If you honestly care about support for domain specific languages, read
_C++ Template Metaprogramming_ by David Abrahams and Aleksey Gurtovoy.

--
Later,
Jerry.

The universe is a figment of its own imagination.
Jun 27 '08 #104
On Fri, 11 Apr 2008 20:12:15 -0700 (PDT), Mirek Fidler
<cx*@ntllib.orgwrote:
>Well, to save you a bit of time:
I added #include <ctime>

RTIMING("NewDelete");
clock_t start=clock();

or should it be...

clock_t start=clock();
RTIMING("NewDelete");

The first one is twice slower (375 ms)

the second one is 172 ms, around same as java -server. Too close to
call. I increased 10,000,000 to 100,000,000

Time: 1844 ms (java -client)
Time: 1532 ms(java -server)
Time: 47 ms (Java Jet)
Time: 1718 ms (UPP)

The above is not right because Jet compiler apparently optimized it if
the following is not in the loop..

if ( i % 50000000 == 0 )
{
System.out.println( test );
}
so after adding the above back and also adding it to the UPP
version...

Time: 3203 ms (java client)
Time: 1546 ms (java server)
Time: 4406 ms (Jet)
Time: 2078 ms (Upp)

java -server is clear winner. Adding -Xms64m flag can increase
performance in this case (due to fewer frequency calls to GC).

Time: 1406 ms (-server with -Xms64m)
Jun 27 '08 #105
On Apr 12, 6:34 am, Razii <DONTwhatever...@hotmail.comwrote:
On Fri, 11 Apr 2008 20:12:15 -0700 (PDT), Mirek Fidler

<c...@ntllib.orgwrote:
Well, to save you a bit of time:

I added #include <ctime>

RTIMING("NewDelete");
clock_t start=clock();

or should it be...

clock_t start=clock();
RTIMING("NewDelete");
FYI, "RTIMING" does the "clock" job. It measures a time spend in code
from "RTIMING" to the end of block. Press Alt+L in theide and you will
get the ouput log file with number(s).

Anyway, with numbers like this, I would say we can put "superior GC
memory management performance over manual new/delete" to the rest, can
we?

The main reason why GC seems to perform better is that much more time
was spent optimizing GC than manual alloc/free in last years.

And also note that in this example, GC's job is fairly simple - there
are no living blocks left in the code. Makes GC zero time operation.

Mirek
Jun 27 '08 #106
BTW, if you want a more realistic benchmark, try this and its Java
equivalent:

#include <Core/Core.h>

using namespace Upp;

struct Tree {
Tree *left;
Tree *right;
};

Tree *CreateTree(int n)
{
if(n <= 0)
return NULL;
Tree *t = new Tree;
t->left = CreateTree(n - 1);
t->right = CreateTree(n - 1);
return t;
}

void DeleteTree(Tree *t)
{
if(t) {
DeleteTree(t->left);
DeleteTree(t->right);
delete t;
}
}

CONSOLE_APP_MAIN
{
RTIMING("Tree new/delete");
for(int i = 0; i < 100; i++)
DeleteTree(CreateTree(20));
}
Jun 27 '08 #107

"Mirek Fidler" <cx*@ntllib.orgwrote in message
news:aa**********************************@c19g2000 prf.googlegroups.com...
On Apr 11, 11:44 pm, Razii <DONTwhatever...@hotmail.comwrote:
Which "older OS"? Some 30yo?

How about mobile and embedded devices that don't have sophisticated
memory management? If a C++ application is leaking memory, the
memory
might never be returned even after the application is terminated.
This is more dangerous than memory leak in Java application, where,
after the application is terminated, all memory is returned by VM.

If VM is able to return memory to OS, so it should be C++ runtime.
The JVM can (in principle, at least) compact its heap and return the
now-free space to the OS. An environment that doesn't allow memory
compaction (which includes most C++ implementations) would find this
impossible.
Jun 27 '08 #108
On Apr 12, 7:48 am, "Mike Schilling" <mscottschill...@hotmail.com>
wrote:
"Mirek Fidler" <c...@ntllib.orgwrote in message

news:aa**********************************@c19g2000 prf.googlegroups.com...
On Apr 11, 11:44 pm, Razii <DONTwhatever...@hotmail.comwrote:
Which "older OS"? Some 30yo?
How about mobile and embedded devices that don't have sophisticated
memory management? If a C++ application is leaking memory, the
memory
might never be returned even after the application is terminated.
This is more dangerous than memory leak in Java application, where,
after the application is terminated, all memory is returned by VM.
If VM is able to return memory to OS, so it should be C++ runtime.

The JVM can (in principle, at least) compact its heap and return the
now-free space to the OS. An environment that doesn't allow memory
compaction (which includes most C++ implementations) would find this
impossible.
Yes, but we are speaking about "application is terminated" situation
here...

Mirek
Jun 27 '08 #109
On Fri, 11 Apr 2008 21:56:03 -0700 (PDT), Mirek Fidler
<cx*@ntllib.orgwrote:
>BTW, if you want a more realistic benchmark, try this and its Java
equivalent:
that was 4 times slower than the last one.
Jun 27 '08 #110
On Fri, 11 Apr 2008 21:44:48 -0700 (PDT), Mirek Fidler
<cx*@ntllib.orgwrote:
>Anyway, with numbers like this, I would say we can put "superior GC
memory management performance over manual new/delete" to the rest, can
we?
The common claim is that GC is much slower than manual new/delete.
>The main reason why GC seems to perform better is that much more time
was spent optimizing GC than manual alloc/free in last years.
How could that be? C is 30 years old and all this time has been spent
in optimizing C and C++ too. GC and especially JIT compilers are
newer, and this is not enough time to improve them.
>And also note that in this example, GC's job is fairly simple - there
are no living blocks left in the code. Makes GC zero time operation.
How about you or someone post an example where GC would be much slower
than manual new/delete?
Jun 27 '08 #111
On Apr 12, 8:42 am, Razii <DONTwhatever...@hotmail.comwrote:
On Fri, 11 Apr 2008 21:56:03 -0700 (PDT), Mirek Fidler

<c...@ntllib.orgwrote:
BTW, if you want a more realistic benchmark, try this and its Java
equivalent:

that was 4 times slower than the last one.

In C++? Java? Relative?

Mirek
Jun 27 '08 #112
On Apr 12, 9:17 am, Razii <DONTwhatever...@hotmail.comwrote:
On Fri, 11 Apr 2008 21:44:48 -0700 (PDT), Mirek Fidler

<c...@ntllib.orgwrote:
Anyway, with numbers like this, I would say we can put "superior GC
memory management performance over manual new/delete" to the rest, can
we?

The common claim is that GC is much slower than manual new/delete.
The main reason why GC seems to perform better is that much more time
was spent optimizing GC than manual alloc/free in last years.

How could that be? C is 30 years old and all this time has been spent
in optimizing C and C++ too.
Nope. Especially, new/delete in MSC is absolutely terrible.
Unfortunately, very little effort went into C++ standard library.
Shame, but true.
GC and especially JIT compilers are
newer, and this is not enough time to improve them.
Well, everybody was throwing a lot of money on optimizing VMs and GCs.

IMO, for each developer in major SW company, 10 times more people you
will find optimizing GC than malloc/free.

Otherwise, explain me how we can ouperform MSC new/delete by 10 times?

Mirek
Jun 27 '08 #113
On Apr 12, 8:42 am, Razii <DONTwhatever...@hotmail.comwrote:
On Fri, 11 Apr 2008 21:56:03 -0700 (PDT), Mirek Fidler

<c...@ntllib.orgwrote:
BTW, if you want a more realistic benchmark, try this and its Java
equivalent:

that was 4 times slower than the last one.

Also, if possible, please post me your Java equivalent...

Mirek
Jun 27 '08 #114
On Sat, 12 Apr 2008 00:26:18 -0700 (PDT), Mirek Fidler
<cx*@ntllib.orgwrote:
>Otherwise, explain me how we can ouperform MSC new/delete by 10 times?
I was using g++ (MinGW) MSC9 did better a little better.

13859 ms (Vc++) cl /O2 /GL test.cpp /link /ltcg
17765 ms (g++) g++ -O2 -fomit-frame-pointer "test.cpp" -o "test.exe"

but, for some reason if I use U++ IDE to compile, these time improve..

5142 ms (g++)
3906 ms (MSC9)

still much slower than java-server ~1400 ms..

Jun 27 '08 #115
On Apr 12, 9:56 am, Razii <DONTwhatever...@hotmail.comwrote:
On Sat, 12 Apr 2008 00:26:18 -0700 (PDT), Mirek Fidler

<c...@ntllib.orgwrote:
Otherwise, explain me how we can ouperform MSC new/delete by 10 times?

I was using g++ (MinGW) MSC9 did better a little better.

13859 ms (Vc++) cl /O2 /GL test.cpp /link /ltcg
17765 ms (g++) g++ -O2 -fomit-frame-pointer "test.cpp" -o "test.exe"

but, for some reason if I use U++ IDE to compile, these time improve..

5142 ms (g++)
3906 ms (MSC9)

still much slower than java-server ~1400 ms..
At this point, hard to say what have you really tested/measured ;)

This looks like you have adapted my code for normal C++ (e.g. removing
#include <Core/Core.h>). Then tested in theide original version.
Correct?

Where is your Java version?

Mirek
Jun 27 '08 #116
On Sat, 12 Apr 2008 00:22:02 -0700 (PDT), Mirek Fidler
<cx*@ntllib.orgwrote:
>In C++? Java? Relative?
4 times slower than the last U++ version that was doing 100,000,000

the last U++ version was around 1700ms this one 7671 ms
Jun 27 '08 #117
On Apr 12, 10:04 am, Razii <DONTwhatever...@hotmail.comwrote:
On Sat, 12 Apr 2008 00:22:02 -0700 (PDT), Mirek Fidler

<c...@ntllib.orgwrote:
In C++? Java? Relative?

4 times slower than the last U++ version that was doing 100,000,000

the last U++ version was around 1700ms this one 7671 ms
Ah, you have missed important fact that we are now doing completely
different benchmark :) One that will keep some live data in the
memory, this maybe exhibiting the real costs of GC...

Mirek
Jun 27 '08 #118
On Sat, 12 Apr 2008 01:08:13 -0700 (PDT), Mirek Fidler
<cx*@ntllib.orgwrote:
>the last U++ version was around 1700ms this one 7671 ms

Ah, you have missed important fact that we are now doing completely
different benchmark :) One that will keep some live data in the
memory, this maybe exhibiting the real costs of GC...

Java -server -Xms64m Test

Time: 17219 ms

hmm ..that's twice slower than U++ version. Changing some flags.

Java -server -Xms128m -Xmx128m Test

Time: 11344 ms

closer now :) Chamging more flags

Java -server -Xms256m -Xmx256m Test

Time: 6188 ms

I win :))

Changing flags more...

Java -server -Xms512m -Xmx512m Test

I am two times faster than U++ :)

--- java version ---

public final class Test
{
Test left;
Test right;

static Test CreateTree(int n)
{
if(n <= 0)
return null;
Test t = new Test();
t.left = CreateTree(n - 1);
t.right = CreateTree(n - 1);
return t;
}

public static void main( String[] arg )
{

long start = System.currentTimeMillis();

for(int i = 0; i < 100; i++)
CreateTree(20);
long end = System.currentTimeMillis();

System.out.println( "Time: " + ( end - start ) + " ms" );

}

}

Jun 27 '08 #119
On Sat, 12 Apr 2008 01:01:35 -0700 (PDT), Mirek Fidler
<cx*@ntllib.orgwrote:
>This looks like you have adapted my code for normal C++ (e.g. removing
#include <Core/Core.h>). Then tested in theide original version.
Correct?
Yes, I removed the <Core/Core.hand pasted the C++ version I posted
earlier.
Jun 27 '08 #120
On Apr 12, 10:26 am, Razii <DONTwhatever...@hotmail.comwrote:
On Sat, 12 Apr 2008 01:08:13 -0700 (PDT), Mirek Fidler

<c...@ntllib.orgwrote:
the last U++ version was around 1700ms this one 7671 ms
Ah, you have missed important fact that we are now doing completely
different benchmark :) One that will keep some live data in the
memory, this maybe exhibiting the real costs of GC...

Java -server -Xms64m Test
Ah, so any application in Java will now consume at least 64MB?
>
Time: 17219 ms
And even then (this is twice as much as C++) GC is 2 times slower...

Be fair and post the number without tweaking initial memory pool...
I am two times faster than U++ :)
That was cheap. Anyway, I guess, we can put this to rest. Java memory
management is much slower than optimal C++ allocator, end of story.

Mirek
Jun 27 '08 #121
On Sat, 12 Apr 2008 03:26:28 -0500, Razii
<DO*************@hotmail.comwrote:
>Java -server -Xms512m -Xmx512m Test

I am two times faster than U++ :)
forgot to post the time with these

Time: 3875 ms (that's twice faster than U++)

The reason obviously is when -Xms is set very high, like 512m as
above, frequency of GC is reduced

[GC 126800K->83413K(518464K), 0.0587731 secs]
[GC 130005K->84057K(518464K), 0.0407264 secs]
[GC 130649K->82140K(518464K), 0.0242517 secs]
[GC 128732K->79580K(518464K), 0.0084715 secs]
(snip)

with Java -server -Xms64m Test, GC ran more often. The output looked
something like..

[GC 56783K->56782K(64832K), 0.0425247 secs]
[Full GC 62606K->13452K(64832K), 0.1563498 secs]
[GC 19276K->19275K(64832K), 0.0398224 secs]
[GC 25099K->25098K(64832K), 0.0455712 secs]
[GC 30922K->30921K(64832K), 0.0426875 secs]
[GC 36745K->36744K(64832K), 0.0422618 secs]
[GC 42568K->42567K(64832K), 0.0424442 secs]
[GC 48391K->48389K(64832K), 0.0426959 secs]
[GC 54213K->54212K(64832K), 0.0425037 secs]
[Full GC 60036K->10883K(64832K), 0.1355823 secs]
[GC 16707K->16706K(64832K), 0.0408873 secs]
[GC 22530K->22529K(64832K), 0.0422355 secs]
(snip)
Jun 27 '08 #122
On Sat, 12 Apr 2008 01:37:17 -0700 (PDT), Mirek Fidler
<cx*@ntllib.orgwrote:
>Ah, so any application in Java will now consume at least 64MB?
In thi case we creating zillions of objects. Eventhe U++ version
easily gets to 20 MB.
>And even then (this is twice as much as C++) GC is 2 times slower...

Be fair and post the number without tweaking initial memory pool...
without any flags it's Time: 23140 ms

Jun 27 '08 #123
On Sat, 12 Apr 2008 03:41:41 -0500, Razii
<DO*************@hotmail.comwrote:
>forgot to post the time with these

Time: 3875 ms (that's twice faster than U++)
this time keeps improving if I keep increasing Xms

java -verbose:gc -server -Xms1024m -Xmx1024m Test

[GC 93184K->11199K(1036928K), 0.0734596 secs]
[GC 104383K->6078K(1036928K), 0.0387792 secs]
[GC 99262K->957K(1036928K), 0.0051831 secs]
[GC 94141K->12220K(1036928K), 0.0761963 secs]
[GC 105404K->7564K(1036928K), 0.0435248 secs]
[GC 100748K->2442K(1036928K), 0.0114260 secs]
[GC 95626K->13705K(1036928K), 0.0820523 secs]
[GC 106889K->10070K(1036928K), 0.0493448 secs]
[GC 103254K->4949K(1036928K), 0.0178042 secs]
[GC 98133K->16211K(1036928K), 0.0885911 secs]
[GC 109395K->13597K(1036928K), 0.0561415 secs]
[GC 106781K->8475K(1036928K), 0.0239692 secs]
[GC 101659K->19738K(1036928K), 0.0960955 secs]
[GC 112922K->18144K(1036928K), 0.0621244 secs]
[GC 111328K->13023K(1036928K), 0.0303754 secs]
[GC 106207K->24286K(1036928K), 0.1060395 secs]
[GC 117470K->23712K(1036928K), 0.0688886 secs]
Time: 2850 ms

GC ran only 17 times. Witout any flags it runs 779 times. with output
that looks like

[GC 25857K->25856K(28460K), 0.0182224 secs]
[Full GC 25856K->9472K(28460K), 0.0949867 secs]
[GC 12032K->12031K(26924K), 0.0171254 secs]
[GC 14591K->14590K(26924K), 0.0176701 secs]
[GC 17150K->17150K(26924K), 0.0178316 secs]
[GC 19710K->19709K(26924K), 0.0177715 secs]
[GC 22269K->22268K(26924K), 0.0178945 secs]
[GC 24828K->24827K(27436K), 0.0181017 secs]
[Full GC 24827K->8443K(27436K), 0.0864060 secs]
(snip)
Jun 27 '08 #124
GC ran only 17 times. Witout any flags it runs 779 times. with output
that looks like
Obviously, less times GC runs, faster the code is.

But we can now stop pretending that GC is faster than manual
management, can we? :)

Of course, if there is no live memory involved, it can be as fast as
manual. But that proves nothing.

(And, BTW, we are still actually using the heap. Of course, in C++,
you allocated much less items there).

Mirek
Jun 27 '08 #125
On Fri, 11 Apr 2008 21:56:03 -0700 (PDT), Mirek Fidler
<cx*@ntllib.orgwrote:
>#include <Core/Core.h>

using namespace Upp;

struct Tree {
Tree *left;
Tree *right;
};

Tree *CreateTree(int n)
{
if(n <= 0)
return NULL;
Tree *t = new Tree;
t->left = CreateTree(n - 1);
t->right = CreateTree(n - 1);
return t;
}

void DeleteTree(Tree *t)
{
if(t) {
DeleteTree(t->left);
DeleteTree(t->right);
delete t;
}
}

CONSOLE_APP_MAIN
{
RTIMING("Tree new/delete");
for(int i = 0; i < 100; i++)
DeleteTree(CreateTree(20));
}

Running this version on VC++ and g++: 43718 ms and g++ 46890 ms

that is 5 to 6 times slowe than in U++
Jun 27 '08 #126
Stefan Ram wrote:
OOP might have distracted minds from some other approaches
that might have been even more beneficial. But we will never
learn about such alternative histories, so we can not compare
history to them.

Some features are attributed to OOP, but actually are also
parts of other non-OOP approaches. For example, encapsulation
and a compound entity of related operations one data are
features of an ADT (abstract data type). So, one also has to
give a specific definition of OOP and non-OOP before
discussing its effects on productivity.
It seems to me like you are arguing that OOP is not the *only*
programming paradigm which improves productivity. However, that was not
the point. The question was whether OOP has increased productivity or
not. "Also this other paradigm has the same features as OOP" doesn't
really say "OOP does not increase productivity". It just says "OOP is
not the only thing that increases productivity".
Jun 27 '08 #127
On Sat, 12 Apr 2008 02:47:26 -0700 (PDT), Mirek Fidler
<cx*@ntllib.orgwrote:
>(And, BTW, we are still actually using the heap. Of course, in C++,
you allocated much less items there).
If there is not enough memory on stack, you don't have a choice. You
have to dynamically allocate memory sometimes.

I changed the loop to

for(int i = 0; i < 15; i++)
DeleteTree(CreateTree(22));

Now you don't have a choice, or do you?

this requires at least 68 MB on U++

Time: 4562 ms (U++)
Time: 27781 ms (g++)

java -server -Xms1024m -Xmx1024m

Time: 3578 ms (max memory I saw was on 300 M -- but at least it
finished 7 times faster than g++ ...

with -Xms75m -Xmx100m the time is around: 9969 ms

On Jet, 2344 ms (wow, that was fast but memory peak was 600 MB!)

Jun 27 '08 #128
"Roedy Green" <se*********@mindprod.com.invalidwrote in message
news:cl********************************@4ax.com...
On Thu, 10 Apr 2008 22:42:00 -0700, "Chris Thomasson"
<cr*****@comcast.netwrote, quoted or indirectly quoted someone who
said :
>>Oh yeah...

Get stuffed.
You want me to be put up on a wall as a trophy?
If you want people to take time to explain things to
you, get that chip off your shoulder.
Was I acting like a bastar% or something?

Jun 27 '08 #129
"Roedy Green" <se*********@mindprod.com.invalidwrote in message
news:r6********************************@4ax.com...
On Thu, 10 Apr 2008 22:36:54 -0700, "Chris Thomasson"
<cr*****@comcast.netwrote, quoted or indirectly quoted someone who
said :
>>If thread A allocates 256 bytes, and thread B races in and concurrently
attempts to allocate 128 bytes... Which thread is going to win?

If all threads share a common heap, new code would have to be
synchronised. If you have that synchronisation, you could use a
single global counter to use to generate a hashCode. Keep in mind we
are talking assembler here. This is the very guts of the JVM. You can
take advantage of an assembler atomic memory increment instruction for
very low overhead synchronisation.
All threads can shared a common heap when they hit their respective slow
paths. I suggest you learn about Hoard and/or StreamFlow.

You could invent a JVM where each thread has its own mini-heap for
freshly created objects. Then it would not need to synchronise to
allocate an object. Long lived objects could be moved to a common
synchronised heap.
The invention(s) are already out there. How would you handle remote
deallocations? Before you answer, try doing some research.

JVMs have extreme latitude to do things any way they please so long
as the virtual machine behaves in a consistent way.
JVM has nothing to do with _any_ allocation techniques in particular.

Jun 27 '08 #130
"Lew" <le*@lewscanon.comwrote in message
news:gM******************************@comcast.com. ..
Chris Thomasson wrote:
>>Before I answer you, please try and answer a simple question:

If thread A allocates 256 bytes, and thread B races in and concurrently
attempts to allocate 128 bytes... Which thread is going to win?

Both threads.

The semantics of the Java language guarantee that both allocations will
succeed. The JVM will not experience a race condition.
>Oh yeah... Thread C tries to allocate just before thread B...
Which thread is going to win?

All three.
>Thing of how a single global pointer to a shared virtual memory range can
be distributed and efficiently managed...

*What* "global pointer" are you talking about? There is no "global
pointer" involved in Java's 'new' operator, at least not one that we as
developers will ever see. That is a detail of how the JVM implements
'new', and is of no concern whatsoever at the language level.
I am talking about how Roedy would implement his explicit point:

Roedy Green: "All Java has to do is add N (the size of the object) to a
counter and
zero out the object. In C++ it also has to look for a hole the right
size and record it is some sort of collection. C++ typically does not
move objects once allocated. Java does."

Java can use a multitude of allocation techniques.

[...]

Jun 27 '08 #131
"Roedy Green" <se*********@mindprod.com.invalidwrote in message
news:s7********************************@4ax.com...
On Thu, 10 Apr 2008 20:31:49 -0500, Razii
<DO*************@hotmail.comwrote, quoted or indirectly quoted
someone who said :
>>Creating 10000000 new objects with the keyword 'new' in tight loop.

All Java has to do is add N
All Java has to do is, what exactly?

(the size of the object) to a counter and
zero out the object. In C++ it also has to look for a hole the right
size and record it is some sort of collection. C++ typically does not
move objects once allocated. Java does.
[...]

AFAICT, in a sense, you don't what your talking about. Explain how you
synchronize with this counter? Are you going to stripe it? If so, at what
level of granularity? Java can use advanced memory allocation techniques.
How many memory allocators have you written?

Jun 27 '08 #132
"Mike Schilling" <ms*************@hotmail.comwrote in message
news:f3****************@newssvr27.news.prodigy.net ...
>
"Mirek Fidler" <cx*@ntllib.orgwrote in message
news:aa**********************************@c19g2000 prf.googlegroups.com...
>On Apr 11, 11:44 pm, Razii <DONTwhatever...@hotmail.comwrote:
> Which "older OS"? Some 30yo?

How about mobile and embedded devices that don't have sophisticated
memory management? If a C++ application is leaking memory, the memory
might never be returned even after the application is terminated.
This is more dangerous than memory leak in Java application, where,
after the application is terminated, all memory is returned by VM.

If VM is able to return memory to OS, so it should be C++ runtime.

The JVM can (in principle, at least) compact its heap and return the
now-free space to the OS. An environment that doesn't allow memory
compaction (which includes most C++ implementations) would find this
impossible.
Heap compaction is nothing all that special. Its certainly not tied to a GC.
Not at all.

Jun 27 '08 #133
Razii wrote:
On Thu, 10 Apr 2008 20:37:59 -0500, Razii
<DO*************@hotmail.comwrote:
>int main(int argc, char *argv[]) {

clock_t start=clock();
for (int i=0; i<=10000000; i++) {
Test *test = new Test(i);
if (i % 5000000 == 0)
cout << test;
}

If I add delete test; to this loop it gets faster. huh? what the
exaplanation for this?

2156 ms

and after I add delete test; to the loop

1781 ms

why is that?

Because new in C++ does NOT directly call an OS allocation function. C++
internally uses something similar to malloc/free from C which are memory
management functions that use OS allocation functions as its base but
keeps a self-maintained heap of, allocated at the OS level but
unallocated at the application level, free blocks.

If you keep allocating with new without using delete the OS allocations
have to be done. If you use delete the block goes into the free-blocks
heap and is probably returned immediately with the next new call. You
probably see the same memory address each time through the loop.

Regards,

Silvio Bierman
Jun 27 '08 #134
"James Kanze" <ja*********@gmail.comwrote in message
news:ab**********************************@m44g2000 hsc.googlegroups.com...
On Apr 11, 3:07 am, "Chris Thomasson" <cris...@comcast.netwrote:
"Razii" <DONTwhatever...@hotmail.comwrote in message
news:v5********************************@4ax.com...
On Thu, 10 Apr 2008 17:33:21 +0300, Juha Nieminen
<nos...@thanks.invalidwrote:
>However, part of my C++ programming style just naturally also avoids
>>doing tons of news and deletes in tight loops (which is, again, very
>>different from eg. Java programming where you basically have no
>>choice)
Howeever, Java allocates new memory blocks on it's internal
heap (which is allocated in huge chunks from the OS). In
this way, in most of the cases it bypasses memory allocation
mechanisms of the underlying OS and is very fast. In C++,
each "new" allocation request will be sent to the operating
system, which is slow.
You are incorrect. Each call into "new" will "most-likely" be
"fast-pathed" into a "local" cache.
Are you sure about the "most-likely" part? From what little
I've seen, most C runtime libraries simply threw in a few locks
to make their 30 or 40 year old malloc thread safe, and didn't
bother with more. Since most of the marketing benchmarks don't
use malloc, there's no point in optimizing it. I suspect that
you're describing best practice, and not most likely. (But I'd
be pleased to learn differently.)
Yeah... I am describing best practice, with the hope that most impls follow
something similar. I should have said that:

<If your using an optimized memory allocator, then each call into "new" will
"most-likely" be "fast-pathed" into a "local" cache.>

Luckily, there are optimized allocators out there that are compatible with
C/C++.

Jun 27 '08 #135
On Apr 12, 1:23 pm, Razii <DONTwhatever...@hotmail.comwrote:
On Sat, 12 Apr 2008 02:47:26 -0700 (PDT), Mirek Fidler

<c...@ntllib.orgwrote:
(And, BTW, we are still actually using the heap. Of course, in C++,
you allocated much less items there).

If there is not enough memory on stack, you don't have a choice.
Is is not about stack only.

E.g. Vector<Stringin U++ allocates single block of memory for all
Strings (as long as they are short enough).
You
have to dynamically allocate memory sometimes.
Yes, but much less :)
I changed the loop to

for(int i = 0; i < 15; i++)
DeleteTree(CreateTree(22));

Now you don't have a choice, or do you?

this requires at least 68 MB on U++

Time: 4562 ms (U++)
Time: 27781 ms (g++)

java -server -Xms1024m -Xmx1024m

Time: 3578 ms (max memory I saw was on 300 M -- but at least it
finished 7 times faster than g++ ...

with -Xms75m -Xmx100m the time is around: 9969 ms

On Jet, 2344 ms (wow, that was fast but memory peak was 600 MB!)
IMO, manual management still wins...

Mirek
Jun 27 '08 #136

"Razii" <DO*************@hotmail.comwrote in message
news:1r********************************@4ax.com...
On Fri, 11 Apr 2008 21:44:48 -0700 (PDT), Mirek Fidler
<cx*@ntllib.orgwrote:
>>Anyway, with numbers like this, I would say we can put "superior GC
memory management performance over manual new/delete" to the rest, can
we?

The common claim is that GC is much slower than manual new/delete.
[...]

The common claim is false. The main difference is that GC is totally
non-deterministic and its not available everywhere. Sometimes, its not the
right choice for a given job. Act... BTW, how does GC get rid of all forms
of memory management? How do you create an efficient dynamic cache under a
environment controlled by a GC? I know the answer, and it involves a form of
manual memory management indeed...

Jun 27 '08 #137
On Apr 11, 5:24 am, Roedy Green <see_webs...@mindprod.com.invalid>
wrote:
On Thu, 10 Apr 2008 20:31:49 -0500, Razii
<DONTwhatever...@hotmail.comwrote, quoted or indirectly quoted
someone who said :
Creating 10000000 new objects with the keyword 'new' in tight loop.

All Java has to do is add N (the size of the object) to a counter and
zero out the object.
Does it mean Java allocator is serialized? Well, that would be a
problem... Good C++ allocators are not locked for the fast path.
In C++ it also has to look for a hole the right
size and record it is some sort of collection.
Which for the fast path is (or should be) equivalent of about 20 asm
ops.

What you basically need to do is to divide the size by "small
quantum" (e.g. 16) to get the bucket type, then unlink single item
from the bucket. Allocation finished.

Mirek
Jun 27 '08 #138
"Mirek Fidler" <cx*@ntllib.orgwrote in message
news:ff**********************************@s39g2000 prd.googlegroups.com...
On Apr 11, 5:24 am, Roedy Green <see_webs...@mindprod.com.invalid>
wrote:
>On Thu, 10 Apr 2008 20:31:49 -0500, Razii
<DONTwhatever...@hotmail.comwrote, quoted or indirectly quoted
someone who said :
>Creating 10000000 new objects with the keyword 'new' in tight loop.

All Java has to do is add N (the size of the object) to a counter and
zero out the object.

Does it mean Java allocator is serialized? Well, that would be a
problem... Good C++ allocators are not locked for the fast path.
>In C++ it also has to look for a hole the right
size and record it is some sort of collection.

Which for the fast path is (or should be) equivalent of about 20 asm
ops.

What you basically need to do is to divide the size by "small
quantum" (e.g. 16) to get the bucket type, then unlink single item
from the bucket. Allocation finished.
Yup. In the various high-performance allocators I have implemented, the
"very high-level basic pattern" I usually go with resembles "something"
like:

Determine bucket-size, and then;

1. Attempt allocation from per-thread heap.

2. Attempt allocation from per-cpu heap.

3. Attempt allocation from global heap.

4. Finally, ask the OS... ;^(
Jun 27 '08 #139
"Chris Thomasson" <cr*****@comcast.netwrote in message
news:Cq******************************@comcast.com. ..
>
"Razii" <DO*************@hotmail.comwrote in message
news:1r********************************@4ax.com...
>On Fri, 11 Apr 2008 21:44:48 -0700 (PDT), Mirek Fidler
<cx*@ntllib.orgwrote:
>>>Anyway, with numbers like this, I would say we can put "superior GC
memory management performance over manual new/delete" to the rest, can
we?

The common claim is that GC is much slower than manual new/delete.
[...]

The common claim is false. The main difference is that GC is totally
non-deterministic and its not available everywhere. Sometimes, its not the
right choice for a given job. Act...
Act should == Ect...
BTW, how does GC get rid of all forms of memory management? How do you
create an efficient dynamic cache under a environment controlled by a GC?
I know the answer, and it involves a form of manual memory management
indeed...
Jun 27 '08 #140
Mirek Fidler wrote:
>
Does it mean Java allocator is serialized? Well, that would be a
problem... Good C++ allocators are not locked for the fast path.
(Ignoring for now that there isn't one Java allocator), typically there
is a per thread area so these counters do not need locks.

Mark Thornton
Jun 27 '08 #141
On Apr 12, 12:06 pm, Razii <DONTwhatever...@hotmail.comwrote:
On Fri, 11 Apr 2008 21:56:03 -0700 (PDT), Mirek Fidler

<c...@ntllib.orgwrote:
#include <Core/Core.h>
using namespace Upp;
struct Tree {
Tree *left;
Tree *right;
};
Tree *CreateTree(int n)
{
if(n <= 0)
return NULL;
Tree *t = new Tree;
t->left = CreateTree(n - 1);
t->right = CreateTree(n - 1);
return t;
}
void DeleteTree(Tree *t)
{
if(t) {
DeleteTree(t->left);
DeleteTree(t->right);
delete t;
}
}
CONSOLE_APP_MAIN
{
RTIMING("Tree new/delete");
for(int i = 0; i < 100; i++)
DeleteTree(CreateTree(20));
}

Running this version on VC++ and g++: 43718 ms and g++ 46890 ms

that is 5 to 6 times slowe than in U++
Well, that actually proves my point, does not it?

(The one about average implementation of standard C++ library being
the crap and about zero effort invested into manual allocators).

Mirek
Jun 27 '08 #142

"Mark Thornton" <ma*************@ntlworld.comwrote in message
news:AQ*******************@newsfe1-win.ntli.net...
Mirek Fidler wrote:
>>
Does it mean Java allocator is serialized? Well, that would be a
problem... Good C++ allocators are not locked for the fast path.

(Ignoring for now that there isn't one Java allocator), typically there is
a per thread area so these counters do not need locks.
That is if the JVM happens to use counter(s) off a base to implement their
allocator. I prefer to use simple segregated per-thread slabs. Anyway, your
100% correct in that there is no such thing as a "Java Allocator". There are
oh so very many different ways to do this...

Jun 27 '08 #143
It's so unfair!

Razii $B<LF;!'(B
On Fri, 11 Apr 2008 03:35:27 +0300, Juha Nieminen
<no****@thanks.invalidwrote:
Razii wrote:
In C++, each "new" allocation request
will be sent to the operating system, which is slow.
That's blatantly false.

Well, my friend, I have proven you wrong. Razi has been victorious
once again :)
Time: 2125 ms (C++)
Time: 328 ms (java)
--- c++--

#include <ctime>
#include <cstdlib>
#include <iostream>

using namespace std;

class Test {
public:
Test (int c) {count = c;}
This is assignment after initialization.
It should be like this:
Test(int c) : count(c) {}
virtual ~Test() { }
int count;
};

int main(int argc, char *argv[]) {

clock_t start=clock();
for (int i=0; i<=10000000; i++) {
Test *test = new Test(i);
if (i % 5000000 == 0)
cout << test;
The memory you allocated is not released, so every new() is actually a
allocation operation to libc.
When the heap is empty, a new page of memory is allocated from OS.
}
clock_t endt=clock();
std::cout <<"Time: " <<
double(endt-start)/CLOCKS_PER_SEC * 1000 << " ms\n";
}

-- java ---

import java.util.*;

class Test {
Test (int c) {count = c;}
int count;

public static void main(String[] arg) {

long start = System.currentTimeMillis();

for (int i=0; i<=10000000; i++) {
Test test = new Test(i);
if (i % 5000000 == 0)
System.out.println (test);
After this and assign a new object to the 'test', the 'test' is no
longer ref to the older object.
So the older object is free to release.
Once the java VM's heap is empty, a gc() is called and done. The
memory for the older objects is free for allocation again.
In this example, the java VM calls OS' memory page allocation once and
then never need to alloc again.
}
long end = System.currentTimeMillis();
System.out.println("Time: " + (end - start) + " ms");

}
}
Jun 27 '08 #144
Lew
"Lew" <le*@lewscanon.comwrote in message
>*What* "global pointer" are you talking about? There is no "global
pointer" involved in Java's 'new' operator, at least not one that we
as developers will ever see. That is a detail of how the JVM
implements 'new', and is of no concern whatsoever at the language level.
Chris Thomasson wrote:
I am talking about how Roedy would implement his explicit point:

Roedy Green: "All Java has to do is add N (the size of the object) to a
counter and
zero out the object. In C++ it also has to look for a hole the right
size and record it is some sort of collection. C++ typically does not
move objects once allocated. Java does."
Yes. We knew that.

--
Lew
Jun 27 '08 #145
Lew
Lew wrote:
"Lew" <le*@lewscanon.comwrote in message
>>*What* "global pointer" are you talking about? There is no "global
pointer" involved in Java's 'new' operator, at least not one that we
as developers will ever see. That is a detail of how the JVM
implements 'new', and is of no concern whatsoever at the language level.

Chris Thomasson wrote:
>I am talking about how Roedy would implement his explicit point:

Roedy Green: "All Java has to do is add N (the size of the object) to
a counter and
zero out the object. In C++ it also has to look for a hole the right
size and record it is some sort of collection. C++ typically does not
move objects once allocated. Java does."

Yes. We knew that.
However, Roedy did not mention a "global pointer". You introduced that into
the conversation, and have yet to explain what you mean by that.

--
Lew
Jun 27 '08 #146
Lew
Chris Thomasson wrote:
Before you answer, try doing some research.
Chris, you are on the edge of being PLONKed.

--
Lew
Jun 27 '08 #147
Lew
Chris Thomasson wrote:
AFAICT, in a sense, you don't what your talking about. Explain how you
synchronize with this counter? Are you going to stripe it? If so, at
what level of granularity? Java can use advanced memory allocation
techniques.
How many memory allocators have you written?
These questions are meaningless and irrelevant.

No one has to synchronize with the memory "counter" (I would use the term
"pointer" myself) in Java. That is handled by the JVM.

--
Lew
Jun 27 '08 #148
"Lew" <le*@lewscanon.comwrote in message
news:Jp******************************@comcast.com. ..
Chris Thomasson wrote:
>AFAICT, in a sense, you don't what your talking about. Explain how you
synchronize with this counter? Are you going to stripe it? If so, at what
level of granularity? Java can use advanced memory allocation techniques.
How many memory allocators have you written?

These questions are meaningless and irrelevant.

No one has to synchronize with the memory "counter" (I would use the term
"pointer" myself) in Java. That is handled by the JVM.
Perhaps I a misunderstanding what Reedy meant by counter. See, I thought
that he meant count from a single base of memory. In other words, increment
a pointer. This can be analogous to a counter. Think in terms of using FAA
to increment a pointer location off of common base memory. This can
definitely be used for an allocator that does not really like to free
anything... How would you do it? I bet you would not use this method.

Jun 27 '08 #149
"Lew" <le*@lewscanon.comwrote in message
news:Jp******************************@comcast.com. ..
Lew wrote:
>"Lew" <le*@lewscanon.comwrote in message
>>>*What* "global pointer" are you talking about? There is no "global
pointer" involved in Java's 'new' operator, at least not one that we as
developers will ever see. That is a detail of how the JVM implements
'new', and is of no concern whatsoever at the language level.

Chris Thomasson wrote:
>>I am talking about how Roedy would implement his explicit point:

Roedy Green: "All Java has to do is add N (the size of the object) to a
counter and
zero out the object. In C++ it also has to look for a hole the right
size and record it is some sort of collection. C++ typically does not
move objects once allocated. Java does."

Yes. We knew that.

However, Roedy did not mention a "global pointer". You introduced that
into the conversation, and have yet to explain what you mean by that.
I mean atomically incrementing a global pointer off a common base of memory.
This is basic memory allocator implementation 101. You know this. Perhaps
Roedy was talking about a distributed model. What say you? Java can use
both, and all methods in between. You don't necessarily want to send atomic
mutations to a common location. Perhaps Roedy meant N counts off the bases
of multiple memory pools. I don't know. I was speculating. I hope I was
wrong. I thought he meant count off a single location. That's not going to
scale very well... You can break big buffer into little ones and distributed
them over threads... Then the "count" would be off a thread local pool
instead of a global set of whatever... I have created a lot of allocators,
and know a lot about some of the caveats.

Jun 27 '08 #150

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

4 posts views Thread by Pedro Miguel Carvalho | last post: by
5 posts views Thread by Ben | last post: by
13 posts views Thread by Mingnan G. | last post: by
28 posts views Thread by Goalie_Ca | last post: by
142 posts views Thread by jacob navia | last post: by
8 posts views Thread by Paul.Lee.1971 | last post: by
56 posts views Thread by Johnny E. Jensen | last post: by
46 posts views Thread by Carlo Milanesi | last post: by
reply views Thread by NPC403 | last post: by
3 posts views Thread by gieforce | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.