Garbage collection in C++

pushpakulkar

Hi all,

Is garbage collection possible in C++. It doesn't come as part of
language support. Is there any specific reason for the same due to the
way the language is designed. Or it is discouraged due to
some specific reason. If someone can give inputs on the same, it will
be of great help.

Regards,
Pushpa

Nov 15 '08

Subscribe Post Reply

158

7660

Juha Nieminen

Pete Becker wrote:

I think the Java folks have come to the conclusion that finalization is
only useful for detecting failures to clean up non-memory resources. So
a class's finalizer throws an exception if the resources that the object
manages haven't been properly disposed of.

Personally I believe that this is caused by the limitations of Java
rather than being a good thing Java strives for.

If I'm not mistaken, C# offers a tool to alleviate that problem: The
'using' block, where destructors/finalizers can actually be very useful.

(Of course 'using' blocks can only be used locally and don't help when
the same resource is shared among different modules...)

Nov 19 '08 #101

Juha Nieminen

James Kanze wrote:

Yes and no. In my case, most of the software I write runs on a
single machine, or on just a couple of machines; I'm not writing
shrink-wrapped software. And from a cost point of view, it's
several orders of magnitudes cheaper to configure them with
sufficient memory than it is for me to optimize the memory
footprint down to the last byte.

I have worked in a project where the amount of calculations performed
by a set of programs was directly limited by the amount of available
memory. (If the program started to swap, it was just hopeless to try to
wait for it to finish.)

Reducing the memory footprint of the program to half meant that
simulations twice as large could be performed. Given that these programs
were used by several people in their own computers, I would say that the
total benefit of squeezing out the last unused bits was worth it.

And of course it also means that (after the optimization) if someone
doubled the amount of RAM in their computer, they could run simulations
four times larger than before, rather than just two. That is an enormous
benefit.

Nov 19 '08 #102

Juha Nieminen

James Kanze wrote:

There are objective facts, which do hold for everyone, whether
you like it or not. And a simple tool like garbage collection
is simply not capable of "encouraging" (or "discouraging")
anything. You're committing anthromorphism, and giving garbage
collection a power that it just doesn't have.

Would you say that lisp encourages people to write programs using a
functional style? Would you say that C encourages people to write with
an imperative style?

Nov 19 '08 #103

Dilip

On Nov 19, 12:15*pm, Juha Nieminen <nos...@thanks.invalidwrote:

James Kanze wrote:
There are objective facts, which do hold for everyone, whether
you like it or not. *And a simple tool like garbage collection
is simply not capable of "encouraging" (or "discouraging")
anything. *You're committing anthromorphism, and giving garbage
collection a power that it just doesn't have.

* Would you say that lisp encourages people to write programs using a
functional style? Would you say that C encourages people to write with
an imperative style?

Thats like asking if a butcher's knife can be used to stab someone.
An object can have several intrinsic qualities. What you want to do
with it is your problem. C++ is a multi-paradigm language as has been
pointed out a million different times. How you want to approach a
software engineering problem is dependant on what tools you use within
the umbrella of the language to solve it. STL, RAII all offer a
certain way to approach a problem. Garbage collection simply adds
another dimension to it. James has been explaining this several
different ways and it just doesn't seem to click. Instead we get into
absolutes like: "Only morons use GC" or "GC encourages anti-modular
programming" or "GC refused to make coffee for me this morning...".

I am not appealing to authority here but instead of talking in
abstract like "GC will lead to this and that..." could we at least
recognize the fact that James has actually used Boehm's collector in
his projects and found real benefits? Can we at least admit practice
trumps theory?

Nov 19 '08 #104

Dilip

On Nov 19, 11:59*am, Juha Nieminen <nos...@thanks.invalidwrote:

Pete Becker wrote:
I think the Java folks have come to the conclusion that finalization is
only useful for detecting failures to clean up non-memory resources. So
a class's finalizer throws an exception if the resources that the object
manages haven't been properly disposed of.

* Personally I believe that this is caused by the limitations of Java
rather than being a good thing Java strives for.

At least in the .NET world one is heavily discouraged from writing
finalizers. As was pointed out, it is necessary only when a class has
to dispose of its "unmanaged" resources in a timely manner. Of
course, that still depends on that the clients of the class
remembering to use that dreaded 'using' block. Also, throwing an
exception from a finalizer is useless because the runtime (CLR) will
simply ignore it.

* If I'm not mistaken, C# offers a tool to alleviate that problem: The
'using' block, where destructors/finalizers can actually be very useful.

Its a little more complicated than that. C# doesn't have a concept of
destructors however C++/CLI (architected after Sutter came onboard)
retains the dtors as it works in the regular C++ world while also
adding mechanisms to implement finalizers (implemented by using the
'!' mark just as we use '~' to represent dtors)

* (Of course 'using' blocks can only be used locally and don't help when
the same resource is shared among different modules...)

Not sure what this means but we are straying OT.. so lets forget it.

Nov 19 '08 #105

Matthias Buelow

Juha Nieminen wrote:

I have worked in a project where the amount of calculations performed
by a set of programs was directly limited by the amount of available
memory. (If the program started to swap, it was just hopeless to try to
wait for it to finish.)

You seem to think that a program that uses GC uses more memory, which is
just false in the general case. Why should it use more? The amount of
memory a program is using depends on the program logic and its memory
trace and not on how the memory is managed, which is, as I said, an
uninteresting implementation detail.

Nov 19 '08 #106

Pete Becker

On 2008-11-19 12:59:55 -0500, Juha Nieminen <no****@thanks.invalidsaid:

Pete Becker wrote:
>I think the Java folks have come to the conclusion that finalization is
only useful for detecting failures to clean up non-memory resources. So
a class's finalizer throws an exception if the resources that the object
manages haven't been properly disposed of.

Personally I believe that this is caused by the limitations of Java
rather than being a good thing Java strives for.

If I'm not mistaken, C# offers a tool to alleviate that problem: The
'using' block, where destructors/finalizers can actually be very useful.

Well, sure, if you change the context, the answer changes. We were
talking about finalizers and garbage collection, not about finalizers
and not garbage collection.

--
Pete
Roundhouse Consulting, Ltd. (www.versatilecoding.com) Author of "The
Standard C++ Library Extensions: a Tutorial and Reference
(www.petebecker.com/tr1book)

Nov 19 '08 #107

Jean-Marc Bourguet

Matthias Buelow <mk*@incubus.dewrites:

Juha Nieminen wrote:

I have worked in a project where the amount of calculations performed
by a set of programs was directly limited by the amount of available
memory. (If the program started to swap, it was just hopeless to try to
wait for it to finish.)

You seem to think that a program that uses GC uses more memory, which is
just false in the general case. Why should it use more? The amount of
memory a program is using depends on the program logic and its memory
trace and not on how the memory is managed, which is, as I said, an
uninteresting implementation detail.

GC tend to trade off space for speed (I'm probably not up to date about GC
algorithms, but ISTR that the cost of a GC scan is proportional to the
amount of live memory, so systems like generational GC scan less memory at
the risk of keeping some unreachable one longer).

GC use reachability as a safe approximation to liveness, but that is only
an approximation.

GC users tend not to NULL uneeded references (helping to keep the
approximation good) when there is no risk of the equivalent of a memory
leak by accumulation of reachable memory which will never be used. Some
seems even to think that the last possibility doesn't exist as "the GC will
take care of the dead memory", wrong, only of the unreachable one. OTHO
I've seen lisp programs really go out of their way to "cons" less, yes lisp
programs which managed the memory manually (at least in critical loop).

Yours,

--
Jean-Marc

Nov 19 '08 #108

Chris M. Thomasson

"Matthias Buelow" <mk*@incubus.dewrote in message
news:6o************@mid.dfncis.de...

Juha Nieminen wrote:

> I have worked in a project where the amount of calculations performed
by a set of programs was directly limited by the amount of available
memory. (If the program started to swap, it was just hopeless to try to
wait for it to finish.)

You seem to think that a program that uses GC uses more memory, which is
just false in the general case.

Well, memory only gets reclaimed when the GC "decides" to run a scan. So, if
the program is making frequent allocations, and the GC does not run enough
scans to keep up, it has no choice to keep expanding its internal heaps.
// GC world
struct foo {
void do_something();
};
void thread() {
foo* f;
for (;;) {
f = new foo();
f->do_something();
}
}
int main() {
// create 32 threads
reutrn 0;
}

// Manual world
struct foo {
void do_something();
};
void thread() {
foo* f;
for (;;) {
f = new foo();
f->do_something();
delete f;
}
}
int main() {
// create 32 threads
reutrn 0;
}

Why should it use more? The amount of
memory a program is using depends on the program logic and its memory
trace and not on how the memory is managed, which is, as I said, an
uninteresting implementation detail.

Which program is going to use more memory? The Manual world, there can only
ever be up to 32 foo objects at a time. In the GC worlds, well, there can
potentially be hundreds, or thousands, in between GC scan intervals...

Nov 19 '08 #109

Chris M. Thomasson

"Chris M. Thomasson" <no@spam.invalidwrote in message
news:RZ*****************@newsfe06.iad...

"Matthias Buelow" <mk*@incubus.dewrote in message
news:6o************@mid.dfncis.de...
>Juha Nieminen wrote:

>> I have worked in a project where the amount of calculations performed
by a set of programs was directly limited by the amount of available
memory. (If the program started to swap, it was just hopeless to try to
wait for it to finish.)

You seem to think that a program that uses GC uses more memory, which is
just false in the general case.

Well, memory only gets reclaimed when the GC "decides" to run a scan. So,
if the program is making frequent allocations, and the GC does not run
enough scans to keep up, it has no choice to keep expanding its internal
heaps.

[...]

Here is a pseudo-code example that should really make a GC environment hog
memory:

struct node {
node* m_next;

void do_something_read_only() const;
};
static mutex g_lock;
static node* g_nodes = NULL;
void writer_thread() {
for (unsigned i = 1 ;; ++i) {
if (i % 10000) {
node* n = new node();
mutex::guard lock(g_lock);
n->m_next = g_nodes;
// membar #StoreStore
g_nodes = n;
} else {
g_nodes = NULL;
}
}
}
void reader_thread() {
for (;;) {
node* n = g_nodes;
// data-dependant load barrier
while (n) {
n->do_something_read_only();
n = n->m_next;
// data-dependant load barrier
}
}
}
int main() {
// create 10 writer threads
// create 32 reader threads
return 0;
}

Nov 19 '08 #110

Keith H Duggar

On Nov 19, 5:31 am, James Kanze <james.ka...@gmail.comwrote:

On Nov 19, 5:10 am, Keith H Duggar <dug...@alum.mit.eduwrote:
On Nov 18, 5:28 am, James Kanze <james.ka...@gmail.comwrote:
On Nov 18, 5:23 am, Keith H Duggar <dug...@alum.mit.eduwrote:
On Nov 17, 4:18 am, James Kanze <james.ka...@gmail.comwrote:
On Nov 16, 1:24 pm, Juha Nieminen <nos...@thanks.invalidwrote:
I really haven't ever felt the need for a GC engine in my
work. Could a GC engine have made my job easier in a few
cases? Maybe. I can't say for sure. At most it could have
perhaps saved a bit of writing work, but not increased the
correctness of my code in any way. C++ makes it quite easy
to write safe code when you follow some simple rules.
Yes and no. C++ certainly provides a number of tools which
can be used to improve safety. It doesn't require their
use, however, and I've seen a lot of programmers which don't
use them systematically. And of course, human beings being
what they are, regardless of the tools or the process,
mistakes will occasionally creap in.
Yes. However, garbage collection is /only/ going to reclaim
memory, eventually. It's not going to correct the logical and
potentially far more serious design bugs that leaked memory in
the first place.
Woah. If the error is leaked memory, garbage collection may
correct it. Or it may not, depending on whether there is
still a pointer floating around to the memory. (The Java
bugs data base has more than a few cases of memory leaks in
it.)
That's not what garbage collection is for. Garbage
collection isn't designed to make an incorrect program
correct---I don't think any tool can guarantee that.
And I never claimed GC does or was designed to correct errors.
One can see from the attached context it was you who posited
"mistakes will occasionally creap in" not I. If you did not
mean memory leaks what did you mean?

Nothing in particular. Just that regardless of the technique
used, code written by human beings will contain errors; your
development process should be designed to detect and remove them
as far upstream as possible.

Agreed.

([GC] also prevents them from being used as a security hole
--- very important if your connecting to the web.)

True. Point taken.

(This in response to your claim
that garbage collection masks errors, where as in fact, it makes
the detection of some errors, like dangling pointers, possible.)

Garbage collection (like all of the other tools I know) is
designed to make it easier to write a correct program. It
also makes the effects of some errors (dangling pointers)
less critical.
And it's exactly by lessening the effects of some errors that
GC can actually /hide/ those errors.

No. It is exactly by lessening the effects of those errors that
it makes their detection possible.

[snip]

Purify will catch the error, but delivered code doesn't run
under Purify, so if the error doesn't show up in your test
cases, you're hosed without garbage collection; you have
undefined behavior, and while it might core dump. It might also
do anything else. Including (as has actually happened in one
case) allowing someone connected to your server to break into
your machine (and if the server is running as root, to do pretty
much anything it wants with root privileges). With garbage
collection, of course, there is no undefined behavior; you set
whatever bits you need to identify the error in the
deconstructed object, and you test them with each use of the
object, handling the detected error however you think best. (I
like assert for this, so I know I get the core dump.)

The problem is that in C++, when you deconstruct an object, you
also free the memory, and that memory can be reused for another
object, so you can't guarantee any state which would identify it
as having been deconstructed. When you deconstruct an object
and are using garbage collection, you can scribble all over the
object, overwriting it with values that can't possibly be legal
vptr's, and you can be moderately sure that those values won't
be overwritten as long as the ex-object is still accessible.

This is a case where garbage collection is necessary for maximum
robustness. But it obviously doesn't solve everything. You
can still dangle pointers to local objects, and a rogue pointer
can still overwrite anything. In the end, the real question is
how much undefined behavior can you accept; in my experience,
undefined behavior is a sure recepe for reduced robustness. And
garbage collection removes one (and regretfully only one)
potential source of undefined behavior.

Thank you. I understand your point much more clearly now. And
as far as I can see you are right. GC does enable more advanced
error detection.

That said, "[setting] whatever bits you need" and "testing them
with each use of the object" was discussed with respect to
zombies in the referenced thread. The concern is that such bits
and checks cost both space and time (at least for non-virtual
functions and member variable access). Do you agree this is
a real cost and a legitimate concern? Or is there a clever
way around those costs?

As an aside, it seems that the simple constructor/destructor
paradigm has proven to be extremely flexible in implementing a
variety of resource management solutions. Do you agree?

More or less. The destructor paradigm certainly rates as one of
C++'s successes, and IMHO, beats finally hands down. Which
doesn't mean that finally wouldn't be nice as well. Nothing
wrong with having a choice. (I'd actually like to see a way of
creating "destructors" ad hoc. Something along the lines of:
cleanup { code } ;
, which would basically create an anonymous variable whose
destructor executes the code.

Indeed. Anonymous destructors would provide for cleaner syntax.
However, since they are already supported with more verbosity

void foo ( ) {
struct anon {
~anon ( ) {
std::cout << "(zombie) : brains! brains!\n" ;
}
} a ;
}

perhaps many would considered them to be only "sugar". On the
other hand lambda expressions are in this sense also "sugar"
and they were accepted into C++0x.

C++ is a multi-paradigm language, usable in many
contexts. If you're writing kernel code, a garbage
collector certainly has no place; nor do exceptions, for
that matter. And if you're implementing a garbage
collector, obviously, you can't use it. But for most
application programs, it's stupid not to.
RAII, RRID, STL containers, automatic variables, value
types, and other software design patterns have served
exceptionally well in eliminating both the need and the
want for GC for me.
Exactly. You've mentionned a number of very useful tools.
Garbage collection is just one more to add to the list.
Sometimes, it will mean that you need to write less code.
Well, I agree with you! It is "just one more" tool and
sometimes it means you write less code. That said it does not
come without cost and those who require it less and value it
less than you are not stupid.

There's a difference. Those who decide in a particular
application that it isn't appropriate aren't stupid. Those who
refuse to consider it, on the other hand, are certainly showing
unreasonable prejudice. As a professional, I have a
responsibility to my clients to provide the best service
possible at the lowest possible cost. Not using a tool which
would result in a more robust program at a lower price would be
a serious violation of professional ontology. And I can't know
whether the tool would result in a more robust program at a
lower price in any particular case unless I consider it with an
open mind.

Agreed on all points. Well said.

Thanks for the continued discussion and your expertise!

I was about to say the same thing. (And don't take any harsh
statements I may have made at the beginning of the discussion
too literally. I like to exercise rhetoric litote, exagerating
a statement to bring a point home. I never mean it personally,
and I certainly don't think that everyone who doesn't see an
immediate need for garbage collection is stupid.)

I know very well that you are highly skilled and informed and was
thus sure you had something to say worth hearing; and I wanted to
hear it! I wasn't about to let harsh rhetoric distract me from
that goal ;-)

KHD

Nov 20 '08 #111

Hendrik Schober

James Kanze wrote:

On Nov 19, 5:10 am, Keith H Duggar <dug...@alum.mit.eduwrote:
[...]
>C++

> Foo * x = new Foo() ;
//in a code far far away a reference is squirreled away
Foo * y = getX() ;
//time passes, we want x to never be used again
delete x ;
//in a code far far away the squirreled digs up his nut
y->activate()

>Java

> Foo x = new Foo() ;
//in a code far far away a reference is squirreled away
Foo y = getX() ;
//time passes, we want x to never be used again so what do
//you put here to indicate this? Roll your own "zombify"?
//in a code far far away the squirreled digs up his nut
y.activate()

>In the C++ version, Purify (or similar) will catch the
dangling pointer or if it sneaks by (as you say "mistakes will
creep in") you have at least some a chance that the code cores
and reveal the error. In Java (and in GC in general?) you will
never know. What am I missing?

Purify will catch the error, but delivered code doesn't run
under Purify, so if the error doesn't show up in your test
cases, you're hosed without garbage collection; [...]

I don't think this can be discussed that generally. It
might just be that accessing the object at this time
might do something blatantly stupid and by having GC
allowing it, instead of the app core dumping it might
be much worse. In this case, GC might indeed mask an
error. Of course, as you've shown, it could be just the
other way around... My conclusion is: GC neither helps
nor hinders errors.
OTOH, there is the argument that GC only deals with one
resource (although admittedly the one that's probably
most common), but doesn't do anything to help you with
all the others.

Schobi

Nov 20 '08 #112

Hendrik Schober

James Kanze wrote:

On Nov 19, 12:51 am, Matthias Buelow <m...@incubus.dewrote:
>Juha Nieminen wrote:
>> Becomes the majority of programmers hired out there are incompetent?

>Well, you have identified the core problem, all right.

I disagree, and I would take exception to such a qualification.
I've working in computerprocessing for over thirty years, most
of them as a consultant, and I've seen and worked with literally
hundreds of programmers. In all that time, I've seen only one
who could be qualified as incompetent. [...]

That just means you've mostly been assigned to one kind of
shops. Good for you. Go visit some I have seen.

Schobi

Nov 20 '08 #113

Hendrik Schober

James Kanze wrote:

On Nov 18, 9:29 pm, Hendrik Schober <spamt...@gmx.dewrote:
>James Kanze wrote:
>>On Nov 17, 8:29 pm, Juha Nieminen <nos...@thanks.invalidwrote:
James Kanze wrote:
>Sometimes I get the impression that garbage collection
>actually causes people to write *less* modular and more
>imperative programs. GC doesn't really encourage
>encapsulation and modularity.

>>>>Garbage collection doesn't "encourage" anything.

>>>I tend to disagree.

>>Sorry, but it's a statement of fact. Garbage collection is
just a tool; a piece of code. It can't encourage or
discourage anything.

> OO is "just a tool", too.

OO is more than just that.

Oh come on. You're not trying to tell me that this simple
rabulistic assumption is all the base you have for your
argument, are you?
(For a starter: Just about everyone coming to C++ from
languages like Java will tell you that the need to take
care of your memory is the most important aspect of C++.
If GC removes that need, it's more than a tool, too, and
does indeed encourage a very different programming style.)

[...]

Schobi

Nov 20 '08 #114

James Kanze

On Nov 20, 5:13 am, Keith H Duggar <dug...@alum.mit.eduwrote:

On Nov 19, 5:31 am, James Kanze <james.ka...@gmail.comwrote:

[...]

This is a case where garbage collection is necessary for
maximum robustness. But it obviously doesn't solve
everything. You can still dangle pointers to local objects,
and a rogue pointer can still overwrite anything. In the
end, the real question is how much undefined behavior can
you accept; in my experience, undefined behavior is a sure
recepe for reduced robustness. And garbage collection
removes one (and regretfully only one) potential source of
undefined behavior.

Thank you. I understand your point much more clearly now. And
as far as I can see you are right. GC does enable more
advanced error detection.

That said, "[setting] whatever bits you need" and "testing
them with each use of the object" was discussed with respect
to zombies in the referenced thread. The concern is that such
bits and checks cost both space and time (at least for
non-virtual functions and member variable access). Do you
agree this is a real cost and a legitimate concern? Or is
there a clever way around those costs?

As they say, there's no such thing as a free lunch. In most
specific cases, I think you will be able to come up with some
solution which doesn't entail any extra space associated with
the object; classes with no redundancy and no extra bits are
rare. And often (but certainly not always) you should be able
to trick hardware into making the check (set a pointer to an
invalid value, for example). But even in these cases, you have
the extra space overhead associated with garbage collection, and
you have created extra work for the programmer. If the compiler
were to take charge of this, the extra space associated with the
object would be systematic, but it could presumably somehow be
integrated into the overall overhead of memory management. (In
all of the manual memory management schemes I've seen, you have
at least one extra pointer per allocated block. Which must be a
multiple of 4 or 8, depending on the machine, so you have 2 or 3
free bits to play with. If you're the compiler or the library
implementer; you can't reasonably get at them from user code.)
On the other hand, unless the compiler was very, very smart (and
I'm not aware of any research having been done on ways to
optimize this), you'd have a systematic runtime overhead.

In practice, in all of the cases I'm aware of where a dangling
pointer was used to breech security, the technique used was to
cause the vptr to be overwritten in a way which caused the
malicious code to be executed when a virtual function was called
through the dangling pointer. I'm not aware of a security
problem which doesn't involve pointers to functions somewhere,
so just zapping all of the memory with values that would be
invalid as pointers (and ensuring that it doesn't get reused as
long as there is a pointer to it) would be sufficient. If the
goal is larger; to detect as many possible programming errors as
possible, as soon as possible, then more effort would be
involved.

As an aside, it seems that the simple
constructor/destructor paradigm has proven to be extremely
flexible in implementing a variety of resource management
solutions. Do you agree?

More or less. The destructor paradigm certainly rates as
one of C++'s successes, and IMHO, beats finally hands down.
Which doesn't mean that finally wouldn't be nice as well.
Nothing wrong with having a choice. (I'd actually like to
see a way of creating "destructors" ad hoc. Something along
the lines of:
cleanup { code } ;
, which would basically create an anonymous variable whose
destructor executes the code.

Indeed. Anonymous destructors would provide for cleaner
syntax. However, since they are already supported with more
verbosity

void foo ( ) {
struct anon {
~anon ( ) {
std::cout << "(zombie) : brains! brains!\n" ;
}
} a ;
}

perhaps many would considered them to be only "sugar". On the
other hand lambda expressions are in this sense also "sugar"
and they were accepted into C++0x.

Exactly. The current support becomes a lot more awkward if the
cleanup code needs to access local variables.

Time constraints have meant that I haven't been active in the
lambda proposal that was adopted. I regret this, because I had
definite ideas about it---IMHO, at the base, what we need is
anonymous classes (or lambda classes, the name isn't important);
constructs which, conceptually, implicitly define a class with
protected references to all of the accessible local variables
and create an instance of it with all of the references
initialized. (Obviously, I expect the compiler to "optimize"
the references out in the generated code.) A lambda function
would be nothing more than a functional object which derived
from such a class; a lambda expression would wrap the expression
in a lambda function, then generate the object. Cleanup, above
would be derived from the class as well, except that the code
would define the destructor, rather than an operator()(). And
I'd expect the class itself to be available, so that the
programmer could derive from it and add additional state, if he
wanted.

I've not studied the lambda proposition in detail, so I don't
know how much of the above it might incorporate. A quick
glance, however, does show that it involves a "closure object",
which sounds very much like the instance of my anonymous class
(although described in somewhat different language). So it
shouldn't be too hard to add "cleanup" as an extension, or in a
later version of the standard, if someone can find time to write
up a proposal for it.

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Nov 20 '08 #115

James Kanze

On Nov 20, 9:49 am, Hendrik Schober <spamt...@gmx.dewrote:

James Kanze wrote:
On Nov 19, 5:10 am, Keith H Duggar <dug...@alum.mit.eduwrote:
[...]
C++

Foo * x = new Foo() ;
//in a code far far away a reference is squirreled away
Foo * y = getX() ;
//time passes, we want x to never be used again
delete x ;
//in a code far far away the squirreled digs up his nut
y->activate()

Java

Foo x = new Foo() ;
//in a code far far away a reference is squirreled away
Foo y = getX() ;
//time passes, we want x to never be used again so what do
//you put here to indicate this? Roll your own "zombify"?
//in a code far far away the squirreled digs up his nut
y.activate()

In the C++ version, Purify (or similar) will catch the
dangling pointer or if it sneaks by (as you say "mistakes
will creep in") you have at least some a chance that the
code cores and reveal the error. In Java (and in GC in
general?) you will never know. What am I missing?

Purify will catch the error, but delivered code doesn't run
under Purify, so if the error doesn't show up in your test
cases, you're hosed without garbage collection; [...]

I don't think this can be discussed that generally. It
might just be that accessing the object at this time
might do something blatantly stupid and by having GC
allowing it, instead of the app core dumping it might
be much worse.

The problem is that in real life, the application didn't core
dump. The memory was reallocated as a buffer, where user input
was written. And the user designed his input so that it
corresponded to a vptr which pointed to malicious code, and
breached security when the dangling pointer was used.

With garbage collection, the "destructor" sets the vptr to an
invalid pointer. And since the memory can't be reallocated as
long as it is reachable, the invalid pointer stays set, and the
crash is guaranteed (which is what you want).

What it comes down to is that we're replacing undefined behavior
with defined. You may not like what the defined behavior is,
out of the box, but you can intervene to make it whatever you
want. Where as undefined behavior is, well, undefined.

[...]

OTOH, there is the argument that GC only deals with one
resource (although admittedly the one that's probably most
common), but doesn't do anything to help you with all the
others.

I'll admit that I don't understand this argument. Obviously,
garbage collection deals with only one resource. But you need
different solutions for different resources; what makes garbage
collection useful is that it deals transparently with the only
resource nine tenths of your classes are concerned with. So you
have less work to do.

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Nov 20 '08 #116

James Kanze

On Nov 19, 8:39 pm, Pete Becker <p...@versatilecoding.comwrote:

On 2008-11-19 12:59:55 -0500, Juha Nieminen <nos...@thanks.invalidsaid:

Pete Becker wrote:
I think the Java folks have come to the conclusion that
finalization is only useful for detecting failures to clean
up non-memory resources. So a class's finalizer throws an
exception if the resources that the object manages haven't
been properly disposed of.

Personally I believe that this is caused by the limitations
of Java rather than being a good thing Java strives for.

If I'm not mistaken, C# offers a tool to alleviate that
problem: The 'using' block, where destructors/finalizers can
actually be very useful.

Well, sure, if you change the context, the answer changes. We
were talking about finalizers and garbage collection, not
about finalizers and not garbage collection.

There's a general problem of vocabulary, I think. In the
somewhat distant past, I've heard people assimulate finalizers
and C++ destructors; more recent articles do stress the
differences. If I understand what Juha is describing (and what
Herb Sutter has described to me in the past), what C# is
offering is still a third possibility, somewhat closer to a
finally clause than to anything else. Calling it finalization
certainlly lends to confusion, but it isn't the first time that
we've had different concepts hiding under the same name.

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Nov 20 '08 #117

James Kanze

On Nov 20, 10:00 am, Hendrik Schober <spamt...@gmx.dewrote:

James Kanze wrote:
On Nov 19, 12:51 am, Matthias Buelow <m...@incubus.dewrote:
Juha Nieminen wrote:
Becomes the majority of programmers hired out there are incompetent?

Well, you have identified the core problem, all right.

I disagree, and I would take exception to such a
qualification. I've working in computerprocessing for over
thirty years, most of them as a consultant, and I've seen
and worked with literally hundreds of programmers. In all
that time, I've seen only one who could be qualified as
incompetent. [...]

That just means you've mostly been assigned to one kind of
shops. Good for you. Go visit some I have seen.

See my comments else-thread. Very, very few programmers are
really incompetent. Very, very few are gifted enough to be able
to overcome mismanagement, however, and quite a few are
mismanaged in ways that prevent their competence from being
used, or even seen.

Also, there are vastly different competences. I consider myself
a competent programmer, at least in procedural and OO languages
(C++ and Java, but I'm sure that I would have no problem with C#
or even Smalltalk), and quite good at many aspects of low-level
design, but I think you'd be disappointed if you asked me to do
system architecture or write a user manual.

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Nov 20 '08 #118

James Kanze

On Nov 19, 7:15 pm, Juha Nieminen <nos...@thanks.invalidwrote:

James Kanze wrote:
There are objective facts, which do hold for everyone,
whether you like it or not. And a simple tool like garbage
collection is simply not capable of "encouraging" (or
"discouraging") anything. You're committing anthromorphism,
and giving garbage collection a power that it just doesn't
have.

Would you say that lisp encourages people to write programs
using a functional style? Would you say that C encourages
people to write with an imperative style?

A programming language is more than just a low level tool.
Using Basic can be harmful. Garbage collection just isn't at
that level; it just doesn't have that much power.

In other words, C++ with garbage collection will still be C++,
and will encourage and discourage whatever habits C++ encourages
or discourages today. C++ with garbage collection will have no
relationship with Java, or C#, or Lisp, or whatever, or at
least, no more relationship with them than it has today.

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Nov 20 '08 #119

Pete Becker

On 2008-11-20 04:39:10 -0500, James Kanze <ja*********@gmail.comsaid:

On Nov 20, 9:49 am, Hendrik Schober <spamt...@gmx.dewrote:
>OTOH, there is the argument that GC only deals with one
resource (although admittedly the one that's probably most
common), but doesn't do anything to help you with all the
others.

I'll admit that I don't understand this argument. Obviously,
garbage collection deals with only one resource. But you need
different solutions for different resources; what makes garbage
collection useful is that it deals transparently with the only
resource nine tenths of your classes are concerned with. So you
have less work to do.

The essence of the argument is that managing resources can be tricky,
and programmers need to practice it. If you don't have to manage
memory, your opportunities to learn resource management are
significantly decreased. The result in practice is often code that is
sloppy about resources. Java code, for example, often relies on the
garbage collector to recover file handles that have been abandoned
without closing the corresponding file.

--
Pete
Roundhouse Consulting, Ltd. (www.versatilecoding.com) Author of "The
Standard C++ Library Extensions: a Tutorial and Reference
(www.petebecker.com/tr1book)

Nov 20 '08 #120

Juha Nieminen

Matthias Buelow wrote:

Juha Nieminen wrote:

> I have worked in a project where the amount of calculations performed
by a set of programs was directly limited by the amount of available
memory. (If the program started to swap, it was just hopeless to try to
wait for it to finish.)

You seem to think that a program that uses GC uses more memory, which is
just false in the general case.

I was not talking about GC in particular, but modern "high-level
languages" in general (and thinking about Java in particular).

GC in itself doesn't make a language memory-hog if the language offers
the tools to create very memory-efficient implementations and hide them
behind nice abstractions (eg. in similar ways as C++ does).

OTOH there are languages (such as Java) where allocating each object
dynamically is basically the only choice you have, which makes creating
memory-efficient programs a lot more difficult. For instance, it makes
it very difficult to squeeze every object into 2 bytes inside a byte
array (as a completely imaginary example).

Nov 20 '08 #121

On 20 Nov., 10:31, James Kanze <james.ka...@gmail.comwrote:

Exactly. The current support becomes a lot more awkward if the
cleanup code needs to access local variables.
[...]
the references out in the generated code.) A lambda function
would be nothing more than a functional object which derived
from such a class; a lambda expression would wrap the expression
in a lambda function, then generate the object. Cleanup, above
would be derived from the class as well, except that the code
would define the destructor, rather than an operator()().

What you want seems to be a "scope guard". A Lambda-based scope guard
could be used like this:

template<typename Frequires Callable<F&& MoveConstructible<F>
class scope_guard { /* ... library magic ... */ };

template<typename Frequires Callable<F&& MoveConstructible<F>
scope_guard<Fmake_scope_guard(F f);

double* example()
{
double* p = new double[123];
auto && sg = make_scope_guard( [p]{delete[] p;} );
// compute data or throw exception here
sg.cancel(); // makes dtor do nothing
return p;
}

There's no derivation, no virtual functions and no dynamically
allocated function object involved. Of course, the example's purpose
was ONLY to demonstrate the how lambdas can be used to make arbitrary
scope guards. In this particular instance managing the dynamically
allocated array should be done differently. At least something like
this:

unique_ptr<double[]example2()
{
unique_ptr<double[]p (new double[123]);
// compute data or throw exception here
return p;
}

It comes with the benefit of the function's declaration being self-
explanatory w.r.t. the life-time management of the allocated array. It
gets even better if you stick with std::vector which will become
movable in C++0x:

std::vector<doubleexample3();

I've not studied the lambda proposition in detail, so I don't
know how much of the above it might incorporate. A quick
glance, however, does show that it involves a "closure object",
which sounds very much like the instance of my anonymous class
(although described in somewhat different language). So it
shouldn't be too hard to add "cleanup" as an extension, or in a
later version of the standard, if someone can find time to write
up a proposal for it.

It seems that the library solution is as convenient and flexible as a
dedicated language feature would be. So, a "clean up" core language
feature is hardly justified.

Cheers!
SG

Nov 20 '08 #122

Matthias Buelow

James Kanze wrote:

Because it does:-). That's been my experience, anyway.

Juha was saying that certain applications (or algorithms, in general)
can't be run in a GC'd environment on the same problem size as they can
in a manually managed environment, which is just plain false. Of course
a, say, Unix process may have some more memory mapped to it if that's
still in the GC's memory pool internally just as it also happens with
some malloc implementations which don't free their stuff in a timely
fashion. This is, however, irrelevant to the application itself.

I don't know enough about Juha's application, but there are
certainly applications which shouldn't use garbage collection,
and others which should use some sort of mixed model, or perhaps
require special tuning for garbage collection to be effective.

Well, a GC-managed environment can always be turned into a manually
managed environment much easier than the other way round.

That's not true when you start pushing towards the limits. In
the distant past, I've had programs which would run out of
memory using one implementation of malloc, and not with another.

Application specialities and implementation details. :)

Nov 20 '08 #123

Matthias Buelow

SG wrote:

It seems that the library solution is as convenient and flexible as a
dedicated language feature would be. So, a "clean up" core language
feature is hardly justified.

It's beginning to resemble Perl, though.

Nov 20 '08 #124

Juha Nieminen

Matthias Buelow wrote:

Juha was saying that certain applications (or algorithms, in general)
can't be run in a GC'd environment on the same problem size as they can
in a manually managed environment, which is just plain false.

Except that I didn't say anything of the sorts.

What said had absolutely nothing to do with GC. What I said is that in
many "high-level" languages it's very difficult to implement, for
example, very memory-efficient programs while still keeping a fair
amount of abstraction and maintainability to the code. I like C++
because it offers tools to do that.

It is true, however, that there are *some* GC'd programming languages
which are more memory hog than others.

Nov 20 '08 #125

Pete Becker

On 2008-11-20 11:50:35 -0500, Noah Roberts <us**@example.netsaid:

>
I do have to say QString bothers me. For one there's wstring if you
want to support unicode.

What portable guarantees does wstring provide that allow you to support
Unicode? And if the support is so good, why is C++0x adding u16string
and u32string? <g>

--
Pete
Roundhouse Consulting, Ltd. (www.versatilecoding.com) Author of "The
Standard C++ Library Extensions: a Tutorial and Reference
(www.petebecker.com/tr1book)

Nov 20 '08 #126

Chris M. Thomasson

"Chris M. Thomasson" <no@spam.invalidwrote in message
news:gg**********@aioe.org...

>
"Matthias Buelow" <mk*@incubus.dewrote in message
news:6o************@mid.dfncis.de...
>Chris M. Thomasson wrote:

>>Why do you think that most of Google is written in C++?

Because.. it weighs.. as much.. as a.. duck..?

lol!

Seriously... Why do you think they choose C++ for their custom web-server's,
indexers, file-systems, databases ect...?

Nov 20 '08 #127

Matthias Buelow

Chris M. Thomasson wrote:

Seriously... Why do you think they choose C++ for their custom
web-server's, indexers, file-systems, databases ect...?

First, I don't know if what you say is true. Second, I have no clue.
Probably for a similar reason why I'm programming in C++; that's not
because I like the language.

Nov 20 '08 #128

Matthias Buelow

Chris M. Thomasson wrote:

The point I was trying to make is that the sample pseudo-code program
only needed to have less than 33 foo objects allocated at any one time.
This cannot be guaranteed with GC because of its non-deterministic
nature. Your program will be using more memory than it has to.

No it won't. It will be using exactly those 33 objects. What makes you
think otherwise? The "garbage" is by definition unused memory and will
be reclaimed if more memory is needed just in the same way that if you
delete or free() some stuff in C++, it might hang around on a free list
for a while but that's an implementation detail that is completely
irrelevant to the question of how much memory the program is using.

If you have an algorithm that uses 1 million memory "cells" on a machine
with 1 million "cells" available, it can run in the same way no matter
if that memory is managed by hand or if the GC does it.

Nov 20 '08 #129

Chris M. Thomasson

I created a little test application for Java and C++:
http://pastebin.com/mc3e3f4e
(Java version)
http://pastebin.com/m4bf7db0a
(My C++ version)
As you can see, my C++ version uses a custom region allocation scheme I
created and posted to this group here:

http://groups.google.com/group/comp....8dc967c7ddba7c
I compile C++ version with MSVC 2005 Express. I compile Java version with
Java version 1.6.0_05. The command line I use for Java version is `java
tree'. The platform is Windows XP SP2 on my old HyperThreaded P4 3.06gz with
only 512mb ram.

C++ Output:
__________________________________________________ ____________________
Time: 12953 ms

Java Output:
__________________________________________________ ____________________
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at tree.create_tree(tree.java:9)
at tree.create_tree(tree.java:11)
at tree.create_tree(tree.java:10)
at tree.create_tree(tree.java:10)
at tree.create_tree(tree.java:10)
at tree.create_tree(tree.java:11)
at tree.create_tree(tree.java:10)
at tree.create_tree(tree.java:11)
at tree.create_tree(tree.java:10)
at tree.create_tree(tree.java:11)
at tree.create_tree(tree.java:10)
at tree.create_tree(tree.java:11)
at tree.create_tree(tree.java:11)
at tree.create_tree(tree.java:11)
at tree.create_tree(tree.java:10)
at tree.create_tree(tree.java:11)
at tree.create_tree(tree.java:11)
at tree.create_tree(tree.java:11)
at tree.create_tree(tree.java:11)
at tree.create_tree(tree.java:11)
at tree.create_tree(tree.java:11)
at tree.main(tree.java:19)

The Java version crashes because it apparently runs out of memory! It's
memory usage in Task Manager spikes up and beyond 80 MB, and then BAM! it
dies... What total crap!
Anyway, the C++ version memory usage peaks out at around 29-30 MB.
So, on my platform, the GC in Java is not good for this example program. It
simply does not keep up, and the memory usage flies out of control.
However, the C++ version uses custom region allocator, and its memory usage
is non-volatile and very stable. I can actually run it on my older system.
Cool.
I conclude, that for this example, GC is no good. Clever manual memory
management is key to achieving efficient memory usage.

Nov 20 '08 #130

Chris M. Thomasson

"Matthias Buelow" <mk*@incubus.dewrote in message
news:6o************@mid.dfncis.de...

Chris M. Thomasson wrote:

>The point I was trying to make is that the sample pseudo-code program
only needed to have less than 33 foo objects allocated at any one time.
This cannot be guaranteed with GC because of its non-deterministic
nature. Your program will be using more memory than it has to.

No it won't. It will be using exactly those 33 objects. What makes you
think otherwise? The "garbage" is by definition unused memory and will
be reclaimed if more memory is needed just in the same way that if you
delete or free() some stuff in C++, it might hang around on a free list
for a while but that's an implementation detail that is completely
irrelevant to the question of how much memory the program is using.

If you have an algorithm that uses 1 million memory "cells" on a machine
with 1 million "cells" available, it can run in the same way no matter
if that memory is managed by hand or if the GC does it.

Why does the Java version run out of memory on my machine with 512mb of
memory, and the C++ version does not:
http://pastebin.com/mc3e3f4e
(Java version)
http://pastebin.com/m4bf7db0a
(My C++ version)

Here is the output I get:

C++ Output:
__________________________________________________ ____________________
Time: 12953 ms

Java Output:
__________________________________________________ ____________________
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at tree.create_tree(tree.java:9)
at tree.create_tree(tree.java:11)
at tree.create_tree(tree.java:10)
at tree.create_tree(tree.java:10)
at tree.create_tree(tree.java:10)
at tree.create_tree(tree.java:11)
at tree.create_tree(tree.java:10)
at tree.create_tree(tree.java:11)
at tree.create_tree(tree.java:10)
at tree.create_tree(tree.java:11)
at tree.create_tree(tree.java:10)
at tree.create_tree(tree.java:11)
at tree.create_tree(tree.java:11)
at tree.create_tree(tree.java:11)
at tree.create_tree(tree.java:10)
at tree.create_tree(tree.java:11)
at tree.create_tree(tree.java:11)
at tree.create_tree(tree.java:11)
at tree.create_tree(tree.java:11)
at tree.create_tree(tree.java:11)
at tree.create_tree(tree.java:11)
at tree.main(tree.java:19)

Humm... Interesting.

Nov 20 '08 #131

Matthias Buelow

Chris M. Thomasson wrote:

The Java version crashes because it apparently runs out of memory! It's
memory usage in Task Manager spikes up and beyond 80 MB, and then BAM!
it dies... What total crap!

It probably exhausts the stack in the recursion.

I conclude, that for this example, GC is no good. Clever manual memory
management is key to achieving efficient memory usage.

If anything at all, your example would only "prove" that C++ is more
than 10x as verbose as Java.

Nov 20 '08 #132

Matthias Buelow

Chris M. Thomasson wrote:

Why does the Java version run out of memory on my machine with 512mb of
memory, and the C++ version does not:

Programming error? From a quick glance, the two programs look totally
different so they're hard to compare (the C++ version is more than 10x
the size of the Java version, so wtf.?)
Besides, you won't trick me into defending Java, anyway.

Nov 20 '08 #133

Chris M. Thomasson

"Matthias Buelow" <mk*@incubus.dewrote in message
news:6o************@mid.dfncis.de...

Chris M. Thomasson wrote:

>Seriously... Why do you think they choose C++ for their custom
web-server's, indexers, file-systems, databases ect...?

First, I don't know if what you say is true.

One example, Google BigTable is written in C++:

http://labs.google.com/papers/bigtable-osdi06.pdf

Second, I have no clue.
Probably for a similar reason why I'm programming in C++; that's not
because I like the language.

Nov 20 '08 #134

Chris M. Thomasson

"Matthias Buelow" <mk*@incubus.dewrote in message
news:6o************@mid.dfncis.de...

Chris M. Thomasson wrote:

>Why does the Java version run out of memory on my machine with 512mb of
memory, and the C++ version does not:

Programming error?

Where? I don't see any at all. None at all.

From a quick glance, the two programs look totally
different so they're hard to compare (the C++ version is more than 10x
the size of the Java version, so wtf.?)

The C++ version has my region allocator in there. Focus on the latter end of
the code under the:
/* Tree: C++ -vs- Java
__________________________________________________ ___________*/
section please. Thanks.

Besides, you won't trick me into defending Java, anyway.

Nov 20 '08 #135

Chris M. Thomasson

"Matthias Buelow" <mk*@incubus.dewrote in message
news:6o************@mid.dfncis.de...

Chris M. Thomasson wrote:

>The Java version crashes because it apparently runs out of memory! It's
memory usage in Task Manager spikes up and beyond 80 MB, and then BAM!
it dies... What total crap!

It probably exhausts the stack in the recursion.

Can you blow the stack in a Java program? BTW, the error explicitly
mentioned Java heap space. Also, if its the stack, why did it not blowup in
the C++ version?

>I conclude, that for this example, GC is no good. Clever manual memory
management is key to achieving efficient memory usage.

If anything at all, your example would only "prove" that C++ is more
than 10x as verbose as Java.

Here is the code with my region allocator stripped:

/* Tree: C++ -vs- Java
__________________________________________________ ___________*/
#include <ctime>
#include <iostream>
struct tree {
tree* m_left;
tree* m_right;
};
static region_allocator<tree, 1024 * 32g_tree_alloc;
tree* create_tree(int n) {
if (n < 1) {
return NULL;
}

tree* const t = g_tree_alloc.allocate();
t->m_left = create_tree(n - 1);
t->m_right = create_tree(n - 1);

return t;
}
int main() {
std::clock_t const start = std::clock();
for (int r = 0; r < 10; ++r) {
for (int i = 0; i < 15; ++i) {
create_tree(22);
g_tree_alloc.flush();
}
}
std::clock_t const end = std::clock();
std::cout <<"Time: " <<
double(end-start)/CLOCKS_PER_SEC * 1000 << " ms\n";

return 0;
}

Anyway, if I change the line `create_tree(22);' to `create_tree(15)', the
Java version finally works on my platform. Here is the output I get:

C++:
________________________________________
Time: 64 ms

Java:
________________________________________
Time: 450 ms

So much for Java GC allocator being faster than well written C++ allocator!
:^D

Nov 20 '08 #136

Matthias Buelow

Chris M. Thomasson wrote:

Can you blow the stack in a Java program? BTW, the error explicitly
mentioned Java heap space. Also, if its the stack, why did it not blowup
in the C++ version?

What do I know? You're comparing apples to oranges. Perhaps java is
simply using more memory for its classes and/or stack frames.

I'm running the java version on my system right now (the "java"
installed is Gnu gij) and it stabilizes at just below 256mb RSS (which
might be the preconfigured VM heap size), no crash yet but it's still
running since a couple minutes.

At best, your test only says that a) Java appears to use more memory
than a C++ program with a handcrafted allocator and b) your program,
including the handcrafted allocator, is 10x longer than the Java one.
Not very interesting results, imho.

Nov 20 '08 #137

Chris M. Thomasson

"Matthias Buelow" <mk*@incubus.dewrote in message
news:6o************@mid.dfncis.de...

Chris M. Thomasson wrote:

>Can you blow the stack in a Java program? BTW, the error explicitly
mentioned Java heap space. Also, if its the stack, why did it not blowup
in the C++ version?

What do I know? You're comparing apples to oranges. Perhaps java is
simply using more memory for its classes and/or stack frames.

I'm running the java version on my system right now (the "java"
installed is Gnu gij) and it stabilizes at just below 256mb RSS (which
might be the preconfigured VM heap size), no crash yet but it's still
running since a couple minutes.

Set the `create_tree(22)' line to `create_tree(15)' just to give yourself
some piece of mind that the Java program does indeed work. As-is, I cannot
see anything wrong with it. Also, I don't think you can blow the stack in
Java. IIRC, it will dynamically expand it.

Should I attempt to tweak both programs so that all recursion is removed?
Personally, I don't think Java has a problem with blowing the stack. But, I
will tinker around with it anyway.

At best, your test only says that a) Java appears to use more memory
than a C++ program with a handcrafted allocator and

Yes. This is because of the GC. 256MB sustained for Java, vs 29-30mb spikes
for C++, is major difference.

b) your program,
including the handcrafted allocator, is 10x longer than the Java one.

That extra coding effort pays off in the long run.

Not very interesting results, imho.

Humm.. Well, I have to disagree for now. IMVHO, it shows how a GC language
falls short when compared to clever custom allocation scheme.

Nov 20 '08 #138

Matthias Buelow

Chris M. Thomasson wrote:

Humm.. Well, I have to disagree for now. IMVHO, it shows how a GC
language falls short when compared to clever custom allocation scheme.

You didn't compare GC vs. non-GC. You compared Java vs. C++.
In addition, you compared a general purpose memory manager with a custom
allocator written for exactly the kind of usage pattern that your
example exhibits.
I can't really be bothered to completely list the many ways your
"comparison" is flawed, sorry.

Nov 20 '08 #139

Chris M. Thomasson

"Chris M. Thomasson" <no@spam.invalidwrote in message
news:ZI*****************@newsfe04.iad...
[...]

Anyway, if I change the line `create_tree(22);' to `create_tree(15)', the
Java version finally works on my platform. Here is the output I get:

C++:
________________________________________
Time: 64 ms

Java:
________________________________________
Time: 450 ms

So much for Java GC allocator being faster than well written C++
allocator!
:^D

I compile the C++ version on MINGW with full optmization using
`create_tree(15)' and I get:
Time: 216 ms
It seems like MINGW produces much slower code than a "release" build with
MSVC 2005 Express.

Nov 20 '08 #140

Chris M. Thomasson

"Matthias Buelow" <mk*@incubus.dewrote in message
news:6o************@mid.dfncis.de...

Chris M. Thomasson wrote:

>Humm.. Well, I have to disagree for now. IMVHO, it shows how a GC
language falls short when compared to clever custom allocation scheme.

You didn't compare GC vs. non-GC. You compared Java vs. C++.

Java is a GC lang, C++ is non-GC lang. I compare performance of memory
allocation. Apparently, the Java program is a huge memory hog, C++ program
is not.

In addition, you compared a general purpose memory manager with a custom
allocator written for exactly the kind of usage pattern that your
example exhibits.
I can't really be bothered to completely list the many ways your
"comparison" is flawed, sorry.

I compare clever manual memory management technique with automatic memory
management. The call to `g_tree_alloc.flush()' is manual management. I don't
see how its flawed. Are you saying that I should use plain 'new/delete' for
every allocation/deallocation? C++ is much more flexible than that. The GC
version cannot really compete with it...

Nov 20 '08 #141

Chris M. Thomasson

"Matthias Buelow" <mk*@incubus.dewrote in message
news:6o************@mid.dfncis.de...

Chris M. Thomasson wrote:

>The point I was trying to make is that the sample pseudo-code program
only needed to have less than 33 foo objects allocated at any one time.
This cannot be guaranteed with GC because of its non-deterministic
nature. Your program will be using more memory than it has to.

No it won't. It will be using exactly those 33 objects. What makes you
think otherwise?

Because its true.

The "garbage" is by definition unused memory and will
be reclaimed if more memory is needed just in the same way that if you
delete or free() some stuff in C++, it might hang around on a free list
for a while but that's an implementation detail that is completely
irrelevant to the question of how much memory the program is using.

If you have an algorithm that uses 1 million memory "cells" on a machine
with 1 million "cells" available, it can run in the same way no matter
if that memory is managed by hand or if the GC does it.

Nov 20 '08 #142

Thomas J. Gritzan

Chris M. Thomasson wrote:

I created a little test application for Java and C++:
http://pastebin.com/mc3e3f4e
(Java version)
http://pastebin.com/m4bf7db0a
(My C++ version)
As you can see, my C++ version uses a custom region allocation scheme I
created and posted to this group here:

http://groups.google.com/group/comp....8dc967c7ddba7c

It's a bit unfair to compare one algorithm in one language and another
algo in another language. You can find a test program that shows that
whatever you want is better than whatever you want.

To make it a little bit more fair, you could run the Java GC the same
time you flush the region allocator in C++:
http://pastebin.com/m40d757b7

By the way, how is your region allocator different from the pool
allocator in boost?

--
Thomas

Nov 20 '08 #143

Noah Roberts

Chris M. Thomasson wrote:

"Matthias Buelow" <mk*@incubus.dewrote in message
news:6o************@mid.dfncis.de...
>Chris M. Thomasson wrote:

>>Humm.. Well, I have to disagree for now. IMVHO, it shows how a GC
language falls short when compared to clever custom allocation scheme.

You didn't compare GC vs. non-GC. You compared Java vs. C++.

Java is a GC lang, C++ is non-GC lang. I compare performance of memory
allocation. Apparently, the Java program is a huge memory hog, C++
program is not.

To compare fairly, did you compile the Java version with the GCC native
compiler? If not then you're comparing a native byte code program to
one running in a VM.

>In addition, you compared a general purpose memory manager with a custom
allocator written for exactly the kind of usage pattern that your
example exhibits.

Although I imagine it's possible to do this within Java, or any GC'd
language, I also imagine it wouldn't translate as cleanly...it would be
harder.

>I can't really be bothered to completely list the many ways your
"comparison" is flawed, sorry.

I compare clever manual memory management technique with automatic
memory management. The call to `g_tree_alloc.flush()' is manual
management. I don't see how its flawed. Are you saying that I should use
plain 'new/delete' for every allocation/deallocation? C++ is much more
flexible than that. The GC version cannot really compete with it...

You can very likely write an allocation pool in Java. A fair comparison
would do that. Then you could compare the difficulty of the task,
writing a fast manager in C++ vs. in a GC environment hell bent on doing
it for you. Then weigh the benefits of each with the frequency in which
such a task is necessary.

Nov 20 '08 #144

Sam

Noah Roberts writes:

Chris M. Thomasson wrote:
>"Matthias Buelow" <mk*@incubus.dewrote in message
news:6o************@mid.dfncis.de...
>>Chris M. Thomasson wrote:

Humm.. Well, I have to disagree for now. IMVHO, it shows how a GC
language falls short when compared to clever custom allocation scheme.

You didn't compare GC vs. non-GC. You compared Java vs. C++.

Java is a GC lang, C++ is non-GC lang. I compare performance of memory
allocation. Apparently, the Java program is a huge memory hog, C++
program is not.

To compare fairly, did you compile the Java version with the GCC native
compiler? If not then you're comparing a native byte code program to
one running in a VM.

Modern Java VMs compile Java bytecode down into native machine code. If you
believe Java's proponents, Java VMs should do even a better job than
compiled languages, since the Java VMs can use runtime metrics to choose how
to optimally compile bytecode.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)

iEYEABECAAYFAkkmAa0ACgkQx9p3GYHlUOLijACfSV0lba16Fm DrR37WHe+jyAp8
JYQAn2qLqbrTaQ3XI6dIs0NONLZ7fLKr
=Jq6r
-----END PGP SIGNATURE-----

Nov 21 '08 #145

James Kanze

On Nov 21, 12:37 am, r...@zedat.fu-berlin.de (Stefan Ram) wrote:

George Kettleborough <g.kettleboro...@member.fsf.orgwrites:
I suppose what I don't get is why you would ever want to
create objects on the stack. It's something you can't do in
Java,

I just read this in »comp.lang.java.programmer«:

John B. Matthews wrote in <nospam-20B427.16262820112...@news.motzarella.org>:
|I was pleasantly surprised by the speed of the JScience
|library, possibly due to stack-based allocation afforded
|by Javolution, on which Jscience is based. (...)

|<http://jscience.org/>
|<http://javolution.org/>

(End of quotation from »comp.lang.java.programmer«.)

»Objects can be allocated on the "stack" and transparently
recycled. With Javolution , your application is busy doing
the real work not memory management (e.g. Javolution
RealtimeParser is 3-5x faster than conventional XML
parsers only because it does not waste 2/3 of the CPU
doing memory allocation/garbage collection).«

I'm not sure that this is quite relevant to the original
question. The poster said that he didn't get "why *you* would
ever want to create objects on the stack". In Java, you can't
create objects on the stack. The compiler, of course, works
under more or less the same "as if" rule as a C++ compiler; it
can do more or less anything it wants, as long as the output of
your program doesn't change. Including allocating variables on
the stack, even if you've written "new" in your code. (Note
that a C++ could theoretically do this as well. But trying to
identify cases where it would be applicable is probably wasted
effort on the part of the compiler, because if the object could
have been on the stack, the programmer wouldn't have used
"new".)

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Nov 21 '08 #146

James Kanze

On Nov 20, 3:17 pm, Matthias Buelow <m...@incubus.dewrote:

SG wrote:
It seems that the library solution is as convenient and
flexible as a dedicated language feature would be. So, a
"clean up" core language feature is hardly justified.

It's beginning to resemble Perl, though.

You've noticed that too. The syntax, despite having some real
problems of its own, is still not nearly as bad as Perl, and we
haven't abandoned static type checking completely, but it's
definitely becoming a case of many different ways of doing
something (all equally unreadable).

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Nov 21 '08 #147

James Kanze

On Nov 20, 7:11 pm, Matthias Buelow <m...@incubus.dewrote:

Chris M. Thomasson wrote:
Seriously... Why do you think they choose C++ for their custom
web-server's, indexers, file-systems, databases ect...?

First, I don't know if what you say is true. Second, I have no
clue. Probably for a similar reason why I'm programming in
C++; that's not because I like the language.

Yes. Often, the choice is really only between C, C++ and Java
(with maybe a possibility of Fortran thrown in on the side).

Still, when push comes to shove---I don't really like C++ that
much, but I like the other languages I've seen even less. C++
seems to lack simplicity and elegance; the other languages lack
expressivity. And with enough expressivity, it's possible to
hide the lack of simplicity and elegance.

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Nov 21 '08 #148

James Kanze

On Nov 20, 3:09 pm, Matthias Buelow <m...@incubus.dewrote:

James Kanze wrote:
Because it does:-). That's been my experience, anyway.

Juha was saying that certain applications (or algorithms, in
general) can't be run in a GC'd environment on the same
problem size as they can in a manually managed environment,
which is just plain false.

It's not just plain false. In fact, it's not false at all;
there are certainly applications for which manual memory
management will increase the possible problem size, given a
fixed upper limit of available memory. Obviously, you can make
them work for the same size problem with garbage collection by
triggering the collector at each allocation, but that generally
comes out to being unacceptably expensive in terms of runtime.

I'm actually rather surprised how often I have to repeat this
here: there is no silver bullet. Garbage collection offers a
number of advantages for a number of programs, but it won't make
a silk purse out of a sow's ear, and it's not a universal
solution, applicable everywhere. It's simply one more tool to
be considered; in the end, whether it is applicable depends on
the engineering trade-offs involved.

Of course a, say, Unix process may have some more memory
mapped to it if that's still in the GC's memory pool
internally just as it also happens with some malloc
implementations which don't free their stuff in a timely
fashion. This is, however, irrelevant to the application
itself.

Yes and no. It's not irrelevant if the implementation of the
allocator requires that memory to be mapped, for whatever
reasons.

I don't know enough about Juha's application, but there are
certainly applications which shouldn't use garbage
collection, and others which should use some sort of mixed
model, or perhaps require special tuning for garbage
collection to be effective.

Well, a GC-managed environment can always be turned into a
manually managed environment much easier than the other way
round.

You'll have to explain that too me. If the code works with
manual management, just replace malloc with gc_malloc, #define
free to nothing, recompile, and it works with garbage
collection. The reverse is not so obvious.

That's not true when you start pushing towards the limits.
In the distant past, I've had programs which would run out
of memory using one implementation of malloc, and not with
another.

Application specialities and implementation details. :)

Which may be relevant to your problem.

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Nov 21 '08 #149

James Kanze

On Nov 20, 9:41 pm, "Chris M. Thomasson" <n...@spam.invalidwrote:

"Matthias Buelow" <m...@incubus.dewrote in message

news:6o************@mid.dfncis.de...

Chris M. Thomasson wrote:

Humm.. Well, I have to disagree for now. IMVHO, it shows
how a GC language falls short when compared to clever
custom allocation scheme.

You didn't compare GC vs. non-GC. You compared Java vs. C++.

Java is a GC lang, C++ is non-GC lang.

Java is not C++ with garbage collection. Java and C++ are
different languages, with different idioms. The fact that
garbage collection is mandatory in Java, and optional in C++, is
only one difference among many.

I compare performance of memory allocation.

No. You compare performance of two specific implementations of
two specific very different languages.

Apparently, the Java program is a huge memory hog, C++ program
is not.

No. Apparently, the Java implemention you used is a huge memory
hog. Or maybe there was something about the way you used it.
Who knows.

In addition, you compared a general purpose memory manager
with a custom allocator written for exactly the kind of
usage pattern that your example exhibits. I can't really be
bothered to completely list the many ways your "comparison"
is flawed, sorry.

I compare clever manual memory management technique with
automatic memory management.

You compare investing a lot of time and effort (read cost) in
custom memory management with just using something out of the
box.

The call to `g_tree_alloc.flush()' is manual management. I
don't see how its flawed. Are you saying that I should use
plain 'new/delete' for every allocation/deallocation?

That's the idiomatic way of doing things in the language. It's
also the cheapest way (in terms of cost).

C++ is much more flexible than that.

So is Java, when it comes down to it.

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Nov 21 '08 #150

Similar topics