C++0x Garbage Collection

Goalie_Ca

I have been reading (or at least googling) about the potential addition
of optional garbage collection to C++0x. There are numerous myths and
whatnot with very little detailed information.

Will this work be library based or language based and will it be based
on that of managed C++? Then of course there are the finer technical
questions raised (especially due to pointer abuse). Is a GC for C++
just a pipe dream or is there a lot of work in the committee to realise
it.

Jun 24 '06 #1

Subscribe Post Reply

3124

Roland Pibinger

On 23 Jun 2006 23:49:30 -0700, "Goalie_Ca" <go******@gmail.com> wrote:

I have been reading (or at least googling) about the potential addition
of optional garbage collection to C++0x. There are numerous myths and
whatnot with very little detailed information.

C++ needs no garbage collection because it offers something better:
deterministic resource management with destructors. You should have
googled for 'RAII' instead of 'garbage collection'. Actually, RAII is
the 'unique selling proposition' of C++. Introducing GC into C++ would
only make it a worse Java or C#.

Best wishes,
Roland Pibinger

Jun 24 '06 #2

Mirek Fidler

Roland Pibinger wrote:

On 23 Jun 2006 23:49:30 -0700, "Goalie_Ca" <go******@gmail.com> wrote:
I have been reading (or at least googling) about the potential addition
of optional garbage collection to C++0x. There are numerous myths and
whatnot with very little detailed information.

C++ needs no garbage collection because it offers something better:
deterministic resource management with destructors. You should have
googled for 'RAII' instead of 'garbage collection'.

Alternatively, go directly here:

http://upp.sf.net

and take a look at sources ;)

You can e.g. start here:

http://upp.sourceforge.net/www$uppweb$vsswing$en-us.html

RAII rules GC.

Mirek

Jun 24 '06 #3

Roland Pibinger

On Sat, 24 Jun 2006 09:58:51 +0200, Mirek Fidler <cx*@volny.cz> wrote:

Roland Pibinger wrote:
On 23 Jun 2006 23:49:30 -0700, "Goalie_Ca" <go******@gmail.com> wrote:
I have been reading (or at least googling) about the potential addition
of optional garbage collection to C++0x. There are numerous myths and
whatnot with very little detailed information.
C++ needs no garbage collection because it offers something better:
deterministic resource management with destructors. You should have
googled for 'RAII' instead of 'garbage collection'.

Alternatively, go directly here:

http://upp.sf.net
and take a look at sources ;)
You can e.g. start here:
http://upp.sourceforge.net/www$uppweb$vsswing$en-us.html

Definitely worth a look and a trial although, for my taste, too "smart
and aggressive" in the use of C++.
RAII rules GC.

:-)

Best wishes,
Roland Pibinger

Jun 24 '06 #4

Joe Seigh

Goalie_Ca wrote:

I have been reading (or at least googling) about the potential addition
of optional garbage collection to C++0x. There are numerous myths and
whatnot with very little detailed information.

Will this work be library based or language based and will it be based
on that of managed C++? Then of course there are the finer technical
questions raised (especially due to pointer abuse). Is a GC for C++
just a pipe dream or is there a lot of work in the committee to realise
it.

Part of the problem is there are different forms of GC. Even for tracing
GC there are enough differences that being able to plug in at the library
level might be a little problematic. Whatever solution they pick is not
likely to be neutral as far as the alternative solutions are concerned.

--
Joe Seigh

When you get lemons, you make lemonade.
When you get hardware, you make software.

Jun 24 '06 #5

Jerry Coffin

In article <11*********************@y41g2000cwy.googlegroups. com>,
go******@gmail.com says...

I have been reading (or at least googling) about the potential addition
of optional garbage collection to C++0x. There are numerous myths and
whatnot with very little detailed information.

Will this work be library based or language based and will it be based
on that of managed C++? Then of course there are the finer technical
questions raised (especially due to pointer abuse). Is a GC for C++
just a pipe dream or is there a lot of work in the committee to realise
it.

There were a couple of lengthy threads about this in
comp.lang.c++.moderated. See:

http://tinyurl.com/s57fq

and:

http://tinyurl.com/p5op2

For starters. When I said lengthy, I wasn't kidding though -- reading
through all this will take considerable time (and this thread will
probably echo many of the same arguments).

--
Later,
Jerry.

The universe is a figment of its own imagination.

Jun 24 '06 #6

Goalie_Ca

Thanks for those threads. I can see that in general people are divided
at every level on how to approach this problem. To me, the outsider, it
appears that it will not likely make this revision although there is
clearly support for having one included.
Jerry Coffin wrote:

In article <11*********************@y41g2000cwy.googlegroups. com>,
go******@gmail.com says...
I have been reading (or at least googling) about the potential addition
of optional garbage collection to C++0x. There are numerous myths and
whatnot with very little detailed information.

Will this work be library based or language based and will it be based
on that of managed C++? Then of course there are the finer technical
questions raised (especially due to pointer abuse). Is a GC for C++
just a pipe dream or is there a lot of work in the committee to realise
it.

There were a couple of lengthy threads about this in
comp.lang.c++.moderated. See:

http://tinyurl.com/s57fq

and:

http://tinyurl.com/p5op2

For starters. When I said lengthy, I wasn't kidding though -- reading
through all this will take considerable time (and this thread will
probably echo many of the same arguments).

--
Later,
Jerry.

The universe is a figment of its own imagination.

Jun 25 '06 #7

Ron House

Goalie_Ca wrote:

Thanks for those threads. I can see that in general people are divided
at every level on how to approach this problem. To me, the outsider, it
appears that it will not likely make this revision although there is
clearly support for having one included.

My question is a simple one: how do we combine destructors with GC?
Destructors do not become superfluous just because one usage for them
does. Closing files and shutting down other resources in a timely manner
becomes hard when object termination occurs at an indeterminate time. Is
there a way to do this that is sufficiently practical and efficient to
raise no major objections if put into a standard? It is one thing for an
add-on to do it, as we use or not use the add-on to our own liking; but
a standard is another matter.

--
Ron House ho***@usq.edu.au
http://www.sci.usq.edu.au/staff/house

Jun 26 '06 #8

Jerry Coffin

In article <44**************@usq.edu.au>, ho***@usq.edu.au says...

Goalie_Ca wrote:
Thanks for those threads. I can see that in general people are divided
at every level on how to approach this problem. To me, the outsider, it
appears that it will not likely make this revision although there is
clearly support for having one included.

My question is a simple one: how do we combine destructors with GC?

One of the threads I previously cited was titled "Reconciling Garbage
Collection with Deterministic Finalization". Even with little or no
knowledge of the subject matter, the mere fact that the thread went
on for well over 300 posts tends to show that nobody has a really
solid answer for that.

--
Later,
Jerry.

The universe is a figment of its own imagination.

Jun 26 '06 #9

Alf P. Steinbach

* Ron House:

Goalie_Ca wrote:
Thanks for those threads. I can see that in general people are divided
at every level on how to approach this problem. To me, the outsider, it
appears that it will not likely make this revision although there is
clearly support for having one included.

My question is a simple one: how do we combine destructors with GC?
Destructors do not become superfluous just because one usage for them
does. Closing files and shutting down other resources in a timely manner
becomes hard when object termination occurs at an indeterminate time. Is
there a way to do this that is sufficiently practical and efficient to
raise no major objections if put into a standard? It is one thing for an
add-on to do it, as we use or not use the add-on to our own liking; but
a standard is another matter.

It's very simple: object destruction is not memory reclamation, and
memory reclamation is not object destruction.

The job of a garbage collector is solely to reclaim memory.

If an object has a non-trivial destructor, one with possible side
effects, then that object cannot be automatically destroyed by a garbage
collector in order to reclaim memory, because then the garbage collector
would intrude in the arena of program logic and correctness.

Thus, a C++ garbage collector compatible with the current standard can
reclaim a region of memory only when:

A. There are no live references to the region (circular references
are not live), and

B. All remaining objects in the region have trivial destructors.

The case where there are no remaining objects in the region (i.e. all
have been destroyed) might seem to be of no practical advantage, but it
is if ::operator delete, rather than deallocating at once, just invokes
object destruction and marks the memory for later automatic reclamation,
which can proceed e.g. when the program's later waiting for user input.

This reduces garbage collection in C++ to an /optimization/ and /memory
leak slurper/, not affecting correctness except to the degree it slurps.

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?

Jun 26 '06 #10

Joe Seigh

Jerry Coffin wrote:

In article <44**************@usq.edu.au>, ho***@usq.edu.au says...
Goalie_Ca wrote:
Thanks for those threads. I can see that in general people are divided
at every level on how to approach this problem. To me, the outsider, it
appears that it will not likely make this revision although there is
clearly support for having one included.

My question is a simple one: how do we combine destructors with GC?

One of the threads I previously cited was titled "Reconciling Garbage
Collection with Deterministic Finalization". Even with little or no
knowledge of the subject matter, the mere fact that the thread went
on for well over 300 posts tends to show that nobody has a really
solid answer for that.

No suprise since it was sort of at the all or nothing level. You could
provide a lower level api for use by GC implementations and let the
market decide. Of course you'd have to recognise that there are different
forms of GC out there, not just tracing GC, otherwise there really wouldn't
be any market to decide. Kind of Henry Ford's you can have any color you
want as long as it's black.

--
Joe Seigh

When you get lemons, you make lemonade.
When you get hardware, you make software.

Jun 26 '06 #11

Ron House

Alf P. Steinbach wrote:

* Ron House:

My question is a simple one: how do we combine destructors with GC?
Destructors do not become superfluous just because one usage for them
does. Closing files and shutting down other resources in a timely
manner becomes hard when object termination occurs at an indeterminate
time. Is there a way to do this that is sufficiently practical and
efficient to raise no major objections if put into a standard? It is
one thing for an add-on to do it, as we use or not use the add-on to
our own liking; but a standard is another matter.

It's very simple: object destruction is not memory reclamation, and
memory reclamation is not object destruction.

The job of a garbage collector is solely to reclaim memory.

If an object has a non-trivial destructor, one with possible side
effects, then that object cannot be automatically destroyed by a garbage
collector in order to reclaim memory, because then the garbage collector
would intrude in the arena of program logic and correctness.

Thus, a C++ garbage collector compatible with the current standard can
reclaim a region of memory only when:

A. There are no live references to the region (circular references
are not live), and

B. All remaining objects in the region have trivial destructors.

The case where there are no remaining objects in the region (i.e. all
have been destroyed) might seem to be of no practical advantage, but it
is if ::operator delete, rather than deallocating at once, just invokes
object destruction and marks the memory for later automatic reclamation,
which can proceed e.g. when the program's later waiting for user input.

This reduces garbage collection in C++ to an /optimization/ and /memory
leak slurper/, not affecting correctness except to the degree it slurps.

Well it can't be "very simple" if the solution doesn't cover the real
motivation for having GC. I don't give a fig for efficiency (within
reason). The motivation for GC is to prevent errors: remove the need for
deleting dynamic memory so that you remove the need to keep track of it,
which simplifies a whole slew of algorithms and designs. If we have to
keep on keeping track for 'difficult' objects, then the algorithms and
methods will be made complicated anyway; errors will still be possible.
So, can we have GC (that is, auto-reclamation of space, meaning no
programming to call delete) along with deterministic execution of the
destructor? For example, we might be satisfied if we could: 1) be sure
destructors would immediately be called at the end for stack variables,
2) be sure they would be called sooner or later for lost heap variables,
and 3) called deterministically by using delete deliberately.

--
Ron House ho***@usq.edu.au
http://www.sci.usq.edu.au/staff/house

Jun 27 '06 #12

Alf P. Steinbach

* Ron House:

Alf P. Steinbach wrote:
* Ron House:
My question is a simple one: how do we combine destructors with GC?
Destructors do not become superfluous just because one usage for them
does. Closing files and shutting down other resources in a timely
manner becomes hard when object termination occurs at an
indeterminate time. Is there a way to do this that is sufficiently
practical and efficient to raise no major objections if put into a
standard? It is one thing for an add-on to do it, as we use or not
use the add-on to our own liking; but a standard is another matter.

It's very simple: object destruction is not memory reclamation, and
memory reclamation is not object destruction.

The job of a garbage collector is solely to reclaim memory.

If an object has a non-trivial destructor, one with possible side
effects, then that object cannot be automatically destroyed by a
garbage collector in order to reclaim memory, because then the garbage
collector would intrude in the arena of program logic and correctness.

Thus, a C++ garbage collector compatible with the current standard can
reclaim a region of memory only when:

A. There are no live references to the region (circular references
are not live), and

B. All remaining objects in the region have trivial destructors.

The case where there are no remaining objects in the region (i.e. all
have been destroyed) might seem to be of no practical advantage, but
it is if ::operator delete, rather than deallocating at once, just
invokes object destruction and marks the memory for later automatic
reclamation, which can proceed e.g. when the program's later waiting
for user input.

This reduces garbage collection in C++ to an /optimization/ and
/memory leak slurper/, not affecting correctness except to the degree
it slurps.

Well it can't be "very simple" if the solution doesn't cover the real
motivation for having GC. I don't give a fig for efficiency (within
reason). The motivation for GC is to prevent errors: remove the need for
deleting dynamic memory so that you remove the need to keep track of it,
which simplifies a whole slew of algorithms and designs.

The above does that: managing memory.

If we have to
keep on keeping track for 'difficult' objects, then the algorithms and
methods will be made complicated anyway; errors will still be possible.
They are, but more so when non-memory cleanup responsibility is left to
GC. Look to Java, where the poor programmers have to do manually what
C++ does automatically, because the Java GC idea prevents RAII. GC
manages memory, not other resources, and when cajoled into managing
other resources does an extremely poor job, so poor that you're better
off without it (zombies, meta-invariants, that sort of thing).

So, can we have GC (that is, auto-reclamation of space, meaning no
programming to call delete) along with deterministic execution of the
destructor? For example, we might be satisfied if we could: 1) be sure
destructors would immediately be called at the end for stack variables,
2) be sure they would be called sooner or later for lost heap variables,
and 3) called deterministically by using delete deliberately.

You'd not really want (2). At least, I don't. For example, if you have
an open file and leave it to GC to close it, then after a while you'll
have an arbitrary number of open files hanging around, waiting for GC to
close them -- if ever -- preventing access to those files both from
other programs and your own (IMO relying on GC for this is an error).

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?

Jun 27 '06 #13

Ian Collins

Ron House wrote:

Alf P. Steinbach wrote:
* Ron House:
My question is a simple one: how do we combine destructors with GC?
Destructors do not become superfluous just because one usage for them
does. Closing files and shutting down other resources in a timely
manner becomes hard when object termination occurs at an
indeterminate time. Is there a way to do this that is sufficiently
practical and efficient to raise no major objections if put into a
standard? It is one thing for an add-on to do it, as we use or not
use the add-on to our own liking; but a standard is another matter.

It's very simple: object destruction is not memory reclamation, and
memory reclamation is not object destruction.

The job of a garbage collector is solely to reclaim memory.

If an object has a non-trivial destructor, one with possible side
effects, then that object cannot be automatically destroyed by a
garbage collector in order to reclaim memory, because then the garbage
collector would intrude in the arena of program logic and correctness.

Thus, a C++ garbage collector compatible with the current standard can
reclaim a region of memory only when:

A. There are no live references to the region (circular references
are not live), and

B. All remaining objects in the region have trivial destructors.

The case where there are no remaining objects in the region (i.e. all
have been destroyed) might seem to be of no practical advantage, but
it is if ::operator delete, rather than deallocating at once, just
invokes object destruction and marks the memory for later automatic
reclamation, which can proceed e.g. when the program's later waiting
for user input.

This reduces garbage collection in C++ to an /optimization/ and
/memory leak slurper/, not affecting correctness except to the degree
it slurps.

Well it can't be "very simple" if the solution doesn't cover the real
motivation for having GC. I don't give a fig for efficiency (within
reason). The motivation for GC is to prevent errors:

That should read "prevent errors introduced through lazy programming"
remove the need for
deleting dynamic memory so that you remove the need to keep track of it,
which simplifies a whole slew of algorithms and designs. If we have to
keep on keeping track for 'difficult' objects, then the algorithms and
methods will be made complicated anyway; errors will still be possible.

My platform has an effective GC library, but I only use it during
acceptance test runs, to verify that there aren't any leaks.

GC doesn't belong in the language, if it is to be used at all, it should
be in a library.

RAII is a much more effective and deterministic tool.

--
Ian Collins.

Jun 27 '06 #14

Roland Pibinger

On Tue, 27 Jun 2006 05:07:45 +0200, "Alf P. Steinbach"
<al***@start.no> wrote:

* Ron House:
So, can we have GC (that is, auto-reclamation of space, meaning no
programming to call delete) along with deterministic execution of the
destructor? For example, we might be satisfied if we could: 1) be sure
destructors would immediately be called at the end for stack variables,
2) be sure they would be called sooner or later for lost heap variables,
and 3) called deterministically by using delete deliberately.

You'd not really want (2). At least, I don't. For example, if you have
an open file and leave it to GC to close it, then after a while you'll
have an arbitrary number of open files hanging around, waiting for GC to
close them -- if ever -- preventing access to those files both from
other programs and your own (IMO relying on GC for this is an error).

When someone wanted to introduce GC into C++ with a newly-made 'gcnew'
operator the _compiler_ would have to check if all data members of the
gcnew-ed object (and its base classes) have trivial (ie.
non-implemented) destructors. Otherwise the compiler would have to
issue a compile time error. This means that not even the current
std::string could be a member of a gcnew-ed object. Moreover, since
objects with trivial destructors usually are of value type and value
types are best handled with value semantics the introduction of even a
limited form of GC into C++ seems highly questionable.

Best regards,
Roland Pibinger

Jun 27 '06 #15

Dilip

Ron House wrote:

Goalie_Ca wrote:
Thanks for those threads. I can see that in general people are divided
at every level on how to approach this problem. To me, the outsider, it
appears that it will not likely make this revision although there is
clearly support for having one included.

My question is a simple one: how do we combine destructors with GC?
Destructors do not become superfluous just because one usage for them
does. Closing files and shutting down other resources in a timely manner
becomes hard when object termination occurs at an indeterminate time. Is
there a way to do this that is sufficiently practical and efficient to
raise no major objections if put into a standard? It is one thing for an
add-on to do it, as we use or not use the add-on to our own liking; but
a standard is another matter.

Years ago when C# first surfaced, a Microsoft employee posted a
_lengthy_ analysis on the resource management/deterministic
finalization conflict. The link is here:
http://discuss.develop.com/archives/...OTNET&P=R28572
It talks about the pros & cons of reference counting vs. deterministic
finalization. Might be relevant to this thread.

Jun 27 '06 #16

Ron House

Alf P. Steinbach wrote:

* Ron House:
Alf P. Steinbach wrote: Well it can't be "very simple" if the solution doesn't cover the real
motivation for having GC. I don't give a fig for efficiency (within
reason). The motivation for GC is to prevent errors: remove the need
for deleting dynamic memory so that you remove the need to keep track
of it, which simplifies a whole slew of algorithms and designs.

The above does that: managing memory.

Not if it manages it quirkily - only for 'simple' objects, but not others.

If we have to keep on keeping track for 'difficult' objects, then the
algorithms and methods will be made complicated anyway; errors will
still be possible.

They are, but more so when non-memory cleanup responsibility is left to
GC. Look to Java, where the poor programmers have to do manually what
C++ does automatically, because the Java GC idea prevents RAII. GC
manages memory, not other resources, and when cajoled into managing
other resources does an extremely poor job, so poor that you're better
off without it (zombies, meta-invariants, that sort of thing).

That is what I am seriously considering, which is why I was surprised by
your original remark that it was "very simple". If these ideas are
basically at odds, then it is the opposite of simple, it is impossible.

So, can we have GC (that is, auto-reclamation of space, meaning no
programming to call delete) along with deterministic execution of the
destructor? For example, we might be satisfied if we could: 1) be sure
destructors would immediately be called at the end for stack
variables, 2) be sure they would be called sooner or later for lost
heap variables, and 3) called deterministically by using delete
deliberately.

You'd not really want (2). At least, I don't. For example, if you have
an open file and leave it to GC to close it, then after a while you'll
have an arbitrary number of open files hanging around, waiting for GC to
close them -- if ever -- preventing access to those files both from
other programs and your own (IMO relying on GC for this is an error).

My point in saying "be sure" is to rule out "if ever". I.e., you are not
criticising my requirement, but the violation of it. Now it might be
true that we cannot "have it all" in this matter, but I am looking for
evidence that this has been shown conclusively, as opposed to it merely
being that no one has found out the way to do it yet.

--
Ron House ho***@usq.edu.au
http://www.sci.usq.edu.au/staff/house
Ethics website: http://www.sci.usq.edu.au/staff/house/goodness

Jun 28 '06 #17

Kaz Kylheku

Ron House wrote:

Goalie_Ca wrote:
Thanks for those threads. I can see that in general people are divided
at every level on how to approach this problem. To me, the outsider, it
appears that it will not likely make this revision although there is
clearly support for having one included.
My question is a simple one: how do we combine destructors with GC?

The most straightforward way is for the garbage collector to know how
to invoke the equivalent of delete on the objects. (I would abhor a
scheme whereby GC'able objects have to inherit from some special base
class with a virtual destructor; it would be better if this was
intelligent somehow).

For some types of objects, calling the destructor at GC time might be
too late. Fine; those objects have to be coded with two-step
destruction. Deterministically do the actions that have to be taken,
using a chain of member functions. The destructor then behaves as a
finalizer. It ensures that those actions happen, plus any other cleanup
actions.

A class written like that can be used with garbage collection, as well
as with new/delete and RAII.
Destructors do not become superfluous just because one usage for them
does. Closing files and shutting down other resources in a timely manner
becomes hard when object termination occurs at an indeterminate time.

The indeterminate lifetime of an object isn't caused by garbage
collection. It's caused by the semantics of the program. Garbage
collection solves the problem of computing that lifetime.

If you don't have garbage collection or some other scheme, you still
have to compute the lifetime of an object and call for it to be
destroyed by explicit delete.

And if you do that, that delete call may also be too late in releasing
an operating system resource.

You simply have to regard that operating system resource handle has
having its own lifetime, which is contained within the lifetime of the
encapsulating object.

You can take the responsibility for computing that contained lifetime,
and let the garbage collector determine the main lifetime.

You don't want to use the destructor for ending the resource lifetime,
because that will turn your entire object into garbage, while it is
still reachable.

So the obvious thing is release the resource and change the state of
the object to indicate that the object does not have that resource.

Garbage collection is not incompatible with your program knowing when
to release a resource.

class ResourceWrapper {
private:
SomeResource *res;
public:
// ...
void ReleaseResource()
{
if (res != 0) {
DestroyResource(res);
res = 0;
}
}
virtual ~ResourceWrapper()
{
ReleaseResource();
}
};

Okay, so that gets the obvious out of the way. Now the problem.

The issue is that functions like ResourceWrapper::ReleaseResource() are
ad hoc. Whereas a destructor is a formalism built into the language.
The destructor formalism does something nice. Namely, it ensures that
the member and base destructors are called.

Here, the partial cleanup done by ReleaseResource() has the
responsibility of doing whatever needs to be done in the base and
member objects, if anything. If ReleaseResource() is virtual and is
overridden, the derived ReleaseResource() will probably have to call
the parent one.

In the intelligently designed Common Lisp Object System, any method can
be endowed with auxiliary methods which are called if that primary
method is called. The auxiliary methods can be specialized throughout
the class lattice, and be designated as "before", "after" or "around".
In CLOS terminology, the automatic constructor calling in C++ resembles
before methods being fired. Whereas destructors are after-methods. Sort
of. The most derived destructor that is called is kind of like the
primary method, and the base ones that are called are like
after-methods. There is no counterpart to the automatic calling of
destructors on member objects.

C++ could benefit from member functions which can be extended with
auxiliaries. In a class having some virtual function Foo() it would be
nice to be able to define a special overload of Foo() which is always
called before Foo(), and another overload which is always called after.
That is to say, if the virtual Foo() is invoked on the object, then the
before Foo() is called in that class, and in all the derived classes
which also have one. Then the appropriate override of Foo() is invoked,
at whatever level in the class hierarchy that may be, and then the
afters are called, in derived to base order.

With befores and afters, certain aspects of resource cleanup would be
easier to manage. The ResourceWrapper would look like this:

virtual void ReleaseResource()
{
// nothing to do here now; it's moved to the after function
}

after void ReleaseResource()
{
if (res != 0) {
DestroyResource(res);
res = 0;
}
}

So now if ReleaseResource() is called on that object, no matter how
that is derived, the ReleaseResource() after-function is called.
(Provided that no bullshit happens with exceptions!) If someone derives
from this class and overrides ReleaseResource(), that will not prevent
our cleanup from happening.

These before and after functions don't need any weird magic in the
vtable or anything. They are not virtual (and in fact making them so
ought to be forbidden).

It works simply like this. When the compiler translates the virtual
function which looks like this:

void ReleaseResource(int arg)
{
BODY;
}

it generates the code as:

virtual void ReleaseResource(int arg)
{
BEFORE(arg);
BODY;
AFTER(arg);
}

Here, BEFORE represents the name of the outermost "before
ReleaseResource" function, and AFTER the call to the nearest "after
ReleaseResource". Both functions are called in the normal way. Imagine
scope resolution being used to specify them exactly with whatever funny
names they have known to the compiler.

And of course the befores and afters are automatically instrumented
with exception-safe code which ensures their own continuity. E.g. an
"after ReleaseResource(int arg) { BODY; } is generated as:

after void ReleaseResource(int arg)
{
BODY;
AFTER(arg);
}

No exception safety there; that is deliberate. If anything throws, the
subsequent actions are not invoked.

Jun 28 '06 #18

Kaz Kylheku

Alf P. Steinbach wrote:

It's very simple: object destruction is not memory reclamation, and
memory reclamation is not object destruction.

The job of a garbage collector is solely to reclaim memory.
That is too naive.

An object must remain valid while it is reachable. And while it remains
valid, it will continue to hold whatever resources it has always held.
Those resources must be cleaned up when that object becomes
unreachable.

Mature garbage-collected language implementations all have a way to
register finalization hooks: code run on an object when it's about to
be reclaimed.

They also have features like weak pointers and weak hash tables: these
can refer to objects weakly. When objects are GC'd, they automatically
disappear from weak hash tables, and weak pointers that refer to them
safely become nulll.

The only time it's a problem to do finalization at GC time is when a
more timely behavior is needed. For instance you keep opening files and
depend on GC to close them, you could run out of operating system file
handles, given a sufficiently large heap.

That doesn't mean destructors shouldn't be run at GC time, only that
some responsibilities have to be taken care of before that happens.
If an object has a non-trivial destructor, one with possible side
effects, then that object cannot be automatically destroyed by a garbage
collector in order to reclaim memory, because then the garbage collector
would intrude in the arena of program logic and correctness.
Be that as it may, people do this quite happily.
Thus, a C++ garbage collector compatible with the current standard can
reclaim a region of memory only when:

A. There are no live references to the region (circular references
are not live), and
Of course circular references are not live. The search for reachable
objects begins with global variables and live locals.
B. All remaining objects in the region have trivial destructors.
So in fact you are calling destructors from the garbage collector.
You're merely insisting that they be trivial, because you're too scared
of having programmer code invoked in the context of the garbage
collector.
The case where there are no remaining objects in the region (i.e. all
have been destroyed) might seem to be of no practical advantage, but it
is if ::operator delete, rather than deallocating at once, just invokes
object destruction and marks the memory for later automatic reclamation,
which can proceed e.g. when the program's later waiting for user input.

If you have garbage collection, then operator delete becomes a
dangerous tool which undermines the garbage collector. You should not
be using it. If you use delete, then you create the risk that you are
destroying an object to which references still remain.

Quite simply, the delete operator cannot be trusted, and so the memory
cannot be marked for reclamation.

Since you've established that these objects have only trivial
destructors, then operator delete is reduced to a noop.

But in a better design, where non-trivial destructors are allowed under
GC, what you would do is this: operator delete calls the destructor
chain on the object, but does not deallocate the memory. Instead, it
sets a flag which indicates that the destructors have been called
already. Later, when that object is garbage collected, the collector
will honor that "destroyed already" flag and not invoke a redundant
destructor call. It will just do the memory reclamation.

So everyone is happy. C++ programmers who think garbage collectors
should only hunt down memory and not handle object destruction can
knock themselves out by computing their own object lifetimes and
calling delete. When these programmers screw that up and forget to call
delete, the GC will save their asses by calling the destructors for
them and reclaiming memory. Those guys won't notice that this happened;
in fact if the destructor has some highly visible side effect, they
will probably still take credit for it, even though it was the GC doing
it for them. And so they can go on believing that garbage collectors
should only hunt down memory and not handle object destruction. When
they call delete and then continue to have references to that object
and use them, they will get the undefined behavior that they crave and
deserve.

Perfect!

Jun 28 '06 #19

Kaz Kylheku

Alf P. Steinbach wrote:

They are, but more so when non-memory cleanup responsibility is left to
GC. Look to Java, where the poor programmers have to do manually what
C++ does automatically, because the Java GC idea prevents RAII.
GC in C++ would not prevent RAII, because you would still be able to
define objects in automatic storage, and as members of other objects,
etc.

{
// RAII at work
ObjectWithResource o;
}

// GC at work
ObjectWithResource *po = new ObjectWithResource;

#if 0
// optional: don't wait for GC, release resource now:
// maybe this is needed, maybe not. depends on resource.
po->releaseResource();
#endif
GC manages memory, not other resources, and when cajoled into managing
other resources does an extremely poor job, so poor that you're better
off without it (zombies, meta-invariants, that sort of thing).
Don't let Java's mistakes, whatever they are, reflect badly on garbage
collection with finalization hooks. There is nothing wrong with having
some function called on an object that is about to be reclaimed, and
it's a sane design for that function to be the destructor, if we are in
C++ land.
Just have a flag so it's not called twice. If the destructor decides to
make that object reachable (for instance using: global_pointer = this)
who cares. The destructor has been called, and won't be called a second
time. End of story.

So, can we have GC (that is, auto-reclamation of space, meaning no
programming to call delete) along with deterministic execution of the
destructor? For example, we might be satisfied if we could: 1) be sure
destructors would immediately be called at the end for stack variables,
2) be sure they would be called sooner or later for lost heap variables,
and 3) called deterministically by using delete deliberately.

and 3) (a): called at most once, no matter what.
You'd not really want (2). At least, I don't. For example, if you have
an open file and leave it to GC to close it, then after a while you'll
have an arbitrary number of open files hanging around, waiting for GC to
close them -- if ever -- preventing access to those files both from
There is no connection between (2) and the problem you are describing.
If that happens, it means that the programmer made the mistake of
solely relying on the destructor to clean up that resource.

Under GC discipline, the destructor must be regarded only as a last
resort cleanup for these kinds of resources.

The object should have some alternate method for releasing the file or
whatever resource, and the program should call that method.

The destructor /should/ still release that resource if it still exists
by that time. Not only because that will plug a resource leak, but
because the C++ class should still be useable for RAII discipline, when
instances are defined in automatic storage.

If I have some LogFile object, I'm not going to call Close() on it if I
defined it in a block scope. I want the destructor to do that.

If I used new to dynamically allocate it, I don't want to wait for the
destructor to close the file. So I will call ptr->Close().
other programs and your own (IMO relying on GC for this is an error).

Yes, relying on GC for this is an error. But that doesn't mean GC
shouldn't call destructors. It means that the destructor cannot be the
only way for that cleanup to take place.

But, also, the situation is not necessarily a calamity. If the program
only loses track of a small number of such objects, maybe it's okay for
them to be reclaimed later. It's only if a sufficiently large number of
unreclaimed objects pile up, containing open handles, that it becomes a
problem.

Consider the short lived program. Opens up a bunch of files, does some
work and then dies. The file handles close on process death. GC didn't
even have a chance to run! Who cares.

Also, there could always be an API for invoking GC explicitly. An
application could call for a full GC at specific checkpoints to flush
out these objects. In some applications, that solution could be quite
adequate and easier than coding the explicit calls to close the
resource, in all the right places.

In Common Lisp there is a macro called WITH-OPEN-FILE. It opens a file,
binds it to a variable and establishes a scope for expressions where
that variable is known. No matter how that scope terminates, the file
is closed.

You don't have to use that macro: you can open the file directly using
the underlying API. Then you are on your own. If you don't close it, GC
will still do it. Only who knows when.

All that stuff works. Common Lisp programmers don't sit around
lamenting GC problems; everything that is discussed here is pretty much
a non-issue.

Instead of looking at Java, why not skip that hornet's nest and go
upstream to some of the places where it borrows some of its
misunderstood and misimplemented ideas.

Jun 28 '06 #20

Kaz Kylheku

Ian Collins wrote:

Ron House wrote:
Well it can't be "very simple" if the solution doesn't cover the real
motivation for having GC. I don't give a fig for efficiency (within
reason). The motivation for GC is to prevent errors:
That should read "prevent errors introduced through lazy programming"

Garbage collection allows for certain programs to be expressed which
are impossible without it, because they would collapse under the
complexity of managing the memory.

He is lazy who avoids work in order to be idle.

He is not lazy who avoids work in order to do other, more valuable
work.

If you think that automaticaly managing object lifetimes is laziness,
then never define a smart pointer.

You know, those constructor and destructors, all they do is prevents
errors introduced through lazy programming. Real programmers always
remember to call the right initialization function after allocating the
right amount of memory.

I invite you to take the Maxima project, a computational algebra system
written in Common Lisp, and rewrite it in C++, with no smart pointers
whatsoever for managing memory. Ha ha.

remove the need for
deleting dynamic memory so that you remove the need to keep track of it,
which simplifies a whole slew of algorithms and designs. If we have to
keep on keeping track for 'difficult' objects, then the algorithms and
methods will be made complicated anyway; errors will still be possible.

My platform has an effective GC library, but I only use it during
acceptance test runs, to verify that there aren't any leaks.

GC doesn't belong in the language, if it is to be used at all, it should
be in a library.

GC isn't anywhere. There is no new syntax, and no function to call.

If you have special GC-related features like weak pointers, or an
explicit call to the garbage collector, then you need an extra library.

It does make sense for the GC run-time support to be optional in C++
implementations. Many C++ implementations make exception handling
support optional, and run-time-type-identification optional.

Don't pay for what you don't use; makes sense.
RAII is a much more effective and deterministic tool.

RAII is limited to block scopes. It does nothing for dynamically
allocated objects.

C++ programmers sometimes use RAII to implement memory management via
smart pointers (e.g. reference counting). That is no longer
deterministic; you don't know which instance of the smart pointer will
take the object with it when it is destroyed.

And, besides, RAII is for the mentally weak and lazy, right? Real,
non-lazy programmers always remember to clean everything up at every
exit point from the function, and deal with all possible exceptions.

Jun 28 '06 #21

Alf P. Steinbach

* Kaz Kylheku:

Alf P. Steinbach wrote:
It's very simple: object destruction is not memory reclamation, and
memory reclamation is not object destruction.

The job of a garbage collector is solely to reclaim memory.

That is too naive.

Thinking you have solved the problem of coupling garbage collection to
object destruction deserves some label, but not a factual statement.

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?

Jun 28 '06 #22

Ian Collins

Kaz Kylheku wrote:

Ian Collins wrote:
Ron House wrote:
Well it can't be "very simple" if the solution doesn't cover the real
motivation for having GC. I don't give a fig for efficiency (within
reason). The motivation for GC is to prevent errors:

That should read "prevent errors introduced through lazy programming"

Garbage collection allows for certain programs to be expressed which
are impossible without it, because they would collapse under the
complexity of managing the memory.

He is lazy who avoids work in order to be idle.

He is not lazy who avoids work in order to do other, more valuable
work.

If you think that automaticaly managing object lifetimes is laziness,
then never define a smart pointer.

I use them all the time.

--
Ian Collins.

Jun 28 '06 #23

peter koch

Roland Pibinger skrev:

On Tue, 27 Jun 2006 05:07:45 +0200, "Alf P. Steinbach" [snip] When someone wanted to introduce GC into C++ with a newly-made 'gcnew'
operator the _compiler_ would have to check if all data members of the
gcnew-ed object (and its base classes) have trivial (ie.
non-implemented) destructors. Otherwise the compiler would have to
issue a compile time error. This means that not even the current
std::string could be a member of a gcnew-ed object.
Why a compile-time error? So far as I know it is perfectly conforming
to leak resources. Also, if you were a bit more modest and required an
all-or-nothing approach and used a garbage collected new rather than a
special purpose gcnew, the destructor of std::string would be empty
(disregarding the "noop" call of delete), and there would be no err...
ehm ... warning.

/Peter Moreover, since
objects with trivial destructors usually are of value type
What do you base that statement on? Actually this is mostly my
experience to, but if you look at the threads referred to by Jerry
Coffin this certainly can't be the experience of souls like
Alexandrescu and Sutter.
and value
types are best handled with value semantics the introduction of even a
limited form of GC into C++ seems highly questionable.
No need to.... Boehm has this nice library that does a lot of what you
need.

Best regards,
Roland Pibinger

/Peter

Jun 28 '06 #24

I V

On Tue, 27 Jun 2006 18:27:04 +0000, Roland Pibinger wrote:

std::string could be a member of a gcnew-ed object. Moreover, since
objects with trivial destructors usually are of value type and value

The obvious exception would be types designed to be used polymorphically,
which probably have trivial destructors, probably aren't value objects,
and can't be used with value semantics.

Jun 28 '06 #25

Ron House

Kaz Kylheku wrote:

Ron House wrote:
Goalie_Ca wrote:
Thanks for those threads. I can see that in general people are divided
at every level on how to approach this problem. To me, the outsider, it
appears that it will not likely make this revision although there is
clearly support for having one included.
My question is a simple one: how do we combine destructors with GC?

The most straightforward way is for the garbage collector to know how
to invoke the equivalent of delete on the objects. (I would abhor a
scheme whereby GC'able objects have to inherit from some special base
class with a virtual destructor; it would be better if this was
intelligent somehow).

For some types of objects, calling the destructor at GC time might be
too late.

That's the reason finalisers aren't the equivalent of destructors, and
the reason I ask if destructors can be combined with GC - without being
replaced by finalisers.

For example, is it possible to write a GC that cleans up immediately the
last pointer disappears, rather than in a general sweep, and still be
acceptably efficient? Note: I have looser ideas about what's efficient
than a lot of people, within reasonable bounds. Turning C++ into a
hyper-reliable (defensive against programmer error) language is much
more important to me than cpu cycles.
...
If you don't have garbage collection or some other scheme, you still
have to compute the lifetime of an object and call for it to be
destroyed by explicit delete.
That's why I want GC - _and_ destructors (not finalisers).
You simply have to regard that operating system resource handle has
having its own lifetime, which is contained within the lifetime of the
encapsulating object.
Do we really? I know this is usually claimed, but is there a proof or
does it merely seem 'obvious'?
You can take the responsibility for computing that contained lifetime,
and let the garbage collector determine the main lifetime.
That at least is safe. But sometimes it make the coder turn sommersaults
to get things working right. Sometimes a module receives an object and
doesn't know whether it must be deleted or not. That means that we
always have to have programming designs where each module takes care of
its own. Comparison of this with the sort of thing possible in, for
example, Haskell, shows how restrictive this can be.
You don't want to use the destructor for ending the resource lifetime,
because that will turn your entire object into garbage, while it is
still reachable. So the obvious thing is release the resource and change the state of
the object to indicate that the object does not have that resource. Garbage collection is not incompatible with your program knowing when
to release a resource.
...

Your example of BEFORE and AFTER is interesting. I had wondered about
that idea and wasn't sure it really is useful. Your example is a good
one. It is not clear to me that it solves this problem without requiring
app-specific code. We might not want to change the compiler, but we do
want to at least have classes that make everything automatic.

--
Ron House ho***@usq.edu.au
http://www.sci.usq.edu.au/staff/house
Ethics website: http://www.sci.usq.edu.au/staff/house/goodness

Jun 29 '06 #26

Ron House

Kaz Kylheku wrote:

Under GC discipline, the destructor must be regarded only as a last
resort cleanup for these kinds of resources.

But this is exactly my question: Is it really a "must" or are we all
just assuming it's a must because we haven't spotted the way to do it
yet? Has anyone proved that it really is a must?

--
Ron House ho***@usq.edu.au
http://www.sci.usq.edu.au/staff/house
Ethics website: http://www.sci.usq.edu.au/staff/house/goodness

Jun 29 '06 #27

Alf P. Steinbach

* Ron House:

Kaz Kylheku wrote:
So the obvious thing is release the resource and change the state of
the object to indicate that the object does not have that resource.

Garbage collection is not incompatible with your program knowing when
to release a resource.
>...

Your example of BEFORE and AFTER is interesting.

It's mostly bull. Kaz advocates zombie objects, like in Java. The idea
is that by introducing enough complexity, and by posting enough barely
different follow-ups, people will be smothered by details and not be
able to argue against (and I for one will not reply to every posting).

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?

Jun 29 '06 #28

Kaz Kylheku

Ron House wrote:

Kaz Kylheku wrote:
Ron House wrote:
Goalie_Ca wrote:

Thanks for those threads. I can see that in general people are divided
at every level on how to approach this problem. To me, the outsider, it
appears that it will not likely make this revision although there is
clearly support for having one included.

My question is a simple one: how do we combine destructors with GC?

The most straightforward way is for the garbage collector to know how
to invoke the equivalent of delete on the objects. (I would abhor a
scheme whereby GC'able objects have to inherit from some special base
class with a virtual destructor; it would be better if this was
intelligent somehow).

For some types of objects, calling the destructor at GC time might be
too late.

That's the reason finalisers aren't the equivalent of destructors, and

The general reason why destructors aren't the same thing as finalizers
is that they live in different programming languages.
the reason I ask if destructors can be combined with GC - without being
replaced by finalisers.

For example, is it possible to write a GC that cleans up immediately the
last pointer disappears, rather than in a general sweep, and still be
acceptably efficient?

So your definition of a destructor appears to be: something which runs
just after the pointer disappears. Whereas a finalizer is something
that runs just before the object is scavenged and re-entered into the
free store.

The problem is that the C++ destructor meets the latter definition a
lot more closely than the former.

Why is that event interesting when the program erases its last pointer
to an object?

I would argue that an interesting event is when the program loses the
last pointer to an object from an interesting subset of all the
pointers to that object.

Pointers which are not in that set behave semantically like weak
pointers.

For instance, suppose you have implemented some cache of objects from
which they expire based on some aging scheme. That cache has poitners
to the objects, of course. However, when /only/ that cache has a
pointer to some object, then, semantically, that object is practically
as good as garbage. If nobody grabs it before the time comes up, it
will be scavenged.

Moreover, certain actions may have to be taken when the object is
entered into the cache to be expired.

These kinds of schemes are found in C++ programs.

They are sometimes combined with reference counting too. I remember
working on systems where it was known that a certain module held a
reference to an object. Therefore, some clean up actions were triggered
on, guess what, the 2 -> 1 refcount transition. A notification was sent
out through the framework, and then that special module would drop
/its/ reference, which would trigger the destructor and delete.

So in other words, the 2 -> 1 refcount was the real disappearance of
the object, and that module effectively held only a weak pointer. Of
course there was nothing manifestly different about the reference that
it held; it was all in the semantics.

Aren't destructors being used for finalization in these situations?

Formally, what is a finalizer? It is a special entry created in the
memory manager which holds a weak pointer to some object, and a
function which is to be called when /only/ weak pointers to the object
remain, prior to that object being reclaimed.

That function itself is sometimes called a finalizer, transitively.
There is no reason why a C++ destructor cannot serve as that function.
That's what it's for: to perform the last rites on an object.

I think what you're asking for is to have an additional notification
when the program loses its last (non weak) reference to an object.

But computing that notification is the same thing as knowing that the
object's lifetime has ended. You call the destructor, and so you might
as well re-enter the object into the free storage. Once the destructor
has run, the object its reduced to just being memory. You can't do
anything with it, and so there is no point in registering any other
function on it.

Therefore, it doesn't make sense to want 'destructors /and/
finalizers'. It does make sense to have the choice to have delayed
/and/ timely finalization, for different objects.

To make use of that choice, it's necessary to have weak pointers. To
have a resource cleaned up in a timely way, the program has to be able
to indicate that certain references to the encapsulating object are
weak.

Then if the program enables that "just in time" garbage collection, it
will get the finalization trigger as soon as the last non-weak
reference is lost. The object is destroyed and all of the weak
references lapse into null values.

This still leaves the problem of what to do if the program design wants
to only have the resource cleaned up, not the entire object: to have
the resource cleaned up at some point when the object is still
reachable by ordinary non-weak references. The object continues to be
used without that resource. Then the program its on its own anyway.

What it boils down to is this. You have some object O which holds a
resource handle R. the object O is shared by multiple references to it.
A module can either hold a reference to O, or not hold a reference to
O. Only thorugh its reference to O can a module express its interest in
resource in R. The problem is that holding or not holding a reference
is only a boolean value: interest yes, or interest no. But there are
two entities there, O and R. The boolean interest value doesn't hold
enough information to express interest in these two separately.

What the program needs is a way to express two kinds of strong
references:

- references which express interest in O with or without the resource
R.
- reference which express interest in O with the resource R.

Now when there are no more references of the second kind, when the only
references to O that remain are weak references and strong references
which do not care whether R is valid, the resource R can be deallocated
at that point.

How can you implement such a scheme? One way is to implement the second
type of reference, expressing, "interest in O with the resource R",
using reference counting. What you can do is allocate a second object,
call it P, which holds a reference count, a reference to O, and a
method to invoke when the count hits zero. Modules which are interested
in R manipulate pointers to P instead of O. By holding references to P
instead, they express a special interest in O related to the meaning of
the reference count in P.

Modules manipulating P take care to manage the reference count among
themselves: when they copy the pointer, they bump up the refcount, and
when they erase it, they drop the refcount.

When the refcount hits zero, the special method is run on P, which does
something with object O: namely, it destroys resource R, and replaces
that handle with an invalid value, an action that can be encapsulated
in a method on O.

So provided that the refcounts are accurately managed, and the
appropriate kind of reference is used by every module to express the
correct kind of interest, the resource will be accurately managed.

No help is needed from the garbage collector; it can delay
finalization.

Or, instead of refcounting, a special memory arena could be used for
these P objects, an arena where garbage collection is greedy: it tracks
the lifetime of these objects accurately. When all the references to P
object disappear, it is reclaimed right away. Its finalizer action
performs the resource cleanup on P. So the effect is like that of
reference counting.

These P objects could hold references to each other, denoting
hierarchies of interest. Suppose that object P1 expresses interest
related to resource R1 in object O. Object P2 express interest related
to resource R2 in that object. How do you express total interest in O
having both resources? By a third object P12 which holds a reference to
O, and also to P1 and P2. The cleanup method of P12 does nothing with
O. All it does is null out P12's references to P1 and P2. If doing so
erases the last reference to P1, then P1's cleanup action is triggered,
releasing the resource O.R1. Likewise for P2.

Jun 29 '06 #29

C++0x Garbage Collection

Similar topics