By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
431,795 Members | 1,216 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 431,795 IT Pros & Developers. It's quick & easy.

Thread problem (constructor+assignement in one step?)

P: n/a
I am trying to create a garbage collection class in C++ to collect
instances of a specific class (so no general gc). The approach is to
use smart pointers and a mark and a simple sweep gc.

smartptr<B> pB;
pB = new B(args);

The constructor of B has been adjusted to register new instances with
the garbage collector. The constructor of smartptr has also been
adjusted to register new instances of smart pointers with the garbage
collector.
To collect, the gc now iterates all known smart pointers and marks the
referenced objects as active. After this, it iterates all known objects
and deletes all thhose that have not been marked active. The gc will be
called explicitly.

This works fine so far when using a single thread. However, when using
multiple threads, there is a problem.
The assignement above is essentially two steps. First create the object,
then do the assignement. What if between these two steps another thread
runs the gc? It will find the new created object not (yet) referenced
and delete it.

What would be the proper solution to prevent this? At a first thought,
you might come to
gc.lock();
pB = new B(args);
gc.unlock();

But then consider
B* f(...)
{
...
return new B(args);
}

...
pB = f(...);

Here, construction and assignement are not on one line. Thus using the
lock strategy will not work. Wrapping the whole function call with
lock/unlock will result in bad performance.

(In fact, the reason of using a gc like this is a little more
complicated. At the bottom line: no I cannot move to reference counting
or one of the smart pointer implementations).

I am looking for any suggestions.
Jul 22 '05 #1
Share this Question
Share on Google+
16 Replies


P: n/a
On Tue, 05 Oct 2004 10:29:01 +0200, yuraukar <yu******@hotmail.com>
wrote:
I am trying to create a garbage collection class in C++ to collect
instances of a specific class (so no general gc). The approach is to
use smart pointers and a mark and a simple sweep gc.

smartptr<B> pB;
pB = new B(args);
Are you sure it is sensible to offer direct assignment of a pointer to
a smart pointer? You probably should make the constructor explicit.
The constructor of B has been adjusted to register new instances with
the garbage collector.
Isn't that the problem? Why does B have to know about the garbage
collector?

The constructor of smartptr has also beenadjusted to register new instances of smart pointers with the garbage
collector.
Why doesn't it also register the passed pointer?
I am looking for any suggestions.


Seems like if you do the above, your problem is solved. Alternatively,
you could create a factory function to create B objects that returns a
smartptr<B>.

Tom
Jul 22 '05 #2

P: n/a
yuraukar wrote:
I am trying to create a garbage collection class in C++ to collect
instances of a specific class (so no general gc). The approach is to
use smart pointers and a mark and a simple sweep gc.

smartptr<B> pB;
pB = new B(args);

The constructor of B has been adjusted to register new instances with
the garbage collector. The constructor of smartptr has also been
adjusted to register new instances of smart pointers with the garbage
collector.


Perhaps put a flag in instances of B that indicates whether the instance has
been assigned to any smartptr yet, and change smartptr<B>::operator= to set
this flag. Then don't collect instances of B until they have been assigned.

Of course real GCs avoid this problem by scanning the stack and registers,
and by using GC safe-points.

--
David Hopwood <da******************@blueyonder.co.uk>
Jul 22 '05 #3

P: n/a


David Hopwood wrote:

yuraukar wrote:
I am trying to create a garbage collection class in C++ to collect
instances of a specific class (so no general gc). The approach is to
use smart pointers and a mark and a simple sweep gc.

smartptr<B> pB;
pB = new B(args);

The constructor of B has been adjusted to register new instances with
the garbage collector. The constructor of smartptr has also been
adjusted to register new instances of smart pointers with the garbage
collector.


Perhaps put a flag in instances of B that indicates whether the instance has
been assigned to any smartptr yet, and change smartptr<B>::operator= to set
this flag. Then don't collect instances of B until they have been assigned.

Of course real GCs avoid this problem by scanning the stack and registers,
and by using GC safe-points.


What's a GC safe-point (I sort of have an idea) and how do they interact with
the GC? I know how to do concurrent GC but I'm wondering what the conventional
approach is.

Joe Seigh
Jul 22 '05 #4

P: n/a
Joe Seigh wrote:


David Hopwood wrote:

Of course real GCs avoid this problem by scanning the stack and
registers, and by using GC safe-points.


What's a GC safe-point (I sort of have an idea) and how do they interact
with
the GC? I know how to do concurrent GC but I'm wondering what the
conventional approach is.


http://wefts.sourceforge.net/wefts-a...alSection.html

I designed that exactly for mark-* family multithreaded GC, although it may
be used also for other reasons.

Giancarlo Niccolai.
Jul 22 '05 #5

P: n/a
yuraukar wrote:
I am trying to create a garbage collection class in C++ to collect
instances of a specific class (so no general gc). The approach is to
use smart pointers and a mark and a simple sweep gc.

smartptr<B> pB;
pB = new B(args);

The constructor of B has been adjusted to register new instances with
the garbage collector. The constructor of smartptr has also been
adjusted to register new instances of smart pointers with the garbage
collector.
To collect, the gc now iterates all known smart pointers and marks the
referenced objects as active. After this, it iterates all known objects
and deletes all thhose that have not been marked active. The gc will be
called explicitly.

This works fine so far when using a single thread. However, when using
multiple threads, there is a problem.
The assignement above is essentially two steps. First create the object,
then do the assignement. What if between these two steps another thread
runs the gc? It will find the new created object not (yet) referenced
and delete it.

What would be the proper solution to prevent this? At a first thought,
you might come to
gc.lock();
pB = new B(args);
gc.unlock();

But then consider
B* f(...)
{
...
return new B(args);
}

...
pB = f(...);

Here, construction and assignement are not on one line.


Even if it were on a single line that does not guarantee that everything
on that line will be executed atomically. In fact there is no way in C++
to insure that (even simple) statements are not interrupted during their
execution. Even a simple assignment can be translated into several
machine code instructions, which means another thread can interrupt even
halfway during the assignment. You will need to you the synchronisation
mechanisms offered by the OS you are using.

Maybe you are better of by marking inactive objects as opposed to
marking active objects.

--
Peter van Merkerk
peter.van.merkerk(at)dse.nl
Jul 22 '05 #6

P: n/a

Here, construction and assignement are not on one line.

Even if it were on a single line that does not guarantee that everything
on that line will be executed atomically. In fact there is no way in C++
to insure that (even simple) statements are not interrupted during their
execution. Even a simple assignment can be translated into several
machine code instructions, which means another thread can interrupt even
halfway during the assignment. You will need to you the synchronisation
mechanisms offered by the OS you are using.


I am aware of that. My point was that in the example above, there is no
way of wrapping the line with a gc.lock()/gc.unlock() call.

Maybe you are better of by marking inactive objects as opposed to
marking active objects.


And how would this solve my problem? As far as I see, the issue is that
the new created instance is not referenced at the time the gc is run
because it is in some intermediate state (not yet assigned).
Jul 22 '05 #7

P: n/a
David Hopwood wrote:


Perhaps put a flag in instances of B that indicates whether the instance
has
been assigned to any smartptr yet, and change smartptr<B>::operator= to set
this flag. Then don't collect instances of B until they have been assigned.
And if for some reason, the instance is never assigned, the gc will
never collect it...
Of course real GCs avoid this problem by scanning the stack and registers,
and by using GC safe-points.


Jul 22 '05 #8

P: n/a
Tom Widmer wrote:

Are you sure it is sensible to offer direct assignment of a pointer to
a smart pointer? You probably should make the constructor explicit.
Not sure what you mean here - what would this look like?
The constructor of B has been adjusted to register new instances with
the garbage collector.

Isn't that the problem? Why does B have to know about the garbage
collector?

Reason is that I want to do gc for a single class only. As far as I see
it any mark-and-sweep gc needs to know or be able to find out (a) all
instances and (b) all references to these.

General purpose GCs try to find out about (a) and (b) by scanning heap,
stack, registers, etc. This seems complicated, OS and probably
machine-dependent.
Given my constraints, I could as well actively maintain the two lists.
This is what I have done.
The constructor of smartptr has also been
adjusted to register new instances of smart pointers with the garbage
collector.

Why doesn't it also register the passed pointer?

The smart pointer registers itself, it contains a pointer to the
instance. When the GC is run, it can ask the smart pointer to which
object it points, so no need to change the registration with each
assignement.

Seems like if you do the above, your problem is solved. Alternatively,
you could create a factory function to create B objects that returns a
smartptr<B>.

Factory function seems like a solution to me. I didn't understand the
other solution.
Jul 22 '05 #9

P: n/a
On Tue, 05 Oct 2004 16:48:20 +0200, yuraukar <yu******@hotmail.com>
wrote:
Tom Widmer wrote:

Are you sure it is sensible to offer direct assignment of a pointer to
a smart pointer? You probably should make the constructor explicit.
Not sure what you mean here - what would this look like?


explicit smartptr(T* ptr) //explicit added to signature
:m_ptr(ptr)
{
registerMe();
registerPtr();
}

Now you can't write:

smartptr<B> ptr = new B();

but instead you might write
smartptr<B> ptr(new B());
The constructor of B has been adjusted to register new instances with
the garbage collector.

Isn't that the problem? Why does B have to know about the garbage
collector?

Reason is that I want to do gc for a single class only. As far as I see
it any mark-and-sweep gc needs to know or be able to find out (a) all
instances and (b) all references to these.


Yes, but the B class doesn't necessarily need to inform the GC of its
existence. Instead, the smart pointer constructor can register the B
instances that it controls. My "registerPtr()" call above is meant to
do that.
General purpose GCs try to find out about (a) and (b) by scanning heap,
stack, registers, etc. This seems complicated, OS and probably
machine-dependent.
Given my constraints, I could as well actively maintain the two lists.
This is what I have done.


Ok.
The constructor of smartptr has also been
adjusted to register new instances of smart pointers with the garbage
collector.

Why doesn't it also register the passed pointer?

The smart pointer registers itself, it contains a pointer to the
instance. When the GC is run, it can ask the smart pointer to which
object it points, so no need to change the registration with each
assignement.


But the first time a "raw" pointer is put under the control of a smart
pointer, isn't that the best time to register that "raw" pointer with
the GC system?

Seems like if you do the above, your problem is solved. Alternatively,
you could create a factory function to create B objects that returns a
smartptr<B>.

Factory function seems like a solution to me. I didn't understand the
other solution.


Basically, "B" doesn't attempt to register with the GC. Instead, the
constructor of smartptr that takes a B* will register that B* with the
GC system. This suggests that your usage should always be:

smartptr<B> ptr(new B); //ptr constructor registers ptr *and* new B.
and you should generally avoid:
B* ptr = new B;

This may not be appropriate, depending on whether Bs are always
created and immediate used to initialize a smartptr<B>.

Tom
Jul 22 '05 #10

P: n/a
On Tue, 05 Oct 2004 16:37:04 +0200, yuraukar <yu******@hotmail.com>
wrote:
David Hopwood wrote:


Perhaps put a flag in instances of B that indicates whether the instance
has
been assigned to any smartptr yet, and change smartptr<B>::operator= to set
this flag. Then don't collect instances of B until they have been assigned.

And if for some reason, the instance is never assigned, the gc will
never collect it...


Is this ever going to happen (assuming you are writing the code)?
Isn't it what you want in any case? e.g.

B b; //don't want this to be collected!
smartptr<B> ptr(new B); //do want this to be collected.

Why can't you use reference counting? A summary of your usage of both
the raw B pointers and the smartptr<B> classes would help point to a
solution...

Tom
Jul 22 '05 #11

P: n/a


Tom Widmer wrote:


Now you can't write:

smartptr<B> ptr = new B();

but instead you might write
smartptr<B> ptr(new B());

Why is the latter preferable?

Joe Seigh
Jul 22 '05 #12

P: n/a
On Tue, 05 Oct 2004 16:36:32 GMT, Joe Seigh <js*******@xemaps.com>
wrote:


Tom Widmer wrote:


Now you can't write:

smartptr<B> ptr = new B();

but instead you might write
smartptr<B> ptr(new B());

Why is the latter preferable?


That particular syntax isn't particularly preferable, rather it is the
prevention of implicit conversions from pointers to smartpointers in
general that is preferable, since you can end up accidentally
assigning a pointer to a smart pointer that you didn't intend to. e.g.

void f(smartptr<B> p);

B b;
f(&b); //whoops.

Tom
Jul 22 '05 #13

P: n/a


Tom Widmer wrote:

On Tue, 05 Oct 2004 16:36:32 GMT, Joe Seigh <js*******@xemaps.com>
wrote:


Tom Widmer wrote:


Now you can't write:

smartptr<B> ptr = new B();

but instead you might write
smartptr<B> ptr(new B());

Why is the latter preferable?


That particular syntax isn't particularly preferable, rather it is the
prevention of implicit conversions from pointers to smartpointers in
general that is preferable, since you can end up accidentally
assigning a pointer to a smart pointer that you didn't intend to. e.g.

void f(smartptr<B> p);

B b;
f(&b); //whoops.

Tom

Jul 22 '05 #14

P: n/a


Tom Widmer wrote:

On Tue, 05 Oct 2004 16:36:32 GMT, Joe Seigh <js*******@xemaps.com>
wrote:


Tom Widmer wrote:


Now you can't write:

smartptr<B> ptr = new B();

but instead you might write
smartptr<B> ptr(new B());

Why is the latter preferable?


That particular syntax isn't particularly preferable, rather it is the
prevention of implicit conversions from pointers to smartpointers in
general that is preferable, since you can end up accidentally
assigning a pointer to a smart pointer that you didn't intend to. e.g.

void f(smartptr<B> p);

B b;
f(&b); //whoops.


You mean assigning the same object to two separate smart pointers. That
doesn't really solve the problem. You have to resort to the factory pattern
which makes the ctor private and sort of makes templates somewhat less useful
since you have to do some custom coding for each class you want to use with
smart pointers. It's a C++ problem which won't ever get fixed since you can
sort of do pointer abstraction in C++, just not very well. You can contrive
all you want but it won't fix it and it won't get fixed.

Joe Seigh
Jul 22 '05 #15

P: n/a
Tom Widmer wrote:

And if for some reason, the instance is never assigned, the gc will
never collect it...

Is this ever going to happen (assuming you are writing the code)?
Isn't it what you want in any case? e.g.

B b; //don't want this to be collected!
smartptr<B> ptr(new B); //do want this to be collected.

Why can't you use reference counting? A summary of your usage of both
the raw B pointers and the smartptr<B> classes would help point to a
solution...


In fact my current implementation has been using ref counting and ran
into serious problems when circular references had to be supported. So
using a GC should resolve the problem.

In fact, instances of B could have references to other instanced of B
and so forth. Circular references are allowed. Look at this as a group.
Now as long as any thread holds a pointer to an instance in a group, the
group should not be deleted.
I tried introducing the group concept as well. Did ref counting on the
group, but then... merging groups would be an issue as soon as you
construct a reference from an instance in one group to another.
At the bottom line: GC seems to be like the solution, provided I can
find a good GC implementation.

The GC holds a list of B's and smart pointers. Each B can reference
other B's with normal pointers. The list of referenced B's can be
obtained with something like B->getReferencedBs().
The m&s GC interates the smart pointers, marks the B's, marks B's
referenced by these B's etc. At the end it deletes all not marked B's.

So I have two pointers in my code: smart pointers and normal pointers
(weak pointers).

Your other post recommending that constructor of B does not register the
instance with the GC, but the smart pointer does sounds good. The
disadvantage would be that B's never associated with smart pointers
would never be seen by the GC.

I will think of that.
Jul 22 '05 #16

P: n/a
On Wed, 06 Oct 2004 16:11:54 +0200, yuraukar <yu******@hotmail.com>
wrote:
Tom Widmer wrote:

And if for some reason, the instance is never assigned, the gc will
never collect it...

Is this ever going to happen (assuming you are writing the code)?
Isn't it what you want in any case? e.g.

B b; //don't want this to be collected!
smartptr<B> ptr(new B); //do want this to be collected.

Why can't you use reference counting? A summary of your usage of both
the raw B pointers and the smartptr<B> classes would help point to a
solution...


In fact my current implementation has been using ref counting and ran
into serious problems when circular references had to be supported. So
using a GC should resolve the problem.


You can use a weak_ptr style object to break cycles. Using
boost::shared_ptr and boost::weak_ptr may to be less error prone and a
lot more efficient than writing your own GC implementation. However, I
agree that reference counting isn't always appropriate.
In fact, instances of B could have references to other instanced of B
and so forth. Circular references are allowed. Look at this as a group.
Now as long as any thread holds a pointer to an instance in a group, the
group should not be deleted.
I tried introducing the group concept as well. Did ref counting on the
group, but then... merging groups would be an issue as soon as you
construct a reference from an instance in one group to another.
At the bottom line: GC seems to be like the solution, provided I can
find a good GC implementation.

The GC holds a list of B's and smart pointers. Each B can reference
other B's with normal pointers. The list of referenced B's can be
obtained with something like B->getReferencedBs().
Ahh, this is why B must be an integral part of the GC mechanism.
The m&s GC interates the smart pointers, marks the B's, marks B's
referenced by these B's etc. At the end it deletes all not marked B's.
Sounds reasonable.
So I have two pointers in my code: smart pointers and normal pointers
(weak pointers).

Your other post recommending that constructor of B does not register the
instance with the GC, but the smart pointer does sounds good. The
disadvantage would be that B's never associated with smart pointers
would never be seen by the GC.

I will think of that.


It does solve your original problem at least. I suppose you could also
provide an explicit method to register a B with the gc. Obviously you
would only call that when a B was safely referenceable by the GC, due
to be being referenced by another B. Alternatively, it could be used
as a kind of delayed delete call.

Tom
Jul 22 '05 #17

This discussion thread is closed

Replies have been disabled for this discussion.