IDisposable, using(), RAII and structs [Discussion]

codymanix

Last night I had several thought about RAII and want to discuss a bit.

Why doesn't CSharp support destructors in structs? Wouldn't that make RAII
possible like in C++? When the struct goes out of scope, the dtor could be
immediately be called (no GC needed).

For that, you don't have to declare the whole File class as a struct (which
would be not good for performance when File has a lot of data-members).
Instead creating a thin wrapper would be a good solution:

public struct MyFile
{
File f;
public MyFile(string p) { f=new File(p); }
~MyFile() { f.Close(); }
}

That would do it.

I have a second though which got through my mind. What about introducing an
interface which is recognized by the compiler and forces usage of the
using() statement.

public interface IDisposableForc e : IDisposable{}

A second possibility could be declaring an attribute which marks
IDisposable-implementing-classes wheather using() should be mandatory or at
least suggested.

enum WarnLevel {NoWarn,WarnLvl 1,WarnLvl2,Warn Lvl3,WarnLvl4,G enerateError};

class UsingAttribute{ UsingAttribute( WarnLevel w){}}

[Using(WarnLvl3)] // that means if warnlevel 3 or higher is set a warning
is issued when this class is used without using()-statement
public class MyFile : File
{
public class MyFile(string p) :base(p){}
}

What do you think?

--
cody

[Freeware, Games and Humor]
www.deutronium.de.vu || www.deutronium.tk

Nov 15 '05

Subscribe Reply

2207

Andreas Huber

Eric,

Sometimes reference counting IS a good thing, and I believe that instead of
the IDisposable pattern being the catch all (and henceforth ungauranteed
because of .Net newbies not realizing that say, a SqlConnection really needs
to be "Close"d or "Dispose"d before it goes out of scope) that a special
type that the compiler can look at and disallow the type to reference OTHER
reference counted types (and prevent the memory leak of two ref-counted
objects pointing at each other)

The following questions come to mind:
- Could a refcounted (RCed) object hold references to GCed objects?
- Could a GCed object hold references to RCed objects?

Regards,

Andreas

Nov 15 '05 #21

Andreas Huber

Cody,

Even an explicit cast does not remove the contradiction. It still
means that a type requiring deterministic cleanup is subjected to
non-deterministic cleanup. That is not normally a good idea and would
more often than not lead to subtle errors, IMO.
This is no error but just a performanceprob lem. Forcing the programmer to
use explicit cast is good to prevent unintentionally boxings. Maybe one
could go so far to forbid boxing with refcounted objects.
> GC was introduced to solve the circular reference problem. If .NET
> supported a special type of class that is reference-counted then there
> would be no way to keep people from introducing cycles.

There would be no problem: the mark&sweep GC can easily detect such lost
objects. using refcounting & mark/sweep together can be a poweful feature IMO.

I don't think so. The initial rationale for introducing
RAII/refcounting in .NET was that it would help to implement classes
that need deterministic clean-up, right? If you let GC clean up the
cycles then those objects are finalized non-determinisitica lly what
defeats the original purpose.
I think deterministic clean-up via RAII/refcouting and
non-deterministic clean-up via GC are two very different beasts. IMO,
there is no way to mix the two without running into contradictions.

Where is the problem? These Disposable classes use referencecounti ng for
deterministic GC. The mark&sweep GC is just as a failsafe just for the case
when refcounting fails for some reason, to prevent memory leaks.

What is your rationale for introducing reference counting then? More
specifically: Why is reference-counted *memory* management better than
garbage collection?

In the title of this thread IDisposable & using are mentioned. You
don't need those for *memory* management!

Therefore, for some situations manual cleanup through
IDisposable/using() is pretty much inevitable.

It is still unsafe and crappy.

You're not alone with that thinking (I once had similar concerns).
Many people have tried to come up with a better solution but to my
knowledge none has ever come up with something substantial. Have you
followed the link I posted? The guy that worte that article is an MS
person involved in the design of .NET. He explains how he also once
was a strong proponent of refcounting but became more and more
convinced that a GC-only strategy is better the more he explored the
exact semantics of a twin (GC and refcounting) solution.

Moreover, isn't it interesting that all other GCed languages (Java,
Python, etc.) work just like .NET does? There is always something
similar to the .NET dispose pattern. Scores of very smart people have
designed those languages and it speaks volumes that they haven't come
up with a better solution either.

BTW, to my knowledge, none of the other GCed languages supports
something similar to using.
It is not very funny to note that you can
write programs in C++ more safe than in C#.
Wasn't that C# was invented for? Safety? No more memory leaks? Mission
failed.

You cannot produce what is classically known as a *memory* leak in a
garbage collected language. This because of the very definition of how
a garbage collector works.
If you believe you can, please post a program that does produce
*memory* leaks.

Regards,

Andreas

Nov 15 '05 #22

cody

> > > I don't think so. The initial rationale for introducing

RAII/refcounting in .NET was that it would help to implement classes
that need deterministic clean-up, right? If you let GC clean up the
cycles then those objects are finalized non-determinisitica lly what
defeats the original purpose.
I think deterministic clean-up via RAII/refcouting and
non-deterministic clean-up via GC are two very different beasts. IMO,
there is no way to mix the two without running into contradictions.
Where is the problem? These Disposable classes use referencecounti ng for
deterministic GC. The mark&sweep GC is just as a failsafe just for the case when refcounting fails for some reason, to prevent memory leaks.

What is your rationale for introducing reference counting then? More
specifically: Why is reference-counted *memory* management better than
garbage collection?

I didn't talk about memory management, I talk about freeing native resources
such as handles. With mark&sweep you can run into problems because it takes
time until gc kicks in and frees it. But native resource handles may be very
limited so you can quickly run out of them.

With refcounting, you can guarantee that dispose() is called immediately
when the refcount is zero. Note that disposing an object does not mean
freeing the object itself. The managed object itself can then freed later by
mark & sweep.

It is still unsafe and crappy.

You're not alone with that thinking (I once had similar concerns).
Many people have tried to come up with a better solution but to my
knowledge none has ever come up with something substantial. Have you
followed the link I posted? The guy that worte that article is an MS
person involved in the design of .NET. He explains how he also once
was a strong proponent of refcounting but became more and more
convinced that a GC-only strategy is better the more he explored the
exact semantics of a twin (GC and refcounting) solution.

I understand that mark&sweep is better for freeing normal memory (when there
is enough of it). But you will run into problems when using mark&sweep with
limited resources. Make a loop that opens network connections. The
mark&sweep system cannot react so fast, there will be thousands of open
connections and the mark&sweep still does nothing. It doesn't even have a
clue that the app will run out of handles soon. With memory management, the
gc notices when there is not enough memory and runs automatically.

I will follow the link you posted maybe i get further insight.
Moreover, isn't it interesting that all other GCed languages (Java,
Python, etc.) work just like .NET does? There is always something
similar to the .NET dispose pattern. Scores of very smart people have
designed those languages and it speaks volumes that they haven't come
up with a better solution either.
Maybe they was very smart but nobody is perfect. When they did a perfect job
then we never would need further inventions and improvements. In hundred
years the people would still use .NET which its mark&sweep because it is the
perfect solution which cannot be improved. Years ago when they invented
assembler they would have never even though that is would be possible that
unused memory can automatically freed.
You cannot produce what is classically known as a *memory* leak in a
garbage collected language.

Maybe not memory leak but native resources leak. Aditionally, finalizers are
not even guaranteed to run even when the program exits.

I wish you a happy Nikolaustag (Not sure what its name is in english) :-)

--
cody

[Freeware, Games and Humor]
www.deutronium.de.vu || www.deutronium.tk

Nov 15 '05 #23

Andreas Huber

Cody,

[snip]

Where is the problem? These Disposable classes use referencecounti ng for
deterministic GC. The mark&sweep GC is just as a failsafe just for the case when refcounting fails for some reason, to prevent memory leaks.
What is your rationale for introducing reference counting then? More
specifically: Why is reference-counted *memory* management better than
garbage collection?

I didn't talk about memory management, I talk about freeing native resources
such as handles.

Ok, you meant resource management.
With mark&sweep you can run into problems because it takes
time until gc kicks in and frees it. But native resource handles may be very
limited so you can quickly run out of them.
Resources limited by other factors than memory are just one problem
and not even the worst because the GC can still clean up for you when
you need very few resources.
The following two problems are much worse:
1. Cleanup affecting multiple objects. For example, a StreamWriter
object internally holds its own buffer and a reference to a Stream
object. If you forget to call Dispose on the StreamWriter object and
it is finalized later there is *no way* it could empty the buffer into
the Stream. This is because the Stream also has a finalizer, which
might have been called before the finalizer of the StreamWriter
object. The effect is that the Stream is closed all right but you will
inevitably lose the data in the StreamWriter buffer. Please note that
garbage collection can *never* save you here. The data is always lost!
2. Non-shareable resources like e.g. FileStream. A FileStream object
holds an *exclusive* lock on the file it is currently writing to. If
you fail to call Dispose here and try to reopen the file a bit later
you are almost guaranteed to be greeted with an exception. This is
because the first FileStream object has most probably not yet been
finalized and therefore never had the chance to close the file.
Please note that GC can almost never save you here because it is
unlikely that it will consistently kick in before you try to reopen
the file.

Now, let's get back to your original proposal: You wanted to use
refcounting for classes that need deterministic clean-up. If
refcounting fails because there are cycles you wanted to let the GC
take care of it.
I think it should be pretty obvious that ***neither refcounting nor
GC*** is capable of cleaning up correctly when you have objects of
type 1 or 2 forming cycles. The programmer needs to say when to clean
up such objects and that's exactly what Dispose() is for.
Yes, there is a way to fix problem 1 but it is not very convincing as
it forces programmers to avoid certain types of cycles and hurts
finalization performance. Since problem no. 2 cannot be fixed anyway
the .NET designers decided to not trade implementation difficulties
and worse performance for only slightly easier cleanup.
With refcounting, you can guarantee that dispose() is called immediately
when the refcount is zero.
Only when there are no cycles.
Note that disposing an object does not mean
freeing the object itself. The managed object itself can then freed later by
mark & sweep.
I am very aware of that.

You're not alone with that thinking (I once had similar concerns).
Many people have tried to come up with a better solution but to my
knowledge none has ever come up with something substantial. Have you
followed the link I posted? The guy that worte that article is an MS
person involved in the design of .NET. He explains how he also once
was a strong proponent of refcounting but became more and more
convinced that a GC-only strategy is better the more he explored the
exact semantics of a twin (GC and refcounting) solution.

I understand that mark&sweep is better for freeing normal memory (when there
is enough of it). But you will run into problems when using mark&sweep with
limited resources. Make a loop that opens network connections. The
mark&sweep system cannot react so fast, there will be thousands of open
connections and the mark&sweep still does nothing. It doesn't even have a
clue that the app will run out of handles soon. With memory management, the
gc notices when there is not enough memory and runs automatically.

Again, I'm very aware of that. However, what you say here is not *the*
problem why the Dispose pattern is necessary. For example, you could
imagine a system that has its own GC for every shareable resource type
that is limited by other factors than memory.

[snip]
Moreover, isn't it interesting that all other GCed languages (Java,
Python, etc.) work just like .NET does? There is always something
similar to the .NET dispose pattern. Scores of very smart people have
designed those languages and it speaks volumes that they haven't come
up with a better solution either.

Maybe they was very smart but nobody is perfect. When they did a perfect job
then we never would need further inventions and improvements. In hundred
years the people would still use .NET which its mark&sweep because it is the
perfect solution which cannot be improved. Years ago when they invented
assembler they would have never even though that is would be possible that
unused memory can automatically freed.

As I tried to explain above, the problem is *systemic* and has nothing
to do with language designers not being smart enough or a GC not being
perfect or technology not being available. There simply is no
practical way that certian types of cleanup could be performed
automatically *and* correctly.
In theory fully automatic cleanup is possible though: You simply let
the GC run whenever a reference goes out of scope. However, it should
be clear that doing so wastes ridiculous amounts of precious cycles on
something humans are so much better.
Aditionally, finalizers are
not even guaranteed to run even when the program exits.
This is only true if your program does funny or very lengthy things
during finalization (e.g. creating objects or writing s..tloads of
data to the disk). The finalizer thread will call all finalizers as
long as the number of finalizable objects does not increase and the
whole finalization does not take longer than somewhere around 40
seconds (I forgot the actual number).

I wish you a happy Nikolaustag (Not sure what its name is in english) :-)

Thanks, I wish you the same. I know I'm late ;-). BTW, I'm actually
Swiss, so there's no need to translate...

Regards,

Andreas

Nov 15 '05 #24

Eric Newton

Andreas, (without thoroughly examining the idea), wouldnt it be fine to
A) create a refcounted type
B) refcounted type may NOT directly or indirectly refer to another
refcounted type, verifiable via compiler and runtime (when using generic
object references, or not all reftypes to hold object references at all...)

I believe that could work... and prevents the circular reference that
plagued rich COM object models...

(And BTW, how did MS ever prevent memory leaks in the MSXML parsing object
model??? I just gotta know!)
--
Eric Newton
C#/ASP Application Developer
http://ensoft-software.com/
er**@cc.ensoft-software.com [remove the first "CC."]

"Andreas Huber" <ah****@gmx.net > wrote in message
news:3e******** *************** ***@posting.goo gle.com...

Cody,

[snip]
> Where is the problem? These Disposable classes use referencecounti ng for > deterministic GC. The mark&sweep GC is just as a failsafe just for
the
case
> when refcounting fails for some reason, to prevent memory leaks.

What is your rationale for introducing reference counting then? More
specifically: Why is reference-counted *memory* management better than
garbage collection?
I didn't talk about memory management, I talk about freeing native resources such as handles.

Ok, you meant resource management.
With mark&sweep you can run into problems because it takes
time until gc kicks in and frees it. But native resource handles may be very limited so you can quickly run out of them.

Resources limited by other factors than memory are just one problem
and not even the worst because the GC can still clean up for you when
you need very few resources.
The following two problems are much worse:
1. Cleanup affecting multiple objects. For example, a StreamWriter
object internally holds its own buffer and a reference to a Stream
object. If you forget to call Dispose on the StreamWriter object and
it is finalized later there is *no way* it could empty the buffer into
the Stream. This is because the Stream also has a finalizer, which
might have been called before the finalizer of the StreamWriter
object. The effect is that the Stream is closed all right but you will
inevitably lose the data in the StreamWriter buffer. Please note that
garbage collection can *never* save you here. The data is always lost!
2. Non-shareable resources like e.g. FileStream. A FileStream object
holds an *exclusive* lock on the file it is currently writing to. If
you fail to call Dispose here and try to reopen the file a bit later
you are almost guaranteed to be greeted with an exception. This is
because the first FileStream object has most probably not yet been
finalized and therefore never had the chance to close the file.
Please note that GC can almost never save you here because it is
unlikely that it will consistently kick in before you try to reopen
the file.

Now, let's get back to your original proposal: You wanted to use
refcounting for classes that need deterministic clean-up. If
refcounting fails because there are cycles you wanted to let the GC
take care of it.
I think it should be pretty obvious that ***neither refcounting nor
GC*** is capable of cleaning up correctly when you have objects of
type 1 or 2 forming cycles. The programmer needs to say when to clean
up such objects and that's exactly what Dispose() is for.
Yes, there is a way to fix problem 1 but it is not very convincing as
it forces programmers to avoid certain types of cycles and hurts
finalization performance. Since problem no. 2 cannot be fixed anyway
the .NET designers decided to not trade implementation difficulties
and worse performance for only slightly easier cleanup.
With refcounting, you can guarantee that dispose() is called immediately
when the refcount is zero.

Only when there are no cycles.
Note that disposing an object does not mean
freeing the object itself. The managed object itself can then freed later by mark & sweep.

I am very aware of that.
You're not alone with that thinking (I once had similar concerns).
Many people have tried to come up with a better solution but to my
knowledge none has ever come up with something substantial. Have you
followed the link I posted? The guy that worte that article is an MS
person involved in the design of .NET. He explains how he also once
was a strong proponent of refcounting but became more and more
convinced that a GC-only strategy is better the more he explored the
exact semantics of a twin (GC and refcounting) solution.

I understand that mark&sweep is better for freeing normal memory (when there
is enough of it). But you will run into problems when using mark&sweep with limited resources. Make a loop that opens network connections. The
mark&sweep system cannot react so fast, there will be thousands of open
connections and the mark&sweep still does nothing. It doesn't even have a clue that the app will run out of handles soon. With memory management, the gc notices when there is not enough memory and runs automatically.

Again, I'm very aware of that. However, what you say here is not *the*
problem why the Dispose pattern is necessary. For example, you could
imagine a system that has its own GC for every shareable resource type
that is limited by other factors than memory.

[snip] Moreover, isn't it interesting that all other GCed languages (Java,
Python, etc.) work just like .NET does? There is always something
similar to the .NET dispose pattern. Scores of very smart people have
designed those languages and it speaks volumes that they haven't come
up with a better solution either.

Maybe they was very smart but nobody is perfect. When they did a perfect job then we never would need further inventions and improvements. In hundred
years the people would still use .NET which its mark&sweep because it is the perfect solution which cannot be improved. Years ago when they invented
assembler they would have never even though that is would be possible that unused memory can automatically freed.

As I tried to explain above, the problem is *systemic* and has nothing
to do with language designers not being smart enough or a GC not being
perfect or technology not being available. There simply is no
practical way that certian types of cleanup could be performed
automatically *and* correctly.
In theory fully automatic cleanup is possible though: You simply let
the GC run whenever a reference goes out of scope. However, it should
be clear that doing so wastes ridiculous amounts of precious cycles on
something humans are so much better.
Aditionally, finalizers are
not even guaranteed to run even when the program exits.

This is only true if your program does funny or very lengthy things
during finalization (e.g. creating objects or writing s..tloads of
data to the disk). The finalizer thread will call all finalizers as
long as the number of finalizable objects does not increase and the
whole finalization does not take longer than somewhere around 40
seconds (I forgot the actual number).

I wish you a happy Nikolaustag (Not sure what its name is in english)

:-)
Thanks, I wish you the same. I know I'm late ;-). BTW, I'm actually
Swiss, so there's no need to translate...

Regards,

Andreas

Nov 15 '05 #25

Andreas Huber

Eric,

Andreas, (without thoroughly examining the idea), wouldnt it be fine to
A) create a refcounted type
B) refcounted type may NOT directly or indirectly refer to another
refcounted type, verifiable via compiler and runtime (when using generic
object references, or not all reftypes to hold object references at all...)

I believe that could work... and prevents the circular reference that
plagued rich COM object models...
Yes it would work fine, given that such a type couldn't contain
references to GCed objects and GCed objects couldn't contain
references to such refcounted types. However, how useful would such a
type be?
(And BTW, how did MS ever prevent memory leaks in the MSXML parsing object
model??? I just gotta know!)

Sorry, I don't understand. What property of the MSXML lib are you
referring to?

Regards,

Andreas

P.S. I won't be able to answer further post before Jan 5th...

Nov 15 '05 #26

Eric Newton

comments inline:

"Andreas Huber" <ah****@gmx.net > wrote in message
news:3e******** *************** ***@posting.goo gle.com...

Eric,
Andreas, (without thoroughly examining the idea), wouldnt it be fine to
A) create a refcounted type
B) refcounted type may NOT directly or indirectly refer to another
refcounted type, verifiable via compiler and runtime (when using generic
object references, or not all reftypes to hold object references at all...)
I believe that could work... and prevents the circular reference that
plagued rich COM object models...
Yes it would work fine, given that such a type couldn't contain
references to GCed objects and GCed objects couldn't contain
references to such refcounted types. However, how useful would such a
type be?
(And BTW, how did MS ever prevent memory leaks in the MSXML parsing object model??? I just gotta know!)

Sorry, I don't understand. What property of the MSXML lib are you
referring to?

well i was actually referring to the whole library, since the MSXML nodes
all point to a DOMDocument and the DOMDocument points to all the Nodes
through a collection, (a circuclar reference) how on earth did this not leak
memory when you set DOMDocument instance to nothing without some kind of
Release or Dispose or whatever...?

Regards,

Andreas

P.S. I won't be able to answer further post before Jan 5th...

and by the way, Chris B on the CLR team has introduced the HandleCollector ,
basically a referencing counting mechanism that will collect on a new
instantiation of a finite resource, instead of on release... fascinating how
if you look at a problem from a different perspective you can achieve the
same result (in a sense...)
--
Eric Newton
C#/ASP Application Developer
http://ensoft-software.com/
er**@cc.ensoft-software.com [remove the first "CC."]

Nov 15 '05 #27

news.bluewin.ch

Eric,

(And BTW, how did MS ever prevent memory leaks in the MSXML parsing
object model??? I just gotta know!)
Sorry, I don't understand. What property of the MSXML lib are you
referring to?

well i was actually referring to the whole library, since the MSXML
nodes all point to a DOMDocument and the DOMDocument points to all
the Nodes through a collection, (a circuclar reference) how on earth
did this not leak memory when you set DOMDocument instance to nothing
without some kind of Release or Dispose or whatever...?

I don't know the library at all. The main strategy to prevent
circular-reference-memory-leaks is to use weak pointers:

http://www.boost.org/libs/smart_ptr/weak_ptr.htm

So, while a DOMDocument would hold ordinary refcounted pointers to its
nodes, the nodes themselves would internally hold only weak pointers to
their DOMDocument. I know that COM does not have the weak pointer concept
but you can just as well use plain C++ pointers instead (i.e. you don't call
AddRef). The node member function returning the DOMDocument pointer simply
calls AddRef on the DOMDocument object...

[snip] and by the way, Chris B on the CLR team has introduced the
HandleCollector , basically a referencing counting mechanism that will
collect on a new instantiation of a finite resource, instead of on
release... fascinating how if you look at a problem from a different
perspective you can achieve the same result (in a sense...)

I guess I fail to get your point. How would such a HandleCollector be able
to provide *deterministic* clean-up? In other words, HandleCollector solves
the resources-limited-by-other-factors-than-memory problem but it does not
solve the other clean-up problems that I described in an earlier post...

Regards,

Andreas

Nov 15 '05 #28

Andreas Huber

news.bluewin.ch wrote:
[snip]

Ooops, sorry for the bogous name...

Nov 15 '05 #29

Similar topics