"Andre Kaufmann" <an************ *********@t-online.dewrote in message
news:uw******** ******@TK2MSFTN GP02.phx.gbl...
Born wrote:
>GC is really garbage itself
[...]
Negative effects by the destruction delay:
1) Efficiency issue
It's bad for CPU/Resource intensive but memory cheap objects.
It's also bad allocating / deallocating permanently small objects on a
native heap. It's rather time consuming, compared to the managed one and
it doesn't scale that well if you have a multi core CPU.
The framework GC is actually one of the more efficient garbage collectors
out there. Properly done, reference counting can take up to 50% of your
program's CPU cycles. Reference counting will also leave objects that refer
to each other in memory, even when no other objects refer to them. Google
for my GC posts in late 2005 for sample code in VB 6 and VB 2005 for a
simple program that demonstrates the difference between the mark/sweep
algorithm in dotNet vs the reference counting algorithm in the VB 6/COM
model.
The dotNet GC is actually a three generation GC. What this means is that
when the Gen(0) heap fills up, all accessible objects are marked and their
size is computed. If there is insufficient space in the Gen(1) heap, then
and only then is the Gen(1) heap cleaned up to the Gen(2) heap. Once there
is sufficient space in the Gen(1) heap, the marked objects in the Gen(0)
heap are copied to the Gen(1) heap and the heap pointer in Gen(0) is reset
to the base address. The only time a full mark/sweep/compact garbage
collection is done is when the Gen(2) heap is full. This appears to be done
before the framework requests additional memory from the OS.
[...]
2) Logic issue
[...]
Don't tell me the IDispose pattern is for that. There may be more than
one strong references and you don't know when and where to call Dispose.
You could use reference counting as well for a managed object. Instead of
calling the destructor you call Dispose if the reference counter is 0.
You never have to call the IDispose interface. The GC will do this for you.
Yes, it does result in those objects possibly being left in memory for one
more GC cycle, but the benefit is that you, as the programmer, never need
worry about dangling pointers and memory leaks. As long as your objects
themselves clean up properly in the dispose method, memory leaks and
dangling pointers simply cannot occur. The GC class even provides
interfaces to that if your object allocates a lot of unmanaged system
memory, you can tell the framework how much is being allocated in its
constructor. You then tell the framework about the release of this memory
in your Dispose method.
>
>[...]
An example:
Suppose we're doing a 3D game. A radar is monitoring a target. Obviously,
the radar should hold a weak reference to the target. When the target is
killed, logical confusion is immediately brought to the radar watcher
(the gamer). Is the target destroyed or not? You can not tell him, hey,
it's killed but still shown on the radar because you've got to wait for
the GC to make it.
What has the state of an object (resource) to do with the memory it has
allocated ? That's the only thing GC is supposed to do - manage memory.
You're thinking is backwards. You are letting your resource management
determine the scope of an object's lifetime. You need to use the
applicatoin domain logic, in this case, the object going off the edge of the
radar or being destroyed by a missile to remove the references to the
object. Yes, the object's resources will still be allocated for an
indeterminate time, but the object will never again be able to appear on
your radar anyway. The runtime will eventually get around to releasing
those resources via the object's IDispose interface - you don't have to do
it yourself. In C++ you must explicitely handle the release of the memory
at some point in your program execution. In C#, the runtime will execute
the destruction code for you when it either has idle time or needs the
memory to satisfy a allocation request.
>Reason 2:
[...]
Fairly speaking, GC itself is not garbage. However, when java and C#
integrate it and prevent the user from manually managing memory, it
becomes garbage. Note GC in java and C# is not really an addictive as
someone would argue since there is no way to do real memory management
like delete obj in C++.
In C++ you have many classes which handle the memory by them self, e.g.
most STL classes to ensure that not permanently memory is allocated or
that the memory is consecutive. GC handles this automatically.
>Better memory management in my mind is reference counting + smart
pointer, which makes things automatic and correct. You have deterministic
destructions while no need to manually call Dispose.
As I wrote, why not also implementing reference counting for managed
objects, which are calling Dispose if the reference count is 0 ?
Though there is a performance impact, because for thread safe reference
counting you have to use Interlocked functions.
The performance impact can be huge. Early Smalltalk implementations spent
50% of their time reference counting.
>You need not to manually change the reference count as smart pointers
help you achieve it.
Agreed, you would have to call them manually in C#, because there's no
RAII. Which I'm really missing in C#.
>The only problem with this approach is cyclic reference. However, even if
not theoretically proven, the problem generally can be solved by
replacing some strong references with weak references.
When using weak references, you still need to handle dangling pointer
exceptions.
Yes, but it's sometimes tricky and in complex object hierarchies you have
a very high chance to build cyclic references, which are hard to deal
with.
>I believe the restriction by GC is one of the main reasons why in some
field (the gaming industry, for example), java or C# is rarely used in
serious products who face real computing challenges.
Hm, perhaps because one can't see that Java or C# is used. E.g. the game
Chrome is written in Java (not 100% but many parts of it)
>Solution
>1) The ideal solution is to convince the language providers to give us
back the ability of managing memory by our own. GC can still be there,
and it becomes a real addictive in that situation.
It's there is C# - you can mix C++ modules with C#. The penalty is that you
now spend your time managing memory.
GC doesn't solve resource allocation problems. They are different as in
C++ and so are the problems you have to face. It's the same with memory
handling. In C++ you still have to think over and over again, how the
memory is handled and if it's better to use an object cache. Otherwise you
will face performance problems too.
This is what the IDispose and Finalize model is for. Yes, it takes an
additional GC cycle to free the memory, but the object has a chance to clean
up after itself first.
>
>2) Transfer the burden to the user. We can ask the user to always take
special cautions (for example, always use "using" in C# to have Dispose
correctly called even exception occurs). Things can work around if the
user do them right. However, that's at risk in nature and not a robust
solution.
Isn't that the case ? The developer has to use "using" e.g. for file
objects, which shall release the file handles directly after their usage.
I admit that sometimes I'm missing reference counting, when I'm dealing
with objects stored in multiple lists. How shall I know when to call
dispose ?
E.g. if a file object is stored in 2 or more lists and has to be removed
from one of the lists. How do I know if I have to call Dispose ? Only
performant solution for me would be to use reference counting.
Though you can't have smart pointers, which are automatically destroyed
and will decrease the reference count of an object automatically. You have
to do it manually in C#. :-( - perhaps there's a better solution in C#
that I don't know yet ? (any comments and solutions would be highly
appreciated)
You never need to call Dispose. By telling the compiler that an object
class implements IDisposable, the GC will call Dispose for you before it
actually deallocates memory.
Andre
I have researched various allocation/deallocation schemes over the years.
Here's a quick summary:
Explicit allocation/deallocation
- Example: C malloc/dealloc and C++ new/delete
- Benefits: Very easy to implement in the runtime and inital
allocations/deallocations are fast.
- Drawbacks: Dangling pointers, attempts to access deallocated/deleted
objects (null pointer references), heap fragmentation
- Where I would use: Only in Real-Time environments where reliable response
time is the driving factor.
Reference Counting:
- Example: Early SmallTalk and VB 6
- Benefits: Programmer doesn't have to worry about memory allocation.
Impossible to dangle pointers or attempt to dereference pointers to
deallocated objects.. Easy to implement in the runtime
- Drawbacks: Cyclic objects never being freed from memory. Early SmallTalk
implementations spent up to 50% of their time doing reference counting.
Doesn't handle resources other than memory.
- Where I would use: General purpose computing
Simple Mark/Sweep/Compact
- Example: Some older Lisp implementations
- Benefits: Programmer doesn't have to worry about memory allocation. Self
referencing objects will be collected.
- Drawbacks: Long pauses while the GC cycle runs. I've seen Lisp machines
pause for several minutes while the GC cycle runs. Implementation can be
tricky. Object lifetime cannot be predicted.
- Where I would use: General purpose computing.
Multi-Generational Mark/Sweep/Compact with explicit disposal interfaces
- Example: Java and dotNet
- Benefits: Programmer controls the allocation of unmanaged resources
through constructors and deallocation through destructors, limiting the
amount of code that actually handles this issue. Usually faster than simple
mark/sweep/compact since most GC passes don't need to process all
generations of memory, especially since most objects have very short
accessible lifetimes and won't be accessible the next time the GC needs to
run.
- Drawbacks: Hard to implement (but not too much more difficult than simple
mark/sweep/compact). Object lifetime cannot be predicted. Must provide at
least one more generation than the maximum number of generations the
destruction of an object can take. Otherwise this becomes a very bloated
version of the simple mark/sweep/compaction algorithm.
- Where I would use: Everywhere, including most real-time systems.
As you move up this chain of memory management, you're thinking of how to
manage objects in your program needs to change, especially when moving from
the explicit allocation/deallocation to any of the other three models.
Moving from Reference counting to mark/sweep removes the restriction of not
creating cyclic references, making the end developer's job easier. Moving
from simple mark/sweep/compaction to generational provides, in most cases,
is a major performance boost for the same source code.
Mike Ober.