By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
440,874 Members | 1,058 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 440,874 IT Pros & Developers. It's quick & easy.

Andre - A response to your post in the "C# memory problem: no end for our problem ?" thread

P: n/a
Hi,

I just noticed your post in the "C# memory problem: no end for our problem?"
thread.
In the post you implied that I do not how the garbage collector works and
that I mislead people. Since the thread is over a month old, I decided to
start a new one with my response.

Please see my comments inline.

"Andre" <fo********@hotmail.com> skrev i meddelandet
news:3E**************@hotmail.com...
This is not true; the GC kicks in when a memory allocation reaches a
treshold (which depends on several things, e.g. the amount of physical
memory
available, the size of the processor cache etc.).
You're right about the threshold.. however it has nothing to do with the
processors cache size and is saying "the amount of physical memory" is
not entirely correct.. read Jones and Lins "Garbage Collection -
Algorithms for automatic dynamic memory management" for more information


I have not read the book you're recommending, but I can't understand how
this book can tell how the "generational garbage collector implemented by
the CLR" determines the treshold values. I do not know for sure that the
treshold is dependent on the processor cache size, but I've seen it
mentioned in articles written about the GC of the CLR and I think it
makes sense. The size of the generation #0 treshold is initially about 160KB
(as far as I remember) and this value is of course dynamically updated after
a GC.
I suspect that the book you're referring to gives some algorithm for
determining the treshold value, but it's up to the implementor to use
whatever input parameters they see fit (physical memory, cache size,
number of surviving instances, etc).

A simple test can prove this:

Add the following code to a .NET program.

Random rnd = new Random();
object[] data = new object[10001];
while( true )
{
for( int i=0; i < 10000; i++ )
{
switch( rnd.Next( 3 ) )
{
case 0:
data[i] = "This is test" + i;
data[i+1] = "This is test" + (i+1);
break;
case 1:
data[i] = new int[10];
break;
case 2:
data[i] = new object();
break;
}
}
}

When inserting the above code in the Main method of a console application, the managed heap will use between 400 KB and 5000KB of memory and the CPU utilization will be 100%. Even though the Task Manager cannot show the size of the managed memory, it will show that the used memory doesn't keep
increasing until all physical memory is used (when running the test, 290 MB of physical memory was still available). The numbers presented were
retrieved from (the upcoming) next version of out .NET Memory Profiler.


So what were you trying to prove again? Again - don't say "amount of
phsyical size", say "heap size" or "managed heap size" ... it only uses
a little of it and the threshold increases as more memory is consumed
and *if* more memory is available.


You left out the following part in the snippet of my post:

"A common misconception regarding the the garbage collector of the CLR is
that it runs whenever the system runs out of physical memory, or when there
is some idle time it can use to clean up the memory."

This text was followed by the "This is not true..." sentence, which you
started your post with.

I was simply trying to prove that the GC collects memory even if there's is
plenty of physical memory left, and even if there's no idle processor time.
I've seen several persons trying to explain why their application is using
large amounts of memory by claiming that the garbage collector collects
at idle time in low memory situations.

I'm trying to find the time to write an article on how the CLR uses physical memory and how it uses generations to improve the performance of the garbage collector (all articles I've read simply tells you that generations are used to improve performance, but does not explain how).


If you don't know how a generatatioanl GC works, what makes you think
you can write an article on it (and mislead people)? Read Jones and Lins
for more information... in a nut shell, a generational garbage collector
improves performance by dividing the managed heap into different
'generations' and moves objects to these spaces according to the number
of times they survive a collection.. if an object survives a certain
number of GCs, it is moved from nursery (the first generation) to the
next generation.... this effectively improves performance because the GC
collects older generations at a very low rate .. this is because of the
hypothesis that "all objects die young".. and so the first generation
gets frequent GCs but since nursery size is set to a small figure, GC is
effectively fast and quick.


I beleive I have a very good knowledge of the generational GC of the CLR,
and your "in a nut shell" explanation of a generational GC is more or less
exactly the same explanation I've seen in many different articles. The thing
that all these articles fail to address is how this increases performance.

I'll try to make a short explanation of what I mean.

The hypothesis you mention ("all objects die young) is probably better
stated as "most objects die young and those that don't will live forever".
Dividing the heap into generations merely on this hypothesis will not
increase performance significantly. Only objects that survive a GC will need
to be relocated (e.g. compacted), and those objects are assumed to live
forever. Thus, old objects can be compacted into the bottom of the heap and
after that they will probably not need to be relocated very often (since all
neighbouring objects are also old and are assumed to live forever, or at
least for a long time).
This behaviour will be the same even if no generational GC is used. The main
thing solved by a generational GC is reducing the number of references to
look at when performing a collect.

Consider a case where you have an application with 1 million long lived
instances, each having 5 references to other instances. If this application
is performing a large amount of allocations of short-lived instances, a gen
#0 collection may be triggered several times per second. Without optimizing
the references to look at, the GC would have to look at every one of the 5
million references to make sure that none of them references a gen #0
instance. What the generational GC does is to keep track if any reference
has changed in instances in older generations by using "write barriers".
When a GC (gen 0 or gen 1) is performed, only the references that have
changed in older generations need to be looked at. This optimization may
very well reduce the number of references to look at from 5 millions to
close to zero, a very significant improvement. Of course the garbage
collector still has to look at all the stack based references (local
variables and
method parameters) and other internal references; this is not
affected by the generational garbage collector.

After I posted the original message, I found the following articles on
MSDN:
http://msdn.microsoft.com/library/en...anagedapps.asp
and
http://msdn.microsoft.com/library/en...etgcbasics.asp
(watch for linewraps)

These articles do mention the use of write barriers used by the garbage
collector and they also provide more low level information about the garbage
collector, making me less motivated to write an article.

Anyway, if I write an article about the garbage collector, it will focus on
implementation details of the CLR garbage collector implemented by
Microsoft, it will not be a description of garbage collectors in general.

Finally, I don't understand how the phrase "the CLR uses generations to
improve the performance of the garbage collector" is technically (and maybe
even entirely) incorrect, as you stated in your next post. As you said, "the
*garbage collector* implemented by the CLR 'is a' generational Garbage
collector", but I think it's quite OK (albeit not perfect) to say
that a generational garbage collector uses generations.
Best regards,

Andreas Suurkuusk
SciTech Software AB
Download our .NET Memory Profiler at http://www.scitech.se/memprofiler





Jul 19 '05 #1
Share this question for a faster answer!
Share on Google+

This discussion thread is closed

Replies have been disabled for this discussion.