473,387 Members | 1,486 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

GC.Collect: Exactly how does it work?

I understand the basic premise: when the object is out of scope or has
been set to null (given that there are no funky finalizers), executing
GC.Collect will clean up your resources.

So I have a basic test. I read a bunch of data into a dataset by using
a command and a data adapter object, .Dispose(ing) as I go. The moment
the data is in the Dataset, the Mem Usage column in the Task Manager
goes up by 50 MB (which is about right). I then .Dispose the Dataset,
set it to null and call GC.Collect. The Mem Usage column reports that
out of 50 MB, 44MB has been reclaimed. I attempt to call GC.Collect a
few more times, but the Mem Usage never goes back to the original. 6 MB
has been lost/leaked somewhere.

What am I missing here?
Regards
Nov 17 '05 #1
9 8531
Task manager does not report the memory in use, but the memory requested
by the application from the OS. After you free it, the application (.Net
Framework) marks it as free, but hold on to it, figuring that you used that
much once, you'll need it again. The OS can seize the memory back, if it
needs it for some other application, but in your test, it didn't need to.

--
--
Truth,
James Curran
[erstwhile VC++ MVP]

Home: www.noveltheory.com Work: www.njtheater.com
Blog: www.honestillusion.com Day Job: www.partsearch.com

"Frank Rizzo" <no**@none.com> wrote in message
news:e7**************@TK2MSFTNGP15.phx.gbl...
I understand the basic premise: when the object is out of scope or has
been set to null (given that there are no funky finalizers), executing
GC.Collect will clean up your resources.

So I have a basic test. I read a bunch of data into a dataset by using
a command and a data adapter object, .Dispose(ing) as I go. The moment
the data is in the Dataset, the Mem Usage column in the Task Manager
goes up by 50 MB (which is about right). I then .Dispose the Dataset,
set it to null and call GC.Collect. The Mem Usage column reports that
out of 50 MB, 44MB has been reclaimed. I attempt to call GC.Collect a
few more times, but the Mem Usage never goes back to the original. 6 MB
has been lost/leaked somewhere.

What am I missing here?
Regards

Nov 17 '05 #2
James Curran wrote:
Task manager does not report the memory in use, but the memory requested
by the application from the OS. After you free it, the application (.Net
Framework) marks it as free, but hold on to it, figuring that you used that
much once, you'll need it again. The OS can seize the memory back, if it
needs it for some other application, but in your test, it didn't need to.


So how can I measure the real memory usage of the application?
Nov 17 '05 #3
depend of what do you mean with "real" ?

for the OS or other programs your memory usage is what report the TM. It's
the chunk of memory assigned to your program.

Frankly I don;t know for sure how to know the memory being used by live
objects. what does GC.GetTotalMemory gives you?
On a second though I think that unless the GC maintain a counter of memory
allocated /freed there is no way to know this. Maybe somebody with deeper
knowledge of the GC implementation can gives you a better answer.
cheers,

--
Ignacio Machin,
ignacio.machin AT dot.state.fl.us
Florida Department Of Transportation

"Frank Rizzo" <no**@none.com> wrote in message
news:u4**************@TK2MSFTNGP12.phx.gbl...
James Curran wrote:
Task manager does not report the memory in use, but the memory
requested
by the application from the OS. After you free it, the application (.Net
Framework) marks it as free, but hold on to it, figuring that you used
that
much once, you'll need it again. The OS can seize the memory back, if it
needs it for some other application, but in your test, it didn't need to.


So how can I measure the real memory usage of the application?

Nov 17 '05 #4

"Frank Rizzo" <no**@none.com> wrote in message
news:e7****************@TK2MSFTNGP15.phx.gbl...
I understand the basic premise: when the object is out of scope or has
been set to null (given that there are no funky finalizers), executing
GC.Collect will clean up your resources.

So I have a basic test. I read a bunch of data into a dataset by using a
command and a data adapter object, .Dispose(ing) as I go. The moment the
data is in the Dataset, the Mem Usage column in the Task Manager goes up
by 50 MB (which is about right). I then .Dispose the Dataset, set it to
null and call GC.Collect. The Mem Usage column reports that out of 50 MB,
44MB has been reclaimed. I attempt to call GC.Collect a few more times,
but the Mem Usage never goes back to the original. 6 MB has been
lost/leaked somewhere.

What am I missing here?
Regards


The GC heap is just another Win32 process heap, initially created by the OS
on request of the CLR, consisting of two segments of 16 MB each (16Kb
committed), one for the Gen0-2 objects and one segment for the Large Object
heap.
When you start to instantiate (non-large) objects, the committed space in
the heap (the first segment) starts to grow. Now suppose that you keep
instantiating objects without ever releasing any instance until the segment
gets full, when that happens the CLR asks the OS for another segment of 16
MB (16Kb committed) and continues to allocate object space from that
segment.
Let's suppose you have the second segment full when you start to release all
of the allocated objects (supposed it's possible) , the GC starts to collect
and compact the heap, say until all objects are gone. That leaves you with a
GC heap of 32 MB, consisting of two segments of 16MB committed space. The GC
has plenty of free space in the heap, but the heap space is not returned to
the OS unless there is memory pressure.
Under memory pressure, the OS signals the CLR to trim it's working set, and
the CLR will return the additional segment to the OS.
So what you did notice is simply what is described above, you have plenty of
free memory and the OS is not reclaiming anything from the running
processes.
Willy.

Nov 17 '05 #5
Thanks, Willy. I understand the part you described (thanks to you, in
another thread), however the part I don't get is how GC.Collect actually
works. You mentioned that when the objects are released, GC collects &
compacts the 16MB sets but does not release those sets to the OS. Is
that what GC.Collect does: just collect & compact, but not release/trim
working set?

If that's the case, how come the Mem Usage column in the Task Manager
does reduce when GC.Collect is executed (and there is no memory
pressure)? And additional question here: how can I signal the CLR to
reduce (i.e release/trim) its original set (since GC.Collect won't do it)?

If that's not the case and GC.Collect does in fact collect/compact and
release/trim, why am I losing 6 MB in the process?

Regards

Willy Denoyette [MVP] wrote:
"Frank Rizzo" <no**@none.com> wrote in message
news:e7****************@TK2MSFTNGP15.phx.gbl...
I understand the basic premise: when the object is out of scope or has
been set to null (given that there are no funky finalizers), executing
GC.Collect will clean up your resources.

So I have a basic test. I read a bunch of data into a dataset by using a
command and a data adapter object, .Dispose(ing) as I go. The moment the
data is in the Dataset, the Mem Usage column in the Task Manager goes up
by 50 MB (which is about right). I then .Dispose the Dataset, set it to
null and call GC.Collect. The Mem Usage column reports that out of 50 MB,
44MB has been reclaimed. I attempt to call GC.Collect a few more times,
but the Mem Usage never goes back to the original. 6 MB has been
lost/leaked somewhere.

What am I missing here?
Regards

The GC heap is just another Win32 process heap, initially created by the OS
on request of the CLR, consisting of two segments of 16 MB each (16Kb
committed), one for the Gen0-2 objects and one segment for the Large Object
heap.
When you start to instantiate (non-large) objects, the committed space in
the heap (the first segment) starts to grow. Now suppose that you keep
instantiating objects without ever releasing any instance until the segment
gets full, when that happens the CLR asks the OS for another segment of 16
MB (16Kb committed) and continues to allocate object space from that
segment.
Let's suppose you have the second segment full when you start to release all
of the allocated objects (supposed it's possible) , the GC starts to collect
and compact the heap, say until all objects are gone. That leaves you with a
GC heap of 32 MB, consisting of two segments of 16MB committed space. The GC
has plenty of free space in the heap, but the heap space is not returned to
the OS unless there is memory pressure.
Under memory pressure, the OS signals the CLR to trim it's working set, and
the CLR will return the additional segment to the OS.
So what you did notice is simply what is described above, you have plenty of
free memory and the OS is not reclaiming anything from the running
processes.
Willy.

Nov 17 '05 #6

"Frank Rizzo" <no**@none.com> wrote in message
news:OJ**************@TK2MSFTNGP15.phx.gbl...
Thanks, Willy. I understand the part you described (thanks to you, in
another thread), however the part I don't get is how GC.Collect actually
works. You mentioned that when the objects are released, GC collects &
compacts the 16MB sets but does not release those sets to the OS. Is that
what GC.Collect does: just collect & compact, but not release/trim working
set?

If that's the case, how come the Mem Usage column in the Task Manager does
reduce when GC.Collect is executed (and there is no memory pressure)? And
additional question here: how can I signal the CLR to reduce (i.e
release/trim) its original set (since GC.Collect won't do it)?

If that's not the case and GC.Collect does in fact collect/compact and
release/trim, why am I losing 6 MB in the process?

Regards


Ok, let me start with a small correction and a disclaimer. The disclaimer
first, what I'm talking about is valid for v1.x and only for the workstation
version of the GC. The correction is that at the start the CLR reserves 2
segments of 16 MB (each having 72kb committed) for the gen0-2 heap plus a 16
MB segment for the LOH.

Consider following (console) sample and say we break at 1 2 and 3
respectively to take a look at the managed heap:

int Main() {
[1]
ArrayList [] al = new ArrayList [1000000];
for (int m = 0; x < 1000000; m++)
al[m] = new ArrayList(1);
[2]
for (int n = 0; n < 1000000; n++ )
{
al[n] = null;
}
GC.Collect();
[3]

At the start [1] of a (CLR hosted) process the GC heap looks like this:

|_--------------|_----------------|......................|---------------|
S0 S1 free
LOH 16MB
S0 = 16MB - 72kb Committed regions (_)
S1 = 16MB - 72kb Committed regions
objects allocated at the start of the program fits in the initial committed
part of the S0 segment, so this committed region contains gen0, 1 and 2.
Say the number of reachable objects account for 6kb heap space here.

When we break at 2, the heap has grown such that S0 and S1 are completely
filled (committed regions) and a third segment had to be created.

|______________|_______________|....|________------|....|---------------|
S0 S1 S2
LOH 16MB
S0 = 16MB - x MB Committed regions
S1 = 16MB - y MB Committed regions
S2 = 16MB - z MB Committed regions
S0 and S1 contains Gen2 objects (those that survived recent Collections)
S2 now holds Gen1 and Gen0
Total object space ~42Mb

Let's Force a Collection and break at 3, now the heap looks like:

|_---------------|____------------|....|_____----------|....|---------------|
S0 S1 S2
LOH 16MB
S0 = 16MB - x MB Committed regions
S1 = 16MB - y MB Committed regions
S2 = 16MB - z MB Committed regions
Total object space = what we had at [1] (6kb), but as you notice the CLR
didn't de-commit all regions and didn't return segment S2 to the OS.
The amount of non de-committed region space depends on a number of
heuristics like; the allocation scheme and the frequency of most recent
object allocations.
When you run above sample you'll see that x, y, z accounts for ~10Mb (your
mileage may vary of course), so when you look at the working set of the
process, you'll notice a growth of ~10MB too. So say we started with 6MB at
[1], we will see 16MB when we are at [3].

What you could do (but you should never do that) is try to reduce the
working set of the process by setting the Process.MaxWorkingSet property,
note that this will not change the heap lay-out and will not return anything
to the OS, only thing that is done is force a page-out of unused process
pages.
Changing the committed region space and the allocated segment space is in
the hands of the CLR and the OS, both of them know what to do and when much
better than you do so keep it that way, after all this is why GC memory
allocators are invented right?

Willy.



Nov 17 '05 #7
Willy, thanks, very enlightening. I ran the test and it turned out just
like you said. I do have a couple of followup questions:

1. Where do you get all this information? I've read a lot of
literature on this topic (Jeffrey Richter's work and some others, gotten
to know the Allocator Profiler, etc...), but I haven't seen anywhere any
references to the size of segment commited ram, etc...

2. What constitues an LOH, how big does an object have to be? What are
the rules for compacting/disposing/releasing it.

3. You mentioned that this applies to the workstation version of the
CLR. My software will run on Win2k servers and Windows 2003 servers
(not advanced, just standard). How are the rules different for the
servers?

4. In the example you described, after the 3rd breakpoint, I applied
some memory pressure (the PC diped into virtual memory). The Mem Usage
column of the console app kept going lower and lower (the more pressure
I applied). Eventually it bottomed out at 100k. Am I to believe that
the whole little console app can be run in 100k? If not, where did it
all go?

Thank you.
Willy Denoyette [MVP] wrote:
"Frank Rizzo" <no**@none.com> wrote in message
news:OJ**************@TK2MSFTNGP15.phx.gbl...
Thanks, Willy. I understand the part you described (thanks to you, in
another thread), however the part I don't get is how GC.Collect actually
works. You mentioned that when the objects are released, GC collects &
compacts the 16MB sets but does not release those sets to the OS. Is that
what GC.Collect does: just collect & compact, but not release/trim working
set?

If that's the case, how come the Mem Usage column in the Task Manager does
reduce when GC.Collect is executed (and there is no memory pressure)? And
additional question here: how can I signal the CLR to reduce (i.e
release/trim) its original set (since GC.Collect won't do it)?

If that's not the case and GC.Collect does in fact collect/compact and
release/trim, why am I losing 6 MB in the process?

Regards

Ok, let me start with a small correction and a disclaimer. The disclaimer
first, what I'm talking about is valid for v1.x and only for the workstation
version of the GC. The correction is that at the start the CLR reserves 2
segments of 16 MB (each having 72kb committed) for the gen0-2 heap plus a 16
MB segment for the LOH.

Consider following (console) sample and say we break at 1 2 and 3
respectively to take a look at the managed heap:

int Main() {
[1]
ArrayList [] al = new ArrayList [1000000];
for (int m = 0; x < 1000000; m++)
al[m] = new ArrayList(1);
[2]
for (int n = 0; n < 1000000; n++ )
{
al[n] = null;
}
GC.Collect();
[3]

At the start [1] of a (CLR hosted) process the GC heap looks like this:

|_--------------|_----------------|......................|---------------|
S0 S1 free
LOH 16MB
S0 = 16MB - 72kb Committed regions (_)
S1 = 16MB - 72kb Committed regions
objects allocated at the start of the program fits in the initial committed
part of the S0 segment, so this committed region contains gen0, 1 and 2.
Say the number of reachable objects account for 6kb heap space here.

When we break at 2, the heap has grown such that S0 and S1 are completely
filled (committed regions) and a third segment had to be created.

|______________|_______________|....|________------|....|---------------|
S0 S1 S2
LOH 16MB
S0 = 16MB - x MB Committed regions
S1 = 16MB - y MB Committed regions
S2 = 16MB - z MB Committed regions
S0 and S1 contains Gen2 objects (those that survived recent Collections)
S2 now holds Gen1 and Gen0
Total object space ~42Mb

Let's Force a Collection and break at 3, now the heap looks like:

|_---------------|____------------|....|_____----------|....|---------------|
S0 S1 S2
LOH 16MB
S0 = 16MB - x MB Committed regions
S1 = 16MB - y MB Committed regions
S2 = 16MB - z MB Committed regions
Total object space = what we had at [1] (6kb), but as you notice the CLR
didn't de-commit all regions and didn't return segment S2 to the OS.
The amount of non de-committed region space depends on a number of
heuristics like; the allocation scheme and the frequency of most recent
object allocations.
When you run above sample you'll see that x, y, z accounts for ~10Mb (your
mileage may vary of course), so when you look at the working set of the
process, you'll notice a growth of ~10MB too. So say we started with 6MB at
[1], we will see 16MB when we are at [3].

What you could do (but you should never do that) is try to reduce the
working set of the process by setting the Process.MaxWorkingSet property,
note that this will not change the heap lay-out and will not return anything
to the OS, only thing that is done is force a page-out of unused process
pages.
Changing the committed region space and the allocated segment space is in
the hands of the CLR and the OS, both of them know what to do and when much
better than you do so keep it that way, after all this is why GC memory
allocators are invented right?

Willy.



Nov 17 '05 #8
Frank, See inline.

Willy.

"Frank Rizzo" <no**@none.com> wrote in message
news:%2****************@TK2MSFTNGP15.phx.gbl...
Willy, thanks, very enlightening. I ran the test and it turned out just
like you said. I do have a couple of followup questions:

1. Where do you get all this information? I've read a lot of literature
on this topic (Jeffrey Richter's work and some others, gotten to know the
Allocator Profiler, etc...), but I haven't seen anywhere any references to
the size of segment commited ram, etc...
Doing a lot of debugging, using low level profilers and tools, and peeking
into the CLR sources. Note also that a managed process is just a Win32
process, the OS has no idea what the CLR is, the process data structures are
exactly the same as another non CLR win32 process, the CLR manages his own
tiny environment and has his own memory allocator and GC, but this ain't
nothing new, the VB6 runtime also has a GC and a memory allocator, C++
runtimes do have different possible memory allocators and all of them are
using the common OS heap/memory manager.
2. What constitues an LOH, how big does an object have to be? What are
the rules for compacting/disposing/releasing it.
Objects larger than 85 kb are going to the LOH. The rules for disposing and
releasing are the same as for the smallerobjects. Compacting of the LOH is
not done only collecting the garbage.
3. You mentioned that this applies to the workstation version of the CLR.
My software will run on Win2k servers and Windows 2003 servers (not
advanced, just standard). How are the rules different for the servers?
The GC server version must be explicitely loaded and is only available for
multi-proc machines (this includes HT). You can host the server GC version
by specifying it in your applications config file:
<runtime>
<gcServer enabled="true" />
</runtime>
or, by hosting the CLR.

4. In the example you described, after the 3rd breakpoint, I applied some
memory pressure (the PC diped into virtual memory). The Mem Usage column
of the console app kept going lower and lower (the more pressure I
applied). Eventually it bottomed out at 100k. Am I to believe that the
whole little console app can be run in 100k? If not, where did it all go?


The trimmed memory R/W pages go to the paging file, the RO pages are thrown
away and will be reloaded from the image file (.exe, .dll,etc...) when
needed. No, a console application cannot run in 100Kb, the missing pages
will be reloaded from the page file or the load libraries. That's why you
should never trim the working set yourself, all you are doing is initiate a
lot of page faults with a lot of disk I/O as result.
Nov 17 '05 #9
Thanks, Willy. The education you provided for me has been invaluable.

Regards.
Willy Denoyette [MVP] wrote:
Frank, See inline.

Willy.

"Frank Rizzo" <no**@none.com> wrote in message
news:%2****************@TK2MSFTNGP15.phx.gbl...
Willy, thanks, very enlightening. I ran the test and it turned out just
like you said. I do have a couple of followup questions:

1. Where do you get all this information? I've read a lot of literature
on this topic (Jeffrey Richter's work and some others, gotten to know the
Allocator Profiler, etc...), but I haven't seen anywhere any references to
the size of segment commited ram, etc...


Doing a lot of debugging, using low level profilers and tools, and peeking
into the CLR sources. Note also that a managed process is just a Win32
process, the OS has no idea what the CLR is, the process data structures are
exactly the same as another non CLR win32 process, the CLR manages his own
tiny environment and has his own memory allocator and GC, but this ain't
nothing new, the VB6 runtime also has a GC and a memory allocator, C++
runtimes do have different possible memory allocators and all of them are
using the common OS heap/memory manager.

2. What constitues an LOH, how big does an object have to be? What are
the rules for compacting/disposing/releasing it.

Objects larger than 85 kb are going to the LOH. The rules for disposing and
releasing are the same as for the smallerobjects. Compacting of the LOH is
not done only collecting the garbage.
3. You mentioned that this applies to the workstation version of the CLR.
My software will run on Win2k servers and Windows 2003 servers (not
advanced, just standard). How are the rules different for the servers?


The GC server version must be explicitely loaded and is only available for
multi-proc machines (this includes HT). You can host the server GC version
by specifying it in your applications config file:
<runtime>
<gcServer enabled="true" />
</runtime>
or, by hosting the CLR.
4. In the example you described, after the 3rd breakpoint, I applied some
memory pressure (the PC diped into virtual memory). The Mem Usage column
of the console app kept going lower and lower (the more pressure I
applied). Eventually it bottomed out at 100k. Am I to believe that the
whole little console app can be run in 100k? If not, where did it all go?

The trimmed memory R/W pages go to the paging file, the RO pages are thrown
away and will be reloaded from the image file (.exe, .dll,etc...) when
needed. No, a console application cannot run in 100Kb, the missing pages
will be reloaded from the page file or the load libraries. That's why you
should never trim the working set yourself, all you are doing is initiate a
lot of page faults with a lot of disk I/O as result.

Nov 17 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: Paul Rowe | last post by:
Hi "You" I have two collection types declared at the SQL level. 1. Do you know of any known bugs with the BULK COLLECT clause used with the TABLE operator? I have a situation now where I am...
18
by: John Black | last post by:
Hi, I am not familiar with for_each very well, suppoase I have a vector<pair<unsigned int, unsigned int> > vec1 and the contents are {<0x00000000, 0x000000FF>, <0x10000000, 0x2FFFFFFF>} what...
3
by: Fan Ruo Xin | last post by:
I ran the runstats and try to collect sub-element statistics, but for some reason, the value of that two fields in syscat.columns is always -1. Does anyone have any idea?
3
by: Hasani | last post by:
I'm creating a .net db provider and I came across a weird problem w/ my data reader. The .net provider was created in managed c++ becuase the db api is also c++. I'm calling the code from a c#...
16
by: LP | last post by:
Hi, Considering code below. Will it make GC to actually collect. One application creates new instances of a class from 3rd party assembly in a loop (it has to). That class doesn't have .Dispose or...
5
by: Mrinal Kamboj | last post by:
Hi , Any pointers when it's absolute necessary to use it . Does it has a blocking effect on the code , as GC per se is undeterministic . what if GC.collect is followed in next line by...
15
by: James Black | last post by:
If you go to http://dante.acomp.usf.edu/HomeworkAssistant/index.php you will see my code. Type in: s = a + b and hit tab, and you will see the extra words. How do I remove these? Here is a...
6
by: Senthil | last post by:
Hi All We are having a VB application on SQL. But we need to collect information from persons who will be offline to verify data and insert new data. Generally they will be entering the data in...
48
by: Ward Bekker | last post by:
Hi, I'm wondering if the GC.Collect method really collects all objects possible objects? Or is this still a "smart" process sometimes keeping objects alive even if they can be garbage collected?...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.