473,883 Members | 3,034 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Garbage Collection in C

Abstract
--------
Garbage collection is a method of managing memory by using a "collector"
library. Periodically, or triggered by an allocation request, the
collector looks for unused memory chunks and recycles them.
This memory allocation strategy has been adapted to C (and C++) by the
library written by Hans J Boehm and Alan J Demers.

Why a Garbage Collector?
-----------------------
Standard C knows only the malloc/calloc/free functions. The programmer
must manage each block of memory it allocates, never forgetting to call
the standard function free() for each block. Any error is immediately
fatal, but helas, not with immediate consequences. Many errors like
freeing a block twice (or more) or forgetting to free an allocated
block will be discovered much later (if at all). This type of bugs are
very difficult to find and a whole industry of software packages
exists just to find this type of bugs.

The garbage collector presents a viable alternative to the traditional
malloc/free "manual" allocation strategies. The allocator of Boehm
tries to find unused memory when either an allocation request is
done, or when explicitely invoked by the programmer.

The main advantage of a garbage collector is that the programmer is
freed from the responsability of allocating/deallocating memory. The
programmer requests memory to the GC, and then the rest is *automatic*.
Limitations of the GC.
---------------------
The GC needs to see all pointers in a program. Since it scans
periodically memory, it will assume that any block in its block list is
free to reuse when it can't find any pointers to it. This means that the
programmer can't store pointers in the disk, or in the "windows extra
bytes", as it was customary to do under older windows versions, or
elsewhere.

This is actually not a limitation since most programs do not write
pointers to disk, and expect them to be valid later...
Obviously, there is an infinite way to hide pointers (by XORing them
with some constant for instance) to hide them from the collector.

This is of no practical significance. Pointers aren't XORed in normal
programs, and if you stay within the normal alignment requirements
of the processor, everything works without any problems.

Performance considerations
--------------------------
In modern workstations, the time needed to make a complete sweep in
mid-size projects is very small, measured in some milliseconds. In
programs that are not real time the GC time is completely undetectable.
I have used Boehm's GC in the IDE of lcc-win32, specially in the
debugger. Each string I show in the "automatic" window is allocated
using the GC. In slow machines you can sometimes see a pause of
less than a second, completely undetectable unless you know that is
there and try to find it.

It must be said too that the malloc/free system is slow too, since at
each allocation request malloc must go through the list of free blocks
trying to find a free one. Memory must be consolidated too, to avoid
fragmentation, and a malloc call can become very expensive, depending
on the implementation and the allocation pattern done by the program.
Portability
-----------
Boehm's GC runs under most standard PC and UNIX/Linux platforms. The
collector should work on Linux, *BSD, recent Windows versions, MacOS X,
HP/UX, Solaris, Tru64, Irix and a few other operating systems. Some
ports are more polished than others. There are instructions for porting
the collector to a new platform. Kenjiro Taura, Toshio Endo, and Akinori
Yonezawa have made available a parallel collector.

Conclusions
-----------
The GC is a good alternative to traditional allocation strategies for C
(and C++). The main weakness of the malloc/free system is that it
doesn't scale. It is impossible to be good at doing a mind numbing task
without any error 100% of the time. You can be good at it, you can be
bad at it, but you can NEVER be perfect. It is human nature.

The GC frees you from those problems, and allows you to conecntrate in
the problems that really matter, and where you can show your strength
as software designer. It frees you from the boring task of keeping track
of each memory block you allocate.

jacob

Oct 11 '06
142 6920
SM Ryan <wy*****@tang o-sierra-oscar-foxtrot-tango.fake.orgw rites:
Jean-Marc Bourguet <jm@bourguet.or gwrote:

# Destructors do more than releasing memory. And finalisation is not a
# good substitute for these tasks as finalisation is not synchronous.

If you always know exactly when your destructors are being
called, they're not really destructors--they're methods you
haven't named. Destructors were intended to solve the problem
if collecting anonymous temporaries and the bookkeeping of
collecting local variables and class variables. You know the
destructors will be called before the stack or heap is
overwritten, but the whole point of destructors was you
don't need to know the exact time or place they are called.

If I need to guarentee a program action synchronise to some
external concern, I would have a explicit method to do so
and call it explicitly.
Destructor were invented to release *ressources* when they are no more
needed. Memory is probably the most common of the ressources freed by
destructors, but it's not the only one. In C++, we rely on destructor to
free other ressources which need to be freed whatever the way the scope is
left (so there is usually no need to put catch clauses to free the
ressources in case of exceptions). Some of these ressources need to be
freed synchronously, a mutex is probably the canonical example of such kind
of ressources. That's so common that there is a name for this idiom: RAII
(Ressource Acquisition Is Initialization) .

Another use of destructors is to notify other objects of the
destruction... this use can sometimes be replaced by the use of weak
references in conjunction with a GC, sometimes not.

Yours,

--
Jean-Marc
Oct 20 '06 #111

Richard Tobin wrote:
In article <11************ **********@f16g 2000cwb.googleg roups.com>,
William Hughes <wp*******@hotm ail.comwrote:
I do not understand. I thought that GC had to scan the entire
amount of memory used by the program. I regularly have programs
that use over a gibabyte of memory. How is this
scanned without a noticable pause?

There are over 40 years of research on the subject, so it's not
surprising that this problem has been largely overcome.

In particular, many modern garbage collectors avoid scanning the whole
of memory, except perhaps very rarely.
How can this be done in a C context without providing the GC with
extra information (e.g. I am not going to use this bit of
memory to store pointers)? True, a scan at any given
time could insure that certain areal of memory do
not contain pointers. But, given that a program may well
write frequently to a large area of memory, how does the GC
know that this memory does not "now" contain pointers. Scan
on write might aleviate the problem somewhat, but only somewhat,
given that most or all of the memory may be overwritten.
(And how could one would implement "scan on write" without slowing
the program to a crawl?)

- William Hughes

Oct 20 '06 #112
William Hughes wrote:
Richard Tobin wrote:
>>In article <11************ **********@f16g 2000cwb.googleg roups.com>,
William Hughes <wp*******@hotm ail.comwrote:

>>>I do not understand. I thought that GC had to scan the entire
amount of memory used by the program. I regularly have programs
that use over a gibabyte of memory. How is this
scanned without a noticable pause?

There are over 40 years of research on the subject, so it's not
surprising that this problem has been largely overcome.

In particular, many modern garbage collectors avoid scanning the whole
of memory, except perhaps very rarely.


How can this be done in a C context without providing the GC with
extra information (e.g. I am not going to use this bit of
memory to store pointers)? True, a scan at any given
time could insure that certain areal of memory do
not contain pointers. But, given that a program may well
write frequently to a large area of memory, how does the GC
know that this memory does not "now" contain pointers. Scan
on write might aleviate the problem somewhat, but only somewhat,
given that most or all of the memory may be overwritten.
(And how could one would implement "scan on write" without slowing
the program to a crawl?)

- William Hughes
Boehm's GC is incremental. Each time you make a GC_malloc call, it will
do a bit of garbage collection, so the full pause is less than it
might be.

Yes, it implies that you design your data structures to avoid mixing
data and pointers. This is not very difficult to do.
Oct 20 '06 #113
jacob navia <ja***@jacob.re mcomp.frwrites:
[...]
Boehm's GC is incremental. Each time you make a GC_malloc call, it will
do a bit of garbage collection, so the full pause is less than it
might be.

Yes, it implies that you design your data structures to avoid mixing
data and pointers. This is not very difficult to do.
I see, so there's *another* restriction (though not an absolute one in
this case) that you haven't mentioned until now.

So if I've declared a structure consisting of a large array of doubles
and a single pointer (say, for a linked list), then I need to re-write
my code to use it with GC?

--
Keith Thompson (The_Other_Keit h) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Oct 20 '06 #114
In article <ln************ @nuthaus.mib.or g>,
Keith Thompson <ks***@mib.orgw rote:
>Yes, it implies that you design your data structures to avoid mixing
data and pointers. This is not very difficult to do.
>So if I've declared a structure consisting of a large array of doubles
and a single pointer (say, for a linked list), then I need to re-write
my code to use it with GC?
You certainly don't have to do that to use the Boehm collector.

If you wish to improve the performance of the garbage collector, you
can use various functions instead of the usual malloc(). In particular,
you can use GC_MALLOC_ATOMI C() to allocate space that will definitely
not contain any pointers. Presumably that's what Jacob is referring to.

If you're interested, there's lots of information at

http://www.hpl.hp.com/personal/Hans_Boehm/gc/

-- Richard
Oct 21 '06 #115

Richard Tobin wrote:
In article <ln************ @nuthaus.mib.or g>,
Keith Thompson <ks***@mib.orgw rote:
Yes, it implies that you design your data structures to avoid mixing
data and pointers. This is not very difficult to do.
So if I've declared a structure consisting of a large array of doubles
and a single pointer (say, for a linked list), then I need to re-write
my code to use it with GC?

You certainly don't have to do that to use the Boehm collector.
Well, it will work, but it might take a *lot* of time.
If you wish to improve the performance of the garbage collector, you
can use various functions instead of the usual malloc(). In particular,
you can use GC_MALLOC_ATOMI C() to allocate space that will definitely
not contain any pointers.
But the person who has to insure that the space will not contain
any pointers is me. To do this I have to allocate different areas
for pointers and data. To do this I have to avoid mixing data
and pointers in the first place. Then, I have to go through
my code and change the appropriate malloc calls
to GC_MALLOC_ATOMI C(). Sounds like rewriting code
to me.

- William Hughes

Oct 21 '06 #116
In article <11************ **********@i3g2 000cwc.googlegr oups.com>,
William Hughes <wp*******@hotm ail.comwrote:
>If you wish to improve the performance of the garbage collector, you
can use various functions instead of the usual malloc(). In particular,
you can use GC_MALLOC_ATOMI C() to allocate space that will definitely
not contain any pointers.
>But the person who has to insure that the space will not contain
any pointers is me. To do this I have to allocate different areas
for pointers and data. To do this I have to avoid mixing data
and pointers in the first place. Then, I have to go through
my code and change the appropriate malloc calls
to GC_MALLOC_ATOMI C(). Sounds like rewriting code
to me.
I explained that you don't have to change your code to use the garbage
collector, but you can to make it more efficient. Why do you restate
this as if you're disagreeing with me?

By the way, in most programs you can use GC_MALLOC_ATOMI C() without
changing your data structures. Just use it for the allocations that
would be pointer-free anyway, such as strings and arrays of numbers.

-- Richard
Oct 21 '06 #117

Richard Tobin wrote:
In article <11************ **********@i3g2 000cwc.googlegr oups.com>,
William Hughes <wp*******@hotm ail.comwrote:
If you wish to improve the performance of the garbage collector, you
can use various functions instead of the usual malloc(). In particular,
you can use GC_MALLOC_ATOMI C() to allocate space that will definitely
not contain any pointers.
But the person who has to insure that the space will not contain
any pointers is me. To do this I have to allocate different areas
for pointers and data. To do this I have to avoid mixing data
and pointers in the first place. Then, I have to go through
my code and change the appropriate malloc calls
to GC_MALLOC_ATOMI C(). Sounds like rewriting code
to me.

I explained that you don't have to change your code to use the garbage
collector, but you can to make it more efficient. Why do you restate
this as if you're disagreeing with me?
Because although it will "work" without being more efficient, it may
have very long pauses, and may be so slow as to be useless.
You do not have to rewrite code to get it to work, but you
may have to rewrite code to get it to work usefully. (And worse, you
may be unable to rewrite code to get it to work usefully)
>
By the way, in most programs you can use GC_MALLOC_ATOMI C() without
changing your data structures. Just use it for the allocations that
would be pointer-free anyway, such as strings and arrays of numbers.
And if you have large data structures that do contain pointers
you are SOL. Being told that "most programs" do not have this problem
is cold comfort.

And this does not address the problem of memory allocation in
libraries not under your control (no you cannot simply link them
with another malloc/free as you will then have two mallocs working
on the same address space and same physical memory).

- William Hughes

Oct 21 '06 #118
Richard Tobin wrote:
In article <ln************ @nuthaus.mib.or g>,
Keith Thompson <ks***@mib.orgw rote:

>>>Yes, it implies that you design your data structures to avoid mixing
data and pointers. This is not very difficult to do.

>>So if I've declared a structure consisting of a large array of doubles
and a single pointer (say, for a linked list), then I need to re-write
my code to use it with GC?


You certainly don't have to do that to use the Boehm collector.

If you wish to improve the performance of the garbage collector, you
can use various functions instead of the usual malloc(). In particular,
you can use GC_MALLOC_ATOMI C() to allocate space that will definitely
not contain any pointers. Presumably that's what Jacob is referring to.

If you're interested, there's lots of information at

http://www.hpl.hp.com/personal/Hans_Boehm/gc/

-- Richard
Well, yes. Mr Hughes was speaking about an application using
1GB RAM. In those cases optimization could be necessary.
GC_MALLOC_ATOMI C() optimizes the GC scan.
Oct 21 '06 #119
William Hughes wrote:
Richard Tobin wrote:
>>In article <ln************ @nuthaus.mib.or g>,
Keith Thompson <ks***@mib.orgw rote:

>>>>Yes, it implies that you design your data structures to avoid mixing
data and pointers. This is not very difficult to do.
>>>So if I've declared a structure consisting of a large array of doubles
and a single pointer (say, for a linked list), then I need to re-write
my code to use it with GC?

You certainly don't have to do that to use the Boehm collector.


Well, it will work, but it might take a *lot* of time.

>>If you wish to improve the performance of the garbage collector, you
can use various functions instead of the usual malloc(). In particular,
you can use GC_MALLOC_ATOMI C() to allocate space that will definitely
not contain any pointers.


But the person who has to insure that the space will not contain
any pointers is me. To do this I have to allocate different areas
for pointers and data. To do this I have to avoid mixing data
and pointers in the first place. Then, I have to go through
my code and change the appropriate malloc calls
to GC_MALLOC_ATOMI C(). Sounds like rewriting code
to me.

- William Hughes
It is rewriting code. If you have an application that uses 1GB
of RAM, it is normal that you optimize RAM usage and the GC
if you use it.

Since you have said that the allocations do not happen
under your control, you shouldn't use the GC for that application
in my opinion. Your third party library is working, and since
you do not have the source code you are stuck with it.

The GC is not magic, and it will save you some work but there
is a minimal effort from your side to be done. If you do not
want to do ANY effort just go on using your system as it is.

We are not gaining any money with the GC, so if you do not use
it we do not lose any "sale" !!!

jacob
Oct 21 '06 #120

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
2341
by: Bob | last post by:
Are there any known applications out there used to test the performance of the .NET garbage collector over a long period of time? Basically I need an application that creates objects, uses them, and then throws them away and then monitors the garbage collection and store statistics on it, preferably in C#. I want to know what is the longest period of time that an application may lock up while garbage collection is processing. Thanks!
6
810
by: Ganesh | last post by:
Is there a utility by microsoft (or anyone) to force garbage collection in a process without have access to the process code. regards Ganesh
11
2752
by: Rick | last post by:
Hi, My question is.. if Lisp, a 40 year old language supports garbage collection, why didn't the authors of C++ choose garbage collection for this language? Are there fundamental reasons behind this? Is it because C is generally a 'low level' language and they didn't want garbage collection to creep into C++ and ruin everything? Just wondering :)
34
6449
by: Ville Voipio | last post by:
I would need to make some high-reliability software running on Linux in an embedded system. Performance (or lack of it) is not an issue, reliability is. The piece of software is rather simple, probably a few hundred lines of code in Python. There is a need to interact with network using the socket module, and then probably a need to do something hardware- related which will get its own driver written in C.
5
3628
by: Bob lazarchik | last post by:
Hello: We are considering developing a time critical system in C#. Our tool used in Semiconductor production and we need to be able to take meaurements at precise 10.0 ms intervals( 1000 measurement exactly 10 ms apart. In the future this may decrease to 5ms ). I am concerned that if garbage collection invokes during this time it may interfere with our measurement results. I have looked over the garbage collection mechanism and see no...
8
3056
by: mike2036 | last post by:
For some reason it appears that garbage collection is releasing an object that I'm still using. The object is declared in a module and instantiated within a class that is in turn instantiated by the mainline. The class that instantiated the object in question is definitely still in existence at the point garbage collection swoops in and yanks it out from under my processing. Is there a way to ensure an instantiated object cannot be freed...
28
3202
by: Goalie_Ca | last post by:
I have been reading (or at least googling) about the potential addition of optional garbage collection to C++0x. There are numerous myths and whatnot with very little detailed information. Will this work be library based or language based and will it be based on that of managed C++? Then of course there are the finer technical questions raised (especially due to pointer abuse). Is a GC for C++ just a pipe dream or is there a lot of work...
56
3736
by: Johnny E. Jensen | last post by:
Hellow I'am not sure what to think about the Garbage Collector. I have a Class OutlookObject, It have two private variables. Private Microsoft.Office.Interop.Outlook.Application _Application = null; Private Microsoft.Office.Interop.Outlook.NameSpace _Namespace = null; The Constructor: public OutlookObject()
350
11981
by: Lloyd Bonafide | last post by:
I followed a link to James Kanze's web site in another thread and was surprised to read this comment by a link to a GC: "I can't imagine writing C++ without it" How many of you c.l.c++'ers use one, and in what percentage of your projects is one used? I have never used one in personal or professional C++ programming. Am I a holdover to days gone by?
158
7951
by: pushpakulkar | last post by:
Hi all, Is garbage collection possible in C++. It doesn't come as part of language support. Is there any specific reason for the same due to the way the language is designed. Or it is discouraged due to some specific reason. If someone can give inputs on the same, it will be of great help. Regards, Pushpa
0
9944
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
11154
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
10863
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
1
7977
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
7136
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5807
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
6005
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4622
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
3
3241
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.