Joe Wright wrote:
Richard Heathfield wrote:
Some time ago now I attacked the GC problem, as I saw it, with what I
called GE or Garbage Elimination.
It has to do with wrapping malloc, calloc, realloc and free such that
they all come to GE first. GE is basically a manager of a linked list of
allocation data, including addresses and sizes.
GE.h adds 'size_t size(void *p);' to the vocabulary, size of the allocation.
User calls to *alloc are simply recorded in the list and then passed to
their libc namesakes. Calls to free() are looked up in the list and if
found result in a call to libc free() and deletion from the list. If the
user calls free(x) and x is not in the list, we simply return, a NOP.
What do you think of it?
This does not solve the general problem of garbage collection. GC
typically eliminates the need for free() by automatically recycling
memory any time that memory stops being referenced by any pointer that
is ultimately reachable from within the program's current run state.
>From what I can tell, what you've done is simply enhanced the memory
model by adding a size(void *p) function, and have a method for
detecting bogus calls to free() (including double frees.) There is no
end of ways to enhance C's memory model, and this is certainly one of
them, but this is not GC. True GCs are somewhat difficult to implement
in C and ultimately rely on platform specific behavior (to access all
live run-time autos, and all writable data areas, for example). Though
they certainly exist (the Boehm GC mechanism, for example.)
That all being said, I fully endorse the idea of enhancing C's memory
model. A big reason why people have run away from C and gone to other
langauges is because the whole malloc/free, programming model is very
hard to sustain especially given the bear-bones support that the
language gives you. But in doing such enhancements, you should not
target an attempt to duplicate GC, since that truly changes programming
paradigms -- C has way too much cruft in it to support a true paradigm
shift without substantial change to the language (comparable to what
C++ or Objective C did.)
What you want to do is to go ahead and embrace C's basic way of doing
things (after all that's the environment) we are in, but attack all its
main weaknesses. I.e., when a GC advocate says "C's memory model is
bad because of reason/example x" you want to be able to respond, "No,
that's easily detectable and correctable with an enhanced C memory
manager that does y". With this in mind, let us return to your "GE"
enhancement.
In your case, rather than *IGNORING* attempts to free garbage, you
should generate some sort of diagnostic to provide the programmer with
information telling them that they did something wrong. In fact you
can do this:
/* In some include file somewhere, say "estdlib.h" */
#define free(x) free_enhanced ((x), __FILE__, __LINE__)
void free_enhanced (void *, const char *, int);
/* In some module, say estdlib.c */
void free_enhanced (void * ptr, const char * file, int line) {
if (wasLegallyAllo cated (ptr))
specialfree (ptr); /* find the real header, then free */
else
diagnostic_badf ree (file, line, ptr);
}
So that in your error log, or message or whatever you decide to do, you
know exactly which call to free is failing. Its very important for the
programmer to know that this bad thing is going in in his/her program
as this may be symptomatic of more serious problems in the program, and
simply avoiding this one anomily is likely to be a bandage that just
doesn't do the surgeon's job.
Another really easy to implement feature is to simply keep track of the
total amount of memory currently allocated, as well as the lifetime
maximum allocated by the program. You could then provide two simple
functions that just returned these values. These are usually
sufficient hints for most programmers to know if they are leaking
memory or not.
As long as you are tracking all allocations in a linked list, you also
might as well provide a memory traversal mechanism. I.e., an ability
to walk through all allocated memory locations. Why would you do this?
Well you would do it in conjunction with an enhancement that tracked
*where* each allocation came from:
/* In some include file somewhere, say "estdlib.h" */
#define malloc(x) malloc_enhanced ((x), __FILE__, __LINE__)
void * malloc_enhanced (size_t, const char *, int);
/* In some module, say estdlib.c */
struct enhancedMemHdr {
struct enhancedMemHdr * linkNext;
size_t sz;
const char * moduleOrg;
int lineNumberOrg;
char mem[1]; /* struct hack */
};
void * malloc_enhanced (size_t sz, const char * file, int line) {
struct enhancedMemHdr * ptr;
if (!sz) {
diagnostic_badm alloc (file, line, sz);
return NULL;
}
/* store sz, file & line as well: */
ptr = specialmalloc (sz, file, line);
if (!ptr) {
diagnostic_outo fmemory (file, line, sz); /* don't abort */
}
return ptr.mem;
}
So the point is not to look into the memory itself while you walk the
allocations (you wouldn't have type information, so that would be kind
of useless) but rather you would be interested in *where* the memory
got allocated. So you could do some simple statistics to figure out
where your memory was mostly being allocated for.
--
Paul Hsieh
http://www.pobox.com/~qed/ http://bstring.sf.net/