473,385 Members | 1,356 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

General method for dynamically allocating memory for a string

I have searched the internet for malloc and dynamic malloc; however, I still
don't know or readily see what is general way to allocate memory to char *
variable that I want to assign the substring that I found inside of a
string.

Any ideas?
Aug 30 '06
94 4620
On Fri, 1 Sep 2006, Ben Pfaff wrote:
Tak-Shing Chan <t.****@gold.ac.ukwrites:
>On Fri, 1 Sep 2006, Ben Pfaff wrote:
>>Tak-Shing Chan <t.****@gold.ac.ukwrites:

(3) In general, writing pointers to files is always
nonportable. So, I am not sure why you are posting this in a
group that values portability.

Writing a pointer to a file is *not* always non-portable. It is
portable to write a pointer to a file, read it back within the
same run of the program, and then use the pointer (as long as the
lifetime of the associated memory has not been reached).

An intervening realloc could mess it up.

realloc terminates the memory's lifetime.
I don't think so. The new object might overlap with the old
one. Perhaps you meant ``object'' rather than ``memory''?

Tak-Shing
Sep 1 '06 #51
Walter Roberson said:
In article <87************@benpfaff.org>,
Ben Pfaff <bl*@cs.stanford.eduwrote:
>>Writing a pointer to a file is *not* always non-portable. It is
portable to write a pointer to a file, read it back within the
same run of the program, and then use the pointer (as long as the
lifetime of the associated memory has not been reached).

Reference, please?

My recollection is that the relevant section says only that
when something is written out to a binary file, that the same
binary value will be read back in. When, though, it comes to a pointer,
that doesn't promise that the reconstituted pointer points to anything.
fread and fwrite work "as if" they are successive calls to fgetc and fputc
respectively. These functions may ostensibly deal in ints, but the Standard
requires that they actually read in and write out unsigned chars. So fread
and fwrite effectively read and write objects as if they were a collection
of unsigned chars. In other words, what is being read/written is the object
representation.

Is it your contention that two objects of the same type with the same object
representation can have different values?

There is a section of the standard that talks about writing out
pointers and reading them back in, but that has to do with using
the %p printf() format, which Richard did not do.
Only because I couldn't face the sheer yukkiness of writing a scanf call.
:-)

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)
Sep 1 '06 #52
Tak-Shing Chan said:
On Fri, 1 Sep 2006, Ben Pfaff wrote:
>Tak-Shing Chan <t.****@gold.ac.ukwrites:
>> (3) In general, writing pointers to files is always
nonportable. So, I am not sure why you are posting this in a
group that values portability.

Writing a pointer to a file is *not* always non-portable. It is
portable to write a pointer to a file, read it back within the
same run of the program, and then use the pointer (as long as the
lifetime of the associated memory has not been reached).

An intervening realloc could mess it up.
There was no intervening realloc in the code which you said was
non-portable. If you had claimed "in general, writing pointers to files is
always non-portable if you realloc before you read them back in", that
would be different (and utterly irrelevant to the discussion). But you
didn't.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)
Sep 1 '06 #53
Tak-Shing Chan <t.****@gold.ac.ukwrites:
On Fri, 1 Sep 2006, Ben Pfaff wrote:
>Tak-Shing Chan <t.****@gold.ac.ukwrites:
>>On Fri, 1 Sep 2006, Ben Pfaff wrote:

Tak-Shing Chan <t.****@gold.ac.ukwrites:

(3) In general, writing pointers to files is always
nonportable. So, I am not sure why you are posting this in a
group that values portability.

Writing a pointer to a file is *not* always non-portable. It is
portable to write a pointer to a file, read it back within the
same run of the program, and then use the pointer (as long as the
lifetime of the associated memory has not been reached).

An intervening realloc could mess it up.

realloc terminates the memory's lifetime.

I don't think so. The new object might overlap with the old
one. Perhaps you meant ``object'' rather than ``memory''?
Yes, that's more precise.
--
"Some programming practices beg for errors;
this one is like calling an 800 number
and having errors delivered to your door."
--Steve McConnell
Sep 1 '06 #54
On Fri, 1 Sep 2006, Richard Heathfield wrote:
Tak-Shing Chan said:
>On Fri, 1 Sep 2006, Ben Pfaff wrote:
>>Tak-Shing Chan <t.****@gold.ac.ukwrites:

(3) In general, writing pointers to files is always
nonportable. So, I am not sure why you are posting this in a
group that values portability.

Writing a pointer to a file is *not* always non-portable. It is
portable to write a pointer to a file, read it back within the
same run of the program, and then use the pointer (as long as the
lifetime of the associated memory has not been reached).

An intervening realloc could mess it up.

There was no intervening realloc in the code which you said was
non-portable. If you had claimed "in general, writing pointers to files is
always non-portable if you realloc before you read them back in", that
would be different (and utterly irrelevant to the discussion). But you
didn't.
In any case, it would hide serious errors (such as having an
intervening realloc or free) where conforming implementations are
not required to issue a diagnostic.

Tak-Shing
Sep 1 '06 #55
Tak-Shing Chan wrote:
On Fri, 1 Sep 2006, Ben Pfaff wrote:
>Tak-Shing Chan <t.****@gold.ac.ukwrites:
>>On Fri, 1 Sep 2006, Ben Pfaff wrote:

Tak-Shing Chan <t.****@gold.ac.ukwrites:

(3) In general, writing pointers to files is always
nonportable. So, I am not sure why you are posting this in a
group that values portability.
Writing a pointer to a file is *not* always non-portable. It is
portable to write a pointer to a file, read it back within the
same run of the program, and then use the pointer (as long as the
lifetime of the associated memory has not been reached).
An intervening realloc could mess it up.


realloc terminates the memory's lifetime.


I don't think so. The new object might overlap with the old
one. Perhaps you meant ``object'' rather than ``memory''?

Tak-Shing
You raise a very importaznt point here, that furthers the arguments for
an automatic garbage collector.

When you do a

q = realloc(p,2*n);

after a successful realloc the p pointer and ALL THE ALIASES you have
done for that object, "including but not limited to":

o structures that contain that pointer in some field
o functions, that hold aliases to that object address in the stack
in their parameter list
o other local variables that use pointers based on that pointer,
for instance p+5 or &p[78]

ARE ALL INVALID and must be ALL invalidated!!!

Manually.

Using the GC you do

q = GC_malloc(2*n);
if (q) {
memcpy(q,p,n); // Copy the old object
p = NULL;
}
else
//No more memory Handle error

Now q contains the reallocated object but the old
storage will be freed only (and only then) when there are no
pointers to it!

Obviously, if you are 100% sure that there are no aliases to
the object, you can use GC_realloc();

But this is not without problems too, since the aliases will
point to old object copies and not to the good one, and
those copies will NOT be updated.

This can be a problem, or may be is harmless, it depends on the
application. For instance if the new object is updated by
adding members to it (a table of structures for instance) it
can be OK to keep the old shorter version, but that could also be dangerous.

In the case of
struct {
int nbOfElements;
T data;
};

after a realloc of "data", the old pointer stored elsewhere
will be utterly wrong after a realloc, and a GC solution could make
the problem even worse.

But this is a matter of discipline of alias usage and discipline
of pointer usage in general.

Another thread of discussion.
Yes, memory management *is* tricky.

Sep 1 '06 #56
Tak-Shing Chan said:

<snip>
In any case, it would hide serious errors
What would? You seem to be talking about hypothetical code hiding
hypothetical errors. The actual posted code seems to have passed you by
completely.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)
Sep 1 '06 #57
Richard Heathfield wrote:
Philip Potter said:
>>You seem to be arguing that because garbage collection can't protect
against all memory leaks without care from the user, it is worthless.


No, I'm sure it has a place in other languages. I just don't think it sits
very well with C, for at least two reasons:

1) As we have already seen upthread, automatic garbage collection (AGC)
encounters serious difficulties when faced with C's heavy use of pointers
for keeping track not only of memory blocks but of particular positions
within them.

2) C is commonly used where not only performance but even timing (in
real-time stuff) is an important criterion for acceptance of the program,
and AGC implementations are notorious for introducing arbitrary delays at
various times through a program's run.

As I have already agreed, AGC is undoubtedly of value to some people in some
situations. I just don't think AGC is a good fit for C programmers, in the
general case. That's not to say that it might not be useful on occasion,
but I don't believe such occasions are sufficiently common to make the
subject a worthwhile diversion away from the topic of this newsgroup, which
is C programming, *not* "C programming + Jacob Navia's AGC extensions".
One situation where GC can be used to good effect with C is as a
diagnostic tool. I often soak test applications linked with a GC
library that logs reclaimed memory and use the library to highlight any
memory leeks that occur. If the application is correctly written, the
GC shouldn't reclaim any blocks.

I would never countenance using the GC library as a band-aid for poor
programming practice, but as another tool in the box, it has its place.

--
Ian Collins.
Sep 1 '06 #58
Ian Collins said:

<snip>
One situation where GC can be used to good effect with C is as a
diagnostic tool.
Absolutely. Likewise automatic bounds checking, with which I would never
burden production code but which has its place in a test rig.
I often soak test applications linked with a GC
library that logs reclaimed memory and use the library to highlight any
memory leeks that occur.
Not broccoli, then? :-)

Seriously, I don't actually use a GC lib for detecting memory leaks, but I
do use a wrapper around *alloc and free which I wrote specifically for
performing such detections. But it really slows the code down, and
generates colossal logfiles, so it gets ripped out of production code via a
#define (or rather, the absence of one).

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)
Sep 1 '06 #59
Richard Heathfield wrote:
Ian Collins said:

<snip>
>>One situation where GC can be used to good effect with C is as a
diagnostic tool.


Absolutely. Likewise automatic bounds checking, with which I would never
burden production code but which has its place in a test rig.

>>I often soak test applications linked with a GC
library that logs reclaimed memory and use the library to highlight any
memory leeks that occur.


Not broccoli, then? :-)
Must be out of season....
Seriously, I don't actually use a GC lib for detecting memory leaks, but I
do use a wrapper around *alloc and free which I wrote specifically for
performing such detections. But it really slows the code down, and
generates colossal logfiles, so it gets ripped out of production code via a
#define (or rather, the absence of one).
That's one advantage of the GC library, it imposes minimal overhead and
only logs real or potential leaks.

--
Ian Collins.
Sep 1 '06 #60
On Fri, 01 Sep 2006 12:47:27 +0200, in comp.lang.c , jacob navia
<ja***@jacob.remcomp.frwrote:
>Richard Heathfield wrote:
>jacob navia said:

>>>As you may know, I am not selling anything here,

No, I don't know that. There is considerable evidence to the contrary.

A product for zero dollars and zero cents?
You're confusing "selling" and "recieving money for". People sell
each other things all day every day, without any money changing hands.
For instance, you can sell your knowledge in return for good will.

--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan
Sep 1 '06 #61
jacob navia wrote:
CBFalconer wrote:
>Revise the example:

T *p;

if (p = malloc(n * sizeof *p) {
T *q;

q = p+1; p = NULL;
dosomethingwith(q);
q--;
....
}

The revised example will not fool the GC since the block can be reached
from q, since it points between the beginning and the end of the block.
Will not fool which GC? The Boehm GC won't get "fooled"
by that only because it's fairly conservative and tends
to err on the side of leakage rather than premature
collection (AIUI, it will compare bits inside objects,
but I'm willing to be convinced otherwise).

Some will definitely get fooled by that (the GC will
track references to objects and won't bother looking
at every possible byte in the object with every
possible alignment - why should it?).

--
goose
Have I offended you? Send flames to root@localhost
real email: lelanthran at gmail dot com
website : www.lelanthran.com
Sep 1 '06 #62
Ian Collins wrote:
<snipped>
>
That's one advantage of the GC library, it imposes minimal overhead and
only logs real or potential leaks.
Or it may not log any at all. Some GC are rather conservative
you know.

--
goose
Have I offended you? Send flames to root@localhost
real email: lelanthran at gmail dot com
website : www.lelanthran.com
Sep 1 '06 #63
On Fri, 1 Sep 2006, Richard Heathfield wrote:
Tak-Shing Chan said:

<snip>
> In any case, it would hide serious errors

What would? You seem to be talking about hypothetical code hiding
hypothetical errors. The actual posted code seems to have passed you by
completely.
Yes, I was talking about hypothetical errors. But this is
the same type of argument you used against malloc casts---the
/hypothetical/ error of missing <stdlib.hwhen an incomplete
program was posted.

Tak-Shing
Sep 1 '06 #64
Tak-Shing Chan said:
On Fri, 1 Sep 2006, Richard Heathfield wrote:
>Tak-Shing Chan said:

<snip>
>> In any case, it would hide serious errors

What would? You seem to be talking about hypothetical code hiding
hypothetical errors. The actual posted code seems to have passed you by
completely.

Yes, I was talking about hypothetical errors.
I wasn't. If you want to start a new thread about hypothetical errors, feel
free.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)
Sep 1 '06 #65
Richard Heathfield <in*****@invalid.invalidwrites:
Richard Bos said:
>Ben Pfaff <bl*@cs.stanford.eduwrote:
>>rl*@hoekstra-uitgeverij.nl (Richard Bos) writes:

<snip>
>The reasoning given in the comment is also bogus. It is a very _bad_
programming habit to start expecting that you can free pointers twice.

You can free a null pointer any number of times you like. I
think that is what the "it" in "delete it more than once" means.

Yes, you can. My point was that relying on this capability by setting
pointers to null after you're done with them is a bad programming habit,
not a good one, because you _will_ get in the habit of calling free() on
any which pointer, with the reasoning that "it'll either be a valid
pointer or null".

True enough - you shouldn't *rely* on it, in the sense of arbitrarily
littering your code with calls to free(). Nevertheless, setting pointers to
NULL after you're done with them is a /good/ habit, not a bad one. It's
called "defence in depth".
I'm not going to say that I disagree with you, but I'm going to offer
a counterargument anyway.

Setting a free()d pointer to NULL can prevent certain kinds of errors
-- or rather, it can prevent the *symptoms* of certain kinds of
errors.

For example:

some_type *ptr = malloc(sizeof *ptr);
...
free(ptr);
...
/*
* Now I don't remember whether I called free(ptr) or not.
* I'll free it here, just in case.
*/
free(ptr);

As written, the second call to free() invokes undefined behavior. In
fact, evaluating ptr in preparation for the call invokes UB; the call
to free() doesn't really have anything to do with it.

If the first free() call were replaced with:

free(ptr);
ptr = NULL;

then the second free() call wouldn't invoke UB. But it would *still*
(probably) be a symptom of a logical error, namely the failure to keep
track of whether you've already called free(). Setting ptr to NULL
after the first free() would mask the symptom of the error; leaving it
alone could *potentially* cause the second free() to blow up, making
the error easier to detect. (Or it could do nothing; such is the
nature of undefined behavior.)

On the other hand, you could legitimately have reached the point of
the second free() by any of several paths, some of which have free()d
the pointer and some of which haven't. In that case, *if* all the
paths that free() it then set it to NULL, then free()ing it again is
harmless and sensible.

I'd probably be more comfortable with a program design that invokes
free() exactly once, if and only if it's needed, but if having your
program do a little unnecessary work at run time makes it easier to
develop, that's not always a bad thing.
>The truly good programming habit, in this case, is to do your bleedin'
bookkeeping, and keep in mind which pointers you have free()d, and which
you haven't.

That's certainly true, but it is also a good idea to recognise that you
might be fallible, and to take precautions against that fallibility. A null
pointer is far more useful than a pointer with an indeterminate value.
In an ideal world, the question would never arise; you wouldn't make
that mistake in the first place, or at least you'd already have
corrected it.

In software intended to be maximally robust (which, I hasten to add,
isn't *always* worth the effort), it probably makes sense to implement
this kind of defense in depth *and* to detect and log cases where
errors were caught by the second or later line of defense.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Sep 1 '06 #66
Keith Thompson wrote:
In software intended to be maximally robust (which, I hasten to add,
isn't *always* worth the effort), it probably makes sense to implement
this kind of defense in depth *and* to detect and log cases where
errors were caught by the second or later line of defense.
This is a very good insight.

Why taking risks?

A function call with a NULL pointer is extremely fast
in this days of GHZ CPUs.

Setting a pointer to null is a single memory write,
absolutely nothing.
Sep 2 '06 #67
Hello Richard:

We will have to agree to disagree. I agree that my code below if/else
is retarded. I agree that it should have been different. This is what
I SHOULD have coded:

if( str2 != NULL )
printf( "str2: %s\n", str2);

free( str2 );
str2 = 0;
....

It sounds like you have the good fortune to work either by yourself or
in a very small team. I currently work on a 500+ developer contract
here in the United States will millions of lines of code. No "cheap
hack" can keep you safe in such an environment. When dealing with
global data in such an environment the above technique can mean the
difference between mission critical system downtime or a harmless
no-op.

This is not meant to foster an absurd programming practice similar to
this:
if( str2 != NULL )
printf( "str2: %s\n", str2);

free( str2 );
str2 = 0;
....
/* not sure if I really freed the memory */
free( str2 ); // Yea! I'm okay!

It is purely a measure of defensive programming. I have learned these
idioms through working in larger teams with varying levels of expertise
with a lot of little updates to the code base. This has worked well
for my team. I would gladly manage the code if I were the sole
maintainer. In my environment I am not. This idiom works for me. Of
course, your mileage may vary.

-Randall

Richard Bos wrote:
Ben Pfaff <bl*@cs.stanford.eduwrote:
rl*@hoekstra-uitgeverij.nl (Richard Bos) writes:
Frederick Gotham <fg*******@SPAM.comwrote:
>
>Randall posted:
if( str2 != NULL ) {
printf( "str2: %s\n", str2 );
} else {
// Setting a null pointer to zero ensures you
// can delete it more than once (free) without
// undefined behavior. This is a good
// programming habit.
str2 = 0;
}
>>
> As Richard Heathfield pointed out, both "else" clauses are redundant.
>
The reasoning given in the comment is also bogus. It is a very _bad_
programming habit to start expecting that you can free pointers twice.
You can free a null pointer any number of times you like. I
think that is what the "it" in "delete it more than once" means.

Yes, you can. My point was that relying on this capability by setting
pointers to null after you're done with them is a bad programming habit,
not a good one, because you _will_ get in the habit of calling free() on
any which pointer, with the reasoning that "it'll either be a valid
pointer or null". That habit will then bite you when you encounter an
exception; e.g., when you make a copy of a pointer and only set one copy
to null, or when you have to work with other people's code which doesn't
nullify any pointers.
The truly good programming habit, in this case, is to do your bleedin'
bookkeeping, and keep in mind which pointers you have free()d, and which
you haven't. Don't just pass any which pointer to any which function,
relying on cheap hacks to keep you "safe".

Richard
Sep 2 '06 #68
"Randall" <ra*************@gmail.comwrites:
Hello Richard:

We will have to agree to disagree. I agree that my code below if/else
is retarded. I agree that it should have been different. This is what
I SHOULD have coded:

if( str2 != NULL )
printf( "str2: %s\n", str2);

free( str2 );
str2 = 0;
...
Please don't top-post. See <http://www.caliburn.nl/topposting.html>
(and most of the articles in this newsgroup) for details.

Out of curiosity, is there some reason you use NULL in the comparison
and 0 in the assignment? I would have used NULL for both.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Sep 2 '06 #69
Randall wrote:
We will have to agree to disagree. I agree that my code below if/else
is retarded. I agree that it should have been different. This is what
I SHOULD have coded:

if( str2 != NULL )
printf( "str2: %s\n", str2);

free( str2 );
str2 = 0;
...

[...] It is purely a measure of defensive programming. I have learned these
idioms through working in larger teams with varying levels of expertise
with a lot of little updates to the code base. This has worked well
for my team. [...]
Indeed, its good practice. But its repetitive, wordy and perhaps not
as expressive as you might intend. How about:

printf ("str2: %s\n", strfForPrintf (str2));
safeFree (str2);

where you have the following macros defined:

#define strfForPrintf(str) ((str)?(str):"<NULL>")
#define safeFree(str) do { free (str); str = NULL; } while (0)
Richard Bos wrote:
Ben Pfaff <bl*@cs.stanford.eduwrote:
You can free a null pointer any number of times you like. I
think that is what the "it" in "delete it more than once" means.
Yes, you can. My point was that relying on this capability by setting
pointers to null after you're done with them is a bad programming habit,
not a good one, because you _will_ get in the habit of calling free() on
any which pointer, with the reasoning that "it'll either be a valid
pointer or null".
You're reaching. In some programming environments you *wrap* your
calls to free(), or you use tools like purify etc, and you get status
reports about attempts to free NULL. I.e., you actually get visibility
into potential programming failures, as opposed to just suffering from
a double free which has UB, and can be difficult to track down or trace
without some effort.
[...] That habit will then bite you when you encounter an
exception; e.g., when you make a copy of a pointer and only set one copy
to null, or when you have to work with other people's code which doesn't
nullify any pointers.
The truly good programming habit, in this case, is to do your bleedin'
bookkeeping, and keep in mind which pointers you have free()d, and which
you haven't. [...]
If you follow your own logic for a second, don't you also have to do
everyone else's book keeping too, since you might be using other
people's code that doesn't nullify pointers? Following
standards/patterns and using safety mechanisms like NULLing pointers is
a far more scalable approach in terms of the number of programmers you
can sustain on a single project.

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Sep 2 '06 #70
Keith Thompson said:

<free(p); p = NULL;>
I'm not going to say that I disagree with you, but I'm going to offer
a counterargument anyway.
Good man. :-)
Setting a free()d pointer to NULL can prevent certain kinds of errors
-- or rather, it can prevent the *symptoms* of certain kinds of
errors.
Well, it can actually prevent errors. If p is NULL, p's value is not
indeterminate. It's not actually /valid/, but it's testably invalid.
For example:

some_type *ptr = malloc(sizeof *ptr);
...
free(ptr);
...
/*
* Now I don't remember whether I called free(ptr) or not.
* I'll free it here, just in case.
*/
free(ptr);
Sloppy programming, of course. (I typed "Sloopy" on the first attempt, and
was half-tempted to leave it uncorrected.)
As written, the second call to free() invokes undefined behavior. In
fact, evaluating ptr in preparation for the call invokes UB; the call
to free() doesn't really have anything to do with it.
QUite so.
>
If the first free() call were replaced with:

free(ptr);
ptr = NULL;

then the second free() call wouldn't invoke UB. But it would *still*
(probably) be a symptom of a logical error, namely the failure to keep
track of whether you've already called free().
Sure, but now you have only one error instead of two, and as far as the
compiler is concerned the error is harmless. As far as the programmer's
boss is concerned, however, it may not be.
Setting ptr to NULL
after the first free() would mask the symptom of the error; leaving it
alone could *potentially* cause the second free() to blow up, making
the error easier to detect. (Or it could do nothing; such is the
nature of undefined behavior.)
Or it could destroy an entire continent, which is why I really really don't
like the idea. Better to have a deterministic program that you can debug
predictably, IMHO.
On the other hand, you could legitimately have reached the point of
the second free() by any of several paths, some of which have free()d
the pointer and some of which haven't. In that case, *if* all the
paths that free() it then set it to NULL, then free()ing it again is
harmless and sensible.
In practice, it's very easy to ensure (or rather, *almost* ensure) that all
paths do free it, meaning that there is no need to give it "one for luck".
I mean, of course, using an opaque type. Yes, it's true that some bonehead
can free(p); instead of TDestroy(&p); if he really really insists, which is
why I have to say "almost".

<snip>
In an ideal world, the question would never arise; you wouldn't make
that mistake in the first place, or at least you'd already have
corrected it.
Indeed. But since it's not an ideal world, it makes sense to make your
programs as debuggable as possible, and in my opinion that means keeping
them deterministic!
In software intended to be maximally robust (which, I hasten to add,
isn't *always* worth the effort), it probably makes sense to implement
this kind of defense in depth *and* to detect and log cases where
errors were caught by the second or later line of defense.
Right.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)
Sep 2 '06 #71
we******@gmail.com said:

<snip>
How about:

printf ("str2: %s\n", strfForPrintf (str2));
safeFree (str2);

where you have the following macros defined:

#define strfForPrintf(str) ((str)?(str):"<NULL>")
#define safeFree(str) do { free (str); str = NULL; } while (0)
Your second macro is badly named, I think. If it were truly a "safe Free",
it would be okay to invoke it like this: safeFree((void *)rand()); /* ! */

<snip>

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)
Sep 2 '06 #72
On Fri, 1 Sep 2006 17:37:22 UTC, Tak-Shing Chan <t.****@gold.ac.uk>
wrote:
On Fri, 1 Sep 2006, Richard Heathfield wrote:
jacob navia said:
CBFalconer wrote:
Revise the example:

T *p;

if (p = malloc(n * sizeof *p) {
T *q;

q = p+1; p = NULL;
dosomethingwith(q);
q--;
....
}
The revised example will not fool the GC since the block can be reached
from q, since it points between the beginning and the end of the block.
<shrugFinding ways to fool it is a moderately trivial exercise:

p = malloc(n * sizeof *p);
pos = ftell(fp);
fwrite(&p, sizeof p, 1, fp);
p = NULL;
fseek(fp, pos, SEEK_SET);
fread(&p, sizeof p, 1, fp);

Your rebuttal is worse than the disease, because:

(1) If fwrite or fread fails, any further use of p will
invoke undefined behaviour.
Like yopu've written some other data. You've lost completely.
(2) Even with error checking inserted, you would still be
leaking memory when fread fails.
Writing data to disk is always risky. When you can't read them back
you're in trouble always. There is no difference between a pointer, a
float or a simple text string. You says clearly: avoid writing data to
disk when you have a need to read them soetimes.
(3) In general, writing pointers to files is always
nonportable.
No, it is always fully portable.

So, I am not sure why you are posting this in a
group that values portability.
What you means is that writing a pointer to a file exit the program
and trying to use the pointer readed from a file will work always.

To cite the twit:

void f(void) {
int *p = GBmalloc(4711);

......

}

GBmalloc does NOT seen that the memory allocated gets unused because p
is not set to NULL even as

void f(void) {

void *p = GBmalloc(4711);
......

p = NULL; /* invokes magically GBFree(p); */

}

is pure crap.

Moving pointers to other fuctions in other translation untis will have
random effects with that garbidge in special when that other
translation units will be called with static arrays even with
dynamically allocated arrasy. It lives in the hands of the programmer
to define when free() should be called because no GB will ever been
able to free it - except in an interpreter.

GB gives in C more problems as it ever can solve when the program is a
bit more complex as an hello world one.

GB in C++ is errornous enough. GB in C is complete senseless. It may
be a bit helpful in a C interpreter but in truly complex applications
it is the main source for UB in any way.

--
Tschau/Bye
Herbert

Visit http://www.ecomstation.de the home of german eComStation
eComStation 1.2 Deutsch ist da!
Sep 2 '06 #73
On Fri, 1 Sep 2006 17:57:28 UTC, ro******@ibd.nrc-cnrc.gc.ca (Walter
Roberson) wrote:
In article <87************@benpfaff.org>,
Ben Pfaff <bl*@cs.stanford.eduwrote:
Writing a pointer to a file is *not* always non-portable. It is
portable to write a pointer to a file, read it back within the
same run of the program, and then use the pointer (as long as the
lifetime of the associated memory has not been reached).

Reference, please?
Practise.
My recollection is that the relevant section says only that
when something is written out to a binary file, that the same
binary value will be read back in. When, though, it comes to a pointer,
that doesn't promise that the reconstituted pointer points to anything.
Reference please.

The pointer will point to exactly the same memory location as it had
as it was written. Which magic will change the memory to be unuseable
only when the pointer pointing to gets written into a file? Clearly
the pointer gets useless when the program holding the data in memory
dies. But solong the program is active in the same run the memory
holds its content.
There is a section of the standard that talks about writing out
pointers and reading them back in, but that has to do with using
the %p printf() format, which Richard did not do.
You knows the difference between binary and text files?

--
Tschau/Bye
Herbert

Visit http://www.ecomstation.de the home of german eComStation
eComStation 1.2 Deutsch ist da!
Sep 2 '06 #74
On Fri, 1 Sep 2006 09:55:23 UTC, jacob navia <ja***@jacob.remcomp.fr>
wrote:
You said "setting a pointer to NULL tells the GC that the
associated memory
is free to reuse". Now you appear to be making a different claim - i.e.
that *all* pointers to that memory have to be set to NULL before the GC can
assume the associated memory is free to reuse. So now the programmer has to
*ensure* that, if he sets up multiple pointers that all end up at the same
address, he *has* to set *all* of them to NULL, to appease the garbage
collector; and if he misses even one, he'll have a memory leak.

Yes, this can be a problem with the GC.
There are more problems than solutions with GC in C.
As you may know, I am not selling anything here, and I am not saying
that the GC is a "solve it all" magic bullet.

The algorithm of a conservative GC is that if a block of memory
is *reachable* from any of the roots (and in this case a global pointer
is a root) then that block can't be reused.
Is GC an interpreter that it pursures each and any action on the
pointers it gives out? No? You lost already.
Setting pointers to NULL after you are done with them is a harmless
operation, nothing will be destroyed unles it can be destroyed.

It is MUCH simpler than calling free() only ONCE for each block.
So your GC will fail to destroy the area producing memory leaks under
guarantee or it will try to free() static memory. It is easy to
produce that.
If you have several dozen *ALIASES* for a memory block, as you did
in the code above, you must free the block ONLY ONCE, what can be
extremely tricky in a real life situation.
When you are a real programmer and not a stupid hacker it will always
be easy to have only one free() available for each malloc(). Even in
more complex code. When the lifetime of a memory block ends free() it
and set the reference pointer to NULL. You can have a million aliasse
of an pointer - but when you are not only a stupid hacker you would
always know how to identify if a given memory block is in usage or
not. No magic needed, no unreliable GC needed.
Personally, I find it far more convenient to call free() when I'm done with
the memory block.

If you have visibility, as in this example, OK. But if you don't,
i.e. you are passing memory blocks around to all kinds of routines that
may store that pointer in structures, or pass it again around, it
can be extremely tricky to know how many aliases you already have
for that memory block and to invalidate ALL of them.
You are describing perfectly the situation your GC will fail
miserably.
>
Then, if I am in a tricksy part of the code and wish to
set pointers to NULL as a safety precaution, I can do that, and if I am in
a simple but performance-critical part of the code, I can choose not to do
it. The way you describe automatic garbage collection, I would be denied
those choices.

Setting a pointer to NULL is such a CHEAP operation in ALL machines
that it is not worth speaking about. It means zeroing a
memory location.
Boah, ey! As you says it means to the the pointer to NULL, but not to
free() a memory location used somewhere under some conditions hidden
from an magic GC.

When your CRT is able to tell you how many memory it has given out and
not gotten back already it is easy during debug phase to fix forgotten
free()s. When the magic GC on such points gives you the same message
you mostenly out of luck.

I've written lots of highly complex C programs using intensive
malloc(), realloc() and free() ending always up with no memory leak. I
had to test lots of less complex C++ programs using GC and found
always lots of memory leaks, ending up in weeks to search for the
causes and resolve them.

GC in C makes simply no sense.

Instead to hack blindly around you have to learn to write failsave
programs and let them run 24h/366d/year using only excatly the
resources they need for the current work. You'll avoid GC and C++
whenever possible.

--
Tschau/Bye
Herbert

Visit http://www.ecomstation.de the home of german eComStation
eComStation 1.2 Deutsch ist da!
Sep 2 '06 #75
On Fri, 1 Sep 2006 11:01:21 UTC, "Philip Potter"
<ph***********@xilinx.comwrote:
Richard, while you have earned a lot of my respect and admiration through
your posting, I do feel that you are on a little bit of a crusade here. You
seem to be arguing that because garbage collection can't protect against all
memory leaks without care from the user, it is worthless. This is a somewhat
ridiculous argument - after all, manual memory management has precisely the
same property.
No. It is quite more easy to write errorfree programs using
malloc()/free() than having a defective GC. A GC that can't handle
perfectly each and any dynamic memory is perfectly unuseable per
design.
GCs don't give you a license not to think, but I didn't see anybody arguing
that they do.
GC as it should be designed araises this claim. In C it will fail
always miserably, so it is useless.

--
Tschau/Bye
Herbert

Visit http://www.ecomstation.de the home of german eComStation
eComStation 1.2 Deutsch ist da!
Sep 2 '06 #76
Richard Heathfield wrote:

[ snip all ]

Some time ago now I attacked the GC problem, as I saw it, with what I
called GE or Garbage Elimination.

It has to do with wrapping malloc, calloc, realloc and free such that
they all come to GE first. GE is basically a manager of a linked list of
allocation data, including addresses and sizes.

GE.h adds 'size_t size(void *p);' to the vocabulary, size of the allocation.

User calls to *alloc are simply recorded in the list and then passed to
their libc namesakes. Calls to free() are looked up in the list and if
found result in a call to libc free() and deletion from the list. If the
user calls free(x) and x is not in the list, we simply return, a NOP.

What do you think of it?

--
Joe Wright
"Everything should be made as simple as possible, but not simpler."
--- Albert Einstein ---
Sep 2 '06 #77
Joe Wright wrote:
Richard Heathfield wrote:

[ snip all ]

Some time ago now I attacked the GC problem, as I saw it, with what I
called GE or Garbage Elimination.

It has to do with wrapping malloc, calloc, realloc and free such that
they all come to GE first. GE is basically a manager of a linked list of
allocation data, including addresses and sizes.

GE.h adds 'size_t size(void *p);' to the vocabulary, size of the
allocation.

User calls to *alloc are simply recorded in the list and then passed to
their libc namesakes. Calls to free() are looked up in the list and if
found result in a call to libc free() and deletion from the list. If the
user calls free(x) and x is not in the list, we simply return, a NOP.

What do you think of it?
I am afraid you are actually repeating the code of malloc.
The system function malloc has been probably coded with great care to do
exactly that:

Find a free block in the list of free blocks!

You are just adding a layer that does the same.

jacob

Sep 2 '06 #78
Joe Wright wrote:
Richard Heathfield wrote:

[ snip all ]

Some time ago now I attacked the GC problem, as I saw it, with
what I called GE or Garbage Elimination.

It has to do with wrapping malloc, calloc, realloc and free such
that they all come to GE first. GE is basically a manager of a
linked list of allocation data, including addresses and sizes.

GE.h adds 'size_t size(void *p);' to the vocabulary, size of the
allocation.

User calls to *alloc are simply recorded in the list and then
passed to their libc namesakes. Calls to free() are looked up in
the list and if found result in a call to libc free() and deletion
from the list. If the user calls free(x) and x is not in the list,
we simply return, a NOP.

What do you think of it?
All that is usually available, in some sort of system dependant
manner, in any malloc package. You could examing my nmalloc and
its associated maldbg package in nmalloc.zip, available at:

<http://cbfalconer.home.att.net/download/>

bearing in mind that the package is designed for use with DJGPP,
but can probably be easily ported to many systems.

The problem is that handling the malloced pointers is not enough.
You also have to be able to handle all the pointers created within
any C program by such things as the & operator, and by automatic
conversion of array references. To include these you have to get
into the warp and woof of the complete compiler/code-generator, and
you will have significant efficiency losses since all pointer
references will have to be indirect, through a system table.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>

Sep 2 '06 #79
jacob navia wrote:
Joe Wright wrote:
>Richard Heathfield wrote:

[ snip all ]

Some time ago now I attacked the GC problem, as I saw it, with what I
called GE or Garbage Elimination.

It has to do with wrapping malloc, calloc, realloc and free such that
they all come to GE first. GE is basically a manager of a linked list
of allocation data, including addresses and sizes.

GE.h adds 'size_t size(void *p);' to the vocabulary, size of the
allocation.

User calls to *alloc are simply recorded in the list and then passed
to their libc namesakes. Calls to free() are looked up in the list and
if found result in a call to libc free() and deletion from the list.
If the user calls free(x) and x is not in the list, we simply return,
a NOP.

What do you think of it?
I think that if an invalid pointer is passed it should abort the program
since something serious has gone wrong and you don't know how much else
has been messed up.
I am afraid you are actually repeating the code of malloc.
The system function malloc has been probably coded with great care to do
exactly that:

Find a free block in the list of free blocks!

You are just adding a layer that does the same.

No, his layer does *not* do the same as the C library. Most
implementations will crash at some point after you have called free with
an invalid pointer. Joe Wright's wrapper will prevent that.
--
Flash Gordon
Sep 2 '06 #80
jacob navia <ja***@jacob.remcomp.frwrites:
Joe Wright wrote:
>Richard Heathfield wrote:
[ snip all ]
Some time ago now I attacked the GC problem, as I saw it, with what
I called GE or Garbage Elimination.
It has to do with wrapping malloc, calloc, realloc and free such
that they all come to GE first. GE is basically a manager of a
linked list of allocation data, including addresses and sizes.
GE.h adds 'size_t size(void *p);' to the vocabulary, size of the
allocation.
User calls to *alloc are simply recorded in the list and then passed
to their libc namesakes. Calls to free() are looked up in the list
and if found result in a call to libc free() and deletion from the
list. If the user calls free(x) and x is not in the list, we simply
return, a NOP.
What do you think of it?

I am afraid you are actually repeating the code of malloc.
The system function malloc has been probably coded with great care to
do exactly that:

Find a free block in the list of free blocks!

You are just adding a layer that does the same.
But that layer in the system malloc() is not exposed to the user. The
only *portable* way to do this kind of thing is to add a wrapper
around *alloc() and free(). Such a wrapper can detect errors that
malloc() can't (such as free()ing the same pointer twice).

It could significantly hurt performance if you keep track of
allocations with a simple linear linked list.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Sep 2 '06 #81
>Some time ago now I attacked the GC problem, as I saw it, with what I
>called GE or Garbage Elimination.

It has to do with wrapping malloc, calloc, realloc and free such that
they all come to GE first. GE is basically a manager of a linked list of
allocation data, including addresses and sizes.

GE.h adds 'size_t size(void *p);' to the vocabulary, size of the allocation.

User calls to *alloc are simply recorded in the list and then passed to
their libc namesakes. Calls to free() are looked up in the list and if
found result in a call to libc free() and deletion from the list. If the
user calls free(x) and x is not in the list, we simply return, a NOP.

What do you think of it?
I'd like to see it have a debugging mode with these characteristics:

(1) Attempting to free something not in the list results in a call to abort().
Or perhaps just logs it.

(2) Allocated memory is initialized to a machine-specific bit pattern
that matches neither all bits 0 nor the bit pattern of a null
pointer, and which is likely to cause program aborts when used as
a pointer accidentally. It should be pessimally aligned. This
pattern should be known to users of the platform so it can easily
be spotted in memory dumps by a debugger. Example: on 32-bit
machines where a null pointer does not use this pattern (no,
0xdeadbeef is *NOT* always used as the bit pattern for a null
pointer, even on machines with 32-bit pointers): 0xdeadbeef.

(3) Deallocated memory is scribbled on with a machine-specific bit
pattern different from the one used in (2) above. The objective
here is to find errors like:

free(foo);
free(foo->next);
where freed memory is used. Example: on 32-bit machines where a
null pointer does not use this pattern: 0xf3eebee1.

Another useful mode is to have (2) and (3) use random bit patterns.
Also, code like this could easily have a memory-leak-detection feature.

Sep 3 '06 #82
Keith Thompson wrote:
jacob navia <ja***@jacob.remcomp.frwrites:
>Joe Wright wrote:
>>Richard Heathfield wrote:
[ snip all ]
Some time ago now I attacked the GC problem, as I saw it, with what
I called GE or Garbage Elimination.
It has to do with wrapping malloc, calloc, realloc and free such
that they all come to GE first. GE is basically a manager of a
linked list of allocation data, including addresses and sizes.
GE.h adds 'size_t size(void *p);' to the vocabulary, size of the
allocation.
User calls to *alloc are simply recorded in the list and then passed
to their libc namesakes. Calls to free() are looked up in the list
and if found result in a call to libc free() and deletion from the
list. If the user calls free(x) and x is not in the list, we simply
return, a NOP.
What do you think of it?
I am afraid you are actually repeating the code of malloc.
The system function malloc has been probably coded with great care to
do exactly that:

Find a free block in the list of free blocks!

You are just adding a layer that does the same.

But that layer in the system malloc() is not exposed to the user. The
only *portable* way to do this kind of thing is to add a wrapper
around *alloc() and free(). Such a wrapper can detect errors that
malloc() can't (such as free()ing the same pointer twice).

It could significantly hurt performance if you keep track of
allocations with a simple linear linked list.
I could use some sort of hash I guess but according to the second rule
(for experts only) I haven't optimized it yet. :-)

--
Joe Wright
"Everything should be made as simple as possible, but not simpler."
--- Albert Einstein ---
Sep 3 '06 #83
av
On Sat, 02 Sep 2006 20:34:19 GMT, Keith Thompson wrote:
>jacob navia <ja***@jacob.remcomp.frwrites:
>Joe Wright wrote:
>>Richard Heathfield wrote:
[ snip all ]
Some time ago now I attacked the GC problem, as I saw it, with what
I called GE or Garbage Elimination.
It has to do with wrapping malloc, calloc, realloc and free such
that they all come to GE first. GE is basically a manager of a
linked list of allocation data, including addresses and sizes.
GE.h adds 'size_t size(void *p);' to the vocabulary, size of the
allocation.
User calls to *alloc are simply recorded in the list and then passed
to their libc namesakes. Calls to free() are looked up in the list
and if found result in a call to libc free() and deletion from the
list. If the user calls free(x) and x is not in the list, we simply
return, a NOP.
What do you think of it?

I am afraid you are actually repeating the code of malloc.
The system function malloc has been probably coded with great care to
do exactly that:

Find a free block in the list of free blocks!

You are just adding a layer that does the same.

But that layer in the system malloc() is not exposed to the user. The
only *portable* way to do this kind of thing is to add a wrapper
around *alloc() and free(). Such a wrapper can detect errors that
malloc() can't (such as free()ing the same pointer twice).

It could significantly hurt performance if you keep track of
allocations with a simple linear linked list.
wrong, in the 90% of case (how malloc and free have to use) e.g.
something like

{ p1=malloc(n1);
p2=malloc(n2);
p3=malloc(n3);
p4=malloc(n4);
...NO MALLOC HERE
free(p4);
free(p3);
free(p2);
free(p1);
}

the search of p1, p2, p3, p4 is O(1) if there is a search like

free(void* k)
{int i;
...
for(i=last; i>=0; --i)
if(vettore[i]==k) break;
if(i<0) \\ error;

}
Sep 3 '06 #84
av
On Sat, 02 Sep 2006 20:03:01 +0100, Flash Gordon wrote:
>jacob navia wrote:
>Joe Wright wrote:
>>Richard Heathfield wrote:

[ snip all ]

Some time ago now I attacked the GC problem, as I saw it, with what I
called GE or Garbage Elimination.

It has to do with wrapping malloc, calloc, realloc and free such that
they all come to GE first. GE is basically a manager of a linked list
of allocation data, including addresses and sizes.

GE.h adds 'size_t size(void *p);' to the vocabulary, size of the
allocation.

User calls to *alloc are simply recorded in the list and then passed
to their libc namesakes. Calls to free() are looked up in the list and
if found result in a call to libc free() and deletion from the list.
If the user calls free(x) and x is not in the list, we simply return,
a NOP.
for me, free a pointer that is not realised from malloc is an error
that has to be segnaled
>>What do you think of it?

I think that if an invalid pointer is passed it should abort the program
since something serious has gone wrong and you don't know how much else
has been messed up.
>I am afraid you are actually repeating the code of malloc.
The system function malloc has been probably coded with great care to do
exactly that:

Find a free block in the list of free blocks!

You are just adding a layer that does the same.


No, his layer does *not* do the same as the C library. Most
implementations will crash at some point after you have called free with
an invalid pointer. Joe Wright's wrapper will prevent that.
for me a crash that show an error is better than the "silent running
wrong"
Sep 3 '06 #85

In article <ln************@nuthaus.mib.org>, Keith Thompson <ks***@mib.orgwrites:
Joe Wright wrote:
Some time ago now I attacked the GC problem, as I saw it, with what
I called GE or Garbage Elimination.
It has to do with wrapping malloc, calloc, realloc and free such
that they all come to GE first. GE is basically a manager of a
linked list of allocation data, including addresses and sizes.
GE.h adds 'size_t size(void *p);' to the vocabulary, size of the
allocation.
User calls to *alloc are simply recorded in the list and then passed
to their libc namesakes. Calls to free() are looked up in the list
and if found result in a call to libc free() and deletion from the
list. If the user calls free(x) and x is not in the list, we simply
return, a NOP.
My inclination would be to have an optional callback for this and
other error conditions; that would let the library user log the
error, or abort, or whatever seems appropriate for the application.

But the general idea is a good one, and many of the projects I've
worked on had similar allocation wrappers. (They also often support
things like generating allocation reports at program termination, to
help detect leaks.)
It could significantly hurt performance if you keep track of
allocations with a simple linear linked list.
It *could*, particularly due to caching effects (linked lists generally
have poor cache behavior), but there's ample room for optimization if
that becomes an issue, and I suspect that for most C programs it
won't. If a performance-critical program is doing dynamic memory
allocation in the inner loop, that suggests the program might stand a
bit of optimization.

IME, C programs tend to do significantly less dynamic allocation than
programs written in languages with more dynamic-allocation sugar,
presumably for the obvious reason - the programmer has to write all
the allocation code explicitly.

And the tradeoff advantages are sufficient to justify a significant
performance cost in many cases anyway; for example, just duplicating
the malloc housekeeping data somewhere not adjacent to the allocated
storage itself, and using it as a preliminary check, prevents dup-free
heap-smashing attacks, which have been very successful against a wide
range of programs. It's a cheap security measure.

--
Michael Wojcik mi************@microfocus.com

Poe said that poetry was exact.
But pleasures are mechanical
and know beforehand what they want
and know exactly what they want. -- Elizabeth Bishop
Sep 3 '06 #86
av <av@ala.awrites:
On Sat, 02 Sep 2006 20:34:19 GMT, Keith Thompson wrote:
>>jacob navia <ja***@jacob.remcomp.frwrites:
[...]
>>Find a free block in the list of free blocks!

You are just adding a layer that does the same.

But that layer in the system malloc() is not exposed to the user. The
only *portable* way to do this kind of thing is to add a wrapper
around *alloc() and free(). Such a wrapper can detect errors that
malloc() can't (such as free()ing the same pointer twice).

It could significantly hurt performance if you keep track of
allocations with a simple linear linked list.

wrong, in the 90% of case (how malloc and free have to use) e.g.
something like

{ p1=malloc(n1);
p2=malloc(n2);
p3=malloc(n3);
p4=malloc(n4);
...NO MALLOC HERE
free(p4);
free(p3);
free(p2);
free(p1);
}

the search of p1, p2, p3, p4 is O(1) if there is a search like
[snip]

Yes, that can happen in *some* cases -- which is why I wrote that it
*could* significantly hurt performance.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Sep 3 '06 #87
Joe Wright wrote:
Richard Heathfield wrote:
Some time ago now I attacked the GC problem, as I saw it, with what I
called GE or Garbage Elimination.

It has to do with wrapping malloc, calloc, realloc and free such that
they all come to GE first. GE is basically a manager of a linked list of
allocation data, including addresses and sizes.

GE.h adds 'size_t size(void *p);' to the vocabulary, size of the allocation.

User calls to *alloc are simply recorded in the list and then passed to
their libc namesakes. Calls to free() are looked up in the list and if
found result in a call to libc free() and deletion from the list. If the
user calls free(x) and x is not in the list, we simply return, a NOP.

What do you think of it?
This does not solve the general problem of garbage collection. GC
typically eliminates the need for free() by automatically recycling
memory any time that memory stops being referenced by any pointer that
is ultimately reachable from within the program's current run state.
>From what I can tell, what you've done is simply enhanced the memory
model by adding a size(void *p) function, and have a method for
detecting bogus calls to free() (including double frees.) There is no
end of ways to enhance C's memory model, and this is certainly one of
them, but this is not GC. True GCs are somewhat difficult to implement
in C and ultimately rely on platform specific behavior (to access all
live run-time autos, and all writable data areas, for example). Though
they certainly exist (the Boehm GC mechanism, for example.)

That all being said, I fully endorse the idea of enhancing C's memory
model. A big reason why people have run away from C and gone to other
langauges is because the whole malloc/free, programming model is very
hard to sustain especially given the bear-bones support that the
language gives you. But in doing such enhancements, you should not
target an attempt to duplicate GC, since that truly changes programming
paradigms -- C has way too much cruft in it to support a true paradigm
shift without substantial change to the language (comparable to what
C++ or Objective C did.)

What you want to do is to go ahead and embrace C's basic way of doing
things (after all that's the environment) we are in, but attack all its
main weaknesses. I.e., when a GC advocate says "C's memory model is
bad because of reason/example x" you want to be able to respond, "No,
that's easily detectable and correctable with an enhanced C memory
manager that does y". With this in mind, let us return to your "GE"
enhancement.

In your case, rather than *IGNORING* attempts to free garbage, you
should generate some sort of diagnostic to provide the programmer with
information telling them that they did something wrong. In fact you
can do this:

/* In some include file somewhere, say "estdlib.h" */
#define free(x) free_enhanced ((x), __FILE__, __LINE__)
void free_enhanced (void *, const char *, int);

/* In some module, say estdlib.c */
void free_enhanced (void * ptr, const char * file, int line) {
if (wasLegallyAllocated (ptr))
specialfree (ptr); /* find the real header, then free */
else
diagnostic_badfree (file, line, ptr);
}

So that in your error log, or message or whatever you decide to do, you
know exactly which call to free is failing. Its very important for the
programmer to know that this bad thing is going in in his/her program
as this may be symptomatic of more serious problems in the program, and
simply avoiding this one anomily is likely to be a bandage that just
doesn't do the surgeon's job.

Another really easy to implement feature is to simply keep track of the
total amount of memory currently allocated, as well as the lifetime
maximum allocated by the program. You could then provide two simple
functions that just returned these values. These are usually
sufficient hints for most programmers to know if they are leaking
memory or not.

As long as you are tracking all allocations in a linked list, you also
might as well provide a memory traversal mechanism. I.e., an ability
to walk through all allocated memory locations. Why would you do this?
Well you would do it in conjunction with an enhancement that tracked
*where* each allocation came from:

/* In some include file somewhere, say "estdlib.h" */
#define malloc(x) malloc_enhanced ((x), __FILE__, __LINE__)
void * malloc_enhanced (size_t, const char *, int);

/* In some module, say estdlib.c */
struct enhancedMemHdr {
struct enhancedMemHdr * linkNext;
size_t sz;
const char * moduleOrg;
int lineNumberOrg;
char mem[1]; /* struct hack */
};

void * malloc_enhanced (size_t sz, const char * file, int line) {
struct enhancedMemHdr * ptr;
if (!sz) {
diagnostic_badmalloc (file, line, sz);
return NULL;
}
/* store sz, file & line as well: */
ptr = specialmalloc (sz, file, line);
if (!ptr) {
diagnostic_outofmemory (file, line, sz); /* don't abort */
}
return ptr.mem;
}

So the point is not to look into the memory itself while you walk the
allocations (you wouldn't have type information, so that would be kind
of useless) but rather you would be interested in *where* the memory
got allocated. So you could do some simple statistics to figure out
where your memory was mostly being allocated for.

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Sep 3 '06 #88
"Herbert Rosenau" <os****@pc-rosenau.dewrote in message
news:wm***************************@JUPITER1.PC-ROSENAU.DE...
On Fri, 1 Sep 2006 11:01:21 UTC, "Philip Potter"
<ph***********@xilinx.comwrote:
Richard, while you have earned a lot of my respect and admiration
through
your posting, I do feel that you are on a little bit of a crusade here.
You
seem to be arguing that because garbage collection can't protect against
all
memory leaks without care from the user, it is worthless. This is a
somewhat
ridiculous argument - after all, manual memory management has precisely
the
same property.

No. It is quite more easy to write errorfree programs using
malloc()/free() than having a defective GC. A GC that can't handle
perfectly each and any dynamic memory is perfectly unuseable per
design.
GCs don't give you a license not to think, but I didn't see anybody
arguing
that they do.

GC as it should be designed araises this claim.
I'm not sure what this sentence means. Assuming "araises" is a typo for
"raises", you're changing the definition of a GC to be something which is
impossible to implement, and then rebutting it. It's a straw man argument.

The fact of the matter is that GCs are not inherently broken. Java and LISP
programs do actually work without crashing, provided the programmer
understands what they are doing. (And if the programmer doesn't know what
they are doing, well, all is lost.)
In C it will fail
always miserably, so it is useless.
I'm prepared to talk about whether or not GCs are suited to C, but blind
assertions do little to convince me that they are not.

I personally feel that GCs are suited to some forms of programming but not
others. A programmer using a GC still has to take care - but less care IMHO
than one using malloc()/free(); GCs reduce development time and maintainence
costs, at the cost of less efficient memory management and bigger runtime
footprint - lower performance, in short.

If this isn't what the Standards committee think what C is about, then I'm
not going to argue. But I'm also not going to argue with anybody who plugs a
GC into C for their own personal use.

Philip

Sep 4 '06 #89
we******@gmail.com wrote:

[ snipped ]

Thanks Paul for a very thoughtful and helpful critique of GE. Some of
your suggested enhancements are provided for. GE.h has a new prototype
allowing 'void * Free(void *);' and provides a typedef of the node
structure for the list. Free returns a pointer to the beginning of the
list. Also besides 'size_t size(void *)' there is 'size_t sizeall(void)'
which sums all the sizes.

Also 'void freeall(void)' which trips through the list freeing each
allocation and removing its node in turn. I think I have to revisit this
function to ensure allocations get freed in reverse order so that I
don't free **a before I free all the *a pointers.

Thanks again.

--
Joe Wright
"Everything should be made as simple as possible, but not simpler."
--- Albert Einstein ---
Sep 4 '06 #90
On Sun, 3 Sep 2006 22:58:39 UTC, we******@gmail.com wrote:
/* In some module, say estdlib.c */
struct enhancedMemHdr {
struct enhancedMemHdr * linkNext;
size_t sz;
const char * moduleOrg;
int lineNumberOrg;
char mem[1]; /* struct hack */
};
It's faulty. It does not help against the most bad pointer handling
one can do. It will not even catch such things as

1. dosn't help against overwriting memory but will destroy the
bookkeeping the memory manager knows nothing of AND may or may not the
memory bookkeeping of malloc too.

p = malloc(1000)
p2 = p - 3;
memcpy(p2, "I'm destroying memory management now");

or

long p = malloc(400);
p2 = p + 100;
strcpy(p2, "I'm writing behind the size of the allocated memory block
now");

In both cases the crash will occure 345 calls of some other functions
from here.

2. When you have to made 1.000.000 calls to get all memory you needs
20.000.000 bytes of memory extra. In an program I maintain that will
cost exactly the amout of memory that let it get at lest the amount of
memory it needs to get for it highly need of short structs to fill the
nested lists of nested lists of nested list of nested lists of structs
beside a number of dynamically allocated buffers in size of some bytes
up to some megabytes in size each. to do its primitive work: get
enough members to store all data needed. That meenas 20 MB memory is
not dispensable.

I've written my own c/m/realloc() to win 8 - 12 from 16 bytes my
standard memory manager uses for its internal storage on each request
of a memory block to give me a higher number of requestable blocks as
the mayority of memory blocks is less than 20, 40 or 120 bytes in size
spending 16 for each only for mangement is too high.

3. The risk that your struct will end up in returning a pointer to
unaligned address is high.
Even on 32 bit mashines alignment on an address divisible through 4
may not enough. What is when an long double address needs alignment of
8 instead of 4?

It's not even portable. Because nothing ssays that an alignment of 6
or 7 is not an requirement for some data or pointer size? The
c/m/realloc() is committed to return a pointer aligned for each
possible data type

void * malloc_enhanced (size_t sz, const char * file, int line) {
struct enhancedMemHdr * ptr;
if (!sz) {
diagnostic_badmalloc (file, line, sz);
return NULL;
}
/* store sz, file & line as well: */
ptr = specialmalloc (sz, file, line);
if (!ptr) {
diagnostic_outofmemory (file, line, sz); /* don't abort */
}
return ptr.mem;
}
Too time expensive for practical use where memory allocation is mor
than the half needed runtime for creating the structures needed to
store the data in an ordered manner.
So the point is not to look into the memory itself while you walk the
allocations (you wouldn't have type information, so that would be kind
of useless) but rather you would be interested in *where* the memory
got allocated. So you could do some simple statistics to figure out
where your memory was mostly being allocated for.
To fix the possibility to dedect memory overrides you have to change
your struct:
1. don't use the struct hack. Use a separate memory are for your
memory management that is truly separate from the area you gives to
your users. That means you have to build up an array of n structs to
get space for the management, you have to build a list of that arrays
to be extensible when that array gets full and more memory is
requested.

You have to spend more and more and more runtime to
- find an free element in your management structure for reuse
- find the management struct again to free it after you free()d
the memory the user will free.
An advansed C programmer needs no nanny to get memory management
right. A beginner will learn that quickly or should use C++ and new
instead.

And a nanny that does not really help in catching faoulty memory
writes is no help either. Faulty means here overwriting the limimts of
allocated block either before its start or after its end.

--
Tschau/Bye
Herbert

Visit http://www.ecomstation.de the home of german eComStation
eComStation 1.2 Deutsch ist da!
Sep 4 '06 #91
av
On Mon, 4 Sep 2006 18:25:34 +0000 (UTC), Herbert Rosenau wrote:

>p = malloc(1000)
p2 = p - 3;
memcpy(p2, "I'm destroying memory management now");
in this case here when free p my prog know there is something wrong
and i go for the default exit
>or

long p = malloc(400);
p2 = p + 100;
strcpy(p2, "I'm writing behind the size of the allocated memory block
now");
the same here
>In both cases the crash will occure 345 calls of some other functions
from here.
at last when it free p. if the information for the pointer is stored
in the same array for vectors only when free p and this can not
"destroying memory management"
Sep 5 '06 #92
Philip Potter wrote:
"Herbert Rosenau" <os****@pc-rosenau.dewrote in message
news:wm***************************@JUPITER1.PC-ROSENAU.DE...
>>In C it will fail
always miserably, so it is useless.


I'm prepared to talk about whether or not GCs are suited to C, but blind
assertions do little to convince me that they are not.

I personally feel that GCs are suited to some forms of programming but not
others. A programmer using a GC still has to take care - but less care IMHO
than one using malloc()/free(); GCs reduce development time and maintainence
costs, at the cost of less efficient memory management and bigger runtime
footprint - lower performance, in short.

If this isn't what the Standards committee think what C is about, then I'm
not going to argue. But I'm also not going to argue with anybody who plugs a
GC into C for their own personal use.

Philip
Just one example:

I wrote the debugger of the lcc-win32 compiler system using the GC.

Without it, I would have never finished it.

There are 3 threads in the debugger: the edtior, (UI), the
debugger itself, and the debuggee, the program that is to be debugged.

The debugger and the editor that displays the results of the
debugger share a lot of buffers, and state. Messages are sent
to the UI for display, and input is retrieved from the user
and sent to the debugger. At the same time, complicated
data structures are built on the fly to support the displays
for the current line, structures that must be thrown away
immediately and rebuilt when the debugger reaches another,
completely unrelated point...

Using malloc/free this was HELL!!!

Tracking each and every buffer, each and every data structure,
is so complex that it would have taken months and months
to develop and debug. Specially, debugging a debugger
is not really an easy to do thing... specially if the debugger
is not able to debug itself.

Freeing complicated hierarchical data structures is very difficult.
And it takes time and effort. Each pointer that you have stored
somewhere must be proofed... is it still valid???

The GC allowed me to develop the debugger without ALL those
questions.

I concentrated in the debugger, and memory management is now automatic.

jacob
Sep 5 '06 #93

Philip Potter wrote:
"Herbert Rosenau" <os****@pc-rosenau.dewrote in message
news:wm***************************@JUPITER1.PC-ROSENAU.DE...
On Fri, 1 Sep 2006 11:01:21 UTC, "Philip Potter"
<ph***********@xilinx.comwrote:
Richard, while you have earned a lot of my respect and admiration
through
your posting, I do feel that you are on a little bit of a crusade here.
You
seem to be arguing that because garbage collection can't protect against
all
memory leaks without care from the user, it is worthless. This is a
somewhat
ridiculous argument - after all, manual memory management has precisely
the
same property.
No. It is quite more easy to write errorfree programs using
malloc()/free() than having a defective GC. A GC that can't handle
perfectly each and any dynamic memory is perfectly unuseable per
design.
GCs don't give you a license not to think, but I didn't see anybody
arguing
that they do.
GC as it should be designed araises this claim.

I'm not sure what this sentence means. Assuming "araises" is a typo for
"raises", you're changing the definition of a GC to be something which is
impossible to implement, and then rebutting it. It's a straw man argument.

The fact of the matter is that GCs are not inherently broken. Java and LISP
programs do actually work without crashing, provided the programmer
understands what they are doing. (And if the programmer doesn't know what
they are doing, well, all is lost.)
In C it will fail
always miserably, so it is useless.

I'm prepared to talk about whether or not GCs are suited to C, but blind
assertions do little to convince me that they are not.

I personally feel that GCs are suited to some forms of programming but not
others. A programmer using a GC still has to take care - but less care IMHO
than one using malloc()/free(); GCs reduce development time and maintainence
costs, at the cost of less efficient memory management and bigger runtime
footprint - lower performance, in short.
GC isn't always inherently less efficient or lower performance
than using malloc()/free(). For some applications, using GC
produces a faster application than using malloc().
If this isn't what the Standards committee think what C is about, then I'm
not going to argue. But I'm also not going to argue with anybody who plugs a
GC into C for their own personal use.
How do you expect to fit in on comp.lang.c with
a reasonable attitude like that? :)

Sep 6 '06 #94
On Tue, 5 Sep 2006 22:35:39 UTC, jacob navia <ja***@jacob.remcomp.fr>
wrote:
>
Just one example:

I wrote the debugger of the lcc-win32 compiler system using the GC.

Without it, I would have never finished it.
Umm, it seems you have to learn programming, not hacking.
There are 3 threads in the debugger: the edtior, (UI), the
debugger itself, and the debuggee, the program that is to be debugged.
Ugh, on an real OS a debugger would not been a thread of the debuggee
and the debugee will never benn a thread of the debugger. You'll have
2 independant processes, where the OS allows the debugger to control
the debugee. You would have to learn how process and thread control
works on the system the debuger and debugee runs on. You'll have to
learn how the debug control interface of the OS works. It's not really
easy but doable. Hacking around is no choice.
The debugger and the editor that displays the results of the
debugger share a lot of buffers, and state. Messages are sent
to the UI for display, and input is retrieved from the user
and sent to the debugger. At the same time, complicated
data structures are built on the fly to support the displays
for the current line, structures that must be thrown away
immediately and rebuilt when the debugger reaches another,
completely unrelated point...
Gee, you have to learn how the memory manager of the OS works beside
the memory manager of the CRT. When you're a real compiler developer
you would be able to write a CRT supporting the debugger.
Using malloc/free this was HELL!!!
Only if you're not able to understund how the memory managers of both,
the OS and the CRT works. You needs to be a real programmer and not
only hacking around.
Tracking each and every buffer, each and every data structure,
is so complex that it would have taken months and months
to develop and debug. Specially, debugging a debugger
is not really an easy to do thing... specially if the debugger
is not able to debug itself.
Where is the problem? I'm not a compiler developer. I had no need to
develop a debuger, but I learned enough from both, the OSes memory
manager and the CRTs ones of my compiler to read and understund and
interpret theyr memory management structures. I can't find something
horrible on them. There is nothing horrible to management 2 linear
double linked lists like the malloc familiy does. The only you have to
do is to learn programming right instead of hacking around.
Freeing complicated hierarchical data structures is very difficult.
Really? It seams you're only a bloody hacker but knows nothing about
programming. You should start from begin and learn how to program.
And it takes time and effort. Each pointer that you have stored
somewhere must be proofed... is it still valid???
That is really simple. As developer you will simple set each invalid
pointer to NULL. As developer of a debugger you will catch the event
you get from the OS when the debugee fails on illegal memory access.
What is horrible on that?

What is the problem of multiple nested data stuctures? When you've
learned how to use them in assember you'll laugh about them in C.
The GC allowed me to develop the debugger without ALL those
questions.
GC was, is and will never a possible solution for the unabilities of a
programmer in C but will bring more problems in as it can resolve.
When you're unable to handle the malloc family right then C is not the
right language for you. Go to java instead.
I concentrated in the debugger, and memory management is now automatic.
And will produce memory leaks.
jacob
who has proven again that he knows nothing about programming but seems
to be gread in hacking around blindly.

Jacob, you should really learn how real programming works before you
tries to write such complicate things like a compiler or debugger.
They are both too big for your current knowledge.
Having extensive knowledge of assembly of a specific platform makes
you not a good C programmer.
You have to learn what C is, how C is designed, the standard, what it
means and how and why it is designed (for).

--
Tschau/Bye
Herbert

Visit http://www.ecomstation.de the home of german eComStation
eComStation 1.2 Deutsch ist da!
Sep 9 '06 #95

This thread has been closed and replies have been disabled. Please start a new discussion.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.