By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
445,898 Members | 2,039 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 445,898 IT Pros & Developers. It's quick & easy.

A debugging implementation of malloc

P: n/a
In the C tutorial for lcc-win32, I have a small chapter
about a debugging implementation of malloc.

Here is the code, and the explanations that go with it.

I would appreciate your feedback both about the code and
the associated explanations.

---------------------------------------------------------------------

Instead of using directly malloc/free here are two implementations of
equivalent functions with some added safety features:
o Freeing NULL is allowed and is not an error.
o Double freeing is made impossible.
o Any overwrite immediately at the end of the block is checked for.
o Memory is initialized to zero.
o A count of allocated memory is kept in a global variable.

#include <stdlib.h>
#define MAGIC 0xFFFF
#define SIGNATURE 12345678L
size_t AllocatedMemory;
void *allocate(size_t size)
{
register char *r;
register int *ip = NULL;
size += 3 * sizeof(int);
r = malloc(size);
if (r == NULL)
return r;
AllocatedMemory += size;
ip = (int *) r;
// At the start of the block we write the signature
*ip++ = SIGNATURE;
// Then we write the size of the block in bytes
*ip++ = (int) size;
// We zero the data space
memset(ip, 0, size - 3*sizeof(int));
// We write the magic number at the end of the block,
// just behind the data
ip = (int *) (&r[size - sizeof(int)]);
*ip = MAGIC;
// Return a pointer to the start of the data area
return (r + 2 * sizeof(int));
}
void release(void *pp)
{
register int *ip = NULL;
int s;
register char *p = pp;
if (p == NULL) // Freeing NULL is allowed
return;
// The start of the block is two integers before the data.
p -= 2 * sizeof(int);
ip = (int *) p;
if (*ip == SIGNATURE) {
// Overwrite the signature so that this block
// can’t be freed again
*ip++ = 0;
s = *ip;
ip = (int *) (&p[s - sizeof(int)]);
if (*ip != MAGIC) {
ErorPrintf(“Overwritten block size %d”, s);
return;
}
*ip = 0;
AllocatedMemory -= s;
free(p);
}
else {
/* The block has been overwritten. Complain. */
ErrorPrintf(“Wrong block passed to release”);
}
}

The allocate function adds to the requested size space for 3 integers.
1) The first is a magic number (a signature) that allows the
identification of this block as a block allocated by our allocation
system.
2) The second is the size of the block. After this two numbers, the data
follows.
3) The data is followed by a third number that is placed at the end of
the block. Any memory overwrite of any block will overwrite probably
this number first. Since the “release” function checks for this, we
will be able to detect when a block has been overwritten.

At any time, the user can ask for the size of total allocated memory
(valid blocks in circulation) by querying the AllocatedMemory variable.

The “release function” accepts NULL (that is ignored). If the pointer
passed to it is not NULL, it will check that it is a valid block, and
that the signature is still there, i.e. that no memory overwrites have
happened during the usage of the block.
Jun 24 '06 #1
Share this Question
Share on Google+
41 Replies


P: n/a
jacob navia wrote:
In the C tutorial for lcc-win32, I have a small chapter
about a debugging implementation of malloc.

Here is the code, and the explanations that go with it.

I would appreciate your feedback both about the code and
the associated explanations.

Are there any unit tests?

--
Ian Collins.
Jun 24 '06 #2

P: n/a
On 2006-06-24, jacob navia <ja***@jacob.remcomp.fr> wrote:
In the C tutorial for lcc-win32, I have a small chapter
about a debugging implementation of malloc.

Here is the code, and the explanations that go with it.

I would appreciate your feedback both about the code and
the associated explanations.

---------------------------------------------------------------------

Instead of using directly malloc/free here are two implementations of
equivalent functions with some added safety features:
o Freeing NULL is allowed and is not an error.
free already lets you free(NULL) without an error doesn't it?
#include <stdlib.h>
#define MAGIC 0xFFFF
#define SIGNATURE 12345678L
size_t AllocatedMemory;
void *allocate(size_t size)
{
register char *r;
register int *ip = NULL;
size += 3 * sizeof(int);
r = malloc(size);
if (r == NULL)
return r;
AllocatedMemory += size;
ip = (int *) r;
// At the start of the block we write the signature
*ip++ = SIGNATURE;
// Then we write the size of the block in bytes
*ip++ = (int) size;
You might as well store the size in your "metadata" as a size_t instead
of truncating to int-- no real harm in it, and people do allocate big
blocks sometimes.
// We zero the data space
memset(ip, 0, size - 3*sizeof(int));
// We write the magic number at the end of the block,
// just behind the data
ip = (int *) (&r[size - sizeof(int)]);
*ip = MAGIC;


This will likely cause alignment problems-- suppose size were 17, and
sizeof (int) 4, malloc returns 16-byte aligned pointers, and that on the
target machine ints have to be stored at 4-byte aligned addresses.

r is 0x10 (say). &r[size - 4] will be 0x1d, which is not 4-byte aligned (0x1d %
4 is 1)

Better to make your magic number guard band something like an array of
char, so it can always go at the end of a block, whatever the size of
the block.
Jun 24 '06 #3

P: n/a
jacob navia wrote:
void release(void *pp)
{
register int *ip = NULL;
int s;
register char *p = pp;
if (p == NULL) // Freeing NULL is allowed
return;
// The start of the block is two integers before the data.
p -= 2 * sizeof(int);
Maybe consider an alignment check (p divisible by sizeof(int)) and check
for p > 2*sizeof(int) before the subtraction. Paranoid, put protects
against release( 7 );
ip = (int *) p;
if (*ip == SIGNATURE) {
// Overwrite the signature so that this block
// can’t be freed again
*ip++ = 0;
s = *ip;
I'd bring s (and call it size) into the if{} scope.
ip = (int *) (&p[s - sizeof(int)]);


Another paranoid alignment check here?

--
Ian Collins.
Jun 24 '06 #4

P: n/a
On Sat, 24 Jun 2006 23:00:35 +0200, jacob navia
<ja***@jacob.remcomp.fr> wrote:
In the C tutorial for lcc-win32, I have a small chapter
about a debugging implementation of malloc.

Here is the code, and the explanations that go with it.

I would appreciate your feedback both about the code and
the associated explanations.

---------------------------------------------------------------------

Instead of using directly malloc/free here are two implementations of
equivalent functions with some added safety features:
o Freeing NULL is allowed and is not an error.
o Double freeing is made impossible.
o Any overwrite immediately at the end of the block is checked for.
o Memory is initialized to zero.
o A count of allocated memory is kept in a global variable.

#include <stdlib.h>
#define MAGIC 0xFFFF
#define SIGNATURE 12345678L
size_t AllocatedMemory;
void *allocate(size_t size)
{
register char *r;
register int *ip = NULL;
size += 3 * sizeof(int);
r = malloc(size);
if (r == NULL)
return r;
AllocatedMemory += size;
At this point r contains the address of a block of memory properly
aligned for any possible object.
ip = (int *) r;
// At the start of the block we write the signature
*ip++ = SIGNATURE;
If sizeof(int)*CHAR_BIT < 32, SIGNATURE will not fit.
// Then we write the size of the block in bytes
*ip++ = (int) size;
// We zero the data space
memset(ip, 0, size - 3*sizeof(int));
It is legal for the user to malloc 0 bytes. I could not find a
description of what memset does if the third argument is 0. Let's
hope it checks first.
// We write the magic number at the end of the block,
// just behind the data
ip = (int *) (&r[size - sizeof(int)]);
There is no guarantee that size provided by the user is a multiple of
sizeof(int). If it is not, then this address calculation produces an
misaligned result.
*ip = MAGIC;
And this attempt to store into the "misaligned int" invokes undefined
behavior.
// Return a pointer to the start of the data area
return (r + 2 * sizeof(int));
Here you give the user an address two int into the block of memory. If
double requires 8 byte alignment and sizeof(int) is 2, the address you
return is not suitable for storing a double. Similarly if sizeof(long
long) is 16 and sizeof(int) is 4 (probably a more likely situation
with new systems).
}

<snip>
Remove del for email
Jun 24 '06 #5

P: n/a
Barry Schwarz wrote:

At this point r contains the address of a block of memory properly
aligned for any possible object.

Is there a portable way of finding out what this alignment is?

--
Ian Collins.
Jun 24 '06 #6

P: n/a
jacob navia <ja***@jacob.remcomp.fr> writes:
In the C tutorial for lcc-win32, I have a small chapter
about a debugging implementation of malloc.

Here is the code, and the explanations that go with it.

I would appreciate your feedback both about the code and
the associated explanations.

---------------------------------------------------------------------

Instead of using directly malloc/free here are two implementations of
equivalent functions with some added safety features:
o Freeing NULL is allowed and is not an error.
That's not an added safety feature; free(NULL) is explicitly required
to do nothing.
o Double freeing is made impossible.
Well, unlikely.
o Any overwrite immediately at the end of the block is checked for.
o Memory is initialized to zero.
Arguably, that could mask certain errors. For example, if I use the
standard malloc() function, and I attempt to access the allocated
memory before I've initialized it, I've invoked undefined behavior.
By initializing the memory to zero, you make such buggy code behave
consistently, making it more difficult to track down the error.

You might consider setting the allocated memory to a known bad value,
perhaps all-ones, or 0xDEADBEEF, or 0xE5E5E5E5.
o A count of allocated memory is kept in a global variable.

#include <stdlib.h>
#define MAGIC 0xFFFF
#define SIGNATURE 12345678L
size_t AllocatedMemory;
void *allocate(size_t size)
{
register char *r;
register int *ip = NULL;
The common wisdom is that "register" is more likely to interfere with
the compiler's optimizer than to actually make the code run faster. I
don't know how true that is. Declaring something as "register"
*should* be harmless as long as you don't try to take its address, but
it's a little odd to see it in new code.
size += 3 * sizeof(int);
r = malloc(size);
if (r == NULL)
return r;
AllocatedMemory += size;
ip = (int *) r;
// At the start of the block we write the signature
*ip++ = SIGNATURE;
// Then we write the size of the block in bytes
*ip++ = (int) size;
Why do you store the size as an int? (And why the unnecessary cast?)
It seems far more natural, more correct, and potentially more useful,
to store it as a size_t.
// We zero the data space
memset(ip, 0, size - 3*sizeof(int));
// We write the magic number at the end of the block,
// just behind the data
ip = (int *) (&r[size - sizeof(int)]);
*ip = MAGIC;
// Return a pointer to the start of the data area
return (r + 2 * sizeof(int));
r is correctly aligned for any type. You assume here that
r + 2 * sizeof(int) is also correctly aligned for any type. That's
very likely to be true (and there's no portable way to check it),
but you should at least document the assumption.
}
void release(void *pp)
{
register int *ip = NULL;
int s;
register char *p = pp;
if (p == NULL) // Freeing NULL is allowed
return;
// The start of the block is two integers before the data.
p -= 2 * sizeof(int);
ip = (int *) p;
if (*ip == SIGNATURE) {
// Overwrite the signature so that this block
// can’t be freed again
*ip++ = 0;
s = *ip;
ip = (int *) (&p[s - sizeof(int)]);
if (*ip != MAGIC) {
ErorPrintf(“Overwritten block size %d”, s);
Typo: that should be "ErrorPrintf", not "ErorPrintf". Also, your
quotation marks show up in my newsreader as "\223" and "\224" (I think
those are Windows-specific character codes). Both of these lead me to
suspect that you didn't copy this from an actual compilable source
file. (No need, I think, to go into the reasons why this is a Bad
Idea.)
return;
}
*ip = 0;
AllocatedMemory -= s;
free(p);
}
else {
/* The block has been overwritten. Complain. */
ErrorPrintf(“Wrong block passed to release”);
}
}


I don't know what ErrorPrintf does, but printing error messages from a
memory allocation function seems like a bad idea. If it doesn't abort
the program, the caller won't know that the call failed; if it does,
the caller can't handle the problem (perhaps by attempting to do some
cleanup before terminating the program).

Since this is intended to be a replacement for free(), which has no
way to indicate a failure, it's difficult to come up with a good way
to indicate an error. Perhaps the best compromise would be to have
release() return a status value, 0 for success and any of several
defined error codes on failure. Or it could set errno.

[...]

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Jun 25 '06 #7

P: n/a
>Barry Schwarz wrote:
At this point [where r is the result of a malloc() call] r contains
the address of a block of memory properly aligned for any possible
object.

In article <4g*************@individual.net>
Ian Collins <ia******@hotmail.com> wrote:Is there a portable way of finding out what this alignment is?


No. I consider this a flaw in the C standards.

If there *were* a portable way to find this, you could write portable
"augmented malloc and free" routines along the lines of those Mr Navia
has provided (with more work to deal with alignment, of course). But
there is not, so you cannot.

If you want to go ahead and write such routines, probably the best
approach is to mark any alignment assumptions you make, and try to
write the code such that, for porting to a new system with different
alignment requirements, changing one or two "#define"s is all that
is needed.
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.
Jun 25 '06 #8

P: n/a
Chris Torek wrote:
Barry Schwarz wrote:
At this point [where r is the result of a malloc() call] r contains
the address of a block of memory properly aligned for any possible
object.


In article <4g*************@individual.net>
Ian Collins <ia******@hotmail.com> wrote:
Is there a portable way of finding out what this alignment is?

No. I consider this a flaw in the C standards.

Thanks, I didn't think so.
If there *were* a portable way to find this, you could write portable
"augmented malloc and free" routines along the lines of those Mr Navia
has provided (with more work to deal with alignment, of course). But
there is not, so you cannot.
I've used max(sizeof(void*), sizeof(long double)) as a 'fairly safe'
value for alignment in similar situations.
If you want to go ahead and write such routines, probably the best
approach is to mark any alignment assumptions you make, and try to
write the code such that, for porting to a new system with different
alignment requirements, changing one or two "#define"s is all that
is needed.


I agree.

--
Ian Collins.
Jun 25 '06 #9

P: n/a
Ian Collins wrote:

I've used max(sizeof(void*), sizeof(long double)) as a 'fairly safe'
value for alignment in similar situations.

Oops, looking back I used max(sizeof(void*), sizeof(double)). The
targets I was using didn't have long double and I don't use them anyway...

Where long double does exist, you'd have to use the lowest common
multiple of sizeof(void*) and sizeof(long double).

--
Ian Collins.
Jun 25 '06 #10

P: n/a
Thanks for the answers.
1) True, I do not consider alignment problems, and that could
be a problem in machines where integers are aligned at some
power of 2 boundary. A work-around would be to just
memcpy (or equivalent) and memcmp (or an equivalent loop)
instead of accessing the signature at the end of the
block as an integer.

2) This code is very old, and I have forgotten that now
free accepts NULL. It was written originally under
windows 3.0 (16 bit OS), and I have used it ever
since.

3) A problem arises when size_t is much bigger than
sizeof int. In 64 bit windows, for instance, this
code would truncate a size_t when the user attempts
to allocate more than 2GB of memory in a single block.

A problem arises then here. What should this routine do with
an allocation request of that size? It could be a legal
request, but it would also be the result of a negative
number being passed to malloc...
Microsoft proposed in the TR 24731 to coin a new type
rsize_t that would be the maximum value of a legal request
for an object, and limited to a signed int, in 32 bit systems
2GB. I think the solution would be along those lines here.

Again, thanks to all people that answered.

jacob
Jun 25 '06 #11

P: n/a
jacob navia wrote:

3) A problem arises when size_t is much bigger than
sizeof int. In 64 bit windows, for instance, this
code would truncate a size_t when the user attempts
to allocate more than 2GB of memory in a single block.

A problem arises then here. What should this routine do with
an allocation request of that size? It could be a legal
request, but it would also be the result of a negative
number being passed to malloc...


I had to make the same call for the debug allocator I use. I settled
for an assert on the size being <= INT_MAX, assuming a request bigger
than that to more likely be a negative number than a real request.

--
Ian Collins.
Jun 25 '06 #12

P: n/a
Ian Collins <ia******@hotmail.com> writes:
Ian Collins wrote:
I've used max(sizeof(void*), sizeof(long double)) as a 'fairly safe'
value for alignment in similar situations.

Oops, looking back I used max(sizeof(void*), sizeof(double)). The
targets I was using didn't have long double and I don't use them anyway...

Where long double does exist, you'd have to use the lowest common
multiple of sizeof(void*) and sizeof(long double).


Really? long double has been standard since ANSI C89; any C compiler
that doesn't provide it is non-conforming. (It can have the same
characteristics as long, so it's just a matter of recognizing the
syntax.)

Are you sure you can trust this implementation to get the rest of the
language right?

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Jun 25 '06 #13

P: n/a
jacob navia wrote:
In the C tutorial for lcc-win32, I have a small chapter
about a debugging implementation of malloc.
[...]


Others have commented on the alignment issues, and on
the wisdom of initializing newly-allocated memory to a
"useful" value like all zeroes. I've got a few further
suggestions:

When an allocation is released, consider filling it
with garbage. This may help catch erroneous uses of an
area after it's been freed, as in the famously incorrect
code for freeing a linked list:

for (ptr = head; ptr != NULL; ptr = ptr->next)
free(ptr);

release() leaks the memory that's passed to it, so the
package isn't suitable for debugging programs that churn
through a lot of allocations and releases and expect the
released memory to be recycled. Consider passing released
areas to free(), perhaps not immediately (so the garbage
mentioned above has time to cause damage). You might make
a FIFO list of released areas, and let it grow to a few
thousand entries before handing the oldest to free(). You
might even postpone free() until malloc() returns NULL.

Consider keeping the "metadata" -- addresses, sizes,
and so on -- separate from the allocated data. The idea is to
make the metadata less vulnerable to common off-by-one errors.
(I've used a skip-list with the block address as a key for
this purpose; that's not portable because C doesn't define
comparison between pointers to "unrelated" objects, but it
works for mainstream C implementations.) Maintaining the
metadata separately involves more overhead, but this is, after
all, a debugging package.

Consider adding a sanity-checker that visits all the
allocated areas and checks their signatures for signs of
damage. Making the checker callable by the user can help in
"wolf fence" debugging. If sufficiently desperate, sanity-
check at every transaction.

I found it helpful to keep a sequence number in the
metadata for each block, and to let the user program query
the current value. This is useful when tracking down leaks:

open files, initialize libraries, ...
epoch = getCurrentSeqno();
do something that should be memory-neutral ...
listSurvivingAllocationsSince(epoch);

.... where the final call just traverses all the allocations
that haven't been freed, reporting any whose sequence numbers
are greater than `epoch'.

--
Eric Sosman
es*****@acm-dot-org.invalid

Jun 25 '06 #14

P: n/a
Eric Sosman a écrit :
jacob navia wrote:
In the C tutorial for lcc-win32, I have a small chapter
about a debugging implementation of malloc.
[...]

Others have commented on the alignment issues, and on
the wisdom of initializing newly-allocated memory to a
"useful" value like all zeroes. I've got a few further
suggestions:

When an allocation is released, consider filling it
with garbage. This may help catch erroneous uses of an
area after it's been freed, as in the famously incorrect
code for freeing a linked list:

for (ptr = head; ptr != NULL; ptr = ptr->next)
free(ptr);

release() leaks the memory that's passed to it, so the
package isn't suitable for debugging programs that churn
through a lot of allocations and releases and expect the
released memory to be recycled. Consider passing released
areas to free(), perhaps not immediately (so the garbage
mentioned above has time to cause damage).


I do pass the pointer to the free() function. release()
does NOT leak memory. Maybe an oversight?
You might make a FIFO list of released areas, and let it grow to a few
thousand entries before handing the oldest to free(). You
might even postpone free() until malloc() returns NULL.

Consider keeping the "metadata" -- addresses, sizes,
and so on -- separate from the allocated data. The idea is to
make the metadata less vulnerable to common off-by-one errors.
(I've used a skip-list with the block address as a key for
this purpose; that's not portable because C doesn't define
comparison between pointers to "unrelated" objects, but it
works for mainstream C implementations.) Maintaining the
metadata separately involves more overhead, but this is, after
all, a debugging package.

Consider adding a sanity-checker that visits all the
allocated areas and checks their signatures for signs of
damage. Making the checker callable by the user can help in
"wolf fence" debugging. If sufficiently desperate, sanity-
check at every transaction.

I cosnidered that, just making a table with all the allocated areas
and scanning that table in a memory check pass. Within the context
of a tutorial however, I do not know if it is that useful...

I will mention your ideas in the tutorial, and outline how
they could be implemented.
I found it helpful to keep a sequence number in the
metadata for each block, and to let the user program query
the current value. This is useful when tracking down leaks:

open files, initialize libraries, ...
epoch = getCurrentSeqno();
do something that should be memory-neutral ...
listSurvivingAllocationsSince(epoch);

... where the final call just traverses all the allocations
that haven't been freed, reporting any whose sequence numbers
are greater than `epoch'.


To detect memory leaks, I use the global variable AllocatedMemory.
You have just to change your example to:
open files, initialize libraries, ...
mem = AllocatedMemory;
do something that should be memory-neutral ...
if (AllocatedMemory != mem) {
//Memory leak detected
}
Obviously your approach is better since it would detect which
memory blocks are leaked, but a simpler approach still yields
useful information.

Thanks for your input

jacob
Jun 25 '06 #15

P: n/a
On Sun, 25 Jun 2006 10:17:39 +0200, jacob navia
<ja***@jacob.remcomp.fr> wrote:
Thanks for the answers.
1) True, I do not consider alignment problems, and that could
be a problem in machines where integers are aligned at some
power of 2 boundary. A work-around would be to just
memcpy (or equivalent) and memcmp (or an equivalent loop)
instead of accessing the signature at the end of the
block as an integer.


This doesn't solve the problem that your are returning an address
*only* two int beyond the allocated address.

Two possibilities come to mind:

On the very first call to your function, allocate a small
block (sizeof(long long)+sizeof(long double)). Convert the address to
unsigned long. Determine N, the number of low order zeros in the
result. Instead of using sizeof(int) as your delta, use pow(2,N).
Free the memory and get on with what the user requested.

On the very first call, compute max(sizeof(short),
sizeof(int), sizeof(size_t), sizeof(ptrdiff_t), sizeof(long long),
sizeof(long double), ...), betting on the assumption that the longest
object imposes the strictest alignment.

<snip>
Remove del for email
Jun 25 '06 #16

P: n/a
Barry Schwarz wrote:
[...]
On the very first call, compute max(sizeof(short),
sizeof(int), sizeof(size_t), sizeof(ptrdiff_t), sizeof(long long),
sizeof(long double), ...), betting on the assumption that the longest
object imposes the strictest alignment.


K&R (now retronymically called "K&R I") showed an easy
way to do this without a run-time computation by using the
size of a union whose elements are all the primitive types
likely to need special alignment. Paraphrasing (because
some of today's types didn't exist then):

union all_kinds {
char c;
short s;
int i;
long l;
long long ll;
intmax_t im;
float f;
double d;
long double ld;
void *vp;
void (*fp)(void);
/* others as seem needful */
};
#define ALIGNSIZE sizeof(union all_kinds)

K&R did not, of course, claim that this was perfect.
Still, it's as effective as Barry Schwartz' suggestion and
doesn't require any special first-call computation.

A refinement is possible, since the sizeof a type may
be greater than the type's required alignment:

struct aligned {
char pad;
union all_kinds u;
};
#define ALIGNSIZE offsetof(struct aligned, u)

This still isn't perfect, but may sometimes turn out to
be more economical.

--
Eric Sosman
es*****@acm-dot-org.invalid
Jun 25 '06 #17

P: n/a
jacob navia wrote:
Eric Sosman a écrit :
release() leaks the memory that's passed to it, [...]


I do pass the pointer to the free() function. release()
does NOT leak memory. Maybe an oversight?


Yes, indeed: mine. (I thought it was odd to omit free()
so I went back and double-checked; looks like I should have
triple- or quadruple-checked ... Sorry for the slander!)

--
Eric Sosman
es*****@acm-dot-org.invalid
Jun 25 '06 #18

P: n/a
Eric Sosman <es*****@acm-dot-org.invalid> writes:
When an allocation is released, consider filling it with
garbage.


This is a good idea. If you know a little about the target
platform, you can even choose particularly good garbage. In
"debug" allocators I have designed, I have chosen 0xcc, because
on x86 that is a "debug trap" instruction when executed and it's
unlikely to be a valid pointer value for user programs. (I think
I originally got this idea from a book.)
--
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan
Jun 25 '06 #19

P: n/a
Keith Thompson wrote:
Ian Collins <ia******@hotmail.com> writes:
Ian Collins wrote:
I've used max(sizeof(void*), sizeof(long double)) as a 'fairly safe'
value for alignment in similar situations.


Oops, looking back I used max(sizeof(void*), sizeof(double)). The
targets I was using didn't have long double and I don't use them anyway...

Where long double does exist, you'd have to use the lowest common
multiple of sizeof(void*) and sizeof(long double).

Really? long double has been standard since ANSI C89; any C compiler
that doesn't provide it is non-conforming. (It can have the same
characteristics as long, so it's just a matter of recognizing the
syntax.)

OK, scratch didn't have (I didn't check) and emphasise we didn't use
them, neither target had an FPU, so FP operations where strongly
discouraged.

--
Ian Collins.
Jun 25 '06 #20

P: n/a

"jacob navia" <ja***@jacob.remcomp.fr> wrote

Microsoft proposed in the TR 24731 to coin a new type
rsize_t that would be the maximum value of a legal request
for an object, and limited to a signed int, in 32 bit systems
2GB. I think the solution would be along those lines here.

And so gibberish types multiply.
Why not just go back to malloc() taking an int?
--
Buy my book 12 Common Atheist Arguments (refuted)
$1.25 download or $7.20 paper, available www.lulu.com/bgy1mm
Jun 25 '06 #21

P: n/a
Malcolm wrote:
"jacob navia" <ja***@jacob.remcomp.fr> wrote
Microsoft proposed in the TR 24731 to coin a new type
rsize_t that would be the maximum value of a legal request
for an object, and limited to a signed int, in 32 bit systems
2GB. I think the solution would be along those lines here.


And so gibberish types multiply.
Why not just go back to malloc() taking an int?


Wouldn't be ideal in a 64 bit system, long would be a better bet. But
there still isn't a guarantee it would be the same size as size_t.

--
Ian Collins.
Jun 25 '06 #22

P: n/a
jacob navia <ja***@jacob.remcomp.fr> writes:
Thanks for the answers.
1) True, I do not consider alignment problems, and that could
be a problem in machines where integers are aligned at some
power of 2 boundary. A work-around would be to just
memcpy (or equivalent) and memcmp (or an equivalent loop)
instead of accessing the signature at the end of the
block as an integer.
If I remember and understand the code correctly, you allocate two
extra ints at the start of the block, and one at the end, and return
(malloc() + 2 * sizeof(int)). (That's not a valid expression, but you
get the idea.)

If the two ints cause the rest of the block to be misaligned, using
memcpy() to access the int you put at the end of the block will help
only for that int; the user's data will still be potentially
misaligned (depending on what's stored in it).

[...]
3) A problem arises when size_t is much bigger than
sizeof int. In 64 bit windows, for instance, this
code would truncate a size_t when the user attempts
to allocate more than 2GB of memory in a single block.
So don't truncate it. Store it as a size_t.
A problem arises then here. What should this routine do with
an allocation request of that size? It could be a legal
request, but it would also be the result of a negative
number being passed to malloc...
Microsoft proposed in the TR 24731 to coin a new type
rsize_t that would be the maximum value of a legal request
for an object, and limited to a signed int, in 32 bit systems
2GB. I think the solution would be along those lines here.


Perhaps, but since rsize_t isn't in the language, there's no portable
way to define it or determine its upper bound.

My advice:

Store the size as a size_t. Don't even think about truncating it. If
the code is intended to be portable, you can't assume that
malloc((size_t)INT_MAX + 1)
or
malloc((size_t)INT_MAX * 2)
is invalid.

Define a constant for the maximum alignment in bytes, and use it
when you decide how much to allocate. For example:
#define MAX_ALIGN (sizeof(long long))

The offset of the user's data in the malloc()ed block would then be
max(MAX_ALIGN, 2 * sizeof(size_t)).

If you want to check for accidental negative values, define a constant
and do an explicit check in your code; treat any attempt to allocate
more than that as an error. For example:
#define MAX_ALLOC (SIZE_MAX / 2)

For systems where some type requires stricter alignment than long
long, or where the maximum sensible allocation is something other than
SIZE_MAX / 2, the constants are easily configured by editing a line or
two in the source (or using whatever configuration system you might
use).

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Jun 25 '06 #23

P: n/a
Barry Schwarz wrote:
<snip>
It is legal for the user to malloc 0 bytes. I could not find a
description of what memset does if the third argument is 0. Let's
hope it checks first.

What do you mean? Why would it have to check anything? It's just as legal
for the user to memset() 0 bytes.

My copy of the standard says "void *memset(void *s, int c, size_t n); [..]
The memset function copies the value of c (converted to an unsigned char)
into each of the first n characters of the object pointed to by s." This
statement is well-defined if n == 0; memset() must do nothing.

Sometimes edge cases are special; it's a good idea if they're not.

S.
Jun 25 '06 #24

P: n/a
Ben Pfaff wrote:
Eric Sosman <es*****@acm-dot-org.invalid> writes:

When an allocation is released, consider filling it with
garbage.

This is a good idea. If you know a little about the target
platform, you can even choose particularly good garbage. In
"debug" allocators I have designed, I have chosen 0xcc, because
on x86 that is a "debug trap" instruction when executed and it's
unlikely to be a valid pointer value for user programs. (I think
I originally got this idea from a book.)


Fill patterns that resemble small odd negative values make
good garbage:

- negative, because negative values will be "obviously
wrong" in many contexts

- small, because if interpreted as unsigned values on a
two's complement machine the values will be "huge," and
stand a good chance of being detected as "ridiculous"

- odd, because if interpreted as a pointer there's a
decent chance of provoking a data-alignment trap

0x9F was used in one commercial product I worked on, and
0xBADD was seen a lot in the 16-bit days. 0xDEADBEEF is a
popular favorite. I'm personally rather fond of 0xDECEA5ED,
but that's mostly for its amusement value.

--
Eric Sosman
es*****@acm-dot-org.invalid
Jun 25 '06 #25

P: n/a
Barry Schwarz <sc******@doezl.net> writes:
On Sun, 25 Jun 2006 10:17:39 +0200, jacob navia
<ja***@jacob.remcomp.fr> wrote:
Thanks for the answers.
1) True, I do not consider alignment problems, and that could
be a problem in machines where integers are aligned at some
power of 2 boundary. A work-around would be to just
memcpy (or equivalent) and memcmp (or an equivalent loop)
instead of accessing the signature at the end of the
block as an integer.
This doesn't solve the problem that your are returning an address
*only* two int beyond the allocated address.

Two possibilities come to mind:

On the very first call to your function, allocate a small
block (sizeof(long long)+sizeof(long double)). Convert the address to
unsigned long. Determine N, the number of low order zeros in the
result. Instead of using sizeof(int) as your delta, use pow(2,N).
Free the memory and get on with what the user requested.


I've worked on systems where the number of low-order zeros in the
representation of an address did *not* correspond directly to the
alignment in bytes of the address. (Cray vector systems, where
hardware addresses denote 64-bit words, and byte pointers
(CHAR_BIT==8) are implemented in software by storing an offset in the
high-order 3 bits of the pointer.)
On the very first call, compute max(sizeof(short),
sizeof(int), sizeof(size_t), sizeof(ptrdiff_t), sizeof(long long),
sizeof(long double), ...), betting on the assumption that the longest
object imposes the strictest alignment.


You can also create a union of all those types, then wrap it in a
struct:

struct foo {
char c;
union everthing u;
};

If you've included enough stuff in "union everything", then
offsetof(struct foo, u) will probably be the strictest alignment,
suitable for use by malloc(). If long long is 8 bytes and requires
only 4-byte alignment, this method will give you 4 rather than 8,
which could save some space.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Jun 25 '06 #26

P: n/a
"Malcolm" <re*******@btinternet.com> writes:
"jacob navia" <ja***@jacob.remcomp.fr> wrote

Microsoft proposed in the TR 24731 to coin a new type
rsize_t that would be the maximum value of a legal request
for an object, and limited to a signed int, in 32 bit systems
2GB. I think the solution would be along those lines here.

And so gibberish types multiply.
Why not just go back to malloc() taking an int?


Because it won't always work. There are good reasons for int to be 32
bits even on 64-bit systems. (with 8-bit char, making int 64 bits
would leave a hole in the type system; short would be either 16 or 32
bits). But there's no reason to forbid malloc()ing more than INT_MAX
bytes.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Jun 25 '06 #27

P: n/a
In article <4g*************@individual.net>,
Ian Collins <ia******@hotmail.com> wrote:
I've used max(sizeof(void*), sizeof(long double)) as a 'fairly safe'
value for alignment in similar situations.


Some implementations do not require doubles (for example) to have
alignment "as big as" their size. If you wanted to take advantage
of this you could declare structs such as

struct {char a; double b;}

and use offsetof() to get the alignment.

-- Richard
Jun 25 '06 #28

P: n/a
Eric Sosman wrote:
jacob navia wrote:
In the C tutorial for lcc-win32, I have a small chapter
about a debugging implementation of malloc.
[...]
Others have commented on the alignment issues, and on
the wisdom of initializing newly-allocated memory to a
"useful" value like all zeroes. I've got a few further
suggestions:

When an allocation is released, consider filling it
with garbage.


MSVC's debug-mode heap fills free entries with a predictable pattern
that is checked when it is used for a subsequent allocation. If you
are willing to conceed this amount of a performance hit, then I would
endorse MSVC's strategy, as it allows you to detect double-frees, and
other heap corruptions in general.
[...] Consider keeping the "metadata" -- addresses, sizes,
and so on -- separate from the allocated data. The idea is to
make the metadata less vulnerable to common off-by-one errors.
Right, but then you are also less likely to detect some kinds of
errors. For example:

ptr1 = malloc(100);
ptr2 = malloc(100);
memcpy (ptr-16, ptr-16, 132);

The memcpy is erroneous, but if you are using a constant signature,
then there will appear to be no detectable error. Mixing the meta-data
in with the real data payloads makes it more "fragile"; but it is this
fragility that is precisely what you want to focus on to detect errors
more exhaustively.
(I've used a skip-list with the block address as a key for
this purpose; that's not portable because C doesn't define
comparison between pointers to "unrelated" objects, but it
works for mainstream C implementations.) Maintaining the
metadata separately involves more overhead, but this is, after
all, a debugging package.
In implementing my own debugging heaps, I've come to the conclusion
that you can do very good corruption detection with very little
performance hit. This is an important point, since it makes the option
of shipping the release version with the debug heap enabled
realistically available to you.
Consider adding a sanity-checker that visits all the
allocated areas and checks their signatures for signs of
damage. Making the checker callable by the user can help in
"wolf fence" debugging. If sufficiently desperate, sanity-
check at every transaction.
More generally, you should just make a heap-walker, like WATCOM C/C++'s
Clib includes. This would allow you to walk all allocated heap
entries, and sanity check them one by one. The programmer could then
instrument their own code with such checks as necessary.

My own debug heap includes a full heap implementation as well, so I
also track and walk the free entries as well.
I found it helpful to keep a sequence number in the
metadata for each block, and to let the user program query
the current value. This is useful when tracking down leaks:

open files, initialize libraries, ...
epoch = getCurrentSeqno();
do something that should be memory-neutral ...
listSurvivingAllocationsSince(epoch);

... where the final call just traverses all the allocations
that haven't been freed, reporting any whose sequence numbers
are greater than `epoch'.


This sounds interesting. But I would also add in __FILE__, __LINE__ as
well. In fact, in my debug heaps, I typically have:

void * dbgMalloc_internal (size_t size, const char * setto__FILE__,
int setto_LINE__);
#define dgbMalloc(s) dbgMalloc_internal ((s), __FILE__, __LINE__);
/* ... etc. ... */

If you need function pointers, you can just make wrapper functions.

There are two other very useful ideas to put into a debug heap manager.
As you malloc memory, convert the pointer to a intptr_t, and keep
track of the minimum and maximum values that are generated by the
allocator. These values usually correspond to a fairly narrow range of
integer addresses. Then the debug free function would first check that
the (intptr_t) cast value of its parameter was within the range of all
possible pointers that were returned by the debug malloc, calloc and
realloc functions. This allows you to (with very high probability)
detect attempts to free memory that didn't come from the heap in the
first place.

The second is how I do the signatures. Using a random constant (like
0x1234567) on the head and foot works fine, but won't help you detect
an over-large memcpy such as I've shown in my first example above. So
for the signature, I usually pick the value:

((intptr_t) ptr) ^ MAGIC_CONST

Where ptr is exactly equal to the address of the position where each
signature is stored. So every signature is address specific (and thus
not movable) while still retaining the same bit entropy as your
MAGIC_CONST.

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Jun 25 '06 #29

P: n/a
we******@gmail.com wrote:
Eric Sosman wrote:
jacob navia wrote:

[...] Consider keeping the "metadata" -- addresses, sizes,
and so on -- separate from the allocated data. The idea is to
make the metadata less vulnerable to common off-by-one errors.
Right, but then you are also less likely to detect some kinds of
errors. For example:

ptr1 = malloc(100);
ptr2 = malloc(100);
memcpy (ptr-16, ptr-16, 132);


(I guess the two appearances of ptr are supposed to
be ptr1 and ptr2. Not too sure where the 132 came from;
are you assuming that the two allocations are contiguous?
Hmmm -- maybe not. Moot point, though, because the problem
is not with the separate metadata but with the constant
signature, and as it turns out neither of us uses a constant
signature.)
The memcpy is erroneous, but if you are using a constant signature,
then there will appear to be no detectable error. Mixing the meta-data
in with the real data payloads makes it more "fragile"; but it is this
fragility that is precisely what you want to focus on to detect errors
more exhaustively.


Yah. Instead of a constant signature, I've used a block-
specific signature formed by hashing the block address with
the block length. I plant a copy of the signature before and
after the "payload" of each block, and check it on free() and
realloc(). (Reading further along in your reply, I see you've
done something fairly similar.)

The separately-allocated metadata provides a further validity
check. Instead of just trusting the pointer given to free() or
realloc(), use that pointer as a key and search for it in the
metadata. Not found? Bad argument! Found? That's nice, now
check the block signatures for signs of damage.

No one scheme seems likely to detect every error, so there's
a sort of guessing game about what sorts of error are both likely
and resistant to detection by other means. Then you put together
a data structure that helps detect the likely errors, but is not
particularly likely to be damaged by them. Nothing's perfect, of
course; there's really no defense against

for (;;)
*(*argv + rand()) = 42;

--
Eric Sosman
es*****@acm-dot-org.invalid
Jun 26 '06 #30

P: n/a
Eric Sosman <es*****@acm-dot-org.invalid> writes:
jacob navia wrote:
In the C tutorial for lcc-win32, I have a small chapter
about a debugging implementation of malloc.
[...]


Others have commented on the alignment issues, and on
the wisdom of initializing newly-allocated memory to a
"useful" value like all zeroes. I've got a few further
suggestions:

When an allocation is released, consider filling it
with garbage. This may help catch erroneous uses of an
area after it's been freed, as in the famously incorrect
code for freeing a linked list:

for (ptr = head; ptr != NULL; ptr = ptr->next)
free(ptr);

[...]

Better yet: when allocating memory, fill it with one kind of garbage
(to detect, though not reliably, the error of accessing uninitialized
memory), and when releasing it, fill it with another kind of garbage
(to detect, though not reliably, the distinct error of attempting to
access freed memory). It could be a substantial performance hit, but
it might be worth it if it detects errors.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Jun 26 '06 #31

P: n/a
Keith Thompson wrote:
Eric Sosman <es*****@acm-dot-org.invalid> writes:
jacob navia wrote:
In the C tutorial for lcc-win32, I have a small chapter
about a debugging implementation of malloc.
[...]


Others have commented on the alignment issues, and on
the wisdom of initializing newly-allocated memory to a
"useful" value like all zeroes. I've got a few further
suggestions:

When an allocation is released, consider filling it
with garbage. This may help catch erroneous uses of an
area after it's been freed, as in the famously incorrect
code for freeing a linked list:

for (ptr = head; ptr != NULL; ptr = ptr->next)
free(ptr);


[...]

Better yet: when allocating memory, fill it with one kind of garbage
(to detect, though not reliably, the error of accessing uninitialized
memory), and when releasing it, fill it with another kind of garbage
(to detect, though not reliably, the distinct error of attempting to
access freed memory). It could be a substantial performance hit, but
it might be worth it if it detects errors.


I fill newly-allocated blocks with 0xFEEDC0DE, and fill
released blocks with 0xDECEA5ED. (0xFEEDC0DE violates one of
the suggested guidelines I posted elsethread, but I've been
unable to change it due to an arm injury suffered while trying
to pat myself on the back over my own cleverness. ;-)

--
Eric Sosman
es*****@acm-dot-org.invalid

Jun 26 '06 #32

P: n/a
Keith Thompson wrote:
"Malcolm" <re*******@btinternet.com> writes:
"jacob navia" <ja***@jacob.remcomp.fr> wrote
Microsoft proposed in the TR 24731 to coin a new type
rsize_t that would be the maximum value of a legal request
for an object, and limited to a signed int, in 32 bit systems
2GB. I think the solution would be along those lines here.

And so gibberish types multiply.
Why not just go back to malloc() taking an int?


Because it won't always work. There are good reasons for int to be 32
bits even on 64-bit systems. (with 8-bit char, making int 64 bits
would leave a hole in the type system; short would be either 16 or 32
bits). But there's no reason to forbid malloc()ing more than INT_MAX
bytes.

Isn't "int" supposed to be the "natural" integer type for a platform? The
"hole" argument is compelling up to a point, but an implementation can
always provide int_least32_t and int_fast32_t for code that needs such types
(which, for ease of portability, will typically hide these behind typedefs
anyway).

Today's systems properly make the tradeoff, but I can imagine 64-bit
architectures where having a 32-bit int would be silly. ('short' would
indeed be likely to be 32 bits there, and there might be a separate 16-bit
type not covered by the regular types).

S.
Jun 26 '06 #33

P: n/a
Skarmander wrote:
Isn't "int" supposed to be the "natural" integer type for a platform?


By tradition, but not by force of law.

--
pete
Jun 26 '06 #34

P: n/a
Skarmander <in*****@dontmailme.com> writes:
Keith Thompson wrote:
"Malcolm" <re*******@btinternet.com> writes:
"jacob navia" <ja***@jacob.remcomp.fr> wrote
Microsoft proposed in the TR 24731 to coin a new type
rsize_t that would be the maximum value of a legal request
for an object, and limited to a signed int, in 32 bit systems
2GB. I think the solution would be along those lines here.

And so gibberish types multiply.
Why not just go back to malloc() taking an int? Because it won't always work. There are good reasons for int to be
32
bits even on 64-bit systems. (with 8-bit char, making int 64 bits
would leave a hole in the type system; short would be either 16 or 32
bits). But there's no reason to forbid malloc()ing more than INT_MAX
bytes.

Isn't "int" supposed to be the "natural" integer type for a platform?


Sure, but "natural" is a loose enough concept that the requirement is
unenforceable.
The "hole" argument is compelling up to a point, but an implementation
can always provide int_least32_t and int_fast32_t for code that needs
such types (which, for ease of portability, will typically hide these
behind typedefs anyway).
Sure, for C99. I don't think C99 has yet influenced the choice
of sizes of predefined types. A lot of C code is written
to be portable to C90 implementations; such code cannot use
<stdint.h>. (Unless it uses Doug Gwyn's q8 implementation,
<http://www.lysator.liu.se/c/q8/index.html> (link currently down).)
Today's systems properly make the tradeoff, but I can imagine 64-bit
architectures where having a 32-bit int would be silly. ('short' would
indeed be likely to be 32 bits there, and there might be a separate
16-bit type not covered by the regular types).


I don't have to imagine them. I've used systems with:

char 8 bits
short 32 bits
int 64 bits

and others with

char 8 bits
short 64 bits
int 64 bits

The lack of a 32-bit integer type on the latter was inconvenient at
times. (The implementation was C90 only.)

(If you're curious, the systems were a Cray T3E (based on the Alpha
processor) and a Cray T90 (vector system), respectively.)

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Jun 26 '06 #35

P: n/a
Keith Thompson posted:

and others with

char 8 bits
short 64 bits
int 64 bits

The lack of a 32-bit integer type on the latter was inconvenient at
times. (The implementation was C90 only.)

I'm curious, why is the lack of a 32-Bit integer type inconvenient? It
would seem that the 64-Bit integer types would provide all the
"functionality" you need... unless you're depending on 0 rolling backwards
into 4294967295?
--

Frederick Gotham
Jun 26 '06 #36

P: n/a
Keith Thompson wrote:
Skarmander <in*****@dontmailme.com> writes:
Keith Thompson wrote:
"Malcolm" <re*******@btinternet.com> writes:
"jacob navia" <ja***@jacob.remcomp.fr> wrote
> Microsoft proposed in the TR 24731 to coin a new type
> rsize_t that would be the maximum value of a legal request
> for an object, and limited to a signed int, in 32 bit systems
> 2GB. I think the solution would be along those lines here.
>
And so gibberish types multiply.
Why not just go back to malloc() taking an int?
Because it won't always work. There are good reasons for int to be
32
bits even on 64-bit systems. (with 8-bit char, making int 64 bits
would leave a hole in the type system; short would be either 16 or 32
bits). But there's no reason to forbid malloc()ing more than INT_MAX
bytes.

Isn't "int" supposed to be the "natural" integer type for a platform?


Sure, but "natural" is a loose enough concept that the requirement is
unenforceable.
The "hole" argument is compelling up to a point, but an implementation
can always provide int_least32_t and int_fast32_t for code that needs
such types (which, for ease of portability, will typically hide these
behind typedefs anyway).


Sure, for C99. I don't think C99 has yet influenced the choice
of sizes of predefined types. A lot of C code is written
to be portable to C90 implementations; such code cannot use
<stdint.h>. (Unless it uses Doug Gwyn's q8 implementation,
<http://www.lysator.liu.se/c/q8/index.html> (link currently down).)

Clarify: I meant to imply that implementations are free to provide
nonstandard integer types as an extension. (Like, say, __int32.) Portable
code that needs exactly-sized integer types will usually use its own typedef
sequestered in a header; customizing this is easy.

If implementations offer additional integer types through C99 standard types
that's even better; I was just using the C99 types to unambiguously describe
the kind of types I'm talking about.
Today's systems properly make the tradeoff, but I can imagine 64-bit
architectures where having a 32-bit int would be silly. ('short' would
indeed be likely to be 32 bits there, and there might be a separate
16-bit type not covered by the regular types).


I don't have to imagine them. I've used systems with:

char 8 bits
short 32 bits
int 64 bits

and others with

char 8 bits
short 64 bits
int 64 bits

The lack of a 32-bit integer type on the latter was inconvenient at
times. (The implementation was C90 only.)

(If you're curious, the systems were a Cray T3E (based on the Alpha
processor) and a Cray T90 (vector system), respectively.)

Thanks for the information. I've only been familiar with the recent x86-64
platforms, which for obvious reasons make very sure to keep 32-bit types around.

S.
Jun 26 '06 #37

P: n/a
Frederick Gotham <fg*******@SPAM.com> writes:
Keith Thompson posted:

and others with

char 8 bits
short 64 bits
int 64 bits

The lack of a 32-bit integer type on the latter was inconvenient at
times. (The implementation was C90 only.)

I'm curious, why is the lack of a 32-Bit integer type inconvenient? It
would seem that the 64-Bit integer types would provide all the
"functionality" you need... unless you're depending on 0 rolling backwards
into 4294967295?


The particular case I'm thinking of involved an externally defined
data format. The code, which was in a system header file, used a
32-bit bit field where most other implementations used an ordinary
struct member. This caused problems for code that took that member's
address.

A workaround was possible, but it was, as I said, inconvenient.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Jun 26 '06 #38

P: n/a

Eric Sosman wrote:
jacob navia wrote:
In the C tutorial for lcc-win32, I have a small chapter
about a debugging implementation of malloc.
[...]

Others have commented on the alignment issues, and on
the wisdom of initializing newly-allocated memory to a
"useful" value like all zeroes. I've got a few further
suggestions:

When an allocation is released, consider filling it
with garbage. This may help catch erroneous uses of an
area after it's been freed, as in the famously incorrect
code for freeing a linked list:

for (ptr = head; ptr != NULL; ptr = ptr->next)
free(ptr);

release() leaks the memory that's passed to it, so the
package isn't suitable for debugging programs that churn
through a lot of allocations and releases and expect the
released memory to be recycled. Consider passing released
areas to free(), perhaps not immediately (so the garbage
mentioned above has time to cause damage). You might make
a FIFO list of released areas, and let it grow to a few
thousand entries before handing the oldest to free(). You
might even postpone free() until malloc() returns NULL.

Consider keeping the "metadata" -- addresses, sizes,
and so on -- separate from the allocated data. The idea is to
make the metadata less vulnerable to common off-by-one errors.
(I've used a skip-list with the block address as a key for
this purpose; that's not portable because C doesn't define
comparison between pointers to "unrelated" objects, but it
works for mainstream C implementations.) Maintaining the
metadata separately involves more overhead, but this is, after
all, a debugging package.

Consider adding a sanity-checker that visits all the
allocated areas and checks their signatures for signs of
damage. Making the checker callable by the user can help in
"wolf fence" debugging. If sufficiently desperate, sanity-
check at every transaction.

I found it helpful to keep a sequence number in the
metadata for each block, and to let the user program query
the current value. This is useful when tracking down leaks:

open files, initialize libraries, ...
epoch = getCurrentSeqno();
do something that should be memory-neutral ...
listSurvivingAllocationsSince(epoch);

... where the final call just traverses all the allocations
that haven't been freed, reporting any whose sequence numbers
are greater than `epoch'.
An excellent collection of suggestions. *applause*

Jul 17 '06 #39

P: n/a

Chris Torek wrote:
Barry Schwarz wrote:
At this point [where r is the result of a malloc() call] r contains
the address of a block of memory properly aligned for any possible
object.

In article <4g*************@individual.net>
Ian Collins <ia******@hotmail.comwrote:
Is there a portable way of finding out what this alignment is?

No. I consider this a flaw in the C standards.

If there *were* a portable way to find this, you could write portable
"augmented malloc and free" routines along the lines of those Mr Navia
has provided (with more work to deal with alignment, of course). ...
I think you mean to add a qualifier that pointers
can be turned into integers and operated on sensibly.
Otherwise knowing the alignment doesn't help.

Jul 17 '06 #40

P: n/a
<en******@yahoo.comwrote in message
news:11**********************@m79g2000cwm.googlegr oups.com...
>
Chris Torek wrote:
>Barry Schwarz wrote:
At this point [where r is the result of a malloc() call] r contains
the address of a block of memory properly aligned for any possible
object.

In article <4g*************@individual.net>
Ian Collins <ia******@hotmail.comwrote:
>Is there a portable way of finding out what this alignment is?

No. I consider this a flaw in the C standards.

If there *were* a portable way to find this, you could write portable
"augmented malloc and free" routines along the lines of those Mr Navia
has provided (with more work to deal with alignment, of course). ...

I think you mean to add a qualifier that pointers
can be turned into integers and operated on sensibly.
Otherwise knowing the alignment doesn't help.
Dumb question:

If we have no guarantee of modular math on the pointers, then why not
repeated subtraction on the pointers (which is clearly guaranteed)?

Jul 17 '06 #41

P: n/a

Dann Corbit wrote:
<en******@yahoo.comwrote in message
news:11**********************@m79g2000cwm.googlegr oups.com...

Chris Torek wrote:
Barry Schwarz wrote:
At this point [where r is the result of a malloc() call] r contains
the address of a block of memory properly aligned for any possible
object.

In article <4g*************@individual.net>
Ian Collins <ia******@hotmail.comwrote:
Is there a portable way of finding out what this alignment is?

No. I consider this a flaw in the C standards.

If there *were* a portable way to find this, you could write portable
"augmented malloc and free" routines along the lines of those Mr Navia
has provided (with more work to deal with alignment, of course). ...
I think you mean to add a qualifier that pointers
can be turned into integers and operated on sensibly.
Otherwise knowing the alignment doesn't help.

Dumb question:

If we have no guarantee of modular math on the pointers, then why not
repeated subtraction on the pointers (which is clearly guaranteed)?
Subtraction on pointers means you have a base
pointer to start with. I took the comment to
mean that finding the initial base pointer is
what we're looking for. Can't do subtraction
if there's no base to subtract.

Jul 18 '06 #42

This discussion thread is closed

Replies have been disabled for this discussion.