473,387 Members | 3,810 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

When to check the return value of malloc

Howdy,

I was reflecting recently on malloc.

Obviously, for tiny allocations like 20 bytes to strcpy a filename or
something, there's no point putting in a check on the return value of
malloc.

OTOH, if you're allocating a gigabyte for a large array, this might
fail, so you should definitely check for a NULL return.

So somewhere in between these extremes, there must be a point where you
stop ignoring malloc's return value, and start checking it.

Where do people draw this line? I guess it depends on the likely system
the program will be deployed on, but are there any good rule-of-thumbs?

Rgds,
MJ

Jan 18 '08
173 7915
Malcolm McLean wrote:
In gcc you have a -O setting to set the optimisation level. Normally you
leave it unset whilst debugging, and ramp it up to maximum for the
production build.
Why?
--
Army1987 (Replace "NOSPAM" with "email")
Jan 24 '08 #101

"Army1987" <ar******@NOSPAM.itwrote in message
Malcolm McLean wrote:
>In gcc you have a -O setting to set the optimisation level. Normally you
leave it unset whilst debugging, and ramp it up to maximum for the
production build.
Why?
Because it compiles faster with the optimisations settings set low. Also, if
you turn debugging or profiling on it gives a stack trace rather than
merging subroutines.
Then there are a few bugs that appear with high optimisation settings.

--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm

Jan 24 '08 #102
On Thu, 24 Jan 2008 20:19:19 +0000 (UTC), Army1987
<ar******@NOSPAM.itwrote:
>Malcolm McLean wrote:
>In gcc you have a -O setting to set the optimisation level. Normally you
leave it unset whilst debugging, and ramp it up to maximum for the
production build.
Why?
Two reasons. One, large systems build faster unoptimized. Two,
following a debugger through optimized code can be very difficult.

--
Al Balmer
Sun City, AZ
Jan 24 '08 #103
Richard Tobin wrote:
[ using int vs size_t for the size in a malloc replacement]
I've had several bugs (my own and others) where a miscalculation
resulted in an attempt to malloc() a negative size. If the size
parameter could be negative, you might get a more informative
error message. (Of course, after a while you get to recognise what
malloc failing with a an argument of 4294967292 means.)
Well, it might have discovered errors for you, and I don't think anyone
claims that it isn't possible, but that is not what people are arguing
against. No, the problem with using a malloc-like function using int for
the size is that it is not a general replacement. Even worse, you might
calling this with 0x14000 and since int is 16 bit it will allocate 0x4000!
This will obviously cause buffer overflows. Further, a legitimate
allocation of 0x20000 is simply not possible. Lastly, for another program
which just inputs and outputs numbers and therefore never needs more than a
few hundred bytes, the artificial limit of 0x7fff byte is pure nonsense.

Seriously, the remedy for this has been already mentioned in this thread
long ago: use size_t, as it was meant to be, and use a configurable maximum
allocation size.

Some further notes:
1. I know that int is not 16 bits. ;)
2. Converting size_t to int can cause buffer overflows. Using size_t the
whole way through doesn't guarantee freedom from those errors, you can
still mess up size calculations. Using xcalloc( size, count) or a similar
interface helps.

Uli

Jan 24 '08 #104
In article <5v*************@mid.uni-berlin.de>,
Ulrich Eckhardt <do******@knuut.dewrote:
>Well, it might have discovered errors for you, and I don't think anyone
claims that it isn't possible, but that is not what people are arguing
against. No, the problem with using a malloc-like function using int for
the size is that it is not a general replacement. Even worse, you might
calling this with 0x14000 and since int is 16 bit it will allocate 0x4000!
I'm certainly not suggesting that you should use int for a
general-purpose malloc() replacement. Rather, it would be an
argument for size_t being signed.

-- Richard
--
:wq
Jan 24 '08 #105
In article <bi********************************@4ax.com>,
Al Balmer <al******@att.netwrote:
>>In gcc you have a -O setting to set the optimisation level. Normally you
leave it unset whilst debugging, and ramp it up to maximum for the
production build.
>>Why?
>Two reasons. One, large systems build faster unoptimized. Two,
following a debugger through optimized code can be very difficult.
On the other hand, there are several common errors that are only
detected when optimisation is done, so I have it turned on from the
start. Occasionally I have to turn it off temporarily to debug
something.

-- Richard
--
:wq
Jan 24 '08 #106

"Richard Tobin" <ri*****@cogsci.ed.ac.ukwrote in message
In article <5v*************@mid.uni-berlin.de>,
Ulrich Eckhardt <do******@knuut.dewrote:
>>Well, it might have discovered errors for you, and I don't think anyone
claims that it isn't possible, but that is not what people are arguing
against. No, the problem with using a malloc-like function using int for
the size is that it is not a general replacement. Even worse, you might
calling this with 0x14000 and since int is 16 bit it will allocate 0x4000!

I'm certainly not suggesting that you should use int for a
general-purpose malloc() replacement. Rather, it would be an
argument for size_t being signed.
Really Ulrich is making an argument against fixed-size types. Unless he is
absolutely consistent in using either size_t or int in indexing that array,
either could come unstuck. size_t might be 16 bits and int 32.
If the limits were specified at declaration time, like in Ada, we wouldnt
have this problem. Nor would we if we took the Visual Basic approach of a
variable object, or the Perl approach of "it's both a string of digits and
an integer".

--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm

Jan 24 '08 #107
On Thu, 24 Jan 2008 06:15:29 -0600, santosh wrote
(in article <fn**********@registered.motzarella.org>):
Kelsey Bjarnason wrote:
>[snips]

<snip>
>And if nothing else, malloc doesn't lie about what it actually does,
and works on all legitimate size requests, if memory is, in fact,
available to meet the request.

Unfortunately, only on systems that do not overcommit memory.
malloc() will also fail on such systems. Don't believe it? Try and
malloc 2GB on a 32-bit box running one of these platforms. (you may
have a kernel mod to allow 3GB per proc, if so, extend the request size
accordingly).
--
Randy Howard (2reply remove FOOBAR)
"The power of accurate observation is called cynicism by those
who have not got it." - George Bernard Shaw

Jan 24 '08 #108
Ulrich Eckhardt said:

<snip>
Some further notes:
1. I know that int is not 16 bits. ;)
Hmmm. How sure are you? I've certainly used systems where it /is/ 16 bits.
I've also used systems where it isn't.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Jan 25 '08 #109
Richard Tobin wrote:
In article <5v*************@mid.uni-berlin.de>,
Ulrich Eckhardt <do******@knuut.dewrote:
>>Well, it might have discovered errors for you, and I don't think anyone
claims that it isn't possible, but that is not what people are arguing
against. No, the problem with using a malloc-like function using int for
the size is that it is not a general replacement. Even worse, you might
calling this with 0x14000 and since int is 16 bit it will allocate 0x4000!

I'm certainly not suggesting that you should use int for a
general-purpose malloc() replacement. Rather, it would be an
argument for size_t being signed.
If it was 32 bits large, why wouldn't I be allowed to allocate 3GiB?
Further, signed types don't behave in any particular way on overflow
either, they rather cause undefined behaviour and this behaviour shows[1]!
Lastly, you could argue that a malloc() replacement should take an argument
that is simply larger than size_t and then check if the conversion to
size_t is valid. Seriously, all you achieve is that allocations are topped
at a certain limit, but this limit is neither technically necessary nor is
it even configurable!

Uli

[1]
This code:

int s = count * sizeof (element);
if(s<0)
error("overflow");

is flawed, because the check is only possibly triggered when undefined
behaviour has already been caused by signed overflow. There are popular
compilers that use this to optimise out the check.
Jan 25 '08 #110
Malcolm McLean wrote:
"Richard Tobin" <ri*****@cogsci.ed.ac.ukwrote in message
>In article <5v*************@mid.uni-berlin.de>,
Ulrich Eckhardt <do******@knuut.dewrote:
>>>Well, it might have discovered errors for you, and I don't think anyone
claims that it isn't possible, but that is not what people are arguing
against. No, the problem with using a malloc-like function using int for
the size is that it is not a general replacement. Even worse, you might
calling this with 0x14000 and since int is 16 bit it will allocate
0x4000!

I'm certainly not suggesting that you should use int for a
general-purpose malloc() replacement. Rather, it would be an
argument for size_t being signed.
Really Ulrich is making an argument against fixed-size types.
Am I? In case you mean types that dynamically adapt themselves to the stored
value, i.e. types that are not limited like integers in C, then yes, those
would surely avoid overflows. I wouldn't want to pay the price though,
neither for their dynamic allocation on an embedded platform nor for
checking if said dynamic allocation of the last multiplication didn't
perhaps fail. This would present a really heavy burden on the required
error-handling code.
Unless he is absolutely consistent in using either size_t or int in
indexing that array, either could come unstuck. size_t might be 16 bits
and int 32.
I have yet to find a platform where size_t is smaller than int. That said, I
do use size_t instead of int to index through arrays, exactly because I
also store the size of the arrays in size_t (unless they are constants).
This allows me to raise the warning levels of the compiler without getting
too many warnings.
If the limits were specified at declaration time, like in Ada, we wouldnt
have this problem. Nor would we if we took the Visual Basic approach of a
variable object, or the Perl approach of "it's both a string of digits and
an integer".
You can well do that, though it will look extremely clumsy because C doesn't
allow you to overload operators. However, C's builtin types can't do that.

Uli

Jan 25 '08 #111
Ulrich Eckhardt wrote:
Richard Tobin wrote:
>In article <5v*************@mid.uni-berlin.de>,
Ulrich Eckhardt <do******@knuut.dewrote:
>>>Well, it might have discovered errors for you, and I don't think
anyone claims that it isn't possible, but that is not what people are
arguing against. No, the problem with using a malloc-like function
using int for the size is that it is not a general replacement. Even
worse, you might calling this with 0x14000 and since int is 16 bit it
will allocate 0x4000!

I'm certainly not suggesting that you should use int for a
general-purpose malloc() replacement. Rather, it would be an
argument for size_t being signed.

If it was 32 bits large, why wouldn't I be allowed to allocate 3GiB?
And we are back to Malcolm's stand that everone must move to 64 bit
ints.
Further, signed types don't behave in any particular way on overflow
either, they rather cause undefined behaviour and this behaviour
shows[1]! Lastly, you could argue that a malloc() replacement should
take an argument that is simply larger than size_t and then check if
the conversion to size_t is valid. Seriously, all you achieve is that
allocations are topped at a certain limit, but this limit is neither
technically necessary nor is it even configurable!
size_t is fine as long as you use it correctly. Or at least I have not
encountered any notable nuisances in it's usage. Yes loops are
sometimes a bit awkward with an unsigned type, and need more attention,
but that is a trivial issue.

There is no point in using a signed type for size values, since size can
never be negative. Malcolm does have a point, but it's not applicable
to C as it is.

[1]
This code:

int s = count * sizeof (element);
if(s<0)
error("overflow");

is flawed, because the check is only possibly triggered when undefined
behaviour has already been caused by signed overflow. There are
popular compilers that use this to optimise out the check.
Yes we have had long threads from time to time on how to preempt
overflow.

Jan 25 '08 #112

"Ulrich Eckhardt" <do******@knuut.dewrote in message
>
If it was 32 bits large, why wouldn't I be allowed to allocate 3GiB?
A lot of machines won't let you install more than 2GB, despite being 32 bit.
Under the hood, tnhey use signed rather than unsigned integers.

It is nuisance for the policy that "everything shall be signed" - but
remember

If you are allocating 3GB on a 4GB machine, you are taking up virtually all
available memory. It is not unreasonable to be expected to code that
specially.
If you need 3GB now you'll most certainly need 5GB in a few month's time. So
this awkward window won't last long and soon you'll move to the glory of 64
bit pointers. And 64 bit ints if enough committee members read this to make
that recommendation.

--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm
Jan 25 '08 #113
Malcolm McLean wrote:
>
"Ulrich Eckhardt" <do******@knuut.dewrote in message
>>
If it was 32 bits large, why wouldn't I be allowed to allocate 3GiB?
A lot of machines won't let you install more than 2GB, despite being
32 bit. Under the hood, tnhey use signed rather than unsigned
integers.

It is nuisance for the policy that "everything shall be signed" - but
remember

If you are allocating 3GB on a 4GB machine, you are taking up
virtually all available memory. It is not unreasonable to be expected
to code that specially.
If you need 3GB now you'll most certainly need 5GB in a few month's
time. So this awkward window won't last long and soon you'll move to
the glory of 64 bit pointers. And 64 bit ints if enough committee
members read this to make that recommendation.
What is the ISO C Committee supposed to do? As hardware transitions from
32 to 64 bits, C compilers ought to move forward as well. Fixing int to
64 bits should break too much existing code. Instead use long.

Jan 25 '08 #114
In article <5v*************@mid.uni-berlin.de>,
Ulrich Eckhardt <do******@knuut.dewrote:
>I'm certainly not suggesting that you should use int for a
general-purpose malloc() replacement. Rather, it would be an
argument for size_t being signed.
>If it was 32 bits large, why wouldn't I be allowed to allocate 3GiB?
The existence of arguments one way doesn't preclude the existence of
arguments the other way.

-- Richard
--
:wq
Jan 25 '08 #115
On Fri, 25 Jan 2008 06:34:33 -0600, Malcolm McLean wrote
(in article <Ep*********************@bt.com>):
32 bits aren't quite enough to count everyone in the world.

And they are too many in some other places. ;-)

--
Randy Howard (2reply remove FOOBAR)
"The power of accurate observation is called cynicism by those
who have not got it." - George Bernard Shaw

Jan 25 '08 #116
Richard Tobin wrote:
In article <5v*************@mid.uni-berlin.de>,
Ulrich Eckhardt <do******@knuut.dewrote:
>>I'm certainly not suggesting that you should use int for a
general-purpose malloc() replacement. Rather, it would be an
argument for size_t being signed.
>>If it was 32 bits large, why wouldn't I be allowed to allocate 3GiB?

The existence of arguments one way doesn't preclude the existence of
arguments the other way.
Ahem, Richard, in case you were trying to say anything it totally missed me.
Please be a bit more explicit.

Thanks

Uli

Jan 25 '08 #117
Malcolm McLean wrote:
"Ulrich Eckhardt" <do******@knuut.dewrote in message
>If it was 32 bits large, why wouldn't I be allowed to allocate
3GiB?

A lot of machines won't let you install more than 2GB, despite
being 32 bit. Under the hood, tnhey use signed rather than
unsigned integers.
Or, more likely, they simply reserve the memory with the most sig.
bit set for the use of the OS.

--
[mail]: Chuck F (cbfalconer at maineline dot net)
[page]: <http://cbfalconer.home.att.net>
Try the download section.

--
Posted via a free Usenet account from http://www.teranews.com

Jan 25 '08 #118

"Kelsey Bjarnason" <kb********@gmail.comwrote in message
Thus, either your function is a pointless waste of time, because malloc
will never fail, or malloc *can* fail, so the error handling code in the
caller *will* be called.

So which is it?
You can have something that will never happen, yet could happen. Like a
monkey typing out the works of Shakespeare.

--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm

Jan 26 '08 #119
On Thu, 24 Jan 2008 17:33:29 +0530, santosh wrote:
Kelsey Bjarnason wrote:
>[snips]

<snip>
>And if nothing else, malloc doesn't lie about what it actually does,
and works on all legitimate size requests, if memory is, in fact,
available to meet the request.

Unfortunately, only on systems that do not overcommit memory.
In which case, his xmalloc won't help.

Jan 27 '08 #120
Kelsey Bjarnason wrote:
>
As a simple example, let's ponder a function which parses data from a
file, where the expectation is that the typical file so parsed is
relatively large - say several gigabytes.

Option 1: run through it, character by character, using fgetc() or some
equivalent. This was actually tested, and performance sucked.

Option 2: run through it line by line, using fgets or some equivalent.
This was also tested, works better, but not as well as it should.

Option 3: allocate a buffer of size N, read N bytes' worth of data,
process in memory. This, based on testing, is the most efficient, with
efficiency increasing, on average, with increasing values of N, to a
point, say around 64MB, where the benefits of more memory are
insufficient to justify increased usage.
You forgot option 4, use the operating system's file mapping API and let
the OS take care of the memory allocations.

--
Ian Collins.
Jan 27 '08 #121
Ian Collins wrote:
Kelsey Bjarnason wrote:
>As a simple example, let's ponder a function which parses data
from a file, where the expectation is that the typical file so
parsed is relatively large - say several gigabytes.

Option 1: run through it, character by character, using fgetc()
or some equivalent. This was actually tested, and performance
sucked.

Option 2: run through it line by line, using fgets or some
equivalent. This was also tested, works better, but not as well
as it should.

Option 3: allocate a buffer of size N, read N bytes' worth of
data, process in memory. This, based on testing, is the most
efficient, with efficiency increasing, on average, with
increasing values of N, to a point, say around 64MB, where the
benefits of more memory are insufficient to justify increased
usage.

You forgot option 4, use the operating system's file mapping API
and let the OS take care of the memory allocations.
He also forgot option 1a, replace fgetc with getc and use the
systems builtin buffers.

--
[mail]: Chuck F (cbfalconer at maineline dot net)
[page]: <http://cbfalconer.home.att.net>
Try the download section.

--
Posted via a free Usenet account from http://www.teranews.com

Jan 28 '08 #122
On Fri, 18 Jan 2008 22:39:24 -0500, CBFalconer <cb********@yahoo.com>
wrote:
Randy Howard wrote:
Eric Sosman wrote
"Obviously," you can allocate an infinite amount of
memory as long as you get it in 20-byte chunks? Did you
used to work for Enron or something?
This thread was useful, now I know I never have to buy extra
memory again.

PROVIDED you malloc it in 20 byte chunks. Since the standard
specifies that freed memory be made available again, you must be
perfectly safe in allocating 4k by:

for (i = 0; i < 20; i++) a[i] = malloc(20);
for (i = 0; i < 20; i++) free(a[i]);
ptr = malloc(4000);

with suitable declarations for a, i, ptr, and all the needed
#includes. Learning is wunnerful.
Apparently this is some strange new value of 20 * 20 of which I was
previously unaware.

(I know it's a joke, but cf. Twain on lightning != lightningbug.)
<G>
- formerly david.thompson1 || achar(64) || worldnet.att.net
Jan 28 '08 #123
Randy Howard <ra*********@FOOverizonBAR.netwrites:
I have systems here with ECC memory in them. The memory is official,
supported memory from the hardware vendor. I can replace those memory
modules with a dozen other samples. I can replace the motherboard with
any of several identical motherboards from the same vendor. I can make
it corrupt memory in less than 15 seconds on any variation you choose.
It's called poor signal integrity, and it's far more common than most
people think. They just don't have typical usage patterns that employ
data that makes this problem obvious.
Interesting. What usage patterns make this problem obvious?
--
Ben Pfaff
http://benpfaff.org
Jan 28 '08 #124
On Sun, 27 Jan 2008 22:57:30 -0600, Ben Pfaff wrote
(in article <87************@blp.benpfaff.org>):
Randy Howard <ra*********@FOOverizonBAR.netwrites:
>I have systems here with ECC memory in them. The memory is official,
supported memory from the hardware vendor. I can replace those memory
modules with a dozen other samples. I can replace the motherboard with
any of several identical motherboards from the same vendor. I can make
it corrupt memory in less than 15 seconds on any variation you choose.
It's called poor signal integrity, and it's far more common than most
people think. They just don't have typical usage patterns that employ
data that makes this problem obvious.

Interesting. What usage patterns make this problem obvious?
Something like a BERT tester is common, often implemented in software
for memory testing in a production system. A SmartBits is a common
hardware solution for testing networked devices in a similar fashion.

In essentially random usage, you might experience a lockup, reboot, or
unexplained data corruption every once in a while, and blame it on your
OS, an application you don't like, sunspots, the phase of the moon, or
it being a Monday. :)

--
Randy Howard (2reply remove FOOBAR)
"The power of accurate observation is called cynicism by those
who have not got it." - George Bernard Shaw

Jan 28 '08 #125

"santosh" <sa*********@gmail.comwrote in message
You can set a global variable from the call-back to tell the higher
level code the amount of memory that was actually allocated. Ugly, but
doable.
That's a workaround. However it is easier just to call malloc() if you've
got a failure strategy. xmalloc() is for when the only failure strategy,
realistically, is to abort, or for when you are running in a multi-tasking
environment and you know that you can make small amounts of memory available
by killing off something else or allowing it to finish.

--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm
Jan 28 '08 #126
Malcolm McLean wrote:
>
"Marty James" <ma**@nospam.comwrote in message
Obviously, for tiny allocations like
20 bytes to strcpy a filename or
something, there's no point putting in a check
on the return value of malloc.
Dereferencing an allocated pointer,
without prior checking the value, is wrong.
OTOH, if you're allocating a gigabyte for a large array, this might
fail, so you should definitely check for a NULL return.

So somewhere in between these extremes,
there must be a point where you
stop ignoring malloc's return value, and start checking it.

Where do people draw this line?
Whenever an allocated pointer is going to be dereferenced
or used in pointer arithmetic, the value should be checked first.
I guess it depends on the likely system
the program will be deployed on,
but are there any good rule-of-thumbs?
Yes.
No.
Imagine you've got 2GB installed and are allocating 20 bytes.
The system is stressed and programs crash or terminate
for lack of memory once a day. Any
more than that, and no-one would tolerate it.
So the chance
the crash being caused by your allocation is 1/ 100 000 000,
If your program is a mail sorter
and makes an allocation for every peice of mail sorted,
then with odds like that,
your program won't crash when you test it on the machine in lab,
but your program will crash five times
on the day that it is installed in the post office.

--
pete
Jan 28 '08 #127
[snips]

On Mon, 28 Jan 2008 05:05:14 +0530, santosh wrote:
You can set a global variable from the call-back to tell the higher
level code the amount of memory that was actually allocated. Ugly, but
doable.
Ugly and in many cases _not_ doable. CLC aside, many apps use threads.

Jan 28 '08 #128
santosh wrote:
I think I get what Malcolm is getting at.
Basically he does not want to
implement a full error-detection and recovery scenario for small
allocations, which /he/ thinks have only a minute chance of failure.
For such allocations he uses xmalloc() which will either get you the
requested memory or call exit().
I don't see a relationship between
the size of the requested memory,
and whether or not quitting the program
is the best way to handle an allocation failure.

--
pete
Jan 29 '08 #129
On Jan 28, 4:15*pm, pete <pfil...@mindspring.comwrote:
santosh wrote:
I think I get what Malcolm is getting at.
Basically he does not want to
implement a full error-detection and recovery scenario for small
allocations, which /he/ thinks have only a minute chance of failure.
For such allocations he uses xmalloc() which will either get you the
requested memory or call exit().

I don't see a relationship between
the size of the requested memory,
and whether or not quitting the program
is the best way to handle an allocation failure.
Neither is there any relationship between the size of the requested
block and the probability of failure:

Allocation series 1:
void *bigarr[10000000];
size_t index;
for (index = 0; index < sizeof bigarr/sizeof bigarr[0]; index++)
{
bigarr[index] = malloc(10);
}

Allocation series 2:
void *p = malloc(10000000);

as far as I can see, allocation series 1 never allocated a large
object and yet is more likely to fail than allocation 2 which does
allocate a large block of RAM.

Jan 29 '08 #130
Marty James schrieb:
Howdy,

I was reflecting recently on malloc.
Had a somewhat heated discussion on that recently myself.
Obviously, for tiny allocations like 20 bytes to strcpy a filename or
something, there's no point putting in a check on the return value of
malloc.

OTOH, if you're allocating a gigabyte for a large array, this might
fail, so you should definitely check for a NULL return.

So somewhere in between these extremes, there must be a point where you
stop ignoring malloc's return value, and start checking it.
Actually its not as much about the size but more about the "what else".
If you have an alternative to malloc'ing memory, nice, check and do it.
If you cant continue, do a gracefull exit, e.g.
in standard C i would wrap malloc, do or die style,
in Unix style systems i would catch SIGSEGV.

Dont know if anyone pointed this out, but alternatives to malloc are no
more safe, e.g. instead of
somefunc(){
char* buffer;
buffer=malloc(SIZE); //oh, i need to check?
//do something
free(buffer);
}

somefunc(){
char buffer[SIZE]; //how do i check here?
// do something
}
>
Where do people draw this line? I guess it depends on the likely system
the program will be deployed on, but are there any good rule-of-thumbs?
I think the basic rule is tied to the question "can i do something
better than quiting?"
I dont think you can find a real world solution to this question in the
C standard specs. It just too OS specific.
>
Rgds,
MJ
Jan 29 '08 #131
pete wrote:
santosh wrote:
>I think I get what Malcolm is getting at.
Basically he does not want to
implement a full error-detection and recovery scenario for small
allocations, which /he/ thinks have only a minute chance of failure.
For such allocations he uses xmalloc() which will either get you the
requested memory or call exit().

I don't see a relationship between
the size of the requested memory,
and whether or not quitting the program
is the best way to handle an allocation failure.
Malcolm has subsequently explained in another thread with Kelsey
Bjarnason that /he/ considers quitting an acceptable, even good,
strategy for failures of small sized allocations, like say a few
hundred bytes. He has said more than once that his xmalloc() is /not/
suitable for use if there is any chance of the allocation failing at
all, i.e., almost never.

Jan 29 '08 #132

"Kelsey Bjarnason" <kb********@gmail.comwrote in message
If I can't allocate 100 bytes, it may well be because I've already
allocated 100MB and there's nothing left to allocate. I _could_ possibly
reduce my buffer size to 50MB, freeing up space for the 100 byte
allocation. Oh wait, no, I can't, the allocator just crashed my app.
That's where your intuition fails you.
if you successfully allocate 100MB, it is inconceivable that a request for
100 bytes should then fail, unless the big allocation was deliberately
tailored to the amount of memory in the machine.

You could reduce a large allocation, change algorithms, and then get those
100 bytes. xmalloc() would not prevent you from doing that, because the
handler function can shrink the 100MB buffer down to 50 MB. There's even a
hook pointer provided to help. But why do I get the impression that people
are fantasisiing about these amazing recovery strategies?

--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm

Jan 29 '08 #133
Malcolm McLean wrote:
>
"Kelsey Bjarnason" <kb********@gmail.comwrote in message
>If I can't allocate 100 bytes, it may well be because I've already
allocated 100MB and there's nothing left to allocate. I _could_
possibly reduce my buffer size to 50MB, freeing up space for the 100
byte
allocation. Oh wait, no, I can't, the allocator just crashed my app.
That's where your intuition fails you.
if you successfully allocate 100MB, it is inconceivable that a request
for 100 bytes should then fail,
Why inconceivable? Lets say I have 512 Mb. The program in question has
allocated and is using 100 Mb. I also have four other programs, each
using roughly 100 Mb. Now it's perfectly likely that the next
allocation request my program makes, even a 100 byte one will fail,
because the system memory is nearly all used up and the rest is
occupied by the OS itself.
unless the big allocation was
deliberately tailored to the amount of memory in the machine.
This is also one scenario which a good program must handle.
You could reduce a large allocation, change algorithms, and then get
those 100 bytes. xmalloc() would not prevent you from doing that,
because the handler function can shrink the 100MB buffer down to 50
MB. There's even a hook pointer provided to help. But why do I get the
impression that people are fantasisiing about these amazing recovery
strategies?
It's not fantasy. It is done, but I'll admit that only rarely, because
it requires more than a little technical competence and patience. While
commercial code is often hamstrung by tight schedules, hobby code can
suffer because the author lacked the necessary patience.

But it's not fantasy. And xmalloc() is not the proper primitive for the
job.

Jan 29 '08 #134
Malcolm McLean wrote:
>
"Kelsey Bjarnason" <kb********@gmail.comwrote in message
>If I can't allocate 100 bytes, it may well be because I've already
allocated 100MB and there's nothing left to allocate. I _could_ possibly
reduce my buffer size to 50MB, freeing up space for the 100 byte
allocation. Oh wait, no, I can't, the allocator just crashed my app.
That's where your intuition fails you.
if you successfully allocate 100MB, it is inconceivable that a request
for 100 bytes should then fail, [...]
I'll have to explain this phenomenon to my bank: If I write
a big check and they honor it, they are then obliged to honor a
subsequent small check, even if the first one emptied my account.
Sweet!

--
Er*********@sun.com
Jan 29 '08 #135
[snips]

On Tue, 29 Jan 2008 12:12:09 -0800, user923005 wrote:
Let's consider (for a moment) small allocations. We have one initial
question to answer:
"Why was malloc(50) called instead of just char foo[50]?"
Possibility: the app uses something akin to a linked list to store the
actual file names loaded, presumably so it can later save them. The file
select and open mechanism can thus open the file _and_ add the name of
the file to the list. At least as applies to the example Keith suggested.

Jan 29 '08 #136
On Jan 29, 5:12*am, Kelsey Bjarnason <kbjarna...@gmail.comwrote:
[snips]

On Tue, 29 Jan 2008 12:12:09 -0800, user923005 wrote:
Let's consider (for a moment) small allocations. *We have one initial
question to answer:
"Why was malloc(50) called instead of just char foo[50]?"

Possibility: the app uses something akin to a linked list to store the
actual file names loaded, presumably so it can later save them. *The file
select and open mechanism can thus open the file _and_ add the name of
the file to the list. *At least as applies to the example Keith suggested.
Now, it is not likely that we will be storing millions of filenames,
but a linked list is another structure which holds an arbitrary list
of data. It's probably a very poor choice for something that holds
millions of items though, unless it is a forward-only cursor through a
data set.
Jan 29 '08 #137
"Malcolm McLean" <re*******@btinternet.comwrote:
"Eric Sosman" <Er*********@sun.comwrote in message

I'll have to explain this phenomenon to my bank: If I write
a big check and they honor it, they are then obliged to honor a
subsequent small check, even if the first one emptied my account.
Sweet!
I don't know how much I've got in my account, but I know its several
thousand pounds, and I know the amount, which I won't reveal, to the nearest
thousand. So I write a cheque in round thousands. It doesn't bounce - by
luck, because I didn't employ any particular rounding strategy up or down.
Now I write another cheque for one pound. Given that cheque one didn't
bounce, calculate the probability that cheque two will bounce.
Exactly one in a thousand.

Programs which do more than a thousand small allocations are a dime a
dozen.

Richard
Jan 30 '08 #138
"Flash Gordon" <sp**@flash-gordon.me.ukwrote in message
Malcolm McLean wrote, On 29/01/08 22:04:
Alternatively it could be a perfectly valid request. This is why you
should not be applying such an arbitrary limit. Of course, as you say your
xmalloc is only suitable for small requests the problem could be solved by
changing the parameter to an unsigned char.
32K is the minimum range of an int. So xmalloc() should not be called from
portable code if there is any possibility that the request will exceed this
size.
32K is a pretty big structure, but not such a big array, in this day and
age. It is however a string that is too long for the standard library
functions to work well with.
So xmalloc() for allocating individual structures and strings, malloc() for
arrays.

--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm
Jan 30 '08 #139
dj******@csclub.uwaterloo.ca.invalid wrote:
CBFalconer <cb********@maineline.netwrote:
>He also forgot option 1a, replace fgetc with getc and use the
systems builtin buffers.

getc and fgetc will use the same stdio buffer, so if the problem
is buffer size that won't solve anything.
You miss the point, which is that getc is encouraged to be a macro,
so that use can avoid any system function calls. When properly
implemented it effectively puts the system buffers in the program
coding.

--
[mail]: Chuck F (cbfalconer at maineline dot net)
[page]: <http://cbfalconer.home.att.net>
Try the download section.

--
Posted via a free Usenet account from http://www.teranews.com

Jan 30 '08 #140
On Tue, 29 Jan 2008 13:37:03 -0800, user923005 wrote:
On Jan 29, 5:12Â*am, Kelsey Bjarnason <kbjarna...@gmail.comwrote:
>[snips]

On Tue, 29 Jan 2008 12:12:09 -0800, user923005 wrote:
Let's consider (for a moment) small allocations. Â*We have one initial
question to answer:
"Why was malloc(50) called instead of just char foo[50]?"

Possibility: the app uses something akin to a linked list to store the
actual file names loaded, presumably so it can later save them. Â*The
file select and open mechanism can thus open the file _and_ add the
name of the file to the list. Â*At least as applies to the example Keith
suggested.

Now, it is not likely that we will be storing millions of filenames,
No, but it was an applicable answer to the question asked.

but
a linked list is another structure which holds an arbitrary list of
data. It's probably a very poor choice for something that holds
millions of items though, unless it is a forward-only cursor through a
data set.
Indeed. However, it's not the only sort. A tree, for example, might be
useful in holding a large number of small objects sorted or otherwise
organized by some means.

Jan 30 '08 #141
CBFalconer wrote:
>
dj******@csclub.uwaterloo.ca.invalid wrote:
CBFalconer <cb********@maineline.netwrote:
He also forgot option 1a, replace fgetc with getc and use the
systems builtin buffers.
getc and fgetc will use the same stdio buffer, so if the problem
is buffer size that won't solve anything.

You miss the point, which is that getc is encouraged to be a macro,
so that use can avoid any system function calls. When properly
implemented it effectively puts the system buffers in the program
coding.
I doubt it.
When getc is also implemented as a macro,
then the simplest way to define the functions getc and fgetc,
is to have the function bodies consist
soley of a call to the macro, like this:

int fgetc(FILE *stream)
{
return getc(stream);
}

int (getc)(FILE *stream)
{
return getc(stream);
}

--
pete
Jan 30 '08 #142
In article <47***************@yahoo.com>,
CBFalconer <cb********@maineline.netwrote:
>dj******@csclub.uwaterloo.ca.invalid wrote:
>CBFalconer <cb********@maineline.netwrote:
>>He also forgot option 1a, replace fgetc with getc and use the
systems builtin buffers.

getc and fgetc will use the same stdio buffer, so if the problem
is buffer size that won't solve anything.

You miss the point, which is that getc is encouraged to be a macro,
so that use can avoid any system function calls.
If the problem is the function call overhead, then that will probably
be a useful solution, as I noted.
When properly
implemented it effectively puts the system buffers in the program
coding.
It's highly unlikely that the "system buffer" that fgetc uses is any
different from the buffer that getc uses. A macro implementation of
getc will still have to do a system call to re-fill the buffer when
it's empty.
dave

--
Dave Vandervies dj3vande at eskimo dot com
I like "fun" risks, rather than "lazy" risks.
(If I'm going to kill myself, I want to have a good time on the way.)
--Graham Reed in the scary devil monastery
Jan 30 '08 #143
dj******@csclub.uwaterloo.ca.invalid wrote:
CBFalconer <cb********@maineline.netwrote:
>dj******@csclub.uwaterloo.ca.invalid wrote:
>>CBFalconer <cb********@maineline.netwrote:

He also forgot option 1a, replace fgetc with getc and use the
systems builtin buffers.

getc and fgetc will use the same stdio buffer, so if the
problem is buffer size that won't solve anything.

You miss the point, which is that getc is encouraged to be a
macro, so that use can avoid any system function calls.

If the problem is the function call overhead, then that will
probably be a useful solution, as I noted.
>When properly implemented it effectively puts the system buffers
in the program coding.

It's highly unlikely that the "system buffer" that fgetc uses is
any different from the buffer that getc uses. A macro
implementation of getc will still have to do a system call to
re-fill the buffer when it's empty.
True, but by then the frequency is reduced by something like 80 to
1000 times.

--
[mail]: Chuck F (cbfalconer at maineline dot net)
[page]: <http://cbfalconer.home.att.net>
Try the download section.
--
Posted via a free Usenet account from http://www.teranews.com

Jan 30 '08 #144
CBFalconer <cb********@yahoo.comwrites:
dj******@csclub.uwaterloo.ca.invalid wrote:
>CBFalconer <cb********@maineline.netwrote:
>>dj******@csclub.uwaterloo.ca.invalid wrote:
CBFalconer <cb********@maineline.netwrote:
He also forgot option 1a, replace fgetc with getc and use the
systems builtin buffers.

getc and fgetc will use the same stdio buffer, so if the
problem is buffer size that won't solve anything.

You miss the point, which is that getc is encouraged to be a
macro, so that use can avoid any system function calls.

If the problem is the function call overhead, then that will
probably be a useful solution, as I noted.
>>When properly implemented it effectively puts the system buffers
in the program coding.

It's highly unlikely that the "system buffer" that fgetc uses is
any different from the buffer that getc uses. A macro
implementation of getc will still have to do a system call to
re-fill the buffer when it's empty.

True, but by then the frequency is reduced by something like 80 to
1000 times.
The buffer behavior of getc should normally be identical to the buffer
behavior of fgetc. The only likely difference is the overhead of a
single function call for fgetc. If getc performs a system call to
refill the buffer every 1000 times, then fgetc performs a system call
to refill the buffer every 1000 times. (It's not clear whether their
behavior with respect to refilling the buffer is required to be
identical, but there's no reason for them to be different.)

--
Keith Thompson (The_Other_Keith) <ks***@mib.org>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Feb 2 '08 #145
In article <87************@kvetch.smov.org>,
Keith Thompson <ks***@mib.orgwrote:
(It's not clear whether their
behavior with respect to refilling the buffer is required to be
identical, but there's no reason for them to be different.)
getc is defined to be equivalent to fgetc except that it's allowed to
expose macro-ness by evaluating its argument more than once.

So if buffer refills can be exposed to strictly conforming programs
(it's not obvious to me whether or how that could be done), they are in
fact required to have the same behavior wrt buffering.
The fact that it's defined as equivalent except for an allowance not
relevant to buffering behavior is at the very least a pretty strong
hint.
dave

--
Dave Vandervies dj3vande at eskimo dot com
In short, I've *thought* about this topic for longer than the average
reader of this newsgroup has been alive. I think I understand it.
--P.J. Plauger in comp.lang.c
Feb 2 '08 #146
Keith Thompson wrote:
CBFalconer <cb********@yahoo.comwrites:
>dj******@csclub.uwaterloo.ca.invalid wrote:
>>CBFalconer <cb********@maineline.netwrote:
dj******@csclub.uwaterloo.ca.invalid wrote:
CBFalconer <cb********@maineline.netwrote:
>He also forgot option 1a, replace fgetc with getc and use the
>systems builtin buffers.
>
getc and fgetc will use the same stdio buffer, so if the
problem is buffer size that won't solve anything.

You miss the point, which is that getc is encouraged to be a
macro, so that use can avoid any system function calls.

If the problem is the function call overhead, then that will
probably be a useful solution, as I noted.

When properly implemented it effectively puts the system buffers
in the program coding.

It's highly unlikely that the "system buffer" that fgetc uses is
any different from the buffer that getc uses. A macro
implementation of getc will still have to do a system call to
re-fill the buffer when it's empty.

True, but by then the frequency is reduced by something like 80 to
1000 times.

The buffer behavior of getc should normally be identical to the buffer
behavior of fgetc. The only likely difference is the overhead of a
single function call for fgetc. If getc performs a system call to
refill the buffer every 1000 times, then fgetc performs a system call
to refill the buffer every 1000 times. (It's not clear whether their
behavior with respect to refilling the buffer is required to be
identical, but there's no reason for them to be different.)
Not so. The typical action of a getc macro will be something like:

#define getc(f) do { \
if (f->ix <= f->sz) return f->buf[f->ix++]; \
else return _getnew(f); \
} while (0)

totally avoiding all system calls until the buffer is emptied.
It's those intermediate system calls that can eat up the
performance. Of course there need not be any getc macro available
either.

--
[mail]: Chuck F (cbfalconer at maineline dot net)
[page]: <http://cbfalconer.home.att.net>
Try the download section.

--
Posted via a free Usenet account from http://www.teranews.com

Feb 3 '08 #147
CBFalconer wrote, On 02/02/08 22:02:
Keith Thompson wrote:
>CBFalconer <cb********@yahoo.comwrites:
>>dj******@csclub.uwaterloo.ca.invalid wrote:
CBFalconer <cb********@maineline.netwrote:
dj******@csclub.uwaterloo.ca.invalid wrote:
>CBFalconer <cb********@maineline.netwrote:
>>He also forgot option 1a, replace fgetc with getc and use the
>>systems builtin buffers.
>getc and fgetc will use the same stdio buffer, so if the
>problem is buffer size that won't solve anything.
You miss the point, which is that getc is encouraged to be a
macro, so that use can avoid any system function calls.
If the problem is the function call overhead, then that will
probably be a useful solution, as I noted.

When properly implemented it effectively puts the system buffers
in the program coding.
It's highly unlikely that the "system buffer" that fgetc uses is
any different from the buffer that getc uses. A macro
implementation of getc will still have to do a system call to
re-fill the buffer when it's empty.
True, but by then the frequency is reduced by something like 80 to
1000 times.
The buffer behavior of getc should normally be identical to the buffer
behavior of fgetc. The only likely difference is the overhead of a
single function call for fgetc. If getc performs a system call to
refill the buffer every 1000 times, then fgetc performs a system call
to refill the buffer every 1000 times. (It's not clear whether their
behavior with respect to refilling the buffer is required to be
identical, but there's no reason for them to be different.)

Not so. The typical action of a getc macro will be something like:

#define getc(f) do { \
if (f->ix <= f->sz) return f->buf[f->ix++]; \
^^^^^^
else return _getnew(f); \
^^^^^^
} while (0)
I somehow hope that the getc macro looks significantly different to
that, since it should not be causing the function it is called from to
return! In the version of glibc on this machine it is simply
#define getc(_fp) _IO_getc (_fp)
Presumable _IO_getc is compiler magic.
totally avoiding all system calls until the buffer is emptied.
It's those intermediate system calls that can eat up the
performance. Of course there need not be any getc macro available
either.
I would normally use getc rather than fgetc just in case it is more
efficient (also it is one less character to type), but with modern
systems I would not actually expect any major difference.
--
Flash Gordon
Feb 3 '08 #148
Flash Gordon wrote:
>
I would normally use getc rather than fgetc just in case it is more
efficient (also it is one less character to type), but with modern
systems I would not actually expect any major difference.
Just because it's a quiet Sunday afternoon, I just tried compiling this
with Sun cc:

#include <stdio.h>

int main() {
int n = getc( stdin );
}

The preprocessor output is:

int main()
{
int n =
(--((&__iob[0]))->_cnt<0?__filbuf((&__iob[0])):(int)*((&__iob[0]))->_ptr++);
}

Due to cc "knowing" and therefore being able to inline standard library
functions. I guess other compilers can do the same.

--
Ian Collins.
Feb 3 '08 #149
Ian Collins wrote:
>
Flash Gordon wrote:

I would normally use getc rather than fgetc just in case it is more
efficient (also it is one less character to type), but with modern
systems I would not actually expect any major difference.

Just because it's a quiet Sunday afternoon,
I just tried compiling this with Sun cc:

#include <stdio.h>

int main() {
int n = getc( stdin );
}

The preprocessor output is:

int main()
{
int n =
(--((&__iob[0]))->_cnt<0?__filbuf((&__iob[0])):(int)*((&__iob[0]))->_ptr++);
}

Due to cc "knowing"
and therefore being able to inline standard library
functions. I guess other compilers can do the same.
Are you sure getc isn't a macro on your Sun cc?

/* BEGIN new.c */

#include<stdio.h>

int main(void)
{

#ifdef getc
puts("getc is a macro.");
#else
puts("getc is not a macro.");
#endif

return 0;
}

/* END new.c */

--
pete
Feb 3 '08 #150

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

27
by: MK | last post by:
I am a newbie. Please help. The following warning is issued by gcc-3.2.2 compiler (pc Linux): ================================================================== read_raw_data.c:51: warning:...
10
by: raghu | last post by:
I have written code for single linked list...I get an error as Expression syntax in the line marked with ////////. Please correct me where am I going wrong. I compiled this in TURBO Compiler ...
58
by: Jorge Peixoto de Morais Neto | last post by:
I was reading the code of FFmpeg and it seems that they use malloc just too much. The problems and dangers of malloc are widely known. Malloc also has some overhead (although I don't know what is...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.