size_t or int for malloc-type functions?

Richard Heathfield wrote:

av said:

>On Wed, 03 Jan 2007 09:57:52 +0000, Richard Heathfield wrote:

>>The C Standard correctly defines size_t.

C Standard is wrong on that

No, it isn't.

--

+-------------------+ .:\:\:/:/:.
| PLEASE DO NOT F :.:\:\:/:/:.:
| FEED THE TROLLS | :=.' - - '.=:
| | '=(\ 9 9 /)='
| Thank you, | ( (_) )
| Management | /`-vvv-'\
+-------------------+ / \
| | @@@ / /|,,,,,|\ \
| | @@@ /_// /^\ \\_\
@x@@x@ | | |/ WW( ( ) )WW
\||||/ | | \| __\,,\ /,,/__
\||/ | | | jgs (______Y______)
/\/\/\/\/\/\/\/\//\/\\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\
================================================== ============

fix (vb.): 1. to paper over, obscure, hide from public view; 2.
to work around, in a way that produces unintended consequences
that are worse than the original problem. Usage: "Windows ME
fixes many of the shortcomings of Windows 98 SE". - Hutchinson

Jan 3 '07 #160

Richard Bos

jacob navia <ja***@jacob.remcomp.frwrote:

Randy Howard a écrit :
I'll take a stab it. The proposal is to arbitrarily chop the available
address in space, for those rare cases in which someone wishes to
malloc an amount of memory /larger/ than the available address space on
the processor. I.e., they want a boatload of RAM, so restrict them to
less than half of that they need to make things "safer". That has BS
written all over it, imo.

Not at all. Please just see what the proposal really was
before going to answer to phantasy proposals.

The proposal was discussing the idea of using a signed
type to avoid problems with small negative numbers
that get translated into huge unsigned ones.

And that's where it breaks down. You see, where are you going to _get_
these small negative numbers? Are you going to get a negative
multiplicand from sizeof? No, because that's defined as giving a
positive number under all circumstances. Is your programmer going to
specify a negative number of objects? Hardly likely. That would be a
blunder of the first order.
So whence the negative number? Probably, one supposes, from multiplying
two largeish positive numbers and getting a signed integer overflow. Ah,
but! But signed integer overflow causes undefined behaviour. So the
error is not trying to allocate a negative number of bytes, the error is
computing the negative in the first place, and it's an error that is
allowed to be fatal and cannot reliably be caught.

Of course, there _is_ an easy way to stop the undefined behaviour. That
way is not to use signed integers for sizes in the first place.
Multiplying an unsigned integer by an (unsigned) size_t gives you
another unsigned integer. The multiplication cannot overflow, and cannot
cause UB. It _can_ wrap around, but that error is fairly easy to detect;
the way to do this is left as an exercise to the reader, but should not
evade any first-year student of C.

So, by suggesting that instead of the unsigned size_t, we should use
signed int or ssize_t, you are effectively advocating replacing a safe
method of handling malloc() in which overly large sizes are easily
spotted, by an unsafe method in which overly large numbers cause
untrappable errors which can only be caught after the damage has already
been done, and in which the program may crash before you even get to
check whether the result is negative at all. Is that wise? Seems to me
that it's not.

Richard

Jan 3 '07 #161

kuyper

CBFalconer wrote:

ku****@wizard.net wrote:
jacob navia wrote:
... snip ...

>
Who cares about rings?
Rings are the mathematical construct that correspond to the way
in which the C standard defines arithmetic for unsigned types.

IIRC a ring defines a set of objects that are members of the ring,
and a set of operations on those objects, such that <m1 operation
m2yields a member of the ring. unsigned objects and the
operations +, -, and * meet this definiton. The operation / does
not. Exponentiation does. Many bits have been dropped in my
memory, however.

The last time I studied mathematical rings was about 30 years ago; a
lot of the bits have dropped from my memory too. I don't remember how
the mathematicians decided to handle division for rings. I think that
they declared that rings are not closed under division. However, for
every case where m1/m2 gives a value that is also a member of a ring,
the C division operator gives that same member. Mathematics is
sufficiently flexible that I'm sure that every C type has a
corresponding mathematical construct which exactly matches it's
behavior, but no C type is an exact match to any simple mathematical
construct. A ring is the mathematical concept that is the closest
simple match to the behavior of C unsigned types.

Jan 3 '07 #162

kuyper

av wrote:
,,,,

your dear c standard is wrong in the definition on size_t

is there someone agree with me?

You use too little English to explain what it is you're talking about,
and the English you do provide has such sloppy grammar and punctuation
that I can't figure out what it is that you're trying to say.

In order to validly argue that a definition of a term is wrong, you
must reference a differing more authoritative definition. In this case,
the authoritative definition of size_t is the one provided the C
standard - there is no higher authority you can refer to, to justify
calling the definition wrong. The standard's definition might be poorly
written, unreadable, useless, meaningless, internally inconsistent,
inconsistent with other parts of the standard, inconsistent with some
other standard, unimplementable, or it might possess any of a wide
variety of other negative characteristics. But since C standard is the
relevant authority, it's definition of size_t is inherently incapable
of being wrong.

So - which of those other negative characteristics describes the
problem you're complaining about?

Jan 3 '07 #163

kuyper

Mark McIntyre wrote:

On 2 Jan 2007 15:05:33 -0800, in comp.lang.c , "Old Wolf"
<ol*****@inspire.net.nzwrote:

Mark McIntyre wrote:
If the argument is unsigned, you can't pass a -ve value to it.
"argument" means the value passed in. In this code:

#include <stdlib.h>
int foo() { malloc(-1); }

the argument to malloc is the value -1 (of type int).

No, thats just how you typed it. As far as malloc is concerned, you
passed UINT_MAX.

The argument passed to malloc is, as he said, -1. The parameter value
received by malloc() is SIZE_MAX, which might or might not be the same
as UINT_MAX.

Still, the result is the same: you can pass a negative value to
malloc(), but malloc() can't receive it.

Jan 3 '07 #164

Ben Pfaff

CBFalconer <cb********@yahoo.comwrites:

Ben Pfaff wrote:
>>
Keith Thompson <ks***@mib.orgwrites:

jacob navia <ja***@jacob.remcomp.frwrites:
void *calloc(size_t n,size_t s)
{
long long siz = (long long)n * (long long)s;
void *result;
if (siz>>32)
return 0;
result = malloc((size_t)siz);
if (result)
memset(result,0,(size_t)siz);
return result;
}

[...]

I wonder if "siz&~0xFFFFFFFF" might be marginally more efficient than
"siz>>32". [...]

I'd recommend "siz SIZE_MAX" as being both clear and portable.

I don't think so. Ignoring the casts, maybe

if (!(SIZE_MAX - siz)) thingsarebad();
else carryon();

I seriously doubt that most machines can generate a value larger
than SIZE_MAX.

I don't understand your objection. You believe that the product
of one size_t and another size_t, both converted to long long,
cannot be larger than SIZE_MAX? (Jacob has postulated 32-bit
size_t, by the way.)

I don't understand why Jacob didn't use unsigned long long, by
the way. It would seem a more straightforward choice.
--
"When in doubt, treat ``feature'' as a pejorative.
(Think of a hundred-bladed Swiss army knife.)"
--Kernighan and Plauger, _Software Tools_

Jan 3 '07 #165

Stephen Sprunk

"Old Wolf" <ol*****@inspire.net.nzwrote in message
news:11**********************@42g2000cwt.googlegro ups.com...

Richard Heathfield wrote:
>On any given implementation, either size_t is big enough to store
65521 *
65552 or it isn't. If it is, there is no issue. And if it is not,
your
request is meaningless, since you're asking for an object bigger than
the
system can provide.

Firstly, systems might exist where you can allocate more memory
than SIZE_MAX.

Not portably and in a single object. size_t is _defined_ to be able to
hold the size of the largest possible object. That, of course, doesn't
exclude systems where you can allocate SIZE_MAX bytes multiple times
(e.g. MS DOS) or where there is some other (i.e. non-portable)
allocator.

That aside, wouldn't it be more sensible behaviour for malloc to
return NULL or take some other action when you try request an
object bigger than the system can provide, rather than returning
a smaller object than requested? I think this is Navia's point.

malloc() has no way of knowing what you _tried_ to request; it only
knows what value actually showed up in its argument. If that doesn't
match what you thought you were passing, there's no way for malloc() to
know.

If malloc() doesn't know that you tried to request SIZE_MAX+N bytes,
then how is it supposed to know to respond to that case?

S

--
Stephen Sprunk "God does not play dice." --Albert Einstein
CCIE #3723 "God is an inveterate gambler, and He throws the
K5SSS dice at every possible opportunity." --Stephen Hawking
--
Posted via a free Usenet account from http://www.teranews.com

Jan 3 '07 #166

Stephen Sprunk

"Randy Howard" <ra*********@FOOverizonBAR.netwrote in message
news:00*****************************@news.verizon. net...

On Tue, 2 Jan 2007 15:49:03 -0600, christian.bau wrote
(in article <11**********************@k21g2000cwa.googlegroups .com>):
>And most 32 bit systems have limit at 3GB or 3.5GB.

Actually it's usually 2GB, due to splitting of address space between
the kernel and user space. Some go as high as 3GB with special boot
options.

RedHat has a special Linux kernel that gives just under 4GB of user
address space; a bit of kernel space is still required to keep syscalls
working, but it's pretty small. It's mainly used by database folks, who
should be moving to AMD64 now anyways (with its 2^51 bytes of user
space, currently).

However, there are hacks (outside of malloc) that allow for
what Intel calls "Physical Address Extension" (PAE) to allow
systems with Intel 32-bit processors to see memory above 4GB, sort
of like the old extended/expanded memory hacks in the DOS days.
Again, proprietary, and different APIs to use the memory from what
standard C supports.

A single process will never see more than 32 bits of memory at a time,
since both the virtual and linear address spaces are limited to that
size. What PAE does is allow the OS to map those 32 bits of per-process
linear address space into 36 bits of physical address space. That means
you still can't have more than 4GB in a single app, but you could run
sixteen apps that each have their own 4GB without conflict.

Some OSes may provide a way for processes to "see" different parts of
the 36-bit space at various times, similar to how EMS allowed 16-bit
apps to "see" different 1MB chunks of a 16MB physical address space.
Obviously that requires a lot of non-portable trickery to ensure the
right memory chunk is in place when you dereference a pointer.

S

--
Stephen Sprunk "God does not play dice." --Albert Einstein
CCIE #3723 "God is an inveterate gambler, and He throws the
K5SSS dice at every possible opportunity." --Stephen Hawking
--
Posted via a free Usenet account from http://www.teranews.com

Jan 3 '07 #167

Kenny McCormack

In article <e0********************************@4ax.com>, av <av@ala.awrote:
....

>>>Besides which av is a troll.

who is the troll?

The trolls (code for "people who speak truth (not claptrap)") are:

1) You
2) Me
3) Jacob
4) Frederick
5) Old Wolf

And growing. Applications for membership are always accepted.

Jan 3 '07 #168

CBFalconer <cb********@yahoo.comwrites:

jacob navia wrote:
>christian.bau a écrit :

... snip ...
>>
lcc-win32 uses this:

void *calloc(size_t n,size_t s)
{
long long siz = (long long)n * (long long)s;
void *result;
if (siz>>32)
return 0;
result = malloc((size_t)siz);
if (result)
memset(result,0,(size_t)siz);
return result;
}

Bad. long longs can overflow, leading to undefined behaviour. No
guarantee you ever get to testing the product. Casts are always
suspicious.

I've already commented on this code. *If* you happen to know that
LLONG_MAX >= SIZE_MAX*SIZE_MAX, then no overflow is possible. You
can't assume that in portable code, but C runtime code needn't be
portable; it's free to depend on any implementation-specific behavior.

I agree that the casts should be removed. In this case, though, they
happen to be harmless, as long as declarations for malloc() and
memset() are visible.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Jan 3 '07 #169

Cesar Rabak

jacob navia escreveu:

Ben Pfaff a écrit :

[snipped]

>Here is part of what the Rationale says about integer overflow.
I believe that it supports my position:

The keyword unsigned is something of a misnomer, suggesting
as it does in arithmetic that it is non-negative but
capable of overflow. The semantics of the C type unsigned
is that of modulus, or wrap-around, arithmetic for which
overflow has no meaning. The result of an unsigned
arithmetic operation is thus always defined, whereas the
result of a signed operation may be undefined.

Yes, I know that, and I agree that the semantics of unsigned is
wrap around. What I am saying is that when "the result cannot be
represented" and this wrap around semantics reduces the result,
this reduced result is mathematically WRONG in the sense of the USUAL
multiplication operation.

PHEW!!!!

Specifically when I use the malloc (p * sizeof *p) "idiom"
even if the semantics are well defined this is NOT what I
inteded with that multiplication!!!!

There is no point in throwing me standards texts because I am not
questioning them. I am just saying that this "results that cannot be
represented" lead to a wrong result being passed to malloc!!!
I just can't understand why it is impossible to agree in such
an evident stuff !!!

I think the root is the difference in perspective.

There is a Standard that documents pretty well this behaviour. So it is
a limitation the user of the library has to live with.

OTOH, we have an interpretation of this as problematic as it may not the
"expectation" for a user not completely aware of this (well described)
way the wrapping works.

So jacob the question is not the 'evidence' but the _interpretation_:
people used to the Standard sees this as business as usual, and you as
an opportunity to improvement...

my 0.01999...

--
Cesar Rabak

Jan 3 '07 #170

Cesar Rabak

Kenny McCormack escreveu:

In article <e0********************************@4ax.com>, av <av@ala.awrote:
...

>>>Besides which av is a troll.
who is the troll?

The trolls (code for "people who speak truth (not claptrap)") are:

1) You
2) Me
3) Jacob
4) Frederick
5) Old Wolf

And growing. Applications for membership are always accepted.

....and will this start of year applicants be free of fees? :-D

Jan 3 '07 #171

CBFalconer <cb********@yahoo.comwrites:

Ben Pfaff wrote:
>Keith Thompson <ks***@mib.orgwrites:
jacob navia <ja***@jacob.remcomp.frwrites:
void *calloc(size_t n,size_t s)
{
long long siz = (long long)n * (long long)s;
void *result;
if (siz>>32)
return 0;
result = malloc((size_t)siz);
if (result)
memset(result,0,(size_t)siz);
return result;
}

[...]

I wonder if "siz&~0xFFFFFFFF" might be marginally more efficient than
"siz>>32". [...]

I'd recommend "siz SIZE_MAX" as being both clear and portable.

I don't think so. Ignoring the casts, maybe

if (!(SIZE_MAX - siz)) thingsarebad();
else carryon();

I seriously doubt that most machines can generate a value larger
than SIZE_MAX.

The code assumes 32-bit size_t and 64-bit long long. "siz SIZE_MAX"
should work correctly given those assumptions.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Jan 3 '07 #172

"Stephen Sprunk" <st*****@sprunk.orgwrites:

"Old Wolf" <ol*****@inspire.net.nzwrote in message
news:11**********************@42g2000cwt.googlegro ups.com...
>Richard Heathfield wrote:
>>On any given implementation, either size_t is big enough to store
65521 *
65552 or it isn't. If it is, there is no issue. And if it is not,
your
request is meaningless, since you're asking for an object bigger
than the
system can provide.

Firstly, systems might exist where you can allocate more memory
than SIZE_MAX.

Not portably and in a single object. size_t is _defined_ to be able
to hold the size of the largest possible object. That, of course,
doesn't exclude systems where you can allocate SIZE_MAX bytes multiple
times (e.g. MS DOS) or where there is some other (i.e. non-portable)
allocator.

[...]

That's not quite correct. The standard merely defines size_t as "the
unsigned integer type of the result of the sizeof operator" (C99
7.17p2). There are objects to which you can't apply the sizeof
operator, for example objects created by the *alloc() functions.
calloc() in particular can be used to *request* the creation of an
object bigger than SIZE_MAX bytes (though an implementation is likely
to reject all such requests).

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Jan 3 '07 #173

"Stephen Sprunk" <st*****@sprunk.orgwrites:

"Old Wolf" <ol*****@inspire.net.nzwrote in message
news:11**********************@42g2000cwt.googlegro ups.com...
>Richard Heathfield wrote:
>>On any given implementation, either size_t is big enough to store
65521 *
65552 or it isn't. If it is, there is no issue. And if it is not,
your
request is meaningless, since you're asking for an object bigger
than the
system can provide.

Firstly, systems might exist where you can allocate more memory
than SIZE_MAX.

Not portably and in a single object. size_t is _defined_ to be able
to hold the size of the largest possible object. That, of course,
doesn't exclude systems where you can allocate SIZE_MAX bytes multiple
times (e.g. MS DOS) or where there is some other (i.e. non-portable)
allocator.

Jan 3 '07 #174

Mark McIntyre

On Wed, 03 Jan 2007 04:31:15 +0000, in comp.lang.c , Richard
Heathfield <rj*@see.sig.invalidwrote:

>No, the argument to malloc is the expression -1, which is clearly of type
int, and is equally clearly negative!

The compiler doesn't read "-1" though, does it? Its an expression of
type int, whose value is represented by some bits which, when regarded
as a signed int, equal -1, and when regarded as an unsigned long, make
up (say) 0xFFFF.
Imagine if you'd passed in 'a'.

>As far as malloc is concerned, it
*receives* a parameter with the value (size_t)-1, which is a (very)
positive value.

No, it recieves a set of bits in some memory address or register, and
interprets them as a size_t.

--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan

Jan 3 '07 #175

pete

Mark McIntyre wrote:

>
On Wed, 03 Jan 2007 04:31:15 +0000, in comp.lang.c , Richard
Heathfield <rj*@see.sig.invalidwrote:

No, the argument to malloc is the expression -1,
which is clearly of type int,
and is equally clearly negative!

As far as malloc is concerned, it
*receives* a parameter with the value (size_t)-1, which is a (very)
positive value.

No, it recieves a set of bits in some memory address or register, and
interprets them as a size_t.

As far as malloc is concerned,
it *initializes* a parameter with the value (size_t)-1,
which is a (very) positive value.

--
pete

Jan 4 '07 #176

Joe Wright

CBFalconer wrote:

jacob navia wrote:
... snip ...
>Or are you implying that

65521 x 65552 is 65296 and NOT 4295032592

Yes, if done with unsigned shorts, i.e. any unsigned 16 bit type.
That's what the standard prescribes. Have you ever bothered to
read the standard? Without doing so how can you dare to modify the
lcc compiler?

No, its done with 32-bit ints. Although 65521 fits in 16 bits 65552
requires 17 bits. The product of the two wants 33 bits.

65521 int
65552 int
65296 int (truncated)
4295032592 long long
...and the corresponding binary..
00000000 00000000 11111111 11110001
00000000 00000001 00000000 00010000
00000000 00000000 11111111 00010000
00000000 00000000 00000000 00000001 00000000 00000000 11111111 00010000

--
Joe Wright
"Everything should be made as simple as possible, but not simpler."
--- Albert Einstein ---

Jan 4 '07 #177

ku****@wizard.net wrote:

...
The last time I studied mathematical rings was about 30 years ago; a
lot of the bits have dropped from my memory too. I don't remember how
the mathematicians decided to handle division for rings.

They invented Fields. But note that the mathematical notion of
multiplicative
inverses in Fields does not correspond to integer division with
rounding.

I think that they declared that rings are not closed under division.

Division by zero is usually excluded. So division is rarely closed
in any case.

However, for
every case where m1/m2 gives a value that is also a member of a ring,
the C division operator gives that same member.

That's too strong a statement. It is trivial to generate Rings whose
elements
are normal integers, but whose addition and multiplication functions
are
entirely different to the ordinary arithmetic of integers. The elements
are
really just labels, it's the function mappings that determine what is a
ring.
C's unsigned arithmetic uses particular mappings that just one example.

...A ring is the mathematical concept that is the closest
simple match to the behavior of C unsigned types.

Yes, but it's unnecessary to consider rings in general in order to
understand
unsigned integers. In fact most mathematical texts often explain
modular
arithmetic first to give a grounding example in preparation for later
discussing of groups and rings in more formal terms.

The important part is the notion that in the abstract, the operators +
and *
are just function mappings, not 'computations'. In other words + and *
are
in essence just pure lookup tables.

For unsigned integers, the mapping/table is 'onto', that is, for every
pair (a,b) there exists a corresponding element. Signed integer
operators are not onto since there are pairs that do not have
mappings. Because of that, C's signed integer arithmetic is not
a ring. However, the typical two's complement implementation
completes the mapping and does form a ring.

The standard explains the completeness of the unsigned mapping
in terms of wrap-around. But that explanation is really just a
convenient definition. Hardware implementers certainly don't think
in terms of it, they don't need to. All that is important is what the
definition implies. Modulo arithmetic has some particularly useful
properties, and it all stems from that simple definition.

Unfortunately, C's definition directly introduces the concept of an
intermediary result that needs to be force fitted back into the
appropriate range. In other parts, particularly floating point it's
called a mathematical result. But an intermediary result is the
way most people actually think of and understand modular
arithmetic.

But as is often the case in clc, neither the original post, nor the
spin off debate is anything new... [may need unwrapping...]

http://groups.google.com/group/comp....be47fcef0f83f7

--
Peter

Jan 4 '07 #178

ku****@wizard.net wrote:

av wrote:
,,,,
>your dear c standard is wrong in the definition on size_t

is there someone agree with me?

You use too little English to explain what it is you're talking
about, and the English you do provide has such sloppy grammar and
punctuation that I can't figure out what it is that you're trying
to say.

av is a troll. Ignore it.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>

Jan 4 '07 #179

Mark McIntyre wrote:

Richard Heathfield <rj*@see.sig.invalidwrote:
No, the argument to malloc is the expression -1,
which is clearly of type int,
and is equally clearly negative!
>
As far as malloc is concerned, it
*receives* a parameter with the value (size_t)-1, which is a (very)
positive value.
No, it recieves a set of bits in some memory address or register, and
interprets them as a size_t.

That's an 'all the world is intel' view. The standard is more general
but
otherwise quite explicit.

pete wrote:

As far as malloc is concerned,
it *initializes* a parameter with the value (size_t)-1,
which is a (very) positive value.

The effect is the same, but to nitpick, the standard says assignment,
not initialisation: 6.5.2.2p4

An argument may be an expression of any object type. In preparing for
the call to a function, the arguments are evaluated, and each
parameter
is assigned the value of the corresponding argument.

So malloc doesn't do anything because its parameters already have
their values when it is actually called. This is indeed how most
calling conventions work. The code for the function definition assumes
that the values have already been assigned. The only way this can
happen (efficiently) in the case where conversions are necessary is
if the assignments occur prior to the call.

It is for this reason that even non variadic functions can require a
prototype in scope before the call in order for the function to be
called correctly.

Functions taking size_t parameters are candidates; malloc is one of
them. Although -1 is unlikely in real code, I have certainly seen
dynamic character array allocations where the argument is not a
size_t. Such calls require a prototype to be in place in order that the
correct conversion be applied.

--
Peter

Jan 4 '07 #180

Mark McIntyre said:

On Wed, 03 Jan 2007 04:31:15 +0000, in comp.lang.c , Richard
Heathfield <rj*@see.sig.invalidwrote:

>>No, the argument to malloc is the expression -1, which is clearly of type
int, and is equally clearly negative!

The compiler doesn't read "-1" though, does it?

No, you're right - the preprocessor reads -1 (in the code under
consideration), and converts it into a pp-token. But I don't think it's
unreasonable to speak of the compiler, or at least the implementation,
reading -1 as being the argument to malloc.

Its an expression of
type int, whose value is represented by some bits which, when regarded
as a signed int, equal -1, and when regarded as an unsigned long, make
up (say) 0xFFFF.

Let's say 0xFFFFFFFF instead, shall we?

Imagine if you'd passed in 'a'.

Then it would be a positive number with a value <= CHAR_MAX, and certainly
representable as a size_t, and it would be of no relevance whatsoever to
this discussion.

>
>>As far as malloc is concerned, it
*receives* a parameter with the value (size_t)-1, which is a (very)
positive value.

No, it recieves a set of bits in some memory address or register, and
interprets them as a size_t.

Nothing in the C spec requires malloc to receive its parameter via a memory
address or register, or to do any interpreting of that value as a size_t.
It could receive the parameter by parcel post or carrier pigeon, with any
necessary interpretation already done for it, for all the C Standard cares.
But the Standard *does* require malloc to receive a parameter of type
size_t. So I maintain that my statement was correct, and note that yours
was rather less so.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at the above domain, - www.

Jan 4 '07 #181

=?utf-8?B?SGFyYWxkIHZhbiBExLNr?=

CBFalconer wrote:

Nelu wrote:

... snip ...

Why not:

void *calloc(size_t n, size_t s) {
void *result;
size_t sz;
if(SIZE_MAX/n<s) {
sz=n*s;
result=malloc(n*s);
if(result) {
memset(result,0,sz);
return result;
}
}
return NULL;
}

I think that works everywhere. I would rework it slightly to:

void *calloc(size_t n, size_t s) {
void *result;
size_t sz;

result = NULL;
if (SIZE_MAX / n < s) {

What if n == 0 ?

sz = n * s;
result = malloc(sz);
if (result) memset(result, 0, sz);
}
return result;
}

largely to install some blanks for readability. :-)

Jan 4 '07 #182

On Wed, 03 Jan 2007 15:16:51 GMT, Richard Bos wrote:

>jacob navia <ja***@jacob.remcomp.frwrote:

>Randy Howard a écrit :
I'll take a stab it. The proposal is to arbitrarily chop the available
address in space, for those rare cases in which someone wishes to
malloc an amount of memory /larger/ than the available address space on
the processor. I.e., they want a boatload of RAM, so restrict them to
less than half of that they need to make things "safer". That has BS
written all over it, imo.

Not at all. Please just see what the proposal really was
before going to answer to phantasy proposals.

The proposal was discussing the idea of using a signed
type to avoid problems with small negative numbers
that get translated into huge unsigned ones.

And that's where it breaks down. You see, where are you going to _get_
these small negative numbers? Are you going to get a negative
multiplicand from sizeof? No, because that's defined as giving a
positive number under all circumstances. Is your programmer going to
specify a negative number of objects? Hardly likely. That would be a
blunder of the first order.
So whence the negative number? Probably, one supposes, from multiplying
two largeish positive numbers and getting a signed integer overflow. Ah,
but! But signed integer overflow causes undefined behaviour. So the
error is not trying to allocate a negative number of bytes, the error is
computing the negative in the first place, and it's an error that is
allowed to be fatal and cannot reliably be caught.

Of course, there _is_ an easy way to stop the undefined behaviour. That
way is not to use signed integers for sizes in the first place.
Multiplying an unsigned integer by an (unsigned) size_t gives you
another unsigned integer. The multiplication cannot overflow, and cannot
cause UB. It _can_ wrap around, but that error is fairly easy to detect;
the way to do this is left as an exercise to the reader, but should not
evade any first-year student of C.

So, by suggesting that instead of the unsigned size_t, we should use
signed int or ssize_t, you are effectively advocating replacing a safe
method of handling malloc() in which overly large sizes are easily
spotted, by an unsafe method in which overly large numbers cause
untrappable errors which can only be caught after the damage has already
been done, and in which the program may crash before you even get to
check whether the result is negative at all. Is that wise? Seems to me
that it's not.

Richard

google "integer overflow" & bugtraq,
some type can not "overflow" (using mod)
and if "overflow" it has to be easy to find they have "overflow"

Jan 4 '07 #183

Jun Woong

Harald van D©¦k wrote:

CBFalconer wrote:

[...]

void *calloc(size_t n, size_t s) {
void *result;
size_t sz;

result = NULL;
if (SIZE_MAX / n < s) {

What if n == 0 ?

That should be

if (n == 0 || s == 0 || SIZE_MAX / n s) {

or something like that.
--
Jun, Woong (woong at icu.ac.kr)
Samsung Electronics Co., Ltd.

``All opinions expressed are mine, and do not represent
the official opinions of any organization.''

Jan 4 '07 #184

christian.bau

ku****@wizard.net wrote:

In order to validly argue that a definition of a term is wrong, you
must reference a differing more authoritative definition. In this case,
the authoritative definition of size_t is the one provided the C
standard - there is no higher authority you can refer to, to justify
calling the definition wrong. The standard's definition might be poorly
written, unreadable, useless, meaningless, internally inconsistent,
inconsistent with other parts of the standard, inconsistent with some
other standard, unimplementable, or it might possess any of a wide
variety of other negative characteristics. But since C standard is the
relevant authority, it's definition of size_t is inherently incapable
of being wrong.

The word "wrong" is used with different meanings, it can mean
inappropriate, ill-advised, unsound and so on. I could reasonably say
"The definition of the strncpy function in the C Standard is wrong".
Some people say "trigraphs are wrong" or harsher things. On the other
hand, I don't think the definition of size_t is wrong in that sense.

Jan 5 '07 #185

christian.bau

Stephen Sprunk wrote:

"Old Wolf" <ol*****@inspire.net.nzwrote in message
news:11**********************@42g2000cwt.googlegro ups.com...
Firstly, systems might exist where you can allocate more memory
than SIZE_MAX.

Not portably and in a single object. size_t is _defined_ to be able to
hold the size of the largest possible object. That, of course, doesn't
exclude systems where you can allocate SIZE_MAX bytes multiple times
(e.g. MS DOS) or where there is some other (i.e. non-portable)
allocator.

I did a search for "size_t" in the C99 final draft, and I couldn't
actually find anything that says size_t must be able to store the sizes
of any array returned by calloc. You cannot define a type that is
bigger than can be represented using size_t (at least a compiler cannot
let you use sizeof for such a type), and a call to malloc/realloc
cannot request such an object, but a call to calloc can.

Being able to allocate such a large object would have some other
consequences, like indexing and pointer subtraction might be harder to
implement, but I didn't find anything that doesn't allow calloc to
return large objects.

Jan 5 '07 #186

Richard Heathfield wrote:

Mark McIntyre said:
Richard Heathfield <rj*@see.sig.invalidwrote:
No, the argument to malloc is the expression -1, which is clearly of type
int, and is equally clearly negative!
The compiler doesn't read "-1" though, does it?

No, you're right - the preprocessor reads -1 (in the code under
consideration), and converts it into a pp-token.

No, -1 is actually two pp-tokens, a punctuator and a pp-number. It's
surprising
what sorts of things constitute a pp-number (e.g. 8teen), but a leading
sign is
not part of it. Even when preprocessing tokens are converted to tokens,
-1
remains as two tokens.

All that said, -1 is still the expression that constitutes the argument
to malloc.
It evaluates to -1 (ta da!) and has type int. Prior to calling,
malloc's parameter,
which has type size_t, is _assigned_ the value -1. The semantic rules
of
assignment require a conversion take place from the int value to
size_t.
When the call is actually made, malloc's parameter already has the
converted
value. [At least, that is the way the standard defines the call.]

The major problem that can occur is when malloc is not prototyped and
the
mentioned conversion does not take place (more precisely the behaviour
is undefined). Mark's description is more literally true of what takes
place
on most implementations when undefined behaviour is invoked.

Of course, Mark's description is also literally true of the way that
many
implementations (upon which size_t has the same size as int, neither
has padding bits, and the conversion is a no-op) perform the call, but
it
is unwise to think that all implementations are similar in that regard.

Certainly, on most (all?) stack based implementations where size_t
is wider than int, it is easy to look at the disassembly and see that
what gets pushed onto the stack is a size_t and not an int. This is
an indication that the implementation writer has read the standard
and knows that a conversion needs to be performed before the
function call itself.

--
Peter

Jan 5 '07 #187

Zara

On Tue, 02 Jan 2007 18:12:03 +0100, jacob navia
<ja***@jacob.remcomp.frwrote:

>Richard Heathfield a écrit :
>>
>>>The bug I am referring to is when you multiply
65521 * 65552 --65 296

If you can meaningfully allocate 65521*65552 bytes in a single aggregate
object, then sizeof has to be able to report the size of such an object,
which means size_t has to be at least 33 bits, which means the "bug" you
refer to doesn't occur. If size_t is no more than 32 bits, it doesn't make
sense to try to allocate an aggregate object 2^32-1 bytes in size.

C'MON HEATHFIELD
can you STOP IT???????

BUGS NEVER MAKE SENSE!!!

THAT'S WHY THEY ARE BUGS!!!!

<..>

Please, jacob, do stop shouting.

All RH is saying is that the bug is neither in the standard not in
malloc, it lies in the program that requests a chunk of memory greater
than maximum value of size_t. And so it is.

Trying to put the bug in the standard libraries routine or in the
standard wording is trying to hide the bug done by the programmer.

Best regards (and my best desires for peace)
Zara

Jan 5 '07 #188

pete

Zara wrote:

>
On Tue, 02 Jan 2007 18:12:03 +0100, jacob navia
<ja***@jacob.remcomp.frwrote:

Richard Heathfield a écrit :
>
The bug I am referring to is when you multiply
65521 * 65552 --65 296
If you can meaningfully allocate 65521*65552 bytes
in a single aggregate object,
then sizeof has to be able to report the size of such an object,
which means size_t has to be at least 33 bits,
which means the "bug" you refer to doesn't occur.
If size_t is no more than 32 bits, it doesn't make
sense to try to allocate an aggregate
object 2^32-1 bytes in size.

C'MON HEATHFIELD
can you STOP IT???????

BUGS NEVER MAKE SENSE!!!

THAT'S WHY THEY ARE BUGS!!!!

<..>

Please, jacob, do stop shouting.

All RH is saying is that the bug is neither in the standard not in
malloc,

The function under discussion is calloc, not malloc.

it lies in the program that requests a chunk of memory greater
than maximum value of size_t. And so it is.

What RH said is wrong.
sizeof operates on types.
sizeof does not need to be able to report how many bytes
of memory are allocated by calloc.

--
pete

Jan 5 '07 #189

Jun Woong wrote:

Harald van D©¦k wrote:
CBFalconer wrote:
[...]

>>>
void *calloc(size_t n, size_t s) {
void *result;
size_t sz;

result = NULL;
if (SIZE_MAX / n < s) {

What if n == 0 ?

That should be

if (n == 0 || s == 0 || SIZE_MAX / n s) {

or something like that.

The version I have just put in nmalloc.c (not yet published) is:

/* calloc included here to ensure that it handles the
same range of sizes (s * n) as does malloc. The
multiplication n*s can wrap, yielding a too small
value, so we must ensure calloc rejects this.
*/
void *ncalloc(size_t n, size_t s)
{
void *result;
size_t sz;

result = NULL;
if (!n || ((size_t)-1) / n s) {
sz = n * s;
if ((result = nmalloc(sz))) memset(result, 0, sz);
}
return result;
} (* ncalloc *)

which makes the output of ncalloc be that of nmalloc(0) whenever
either n or s is 0. I think there is still a possible glitch when
((size_t)-1) / n) == s. The only thing that needs protection
agains n==0 is the division. s==0 will simply force sz==0.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>

Jan 5 '07 #190

Zara wrote:

jacob navia <ja***@jacob.remcomp.frwrote:
Richard Heathfield a écrit :
>>The bug I am referring to is when you multiply
65521 * 65552 --65 296

If you can meaningfully allocate 65521*65552 bytes in a single aggregate
object, then sizeof has to be able to report the size of such an object,
which means size_t has to be at least 33 bits, which means the "bug" you
refer to doesn't occur. If size_t is no more than 32 bits, it doesn't make
sense to try to allocate an aggregate object 2^32-1 bytes in size.
C'MON HEATHFIELD
can you STOP IT???????

BUGS NEVER MAKE SENSE!!!

THAT'S WHY THEY ARE BUGS!!!!

Please, jacob, do stop shouting.

All RH is saying is that the bug is neither in the standard not in
malloc,

Yet the committee itself has considered the issue as a weakness
of the standard.

it lies in the program that requests a chunk of memory greater
than maximum value of size_t. And so it is.

Trying to put the bug in the standard libraries routine or in the
standard wording is trying to hide the bug done by the programmer.

True. However... the committee's job is to codify C in terms of actual
practice. [The most perverse example on record is gets().]

I don't think it's an exageration to say there are hundreds of
thousands
of C programs that allocate memory dynamically. A significant portion
(probably even most!) fail to check for size_t overflow (er
wrap-around.)
With size_t becoming wider and wider, now is as good a time as any
to consider allocation functions that take size_t and what they should
do with extremely large (likely bogus) requests.

Richard's stance is basically that new programs should get it right
from the start and existing programs which get it wrong should
be modified and corrected. Sound idea, but a tad idealistic. ;-)

The committee's options include considering more pragmatic
viewpoints. Sure, the options considered don't prevent the kinds
of bug being discussed, but greater detection and mitigation of
effects is not necessarily a Bad Thing (tm).

--
Peter

Jan 5 '07 #191

pete

pete wrote:

>
Zara wrote:

On Tue, 02 Jan 2007 18:12:03 +0100, jacob navia
<ja***@jacob.remcomp.frwrote:

>Richard Heathfield a écrit :
>>
>>>The bug I am referring to is when you multiply
>>>65521 * 65552 --65 296
>>
>>
>If you can meaningfully allocate 65521*65552 bytes
>in a single aggregate object,
>then sizeof has to be able to report the size of such an object,
>which means size_t has to be at least 33 bits,
>which means the "bug" you refer to doesn't occur.
>If size_t is no more than 32 bits, it doesn't make
>sense to try to allocate an aggregate
>object 2^32-1 bytes in size.
>>
>>
>
>C'MON HEATHFIELD
>can you STOP IT???????
>
>BUGS NEVER MAKE SENSE!!!
>
>THAT'S WHY THEY ARE BUGS!!!!
<..>

Please, jacob, do stop shouting.

All RH is saying is that the bug is neither in the standard not in
malloc,

The function under discussion is calloc, not malloc.

it lies in the program that requests a chunk of memory greater
than maximum value of size_t. And so it is.

What RH said is wrong.
sizeof operates on types.
sizeof does not need to be able to report how many bytes
of memory are allocated by calloc.

But that's not really the point.
If calloc can't return a pointer to
"space for an array of nmemb objects,
each of whose size is size"
then it should return a null pointer,
rather than a pointer to
((size_t)nmemb * (size_t)size) bytes of memory.

--
pete

Jan 5 '07 #192

pete

CBFalconer wrote:

>
Jun Woong wrote:
Harald van D©¦k wrote:
CBFalconer wrote:
[...]
>>
void *calloc(size_t n, size_t s) {
void *result;
size_t sz;

result = NULL;
if (SIZE_MAX / n < s) {

What if n == 0 ?
That should be

if (n == 0 || s == 0 || SIZE_MAX / n s) {

or something like that.

The version I have just put in nmalloc.c (not yet published) is:

/* calloc included here to ensure that it handles the
same range of sizes (s * n) as does malloc. The
multiplication n*s can wrap, yielding a too small
value, so we must ensure calloc rejects this.
*/
void *ncalloc(size_t n, size_t s)
{
void *result;
size_t sz;

result = NULL;
if (!n || ((size_t)-1) / n s) {
sz = n * s;
if ((result = nmalloc(sz))) memset(result, 0, sz);
}
return result;
} (* ncalloc *)

which makes the output of ncalloc be that of nmalloc(0) whenever
either n or s is 0. I think there is still a possible glitch when
((size_t)-1) / n) == s.

Then why don't you make it

if (!n || ((size_t)-1) / n >= s)

instead?

The only thing that needs protection
agains n==0 is the division. s==0 will simply force sz==0.

--
pete

Jan 5 '07 #193

Peter Nilsson said:

Richard Heathfield wrote:
>Mark McIntyre said:
Richard Heathfield <rj*@see.sig.invalidwrote:
No, the argument to malloc is the expression -1, which is clearly of
type int, and is equally clearly negative!

The compiler doesn't read "-1" though, does it?

No, you're right - the preprocessor reads -1 (in the code under
consideration), and converts it into a pp-token.

No, -1 is actually two pp-tokens, a punctuator and a pp-number.
It's surprising what sorts of things constitute a pp-number
(e.g. 8teen), but a leading sign is not part of it. Even when
preprocessing tokens are converted to tokens, -1 remains as
two tokens.

I appear to have been out-pedanted again. :-)

Thanks for the correction.
--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at the above domain, - www.

Jan 5 '07 #194

CBFalconer wrote:

... The version I have just put in nmalloc.c (not yet published) is:

/* calloc included here to ensure that it handles the
same range of sizes (s * n) as does malloc. The
multiplication n*s can wrap, yielding a too small
value, so we must ensure calloc rejects this.
*/
void *ncalloc(size_t n, size_t s)
{
void *result;
size_t sz;

result = NULL;
if (!n || ((size_t)-1) / n s) {
sz = n * s;
if ((result = nmalloc(sz))) memset(result, 0, sz);
}
return result;
} (* ncalloc *)

Make sure you fix the Pascal comments. ;-)

which makes the output of ncalloc be that of nmalloc(0) whenever
either n or s is 0. I think there is still a possible glitch when
((size_t)-1) / n) == s.

Equality implies n * s plus some non-negative remainder equals
(size_t)-1. In other words, you still have the mathematic relation
n*s <= (size_t)-1.

Of course, you may wish to reserve (size_t)-1 for use as an in-band
error signal (for other functions in your suite), but that's your
choice.

The only thing that needs protection
agains n==0 is the division. s==0 will simply force sz==0.

--
Peter

Jan 5 '07 #195

pete said:

<snip>

What RH said is wrong.
sizeof operates on types.

....and expressions.

sizeof does not need to be able to report how many bytes
of memory are allocated by calloc.

Yes, on reflection I'll take the hit on that. Apologies if I misled anyone.
I think I had malloc firmly on the brain, but calloc on my fingertips, so
to speak.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at the above domain, - www.

Jan 5 '07 #196

pete said:

<snip>

If calloc can't return a pointer to
"space for an array of nmemb objects,
each of whose size is size"
then it should return a null pointer,
rather than a pointer to
((size_t)nmemb * (size_t)size) bytes of memory.

Certainly true.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at the above domain, - www.

Jan 5 '07 #197

Richard Heathfield <rj*@see.sig.invalidwrites:

pete said:

<snip>

>What RH said is wrong.
sizeof operates on types.

...and expressions.

>sizeof does not need to be able to report how many bytes
of memory are allocated by calloc.

Yes, on reflection I'll take the hit on that. Apologies if I misled anyone.
I think I had malloc firmly on the brain, but calloc on my fingertips, so
to speak.

I think the fact that calloc() can *theoretically* allocate objects
bigger than SIZE_MAX bytes is just accidental. Any declared object
presumably can't be bigger than that, because sizeof needs to work.
malloc() can't allocate anything bigger than SIZE_MAX bytes because
its argument is of type size_t. calloc() can because it takes two
arguments, but (I'm fairly sure) it doesn't take two arguments to
enable it to allocate huge objects.

It would have been perfectly reasonable for the standard to say that
no single object can be bigger than SIZE_MAX bytes; I'd be surprised
if any implementation of calloc() actually *can* allocate more than
SIZE_MAX bytes. (The ones I've seen work by calling malloc().)

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Jan 5 '07 #198

Dietmar Schindler

Randy Howard wrote:

>
On Mon, 1 Jan 2007 18:59:04 -0600, jacob navia wrote
This discussion is not designed for people that do not make errors.

No doubt, since I've been searching for several decades for such a
person, and have yet to find one.

You can find many of them six feet under the ground.

--
Dietmar Schindler

Jan 5 '07 #199