Plauger, size_t and ptrdiff_t

"ro***********@yahoo.com" <ro***********@yahoo.com> writes:

In another thread, a poster mentioned the Posix ssize_t definition
(signed version of size_t). My initial reaction was to wonder what the
point of the Posix definition was when ptrdiff_t was already defined as
such.

I got the idea that ptrdiff_t had to be the same size as size_t from
Plauger's "The Standard C Library," where he states "... It is always
the signed type that has the same number of bits as the4 unsigned type
chosen for size_t..." This language would not rule out one being int
and the other long so long as sizeof(int)==sizeof(long) for the
implementation.

Now I can't see anywhere in the standard that would require that, at
least not directly, and it seems that a size_t of unsigned int and a
prtdiff_t of long (where int and long are different sizes) would be
possible. C99 defines SIZE_MAX as being at least 65535, and
PTRDIFF_MIN/MAX as being at least -/+65535.

So do size_t and ptrdiff_t have to be the same size (or base type) or
not?

There's no requirement in the standard for size_t and ptrdiff_t to be
the same size, but I don't know of any implementation where they
differ.

ptrdiff_t is "the signed integer type of the result of subtracting two
pointers"; size_t is "the unsigned integer type of the result of the
sizeof operator".

Suppose a system only supports objects up to 65535 bytes. The sizeof
operator can only yield values from 0 to 65535, so 16 bits are
sufficient, but pointer subtraction for pointers to elements of an
array of 65535 bytes could yield values from -65535 to +65535, so
ptrdiff_t would have to be at least 17 bits.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Feb 17 '06 #3

"Keith Thompson" <ks***@mib.org> wrote in message
news:ln************@nuthaus.mib.org...

"ro***********@yahoo.com" <ro***********@yahoo.com> writes:
In another thread, a poster mentioned the Posix ssize_t definition
(signed version of size_t). My initial reaction was to wonder what the
point of the Posix definition was when ptrdiff_t was already defined as
such.

I got the idea that ptrdiff_t had to be the same size as size_t from
Plauger's "The Standard C Library," where he states "... It is always
the signed type that has the same number of bits as the4 unsigned type
chosen for size_t..." This language would not rule out one being int
and the other long so long as sizeof(int)==sizeof(long) for the
implementation.

Now I can't see anywhere in the standard that would require that, at
least not directly, and it seems that a size_t of unsigned int and a
prtdiff_t of long (where int and long are different sizes) would be
possible. C99 defines SIZE_MAX as being at least 65535, and
PTRDIFF_MIN/MAX as being at least -/+65535.

So do size_t and ptrdiff_t have to be the same size (or base type) or
not?

There's no requirement in the standard for size_t and ptrdiff_t to be
the same size, but I don't know of any implementation where they
differ.

ptrdiff_t is "the signed integer type of the result of subtracting two
pointers"; size_t is "the unsigned integer type of the result of the
sizeof operator".

Suppose a system only supports objects up to 65535 bytes. The sizeof
operator can only yield values from 0 to 65535, so 16 bits are
sufficient, but pointer subtraction for pointers to elements of an
array of 65535 bytes could yield values from -65535 to +65535, so
ptrdiff_t would have to be at least 17 bits.

Yes, but. X3J11 was painfully aware of this problem, which is why
we explicitly decided to allow some pointer differences to be
unrepresentable as a ptrdiff_t. I said what I said because that
was the reality at the time and it was certainly the intent of the
committee. Whether it got captured well in words...

BTW, even when you do get a ptrdiff_t overflow, on a (very common)
twos-complement machine with quiet wraparound on overflow there are
remarkably few cases where it matters.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com

Feb 17 '06 #4

On 2006-02-17, Keith Thompson <ks***@mib.org> wrote:

"ro***********@yahoo.com" <ro***********@yahoo.com> writes:
In another thread, a poster mentioned the Posix ssize_t definition
(signed version of size_t). My initial reaction was to wonder what the
point of the Posix definition was when ptrdiff_t was already defined as
such.

I got the idea that ptrdiff_t had to be the same size as size_t from
Plauger's "The Standard C Library," where he states "... It is always
the signed type that has the same number of bits as the4 unsigned type
chosen for size_t..." This language would not rule out one being int
and the other long so long as sizeof(int)==sizeof(long) for the
implementation.

Now I can't see anywhere in the standard that would require that, at
least not directly, and it seems that a size_t of unsigned int and a
prtdiff_t of long (where int and long are different sizes) would be
possible. C99 defines SIZE_MAX as being at least 65535, and
PTRDIFF_MIN/MAX as being at least -/+65535.

So do size_t and ptrdiff_t have to be the same size (or base type) or
not?

There's no requirement in the standard for size_t and ptrdiff_t to be
the same size, but I don't know of any implementation where they
differ.

How about an implementation where size_t is unsigned int and ptrdiff_t
is long? If all base types have only the minimum ranges, you can fit
size_t in an unsigned int but can't fit ptrdiff_t in an int.

Feb 17 '06 #5

"P.J. Plauger" <pj*@dinkumware.com> writes:

"Keith Thompson" <ks***@mib.org> wrote in message
news:ln************@nuthaus.mib.org...

[...]

There's no requirement in the standard for size_t and ptrdiff_t to be
the same size, but I don't know of any implementation where they
differ.

ptrdiff_t is "the signed integer type of the result of subtracting two
pointers"; size_t is "the unsigned integer type of the result of the
sizeof operator".

Suppose a system only supports objects up to 65535 bytes. The sizeof
operator can only yield values from 0 to 65535, so 16 bits are
sufficient, but pointer subtraction for pointers to elements of an
array of 65535 bytes could yield values from -65535 to +65535, so
ptrdiff_t would have to be at least 17 bits.

Yes, but. X3J11 was painfully aware of this problem, which is why
we explicitly decided to allow some pointer differences to be
unrepresentable as a ptrdiff_t. I said what I said because that
was the reality at the time and it was certainly the intent of the
committee. Whether it got captured well in words...

BTW, even when you do get a ptrdiff_t overflow, on a (very common)
twos-complement machine with quiet wraparound on overflow there are
remarkably few cases where it matters.

Ok, I see that the standard explicitly allows the result of ptr1-ptr2
not to fit in a ptrdiff_t (it's undefined behavior if it doesn't).
But I don't see anything that requires, or even encourages, size_t and
ptrdiff_t to be the same size.

If the maximum object size is, say, 65535 bytes, the standard
*permits* size_t and ptrdiff_t to be 16 bits, but is there any reason
(as far as the standard is concerned) not to make size_t 16 bits and
ptrdiff_t 32 bits?

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Feb 17 '06 #6

Jordan Abel <ra*******@gmail.com> writes:

On 2006-02-17, Keith Thompson <ks***@mib.org> wrote:

[...]

There's no requirement in the standard for size_t and ptrdiff_t to be
the same size, but I don't know of any implementation where they
differ.

How about an implementation where size_t is unsigned int and ptrdiff_t
is long? If all base types have only the minimum ranges, you can fit
size_t in an unsigned int but can't fit ptrdiff_t in an int.

Then that would be an implementation I don't know of. (It's legal as
far as I know.)

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Feb 17 '06 #7

"Keith Thompson" <ks***@mib.org> wrote in message
news:ln************@nuthaus.mib.org...

"P.J. Plauger" <pj*@dinkumware.com> writes:
"Keith Thompson" <ks***@mib.org> wrote in message
news:ln************@nuthaus.mib.org...

[...]
There's no requirement in the standard for size_t and ptrdiff_t to be
the same size, but I don't know of any implementation where they
differ.

ptrdiff_t is "the signed integer type of the result of subtracting two
pointers"; size_t is "the unsigned integer type of the result of the
sizeof operator".

Suppose a system only supports objects up to 65535 bytes. The sizeof
operator can only yield values from 0 to 65535, so 16 bits are
sufficient, but pointer subtraction for pointers to elements of an
array of 65535 bytes could yield values from -65535 to +65535, so
ptrdiff_t would have to be at least 17 bits.

Yes, but. X3J11 was painfully aware of this problem, which is why
we explicitly decided to allow some pointer differences to be
unrepresentable as a ptrdiff_t. I said what I said because that
was the reality at the time and it was certainly the intent of the
committee. Whether it got captured well in words...

BTW, even when you do get a ptrdiff_t overflow, on a (very common)
twos-complement machine with quiet wraparound on overflow there are
remarkably few cases where it matters.

Ok, I see that the standard explicitly allows the result of ptr1-ptr2
not to fit in a ptrdiff_t (it's undefined behavior if it doesn't).
But I don't see anything that requires, or even encourages, size_t and
ptrdiff_t to be the same size.

If the maximum object size is, say, 65535 bytes, the standard
*permits* size_t and ptrdiff_t to be 16 bits, but is there any reason
(as far as the standard is concerned) not to make size_t 16 bits and
ptrdiff_t 32 bits?

There's just Nixon's rule -- you could do it, but it would be wrong.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com

Feb 18 '06 #8

"P.J. Plauger" <pj*@dinkumware.com> writes:

"Keith Thompson" <ks***@mib.org> wrote in message
news:ln************@nuthaus.mib.org...

[...]

Ok, I see that the standard explicitly allows the result of ptr1-ptr2
not to fit in a ptrdiff_t (it's undefined behavior if it doesn't).
But I don't see anything that requires, or even encourages, size_t and
ptrdiff_t to be the same size.

If the maximum object size is, say, 65535 bytes, the standard
*permits* size_t and ptrdiff_t to be 16 bits, but is there any reason
(as far as the standard is concerned) not to make size_t 16 bits and
ptrdiff_t 32 bits?

There's just Nixon's rule -- you could do it, but it would be wrong.

And why exactly would it be wrong? I see nothing in the standard that
even vaguely implies that size_t and ptrdiff_t should be the same
size, and there are realistic circumstances (see above) in which it
would make perfectly good sense, IMHO, for ptrdiff_t to be larger than
size_t. The standard's wording caters to implementations that choose
to make them the same size, but that's very different from encouraging
them to be the same size.

Your opinion does carry significant weight, but I'd be very interested
in knowing the reasoning behind it. In the meantime, I can't think of
any reason to write code that could break on systems where ptrdiff_t
is bigger than size_t.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Feb 18 '06 #9

"Keith Thompson" <ks***@mib.org> wrote in message
news:ln************@nuthaus.mib.org...

"P.J. Plauger" <pj*@dinkumware.com> writes:
"Keith Thompson" <ks***@mib.org> wrote in message
news:ln************@nuthaus.mib.org... [...]
Ok, I see that the standard explicitly allows the result of ptr1-ptr2
not to fit in a ptrdiff_t (it's undefined behavior if it doesn't).
But I don't see anything that requires, or even encourages, size_t and
ptrdiff_t to be the same size.

If the maximum object size is, say, 65535 bytes, the standard
*permits* size_t and ptrdiff_t to be 16 bits, but is there any reason
(as far as the standard is concerned) not to make size_t 16 bits and
ptrdiff_t 32 bits?

There's just Nixon's rule -- you could do it, but it would be wrong.

And why exactly would it be wrong?

Because there's next to nothing to be gained by it, and some people
would be surprised that ptrdiff_t is unnecessarily large. (I won't
even mention a third of a century of past practice.)
I see nothing in the standard that
even vaguely implies that size_t and ptrdiff_t should be the same
size,
There's nothing in the C Standard that prohibits 93 bit chars, either.
and there are realistic circumstances (see above) in which it
would make perfectly good sense, IMHO, for ptrdiff_t to be larger than
size_t.
Maybe by one bit, but as I said before even there it matters way less
than you might think.
The standard's wording caters to implementations that choose
to make them the same size, but that's very different from encouraging
them to be the same size.
Right. The C Standard is not intended to rule out silly implementations.
Your opinion does carry significant weight, but I'd be very interested
in knowing the reasoning behind it. In the meantime, I can't think of
any reason to write code that could break on systems where ptrdiff_t
is bigger than size_t.

Nor do I encourage that either.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com

Feb 19 '06 #10

tmp123

P.J. Plauger wrote:

"Keith Thompson" <ks***@mib.org> wrote in message
"P.J. Plauger" <pj*@dinkumware.com> writes:
"Keith Thompson" <ks***@mib.org> wrote in message

[...]
Ok, I see that the standard explicitly allows the result of ptr1-ptr2
not to fit in a ptrdiff_t (it's undefined behavior if it doesn't).
But I don't see anything that requires, or even encourages, size_t and
ptrdiff_t to be the same size.

If the maximum object size is, say, 65535 bytes, the standard
*permits* size_t and ptrdiff_t to be 16 bits, but is there any reason
(as far as the standard is concerned) not to make size_t 16 bits and
ptrdiff_t 32 bits?

There's just Nixon's rule -- you could do it, but it would be wrong.

And why exactly would it be wrong?

Because there's next to nothing to be gained by it, and some people
would be surprised that ptrdiff_t is unnecessarily large. (I won't
even mention a third of a century of past practice.)

....

Sorry, I do not understand something:

Result of calloc is an array that, according to calloc parameters, can
be up to size_t**2 bytes. ptrdiff_t must be long enough for pointer
differences in its elements. Thus, ptrdiff_t must be greater than
size_t (or both must have the maximum system value)

In other words: it is easy think about a system that doesn't allows big
structure definitions (by example, a maximum of one "memory page" of
4096 bytes --> size_t=4096 ), but this system can accept big dynamic
arrays (ptrdiff_t --> +/-0x7FFFFFFF).

I suposse there are some more related rules I do not know.

Kind regards.

Feb 19 '06 #11

"tmp123" <tm****@menta.net> wrote in message
news:11**********************@g47g2000cwa.googlegr oups.com...

P.J. Plauger wrote:
"Keith Thompson" <ks***@mib.org> wrote in message
> "P.J. Plauger" <pj*@dinkumware.com> writes:
>> "Keith Thompson" <ks***@mib.org> wrote in message
> [...]
>>> Ok, I see that the standard explicitly allows the result of ptr1-ptr2
>>> not to fit in a ptrdiff_t (it's undefined behavior if it doesn't).
>>> But I don't see anything that requires, or even encourages, size_t
>>> and
>>> ptrdiff_t to be the same size.
>>>
>>> If the maximum object size is, say, 65535 bytes, the standard
>>> *permits* size_t and ptrdiff_t to be 16 bits, but is there any reason
>>> (as far as the standard is concerned) not to make size_t 16 bits and
>>> ptrdiff_t 32 bits?
>>
>> There's just Nixon's rule -- you could do it, but it would be wrong.
>
> And why exactly would it be wrong?
Because there's next to nothing to be gained by it, and some people
would be surprised that ptrdiff_t is unnecessarily large. (I won't
even mention a third of a century of past practice.)

...

Sorry, I do not understand something:

Result of calloc is an array that, according to calloc parameters, can
be up to size_t**2 bytes.

The calloc parameters might permit you to request that, but the system
can't deliver it. size_t by definition is an unsigned integer type
big enough to represent the count of bytes in the largest object you
can declare or allocate.
ptrdiff_t must be long enough for pointer
differences in its elements.
Actually, no. The difference between two pointers is permitted to
overflow when represented as an object of type ptrdiff_t.
Thus, ptrdiff_t must be greater than
size_t (or both must have the maximum system value)
Neother is true.
In other words: it is easy think about a system that doesn't allows big
structure definitions (by example, a maximum of one "memory page" of
4096 bytes --> size_t=4096 ), but this system can accept big dynamic
arrays (ptrdiff_t --> +/-0x7FFFFFFF).

I suposse there are some more related rules I do not know.

Yes. See above.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com

Feb 19 '06 #12

"P.J. Plauger" <pj*@dinkumware.com> writes:

"tmp123" <tm****@menta.net> wrote in message
news:11**********************@g47g2000cwa.googlegr oups.com...

[...]

ptrdiff_t must be long enough for pointer
differences in its elements.

Actually, no. The difference between two pointers is permitted to
overflow when represented as an object of type ptrdiff_t.

Agreed, but you seem to be arguing that it's *better* for an
implementation to allow pointer subtraction to overflow than to make
ptrdiff_t bigger than size_t. I respectfully disagree.

Note that this would be necessary only when the maximum object size is
greater than SIZE_MAX/2 *and* size_t is the largest available integer
type. Since C99 requires support for 64-bit integers, that's unlikely
to happen for a long time. But a system with a 32-bit size_t that
allows objects larger than 2 gigabytes (but not larger than 4
gigabytes) might reasonably have a 64-bit ptrdiff_t. (33 bits would
suffice, but I'm assuming hardware support for 32 and 64 bits.) (It
might also reasonably expand its size_t to 64 bits.)

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Feb 19 '06 #13

On 2006-02-19, tmp123 <tm****@menta.net> wrote:

P.J. Plauger wrote:
"Keith Thompson" <ks***@mib.org> wrote in message
> "P.J. Plauger" <pj*@dinkumware.com> writes:
>> "Keith Thompson" <ks***@mib.org> wrote in message
> [...]
>>> Ok, I see that the standard explicitly allows the result of ptr1-ptr2
>>> not to fit in a ptrdiff_t (it's undefined behavior if it doesn't).
>>> But I don't see anything that requires, or even encourages, size_t and
>>> ptrdiff_t to be the same size.
>>>
>>> If the maximum object size is, say, 65535 bytes, the standard
>>> *permits* size_t and ptrdiff_t to be 16 bits, but is there any reason
>>> (as far as the standard is concerned) not to make size_t 16 bits and
>>> ptrdiff_t 32 bits?
>>
>> There's just Nixon's rule -- you could do it, but it would be wrong.
>
> And why exactly would it be wrong?

Because there's next to nothing to be gained by it, and some people
would be surprised that ptrdiff_t is unnecessarily large. (I won't
even mention a third of a century of past practice.)

...

Sorry, I do not understand something:

Result of calloc is an array that, according to calloc parameters, can
be up to size_t**2 bytes. ptrdiff_t must be long enough for pointer
differences in its elements. Thus, ptrdiff_t must be greater than
size_t (or both must have the maximum system value)

The system can [and many do. mine does.] fail all calls to calloc that
result in a size that won't fit in size_t.

Feb 19 '06 #14

On 2006-02-19, Keith Thompson <ks***@mib.org> wrote:

"P.J. Plauger" <pj*@dinkumware.com> writes:
"tmp123" <tm****@menta.net> wrote in message
news:11**********************@g47g2000cwa.googlegr oups.com... [...]
ptrdiff_t must be long enough for pointer
differences in its elements.

Actually, no. The difference between two pointers is permitted to
overflow when represented as an object of type ptrdiff_t.

Agreed, but you seem to be arguing that it's *better* for an
implementation to allow pointer subtraction to overflow than to make
ptrdiff_t bigger than size_t. I respectfully disagree.

But a system could implement a "magic overflow" - i.e. the size of
size_t, char *, and ptrdiff_t are all the same and overflow acts like
typical twos-complement systems.

Then the [char *] pointer difference 0x01-0xFFF3 is 14 [not -65522], and
adding 14 to the pointer 0xFFF3 results in 0x01.
Note that this would be necessary only when the maximum object size is
greater than SIZE_MAX/2 *and* size_t is the largest available integer
type. Since C99 requires support for 64-bit integers, that's unlikely
to happen for a long time. But a system with a 32-bit size_t that
allows objects larger than 2 gigabytes (but not larger than 4
gigabytes) might reasonably have a 64-bit ptrdiff_t. (33 bits would
suffice, but I'm assuming hardware support for 32 and 64 bits.) (It
might also reasonably expand its size_t to 64 bits.)

Feb 19 '06 #15

Malcolm

"P.J. Plauger" <pj*@dinkumware.com> wrote

Suppose a system only supports objects up to 65535 bytes. The sizeof
operator can only yield values from 0 to 65535, so 16 bits are
sufficient, but pointer subtraction for pointers to elements of an
array of 65535 bytes could yield values from -65535 to +65535, so
ptrdiff_t would have to be at least 17 bits.

Yes, but. X3J11 was painfully aware of this problem, which is why
we explicitly decided to allow some pointer differences to be
unrepresentable as a ptrdiff_t. I said what I said because that
was the reality at the time and it was certainly the intent of the
committee. Whether it got captured well in words...

BTW, even when you do get a ptrdiff_t overflow, on a (very common)
twos-complement machine with quiet wraparound on overflow there are
remarkably few cases where it matters.

My own vew is that it is time the size_t, ptrdiff_t ugliness was put to
rest.

It is designed to solve a problem that almost never occurs in practise,
which is that the size of an object in memory exceeds the size of an
integer.
There is maybe a case for making malloc() take a special type as an
argument, but there is a much weaker case for then allowing the type to run
through the code, so that every count of objects (and most integers count
something), and hence every array index, has to be a size_t.

As this problem shows, the strategy doesn't even have the advantage of
making every program that uses the new types theoretically corrrect for all
values, unless you get into the nonsense of making ptrdiff_t a bit wider
than size-t.

--
Buy my book 12 Common Atheist Arguments (refuted)
$1.25 download or $6.90 paper, available www.lulu.com

Feb 19 '06 #16

Jordan Abel <ra*******@gmail.com> writes:

On 2006-02-19, Keith Thompson <ks***@mib.org> wrote:
"P.J. Plauger" <pj*@dinkumware.com> writes:
"tmp123" <tm****@menta.net> wrote in message
news:11**********************@g47g2000cwa.googlegr oups.com...

[...]
ptrdiff_t must be long enough for pointer
differences in its elements.

Actually, no. The difference between two pointers is permitted to
overflow when represented as an object of type ptrdiff_t.

Agreed, but you seem to be arguing that it's *better* for an
implementation to allow pointer subtraction to overflow than to make
ptrdiff_t bigger than size_t. I respectfully disagree.

But a system could implement a "magic overflow" - i.e. the size of
size_t, char *, and ptrdiff_t are all the same and overflow acts like
typical twos-complement systems.

Then the [char *] pointer difference 0x01-0xFFF3 is 14 [not -65522], and
adding 14 to the pointer 0xFFF3 results in 0x01.

Sure, a system could do that, and many do. This is clearly allowed by
the standard. I'm suggesting that, given that ptrdiff_t can overflow
for large objects, it's better to make ptrdiff_t big enough so it
can't overflow (even if it's not the same size as size_t) than to
force it to be the same size as size_t.

I know that most systems have ptrdiff_t and size_t the same size. I'm
just not convinced that there's any significant disadvantage in making
them different sizes *if* there's some benefit (avoiding overflow) in
doing so.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Feb 19 '06 #17

"Malcolm" <re*******@btinternet.com> writes:

"P.J. Plauger" <pj*@dinkumware.com> wrote
Suppose a system only supports objects up to 65535 bytes. The sizeof
operator can only yield values from 0 to 65535, so 16 bits are
sufficient, but pointer subtraction for pointers to elements of an
array of 65535 bytes could yield values from -65535 to +65535, so
ptrdiff_t would have to be at least 17 bits.
Yes, but. X3J11 was painfully aware of this problem, which is why
we explicitly decided to allow some pointer differences to be
unrepresentable as a ptrdiff_t. I said what I said because that
was the reality at the time and it was certainly the intent of the
committee. Whether it got captured well in words...

BTW, even when you do get a ptrdiff_t overflow, on a (very common)
twos-complement machine with quiet wraparound on overflow there are
remarkably few cases where it matters.

My own vew is that it is time the size_t, ptrdiff_t ugliness was put to
rest.

It is designed to solve a problem that almost never occurs in practise,
which is that the size of an object in memory exceeds the size of an
integer.

I assume you mean that the size of an object (in bytes) exceeds the
maximum value of an integer. And I assume that by "integer" you
really mean "int".

So are you suggesting that we should use unsigned int to represent
object sizes?

On typical 64-bit systems, including several that I work on, type int
is 32 bits (and long is 64 bits). If int were made 64 bits, then
there would be a gap in the type system; char is 8 bits, int would be
64 bits, and short would be either 16 or 32 bits. (A C99 extended
integer type could solve this, but such types aren't commonly
implemented.) One such system in particular has 8 gigabytes of
physical memory. I don't know whether objects bigger than 4 gigabytes
are allowed, but there's no fundamental reason they shouldn't be --
and both size_t and ptrdiff_t are 64 bits.
There is maybe a case for making malloc() take a special type as an
argument, but there is a much weaker case for then allowing the type to run
through the code, so that every count of objects (and most integers count
something), and hence every array index, has to be a size_t.

As this problem shows, the strategy doesn't even have the advantage of
making every program that uses the new types theoretically corrrect for all
values, unless you get into the nonsense of making ptrdiff_t a bit wider
than size-t.

Even if you assume that making ptrdiff_t wider than size_t is
nonsense, it's not necessary in this case. Limiting object sizes to
values representable in 32 bits would be absurd; limiting them to
values representable in 64 bits will be more than enough for many
years.

What exactly are you proposing?

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Feb 19 '06 #18

Malcolm

"Keith Thompson" <ks***@mib.org> wrote

Even if you assume that making ptrdiff_t wider than size_t is
nonsense, it's not necessary in this case. Limiting object sizes to
values representable in 32 bits would be absurd; limiting them to
values representable in 64 bits will be more than enough for many
years.

What exactly are you proposing?

Change the standard so that malloc() "takes as an argument an integral type
of type int or higher".

Therefore if you have a machine with a huge memory and small ints, the
compiler writer is free to say that malloc() should take an unsigned long
long, or whatever.

However anyone can write code with int s as array indices, and know that it
is portable - the huge array user is the one making the non-portable
assumption.

Usually you never want to allocate more memory than can be held in an
intger, and of the excpetions usually you wnat to make only one, very
program and platform-specific, allocation.

--
Buy my book 12 Common Atheist Arguments (refuted)
$1.25 download or $6.90 paper, available www.lulu.com

Feb 21 '06 #19

"Malcolm" <re*******@btinternet.com> writes:

"Keith Thompson" <ks***@mib.org> wrote
Even if you assume that making ptrdiff_t wider than size_t is
nonsense, it's not necessary in this case. Limiting object sizes to
values representable in 32 bits would be absurd; limiting them to
values representable in 64 bits will be more than enough for many
years.

What exactly are you proposing?

Change the standard so that malloc() "takes as an argument an integral type
of type int or higher".

Therefore if you have a machine with a huge memory and small ints, the
compiler writer is free to say that malloc() should take an unsigned long
long, or whatever.

However anyone can write code with int s as array indices, and know that it
is portable - the huge array user is the one making the non-portable
assumption.

You propose making malloc()'s argument type implementation-defined.
It already is; the only real difference is that the standard gives a
name to that argument type, namely size_t.

If you want to call malloc() with an int argument, you're already free
to do so; it will be converted to size_t as long as the prototype is
visible. If you want index arrays with ints, that's also perfectly
legal.

I really don't see any advantage in what you propose.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Feb 21 '06 #20

On 2006-02-21, Malcolm <re*******@btinternet.com> wrote:

"Keith Thompson" <ks***@mib.org> wrote
Even if you assume that making ptrdiff_t wider than size_t is
nonsense, it's not necessary in this case. Limiting object sizes to
values representable in 32 bits would be absurd; limiting them to
values representable in 64 bits will be more than enough for many
years.

What exactly are you proposing?

Change the standard so that malloc() "takes as an argument an integral type
of type int or higher".

Therefore if you have a machine with a huge memory and small ints, the
compiler writer is free to say that malloc() should take an unsigned long
long, or whatever.

However anyone can write code with int s as array indices, and know that it
is portable - the huge array user is the one making the non-portable
assumption.

Usually you never want to allocate more memory than can be held in an
intger, and of the excpetions usually you wnat to make only one, very
program and platform-specific, allocation.

Some leeway for the types of arguments in general would be nice,
actually - for example, it'd be nice if fseek() could take an off_t on
unix systems, eliminating the need for the non-standard fseeko()
function.

Feb 22 '06 #21

Jordan Abel <ra*******@gmail.com> writes:
[...]

Some leeway for the types of arguments in general would be nice,
actually - for example, it'd be nice if fseek() could take an off_t on
unix systems, eliminating the need for the non-standard fseeko()
function.

off_t isn't necessarily an integer type; it's "an object type other
than an array type capable of recording all the information needed to
specify uniquely every position within a file". The non-standard
fseeko() is identical to fseek() except that it takes an off_t rather
than a long; I think that it's supported only on systems where off_t
is an integer type. (Note that fpos_t also has to store the file's
parse state, of type mbstate_t.)

What's needed for fseek() and ftell() isn't more leeway, it's a better
definition. The problem is that they're unnecessarily tied to one
particular predefined integer type, long, that isn't necessarily big
enough for the purpose.

A better solution would be to add a new typedef (fseek_t?) for an
integer type to be used for the offset argument of fseek() and the
result of ftell(). Using long int was reasonable in C90, in which
long int was the largest integer type. In C99, though, we have a
requirement for a 64-bit type, but long int can still reasonably be 32
bits even on systems that support multi-gigabyte files. I think
continuing to use long int for fseek() and ftell() was a mistake on
the part of the C99 committee (unless such a change would have broken
existing code).

64-bit file sizes should be adequate for a long time.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Feb 22 '06 #22

On 2006-02-22, Keith Thompson <ks***@mib.org> wrote:

Jordan Abel <ra*******@gmail.com> writes:
[...]
Some leeway for the types of arguments in general would be nice,
actually - for example, it'd be nice if fseek() could take an off_t on
unix systems, eliminating the need for the non-standard fseeko()
function.
off_t isn't necessarily an integer type; it's "an object type other
than an array type capable of recording all the information needed to
specify uniquely every position within a file"

You're thinking of fpos_t. off_t is a signed integral type. It's in
posix, there is of course no such thing in standard C.
The non-standard fseeko() is identical to fseek() except that it takes
an off_t rather than a long; I think that it's supported only on
systems where off_t is an integer type. (Note that fpos_t also has to
store the file's parse state, of type mbstate_t.)

What's needed for fseek() and ftell() isn't more leeway, it's a better
definition. The problem is that they're unnecessarily tied to one
particular predefined integer type, long, that isn't necessarily big
enough for the purpose.

A better solution would be to add a new typedef (fseek_t?) for an
integer type to be used for the offset argument of fseek() and the
result of ftell().
That's what off_t is for in posix.
Using long int was reasonable in C90, in which long int was the
largest integer type. In C99, though, we have a requirement for a
64-bit type, but long int can still reasonably be 32 bits even on
systems that support multi-gigabyte files. I think continuing to use
long int for fseek() and ftell() was a mistake on the part of the C99
committee (unless such a change would have broken existing code).
The problem is that existing code is permitted to declare, say,
extern long ftell();

The question is how much existing code does such a thing, and how much
is it permissible to break. _any_ new change can break existing code.
C99 breaks existing code that uses inline as an identifier, IIRC.
64-bit file sizes should be adequate for a long time.

Feb 22 '06 #23

Keith Thompson <ks***@mib.org> writes:

Jordan Abel <ra*******@gmail.com> writes:
[...]
Some leeway for the types of arguments in general would be nice,
actually - for example, it'd be nice if fseek() could take an off_t on
unix systems, eliminating the need for the non-standard fseeko()
function.

off_t isn't necessarily an integer type; it's "an object type other
than an array type capable of recording all the information needed to
specify uniquely every position within a file". The non-standard
fseeko() is identical to fseek() except that it takes an off_t rather
than a long; I think that it's supported only on systems where off_t
is an integer type. (Note that fpos_t also has to store the file's
parse state, of type mbstate_t.)

The fseeko() and ftello() functions are defined by POSIX (or at least
by The Open Group Base Specifications Issue 6, which I think is
basically the same thing). POSIX also requires off_t to be a signed
integer type. I don't know whether it would be practical for a future
version of the C standard to require this as well (I frankly don't
understand the multibyte/wide character stuff very well); doing so
might break things for some non-POSIX systems.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Feb 22 '06 #24

Jordan Abel <ra*******@gmail.com> writes:

On 2006-02-22, Keith Thompson <ks***@mib.org> wrote:
Jordan Abel <ra*******@gmail.com> writes:
[...]
Some leeway for the types of arguments in general would be nice,
actually - for example, it'd be nice if fseek() could take an off_t on
unix systems, eliminating the need for the non-standard fseeko()
function.

off_t isn't necessarily an integer type; it's "an object type other
than an array type capable of recording all the information needed to
specify uniquely every position within a file"

You're thinking of fpos_t. off_t is a signed integral type. It's in
posix, there is of course no such thing in standard C.

You're right of course. Stupid mistake on my part. That'll teach me
to post while using a rented brain.

The non-standard fseeko() is identical to fseek() except that it takes
an off_t rather than a long; I think that it's supported only on
systems where off_t is an integer type. (Note that fpos_t also has to
store the file's parse state, of type mbstate_t.)
As mentioned elsewhere, off_t is POSIX-specific and is required to be
an integer type.
What's needed for fseek() and ftell() isn't more leeway, it's a better
definition. The problem is that they're unnecessarily tied to one
particular predefined integer type, long, that isn't necessarily big
enough for the purpose.

A better solution would be to add a new typedef (fseek_t?) for an
integer type to be used for the offset argument of fseek() and the
result of ftell().

That's what off_t is for in posix.

Right. So it seems to me that making off_t a system-specific typedef
for a signed integer type, and using it rather than long int for
fseek() and ftell(), would have been a really good idea. In other
words, adopt POSIX's fseeko() and ftello(), but call them fseek() and
ftell(). On most modern systems, then, off_t would be a 64-bit
integer type (probably an alias for long long).

fpos_t, fgetpos(), and fsetpos() would presumably be left as they are.

Using long int was reasonable in C90, in which long int was the
largest integer type. In C99, though, we have a requirement for a
64-bit type, but long int can still reasonably be 32 bits even on
systems that support multi-gigabyte files. I think continuing to use
long int for fseek() and ftell() was a mistake on the part of the C99
committee (unless such a change would have broken existing code).

The problem is that existing code is permitted to declare, say,
extern long ftell();

The question is how much existing code does such a thing, and how much
is it permissible to break. _any_ new change can break existing code.
C99 breaks existing code that uses inline as an identifier, IIRC.

Yes, along with restrict (the new keywords _Bool, _Complex, and
_Imaginary can't conflict with any strictly conforming program; I
guess the committee decided _Inline and _Restrict were just too ugly.

Any *sane* program will just use "#include <stdio.h>", of course. In
fact, I'm not sure it's possible to use feek() and ftell() without a
"#include <stdio.h>", since each takes an argument of type FILE*.

A change would also affect any program that uses a function pointer
that points to either fseek() or ftell(), which might be reasonable if
you want to determine at run time whether to call the standard
function or one with the same type (say, a dummy or wrapper function).

If concern for backward compatibility prevents changing the types of
fseek() and ftell(), I suppose a future C standard could just add
fseeko(), ftello(), and off_t from POSIX, and deprecate fseek() and
ftell(). Unfortunately, that would make the set of positioning
functions even more cluttered than it already is.

Did C99 change the types of any C90 library functions while leaving
their names as they were?

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Feb 22 '06 #25