Incrementing variables past limits

Bas Wassink

Hi there,

Does the ANSI standard say anything about incrementing variables past
their limits ?

When I compile code like this:

unsigned char x = 255;
x++;
printf ( "%d\n", x );

with GCC, the output is 0.

From an assembly point of view this seems to be logical behaviour, but is
it defined in the standard, or is it implementation dependant ??

I've googled for an answer, read K&R2 and the c.l.c FAQ but couldn't find
any decent answer..
Thanks in advance,
Bas Wassink.

Nov 14 '05 #1

Subscribe Post Reply

4030

Eric Sosman

Bas Wassink wrote:

Hi there,

Does the ANSI standard say anything about incrementing variables past
their limits ?

When I compile code like this:

unsigned char x = 255;
x++;
printf ( "%d\n", x );

with GCC, the output is 0.

From an assembly point of view this seems to be logical behaviour, but is
it defined in the standard, or is it implementation dependant ??

I've googled for an answer, read K&R2 and the c.l.c FAQ but couldn't find
any decent answer..

The result is well-defined for unsigned integers: they
obey the rules of modular ("clock") arithmetic.

For unsigned integers, there are no guarantees. The
program may trap, or may deliver some implementation-defined
result. Most implementations "wrap around" from the most
positive to the most negative value, but the C language
itself doesn't promise this behavior.

--
Er*********@sun.com

Nov 14 '05 #2

Ben Pfaff

Bas Wassink <id******@email.org> writes:

Does the ANSI standard say anything about incrementing variables past
their limits ?

Yes. Unsigned integer types wrap around. With other types it's
unpredictable and you should avoid doing out-of-bounds arithmetic
with them.
--
"We put [the best] Assembler programmers in a little glass case in the hallway
near the Exit sign. The sign on the case says, `In case of optimization
problem, break glass.' Meanwhile, the problem solvers are busy doing their
work in languages most appropriate to the job at hand." --Richard Riehle

Nov 14 '05 #3

Christopher Benson-Manica

<cr**********@news1brm.Central.Sun.COM>
Eric Sosman <er*********@sun.com> wrote:

For unsigned integers, there are no guarantees. The ^^^^^^^^ (I know you meant "signed" :) program may trap, or may deliver some implementation-defined
result. Most implementations "wrap around" from the most
positive to the most negative value, but the C language
itself doesn't promise this behavior.

--
Christopher Benson-Manica
ataru(at)cyberspace.org

Nov 14 '05 #4

Eric Sosman

Christopher Benson-Manica wrote:

<cr**********@news1brm.Central.Sun.COM>
Eric Sosman <er*********@sun.com> wrote:

For unsigned integers, there are no guarantees. The

^^^^^^^^ (I know you meant "signed" :)

I know you're right. Sorry about that.

--
Er*********@sun.com

Nov 14 '05 #5

Keith Thompson

Bas Wassink <id******@email.org> writes:

Does the ANSI standard say anything about incrementing variables past
their limits ?

When I compile code like this:

unsigned char x = 255;
x++;
printf ( "%d\n", x );

with GCC, the output is 0.

From an assembly point of view this seems to be logical behaviour, but is
it defined in the standard, or is it implementation dependant ??

I've googled for an answer, read K&R2 and the c.l.c FAQ but couldn't find
any decent answer..

For unsigned types, overflow has well-defined behavior; the result
wraps around. More precisely, the result is reduced modulo the number
that is one greater than the largest value that can be represented by
the resulting type. For unsigned char (assuming UCHAR_MAX==256), the
value 256 reduces to 0; gcc is behaving correcly.

For signed types, overflow causes undefined behavior. Wraparound is
fairly common, but you shouldn't depend on it; some implementations
may produce a trap. (A conversion to a signed type, if the value
cannot be represented, either yields an implementation-defined result
or raises an implementation-defined signal.)

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Nov 14 '05 #6

Bas Wassink

Thank you for your quick reply, this will help me with writing clean
and portable 6502/6510-emulation code..

Bas Wassink

Nov 14 '05 #7

Derrick Coetzee

Bas Wassink wrote:

Does the ANSI standard say anything about incrementing variables past
their limits ?

Others have explained the rules for signed and unsigned integers. Also
note, however, that casting from signed to unsigned, incrementing, and
then casting back won't work. It's the last step that fails; a cast from
an unsigned value to a signed type that cannot represent that value is
implementation-defined (ref C99 6.3.1.3.3). If performance isn't a big
deal, the easiest way to increment a signed integer is a branch:

#include <limits.h>

int signed_incr(int i) {
if (i == INT_MAX)
i = INT_MIN;
else
i++;
}

--
Derrick Coetzee
I grant this newsgroup posting into the public domain. I disclaim all
express or implied warranty and all liability. I am not a professional.

Nov 14 '05 #8

Peter Nilsson

Keith Thompson wrote:

Bas Wassink <id******@email.org> writes:
Does the ANSI standard say anything about incrementing variables past their limits ?

When I compile code like this:

unsigned char x = 255;
[Note that 255 need not be the limit for unsigned char.]
x++;
printf ( "%d\n", x );

with GCC, the output is 0.

From an assembly point of view this seems to be logical behaviour, but is it defined in the standard, or is it implementation dependant ??
For unsigned types, overflow has well-defined behavior; the result
wraps around. More precisely, the result is reduced modulo the

number that is one greater than the largest value that can be represented by
the resulting type. For unsigned char [assuming UCHAR_MAX==255], the
value 256 reduces to 0; gcc is behaving correcly.

Unsigned short has scope for problems...

unsigned short us = -1;
us++; /* UB if USHRT_MAX == INT_MAX */
The paranoid can safely use...

us += 1u;

--
Peter

Nov 14 '05 #9

Micah Cowan

Derrick Coetzee wrote:

Bas Wassink wrote:
Does the ANSI standard say anything about incrementing variables past
their limits ?

Others have explained the rules for signed and unsigned integers. Also
note, however, that casting from signed to unsigned, incrementing, and
then casting back won't work. It's the last step that fails; a cast from
an unsigned value to a signed type that cannot represent that value is
implementation-defined (ref C99 6.3.1.3.3). If performance isn't a big
deal, the easiest way to increment a signed integer is a branch:

#include <limits.h>

int signed_incr(int i) {
if (i == INT_MAX)
i = INT_MIN;
else
i++;
}

Of course, the above would not actually modify the original
value, and doesn't specify a return value, but I think we all
know what you meant.

Actually, in the vast majority of cases, it would be much better
to indicate an error when (i == INT_MAX) then to silently roll
it. I generally check explicitly for overflow on every arithmetic
operation that might cause one.

As an absurdly extreme--but true--case, failure to catch such a
rollover caused massive radiation overexposure, and even death,
to patients treated by the infamous Therac-25 liniar accelerator
used for radiation therapy.

http://courses.cs.vt.edu/~cs3604/lib.../Therac_1.html

Nov 14 '05 #10

Bas Wassink

On Tue, 04 Jan 2005 21:43:59 -0800, Peter Nilsson wrote:

[Note that 255 need not be the limit for unsigned char.]

So, I would be better off using an unsigned int for representing a byte
and performing the wrap-around myself, right ?

Can I rely on fputc to write an eight-bit value to a file ( passing "wb"
to fopen ) or is this implementation dependant too ?

Nov 14 '05 #11

infobahn

Bas Wassink wrote:

On Tue, 04 Jan 2005 21:43:59 -0800, Peter Nilsson wrote:
[Note that 255 need not be the limit for unsigned char.]

So, I would be better off using an unsigned int for representing a byte
and performing the wrap-around myself, right ?

Can I rely on fputc to write an eight-bit value to a file ( passing "wb"
to fopen ) or is this implementation dependant too ?

The Standard says:

"The fputc function writes the character specified by c (converted
to an unsigned char) to the output stream pointed to by stream, at
the position indicated by the associated file position indicator for
the stream (if defined), and advances the indicator appropriately.
If the file cannot support positioning requests, or if the stream
was opened with append mode, the character is appended to the output
stream."

So the answer is that CHAR_BIT bits will be written. This must
be at least 8, but can be higher.

Nov 14 '05 #12

Chris Croughton

On Wed, 05 Jan 2005 04:52:04 GMT, Derrick Coetzee
<dc****@moonflare.com> wrote:

Bas Wassink wrote:
Does the ANSI standard say anything about incrementing variables past
their limits ?

Others have explained the rules for signed and unsigned integers. Also
note, however, that casting from signed to unsigned, incrementing, and
then casting back won't work. It's the last step that fails; a cast from
an unsigned value to a signed type that cannot represent that value is
implementation-defined (ref C99 6.3.1.3.3). If performance isn't a big
deal, the easiest way to increment a signed integer is a branch:

How about straight assignment? If I do:

int test(int i)
{
unsigned int u;
u = i;
return (int)u;
}

is that defined? I have a suspicion that it isn't (because although u
can represent a negative value of i in some implementation-defined way,
the conversion back to an int may not be able to represent that unsigned
value).

More usefully, is there any real computer and C compiler on which it
will fail in practice?

Chris C

Nov 14 '05 #13

Eric Sosman

Bas Wassink wrote:

On Tue, 04 Jan 2005 21:43:59 -0800, Peter Nilsson wrote:

[Note that 255 need not be the limit for unsigned char.]

So, I would be better off using an unsigned int for representing a byte
and performing the wrap-around myself, right ?

"Better" depends on your purposes; there is no "best"
way to define "better."

Machines with 8-bit bytes are the overwhelming majority
these days, the principal exceptions being processors that
are specially tailored to particular tasks like signal
processing. If you can accept the limitation of only
running on "mainstream" systems, just go ahead and write
code that assumes an 8-bit byte. But you should insert a
simple test for the sake of the poor sod who may someday
try to run your code on machine with 11- or 32-bit bytes:

#include <limits.h>
#if UCHAR_MAX != 255
#error "Sorry; must have an 8-bit byte"
#endif

A good, solid error at compile-time will save him a lot of
futile effort; he may not thank you for assuming the 8-bit
byte, but he'll at least not curse you for the time he
spends trying to fix your "bugs."

--
Er*********@sun.com

Nov 14 '05 #14

CBFalconer

Chris Croughton wrote:

.... snip ...
How about straight assignment? If I do:

int test(int i)
{
unsigned int u;
u = i;
return (int)u;
}

is that defined? I have a suspicion that it isn't (because although
u can represent a negative value of i in some implementation-defined
way, the conversion back to an int may not be able to represent that
unsigned value).
You answered your own question.

More usefully, is there any real computer and C compiler on which it
will fail in practice?

Define fail. Remember, these casts are value transformations, not
bodily transference of bit patterns. Your attitude reminds me of
Microsoft, who at least have the excuse that all their software
runs on the same hardware family. That attitude will lead to
similar reliability.

--
Chuck F (cb********@yahoo.com) (cb********@worldnet.att.net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net> USE worldnet address!

Nov 14 '05 #15

Keith Thompson

Chris Croughton <ch***@keristor.net> writes:
[...]

How about straight assignment? If I do:

int test(int i)
{
unsigned int u;
u = i;
return (int)u;
}

is that defined? I have a suspicion that it isn't (because although u
can represent a negative value of i in some implementation-defined way,
the conversion back to an int may not be able to represent that unsigned
value).
If i is non-negative, there's no problem at all.

If i is negative, the conversion from signed to unsigned depends on
the implementation-defined value of UINT_MAX, but is otherwise defined
by the language (even on systems that use a representation other than
2's-complement).

But the conversion from unsigned to signed, if the result can't be
represented, either yields an implementation-defined result or raises
an implementation-defined signal (no nasal demons in this case). Note
that "implementation-defined" means the implementation has to document
its behavior.
More usefully, is there any real computer and C compiler on which it
will fail in practice?

If by "fail" you mean doing something other than returning the
original value of i, I don't know.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Nov 14 '05 #16

Chris Croughton

On Wed, 05 Jan 2005 17:33:54 GMT, CBFalconer
<cb********@yahoo.com> wrote:

Chris Croughton wrote:

More usefully, is there any real computer and C compiler on which it
will fail in practice?
Define fail.

Not give the same number in the signed variable as it started with.
I.e. is there any real world situation where

i != (int)(unsigned int)i

is true (for i of type int)? Has there ever been such a system? Would
anyone in their right mind make such a system, and would anyone buy it
if they did?
Remember, these casts are value transformations, not
bodily transference of bit patterns. Your attitude reminds me of
Microsoft, who at least have the excuse that all their software
runs on the same hardware family. That attitude will lead to
similar reliability.

There are probably thousands if not millions of programs out there which
already do that sort of thing. It's certainly not limited to one
processor family, every processor I have ever heard of has the property
that an unsigned variable can act as a container for the signed value of
the same size, and can restore it. Quite a lot of printf code would
fail, for a start, as would a lot of code using variadic functions...

Have you ever heard of any processor where that wasn't true?

(For that matter, is anyone still producing machines with 1's complement
or sign+magnitude integers?)

Chris C

Nov 14 '05 #17

Eric Sosman

Chris Croughton wrote:

[... int i = -42; assert(i == (int)(unsigned)i); ...]

Have you ever heard of any processor where that wasn't true?

(For that matter, is anyone still producing machines with 1's complement
or sign+magnitude integers?)

I've personally used ones' complement and signed magnitude
(decimal!) computers. Admittedly, it was a while ago. But the
fact that I've seen them dwindle away doesn't suggest to me that
they're gone forever; rather, it suggests that techniques in the
computer industry are subject to change. The assumption that
things will remain forever the way they happen to be today has
not been tenable up to now; are you confident that Change has
come to its end?

Or as a colleague of mine likes to say, "We work in a
fashion-driven industry."

Preparing for every conceivable raising or lowering of
computers' hemlines carries a cost, and failing to be prepared
carries a risk. In the context of a given project you may
well decide that the risk is too small and the cost too large.
That's fine; that's part of what engineering is about. But an
implicit decision that all risks are zero is just as foolhardy
as a decision that all costs are justified.

BTW, what are your thoughts on the C0x Committee's decision
to allow balanced ternary integers? ;-)

--
Er*********@sun.com

Nov 14 '05 #18

Chris Croughton

On Wed, 05 Jan 2005 16:57:30 -0500, Eric Sosman
<er*********@sun.com> wrote:

I've personally used ones' complement and signed magnitude
(decimal!) computers.
I've used a 1's-C computer and a signed magnitude one -- almost 30 years
ago! I haven't used or even seen either since the late '70s, though.
Do Burroughs still make computers? I think theirs was the S/M one.
Admittedly, it was a while ago. But the
fact that I've seen them dwindle away doesn't suggest to me that
they're gone forever; rather, it suggests that techniques in the
computer industry are subject to change. The assumption that
things will remain forever the way they happen to be today has
not been tenable up to now; are you confident that Change has
come to its end?
Any change which means that a signed value can't be cast to its unsigned
equivalent and the back would, I think, break a lot of code. Yes,
things might change (like using balanced ternary) but it would break an
enormous amount of code. The change from BCD to binary was bad enough
(and there are still BCD based languages).
Or as a colleague of mine likes to say, "We work in a
fashion-driven industry."
After a fashion <g>. Although some of the 'fashions' have lasted a long
time for such things, the 8 bit byte for instance (I suspect that almost
all of several major operating systems plus their utilities would have
to be rewritten for a different-sized byte). Even Unicode is only
gradually becoming accepted.
Preparing for every conceivable raising or lowering of
computers' hemlines carries a cost, and failing to be prepared
carries a risk. In the context of a given project you may
well decide that the risk is too small and the cost too large.
That's fine; that's part of what engineering is about. But an
implicit decision that all risks are zero is just as foolhardy
as a decision that all costs are justified.
I didn't say anything about such a decision. I'm looking at risk
assessment -- is it really worth writing code which will be inefficient
and hard to maintain in order to cope with a possible hole which the
standard allows but no one is likely to implement that way? Is the
probability of someone producing a system which breaks a lot of code
higher than that of the next C standard breaking code? (Anyone who used
a variable called 'restrict' or 'inline' will have run foul of that in
C99).
BTW, what are your thoughts on the C0x Committee's decision
to allow balanced ternary integers? ;-)

I'd like to see how they propose to square it with all the references to
'binary' in the C specification <g>. Yes, it is possible to emulate bit
operations using b-tits[1] but C as we know it would not be an efficient
language to program such a machine...

[1] Ternary digits ought to be called tits. If they aren't someone was
slipping when they were named[2]...

[2] Robert A. Heinlein used ternary in some of his futuristic computers.
Knowing his proclivities, how did he miss calling them tits?

Chris C

Nov 14 '05 #19

Keith Thompson

Chris Croughton <ch***@keristor.net> writes:
[...]

[1] Ternary digits ought to be called tits. If they aren't someone was
slipping when they were named[2]...

[2] Robert A. Heinlein used ternary in some of his futuristic computers.
Knowing his proclivities, how did he miss calling them tits?

They're actually called "trits".

<http://www.catb.org/~esr/jargon/html/T/trit.html>

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Nov 14 '05 #20

Eric Sosman

Chris Croughton wrote:

On Wed, 05 Jan 2005 16:57:30 -0500, Eric Sosman
<er*********@sun.com> wrote:
[...]
Preparing for every conceivable raising or lowering of
computers' hemlines carries a cost, and failing to be prepared
carries a risk. In the context of a given project you may
well decide that the risk is too small and the cost too large.
That's fine; that's part of what engineering is about. But an
implicit decision that all risks are zero is just as foolhardy
as a decision that all costs are justified.
I didn't say anything about such a decision. I'm looking at risk
assessment -- is it really worth writing code which will be inefficient
and hard to maintain in order to cope with a possible hole which the
standard allows but no one is likely to implement that way? Is the
probability of someone producing a system which breaks a lot of code
higher than that of the next C standard breaking code? (Anyone who used
a variable called 'restrict' or 'inline' will have run foul of that in
C99).

You are asking the right questions, but I doubt there is
a single universally correct answer to any of them. All must
be considered in the context of the project at hand; all are
meaningless without that context. This means that a careful
programmer (you, for example) will give different answers at
different times.
[1] Ternary digits ought to be called tits. If they aren't someone was
slipping when they were named[2]...

They weren't keeping abreast of developments ...

If C adopts ternary, perhaps memmove() should be renamed
to mammove() ...

... and we'll all start worshipping a bust of Denise
Writchie's bust ...

--
Eric Sosman
es*****@acm-dot-org.invalid

Nov 14 '05 #21

Andrey Tarasevich

Peter Nilsson wrote:

...
Unsigned short has scope for problems...

unsigned short us = -1;
us++; /* UB if USHRT_MAX == INT_MAX */
??? Why? I don't see any potential for UB here, regardless of the
USHRT_MAX value.
The paranoid can safely use...

us += 1u;

I don't see how it is different.

--
Best regards,
Andrey Tarasevich

Nov 14 '05 #22

Keith Thompson

Andrey Tarasevich <an**************@hotmail.com> writes:

Peter Nilsson wrote:
...
Unsigned short has scope for problems...

unsigned short us = -1;
us++; /* UB if USHRT_MAX == INT_MAX */

??? Why? I don't see any potential for UB here, regardless of the
USHRT_MAX value.

us++;
does the equivalent to
us = us + 1;
(and then yields the value of us before it was incremented).

In the expression (us + 1), the value of us (which is of type unsigned
short) is promoted to int (if int can hold all the values of type
unsigned short) or to unsigned int (otherwise).

Normally, if int is bigger than short, the value will be promoted to
int and the addition will not overflow; if int is the same size as
short, the value will be promoted to unsigned int and the addition
will wrap around to 0.

But if USHRT_MAX == INT_MAX, the value of us will be promoted to type
int, resulting in a value of INT_MAX, and INT_MAX + 1 will overflow
and cause undefined behavior.

I doubt that any real-world implementations are affected by this.
Type int would have to be effectively only 1 bit wider than type
short, which could only happen if there are padding bits.

The paranoid can safely use...

us += 1u;

I don't see how it is different.

In the expression us + 1u, us will be promoted to unsigned int even if
USHRT_MAX == INT_MAX, avoiding the undefined behavior.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Nov 14 '05 #23

Andrey Tarasevich

Keith Thompson wrote:

...
Unsigned short has scope for problems...

unsigned short us = -1;
us++; /* UB if USHRT_MAX == INT_MAX */

??? Why? I don't see any potential for UB here, regardless of the
USHRT_MAX value.

us++;
does the equivalent to
us = us + 1;
(and then yields the value of us before it was incremented).

In the expression (us + 1), the value of us (which is of type unsigned
short) is promoted to int (if int can hold all the values of type
unsigned short) or to unsigned int (otherwise).

Oh, I see. Thank you for the explanation.

As a side note, to me the whole thing feels like a defect: the committee
took a "shortcut" by defining the increment and compound assignment
operators through their "regular" binary counterparts (instead of
providing independent definitions for them) and ended up with this as an
unintentional side-effect. On the other hand, I could be wrong, since
this specification is consistent with C89/90.

--
Best regards,
Andrey Tarasevich

Nov 14 '05 #24

Christian Bau

In article <10*************@news.supernews.com>,
Andrey Tarasevich <an**************@hotmail.com> wrote:

Keith Thompson wrote:
...
Unsigned short has scope for problems...

unsigned short us = -1;
us++; /* UB if USHRT_MAX == INT_MAX */

??? Why? I don't see any potential for UB here, regardless of the
USHRT_MAX value.

us++;
does the equivalent to
us = us + 1;
(and then yields the value of us before it was incremented).

In the expression (us + 1), the value of us (which is of type unsigned
short) is promoted to int (if int can hold all the values of type
unsigned short) or to unsigned int (otherwise).

Oh, I see. Thank you for the explanation.

As a side note, to me the whole thing feels like a defect: the committee
took a "shortcut" by defining the increment and compound assignment
operators through their "regular" binary counterparts (instead of
providing independent definitions for them) and ended up with this as an
unintentional side-effect. On the other hand, I could be wrong, since
this specification is consistent with C89/90.

On the other hand, would you have wanted "us++" to behave in any way
different from "us = us + 1"?

Nov 14 '05 #25

Andrey Tarasevich

Christian Bau wrote:

...
As a side note, to me the whole thing feels like a defect: the committee
took a "shortcut" by defining the increment and compound assignment
operators through their "regular" binary counterparts (instead of
providing independent definitions for them) and ended up with this as an
unintentional side-effect. On the other hand, I could be wrong, since
this specification is consistent with C89/90.

On the other hand, would you have wanted "us++" to behave in any way
different from "us = us + 1"?

Well, probably not. But if I'd continue to insist that this is a
"defect" I'd have to come to a logical conclusion that the real root of
the problem is not in the way the increment is defined, but rather in
the very existence of implicit integral promotions and/or in the way
these promotions behave. I don't have any desire to make any attacks on
integral promotions at the moment :), particularly because I don't
remember the rationale that led to them being defined the way they are.

--
Best regards,
Andrey Tarasevich

Nov 14 '05 #26

Chris Torek

Given a situation in which USHRT_MAX may equal INT_MAX:

Keith Thompson wrote:
us++;
does the equivalent to
us = us + 1;
(and then yields the value of us before it was incremented).
Now suppose "us" is initially set to USHRT_MAX, and USHRT_MAX
is indeed equal to INT_MAX.
In the expression (us + 1), the value of us (which is of type unsigned
short) is promoted to int (if int can hold all the values of type
unsigned short) or to unsigned int (otherwise).

And thus the signed int, whose value is INT_MAX, has 1 added
to it. The effect is undefined, and possibly a runtime trap,
which is quite undesirable.

In article <10*************@news.supernews.com>
Andrey Tarasevich <an**************@hotmail.com> wrote:Oh, I see. Thank you for the explanation.

As a side note, to me the whole thing feels like a defect: the committee
took a "shortcut" by defining the increment and compound assignment
operators through their "regular" binary counterparts (instead of
providing independent definitions for them) and ended up with this as an
unintentional side-effect. On the other hand, I could be wrong, since
this specification is consistent with C89/90.

The real flaw, in my opinion, is that the original ANSI C (C89)
committee decided on this bizarre, inconsistent widening treatment of
types in the first place: when an unsigned type is widened, if the
wider signed type can represent all values of the narrower unsigned
type, the resulting type is signed, otherwise the resulting type
is unsigned.

In other words, on a 16-bit implementation, "unsigned short" widens
to "unsigned int", because USHRT_MAX (65535) exceeds INT_MAX (32767)
but not UINT_MAX (65535). On a 32-bit implementation on otherwise
similar (or even identical) hardware, "unsigned short" widens to
*signed* int, because USHRT_MAX (65535) is much less than INT_MAX
(2147483647). Sometimes this makes the code behave differently
on the two compilers, even if they use the same hardware:

unsigned short us = 0;
...
if ((us - 1) > 1)
...

Here, if unsigned short is 16 bits and plain int is also 16 bits
(Compiler A), "us - 1" is (unsigned int)65535, which is greater
than 1. But if unsigned short is 16 bits and plain int is 32 bits
(Compiler B, on the same hardware), "us - 1" is (signed int)-1,
which is less than 1.

The C89 Rationale called such situations "questionably signed";
the theory is that it is hard to tell what the programmer intended
in the first place. So they came up with these so-called "value
preserving" rules. The problem is that, in the presence of any
kind of arithmetic, the value they preserve depends on the relative
values of USHRT_MAX vs INT_MAX (or UINT_MAX vs LONG_MAX, and so
on).

The alternative to "value-preserving" is the so-called "sign
preserving" or "unsigned preserving" rule. This is what Unix-based
systems actually did, and it is CLEARLY (note opinion :-) ) the
better method BY FAR, because it does not require comparing USHRT_MAX
and INT_MAX at all. Instead, a narrow unsigned type *always* widens
to the wider unsigned type.

Note that this completely solves the issue at hand, because then
the fact that "++us" accomplishes the same thing as "us = us + 1"
is not a problem: "us" expands to unsigned int, which has the usual
clock-arithmetic semantics and either goes from 65535 to 65536 or
goes from 65535 to 0 (as appropriate), and then that value is put
back into "us", which always produces 0.

(As far as I can tell, there is only one drawback to "unsigned
preserving" behavior, and that is what happens if plain char is
unsigned. This problem can be solved by fiat: we already know that
the I/O library is problematic if UCHAR_MAX > INT_MAX, so we can
simply rule that UCHAR_MAX < INT_MAX and, if necessary [and I am
not sure whether it is], that plain char violates the "unsigned
preserving" behavior and widens to signed int.)
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.

Nov 14 '05 #27

Incrementing variables past limits

Similar topics