"Might be undefined" Behaviour

Frederick Gotham

I have a general idea of the different kinds of behaviour described by the
C Standard, such as:

(1) Well-defined behaviour:

int a = 2, b = 3;

int c = a + b;

(Jist: The code will work perfectly.)
(2) Implementation-defined behaviour

unsigned i = -1;

if (i 65535) DoSomething();
else DoSomethingElse();

(Jist: Different things can happen on different platforms, but the
program shouldn't crash.)
(3) Unspecified Behaviour

int i = Func1() + Func2();

(Jist: We don't know which function is called first.)
(4) Undefined Behaviour

int i = INT_MAX;

++i;

(The implementation can do whatever it likes, and the program may very
well crash.)

I'm looking for a term though to describe a code snippet which _might_
invoke undefined behaviour. Here's an example:

int i = 32767;

++i;

Given the minimum range of "int", this code may fail on some systems and
succeed on others. I hestitate though to simply label it as "undefined
behaviour". How would I describe a code snippet which may invoke undefined
behaviour depending on implementation-specific details?

--

Frederick Gotham

Nov 21 '06 #1

Subscribe Post Reply

2152

Eric Sosman

Frederick Gotham wrote On 11/21/06 16:43,:

I have a general idea of the different kinds of behaviour described by the
C Standard, such as:

(1) Well-defined behaviour:

int a = 2, b = 3;

int c = a + b;

(Jist: The code will work perfectly.)

"Gist."

>
(2) Implementation-defined behaviour

unsigned i = -1;

if (i 65535) DoSomething();
else DoSomethingElse();

(Jist: Different things can happen on different platforms, but the
program shouldn't crash.)
(3) Unspecified Behaviour

int i = Func1() + Func2();

(Jist: We don't know which function is called first.)
(4) Undefined Behaviour

int i = INT_MAX;

++i;

(The implementation can do whatever it likes, and the program may very
well crash.)

I'm looking for a term though to describe a code snippet which _might_
invoke undefined behaviour. Here's an example:

int i = 32767;

++i;

Given the minimum range of "int", this code may fail on some systems and
succeed on others. I hestitate though to simply label it as "undefined
behaviour". How would I describe a code snippet which may invoke undefined
behaviour depending on implementation-specific details?

"It is implementation-defined whether the behavior is
defined or undefined." Note that mere "implementation-defined"
doesn't quite cover the situation, because an implementation
where INT_MAX==32767 is not required to define the behavior of
this code.

Perhaps we could abbreviate this notion with the phrase
"implementation-undefined."

--
Er*********@sun.com

Nov 21 '06 #2

CBFalconer

Frederick Gotham wrote:

>

.... snip ...

>
I'm looking for a term though to describe a code snippet which
_might_ invoke undefined behaviour. Here's an example:

int i = 32767;

++i;

Given the minimum range of "int", this code may fail on some
systems and succeed on others. I hestitate though to simply label
it as "undefined behaviour". How would I describe a code snippet
which may invoke undefined behaviour depending on implementation-
specific details?

#include <limits.h>

....

if (INT_MAX == i) overflowerror();
else ++i;

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>

Nov 21 '06 #3

Peter Nilsson

Frederick Gotham wrote:

I have a general idea of the different kinds of behaviour described by the
C Standard, such as:

(1) Well-defined behaviour:

int a = 2, b = 3;

int c = a + b;

(Jist: The code will work perfectly.)

Your next example is perfectly well defined too IMO.

The standard defines the term strictly conforming _program_. Though it
qualifies it as _output_ behaviour. For example, the following is a
strictly
conforming program even though some of the operations use
implementation defined or unspecified values and results...

#include <stdio.h>
#include <limits.h>

int main(void)
{
unsigned x = UINT_MAX;
unsigned y = -1u / 2;
unsigned z = x / y;
printf("The answer to life, the universe and everything is... ");
printf("%u\n", z * 21);
return 0;
}

(2) Implementation-defined behaviour

unsigned i = -1;
if (i 65535) DoSomething();
else DoSomethingElse();

An implementation is not required to document the behaviour of that
piece of code. Note that the standard already documents the behaviour
of the first declaration and the behaviour of the if conditional and
statement as a whole.

(Jist: Different things can happen on different platforms, but the
program shouldn't crash.)

Implementation defined for me means something like: (-5 >1)

(3) Unspecified Behaviour

int i = Func1() + Func2();

(Jist: We don't know which function is called first.)

(4) Undefined Behaviour

int i = INT_MAX;
++i;

Note that there is a stronger class of undefined behaviour, namely
constraint
violations. [In a clc thread a while back, committee members stated
that the
behaviour was undefined even if the standard does in fact specfiy the
behaviour in normative text outside of the constraint.]

(The implementation can do whatever it likes, and the program may very
well crash.)

I'm looking for a term though to describe a code snippet which _might_
invoke undefined behaviour. Here's an example:

int i = 32767;
++i;

Given the minimum range of "int", this code may fail on some systems and
succeed on others. I hestitate though to simply label it as "undefined
behaviour".

How would I describe a code snippet which may invoke undefined
behaviour depending on implementation-specific details?

Schrödinger C?

Potentially undefined behaviour?

"Not portable" seems to be quite common.

--
Peter

Nov 22 '06 #4

Christopher Benson-Manica

Frederick Gotham <fg*******@spam.comwrote:

I have a general idea of the different kinds of behaviour described by the
C Standard, such as:

(snip)

http://c-faq.com/ansi/undef.html

--
C. Benson Manica | I *should* know what I'm talking about - if I
cbmanica(at)gmail.com | don't, I need to know. Flames welcome.

Nov 22 '06 #5

Frederick Gotham

Peter Nilsson:

The standard defines the term strictly conforming _program_. Though it
qualifies it as _output_ behaviour.

I realise that.

For example, the following is a strictly conforming program even though
some of the operations use implementation defined or unspecified values
and results...

#include <stdio.h>
#include <limits.h>

int main(void)
{
unsigned x = UINT_MAX;
unsigned y = -1u / 2;
unsigned z = x / y;
printf("The answer to life, the universe and everything is... ");
printf("%u\n", z * 21);
return 0;
}

You sure about that? My own understanding of a "strictly conforming
program" is a program whose output is identical on every implementation. An
example would be:

#include <stdio.h>

int main(void) { printf("%d", 42); }

A counter-example would be:

#include <limits.h>
#include <stdio.h>

int main(void) { printf("%d", INT_MAX); }

(By the way, I don't see why we're talking about "strictly conforming
programs".)

>(2) Implementation-defined behaviour

unsigned i = -1;
if (i 65535) DoSomething();
else DoSomethingElse();

An implementation is not required to document the behaviour of that
piece of code.

Yes, it is. It must document the range of "unsigned int", thus indicating
which leg of the "if" statement will be executed.

--

Frederick Gotham

Nov 22 '06 #6

Richard Heathfield

Frederick Gotham said:

Peter Nilsson:

>For example, the following is a strictly conforming program even though
some of the operations use implementation defined or unspecified values
and results...

#include <stdio.h>
#include <limits.h>

int main(void)
{
unsigned x = UINT_MAX;
unsigned y = -1u / 2;
unsigned z = x / y;
printf("The answer to life, the universe and everything is... ");
printf("%u\n", z * 21);
return 0;
}

You sure about that? My own understanding of a "strictly conforming
program" is a program whose output is identical on every implementation.

....such as the one quoted above, for example. :-)

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at the above domain, - www.

Nov 22 '06 #7

Clark S. Cox III

Frederick Gotham wrote:

Peter Nilsson:

>The standard defines the term strictly conforming _program_. Though it
qualifies it as _output_ behaviour.

I realise that.

>For example, the following is a strictly conforming program even though
some of the operations use implementation defined or unspecified values
and results...

#include <stdio.h>
#include <limits.h>

int main(void)
{
unsigned x = UINT_MAX;
unsigned y = -1u / 2;
unsigned z = x / y;
printf("The answer to life, the universe and everything is... ");
printf("%u\n", z * 21);
return 0;
}

You sure about that? My own understanding of a "strictly conforming
program" is a program whose output is identical on every implementation.

The program above *does* have output that is identical on every platform.

--
Clark S. Cox III
cl*******@gmail.com

Nov 22 '06 #8

Frederick Gotham

Richard Heathfield:

>You sure about that? My own understanding of a "strictly conforming
program" is a program whose output is identical on every implementation.

...such as the one quoted above, for example. :-)

I read as far as the following line:

unsigned y = -1u / 2;

, and I didn't know what happened when the negation operator was applied to
an unsigned integer type, so I gave up!

I'll take your word for it that the output is identical on every
implementation.

--

Frederick Gotham

Nov 22 '06 #9

Richard Heathfield

Frederick Gotham said:

Richard Heathfield:

>>You sure about that? My own understanding of a "strictly conforming
program" is a program whose output is identical on every implementation.

...such as the one quoted above, for example. :-)

I read as far as the following line:

unsigned y = -1u / 2;

, and I didn't know what happened when the negation operator was applied
to an unsigned integer type, so I gave up!

I'll take your word for it that the output is identical on every
implementation.

Oh, no no no no no, don't do that. Let's demonstrate it. It's much more
educational.

Firstly, we need to know that C does arithmetic on unsigned integer types
like this: let's say the unsigned type can represent values in the range 0
to N-1 (i.e. N different values, so it has log2(N) bits). And let's say
that we try to stick a value into it that isn't in that range. Well, what
happens is that, if it's below the range, you can (conceptually) keeping
add N to the value over and over until it hits the range, and if it's above
the range, you can keep subtracting N until, again, it hits the range. This
is called "reducing the value modulo N" (even though it might mean
increasing the value!).

So -1u actually becomes -1u + N (where, of course, N is (UINT_MAX + 1) in
this case). -1 + UINT_MAX + 1 is, naturally enough, UINT_MAX, and so in our
example program y ends up with UINT_MAX / 2, which is going to be no more
than half the value of UINT_MAX (and it could be a smidgen less, of
course). Clearly, this will divide into UINT_MAX twice (possibly leaving a
small remainder which will be lost because of integer division rules), so
the program calculates the value 2, irrespective of the value of UINT_MAX.

And then of course it prints a value equal to 21 times this result - on any
hosted implementation.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at the above domain, - www.

Nov 22 '06 #10

Harald van DÄ³k

Frederick Gotham wrote:

Richard Heathfield:

You sure about that? My own understanding of a "strictly conforming
program" is a program whose output is identical on every implementation.
...such as the one quoted above, for example. :-)

I read as far as the following line:

unsigned y = -1u / 2;

, and I didn't know what happened when the negation operator was applied to
an unsigned integer type, so I gave up!

I'll take your word for it that the output is identical on every
implementation.

Unsigned arithmetic is performed modulo TYPE_MAX+1, so -1u is simply
UINT_MAX. If I recall correctly, you've used (unsigned char) -1 as an
alternative to UCHAR_MAX in the past. The principle here is the same.

Nov 22 '06 #11

Frederick Gotham

Richard Heathfield:

So -1u actually becomes -1u + N (where, of course, N is (UINT_MAX + 1)
in this case).

Let's say that:

sizeof(long) sizeof(int)

And let's say that we want to store the max value for an "unsigned int"
inside a long.

We could write:

long i = UINT_MAX;

or:

long i = (unsigned)-1;

or:

long i = -1U;

That right? Should the third one be interpreted as:

(1) Take the R-value:

Type: int unsigned
Value: 1

(2) Now let's go to our magical world of omnipotent mathematics, where
nothing overflows and where any number can be represented. In this magical
world, we negate the number 1, yielding -1.

(3) Now let's get back to the real world of C and try to store -1 in an
unsigned int.

Would that sound about right? How about trying to store the max value for
an unsigned short in a long? Would I be right in thinking that the
following would _not_ be OK:

long i = -(short unsigned)1;

Reason being that, before it's negated, it's probably promoted to "int",
yielding an actual -1 which is then stored in the long? Exactly like:

long i = -(int)(short unsigned)1;

Even if it promoted to unsigned int rather than signed int, it still
wouldn't yield the max value for an unsigned short, right? Because it would
be as follows:
long i = -(unsigned)(short unsigned)1;

This would give us the max value for an unsigned int rather than an
unsigned short, right? (Yes I realise they might be equal.)

So what have I learned? Well, it doesn't seem like I'd ever find the need
to negate an unsigned integer.

--

Frederick Gotham

Nov 22 '06 #12

Richard Heathfield

Frederick Gotham said:

<snip>

So what have I learned? Well, it doesn't seem like I'd ever find the need
to negate an unsigned integer.

<shrug-1 is a kind of shorthand, an idiom, for "largest possible value of
this (unsigned) type". Nobody's forcing you to use it. You may, however,
come across it when reading other people's code, so it's as well to be
aware of it.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at the above domain, - www.

Nov 22 '06 #13

Frederick Gotham

Richard Heathfield:

>So what have I learned? Well, it doesn't seem like I'd ever find the

need

>to negate an unsigned integer.

<shrug-1 is a kind of shorthand, an idiom, for "largest possible value

this (unsigned) type". Nobody's forcing you to use it. You may, however,
come across it when reading other people's code, so it's as well to be
aware of it.

Ah yes, I've no problem with using negative values to achieve a particular
unsigned value, such as:

size_t i = -1;

But this is because the literal, 1, is signed rather than unsigned. It's
equivalent to:

size_t i = -(int)1;

Note that it doesn't negate an unsigned integer type object.

The following, however, _does_ negate an unsigned integer type object:

size_t i = 1;

size_t len = -i;

It just doesn't look right to me...

--

Frederick Gotham

Nov 22 '06 #14

CBFalconer

Frederick Gotham wrote:

Richard Heathfield:

>>You sure about that? My own understanding of a "strictly conforming
program" is a program whose output is identical on every implementation.

...such as the one quoted above, for example. :-)

I read as far as the following line:

unsigned y = -1u / 2;

, and I didn't know what happened when the negation operator was applied to
an unsigned integer type, so I gave up!

That results in the constant UINT_MAX, which is odd. Division by 2
results in (UINT_MAX - 1) / 2. The result is system dependant.
The final result (UINT_MAX / y) is always 2. The printf emits
Richards favorite value, 42. Straightforward.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>

Nov 22 '06 #15

Eric Sosman

Richard Heathfield wrote On 11/22/06 14:50,:

Frederick Gotham said:

<snip>

>>So what have I learned? Well, it doesn't seem like I'd ever find the need
to negate an unsigned integer.

<shrug-1 is a kind of shorthand, an idiom, for "largest possible value of
this (unsigned) type". Nobody's forcing you to use it. You may, however,
come across it when reading other people's code, so it's as well to be
aware of it.

... and to pay attention to the details. I once spent
time chasing a bug whose root cause was (paraphrased)

unsigned long x = -1u;

.... as a shorthand for "Fill `x' with 1-bits."

--
Er*********@sun.com

Nov 22 '06 #16

Keith Thompson

Frederick Gotham <fg*******@SPAM.comwrites:
[...]

Note that it doesn't negate an unsigned integer type object.

The following, however, _does_ negate an unsigned integer type object:

size_t i = 1;

size_t len = -i;

It just doesn't look right to me...

Well, it is. You might consider adjusting your expectations.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Nov 22 '06 #17

Giorgio Silvestri

"Frederick Gotham" <fg*******@SPAM.comha scritto nel messaggio
news:mO*******************@news.indigo.ie...

Richard Heathfield:

So -1u actually becomes -1u + N (where, of course, N is (UINT_MAX + 1)
in this case).

Let's say that:

sizeof(long) sizeof(int)

And let's say that we want to store the max value for an "unsigned int"
inside a long.

We could write:

long i = UINT_MAX;

or:

long i = (unsigned)-1;

or:

long i = -1U;

sizeof(long) sizeof(int)

is not particularly interesting.

Probably you want:

LONG_MAX >= UINT_MAX

If you consider "padding bits" the following is possible:

sizeof(long) sizeof(int)

and

LONG_MAX < UINT_MAX

--
Giorgio Silvestri
DSP/Embedded/Real Time OS Software Engineer

Nov 22 '06 #18

Frederick Gotham

Giorgio Silvestri:

sizeof(long) sizeof(int)

is not particularly interesting.

Probably you want:

LONG_MAX >= UINT_MAX

If you consider "padding bits" the following is possible:

sizeof(long) sizeof(int)

and

LONG_MAX < UINT_MAX

Yes I realise that. However, I've overstepped my year's quote for mentioning
IMAX_BITS.

--

Frederick Gotham

Nov 22 '06 #19

Mark McIntyre

On Wed, 22 Nov 2006 19:53:49 GMT, in comp.lang.c , Frederick Gotham
<fg*******@SPAM.comwrote:

>Ah yes, I've no problem with using negative values to achieve a particular
unsigned value, such as:

size_t i = -1;

but...

size_t i = 1;
size_t len = -i;

It just doesn't look right to me...

The two are absolutely identical, and any decent compiler would
optimise the latter into the former.

its worth reviewing ones prejudices occasionally, in case they're
invalid.
--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan

Nov 22 '06 #20

Frederick Gotham

Eric Sosman:

I once spent
time chasing a bug whose root cause was (paraphrased)

unsigned long x = -1u;

... as a shorthand for "Fill `x' with 1-bits."

Just to make sure I understand here... What that line does is equivalent
to:

unsigned const one_unsigned = 1;

long unsigned x = -one_unsigned;

In order to compute "-one_unsigned", we must traipse over to omnipotent
maths world. In omnipotent maths world, we calculate that the negative of 1
is -1. We then try to store -1 in an unsigned int. Therefore, we can expand
the code to:

unsigned const one_unsigned = 1;

unsigned const negative_of_one_unsigned = -one_unsigned;

long unsigned x = negative_of_one_unsigned;

And of course, this is equal to:

long unsigned x = UINT_MAX;
What the person _should_ have written is:

long unsigned x = ULONG_MAX;

or:

long unsigned x = -1;

Or, if the objective was to set all bits (including padding bits) to 1:

long unsigned x;
memset(&x,UCHAR_MAX,sizeof x);

Anyhow... this just goes to show how I'd never have a use for negating an
unsigned integer.

--

Frederick Gotham

Nov 22 '06 #21

Eric Sosman

Frederick Gotham wrote On 11/22/06 16:30,:

Eric Sosman:

>>I once spent
time chasing a bug whose root cause was (paraphrased)

unsigned long x = -1u;

... as a shorthand for "Fill `x' with 1-bits."

Just to make sure I understand here... What that line does is equivalent
to:

unsigned const one_unsigned = 1;

long unsigned x = -one_unsigned;

Yes. It's also equivalent to

unsigned int y = -1; /* an int's worth of 1-bits */
unsigned long x = y; /* 0's (maybe) and 1's */

An interesting observation on the original bug is that it
has two different one-character fixes, an insertion or a
deletion:

unsigned long x = -1uL; /* fix by inserting L */
unsigned long x = -1; /* fix by deleting u */

--
Er*********@sun.com

Nov 22 '06 #22

websnarf

Peter Nilsson wrote:

Frederick Gotham wrote:
I'm looking for a term though to describe a code snippet which _might_
invoke undefined behaviour. Here's an example:

int i = 32767;
++i;

Given the minimum range of "int", this code may fail on some systems and
succeed on others. I hestitate though to simply label it as "undefined
behaviour".

How would I describe a code snippet which may invoke undefined
behaviour depending on implementation-specific details?

Schrödinger C?

Potentially undefined behaviour?

But undefined behaviour itself means potentially random behaviour. So
you mean potentially potentially random behaviour?

"Not portable" seems to be quite common.

But that's a description for practically the whole language.

More accurate would be "not portably defined". Or if one is targetting
"C" (i.e., all C platforms simultaneously) then it should just be
called "undefined behaviour".

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Nov 22 '06 #23

Old Wolf

Frederick Gotham wrote:

Or, if the objective was to set all bits (including padding bits) to 1:

long unsigned x;
memset(&x,UCHAR_MAX,sizeof x);

Can unsigned integers have padding bits?

Nov 23 '06 #24

CBFalconer

Old Wolf wrote:

Frederick Gotham wrote:

>Or, if the objective was to set all bits (including padding bits)
to 1:

long unsigned x;
memset(&x,UCHAR_MAX,sizeof x);

Can unsigned integers have padding bits?

Yes

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>

Nov 23 '06 #25

Keith Thompson

"Old Wolf" <ol*****@inspire.net.nzwrites:

Frederick Gotham wrote:
>Or, if the objective was to set all bits (including padding bits) to 1:

long unsigned x;
memset(&x,UCHAR_MAX,sizeof x);

Can unsigned integers have padding bits?

Yes (but unsigned char can't).

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Nov 23 '06 #26

pete

Frederick Gotham wrote:

(By the way, I don't see why we're talking about "strictly conforming
programs".)

Neither do I.
"Correct program" is the topic suggested
by the subject line of this thread: Re: "Might be undefined" Behaviour.

If the worst that you can say about a program's output
in terms of behavior,
is that is it contains "unspecified behavior"
then you have a "correct program".

(32767 + 1) is undefined because there is no limit
imposed by the standard on the consequences
of the evaluation of that expression.

--
pete

Nov 29 '06 #27

"Might be undefined" Behaviour

Similar topics