469,946 Members | 1,882 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,946 developers. It's quick & easy.

Does integer over flow cause undefined behaviour ?

Hi all,

Does the following code invoke undefined behaviour ?

[z852378@premier dumpyard]$ cat a1.cc
#include <iostream>
#include <limits>

int main() {
int a = INT_MAX/2;
std::cout << "a is " << a << std::endl;
a = a * a;
std::cout << "a is " << a << std::endl;
return 0;
}
[z852378@premier dumpyard]$ g++ a1.cc
[z852378@premier dumpyard]$ ./a.out
a is 1073741823
a is -2147483647

thanks

May 19 '06 #1
33 2026
dragoncoder wrote:
Does the following code invoke undefined behaviour ?
Yes. 5/5.
[z852378@premier dumpyard]$ cat a1.cc
#include <iostream>
#include <limits>

int main() {
int a = INT_MAX/2;
std::cout << "a is " << a << std::endl;
a = a * a;
std::cout << "a is " << a << std::endl;
return 0;
}
[z852378@premier dumpyard]$ g++ a1.cc
[z852378@premier dumpyard]$ ./a.out
a is 1073741823
a is -2147483647


V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask
May 19 '06 #2
dragoncoder wrote:
Hi all,

Does the following code invoke undefined behaviour ?

[z852378@premier dumpyard]$ cat a1.cc
#include <iostream>
#include <limits>

int main() {
int a = INT_MAX/2;
std::cout << "a is " << a << std::endl;
a = a * a;
std::cout << "a is " << a << std::endl;
return 0;
}
[z852378@premier dumpyard]$ g++ a1.cc
[z852378@premier dumpyard]$ ./a.out
a is 1073741823
a is -2147483647


Yes, but nearly every existing implementation of C++ defines integer
overflow as harmless. And in fact the Standard even observes as much:
"Note: most existing implementations of C++ ignore integer overflows."
5.1/5.

Greg

May 19 '06 #3
Greg wrote:

Yes, but nearly every existing implementation of C++ defines integer
overflow as harmless.
Which has nothing to do with whether its behavior is undefined.
Undefined behavior is not necessarily harmful.
And in fact the Standard even observes as much:
"Note: most existing implementations of C++ ignore integer overflows."


Which may or may not be harmful, depending on it means to ignore integer
overflows and on what the program does with the result. But it is still
undefined behavior, because the C++ standard doesn't require any
particular behavior in that case.

--

Pete Becker
Roundhouse Consulting, Ltd.
May 19 '06 #4
Pete Becker wrote:
Yes, but nearly every existing implementation of C++ defines integer
overflow as harmless.


Which has nothing to do with whether its behavior is undefined. Undefined
behavior is not necessarily harmful.


I thought the nearest toilet could explode.

Closer to the metal, I thought that certain bit patterns couldn't form valid
integers, and could cause a trap. Hence, I thought that overflowing an
integer could hit one of those bit patterns.

Is overflow implementations-defined as safe-to-read-but-garbage-bits? Or
might the nearest toilet.. you know..?

--
Phlip
http://c2.com/cgi/wiki?ZeekLand <-- NOT a blog!!!
May 19 '06 #5
* Phlip:
Pete Becker wrote:
Yes, but nearly every existing implementation of C++ defines integer
overflow as harmless.

Which has nothing to do with whether its behavior is undefined. Undefined
behavior is not necessarily harmful.


I thought the nearest toilet could explode.

Closer to the metal, I thought that certain bit patterns couldn't form valid
integers, and could cause a trap. Hence, I thought that overflowing an
integer could hit one of those bit patterns.

Is overflow implementations-defined as safe-to-read-but-garbage-bits? Or
might the nearest toilet.. you know..?


Most C++ implementations employ two's complement representation of
signed integers, with no trapping.

That means arithmetic modulo 2^n, just as with unsigned integers.

Many, including me, have argued that the C++ standard should stop
supporting the ENIAC, MANIAC and NUSSE computers, and require two's
complement no-trapping representation. Which, for example, would make
std::clock much more useful (IIRC), and would make a very large part of
the existing C++ code base conforming. But ironically, since you have
to search very obscure corners of the universe to find a C++ compiler
that doesn't do that already, i.e. because it's already a de-facto
standard, because the existing practice is so overwhelmingly uniform,
there's not sufficient pressure to have it standardized...

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?
May 20 '06 #6

Pete Becker wrote:
Greg wrote:

Yes, but nearly every existing implementation of C++ defines integer
overflow as harmless.


Which has nothing to do with whether its behavior is undefined.
Undefined behavior is not necessarily harmful.


Undefined behavior is the absence of a defined behavior (in the case,
by the C++ Standard) - but it is not an affirmative behavior in its own
right. And it is quite common for implementations or architectures to
define behavior not defined by the Standard. Dereferencing a NULL
pointer would be an obvious example of a behavior undefined by the
Standard - but which nonetheless has a behavior defined in almost every
implementation. Integer overlfow would be another example of a
widely-defined operation.
And in fact the Standard even observes as much:
"Note: most existing implementations of C++ ignore integer overflows."


Which may or may not be harmful, depending on it means to ignore integer
overflows and on what the program does with the result. But it is still
undefined behavior, because the C++ standard doesn't require any
particular behavior in that case.


An implementation is free to define its own behavior for behavior not
defined by the Standard (and nearly all do in this case). So there is
never a certainty that a behavior left undefined by the Standard will
also be undefined by the implementation or the architecture in a given
case.

Greg

May 20 '06 #7
> Many, including me, have argued that the C++ standard should stop
supporting the ENIAC, MANIAC and NUSSE computers, and require two's
complement no-trapping representation. Which, for example, would make
std::clock much more useful (IIRC), and would make a very large part of
the existing C++ code base conforming. But ironically, since you have
to search very obscure corners of the universe to find a C++ compiler
that doesn't do that already, i.e. because it's already a de-facto
standard, because the existing practice is so overwhelmingly uniform,
there's not sufficient pressure to have it standardized...


Sometimes this makes me think that we would need some sort of
"substandard" that "defines" some of those undefined behaviours - as
most code in fact needs to use some of relatively harmless "undefineds"
to exist (e.g. memory allocator routines or GC) or it can use it to
improve performance performance problems (memcpy non-PODs) - and it
would still be quite advantage for portability to know that certain
platform supports such "substandard".

Mirek
May 20 '06 #8

"Alf P. Steinbach" <al***@start.no> skrev i meddelandet
news:4d*************@individual.net...
* Phlip:
Pete Becker wrote:
Yes, but nearly every existing implementation of C++ defines
integer
overflow as harmless.
Which has nothing to do with whether its behavior is undefined.
Undefined behavior is not necessarily harmful.
I thought the nearest toilet could explode.

Closer to the metal, I thought that certain bit patterns couldn't
form valid integers, and could cause a trap. Hence, I thought that
overflowing an integer could hit one of those bit patterns.

Is overflow implementations-defined as
safe-to-read-but-garbage-bits? Or might the nearest toilet.. you
know..?


Most C++ implementations employ two's complement representation of
signed integers, with no trapping.

That means arithmetic modulo 2^n, just as with unsigned integers.

Many, including me, have argued that the C++ standard should stop
supporting the ENIAC, MANIAC and NUSSE computers,


Or Univac / Unisys hardware, still in production,
and require two's complement no-trapping representation.
which still offers one's complement 36 bit words, for backward
compatibility.
Which, for example, would make std::clock much more useful (IIRC),
and would make a very large part of the existing C++ code base
conforming. But ironically, since you have to search very obscure
corners of the universe to find a C++ compiler that doesn't do that
already, i.e. because it's already a de-facto standard, because the
existing practice is so overwhelmingly uniform, there's not
sufficient pressure to have it standardized...


Don't you think it would have been hard for the C++ committee to ban
this kind of hardware, when these chapters were drafted around 1990?

I am right now working in a project, modifying communications with a
financial institution, which would allow them to shut down their
Unisys machines in september *this year*. If it doesn't work out as
well as planned, they might continue to run next year as well.
How good is a standard that makes C++ impossible to implement on that
kind of hardware? Just for this tiny detail.
Bo Persson
May 20 '06 #9
Phlip wrote:
Yes, but nearly every existing implementation of C++ defines integer
overflow as harmless.
Which has nothing to do with whether its behavior is undefined. Undefined
behavior is not necessarily harmful.

I thought the nearest toilet could explode.


I don't see how that could happen.

When the standard says that the behavior of a program that uses some
code construct is undefined it means that the standard doesn't tell you
what happens when you do it. That's all. It doesn't require that bad
things happen, and it certainly doesn't say what something that's
"harmless" actually does.
Closer to the metal, I thought that certain bit patterns couldn't form valid
integers, and could cause a trap. Hence, I thought that overflowing an
integer could hit one of those bit patterns.
Could be. The behavior of a code construct whose behavior is undefined
is undefined. <g>

Is overflow implementations-defined as safe-to-read-but-garbage-bits?


It's not implementation-defined behavior. It's undefined behavior. The
difference is that in the former case, the compiler's documentation has
to tell you what it does.

--

Pete Becker
Roundhouse Consulting, Ltd.
May 20 '06 #10
Mirek Fidler wrote:

Sometimes this makes me think that we would need some sort of
"substandard" that "defines" some of those undefined behaviours - as
most code in fact needs to use some of relatively harmless "undefineds"
to exist (e.g. memory allocator routines or GC) or it can use it to
improve performance performance problems (memcpy non-PODs) - and it
would still be quite advantage for portability to know that certain
platform supports such "substandard".


Sure, you can kill performance and make compiler writers work harder in
order to improve "portability" for a small fraction of the code that
people write. Java did it with their floating-point math, and had to
undo it.

--

Pete Becker
Roundhouse Consulting, Ltd.
May 20 '06 #11
Greg wrote:
Dereferencing a NULL
pointer would be an obvious example of a behavior undefined by the
Standard - but which nonetheless has a behavior defined in almost every
implementation.
Really? What can I expect to happen when I dereference a null pointer
with MSVC 7.1? And where is it documented? After I've done this, what is
the state of my program?
Integer overlfow would be another example of a
widely-defined operation.

Really? How is it defined for MSVC 7.1? And where is it documented? Has
Microsoft promised that whatever this behavior is, it will never be changed?

I'm not disputing that you can often figure out what happens in
particular cases. But that's not a sufficient basis for saying that it's
well defined for that compiler. Unless the compiler's specification
tells you exactly what you can expect, you're guessing. That's often
appropriate, but it doesn't make what you're doing well defined.
And in fact the Standard even observes as much:
"Note: most existing implementations of C++ ignore integer overflows."
Which may or may not be harmful, depending on it means to ignore integer
overflows and on what the program does with the result. But it is still
undefined behavior, because the C++ standard doesn't require any
particular behavior in that case.

An implementation is free to define its own behavior for behavior not
defined by the Standard (and nearly all do in this case).


Of course it is.
So there is
never a certainty that a behavior left undefined by the Standard will
also be undefined by the implementation or the architecture in a given
case.


Of course not. Nobody said otherwise.

--

Pete Becker
Roundhouse Consulting, Ltd.
May 20 '06 #12
* Bo Persson:

How good is a standard that makes C++ impossible to implement on that
kind of hardware [Univac / Unisys]?
These olde beasties are finally being shut down. And you think someone
is going to write a C++ program for them? Hah.

Just for this tiny detail.


The basic substrate the language is built in, does matter.

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?
May 20 '06 #13
Pete Becker wrote:
Greg wrote:
Dereferencing a NULL
pointer would be an obvious example of a behavior undefined by the
Standard - but which nonetheless has a behavior defined in almost every
implementation.

Really? What can I expect to happen when I dereference a null pointer
with MSVC 7.1? And where is it documented? After I've done this, what is
the state of my program?


Okay, bad example. <g> SEH is, of course, an abomination when mixed with
C++, but it's arguabely well-defined.

A complete analysis would involve three steps:

1. Is there something reasonable that can be required in place of what's
currently undefined behavior

2. Can the new behavior be implemented without imposing noticeable
overhead on programs that don't use it

3. Is the new behavior useful?

Most discussions advocating making undefined behavior well defined
ignore number 3. That's the problem, for example, with the assertion
that integer overflow is usually harmless. It doesn't tell you what you
can do with it.

--

Pete Becker
Roundhouse Consulting, Ltd.
May 20 '06 #14

"Alf P. Steinbach" <al***@start.no> skrev i meddelandet
news:4d*************@individual.net...
* Bo Persson:

How good is a standard that makes C++ impossible to implement on
that kind of hardware [Univac / Unisys]?
These olde beasties are finally being shut down. And you think
someone is going to write a C++ program for them? Hah.


No, but I don't think we should ban implementations on alternate
hardware, by specifying the language in too much detail.

So, if we ban one's complement integers, what about 9 bit bytes, 36 or
48 bit words, non-IEEE floating point, or EBCDIC character sets?

On our mainframes, IBM had to add special Java processors to have it
run reasonably efficient. Should we opt for C++ add-on hardware as
well, or should we allow it to run efficiently on existing machines?

Just for this tiny detail.


The basic substrate the language is built in, does matter.


Ok, so I find one sentence, "The representations of integral types
shall define values by use of a pure
binary numeration system.", out of a 1000 page document a tiny detail.
Bo Persson
May 20 '06 #15
* Bo Persson:
"Alf P. Steinbach" <al***@start.no> skrev i meddelandet
news:4d*************@individual.net...
* Bo Persson:
How good is a standard that makes C++ impossible to implement on
that kind of hardware [Univac / Unisys]? These olde beasties are finally being shut down. And you think
someone is going to write a C++ program for them? Hah.


No, but I don't think we should ban implementations on alternate
hardware, by specifying the language in too much detail.

So, if we ban one's complement integers, what about 9 bit bytes, 36 or
48 bit words, non-IEEE floating point, or EBCDIC character sets?


Good question, because it illuminates the basic issue.

Namely, that

* It's not the case that support for deterministic, well-defined
behavior is in conflict with support for implementation-
defined behavior.

That conflict is one that's made up, out of thin air, by using words
such as "ban", or if not actually using them, thinking them.

One simply should not try to shoehorne conflicting behaviors into the
same, uh, gorble (I made up that word, the 'int' type is an example).

And the C solution for having guaranteed 8-bit bytes and so on, happily
coexisting with implementation defined sizes, is precisely to use
different gorbles for different behaviors: good old 'int' for
implementation-defined size, and types from <stdint.h> for fixed sizes.

On our mainframes, IBM had to add special Java processors to have it
run reasonably efficient. Should we opt for C++ add-on hardware as
well, or should we allow it to run efficiently on existing machines?


See above: it's a false dichotomy.

Just for this tiny detail.

The basic substrate the language is built in, does matter.


Ok, so I find one sentence, "The representations of integral types
shall define values by use of a pure
binary numeration system.", out of a 1000 page document a tiny detail.


Ah, but you have essentially pointed out that a great many problems with
current C++, such as character set handling and guaranteed value ranges,
stem from C++ having adopted the same general solution everywhere:
shoehorning all those possible different behaviors into one gorble, and
then by necessity defining that gorble vaguely enough that in some
simple situations it works, and otherwise, just be non-portable...

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?
May 20 '06 #16
"Alf P. Steinbach" <al***@start.no> wrote in message
news:4d*************@individual.net...
Most C++ implementations employ two's complement representation of signed
integers, with no trapping. That means arithmetic modulo 2^n, just as with unsigned integers. Many, including me, have argued that the C++ standard should stop
supporting the ENIAC, MANIAC and NUSSE computers, and require two's
complement no-trapping representation.


This is not the only issue.

I remember, from many years ago, a major argument that arose within AT&T
between two organizations, one of which was supplying a compiler and the
other of which was using it. The dispute arose around a statement of the
form

if ((x = y - z) < 0) { ... }

The compiler generated machine code that did the following:

Copy y into a register
Subtract z from the register
Copy the register into memory
If the condition code did not indicate that the last arithmetic
computation
yielded a negative result, jump over the code in braces.

Looks straightforward enough, right? But if y-z overflowed, the condition
code indicated that the result of the last operation was an overflow, which
is different from a negative result. Therefore the code in braces was not
executed, even if the result after the overflow happened to be negative.

In other words, if an overflow occurred, it was possible that x might be
negative, but the code in braces was still not executed.

The programmers tried to rewrite the code this way:

x = y - z;
if (x < 0) { ... }

and found that the compiler generated exactly the same code: The optimizer
realized that the if statement could be reached only from the assignment, so
it assumed that the hardware was correctly reflecting the results of the
last computation.

The developers claimed that this state of affairs reflected a compiler bug.
If you test whether a variable is negative, and the test fails, it is a bug
if the variable is subsequently found to be negative.

The compiler people claimed that the underflow resulted in undefined
behavior, so all bets were off. Moreover, if they were to fix this "bug",
it would result in generating a needless extra instruction in a wide variety
of contexts, merely to defend against programming techniques that no one
should be using anyway.

Ultimately, the compiler people won; and their philosophy has persisted to
this day.
May 20 '06 #17
Pete Becker wrote:
I thought the nearest toilet could explode.
I don't see how that could happen.


Be imaginative. An x10 "firecracker" controller and some plastic explosive?
Suppose we wrote the program to _prevent_ the controller from triggering,
such that certain undefined behaviors cause it to trigger. Don't answer;
this is not the point.
When the standard says that the behavior of a program that uses some code
construct is undefined it means that the standard doesn't tell you what
happens when you do it. That's all. It doesn't require that bad things
happen, and it certainly doesn't say what something that's "harmless"
actually does.


If we had a C++ compiler that had "ISO Compliant" stamped on its box, and if
we write a program and run it, and if the nearest toilet explodes, we might
trace this to either of two situations. Either we wrote undefined behavior
and the compiler took the option of derailing, or we wrote defined behavior
that exposed a bug in the compiler. So then the family of whoever was on the
toilet at the time will have these options:

- sue us, because the code was undefined, so the compiler
implementors are not to blame

- sue the compiler implementors, because the code was
defined, so we are not to blame

I'm aware this is a slightly different question. But isn't this an example
of the _legal_ implications of the ISO Standard?
Is overflow implementations-defined as safe-to-read-but-garbage-bits?


It's not implementation-defined behavior. It's undefined behavior. The
difference is that in the former case, the compiler's documentation has to
tell you what it does.


Yay. Now get on Alf's branch of the thread and argue the implementor should
not specify in their manual that 2s complement notation will lead to
such-and-so overflow situations. ;-)

--
Phlip
http://c2.com/cgi/wiki?ZeekLand <-- NOT a blog!!!
May 20 '06 #18
* Andrew Koenig:
"Alf P. Steinbach" <al***@start.no> wrote in message
news:4d*************@individual.net...
Most C++ implementations employ two's complement representation of signed
integers, with no trapping.

That means arithmetic modulo 2^n, just as with unsigned integers.

Many, including me, have argued that the C++ standard should stop
supporting the ENIAC, MANIAC and NUSSE computers, and require two's
complement no-trapping representation.


This is not the only issue.

I remember, from many years ago, a major argument that arose within AT&T
between two organizations, one of which was supplying a compiler and the
other of which was using it. The dispute arose around a statement of the
form

if ((x = y - z) < 0) { ... }

The compiler generated machine code that did the following:

Copy y into a register
Subtract z from the register
Copy the register into memory
If the condition code did not indicate that the last arithmetic
computation
yielded a negative result, jump over the code in braces.

Looks straightforward enough, right? But if y-z overflowed, the condition
code indicated that the result of the last operation was an overflow, which
is different from a negative result. Therefore the code in braces was not
executed, even if the result after the overflow happened to be negative.

In other words, if an overflow occurred, it was possible that x might be
negative, but the code in braces was still not executed.

The programmers tried to rewrite the code this way:

x = y - z;
if (x < 0) { ... }

and found that the compiler generated exactly the same code: The optimizer
realized that the if statement could be reached only from the assignment, so
it assumed that the hardware was correctly reflecting the results of the
last computation.

The developers claimed that this state of affairs reflected a compiler bug.
If you test whether a variable is negative, and the test fails, it is a bug
if the variable is subsequently found to be negative.

The compiler people claimed that the underflow resulted in undefined
behavior, so all bets were off. Moreover, if they were to fix this "bug",
it would result in generating a needless extra instruction in a wide variety
of contexts, merely to defend against programming techniques that no one
should be using anyway.

Ultimately, the compiler people won; and their philosophy has persisted to
this day.


I quoted it in full because it's a nice story...

Assuming the language was C or C++, the problem here was that with
signed types the expression (x = y - z) < 0) had, and has, undefined
behavior for certain values of y and z, so that /no matter/ what the
compiler folks did, they could end up being "wrong" wrt. the intention.

Given a language that left this as undefined behavior, the compiler
folks were IMO right to reject a request to define it, which would have
made the code depend on a particular compiler instead of simple but ugly
rewrite using unsigned types and casts.

But the issue wouldn't have existed, if instead the expression was
well-defined for all values. ;-) Then, one could write natural code
that one could be sure was correct. Not merely fast.

So in spite of the compiler folks being right, the language definition
folks (a third group) were IMO not: with 20-20 hindsight we can now see
that they were optimizing prematurely and thereby fostering evil usage.

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?
May 21 '06 #19

"Alf P. Steinbach" <al***@start.no> skrev i meddelandet
news:4d*************@individual.net...
* Bo Persson:

So, if we ban one's complement integers, what about 9 bit bytes, 36
or 48 bit words, non-IEEE floating point, or EBCDIC character sets?
Good question, because it illuminates the basic issue.

Namely, that

* It's not the case that support for deterministic, well-defined
behavior is in conflict with support for implementation-
defined behavior.

That conflict is one that's made up, out of thin air, by using words
such as "ban", or if not actually using them, thinking them.


Here I think "ban" is a synonym for "making an implementation totally
inefficient".

Forcing 36 bit one's complement hardware to behave like if it was 32
bit two's complement, is just an extreme example. Especially if the
problem manifests itself only in overflow situations.

Ah, but you have essentially pointed out that a great many problems
with current C++, such as character set handling and guaranteed
value ranges, stem from C++ having adopted the same general solution
everywhere: shoehorning all those possible different behaviors into
one gorble, and then by necessity defining that gorble vaguely
enough that in some simple situations it works, and otherwise, just
be non-portable...


This is another view on 'being flexible'.

I am sure the standards committee didn't make up these rules just for
fun, but were well aware that specifying seemingly tiny details would
make it impossible to implement the language on a wide range of
hardware. By not specifying some variations ('being vague'), you don't
limit yourself to:

8 bit bytes
2^n bytes per datatype
two's complement integers
no pad bits
'a' == 0x20
floating point with x bit exponent and y bit mantissa
non-segmented memory
byte addressed machines
trapping on defererencing a null pointer
non-trappning on underflow
etc.
what happens on divide-by-zero?
what happens when using invalid pointers?
etc.
etc.
For each of these 'improvements' in portability, you get one tiny
detail that makes it hard to implement the language on one particular
piece of hardware. That might make the effects portable, but not the
run time performance if you have to compensate for your hardware
(unless you build a special Java^H^H^H^H C++ co-processor).
Bo Persson
May 21 '06 #20
* Bo Persson:
"Alf P. Steinbach" <al***@start.no> skrev i meddelandet
news:4d*************@individual.net...
* Bo Persson:
So, if we ban one's complement integers, what about 9 bit bytes, 36
or 48 bit words, non-IEEE floating point, or EBCDIC character sets?

Good question, because it illuminates the basic issue.

Namely, that

* It's not the case that support for deterministic, well-defined
behavior is in conflict with support for implementation-
defined behavior.

That conflict is one that's made up, out of thin air, by using words
such as "ban", or if not actually using them, thinking them.


Here I think "ban" is a synonym for "making an implementation totally
inefficient".

Forcing 36 bit one's complement hardware to


Also, words like "forcing". ;-)

That's a conflict that doesn't exist.

I'ts made up.

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?
May 21 '06 #21
In article <4d*************@individual.net>, bo*@gmb.dk
says...

[ ... ]
I am sure the standards committee didn't make up these rules just for
fun, but were well aware that specifying seemingly tiny details would
make it impossible to implement the language on a wide range of
hardware.
There's really quite a bit more to the situation than
that.
By not specifying some variations ('being vague'), you don't
limit yourself to:

8 bit bytes
2^n bytes per datatype
two's complement integers
no pad bits
'a' == 0x20


[ ... ]

There's a lot more at stake here than just a minor
tradeoff between ease of writing portable code and ease
of writing a compiler that produces efficient code.

One basic intent of C++ is that it should support
essentially any kind of programming that C did/does. One
of the things for which C (and therefore C++) is intended
to be used is system programming, such as implementing
operating systems.

Just for one minor example, consider the result of
mandating the sizes of types. If you were going to do
that, you'd almost certainly mandate them as powers of
two. If you do that, however, you make it essentially
impossible to any longer use C++ to write something like
the OS (or any other "bare metal" code) for almost any
machine with an unusual word size.

Contrary to some people's beliefs, such unusual word
sizes are NOT merely strange leftovers from a bygone era,
nor is there any reasonable likelihood that such machines
are going to go away anytime soon. Consider, for example,
the specs on the TI TMS320C3x DSPs:
32-bit floating point
24-bit fixed point
40-bit registers
Likewise, the Motorola DSP 563xx series:
24-bit addresses
24-bit fixed point data
56-bit accumulators

It's not particularly difficult to find more examples
either.

To summarize: the fundamental question here is primarily
whether you want an applications programming language or
a systems programming language. There is certainly room
in the world for applications programming languages --
but that's never been the intent of C++.

--
Later,
Jerry.

The universe is a figment of its own imagination.
May 21 '06 #22
* Jerry Coffin:

One basic intent of C++ is that it should support
essentially any kind of programming that C did/does. One
of the things for which C (and therefore C++) is intended
to be used is system programming, such as implementing
operating systems.

Just for one minor example, consider the result of
mandating the sizes of types. If you were going to do
that, you'd almost certainly mandate them as powers of
two. If you do that, however, you make it essentially
impossible to any longer use C++ to write something like
the OS (or any other "bare metal" code) for almost any
machine with an unusual word size.

Contrary to some people's beliefs, such unusual word
sizes are NOT merely strange leftovers from a bygone era,
nor is there any reasonable likelihood that such machines
are going to go away anytime soon. Consider, for example,
the specs on the TI TMS320C3x DSPs:
32-bit floating point
24-bit fixed point
40-bit registers
Likewise, the Motorola DSP 563xx series:
24-bit addresses
24-bit fixed point data
56-bit accumulators

It's not particularly difficult to find more examples
either.


I do understand why I have to repeat the same umpteen times in the same
subthread.

But for the record, once more: there is no conflict, it's a false,
construed dichotomy.

Furthermore, as mentioned, C manages to support fixed size types, and
also as mentioned, that will probably also be supported in C++0x, so the
question about fixed size types is really moot. :-)

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?
May 21 '06 #23
In article <4d*************@individual.net>,
al***@start.no says...

[ ... ]
I do understand why I have to repeat the same umpteen times in the same
subthread.
I suspect you meant you _don't_ understand. I can explain
it easily: it's a natural consequence of the fact that
you're mostly wrong.
But for the record, once more: there is no conflict, it's a false,
construed dichotomy.

Furthermore, as mentioned, C manages to support fixed size types, and
also as mentioned, that will probably also be supported in C++0x, so the
question about fixed size types is really moot. :-)


C supports fixed-size types -- sort of -- and only
optionally at that. There's no problem with C++ doing the
same, but the portability gains are minimal at best.

--
Later,
Jerry.

The universe is a figment of its own imagination.
May 21 '06 #24
* Jerry Coffin:
In article <4d*************@individual.net>,
al***@start.no says...

[ ... ]
I do understand why I have to repeat the same umpteen times in the same
subthread.


I suspect you meant you _don't_ understand. I can explain
it easily: it's a natural consequence of the fact that
you're mostly wrong.


If you believe that, if would be prudent to argue your case (whatever it
is) rather than resorting to an infantile accusation.

But for the record, once more: there is no conflict, it's a false,
construed dichotomy.

Furthermore, as mentioned, C manages to support fixed size types, and
also as mentioned, that will probably also be supported in C++0x, so the
question about fixed size types is really moot. :-)


C supports fixed-size types -- sort of -- and only
optionally at that. There's no problem with C++ doing the
same, but the portability gains are minimal at best.


There's no "sort of": C supports fixed-size types.

They're optional as they should be.

The portability gains are substantial: without standardization, each
application would have to provide portability on its own.

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?
May 21 '06 #25
In article <4d*************@individual.net>,
al***@start.no says...

[ ... ]
There's no "sort of": C supports fixed-size types.
By now, you've probably already realized what complete
nonsense this was, in context, but just in case you've
missed the obvious...

What C has are typedefs, and typedef's are only sort of
types. Admittedly, within the C context, most of the
difference isn't visible -- but C++ it is far more often.
Consider, for example:

int32_t x(int32_t a) {
// ...
}

int x(int a) {
// ...
}

This may work part of the time, but it certainly isn't
portable.

What C provides are only sort of types. The difference
between what's provided and a real type is usually
negligible in C, but becomes much more prominent in C++.
They're optional as they should be.

The portability gains are substantial: without standardization, each
application would have to provide portability on its own.


Code that really needs to be portable can't depend on
their being present -- so it still has to provide the
portability on its own.

--
Later,
Jerry.

The universe is a figment of its own imagination.
May 21 '06 #26
* Jerry Coffin:
In article <4d*************@individual.net>,
al***@start.no says...

[ ... ]
There's no "sort of": C supports fixed-size types.
By now, you've probably already realized what complete
nonsense this was, in context, but just in case you've
missed the obvious...


The error/misconception lies in the word "obvious".

What C has are typedefs, and typedef's are only sort of
types. Admittedly, within the C context, most of the
difference isn't visible -- but C++ it is far more often.
Consider, for example:

int32_t x(int32_t a) {
// ...
}

int x(int a) {
// ...
}

This may work part of the time, but it certainly isn't
portable.

You're arguing that because the C solution doesn't meet your arbitrary
C++ requirements, it's not portable: that conclusion does not follow.

And your arbitrary C++ requirements are not met by types such as
std::size_t or std::ptr_diff, and they're portable: that's a direct
counter-example (or two).

In short, the argument you present is (1) bereft of logic, and (2) if it
were valid, would make existing C++ standard types non-portable.

What C provides are only sort of types. The difference
between what's provided and a real type is usually
negligible in C, but becomes much more prominent in C++.
They're optional as they should be.

The portability gains are substantial: without standardization, each
application would have to provide portability on its own.


Code that really needs to be portable can't depend on
their being present -- so it still has to provide the
portability on its own.


That's nonsense.

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?
May 21 '06 #27
In article <4d*************@individual.net>,
al***@start.no says...

[ ... ]
You're arguing that because the C solution doesn't meet your arbitrary
C++ requirements, it's not portable: that conclusion does not follow.

And your arbitrary C++ requirements are not met by types such as
std::size_t or std::ptr_diff, and they're portable: that's a direct
counter-example (or two).
Not really -- these are typedefs oriented toward an
_intent_, which makes them entirely different. To use
your example, overloading ptrdiff_t and whatever its base
type might be simply doesn't make sense, because even if
they have the same representation, they have different
uses.

That's not at all the case with int32_t and such -- these
are relatively ordinary integer types, without a
fundamentally different intent from short, int, long,
etc.
In short, the argument you present is (1) bereft of logic, and (2) if it
were valid, would make existing C++ standard types non-portable.


Not even close to accurate.
What C provides are only sort of types. The difference
between what's provided and a real type is usually
negligible in C, but becomes much more prominent in C++.
They're optional as they should be.

The portability gains are substantial: without standardization, each
application would have to provide portability on its own.


Code that really needs to be portable can't depend on
their being present -- so it still has to provide the
portability on its own.


That's nonsense.


Oh, how I wish you were right!

--
Later,
Jerry.

The universe is a figment of its own imagination.
May 21 '06 #28
* Jerry Coffin:
In article <4d*************@individual.net>,
al***@start.no says...

[ ... ]
You're arguing that because the C solution doesn't meet your arbitrary
C++ requirements, it's not portable: that conclusion does not follow.

And your arbitrary C++ requirements are not met by types such as
std::size_t or std::ptr_diff, and they're portable: that's a direct
counter-example (or two).
Not really -- these are typedefs oriented toward an
_intent_, which makes them entirely different.


I'm not going to discuss hypothetical intents and their even more
hypothetical effects.

To use
your example, overloading ptrdiff_t and whatever its base
type might be simply doesn't make sense, because even if
they have the same representation, they have different
uses.
It may be your right about ptrdiff_t (sorry about earlier typo), because
I've never needed to overload that.

On the other hand I have needed to overload std::size_t. One example is
a bug in Visual C++ where it spews out warnings about passing a
std::size_t to standard streams. Another example is an output function
that should choose the appropriate printf format specifier for the type
of argument, noting that std::size_t may or may not be a typedef of some
other built-in type (yes, there is a portable solution ;-)).

So your "doesn't make sense" doesn't make sense.

More fundamentally, it doesn't make sense to discuss a limitation of
adopting a C solution as-is in C++, as a limitation of the C solution:
it is a limitation of (the C solution + a chosen context and language
for which it was not designed). It's not even a limitation per se. It
is a requirement that you place on a C++ solution, and there's nothing
technically that prevents that requirement from being met.

In short, your argument here has (A) a false premise, and (B) is
irrelevant anyway.

That's not at all the case with int32_t and such -- these
are relatively ordinary integer types, without a
fundamentally different intent from short, int, long,
etc.


I'm not going to discuss hypothetical intents and their even more
hypothetical effects.

In short, the argument you present is (1) bereft of logic, and (2) if it
were valid, would make existing C++ standard types non-portable.


Not even close to accurate.


I demonstrated (1) and (2). If you disagree, and want me or others to
See The Light, please supply better counter-arguments. Above you tried
to attack (2), but failed as noted in (A) and (B).

What C provides are only sort of types. The difference
between what's provided and a real type is usually
negligible in C, but becomes much more prominent in C++.

They're optional as they should be.

The portability gains are substantial: without standardization, each
application would have to provide portability on its own.
Code that really needs to be portable can't depend on
their being present -- so it still has to provide the
portability on its own.

That's nonsense.


Oh, how I wish you were right!


Just ask if you wonder why. :-)
--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?
May 21 '06 #29
Pete Becker wrote:
Greg wrote:
Dereferencing a NULL
pointer would be an obvious example of a behavior undefined by the
Standard - but which nonetheless has a behavior defined in almost every
implementation.
Really? What can I expect to happen when I dereference a null pointer
with MSVC 7.1? And where is it documented? After I've done this, what is
the state of my program?


Dereferencing a NULL pointer is typically defined by the applicable
architecture, not the compiler. Deferencing a NULL pointer on Windows
XP at any rate is a certain EXCEPTION_ACCESS_VIOLATION.

Unless explicitly handled, the state of your program will be
"terminated."

Microsoft has extensive developer documentation at msdn.microsoft.com
with information about these various runtime errors.
Integer overlfow would be another example of a
widely-defined operation.


Really? How is it defined for MSVC 7.1? And where is it documented? Has
Microsoft promised that whatever this behavior is, it will never be changed?


In this case the underlying CPU determines the outcome. So I would
consult the Intel manuals for the governing behavior - but I would
expect the values wrap around.

The C++ standard makes no guarantee that it will not change in the
future. No standard guarantees against change in the future. Standards
are only ever good for the present. And the Intel instruction set and
architecture is a standard of Intel's. So whatever the integer overflow
behavior is is this case - is not due to happenstance. So the
likelihood that it would ever change in the future is very small.
I'm not disputing that you can often figure out what happens in
particular cases. But that's not a sufficient basis for saying that it's
well defined for that compiler. Unless the compiler's specification
tells you exactly what you can expect, you're guessing. That's often
appropriate, but it doesn't make what you're doing well defined.


Sure it does. The language, the compiler, the runtime together define
the set of governing behavior for a program. It doesn't really matter
to a programmer whether it's the C++ standard, the compiler or the OS
that defines NULL pointer dereferences as a memory access violation. No
matter who has defined it - it is the standard behavior as far as that
program is concerned.

Greg

May 21 '06 #30

"Alf P. Steinbach" <al***@start.no> skrev i meddelandet
news:4d*************@individual.net...

But for the record, once more: there is no conflict, it's a false,
construed dichotomy.

Furthermore, as mentioned, C manages to support fixed size types,
and also as mentioned, that will probably also be supported in
C++0x, so the question about fixed size types is really moot. :-)


C99 only manages to support fixed size types for hardware were they
fit exactly. The C99 standard requires int32_t to be defined on
implementations which have 32 bit integers, two's complement, no pad
bits.

On other implementations, the typedef is absent.

How portable is that?

Bo Persson
May 22 '06 #31
* Bo Persson:
"Alf P. Steinbach" <al***@start.no> skrev i meddelandet
news:4d*************@individual.net...
But for the record, once more: there is no conflict, it's a false,
construed dichotomy.

Furthermore, as mentioned, C manages to support fixed size types,
and also as mentioned, that will probably also be supported in
C++0x, so the question about fixed size types is really moot. :-)


C99 only manages to support fixed size types for hardware were they
fit exactly. The C99 standard requires int32_t to be defined on
implementations which have 32 bit integers, two's complement, no pad
bits.

On other implementations, the typedef is absent.

How portable is that?


Maximum.

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?
May 22 '06 #32
Pete Becker wrote:
Mirek Fidler wrote:

Sometimes this makes me think that we would need some sort of
"substandard" that "defines" some of those undefined behaviours - as
most code in fact needs to use some of relatively harmless
"undefineds" to exist (e.g. memory allocator routines or GC) or it can
use it to improve performance performance problems (memcpy non-PODs) -
and it would still be quite advantage for portability to know that
certain platform supports such "substandard".


Sure, you can kill performance and make compiler writers work harder in
order to improve "portability" for a small fraction of the code that
people write. Java did it with their floating-point math, and had to
undo it.


With all respect, what I propose is a little bit different. E.g. GCC,
Intel and Microsoft compilers already support such "substandard",
without any changes to their code (they all support modulo 2 integers
with harmless overflows, flat memory model, destructive moves of non-PODs).

BTW, speaking about floating point maths, all of them also contain
switches to speed up FP at the cost of droping standard compliance - so
your argument is pretty void here (standard is already too restrictive
for optimal FP).

Mirek
May 23 '06 #33
Mirek Fidler wrote:

BTW, speaking about floating point maths, all of them also contain
switches to speed up FP at the cost of droping standard compliance - so
your argument is pretty void here (standard is already too restrictive
for optimal FP).


On the contrary: it demonstrates exactly what I said: that a standard
that imposes excessive restrictions won't be followed.

--

Pete Becker
Roundhouse Consulting, Ltd.
May 23 '06 #34

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

14 posts views Thread by ozbear | last post: by
29 posts views Thread by Vol | last post: by
7 posts views Thread by Daniel Rudy | last post: by
102 posts views Thread by tom fredriksen | last post: by
18 posts views Thread by sunny | last post: by
7 posts views Thread by rsk | last post: by
42 posts views Thread by thomas.mertes | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.