Style question: Use always signed integers or not?

=?ISO-8859-1?Q?Tom=E1s_=D3_h=C9ilidhe?=

On 2008-06-07 12:43, Juha Nieminen wrote:

I was once taught that if some integral value can never have negative
values, it's a good style to use an 'unsigned' type for that: It's
informative, self-documenting, and you are not wasting half of the value
range for values which you will never be using.

I agreed with this, and started to always use 'unsigned' whenever
negative values wouldn't make any sense. I did this for years.

However, I slowly changed my mind: Doing this often causes more
problems than it's worth. A good example:

If you have an image class, its width and height properties can
*never* get negative values. They will always be positive. So it makes
sense to use 'unsigned' for the width and height, doesn't it? What
problems could that ever create?

Well, assume that you are drawing images on screen, by pixel
coordinates, and these images can be partially (or completely) outside
the screen. For example, the left edge coordinate of the image to be
drawn might have a negative x value (for example drawing a 100x100 image
at coordinates <-20, 50>). Since the coordinates are signed and the
dimensions of the image are unsigned, this may cause signed-unsigned
mixup. For example this:

if(x - width/2 < 1) ...

where 'x' is a signed integer, gives *different* results depending on
whether 'width' is signed or unsigned, with certain values of those
variables (for example x=2 and width=10).

Wow, that was certainly an eye-opener. I had assumed that in this case
both values would be promoted to some larger type (signed long) which
could accurately represent both values (the signed and the unsigned int)
but apparently not.

This is a defect in the standard in my opinion since it allows the
correct action to be taken for types smaller than int:

#include <iostream>

int main()
{
unsigned short width = 10;
short x = 2;
std::cout << (x - width);
return 0;
}

--
Erik WikstrÃ¶m

Jun 27 '08 #3

Chris Forone

Erik WikstrÃ¶m schrieb:

On 2008-06-07 12:43, Juha Nieminen wrote:
> I was once taught that if some integral value can never have negative
values, it's a good style to use an 'unsigned' type for that: It's
informative, self-documenting, and you are not wasting half of the value
range for values which you will never be using.

I agreed with this, and started to always use 'unsigned' whenever
negative values wouldn't make any sense. I did this for years.

However, I slowly changed my mind: Doing this often causes more
problems than it's worth. A good example:

If you have an image class, its width and height properties can
*never* get negative values. They will always be positive. So it makes
sense to use 'unsigned' for the width and height, doesn't it? What
problems could that ever create?

Well, assume that you are drawing images on screen, by pixel
coordinates, and these images can be partially (or completely) outside
the screen. For example, the left edge coordinate of the image to be
drawn might have a negative x value (for example drawing a 100x100 image
at coordinates <-20, 50>). Since the coordinates are signed and the
dimensions of the image are unsigned, this may cause signed-unsigned
mixup. For example this:

if(x - width/2 < 1) ...

where 'x' is a signed integer, gives *different* results depending on
whether 'width' is signed or unsigned, with certain values of those
variables (for example x=2 and width=10).

Wow, that was certainly an eye-opener. I had assumed that in this case
both values would be promoted to some larger type (signed long) which
could accurately represent both values (the signed and the unsigned int)
but apparently not.

This is a defect in the standard in my opinion since it allows the
correct action to be taken for types smaller than int:

#include <iostream>

int main()
{
unsigned short width = 10;
short x = 2;
std::cout << (x - width);
return 0;
}

is -8 (gcc 3.4.5/adam riese)...

Jun 27 '08 #4

=?ISO-8859-1?Q?Dar=EDo_Griffo?=

On Jun 7, 7:43 am, Juha Nieminen <nos...@thanks.invalidwrote:

I was once taught that if some integral value can never have negative
values, it's a good style to use an 'unsigned' type for that: It's
informative, self-documenting, and you are not wasting half of the value
range for values which you will never be using.

I agreed with this, and started to always use 'unsigned' whenever
negative values wouldn't make any sense. I did this for years.

However, I slowly changed my mind: Doing this often causes more
problems than it's worth. A good example:

If you have an image class, its width and height properties can
*never* get negative values. They will always be positive. So it makes
sense to use 'unsigned' for the width and height, doesn't it? What
problems could that ever create?

Well, assume that you are drawing images on screen, by pixel
coordinates, and these images can be partially (or completely) outside
the screen. For example, the left edge coordinate of the image to be
drawn might have a negative x value (for example drawing a 100x100 image
at coordinates <-20, 50>). Since the coordinates are signed and the
dimensions of the image are unsigned, this may cause signed-unsigned
mixup. For example this:

if(x - width/2 < 1) ...

where 'x' is a signed integer, gives *different* results depending on
whether 'width' is signed or unsigned, with certain values of those
variables (for example x=2 and width=10). The intention is to treat
'width' here as a signed value, but if it isn't, the comparison will
malfunction (without explicitly casting 'width' to a signed value). This
may well go completely unnoticed because compilers might not even give
any warning (for example gcc doesn't).

Thus at some point I started to *always* use signed integers unless
there was a very good reason not to. (Of course this sometimes causes
small annoyances because STL containers return an unsigned value for
their size() functions, but that's usually not a big problem.)

It would be interesting to hear other opinions on this subject.

It's a good moment for re-read Stroustrups The C++ Programming
Language
He talks about usual conversions un binary operators.
Your example: if(x - width/2 < 1) ...
considering x as int and width as unsigned, should return a unsigned
value.
There are a lot more of basic conversion rules, but i think it fits
here.
I've assumed (until today) the compiler autocast width to signed
before makes the comparison, but it is not true.
About your question, I still think that if the value will never have a
negative value, it should be unsigned, because of the reasons you had
told. The example of the properties of an image is valid, for the
first argument, width and height properties will allways be non
negative, but another thing is to calculate where you're going to draw
them. Where your are going to draw is a coordinate, and they certainly
can be negative, so they have the be signed values. But we where
mixing the concept of a coordinate with a width and height. Booth are
related at drawing time, but not the same thing.

Darío

Jun 27 '08 #5

On Jun 7, 11:43*am, Juha Nieminen <nos...@thanks.invalidwrote:

* If you have an image class, its width and height properties can
*never* get negative values. They will always be positive. So it makes
sense to use 'unsigned' for the width and height, doesn't it? What
problems could that ever create?

* Well, assume that you are drawing images on screen, by pixel
coordinates, and these images can be partially (or completely) outside
the screen. For example, the left edge coordinate of the image to be
drawn might have a negative x value (for example drawing a 100x100 image
at coordinates <-20, 50>).

I agree that this is a case where you'd consider using signed integer
types, but I'm still a stedfast unsigned man. My major peeve with
signed integer types is their undefined behaviour upon overflow.

(Of course the less pretty solution is to use casts in your code)

Jun 27 '08 #6

Paavo Helde

Tomás Ó hÉilidhe <to*@lavabit.comkirjutas:

types, but I'm still a stedfast unsigned man. My major peeve with
signed integer types is their undefined behaviour upon overflow.

You mean that you are fond of the following code having no undefined
behavior? Sorry, I have not encountered a need for this kind of defined
behavior!

#include <iostream>

int main() {
unsigned int a = 2150000000u;
unsigned int b = 2160000000u;
unsigned int c = (a+b)/2;

std::cout << "a=" << a << "\nb=" << b <<
"\nand their average\nc=" << c << "\n";
}

Output (32-bit unsigned ints):

a=2150000000
b=2160000000
and their average
c=7516352

The unsigned integers in C/C++ are very specific cyclic types with
strange overflow and wrapping rules. IMHO, these should be used primarily
in some very specific algorithms needing such cyclic types.

Regards
Paavo

Jun 27 '08 #7

Daniel T.

Juha Nieminen <no****@thanks.invalidwrote:

I was once taught that if some integral value can never have negative
values, it's a good style to use an 'unsigned' type for that: It's
informative, self-documenting, and you are not wasting half of the value
range for values which you will never be using.

I agreed with this, and started to always use 'unsigned' whenever
negative values wouldn't make any sense. I did this for years.

However, I slowly changed my mind: Doing this often causes more
problems than it's worth. A good example:

If you have an image class, its width and height properties can
*never* get negative values. They will always be positive. So it makes
sense to use 'unsigned' for the width and height, doesn't it? What
problems could that ever create?

Well, assume that you are drawing images on screen, by pixel
coordinates, and these images can be partially (or completely) outside
the screen. For example, the left edge coordinate of the image to be
drawn might have a negative x value (for example drawing a 100x100 image
at coordinates <-20, 50>). Since the coordinates are signed and the
dimensions of the image are unsigned, this may cause signed-unsigned
mixup. For example this:

if(x - width/2 < 1) ...

where 'x' is a signed integer, gives *different* results depending on
whether 'width' is signed or unsigned, with certain values of those
variables (for example x=2 and width=10). The intention is to treat
'width' here as a signed value, but if it isn't, the comparison will
malfunction (without explicitly casting 'width' to a signed value). This
may well go completely unnoticed because compilers might not even give
any warning (for example gcc doesn't).

Thus at some point I started to *always* use signed integers unless
there was a very good reason not to. (Of course this sometimes causes
small annoyances because STL containers return an unsigned value for
their size() functions, but that's usually not a big problem.)

It would be interesting to hear other opinions on this subject.

From Stroustrup:

The unsigned integer types are ideal for uses that treat storage as a
bit array. Using an unsigned instead of an int to gain one more bit
to represent positive integers is almost never a good idea. Attempts
to ensure that some values are positive by declaring variables
unsigned will typically be defeated by the implicit conversion rules.

I especially like this quote from Gavin Deane:

Both signed and unsigned arithmetic in C++ behave just like real
world integer arithmetic until you reach the minimum and maximum
values for those types. The difference is that for signed, the
correlation between the real world and C++ breaks down at + or - a
very big number, whereas for unsigned, the correlation between the
real world and C++ breaks down at zero and a (different) very big
number.

In other words, if I use unsigned, I am very close to the edge of the
domain in which C++ arithmetic corresponds to the real world. If I
use signed, I am comfortably in the middle of that domain and the
extremities where I have to start coding for special cases are far,
far away from any number that I am likely to ever want to use for an
age or a length or a size, or I am likely to ever end up with by
doing arithmetic with those quantities.

Jun 27 '08 #8

On 2008-06-07 16:47, Chris Forone wrote:

Erik WikstrÃ¶m schrieb:
>On 2008-06-07 12:43, Juha Nieminen wrote:
>> I was once taught that if some integral value can never have negative
values, it's a good style to use an 'unsigned' type for that: It's
informative, self-documenting, and you are not wasting half of the value
range for values which you will never be using.

I agreed with this, and started to always use 'unsigned' whenever
negative values wouldn't make any sense. I did this for years.

However, I slowly changed my mind: Doing this often causes more
problems than it's worth. A good example:

If you have an image class, its width and height properties can
*never* get negative values. They will always be positive. So it makes
sense to use 'unsigned' for the width and height, doesn't it? What
problems could that ever create?

Well, assume that you are drawing images on screen, by pixel
coordinates, and these images can be partially (or completely) outside
the screen. For example, the left edge coordinate of the image to be
drawn might have a negative x value (for example drawing a 100x100 image
at coordinates <-20, 50>). Since the coordinates are signed and the
dimensions of the image are unsigned, this may cause signed-unsigned
mixup. For example this:

if(x - width/2 < 1) ...

where 'x' is a signed integer, gives *different* results depending on
whether 'width' is signed or unsigned, with certain values of those
variables (for example x=2 and width=10).

Wow, that was certainly an eye-opener. I had assumed that in this case
both values would be promoted to some larger type (signed long) which
could accurately represent both values (the signed and the unsigned int)
but apparently not.

This is a defect in the standard in my opinion since it allows the
correct action to be taken for types smaller than int:

#include <iostream>

int main()
{
unsigned short width = 10;
short x = 2;
std::cout << (x - width);
return 0;
}

is -8 (gcc 3.4.5/adam riese)...

Exactly, but if you use int instead of short you get 4294967288, because
the unsigned int is not promoted to a signed long.

--
Erik WikstrÃ¶m

Jun 27 '08 #9

Daniel Pitts

Paavo Helde wrote:

Tomás Ó hÉilidhe <to*@lavabit.comkirjutas:

>types, but I'm still a stedfast unsigned man. My major peeve with
signed integer types is their undefined behaviour upon overflow.

You mean that you are fond of the following code having no undefined
behavior? Sorry, I have not encountered a need for this kind of defined
behavior!

#include <iostream>

int main() {
unsigned int a = 2150000000u;
unsigned int b = 2160000000u;
unsigned int c = (a+b)/2;

std::cout << "a=" << a << "\nb=" << b <<
"\nand their average\nc=" << c << "\n";
}

Output (32-bit unsigned ints):

a=2150000000
b=2160000000
and their average
c=7516352

The unsigned integers in C/C++ are very specific cyclic types with
strange overflow and wrapping rules. IMHO, these should be used primarily
in some very specific algorithms needing such cyclic types.

Regards
Paavo

I don't have a copy of the standard, but does the standard actually
define unsigned integral types as having that overflow behavior? Or is
that just the "most common case"?

I'm not questioning to make a point, I really would like to know the answer.

Thanks,
Daniel.

--
Daniel Pitts' Tech Blog: <http://virtualinfinity.net/wordpress/>

Jun 27 '08 #10

Daniel Pitts wrote:

Paavo Helde wrote:
>Tomás Ó hÉilidhe <to*@lavabit.comkirjutas:

>>types, but I'm still a stedfast unsigned man. My major peeve with
signed integer types is their undefined behaviour upon overflow.

You mean that you are fond of the following code having no undefined
behavior? Sorry, I have not encountered a need for this kind of defined
behavior!

#include <iostream>

int main() {
unsigned int a = 2150000000u;
unsigned int b = 2160000000u;
unsigned int c = (a+b)/2;

std::cout << "a=" << a << "\nb=" << b <<
"\nand their average\nc=" << c << "\n";
}

Output (32-bit unsigned ints):

a=2150000000
b=2160000000
and their average
c=7516352

The unsigned integers in C/C++ are very specific cyclic types with
strange overflow and wrapping rules. IMHO, these should be used primarily
in some very specific algorithms needing such cyclic types.

[snip]

The wrapping isn't all that strange.

I don't have a copy of the standard, but does the standard actually
define unsigned integral types as having that overflow behavior? Or is
that just the "most common case"?

I'm not questioning to make a point, I really would like to know the
answer.

Overflow for signed arithmetic types is undefined behavior [5/5]. Unsigned
integer types have arithmetic mod 2^N where N is a the bitlength [3.9.1/4].
Best

Kai-Uwe Bux

Jun 27 '08 #11

Paavo Helde

Daniel Pitts <ne******************@virtualinfinity.netkirjuta s:

Paavo Helde wrote:
>Tomás Ó hÉilidhe <to*@lavabit.comkirjutas:

>>types, but I'm still a stedfast unsigned man. My major peeve with
signed integer types is their undefined behaviour upon overflow.

You mean that you are fond of the following code having no undefined
behavior? Sorry, I have not encountered a need for this kind of
defined behavior!

#include <iostream>

int main() {
unsigned int a = 2150000000u;
unsigned int b = 2160000000u;
unsigned int c = (a+b)/2;

std::cout << "a=" << a << "\nb=" << b <<
"\nand their average\nc=" << c << "\n";
}

Output (32-bit unsigned ints):

a=2150000000
b=2160000000
and their average
c=7516352

The unsigned integers in C/C++ are very specific cyclic types with
strange overflow and wrapping rules. IMHO, these should be used
primarily in some very specific algorithms needing such cyclic types.

Regards
Paavo
I don't have a copy of the standard, but does the standard actually
define unsigned integral types as having that overflow behavior? Or is
that just the "most common case"?

I'm not questioning to make a point, I really would like to know the
answer.

Thanks,
Daniel.

AFAIK the unsigned arithmetics is specified exactly by the standard. This means for
example that a debug implementation cannot detect and assert the overflow, but has to
produce the wrapped result instead:

<quote>

3.9.1/4:

Unsigned integers, declared unsigned, shall obey the laws of arithmetic modulo 2^n
where n is the number of bits in the value representation of that particular size of
integer.

Footnote: This implies that unsigned arithmetic does not overflow because a result that
cannot be represented by the resulting unsigned integer type is reduced modulo the
number that is one greater than the largest value that can be represented by the
resulting unsigned integer
type.

</quote>

In the above example one should also consider the effects of integral promotion - the
internal calculations are done in unsigned int, which means that a similar example with
unsigned shorts or chars would appear to work correctly. However, I am not sure where
the standard says that 3.9.1/4 can be violated by promoting the operands to a larger
type. By considering 3.9.1/4 only it seems that one should have:

unsigned char a=128u, b=128u, c=(a+b)/2; assert(c==0);

Fortunately, this is not the case with my compilers ;-)

Cheers
Paavo

Jun 27 '08 #12

Paavo Helde wrote:

Daniel Pitts <ne******************@virtualinfinity.netkirjuta s:

>Paavo Helde wrote:
>>Tomás Ó hÉilidhe <to*@lavabit.comkirjutas:

types, but I'm still a stedfast unsigned man. My major peeve with
signed integer types is their undefined behaviour upon overflow.

You mean that you are fond of the following code having no undefined
behavior? Sorry, I have not encountered a need for this kind of
defined behavior!

#include <iostream>

int main() {
unsigned int a = 2150000000u;
unsigned int b = 2160000000u;
unsigned int c = (a+b)/2;

std::cout << "a=" << a << "\nb=" << b <<
"\nand their average\nc=" << c << "\n";
}

Output (32-bit unsigned ints):

a=2150000000
b=2160000000
and their average
c=7516352

The unsigned integers in C/C++ are very specific cyclic types with
strange overflow and wrapping rules. IMHO, these should be used
primarily in some very specific algorithms needing such cyclic types.

Regards
Paavo
I don't have a copy of the standard, but does the standard actually
define unsigned integral types as having that overflow behavior? Or is
that just the "most common case"?

I'm not questioning to make a point, I really would like to know the
answer.

Thanks,
Daniel.

AFAIK the unsigned arithmetics is specified exactly by the standard. This
means for example that a debug implementation cannot detect and assert the
overflow, but has to produce the wrapped result instead:

<quote>

3.9.1/4:

Unsigned integers, declared unsigned, shall obey the laws of arithmetic
modulo 2^n where n is the number of bits in the value representation of
that particular size of integer.

Footnote: This implies that unsigned arithmetic does not overflow because
a result that cannot be represented by the resulting unsigned integer type
is reduced modulo the number that is one greater than the largest value
that can be represented by the resulting unsigned integer
type.

</quote>

In the above example one should also consider the effects of integral
promotion - the internal calculations are done in unsigned int, which
means that a similar example with unsigned shorts or chars would appear to
work correctly. However, I am not sure where the standard says that
3.9.1/4 can be violated by promoting the operands to a larger type.

[5.9]

By considering 3.9.1/4 only it seems that one should have:

unsigned char a=128u, b=128u, c=(a+b)/2; assert(c==0);

Fortunately, this is not the case with my compilers ;-)

I am not so sure whether we are really fortunate to have integral
promotions.
Best

Kai-Uwe Bux

Jun 27 '08 #13

In article <1D************@read4.inet.fi>, no****@thanks.invalid says...

I was once taught that if some integral value can never have negative
values, it's a good style to use an 'unsigned' type for that: It's
informative, self-documenting, and you are not wasting half of the value
range for values which you will never be using.

I agreed with this, and started to always use 'unsigned' whenever
negative values wouldn't make any sense. I did this for years.

However, I slowly changed my mind: Doing this often causes more
problems than it's worth. A good example:

There was a rather long thread on this very subject a few years ago:

http://groups.google.com/group/comp....wse_frm/thread
/840f37368aefda4e/85b1ac5149a6bd6f#85b1ac5149a6bd6f

--
Later,
Jerry.

The universe is a figment of its own imagination.

Jun 27 '08 #14

James Kanze

On Jun 7, 12:43 pm, Juha Nieminen <nos...@thanks.invalidwrote:

I was once taught that if some integral value can never have
negative values, it's a good style to use an 'unsigned' type
for that: It's informative, self-documenting, and you are not
wasting half of the value range for values which you will
never be using.

It would be good style to use a cardinal integral type, if C++
had such. For better or for worse, the unsigned types in C++
are not really a good abstraction of cardinals (they use modulo
arithmetic), and the implicit conversion rules between signed
and unsigned cause all sorts of problems.

The result is: the "standard" integral type in C++ is "int".
Any other type should only be used to fulfill a very specific
need. And the unsigned types should be avoided except where you
need the modulo arithmetic, or you are actually dealing with
bits.

Up to a point. Even more important is to avoid mixing signed
and unsigned (again, because of the conversion rules). Which
means that if you're stuck using a library (like the standard
library) which uses unsigned, you should usually use unsigned
when interfacing it. Which leads to a horrible mixture in your
own code, but the alternatives seem worse.

[...]

It would be interesting to hear other opinions on this
subject.

The problem here is that there is a difference between theory
and practice, mainly because of all of the implicit and
unchecked conversions, but also because the unsigned types do
not really model a cardinal as well as they should.

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Jun 27 '08 #15

Juha Nieminen

Erik WikstrÃ¶m wrote:

Exactly, but if you use int instead of short you get 4294967288, because
the unsigned int is not promoted to a signed long.

What would be the difference between promoting an unsigned int to a
signed int vs. to a signed long in an architecture where int and long
are the same thing (ie. basically all 32-bit systems)?

Jun 27 '08 #16

Juha Nieminen

James Kanze wrote:

Up to a point. Even more important is to avoid mixing signed
and unsigned (again, because of the conversion rules). Which
means that if you're stuck using a library (like the standard
library) which uses unsigned, you should usually use unsigned
when interfacing it. Which leads to a horrible mixture in your
own code, but the alternatives seem worse.

I find myself constantly writing code like this:

if(size_t(amount) >= table.size())
table.resize(amount+1);

and:

int Class::elementsAmount() const
{
return int(table.size());
}

I don't like those explicit casts. They are awkward and feel
dangerous, but sometimes there just isn't any way around them.
Especially in the latter case there is a danger of overflow if the table
is large enough, but...

It's a bit of a dilemma.

Jun 27 '08 #17

On 2008-06-08 12:16, Juha Nieminen wrote:

Erik WikstrÃ¶m wrote:
>Exactly, but if you use int instead of short you get 4294967288, because
the unsigned int is not promoted to a signed long.

What would be the difference between promoting an unsigned int to a
signed int vs. to a signed long in an architecture where int and long
are the same thing (ie. basically all 32-bit systems)?

None, but then the limitation would (partially) be in the platform and
not in the language. What I complain about is the fact that integer
promotion is specified for types "smaller" than int, but not for
"larger" types. Considering that many desktop machines now easily can
handle types larger than int (which is often 32 bits even on 64-bit
machines) this seems a bit short-sighted to me. I can see no reason to
allow integer promotion for all integer types.

--
Erik WikstrÃ¶m

Jun 27 '08 #18

On 2008-06-08 12:57, Erik WikstrÃ¶m wrote:

On 2008-06-08 12:16, Juha Nieminen wrote:
>Erik WikstrÃ¶m wrote:
>>Exactly, but if you use int instead of short you get 4294967288, because
the unsigned int is not promoted to a signed long.

What would be the difference between promoting an unsigned int to a
signed int vs. to a signed long in an architecture where int and long
are the same thing (ie. basically all 32-bit systems)?

None, but then the limitation would (partially) be in the platform and
not in the language. What I complain about is the fact that integer
promotion is specified for types "smaller" than int, but not for
"larger" types. Considering that many desktop machines now easily can
handle types larger than int (which is often 32 bits even on 64-bit
machines) this seems a bit short-sighted to me. I can see no reason to
allow integer promotion for all integer types.

I meant: "I can see no reason to *not* allow integer promotion for all
integer types."

--
Erik WikstrÃ¶m

Jun 27 '08 #19

In article <Xn*************************@216.196.97.131>, no****@ebi.ee
says...

[ ... ]

AFAIK the unsigned arithmetics is specified exactly by the standard. This means for
example that a debug implementation cannot detect and assert the overflow, but has to
produce the wrapped result instead:

Unsigned arithmetic is defined in all but one respect: the sizes of the
integer types. IOW, you know how overflow is handled when it does
happen, but you don't (portably) know when it'll happen.

--
Later,
Jerry.

The universe is a figment of its own imagination.

Jun 27 '08 #20

Jerry Coffin wrote:

In article <Xn*************************@216.196.97.131>, no****@ebi.ee
says...

[ ... ]

>AFAIK the unsigned arithmetics is specified exactly by the standard. This
means for example that a debug implementation cannot detect and assert
the overflow, but has to produce the wrapped result instead:

Unsigned arithmetic is defined in all but one respect: the sizes of the
integer types. IOW, you know how overflow is handled when it does
happen, but you don't (portably) know when it'll happen.

a) There is <limits>, which will tell you portably the bounds of built-in
types.

b) With unsigned integers, you can check for overflow easily:

unsigned int a = ...;
unsigned int b = ...;
unsigned int sum = a + b;
if ( sum < a ) {
std::cout << "overflow happened.\n"
}

It's somewhat nice that you can check for the overflow _after_ you did the
addition (this does not necessarily work with signed types). Also, the
check is usually a bit cheaper than for signed types (in the case of
addition, subtraction, and division a single comparison is enough; I did
not think too hard about multiplication).
Best

Kai-Uwe Bux

Jun 27 '08 #21

In article <g2**********@aioe.org>, jk********@gmx.net says...

[ ... ]

a) There is <limits>, which will tell you portably the bounds of built-in
types.

Yes, but 1) it doesn't guarantee that the size you need will be present,
and 2) even if the size you need is present, it may not be represented
completely accurately.

For example, on one (admittedly old) compiler, <limits.hsaid that that
SCHAR_MIN was -127 -- but this was on a twos-complement machine where
the limit was really -128. At the time the committee discussed it, and
at least from what I heard, agreed that this didn't fit what they
wanted, but DID conform with the requirements of the standard.

Even when/if <limits.hdoes contain the correct data, and the right
size is present, you can end up with a rather clumsy ladder to get the
right results.

#if (unsigned char is the right size)
typedef unisgned char typeIneed;
#elif (unsigned short is the right size)
typedef unsigned short typeIneed;
#elif (unsigned int is the right size)
typedef unsigned int typeIneed;
#elif (unsigned long is the right size)
typedef unsigned long typeIneed;
#else
#error correct size not present?
#endif

Fortunately, you most often care about sizes like 16 and 32 bits. C99
and C++ 0x allow you to get sizes like them much more easily, using
uintXX_t.

--
Later,
Jerry.

The universe is a figment of its own imagination.

Jun 27 '08 #22

Jerry Coffin wrote:

In article <g2**********@aioe.org>, jk********@gmx.net says...

[ ... ]

Let's restore some important context: You wrote:

In article <Xn*************************@216.196.97.131>, no****@ebi.ee
says...

[ ... ]

AFAIK the unsigned arithmetics is specified exactly by the standard.
This means for example that a debug implementation cannot detect and
assert the overflow, but has to produce the wrapped result instead:

Unsigned arithmetic is defined in all but one respect: the sizes of the
integer types. IOW, you know how overflow is handled when it does
happen, but you don't (portably) know when it'll happen.

Note the claim that one cannot know when overflow will happen.

To _that_, I answered:

>a) There is <limits>, which will tell you portably the bounds of built-in
types.

And now, you say:

Yes, but 1) it doesn't guarantee that the size you need will be present,

which is completely unrelated to the question whether you can portably
detect overflow, and:

and 2) even if the size you need is present, it may not be represented
completely accurately.

which is just FUD.

For example, on one (admittedly old) compiler, <limits.hsaid that that
SCHAR_MIN was -127 -- but this was on a twos-complement machine where
the limit was really -128. At the time the committee discussed it, and
at least from what I heard, agreed that this didn't fit what they
wanted, but DID conform with the requirements of the standard.

Yes. And on that compiler, SCHAR_MIN _was_ -127. That _means_ that the
implementation made no guarantees about behavior when computations
reach -128 even though experiments or documentation about the computer
architecture suggest you could there. The point is that SCHAR_MIN _defines_
what counts as overflow for the signed char type.

In short: contrary to your claim, you can portably detect overflow (and it
is particularly easy for unsigned types, where issues like the one with
SCHAR_MIN cannot happen).
Best

Kai-Uwe Bux

Jun 27 '08 #23

In article <g2**********@aioe.org>, jk********@gmx.net says...

[ ... ]

For example, on one (admittedly old) compiler, <limits.hsaid that that
SCHAR_MIN was -127 -- but this was on a twos-complement machine where
the limit was really -128. At the time the committee discussed it, and
at least from what I heard, agreed that this didn't fit what they
wanted, but DID conform with the requirements of the standard.

Yes. And on that compiler, SCHAR_MIN _was_ -127. That _means_ that the
implementation made no guarantees about behavior when computations
reach -128 even though experiments or documentation about the computer
architecture suggest you could there. The point is that SCHAR_MIN _defines_
what counts as overflow for the signed char type.

The problem is, that the wording for that part of the standard has
remained essentially the same, and still doesn't require that the values
in limits.h (or equivalents inside of namespace std) be truly accurate
in telling you the limits of the implementation. The implementation
clearly DOES have to provide AT LEAST the range specified, but there is
_nothing_ to prevent it from exceeding that range.

In short: contrary to your claim, you can portably detect overflow (and it
is particularly easy for unsigned types, where issues like the one with
SCHAR_MIN cannot happen).

Yes, it CAN happen for unsigned types. An implementation could (for
example) supply 32-it integers, but claim that UINT_MAX was precisely 4
billion. The unsigned integer in question would NOT wrap at precisely 4
billion (and, in fact, an implementation that did would not conform).

Contrary to YOUR claim, I did NOT ever claim that you can't portably
detect overflow. What I said was:

Yes, but 1) it doesn't guarantee that the size you need will be present,
and 2) even if the size you need is present, it may not be represented
completely accurately.

and:

Even when/if <limits.hdoes contain the correct data, and the right
size is present, you can end up with a rather clumsy ladder to get the
right results.

The fact is that I neither said nor implied that you could not portably
detect "overflow" (i.e. wraparound on an unsigned type).

What I said, and I maintain that it's true (and you've yet to show any
evidence of any sort to the contrary) was that 1) the information in
limits.h can be misleading, and 2) when/if you want wraparound at a
specific size, you're not guaranteed that such a size will exist, and 3)
even when/if it does exist, actually finding and using it can be clumsy.

--
Later,
Jerry.

The universe is a figment of its own imagination.

Jun 27 '08 #24

[snip]

Jerry Coffin wrote:

The fact is that I neither said nor implied that you could not portably
detect "overflow" (i.e. wraparound on an unsigned type).

Really? This is from uptrhead:

Jerry Coffin wrote:

In article <Xn*************************@216.196.97.131>, no****@ebi.ee
says...

[ ... ]

>AFAIK the unsigned arithmetics is specified exactly by the standard. This
means for example that a debug implementation cannot detect and assert
the overflow, but has to produce the wrapped result instead:

Unsigned arithmetic is defined in all but one respect: the sizes of the
integer types. IOW, you know how overflow is handled when it does
happen, but you don't (portably) know when it'll happen.

Now, if somebody impersonated you, I apologize.

Whoever made the statement

"IOW, you know how overflow is handled when it does happen, but you don't
(portably) know when it'll happen."

was mistaken. This is the statement I responded to. Snipping it away over
and over again will not change that.
Best

Kai-Uwe Bux

Jun 27 '08 #25

In article <g2**********@aioe.org>, jk********@gmx.net says...

[ ... ]

Whoever made the statement

"IOW, you know how overflow is handled when it does happen, but you don't
(portably) know when it'll happen."

was mistaken. This is the statement I responded to. Snipping it away over
and over again will not change that.

It is not what you responded to, and it IS correct. You do NOT know (a
priori) exactly when it'll happen. Nothing in limits.h changes that. I'd
quote the standard, but the problem is that this is simply a situation
in which there's nothing in the standard to quote. If you want to claim
that the number given as (for example) UINT_MAX in limits.h precisely
reflects when arithmetic with an unisgned integer will wrap around, you
have two choices: quote something from the standard that says so, or
else admit that it doesn't exist (or, if course, continue as you are
now, making wild, unsupported accusations!)

--
Later,
Jerry.

The universe is a figment of its own imagination.

Jun 27 '08 #26

Jerry Coffin wrote:

In article <g2**********@aioe.org>, jk********@gmx.net says...

[ ... ]

>Whoever made the statement

"IOW, you know how overflow is handled when it does happen, but you
don't (portably) know when it'll happen."

was mistaken. This is the statement I responded to. Snipping it away over
and over again will not change that.

It is not what you responded to,

The record clearly shows that this is precisely the statement I responded
to. I would quote my response posting in full, but let me just paste a link
to Google instead (apologies: I had to break the line to please my
newsreader):

http://groups.google.com/group/comp....ee/browse_frm/
thread/52f4c25b71dee3a5/8911caf6152e4b1e?rnum=11
&_done=%2Fgroup%2Fcomp.lang.c%2B%2B%2Fbrowse_frm%2 Fthread
%2F52f4c25b71dee3a5%3F#doc_5c3e4897ec30f171

If there is any meaning to using bottom-posting as opposed to top-posting,
it should be clear that I responded to no other claim than the above.

and it IS correct. You do NOT know (a
priori) exactly when it'll happen. Nothing in limits.h changes that.

Nothing in <limitsis needed for unsigned types. The standard guarantees
that

(unsigned int)(-1)

is 2^N-1 where N is the bitlength of unsigned int [4.7/2]. The corresponding
statements for other unsigned types are true, too.

I'd
quote the standard, but the problem is that this is simply a situation
in which there's nothing in the standard to quote. If you want to claim
that the number given as (for example) UINT_MAX in limits.h precisely
reflects when arithmetic with an unisgned integer will wrap around, you
have two choices:

Formally, I did not claim that that numeric_limits<unsigned int>::max()
gives the same value as (unsigned int)(-1). However, you are correct that I
mentioned <limitsin response to the above, possibly creating the
impression that this was the case.

On the other hand, The main part of my response demonstrated a check for
overflow that does not rely on <limits>.

quote something from the standard that says so, or
else admit that it doesn't exist (or, if course, continue as you are
now, making wild, unsupported accusations!)

I don't make unsupported accusations. I just restore context by quoting.
Also, I disagree with the statement that we cannot know when overflow will
happen for unsigned types. If your case for that statement relies on
<limitsbeing underspecified, I think the above observation from [4.7/2]
should clear things up.
Best

Kai-Uwe Bux

Jun 27 '08 #27

In article <g2**********@aioe.org>, jk********@gmx.net says...

[ ... ]

and it IS correct. You do NOT know (a
priori) exactly when it'll happen. Nothing in limits.h changes that.

Nothing in <limitsis needed for unsigned types. The standard guarantees
that

(unsigned int)(-1)

is 2^N-1 where N is the bitlength of unsigned int [4.7/2]. The corresponding
statements for other unsigned types are true, too.

The point of my statement was that you don't know N ahead of time.
Nothing you've said changes that.

[ ... ]

On the other hand, The main part of my response demonstrated a check for
overflow that does not rely on <limits>.

Yes, but it doesn't tell you ahead of time, when it'll happen, because
you don't know what the size of any specific integer type will be.

quote something from the standard that says so, or
else admit that it doesn't exist (or, if course, continue as you are
now, making wild, unsupported accusations!)

I don't make unsupported accusations. I just restore context by quoting.

You did make unsupported accusations, and you still haven't quoted a
single thing from the standard to support them. You won't either,
because they AREN'T there.

Also, I disagree with the statement that we cannot know when overflow will
happen for unsigned types. If your case for that statement relies on
<limitsbeing underspecified, I think the above observation from [4.7/2]
should clear things up.

Then you haven't got a clue of what you're talking about! The quote
above does NOT tell you the size of any unsigned integer type, which is
exactly what you need before you know when wraparound will happen.

--
Later,
Jerry.

The universe is a figment of its own imagination.

Jun 27 '08 #28

Jerry Coffin wrote:

In article <g2**********@aioe.org>, jk********@gmx.net says...

[ ... ]

For example, on one (admittedly old) compiler, <limits.hsaid that
that SCHAR_MIN was -127 -- but this was on a twos-complement machine
where the limit was really -128. At the time the committee discussed
it, and at least from what I heard, agreed that this didn't fit what
they wanted, but DID conform with the requirements of the standard.

Yes. And on that compiler, SCHAR_MIN _was_ -127. That _means_ that the
implementation made no guarantees about behavior when computations
reach -128 even though experiments or documentation about the computer
architecture suggest you could there. The point is that SCHAR_MIN
_defines_ what counts as overflow for the signed char type.

The problem is, that the wording for that part of the standard has
remained essentially the same, and still doesn't require that the values
in limits.h (or equivalents inside of namespace std) be truly accurate
in telling you the limits of the implementation. The implementation
clearly DOES have to provide AT LEAST the range specified, but there is
_nothing_ to prevent it from exceeding that range.

>In short: contrary to your claim, you can portably detect overflow (and
it is particularly easy for unsigned types, where issues like the one
with SCHAR_MIN cannot happen).

Yes, it CAN happen for unsigned types. An implementation could (for
example) supply 32-it integers, but claim that UINT_MAX was precisely 4
billion.

Are you sure this can happen on a standard conforming implementation?

In [18.2.1.2/4], the standard describes

numeric_limits<T>::max() throw();

as returning the "maximum finite value". In a footnote, it also requires
that UINT_MAX agrees with numeric_limits<unsigned int>::max(). In the
suggested implementation, 4.000.000.001 would be a valid unsinged int
bigger than numeric_limits<unsigned int>::max(), which contradicts the
standard. (Admittedly, I do not know whether footnotes like that are
normative. But, in any case, numeric_limits<T>::max() should give you the
maximum finite value portably according to the standard.

The unsigned integer in question would NOT wrap at precisely 4
billion (and, in fact, an implementation that did would not conform).

Agreed, but I think the implementation would be non-conforming anyway.
[snip]
Best

Kai-Uwe Bux

Jun 27 '08 #29

Jerry Coffin wrote:

In article <g2**********@aioe.org>, jk********@gmx.net says...

[ ... ]

and it IS correct. You do NOT know (a
priori) exactly when it'll happen. Nothing in limits.h changes that.

Nothing in <limitsis needed for unsigned types. The standard guarantees
that

(unsigned int)(-1)

is 2^N-1 where N is the bitlength of unsigned int [4.7/2]. The
corresponding statements for other unsigned types are true, too.

The point of my statement was that you don't know N ahead of time.
Nothing you've said changes that.

I am utterly confused now. Clearly we have different understandings of what
it means to "know (a priory) exactly when it [overflow] will happen.".

I took your words to mean that I cannot determine at compile time the upper
bound for an unsigned type. The word "portably" in your phrase:

IOW, you know how overflow is handled when it does happen, but you don't
(portably) know when it'll happen.

seemed to indicate that, and your follow up remarks on the unreliability of
limits.h strengthened that interpretation in my mind (as I got the
impression that you maintained that the portability issue arose from
possible inaccurate bounds in <limits>).

Now, it appears that you had something different in mind. I am somewhat lost
at seeing what it might be.

[ ... ]

>On the other hand, The main part of my response demonstrated a check for
overflow that does not rely on <limits>.

Yes, but it doesn't tell you ahead of time, when it'll happen, because
you don't know what the size of any specific integer type will be.

quote something from the standard that says so, or
else admit that it doesn't exist (or, if course, continue as you are
now, making wild, unsupported accusations!)

I don't make unsupported accusations. I just restore context by quoting.

You did make unsupported accusations,

I maintain that

(a) all my "accusations" where supported (they were all of the form "you
claimed ...", "you snipped ..." or some such thing, and all of them were
supported by quotes) and

(b) all my unsupported claims (of which I may have made a few) do not
qualify as accusations (since they are not criticisms targeted toward
anybody).
If you felt offended by any of my remarks, I am sorry.

and you still haven't quoted a single thing from the standard to support
them. You won't either, because they AREN'T there.

How and why would I support a claim of the form "you claimed ..." by a quote
from the standard?

>Also, I disagree with the statement that we cannot know when overflow
will happen for unsigned types. If your case for that statement relies on
<limitsbeing underspecified, I think the above observation from [4.7/2]
should clear things up.

Then you haven't got a clue of what you're talking about! The quote
above does NOT tell you the size of any unsigned integer type, which is
exactly what you need before you know when wraparound will happen.

I am pretty certain tha I know what I am talking about. It might however be
something that you were not talking about and I may have misinterpreted
your claim(s).

The above quote allows you to portably(!) determine the size of any unsigned
integer type at compile time. If that is not what you mean by "(portably)
know when [overflow will happen]", I was just misunderstanding you.
Best

Kai-Uwe Bux

Jun 27 '08 #30

In article <g2**********@aioe.org>, jk********@gmx.net says...

[ ... ]

I am utterly confused now. Clearly we have different understandings of what
it means to "know (a priory) exactly when it [overflow] will happen.".

I took your words to mean that I cannot determine at compile time the upper
bound for an unsigned type. The word "portably" in your phrase:

IOW, you know how overflow is handled when it does happen, but you don't
(portably) know when it'll happen.

seemed to indicate that, and your follow up remarks on the unreliability of
limits.h strengthened that interpretation in my mind (as I got the
impression that you maintained that the portability issue arose from
possible inaccurate bounds in <limits>).

Yes and no -- as I thought my mention of the ladder of #if/#elif's would
make obvious, I was less concerned with compile time than with when you
write the code. Mention had been made previously (for one example) of
cryptographic algorithms that depend on wraparound happening at some
particular size (e.g. 32-bit for SHA-1).

The problem is that (with something like uint32_t, which isn't included
in the current version of C++) you can't easily pick a type that's
guaranteed to give you that. On a currently-typical 32-bit
implementation, unsigned int or unsigned long will do the trick -- but
then again, on a 64-bit implementation (hardly a rarity anymore either)
both of those might give you 64 bits instead of 32, causing an algorithm
that depends on wraparound at 32 bits to fail (unless you explicitly add
something like '& 0xffffffff' at the appropriate places).

At least to me, "portability" is mostly something you build into your
code -- i.e. writing the code so it produces correct results on
essentially any conforming implementation of C++ (or at least close to
conforming). An obvious example would be code written 20 years ago that
still works with every current C++ compiler I have handy today (and any
compiler it didn't work with I'd venture to guess was either badly
broken, or for a language other than C++).

For code like this, learning about wraparound at compile-time is FAR too
late. This cod works (just fine) with compilers that didn't exist when
it was written. Unfortunately, writing code that way is a pain -- the
aforementioned #if/#elif ladder being part (but only part) of the
problem.

--
Later,
Jerry.

The universe is a figment of its own imagination.

Jun 27 '08 #31

In article <g2**********@aioe.org>, jk********@gmx.net says...

[ ... ]

Yes, it CAN happen for unsigned types. An implementation could (for
example) supply 32-it integers, but claim that UINT_MAX was precisely 4
billion.

Are you sure this can happen on a standard conforming implementation?

In [18.2.1.2/4], the standard describes

numeric_limits<T>::max() throw();

as returning the "maximum finite value". In a footnote, it also requires
that UINT_MAX agrees with numeric_limits<unsigned int>::max(). In the
suggested implementation, 4.000.000.001 would be a valid unsinged int
bigger than numeric_limits<unsigned int>::max(), which contradicts the
standard. (Admittedly, I do not know whether footnotes like that are
normative. But, in any case, numeric_limits<T>::max() should give you the
maximum finite value portably according to the standard.

Yes. The problem is, that the standard is fairly specific in defining
the words and what they mean. For something to constitute a requirement,
it normally needs to include a word like "shall". As it's worded right
now, you have something that looks like a requirement, and is almost
certainly intended to be one, but really isn't.

Getting things like this right is a real pain. I once sent a list of
problems like this about the then-current draft of the C++ 0x standard,
and while I spent a fair amount of time on it, I know I didn't catch all
the problems, and probably not even the majority of them. (In case you
care, it's at:

http://groups.google.com/group/comp....54efb7bacb098c

There are a couple of these changes that _might_ not reflect the
original intent -- but most of them are simply changing non-normative
wording to normative wording. One example that occurred a number of
times was using "must" instead of "shall". In (at least most of) the
cited cases, these were _clearly_ intended to place requirements on the
implementation -- but the rules given in the standard itself made it
open to a LOT of question whether they could be considered requirements
or not.

--
Later,
Jerry.

The universe is a figment of its own imagination.

Jun 27 '08 #32

James Kanze

On Jun 8, 3:12 pm, Kai-Uwe Bux <jkherci...@gmx.netwrote:

[...]

b) With unsigned integers, you can check for overflow easily:

unsigned int a = ...;
unsigned int b = ...;
unsigned int sum = a + b;
if ( sum < a ) {
std::cout << "overflow happened.\n"
}

Which is fine for addition, but fails for other operators, like
multiplication.

It's somewhat nice that you can check for the overflow _after_
you did the addition (this does not necessarily work with
signed types). Also, the check is usually a bit cheaper than
for signed types (in the case of addition, subtraction, and
division a single comparison is enough; I did not think too
hard about multiplication).

Can division overflow? The only possible overflow I can think
of is if you divide by 0 (since no integral representations I
know of support infinity), and that's undefined behavior even on
an unsigned.

My understanding of the motivation behind undefined behavior is
that it allows the implementation to do something reasonable
(crash the program, for example). Regretfully, no
implementations today seem to take advantage of this. Which is
a shame, because at the machine instruction level, it's usually
very easy to check for overflow (signed or unsigned) after the
operation, but there's no way to access these instructions from
C++.

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Jun 27 '08 #33

Jerry Coffin wrote:

In article <g2**********@aioe.org>, jk********@gmx.net says...

[ ... ]

Yes, it CAN happen for unsigned types. An implementation could (for
example) supply 32-it integers, but claim that UINT_MAX was precisely 4
billion.

Are you sure this can happen on a standard conforming implementation?

In [18.2.1.2/4], the standard describes

numeric_limits<T>::max() throw();

as returning the "maximum finite value". In a footnote, it also requires
that UINT_MAX agrees with numeric_limits<unsigned int>::max(). In the
suggested implementation, 4.000.000.001 would be a valid unsinged int
bigger than numeric_limits<unsigned int>::max(), which contradicts the
standard. (Admittedly, I do not know whether footnotes like that are
normative. But, in any case, numeric_limits<T>::max() should give you the
maximum finite value portably according to the standard.

Yes. The problem is, that the standard is fairly specific in defining
the words and what they mean. For something to constitute a requirement,
it normally needs to include a word like "shall". As it's worded right
now, you have something that looks like a requirement, and is almost
certainly intended to be one, but really isn't.

I do not entirely agree with that interpretation of the standard. It is
correct that the C-standard says that "shall" denotes a requirement, I do
not see, however, that one can infer that _only_ sentences
containing "shall" can constitute requirement. E.g., deque::size() is
defined in [23.1/6] without "shall". Should one maintain that there is no
normative requirement that it returns the number of elements in the
container?

Now, I do see differences: (a) [23.1] is a section entitled "Container
requirements" and on the other hand, (b) [18.2.1.2/4] is not even a
complete sentence. However, many return clauses in the library do not
use "shall" and Section 17.3 that could tell us how they specify
requirements is only informational. I do hesitate to conclude that return
clauses are by and large non-normative. I feel the contention that the lack
of "shall" a priory renders a sentence non-normative is too radical an
interpretation. In any case, I would consider it a very, very lame excuse
of a compiler vendor to point out that there is a "shall" missing when
confronted with the contention that his implementation of
numeric_limits<unsigned>::max() is in violation of the standard.
I will admit that I an on shaky ground here. It appears that the C++
standard incorporates those rules by reference. First, I do not even see
that they are explicitly imported from the C standard, so the above might
be completely bogus. I do see that [1.2/2] incorporates all definitions
from ISO/SEC 2382, which I don't have.

Could you provide some more details about how those "shall" rules enter the
picture with C++ and how they are worded.

Getting things like this right is a real pain. I once sent a list of
problems like this about the then-current draft of the C++ 0x standard,
and while I spent a fair amount of time on it, I know I didn't catch all
the problems, and probably not even the majority of them. (In case you
care, it's at:

http://groups.google.com/group/comp....54efb7bacb098c

There are a couple of these changes that _might_ not reflect the
original intent -- but most of them are simply changing non-normative
wording to normative wording. One example that occurred a number of
times was using "must" instead of "shall". In (at least most of) the
cited cases, these were _clearly_ intended to place requirements on the
implementation -- but the rules given in the standard itself made it
open to a LOT of question whether they could be considered requirements
or not.

That's an impressive list. It would definitely be better if the standard was
more consistent. Your efforts are to be commended.
Thanks

Kai-Uwe Bux

Jun 27 '08 #34

Jerry Coffin wrote:

In article <g2**********@aioe.org>, jk********@gmx.net says...

[ ... ]

>I am utterly confused now. Clearly we have different understandings of
what it means to "know (a priory) exactly when it [overflow] will
happen.".

I took your words to mean that I cannot determine at compile time the
upper bound for an unsigned type. The word "portably" in your phrase:

IOW, you know how overflow is handled when it does happen, but you
don't (portably) know when it'll happen.

seemed to indicate that, and your follow up remarks on the unreliability
of limits.h strengthened that interpretation in my mind (as I got the
impression that you maintained that the portability issue arose from
possible inaccurate bounds in <limits>).

Yes and no -- as I thought my mention of the ladder of #if/#elif's would
make obvious, I was less concerned with compile time than with when you
write the code. Mention had been made previously (for one example) of
cryptographic algorithms that depend on wraparound happening at some
particular size (e.g. 32-bit for SHA-1).

Now, I understand. Sorry for the confusion. I didn't see that SHA-1 example
so I was considering the initial statement in isolation and my
interpretation headed down a different path from the beginning.

The problem is that (with something like uint32_t, which isn't included
in the current version of C++) you can't easily pick a type that's
guaranteed to give you that. On a currently-typical 32-bit
implementation, unsigned int or unsigned long will do the trick -- but
then again, on a 64-bit implementation (hardly a rarity anymore either)
both of those might give you 64 bits instead of 32, causing an algorithm
that depends on wraparound at 32 bits to fail (unless you explicitly add
something like '& 0xffffffff' at the appropriate places).

Right. Been there, done that.

At least to me, "portability" is mostly something you build into your
code -- i.e. writing the code so it produces correct results on
essentially any conforming implementation of C++ (or at least close to
conforming). An obvious example would be code written 20 years ago that
still works with every current C++ compiler I have handy today (and any
compiler it didn't work with I'd venture to guess was either badly
broken, or for a language other than C++).

For code like this, learning about wraparound at compile-time is FAR too
late. This cod works (just fine) with compilers that didn't exist when
it was written. Unfortunately, writing code that way is a pain -- the
aforementioned #if/#elif ladder being part (but only part) of the
problem.

Actually, I think it isn't all that bad. What I did once is to define my own
arithmetic 32-bit unsigned integer type (using the &0xffffffff trick). With
compile time template tricks, you can eliminate the & 0xffffffff for those
platforms where unsigned long or unsigned int happen to have 32 bit. From
then on, I just use the special type for all those algorithms that really
need 32 bits.
Best

Kai-Uwe Bux

Jun 27 '08 #35

Pete Becker

On 2008-06-09 12:27:49 +0200, Kai-Uwe Bux <jk********@gmx.netsaid:

>
I do not entirely agree with that interpretation of the standard. It is
correct that the C-standard says that "shall" denotes a requirement, I do
not see, however, that one can infer that _only_ sentences
containing "shall" can constitute requirement. E.g., deque::size() is
defined in [23.1/6] without "shall". Should one maintain that there is no
normative requirement that it returns the number of elements in the
container?

ISO guidelines say that "shall" imposes requirements, and that other
similar words in general don't. The C++ standard is a bit sloppy in
this regard. In particular, library requirements often are written in
the form "x is y" when the intention is "x shall be y".

>
Now, I do see differences: (a) [23.1] is a section entitled "Container
requirements" and on the other hand, (b) [18.2.1.2/4] is not even a
complete sentence. However, many return clauses in the library do not
use "shall" and Section 17.3 that could tell us how they specify
requirements is only informational. I do hesitate to conclude that return
clauses are by and large non-normative. I feel the contention that the lack
of "shall" a priory renders a sentence non-normative is too radical an
interpretation. In any case, I would consider it a very, very lame excuse
of a compiler vendor to point out that there is a "shall" missing when
confronted with the contention that his implementation of
numeric_limits<unsigned>::max() is in violation of the standard.

Right. The guideline calling for "shall" is for clarity; it's not a
requirement. The standard does have an internally consistent usage, but
it's sometimes confusing. That's on my list of things to work on.

>

I will admit that I an on shaky ground here. It appears that the C++
standard incorporates those rules by reference. First, I do not even see
that they are explicitly imported from the C standard, so the above might
be completely bogus. I do see that [1.2/2] incorporates all definitions
from ISO/SEC 2382, which I don't have.

Actually, it's none of the above. ISO's guidelines for drafting are in
a meta-documment that applies to all standards, hence, does not need to
be explicitly mentioned.

--
Pete
Roundhouse Consulting, Ltd. (www.versatilecoding.com) Author of "The
Standard C++ Library Extensions: a Tutorial and Reference
(www.petebecker.com/tr1book)

Jun 27 '08 #36

James Kanze

On Jun 9, 12:51 pm, Kai-Uwe Bux <jkherci...@gmx.netwrote:

[...]

Actually, I think it isn't all that bad. What I did once is to
define my own arithmetic 32-bit unsigned integer type (using
the &0xffffffff trick). With compile time template tricks, you
can eliminate the & 0xffffffff for those platforms where
unsigned long or unsigned int happen to have 32 bit.

There's no need for any template tricks. Just specify -O (or
whatever it takes for optimization) in the command line of the
compiler. (For that matter, I expect most compilers optimize
this correctly even without -O.)

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Jun 27 '08 #37

In article <g2**********@aioe.org>, jk********@gmx.net says...

[ ... ]

I do not entirely agree with that interpretation of the standard. It is
correct that the C-standard says that "shall" denotes a requirement, I do
not see, however, that one can infer that _only_ sentences
containing "shall" can constitute requirement. E.g., deque::size() is
defined in [23.1/6] without "shall". Should one maintain that there is no
normative requirement that it returns the number of elements in the
container?

Unfortunately, yes, at least probably.

Now, I do see differences: (a) [23.1] is a section entitled "Container
requirements" and on the other hand, (b) [18.2.1.2/4] is not even a
complete sentence. However, many return clauses in the library do not
use "shall" and Section 17.3 that could tell us how they specify
requirements is only informational. I do hesitate to conclude that return
clauses are by and large non-normative. I feel the contention that the lack
of "shall" a priory renders a sentence non-normative is too radical an
interpretation. In any case, I would consider it a very, very lame excuse
of a compiler vendor to point out that there is a "shall" missing when
confronted with the contention that his implementation of
numeric_limits<unsigned>::max() is in violation of the standard.

Yet, that's pretty much the argument that was used to justify defining
SCHAR_MIN as -127 on a twos complement machine...

The part that always struck me as odd about it was that 1) the intent
appeared quite obvious (at least to me), and 2) they almost certainly
put more work into arguing that their implementation was allowed than it
would have taken to just fix the header.

The current C++ standard is based on the C95 standard. I believe in this
area C95 is identical to C89/90. Part of its definition of undefined
behavior reads:

If a "shall" or "shall not" requirement that appears outside
of a constraint is violated, the behavior is undefined.

So, at least according to that, the presence (or lack thereof) of the
word "shall" really does govern the meaning of a specific clause. OTOH,
I'll openly admit that even in that definition, it speaks of a "'shall'
or 'shall not' requirement", which implies that some other sort of
requirement is possible -- but nothing is ever said about what it means
if some other requirement is violated.

Specifically, there is nothing to say or imply that violating any other
requirement means anything about whether an implementation is conforming
or not. In the absence of such a definition, I can't see how there's any
real basis for saying to does mean anything about conformance. Under
those circumstances, I'm left with little alternative but to conclude
exactly as I said before: they look like requirements they were probably
intended to be requirements, but they're really not.

Looking at it from a slightly different direction, they're a bit like a
law that said "you must do this", but then went on to say "no penalty of
any sort may be imposed for violating this law." What you have left
isn't much of law...

I will admit that I an on shaky ground here. It appears that the C++
standard incorporates those rules by reference. First, I do not even see
that they are explicitly imported from the C standard, so the above might
be completely bogus. I do see that [1.2/2] incorporates all definitions
from ISO/SEC 2382, which I don't have.

I tend to agree on that -- but without assuming they imported the
requirements from the C standard, we're left with _nothing_ in the C++
standard being normative. Unfortunately, I can't find _anything_ to
support the notion that the parts lacking a "shall" are normative,
beyond a bare assertion that "they obviously should be."

Looking at it from the other end for a moment, it's probably open to
question whether it makes any real difference in the end. AFAIK, there's
no longer anybody who attempts to do formal certification that compilers
conform to any of the language standards (in fact, I don't think any
such program has ever existed for C++). In that absence of such a
program, we're left with only market forces, which tend to be based more
on overall perceived quality of implementation than on technical details
of whether something constitutes conformance or not. I think it's fair
to say that almost anybody would consider something a lousy
implementation if it didn't implement requirements correctly, even in
the absence of "shall" or "shall not" in the requirement in the
standard. As such, it probably doesn't make much _practical_ difference
either way.

--
Later,
Jerry.

The universe is a figment of its own imagination.

Jun 27 '08 #38

In article <a2e425cf-3dfc-473e-8fc2-f3311d1949a3
@r66g2000hsg.googlegroups.com>, ja*********@gmail.com says...

[ ... ]

I think that the current C standard pretty does pretty much
limit what a compiler is allowed to do here. And given the time
that this has been established practice, I don't think we have
to worry too much about an error or a lack of conformance here.
(I, too, have seen SCHAR_MIN equal to -127 on a 2's complement
machine. But that was a long, long time ago; I don't think it's
an issue today, and given the tightened up wording in C99, I
would consider it an error.)

I agree that we no longer have to worry much about it -- and while I
agree that the wording in C99 has been tightened, I'm not entirely
convinced that it's quite enough to make it a requirement. OTOH, as I
said previously in this thread, I'm not entirely convinced that it
really matters either.

--
Later,
Jerry.

The universe is a figment of its own imagination.

Jun 27 '08 #39