~0 undefined?

Does ~0 yield undefined behavior? C++03 section 5 paragraph 5 seems to
suggest so:

If during the evaluation of an expression, the result is not
mathematically defined or not in the range of representable values
for its type, the behavior is undefined [...]

The description of unary ~ (C++03 section 5.3.1 paragraph 8):

The operand of — shall have integral or enumeration type; the
result is the one's complement of its operand. Integral promotions
are performed. The type of the result is the type of the promoted
operand. [...]

But perhaps "one's complement" means the value that type would have with
all bits inverted, rather than the mathematical result of inverting all
bits in the binary representation. For example, on a machine with 32-bit
int, does one's complement of 0 (attempt to) have the value 2^31-1, which
can't be represented in a signed int and is thus undefined, or does it
have the value of whatever a signed int with all set bits would have (-1
on a two's complement machine)?

I used the ~0 case for simplicity; in practice, this issue might occur
when ANDing with the complement of a mask, for example n&=~0x0F to clear
the low 4 bits of n, or ~n&0x0F to find the inverted low 4 bits of n.

Oct 20 '08 #1

Subscribe Reply

1988

Victor Bazarov

blargg wrote:

Does ~0 yield undefined behavior? C++03 section 5 paragraph 5 seems to
suggest so:

>If during the evaluation of an expression, the result is not
mathematically defined or not in the range of representable values
for its type, the behavior is undefined [...]

The description of unary ~ (C++03 section 5.3.1 paragraph 8):

>The operand of — shall have integral or enumeration type; the
result is the one's complement of its operand. Integral promotions
are performed. The type of the result is the type of the promoted
operand. [...]

But perhaps "one's complement" means the value that type would have with
all bits inverted, rather than the mathematical result of inverting all
bits in the binary representation. For example, on a machine with 32-bit
int, does one's complement of 0 (attempt to) have the value 2^31-1, which
can't be represented in a signed int and is thus undefined,

Uh... Sorry, could you perhaps elaborate, why (2^31 - 1) can't be
represented? Or did you mean (2^32 - 1)?

If the resulting value is greater than can be represented in 'int', the
compiler will create the code to promote it first to 'unsigned', then to
'long', then to 'unsigned long', IIRC. So, if ~0 cannot for some reason
be represented in an int, it might become the (unsigned){all bits set}
value.

or does it
have the value of whatever a signed int with all set bits would have (-1
on a two's complement machine)?

That's what I'd expect.

I used the ~0 case for simplicity; in practice, this issue might occur
when ANDing with the complement of a mask, for example n&=~0x0F to clear
the low 4 bits of n, or ~n&0x0F to find the inverted low 4 bits of n.

Actually, on 2's complement, we use -1 for the "all bits set"...
Perhaps we should switch to ~0 (more portable?)

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask

Oct 20 '08 #2

James Kanze

On Oct 20, 7:51*pm, blargg....@gishpuppy.com (blargg) wrote:

Does ~0 yield undefined behavior?

No.

C++03 section 5 paragraph 5 seems to suggest so:

If during the evaluation of an expression, the result is not
mathematically defined or not in the range of representable
values for its type, the behavior is undefined [...]

The description of unary ~ (C++03 section 5.3.1 paragraph 8):

The operand of — shall have integral or enumeration type;
the result is the one's complement of its operand. Integral
promotions are performed. The type of the result is the type
of the promoted operand. [...]

But perhaps "one's complement" means the value that type would
have with all bits inverted, rather than the mathematical
result of inverting all bits in the binary representation.

It's not really that clear what to expect on a machine not using
2's complement, but at the worst, it's unspecified or
implementation defined---not undefined behavior. (In general, I
would recommend avoiding ~, | and & on signed types.)

For example, on a machine with 32-bit int, does one's
complement of 0 (attempt to) have the value 2^31-1, which
can't be represented in a signed int and is thus undefined, or
does it have the value of whatever a signed int with all set
bits would have (-1 on a two's complement machine)?

The wording is a bit sloppy, but what it doubtlessly means is
that you get a value with all bits set to one (in the specified
type). What that value is, of course, is probably
implementation dependent; it is -1 on a 2's complement machine,
but could very easily be 0 elsewhere.

I used the ~0 case for simplicity; in practice, this issue
might occur when ANDing with the complement of a mask, for
example n&=~0x0F to clear the low 4 bits of n, or ~n&0x0F to
find the inverted low 4 bits of n.

As long as the sign bit is 0, the behavior should be well
defined, with no ambiguities.

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Oct 20 '08 #3

blargg

In article <gd**********@news.datemas.de>, Victor Bazarov
<v.********@comAcast.netwrote:

blargg wrote:
Does ~0 yield undefined behavior? C++03 section 5 paragraph 5 seems to
suggest so:

If during the evaluation of an expression, the result is not
mathematically defined or not in the range of representable values
for its type, the behavior is undefined [...]
The description of unary ~ (C++03 section 5.3.1 paragraph 8):

The operand of — shall have integral or enumeration type; the
result is the one's complement of its operand. Integral promotions
are performed. The type of the result is the type of the promoted
operand. [...]
But perhaps "one's complement" means the value that type would have with
all bits inverted, rather than the mathematical result of inverting all
bits in the binary representation. For example, on a machine with 32-bit
int, does one's complement of 0 (attempt to) have the value 2^31-1, which
can't be represented in a signed int and is thus undefined,

Uh... Sorry, could you perhaps elaborate, why (2^31 - 1) can't be
represented? Or did you mean (2^32 - 1)?

Yeah, (2^32 - 1); I noticed just after I posted.

If the resulting value is greater than can be represented in 'int', the
compiler will create the code to promote it first to 'unsigned', then to
'long', then to 'unsigned long', IIRC. So, if ~0 cannot for some reason
be represented in an int, it might become the (unsigned){all bits set}
value.

Not as I understand it, where this only occurs when selecting what type a
literal will be. If what you described were the case, the type of an
expression would depend on its run-time value, for example if i were an
int, the type of the expression i+1 would be an int unless i contained
INT_MAX, where it would be of type unsigned int. This is clearly not the
case, since C++ is statically-typed.

or does it
have the value of whatever a signed int with all set bits would have (-1
on a two's complement machine)?

That's what I'd expect.

The problem is that most compilers implement conversion of a value to a
signed int as a no-op, that is, simply to reinterpret the bits as being in
two's complement.

I used the ~0 case for simplicity; in practice, this issue might occur
when ANDing with the complement of a mask, for example n&=~0x0F to clear
the low 4 bits of n, or ~n&0x0F to find the inverted low 4 bits of n.

Actually, on 2's complement, we use -1 for the "all bits set"...
Perhaps we should switch to ~0 (more portable?)

It seems to me that ~0 is actually less-portable. As far as I know,
converting -1 to an unsigned type is guaranteed to give a value with all
bits set, since that conversion is guaranteed to give you a two's
complement representation in the unsigned result, even if the machine
doesn't use such a representation (C++03 section 4.7 paragraph 2).

Oct 20 '08 #4

Juha Nieminen

Victor Bazarov wrote:

Actually, on 2's complement, we use -1 for the "all bits set"... Perhaps
we should switch to ~0 (more portable?)

But will it work properly? Assume that in some system 'long' is a
larger type than 'int'. Will this work?

long value1 = ~0;
unsigned long value2 = ~0;

What kind of promotion chain is ~0 subjected to here? Will 'value1'
and 'value2' end up having all bits set?

Oct 20 '08 #5

Victor Bazarov

James Kanze wrote:

[..]
The wording is a bit sloppy, but what it doubtlessly means is
that you get a value with all bits set to one (in the specified
type). What that value is, of course, is probably
implementation dependent; it is -1 on a 2's complement machine,
but could very easily be 0 elsewhere.

Where? C++ only supports three representations, the 1's complement, the
2's complement, and the signed magnitude.

[..]

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask

Oct 20 '08 #6

Victor Bazarov

Juha Nieminen wrote:

Victor Bazarov wrote:
>Actually, on 2's complement, we use -1 for the "all bits set"... Perhaps
we should switch to ~0 (more portable?)

But will it work properly? Assume that in some system 'long' is a
larger type than 'int'. Will this work?

long value1 = ~0;
unsigned long value2 = ~0;

What kind of promotion chain is ~0 subjected to here? Will 'value1'
and 'value2' end up having all bits set?

Probably not. It is generally better to use the literals of the same
type, IOW

long value1 = ~0L;
unsigned long value2 = ~0UL;

, to avoid specifically the situations where the result depends on some
implementation-defined behaviour[s].

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask

Oct 20 '08 #7

James Kanze

On Oct 20, 8:38*pm, Victor Bazarov <v.Abaza...@comAcast.netwrote:

blargg wrote:
Does ~0 yield undefined behavior? C++03 section 5 paragraph 5 seems to
suggest so:

If during the evaluation of an expression, the result is not
mathematically defined or not in the range of representable values
for its type, the behavior is undefined [...]

The description of unary ~ (C++03 section 5.3.1 paragraph 8):

The operand of — shall have integral or enumeration type; the
result is the one's complement of its operand. Integral promotions
are performed. The type of the result is the type of the promoted
operand. [...]

But perhaps "one's complement" means the value that type would have with
all bits inverted, rather than the mathematical result of inverting all
bits in the binary representation. For example, on a machine with 32-bit
int, does one's complement of 0 (attempt to) have the value 2^31-1, which
can't be represented in a signed int and is thus undefined,

Uh... *Sorry, could you perhaps elaborate, why (2^31 - 1) can't be
represented? *Or did you mean (2^32 - 1)?

If the resulting value is greater than can be represented in
'int', the compiler will create the code to promote it first
to 'unsigned', then to 'long', then to 'unsigned long', IIRC.
*So, if ~0 cannot for some reason be represented in an int, it
might become the (unsigned){all bits set} value.

No. That's the way the compiler behaves for integral literal
for an octal or hexadecimal constant. (For a decimal constant,
the results will never be unsigned.) In this case, the integral
literal is 0---which can't possibly overflow anything, and so
has type int. What we have here is an expression, with an
operator applied to an int. What blargg is doubtlessly
referring to is the statement in §5 that "If during the
evaluation of an expression, the result is not mathematically
defined or not in the range of representable values for its
type, the behavior is undefined, unless such an expression
appears where an integral constant expression is required
(5.19), in which case the program is ill-formed."

The problem here is that the "one's complement" operation
doesn't really define a numeric result, but rather a
manipulation on the underlying representation. So I don't think
that this statement can be applied: the ~ operator changes the
bits in the representation, and the "result" is whatever value
the changed bits happen to represent. Except that it's not
really too clear what that means, either; what happens if the
changed bits would be a trapping representation? (E.g. a 1's
complement machine that traps on negative 0's.)

Because of such issues, I tend to avoid using ~, | or & on
signed integral types.

*or does it

have the value of whatever a signed int with all set bits
would have (-1 on a two's complement machine)?

That's what I'd expect.

That's doubtlessly what was intended. On a two's complement
machine. Now try it on a one's complement machine which traps
negative 0's.

The C standard has cleared this up considerably. According to
the C99 standard:

If the implementation supports negative zeros, they
shall be generated only by:
-- the &, |, ^, ~, <<, and >operators with arguments
that produce such a value;
-- the +, -, *, /, and % operators where one argument
is a negative zero and the result is zero;
-- compound assignment operators based on the above
cases.
It is unspecified whether these cases actually generate
a negative zero or a normal zero, and whether a negative
zero becomes a normal zero when stored in an object.

If the implementation does not support negative zeros,
the behavior of the &, |, ^, ~, <<, and >operators
with arguments that would produce such a value is
undefined.

The second paragraph above is particularly significant: ~0
*is* undefined behavior on an implementation which doesn't
support negative zeros. (Note that the text immediately
preceding the above makes it clear that it is talking about
negative zero representations in one's complement or signed
magnitude; the "doesn't support negative zeros" only applies
in the case where they exist in the representation.)

I used the ~0 case for simplicity; in practice, this
issue might occur when ANDing with the complement of a
mask, for example n&=~0x0F to clear the low 4 bits of n,
or ~n&0x0F to find the inverted low 4 bits of n.

Actually, on 2's complement, we use -1 for the "all bits
set"... Perhaps we should switch to ~0 (more portable?)

If you're worried about bits, the *only* way you can be sure
of anything where the highest bit might not be 0 is to use
unsigned types. For signed types, ~0 can result in
undefined behavior. (In other words, ~0 is not portable, ~0U
is. As is -1, if that's what you want.)

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Oct 20 '08 #8

blargg

In article
<cd**********************************@w24g2000prd. googlegroups.com>, James
Kanze <ja*********@gmail.comwrote:

blargg wrote:
>Does ~0 yield undefined behavior? C++03 section 5 paragraph 5 seems to
suggest so:

>>If during the evaluation of an expression, the result is not
mathematically defined or not in the range of representable values
for its type, the behavior is undefined [...]

The description of unary ~ (C++03 section 5.3.1 paragraph 8):

>>The operand of ~ shall have integral or enumeration type; the
result is the one's complement of its operand. Integral promotions
are performed. The type of the result is the type of the promoted
operand. [...]

But perhaps "one's complement" means the value that type would have with
all bits inverted, rather than the mathematical result of inverting all
bits in the binary representation. For example, on a machine with 32-bit
int, does one's complement of 0 (attempt to) have the value 2^31-1, which
can't be represented in a signed int and is thus undefined,

[...]

The problem here is that the "one's complement" operation doesn't
really define a numeric result, but rather a manipulation on the
underlying representation. So I don't think that this statement
[C++03 section 5 paragraph 5] can be applied: the ~ operator
changes the bits in the representation, and the "result" is
whatever value the changed bits happen to represent. Except that
it's not really too clear what that means, either; what happens if
the changed bits would be a trapping representation? (E.g. a 1's
complement machine that traps on negative 0's.)

So you're saying that n = ~n, where n is an int, could be implemented as

for ( size_t i = 0; i < sizeof n; ++i )
reinterpret_cast<unsigned char*(&n) [i] ^= (unsigned char) -1;

where it's up to the implementation as to the new value n takes on. This
would imply that the following are guaranteed to hold true, regardless of
n's signedess or sign:

~~n == n
(n & ~n) == 0
(n ^ ~n) == ~0
(n & ~0) == n
(n & ~1) == n - (n & 1)

This is the interpretation I really hope is the case.

Because of such issues, I tend to avoid using ~, | or & on signed
integral types.

That would require ensuring all bitwise constants are unsigned, by
suffixing with a U, casting, or storing in an unsigned type before use,
which seems somewhat tedious. As in my example, even code for simply
testing the low bit would require a nasty U: n&1U.

Oct 20 '08 #9

James Kanze

On Oct 20, 9:42 pm, Victor Bazarov <v.Abaza...@comAcast.netwrote:

James Kanze wrote:
[..]
The wording is a bit sloppy, but what it doubtlessly means
is that you get a value with all bits set to one (in the
specified type). What that value is, of course, is probably
implementation dependent; it is -1 on a 2's complement
machine, but could very easily be 0 elsewhere.

Where? C++ only supports three representations, the 1's
complement, the 2's complement, and the signed magnitude.

It's not at all clear what C++ supports. C++ took the defective
wording of C90, and modified it slightly to make it even worse.
C99 straightened it out, and does only allow three
representations. And as to where the result of ~0 would not be
-1, exactly what I said: "elsewhere [than on a 2's complement
architecture]". On 1's complement, it would be a negative 0.
(The C99 standard explicitly says that it may result in a
negative 0.) Depending on the implementation, a negative 0
either behaves exactly like a postive 0 in arithmetic operations
(but not bitwise operations), or it is undefined behavior.

So the answer to blargg's original question is, somewhat
surprisingly, that ~0 may result in undefined behavior. (Except
that since it is a constant expression, it doesn't cause
undefined behavior, but makes the program ill formed.)

More generally, the results of any of the bitwise operators --
~, |, &, ^, >or <<, or their <op>= forms -- may result in
undefined behavior. At least according to the C99 standard; the
C++ standard doesn't really say anything meaningful about what
they do.

I currently have a paper before the committee concerning defects
in the specification of the representation of integral types,
with the proposed correction to adopt the wording from C99
(can't see any reason for C and C++ to differ here); I'll update
it to consider these issues as well. (For example, in C99, ~ is
defined as the "bitwise complement", not the 1's complement.)

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Oct 21 '08 #10

Juha Nieminen

Victor Bazarov wrote:

Probably not. It is generally better to use the literals of the same
type, IOW

long value1 = ~0L;
unsigned long value2 = ~0UL;

But then you run into this problem:

template<typename Integral>
void foo()
{
Integral value = ...; // what?
}

Oct 21 '08 #11

Jeff Schwab

Juha Nieminen wrote:

Victor Bazarov wrote:
>Probably not. It is generally better to use the literals of the same
type, IOW

long value1 = ~0L;
unsigned long value2 = ~0UL;

But then you run into this problem:

template<typename Integral>
void foo()
{
Integral value = ...; // what?
}

Integral value = ~Integral( );

Oct 21 '08 #12

James Kanze

On Oct 20, 9:38*pm, Juha Nieminen <nos...@thanks.invalidwrote:

Victor Bazarov wrote:
Actually, on 2's complement, we use -1 for the "all bits
set"... Perhaps we should switch to ~0 (more portable?)

But will it work properly? Assume that in some system 'long'
is a larger type than 'int'. Will this work?

* * long value1 = ~0;
* * unsigned long value2 = ~0;

What kind of promotion chain is ~0 subjected to here? Will
'value1' and 'value2' end up having all bits set?

Maybe. On a 2's complement machine, yes. Otherwise, the second
will always result in something else, and the first will also if
long is larger than int. Supposing the code is even legal, see
else-thread.

The promotion chain is simple: the usual one. The literal 0 has
type int. So the ~ is applied to an int, resulting in an int.
That int is then converted to the target type. For the most
common case, 2's complement, it will work, because ~0 is the
same thing as -1. When you convert -1 to a long, the value
doesn't change, and -1 has all bits set, even in a long. When
you convert it to unsigned long, the results are defined as
(ULONG_MAX + 1) - 1 (mathematically), which is, of course,
ULONG_MAX, which has all bits set. For other representations,
you'll get something else.

Practically speaking, there is no portable way of getting a
signed int with all bits set, because such a thing could be a
trapping representation, resulting in undefined behavior.
(Strictly speaking: you can easily get one: memset( &theInt, -1,
sizeof(int) ). But accessing it could result in unsigned
behavior, and there's really not much use in being able to
generate a value you cannot access.)

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Oct 21 '08 #13

James Kanze

On Oct 20, 9:45*pm, Victor Bazarov <v.Abaza...@comAcast.netwrote:

Juha Nieminen wrote:
Victor Bazarov wrote:
Actually, on 2's complement, we use -1 for the "all bits
set"... Perhaps we should switch to ~0 (more portable?)

But will it work properly? Assume that in some system 'long'
is a larger type than 'int'. Will this work?

* * long value1 = ~0;
* * unsigned long value2 = ~0;

What kind of promotion chain is ~0 subjected to here? Will
'value1' and 'value2' end up having all bits set?

Probably not. *It is generally better to use the literals of
the same type, IOW

* * long value1 = ~0L;
* * unsigned long value2 = ~0UL;

, to avoid specifically the situations where the result
depends on some implementation-defined behaviour[s].

But whether ~0L is a legal expression is already implementation
defined. (Other than that, I agree with your statement. Say
what you mean, and mean what you say. If you want a long, it's
0L, and if you want an unsigned long, it's 0UL.)

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Oct 21 '08 #14

James Kanze

On Oct 21, 3:33*pm, Juha Nieminen <nos...@thanks.invalidwrote:

Victor Bazarov wrote:
Probably not. *It is generally better to use the literals of
the same type, IOW

* *long value1 = ~0L;
* *unsigned long value2 = ~0UL;

* But then you run into this problem:

template<typename Integral>
void foo()
{
* * Integral value = ...; // what?
}

When, and doing what? Why would you want to initialize a
possibly signed int with all bits set, not knowing what value it
corresponds to, or even if it is a legal representation?

If you want the maximum value, that's what
std::numeric_limits<*T >::max() is for.

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Oct 21 '08 #15

Stephen Horne

On Tue, 21 Oct 2008 14:42:07 -0700 (PDT), James Kanze
<ja*********@gmail.comwrote:

>But whether ~0L is a legal expression is already implementation
defined.

This seems odd.

To me, the bitwise operators manipulate bits - not (except as a
side-effect) the integers that those bits represent. In this
interpretation, the result of ~x is always well defined (all the bits
are toggled) - only the interpretation of the result in integer terms
is uncertain.

That is, I can imagine ~0L being interpreted as either -1 (2s
complement) or as -0x7FFFFFFF (one possible 32-bit sign-magnitude
form), but the bit pattern is the same either way.

If I only care about the bit pattern, I would expect the integer
interpretation of that bit pattern to be irrelevant - I wouldn't
expect to have to care whether it is signed or not, 2s complement or
sign-magnitude or whatever.

Oh well.

Oct 22 '08 #16

Richard Herring

In message <ns********************************@4ax.com>, Stephen Horne
<sh********@blueyonder.co.ukwrites

>On Tue, 21 Oct 2008 14:42:07 -0700 (PDT), James Kanze
<ja*********@gmail.comwrote:

>>But whether ~0L is a legal expression is already implementation
defined.

This seems odd.

To me, the bitwise operators manipulate bits - not (except as a
side-effect) the integers that those bits represent. In this
interpretation, the result of ~x is always well defined (all the bits
are toggled) - only the interpretation of the result in integer terms
is uncertain.

That is, I can imagine ~0L being interpreted as either -1 (2s
complement) or as -0x7FFFFFFF (one possible 32-bit sign-magnitude
form), but the bit pattern is the same either way.

If I only care about the bit pattern, I would expect the integer
interpretation of that bit pattern to be irrelevant - I wouldn't
expect to have to care whether it is signed or not, 2s complement or
sign-magnitude or whatever.

But if you only care about the bit pattern, why would you be
representing it using a signed quantity in the first place?

>
Oh well.

--
Richard Herring

Oct 22 '08 #17

Juha Nieminen

James Kanze wrote:

On Oct 21, 3:33 pm, Juha Nieminen <nos...@thanks.invalidwrote:
>Victor Bazarov wrote:
>>Probably not. It is generally better to use the literals of
the same type, IOW

>> long value1 = ~0L;
unsigned long value2 = ~0UL;

> But then you run into this problem:

>template<typename Integral>
void foo()
{
Integral value = ...; // what?
}

When, and doing what? Why would you want to initialize a
possibly signed int with all bits set, not knowing what value it
corresponds to, or even if it is a legal representation?

Fine, wait for the next standard and add a template assertion to that
which makes sure that Integral is an unsigned type.

That wasn't really my point.

Oct 22 '08 #18

Juha Nieminen

James Kanze wrote:

But whether ~0L is a legal expression is already implementation
defined.

Could you quote the part of the standard which says that applying the
bitwise not operator to a signed integral is implementation defined?

Oct 22 '08 #19

James Kanze

On Oct 22, 11:38 am, Stephen Horne <sh006d3...@blueyonder.co.uk>
wrote:

On Tue, 21 Oct 2008 14:42:07 -0700 (PDT), James Kanze

<james.ka...@gmail.comwrote:
But whether ~0L is a legal expression is already
implementation defined.

This seems odd.

To me, the bitwise operators manipulate bits - not (except as
a side-effect) the integers that those bits represent.

Yep. And that's what makes it undefined. By manipulating bits,
you can generate an illegal or unsupported bit pattern.
(Unsigned integers aren't allowed to have any, at least not that
you can manipulate with any C++ operators.)

In this interpretation, the result of ~x is always well
defined (all the bits are toggled) - only the interpretation
of the result in integer terms is uncertain.

And if the result is a trapping representation?

That is, I can imagine ~0L being interpreted as either -1 (2s
complement) or as -0x7FFFFFFF (one possible 32-bit
sign-magnitude form), but the bit pattern is the same either
way.

If I only care about the bit pattern, I would expect the
integer interpretation of that bit pattern to be irrelevant -
I wouldn't expect to have to care whether it is signed or not,
2s complement or sign-magnitude or whatever.

Signed integers can have trapping representations.

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Oct 22 '08 #20

blargg

In article
<bb**********************************@v53g2000hsa. googlegroups.com>, James
Kanze <ja*********@gmail.comwrote:

On Oct 22, 11:38 am, Stephen Horne <sh006d3...@blueyonder.co.uk>
wrote:
On Tue, 21 Oct 2008 14:42:07 -0700 (PDT), James Kanze

<james.ka...@gmail.comwrote:
>But whether ~0L is a legal expression is already
>implementation defined.

This seems odd.

To me, the bitwise operators manipulate bits - not (except as
a side-effect) the integers that those bits represent.

Yep. And that's what makes it undefined. By manipulating bits,
you can generate an illegal or unsupported bit pattern.

OK, I get it now. Operator ~ is no different than <<, in that it can set
the sign bit even when all input values are positive (|, &, and ^ don't
suffer from this). It's unfortunate, but apparently unavoidable on
sign-magnitude architectures that disallow negative zero (with good
reason). So apparently, ~ applied to signed integers (even constants) is
worthy of a compiler warning in any code one wants to be portable, even
something as simple as this where i is always positive:

i &= ~1; // portable workaround: i -= i & 1

Oct 22 '08 #21

James Kanze

On Oct 22, 10:47*pm, blargg....@gishpuppy.com (blargg) wrote:

In article
<bbd0a53c-8f54-4541-88da-19a3ef282...@v53g2000hsa.googlegroups.com>, James

[...]

So apparently, ~ applied to signed integers (even constants)
is worthy of a compiler warning in any code one wants to be
portable, even something as simple as this where i is always
positive:

* * i &= ~1; // portable workaround: i -= i & 1

Portable workaround:
i &= ~1U.
But what are the semantics of this if i is signed? Does it ever
make sense to do this if i is signed?

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Oct 22 '08 #22

blargg

In article
<bf**********************************@u46g2000hsc. googlegroups.com>, James
Kanze <ja*********@gmail.comwrote:

On Oct 22, 10:47=A0pm, blargg....@gishpuppy.com (blargg) wrote:
In article
<bbd0a53c-8f54-4541-88da-19a3ef282...@v53g2000hsa.googlegroups.com>, James
[...]
So apparently, ~ applied to signed integers (even constants)
is worthy of a compiler warning in any code one wants to be
portable, even something as simple as this where i is always
positive:

i &= ~1; // portable workaround: i -= i & 1

Portable workaround:
i &=~1U.

Nifty. That is treated as

i = (int) ((unsigned) i & ~1U)

Since we are only dealing with positive i, the unsigned intermediate
result will be positive, so the conversion back to int won't overflow it.

But what are the semantics of this if i is signed? Does it ever
make sense to do this if i is signed?

Maybe one wants to make a positive signed value even, or just clear the
low bit for whatever reason, even though the value isn't just a set of
bits.

Oct 23 '08 #23

Similar topics