By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
446,376 Members | 1,566 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 446,376 IT Pros & Developers. It's quick & easy.

"Interesting" C behaviours

P: n/a
In the last few days, I have discovered two "interesting" behaviours of
the C language, both of which are apparently correct. Could someone
please explain the reasoning behind them?

1. The operators '^', '&', and '|' have lower precedance than '==',
'!=', '>=", etc. I discovered this when the statement "if (array1[i] ^
array2[i] == 0xff)" failed to do what I expected. To me, it doesn't
make any sense to give the bitwise operators lower precedance than
comparators. I can't see any situation where someone would want to
perform a bitwise operation on a truth value, but that is what the
language specifies for the above expression.

2. In C99, the expression 'a%b' where a<0 and b>0 will return a negative
result (K&R lets this be machine dependent). While it is technically
correct (-1 is congruent to 6 mod 7), it isn't exactly what most people
expect. Mathematically, the congruence classes modulo n are usually
expressed as [0], [1], ... [n-1]. The value -1 would be a member of the
congruence class [n-1]. So, when I'm performing certain operations, I
have to check if a number is negative, and if so, add the modulus to the
residue to get a sensible result. (For example, when subtracting two
struct timevals, and passing the result to select(). select() will barf
if the tv_usec field is negative, at least on Linux, so I have to set it
to (a.tv_usec-b.tv_usec)%1000000, and then add 1000000 if the result is
negative.)

Rennie
Nov 14 '05 #1
Share this Question
Share on Google+
6 Replies


P: n/a

"Rennie deGraaf" <ca.ucalgary.cpsc@degraaf> wrote
In the last few days, I have discovered two "interesting" behaviours of
the C language, both of which are apparently correct. Could someone
please explain the reasoning behind them?

1. The operators '^', '&', and '|' have lower precedance than '==',
'!=', '>=", etc.
You'd have to ask K and R about this. The problem is that the language was
designed by two people in a back room somewhere, and once set, the
precedence rules are very difficult to change.
2. In C99, the expression 'a%b' where a<0 and b>0 will return a negative
result (K&R lets this be machine dependent).

That would probably be driven by architecture. The idea is that a % b would
compile to a single machine instruction (on K and R's original platform).
Just a guess, but I suspect this is the motivation.
Nov 14 '05 #2

P: n/a
In article <sBUpd.360758$%k.84068@pd7tw2no>
Rennie deGraaf <ca.ucalgary.cpsc@degraaf> wrote:
In the last few days, I have discovered two "interesting" behaviours of
the C language, both of which are apparently correct. Could someone
please explain the reasoning behind them?

1. The operators '^', '&', and '|' have lower precedance than '==',
'!=', '>=", etc. I discovered this when the statement "if (array1[i] ^
array2[i] == 0xff)" failed to do what I expected. To me, it doesn't
make any sense to give the bitwise operators lower precedance than
comparators. I can't see any situation where someone would want to
perform a bitwise operation on a truth value, but that is what the
language specifies for the above expression.
Dennis Ritchie has noted that, in Primeval C, there were no "&&" and
"||" operators at all, and the single "&" and single "|" operators
were overloaded, so that:

result = f() | g();

would call both f() and g(), and bitwise-OR the results together to
store in the variable "result", while:

if (f() | g())

would call f() first, and if the result was nonzero, would omit the
call to g() entirely (i.e., what || does now). Likewise:

result = f() & g();

always called both, while:

if (f() & g())

called g() only if f() returned a nonzero value. Stopping as soon
as the result is known is called "short-circuit behavior". Presumably
the operators' priorities were set based on the short-circuit
versions, which probably occurred more often.

At some point, it was deemed bogus to have a single operator for
two entirely separate meanings, so the short-circuit logical
operators were separated out into new && and || operators. But by
then there were perhaps a few dozens of kilobytes :-) of source
code, so the operator parsing was left unchanged.
2. In C99, the expression 'a%b' where a<0 and b>0 will return a negative
result (K&R lets this be machine dependent). While it is technically
correct (-1 is congruent to 6 mod 7), it isn't exactly what most people
expect.
I think you have overly high expectations of "most people" here. :-)
Mathematically, the congruence classes modulo n are usually
expressed as [0], [1], ... [n-1].


True; but the "%" operator is really the "remainder after division"
operator. The goal is to have (a/b + a%b) == a, whenever b != 0.
To have ((-3) % 7) == 4, we would have to have ((-3) / 7) == -1),
but 0 is the most common result today for this division. Thus,
machines with "remainder after divide" instructions mostly produce
-3 here, and machines without such an instruction require computing
(a % b) via (a - (a / b) * b) in the first place.
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (4039.22'N, 11150.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.
Nov 14 '05 #3

P: n/a
Malcolm wrote:
"Rennie deGraaf" <ca.ucalgary.cpsc@degraaf> wrote

2. In C99, the expression 'a%b' where a<0 and b>0 will return a negative
result (K&R lets this be machine dependent).


That would probably be driven by architecture. The idea is that a % b would
compile to a single machine instruction (on K and R's original platform).
Just a guess, but I suspect this is the motivation.


It was architecture dependent in K&R, but C99 standardized it (see
http://home.tiscalinet.ch/t_wolf/tw/...html#Semantics,
section 25). I know that the x86 idiv instruction spits out negative
residues, but is the fact that a common architecture does something
unusual a reason to standardize a programming language on an unusual
behaviour? It would make more sense to me to work around the
architecture when it does something weird.

GCC, for instance, frequently doesn't even compile a%b to a single idiv
instruction - it compiles it to a big mess of imul, shift, and leal
instructions.

Rennie
Nov 14 '05 #4

P: n/a
Rennie deGraaf <ca.ucalgary.cpsc@degraaf> wrote:
In the last few days, I have discovered two "interesting" behaviours of
the C language, both of which are apparently correct. Could someone
please explain the reasoning behind them?

1. The operators '^', '&', and '|' have lower precedance than '==',
'!=', '>=", etc. I discovered this when the statement "if (array1[i] ^
array2[i] == 0xff)" failed to do what I expected. To me, it doesn't
make any sense to give the bitwise operators lower precedance than
comparators. I can't see any situation where someone would want to
perform a bitwise operation on a truth value, but that is what the
language specifies for the above expression.
I believe it is mostly historical baggage from the earliest C
compilers.


2. In C99, the expression 'a%b' where a<0 and b>0 will return a negative
result (K&R lets this be machine dependent). While it is technically
correct (-1 is congruent to 6 mod 7), it isn't exactly what most people
expect. Mathematically, the congruence classes modulo n are usually
expressed as [0], [1], ... [n-1]. The value -1 would be a member of the
congruence class [n-1]. So, when I'm performing certain operations, I
have to check if a number is negative, and if so, add the modulus to the
residue to get a sensible result. (For example, when subtracting two
struct timevals, and passing the result to select(). select() will barf
if the tv_usec field is negative, at least on Linux, so I have to set it
to (a.tv_usec-b.tv_usec)%1000000, and then add 1000000 if the result is
negative.)


It should always be true that (a/b)*b+(a%b) == a, if b is non-zero.
When you have negative values involved the behaviour of the '%'
operator therefore depends on the behaviour of the '/' operator.
In C89 it was implementation-dependent which way '/' truncated when it
had negative operands. In C99 this was fixed as being towards zero.
I believe this change was made for compatibility with Fortran (which
has always done it the same way as C99 does.)
--
<Insert your favourite quote here.>
Erik Trulsson
er******@student.uu.se
Nov 14 '05 #5

P: n/a
In article <lJVpd.356735$nl.170987@pd7tw3no>,
Rennie deGraaf <ca.ucalgary.cpsc@degraaf> wrote:
GCC, for instance, frequently doesn't even compile a%b to a single idiv
instruction - it compiles it to a big mess of imul, shift, and leal
instructions.


Because it is faster.
Nov 14 '05 #6

P: n/a
Erik Trulsson wrote:
Rennie deGraaf <ca.ucalgary.cpsc@degraaf> wrote:
(snip)
2. In C99, the expression 'a%b' where a<0 and b>0 will return a negative
result (K&R lets this be machine dependent). While it is technically
correct (-1 is congruent to 6 mod 7), it isn't exactly what most people
expect.
It is what most hardware designers expect.
So, when I'm performing certain operations, I
have to check if a number is negative, and if so, add the modulus to the
residue to get a sensible result.


I do it by adding the modulus and an additional %,

a=(x%y+y)%y;

It is unknown if the branch penalty is more or less than the
cost of the extra divide. It is less typing, anyway.
It should always be true that (a/b)*b+(a%b) == a, if b is non-zero.
When you have negative values involved the behaviour of the '%'
operator therefore depends on the behaviour of the '/' operator.
In C89 it was implementation-dependent which way '/' truncated when it
had negative operands. In C99 this was fixed as being towards zero.
I believe this change was made for compatibility with Fortran (which
has always done it the same way as C99 does.)


Well, there is a chicken and egg problem. Pretty much all
twos complement machines do it that way. I am not sure at
all what ones complement machines do. It might be, though,
that machines do it that way because Fortran does it that
way. From the Fortran 66 standard:

"The function MOD or AMOD(a1,a2) is defined as a1-[a1/a2]*a2
where [x] is the integer whose magnitude does not exceed
the magnitude of x, and whose sign is the same as x."

Note that many early Fortran machines were sign magnitude
or ones complement machines, and that C89 supports both
sign magnitude and ones complement arithmetic.
(I am not sure about C99.)

-- glen

Nov 14 '05 #7

This discussion thread is closed

Replies have been disabled for this discussion.