469,934 Members | 2,607 Online

# function to count number of set bits

Hi, I'm wondering if anyone knows if the following function will
function properly as a set-bit counter on non 2s complement machines
(as K&R2 implies).

| int bitcount(unsigned x)
| {
| int count;
|
| for(count = 0; x != 0; count++, x &= (x-1))
| ;
|
| return count;
| }

I can't think of a reason why this would fail on a 1s complement or
sign-mag machine (and can't find a non 2s compliment machine to try it
on). Is it portable as far as C is concerned?

And also, if I declare a signed int in the main program and want to set
the msb to 1, can I do this (32bit ints)?:

int b = 0x8000000;
/* is the 0x80000000 taken as an unsigned long constant or a signed
int? */

int count = bitcount(b);
/* is this undefined- trying to send a negative int to bitcount
function? */

Nov 14 '05 #1
26 7912
G Patel wrote:
Hi, I'm wondering if anyone knows if the following function will
function properly as a set-bit counter on non 2s complement machines
(as K&R2 implies).

| int bitcount(unsigned x)
| {
| int count;
|
| for(count = 0; x != 0; count++, x &= (x-1))
| ;
|
| return count;
| }

I can't think of a reason why this would fail on a 1s complement or
sign-mag machine (and can't find a non 2s compliment machine to try it
on). Is it portable as far as C is concerned?
It makes no difference what representation the machine
uses for negative integers, because `x' cannot be negative!
I see no portability or conformance problems in the code.
And also, if I declare a signed int in the main program and want to set
the msb to 1, can I do this (32bit ints)?:

int b = 0x8000000;
You're missing a zero here, I believe.
/* is the 0x80000000 taken as an unsigned long constant or a signed
int? */
Neither: Given the assumption of 32 bits, the constant
(with all seven zeroes) will be an `unsigned int'. Since the
value of this constant is too large for plain `int', trying to
convert it to `int' either produces an implementation-defined
result or raises an implementation-defined signal. On most 32-bit
implementations, `b' will be initialized to INT_MIN, -2147483648 --
but the Standard doesn't actually guarantee this.
int count = bitcount(b);
/* is this undefined- trying to send a negative int to bitcount
function? */

No: conversion the other way, from signed to unsigned,
is well-defined. However, the conversion might change the
number of one-bits in the representation. For example, the
representation of -1 on a signed magnitude machine has two
one-bits. Converting this to `unsigned' produces the value
UINT_MAX, which has at least sixteen one-bits. Avoiding such
surprises is one of the reasons to stick to `unsigned' types
for bit-fiddling.

--
Eric Sosman
es*****@acm-dot-org.invalid
Nov 14 '05 #2
G Patel wrote:
Hi, I'm wondering if anyone knows if the following function will
function properly as a set-bit counter on non 2s complement machines
(as K&R2 implies).

| int bitcount(unsigned x)
| {
| int count;
|
| for(count = 0; x != 0; count++, x &= (x-1))
| ;
|
| return count;
| }

Yes it will work.

Heres two counters for 32 bit integers:

int Count (unsigned int w)
{
w = (0x55555555 & w) + (0x55555555 & (w>> 1));
w = (0x33333333 & w) + (0x33333333 & (w>> 2));
w = (0x0f0f0f0f & w) + (0x0f0f0f0f & (w>> 4));
w = (0x00ff00ff & w) + (0x00ff00ff & (w>> 8));
w = (0x0000ffff & w) + (0x0000ffff & (w>>16));
return w;
}

/* slightly faster */
int Count (unsigned int w)
{
const unsigned int all1 = ~0;
const unsigned int mask1h = all1 / 3 << 1;
const unsigned int mask2l = all1 / 5;
const unsigned int mask4l = all1 / 17;
w -= (mask1h & w) >> 1;
w = w + (w >> 4) & mask4l;
w += w >> 8;
w += w >> 16;
return w & 0xff;
}

gtoomey

Nov 14 '05 #3

Eric Sosman wrote:
G Patel wrote:
Hi, I'm wondering if anyone knows if the following function will
function properly as a set-bit counter on non 2s complement machines (as K&R2 implies).

| int bitcount(unsigned x)
| {
| int count;
|
| for(count = 0; x != 0; count++, x &= (x-1))
| ;
|
| return count;
| }

I can't think of a reason why this would fail on a 1s complement or
sign-mag machine (and can't find a non 2s compliment machine to try it on). Is it portable as far as C is concerned?

It makes no difference what representation the machine
uses for negative integers, because `x' cannot be negative!
I see no portability or conformance problems in the code.
And also, if I declare a signed int in the main program and want to set the msb to 1, can I do this (32bit ints)?:

int b = 0x8000000;

You're missing a zero here, I believe.
/* is the 0x80000000 taken as an unsigned long constant or a signed
int? */

Neither: Given the assumption of 32 bits, the constant
(with all seven zeroes) will be an `unsigned int'. Since the
value of this constant is too large for plain `int', trying to
convert it to `int' either produces an implementation-defined
result or raises an implementation-defined signal. On most 32-bit
implementations, `b' will be initialized to INT_MIN, -2147483648 --
but the Standard doesn't actually guarantee this.

Thank you. Now I get it, I went and checked the C89 draft and found:

The type of an integer constant is the first of the corresponding list
in which its value can be represented. Unsuffixed decimal: int, long
int, unsigned long int; unsuffixed octal or hexadecimal: int, unsigned
int, long int, unsigned long int; suffixed by the letter u or U:
unsigned int, unsigned long int; suffixed by the letter l or L: long
int, unsigned long int; suffixed by both the letters u or U and l or L:
unsigned long int .

But I'm wondering how the "-" plays into any of this (for negative
constants). Is the "-" part of the constant or is it just the constant
with the - unary operator imparted on it.

How would one explain these constants:

-0xF , -077, -56
No mention of "-" that go in front of constants in the standard. How
does the "-" affect the constant (I imagine it just takes all the
signed types in each list out of the picture).
[I wonder what the authors of my C textbook was thinking when he named
it: Teach yourself C in 24 hours ... they couldn't be more wrong]

Nov 14 '05 #4
"G Patel" <ga********@gmail.com> writes:
But I'm wondering how the "-" plays into any of this (for negative
constants). Is the "-" part of the constant or is it just the constant
with the - unary operator imparted on it.

The latter.
--
Just another C hacker.
Nov 14 '05 #5

Ben Pfaff wrote:
"G Patel" <ga********@gmail.com> writes:
But I'm wondering how the "-" plays into any of this (for negative
constants). Is the "-" part of the constant or is it just the constant with the - unary operator imparted on it.

The latter.
--
Just another C hacker.

Ok, so the - isn't part of the actual constant, only acts as a - unary
operator. But what about constants that "fall" into a unsigned type.
Wouldn't the whole expression be a sort of negative unsigned constant?

Nov 14 '05 #6
"G Patel" <ga********@gmail.com> writes:
Ben Pfaff wrote:
"G Patel" <ga********@gmail.com> writes:
> But I'm wondering how the "-" plays into any of this (for negative
> constants). Is the "-" part of the constant or is it just the constant > with the - unary operator imparted on it.

The latter.

Ok, so the - isn't part of the actual constant, only acts as a - unary
operator. But what about constants that "fall" into a unsigned type.
Wouldn't the whole expression be a sort of negative unsigned constant?

Arithmetic in unsigned types "wraps around". A negative value in
an unsigned type is changed into a positive one by adding the
type's maximum value plus one until it is in range. Thus, -1u is
equal to UINT_MAX.
--
"It wouldn't be a new C standard if it didn't give a
new meaning to the word `static'."
--Peter Seebach on C99
Nov 14 '05 #7

Ben Pfaff wrote:
"G Patel" <ga********@gmail.com> writes:
But I'm wondering how the "-" plays into any of this (for negative
constants). Is the "-" part of the constant or is it just the constant with the - unary operator imparted on it.

The latter.
--
Just another C hacker.

I don't think that is true. I think the "-" sign in front of an
otherwise unadorned arithmetic constant is inherently part of the
constant. If we considered the "-" to be an "add on" that works just
as a unary - operator, then we would have to perform UAC/Promotion on
assignments (where otherwise, an assignment has no promotion/UAC, just
straight conversion to the type of the left operands).
Where in the standard did you read this?

Nov 14 '05 #8
"Kobu" <ko********@gmail.com> writes:
Ben Pfaff wrote:
"G Patel" <ga********@gmail.com> writes:
> But I'm wondering how the "-" plays into any of this (for negative
> constants). Is the "-" part of the constant or is it just the constant > with the - unary operator imparted on it.
The latter.
--
Just another C hacker.

I don't think that is true. I think the "-" sign in front of an
otherwise unadorned arithmetic constant is inherently part of the
constant.

You're mistaken; see below.
If we considered the "-" to be an "add on" that works just
as a unary - operator, then we would have to perform UAC/Promotion on
assignments (where otherwise, an assignment has no promotion/UAC, just
straight conversion to the type of the left operands).
Do you have an example where this matters? (Note that there are no
integer constants for types shorter than signed or unsigned int.)
Where in the standard did you read this?

C99 6.4.4.1 defines the syntax of an integer constant; it doesn't

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Nov 14 '05 #9
"Kobu" <ko********@gmail.com> writes:
otherwise unadorned arithmetic constant is inherently part of the
constant. If we considered the "-" to be an "add on" that works just
as a unary - operator, then we would have to perform UAC/Promotion on
assignments (where otherwise, an assignment has no promotion/UAC, just
straight conversion to the type of the left operands).
Where in the standard did you read this?

Where did you find anything in the standard that talks about
negative constants? The grammar shows `-' as a unary operator,
not as part of a primary expression.

I have no idea why you think this has anything to do with
assignment.
--
"If I've told you once, I've told you LLONG_MAX times not to
exaggerate."
--Jack Klein
Nov 14 '05 #10

Keith Thompson wrote:
"Kobu" <ko********@gmail.com> writes:
Ben Pfaff wrote:
"G Patel" <ga********@gmail.com> writes:

> But I'm wondering how the "-" plays into any of this (for negative > constants). Is the "-" part of the constant or is it just the

constant
> with the - unary operator imparted on it.

The latter.
--
Just another C hacker.

I don't think that is true. I think the "-" sign in front of an
otherwise unadorned arithmetic constant is inherently part of the
constant.

You're mistaken; see below.
If we considered the "-" to be an "add on" that works just as a unary - operator, then we would have to perform UAC/Promotion on assignments (where otherwise, an assignment has no promotion/UAC, just straight conversion to the type of the left operands).

Do you have an example where this matters? (Note that there are no
integer constants for types shorter than signed or unsigned int.)

textbook.
I've been enlightened.

Nov 14 '05 #11
Keith Thompson wrote:
"Kobu" <ko********@gmail.com> writes:
.... snip ...
If we considered the "-" to be an "add on" that works just as a
unary - operator, then we would have to perform UAC/Promotion on
assignments (where otherwise, an assignment has no promotion/UAC,
just straight conversion to the type of the left operands).

Do you have an example where this matters? (Note that there are no
integer constants for types shorter than signed or unsigned int.)

Look at the way INT_MIN is usually defined in limits.h. For a 16
bit 2's complement int machine, it would normally be (-INT_MAX -
1). If it were written as -32768 it would create an overflow on
input.

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
Nov 14 '05 #12
On Tue, 01 Feb 2005 11:44:36 -0500, Eric Sosman wrote:

CBFalconer wrote:

....
Look at the way INT_MIN is usually defined in limits.h. For a 16
bit 2's complement int machine, it would normally be (-INT_MAX -
1). If it were written as -32768 it would create an overflow on
input.

Not an overflow, exactly, but a result with the wrong
type and the wrong value. Since the constant is too large
for `int' but within range for an `unsigned int' it would
have the latter type; in effect it would be 32768u. The
"negation" would, under the rules of unsigned arithmetic,
reduce modulo 65536u to yield the value 32768u again. Thus
you'd have the, er, "anomalous" condition INT_MIN > INT_MAX!

Note that `(int)-32768' wouldn't work, since the value
must be a constant expression and constant expressions can't
contain cast operators.

Casts are allowed in constant expressions although there are some
restrictions which don't apply here. The problem with (int)-32768 where
INT_MAX is 32767 and UINT_MAX is 65535 is that it is equivalent to
(int)(32768U) i.e. you are trying to convert to a signed integer type a
value that is not representable in that type. That's undefined in C90 and
in C99 you get an implementation-defined value or signal.

Lawrence
Nov 14 '05 #13
Lawrence Kirby <lk****@netactive.co.uk> writes:
On Tue, 01 Feb 2005 11:44:36 -0500, Eric Sosman wrote:
CBFalconer wrote:

...
Look at the way INT_MIN is usually defined in limits.h. For a 16
bit 2's complement int machine, it would normally be (-INT_MAX -
1). If it were written as -32768 it would create an overflow on
input.

Not an overflow, exactly, but a result with the wrong
type and the wrong value. Since the constant is too large
for `int' but within range for an `unsigned int' it would
have the latter type; in effect it would be 32768u. The
"negation" would, under the rules of unsigned arithmetic,
reduce modulo 65536u to yield the value 32768u again. Thus
you'd have the, er, "anomalous" condition INT_MIN > INT_MAX!

Note that `(int)-32768' wouldn't work, since the value
must be a constant expression and constant expressions can't
contain cast operators.

Casts are allowed in constant expressions although there are some
restrictions which don't apply here. The problem with (int)-32768 where
INT_MAX is 32767 and UINT_MAX is 65535 is that it is equivalent to
(int)(32768U) i.e. you are trying to convert to a signed integer type a
value that is not representable in that type. That's undefined in C90 and
in C99 you get an implementation-defined value or signal.

And if the implementation-defined value happens to be -32768, the
implementation is free to use it as the definition of INT_MIN. (There
are good reasons not to do so. Keeping <limits.h> in sync with the
compiler might be non-trivial, and you might as well have a portable
definition of INT_MIN if you can.)

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Nov 14 '05 #14
Ben Pfaff wrote:
"Kobu" <ko********@gmail.com> writes:

otherwise unadorned arithmetic constant is inherently part of the
constant. If we considered the "-" to be an "add on" that works just
as a unary - operator, then we would have to perform UAC/Promotion on
assignments (where otherwise, an assignment has no promotion/UAC, just
straight conversion to the type of the left operands).
Where in the standard did you read this?

Where did you find anything in the standard that talks about
negative constants? The grammar shows `-' as a unary operator,
not as part of a primary expression.

#define EOF (-1)

Is '-' part of a primary expression?

#define LONG_MIN (-2147483647L-1L)

I must be missing something. Help me here. Buy me a beer in Menlo Park.

--
Joe Wright mailto:jo********@comcast.net
"Everything should be made as simple as possible, but not simpler."
--- Albert Einstein ---
Nov 14 '05 #15
Joe Wright <jo********@comcast.net> writes:
Ben Pfaff wrote:
"Kobu" <ko********@gmail.com> writes:
otherwise unadorned arithmetic constant is inherently part of the
constant. If we considered the "-" to be an "add on" that works just
as a unary - operator, then we would have to perform UAC/Promotion on
assignments (where otherwise, an assignment has no promotion/UAC, just
straight conversion to the type of the left operands).
Where in the standard did you read this?

Where did you find anything in the standard that talks about
negative constants? The grammar shows `-' as a unary operator,
not as part of a primary expression.

#define EOF (-1)

Is '-' part of a primary expression?

#define LONG_MIN (-2147483647L-1L)

Both of these are primary expressions because they're parenthesized.

Without the parentheses, neither -1 nor -2147483647L-1L is a primary
expression.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Nov 14 '05 #16
Joe Wright <jo********@comcast.net> writes:
Ben Pfaff wrote:
"Kobu" <ko********@gmail.com> writes:

Where did you find anything in the standard that talks about
negative constants? The grammar shows `-' as a unary operator,
not as part of a primary expression.

#define EOF (-1)

Is '-' part of a primary expression?

#define LONG_MIN (-2147483647L-1L)

1 primary-expression:
identifier
constant
string-literal
( expression )

You're trying to make some kind of pedantic point by saying that
you parenthesized it, therefore it's a primary expression. But
you know what my point is: there is no alternative listed as
`- constant', nor is the negative sign part of `constant'
itself (except perhaps as part of an exponent, etc.). There is
no such thing as a negative constant, only a constant to which
you apply a negating unary operator.
--
"In My Egotistical Opinion, most people's C programs should be indented six
feet downward and covered with dirt." -- Blair P. Houghton
Nov 14 '05 #17

Ben Pfaff wrote:
[...] There is
no such thing as a negative constant, only a constant to which
you apply a negating unary operator.

Three pettifogging counter-examples (two implementation-
specific):

enum { NEGATIVE = -42, POSITIVE = 42 };
/* Henceforth, `NEGATIVE' is a negative constant. */

'\x99'
/* A constant whose value may be negative on some
* implementations, depending on the value of CHAR_BIT
* and the signedness of `char'.
*/

'ß'
/* A constant whose value may be negative on some
* implementations (and which might be utterly rejected
* by some others, since the character is not in the set
* mandated by the Standard).
*/

--
Er*********@sun.com

Nov 14 '05 #18
Eric Sosman <er*********@sun.com> writes:
Ben Pfaff wrote:
[...] There is
no such thing as a negative constant, only a constant to which
you apply a negating unary operator.

Three pettifogging counter-examples (two implementation-
specific):

Okay, let me try again. There is no such thing as a negative
integer-constant or floating-constant, but you may apply a
negating unary operator to obtain a negative value.

Anybody want to dispute *that*?
--
"You call this a *C* question? What the hell are you smoking?" --Kaz
Nov 14 '05 #19
>Lawrence Kirby <lk****@netactive.co.uk> writes:
Casts are allowed in constant expressions although there are some
restrictions which don't apply here. The problem with (int)-32768 where
INT_MAX is 32767 and UINT_MAX is 65535 is that it is equivalent to
(int)(32768U) i.e. you are trying to convert to a signed integer type a
value that is not representable in that type. That's undefined in C90 and
in C99 you get an implementation-defined value or signal.

In article <ln************@nuthaus.mib.org>
Keith Thompson <ks***@mib.org> wrote:And if the implementation-defined value happens to be -32768, the
implementation is free to use it as the definition of INT_MIN. (There
are good reasons not to do so. Keeping <limits.h> in sync with the
compiler might be non-trivial, and you might as well have a portable
definition of INT_MIN if you can.)

There is another good reason not to use a cast here. Suppose
INT_MIN is in fact -32768 (numerically), and the implementation
uses a cast in the "#define" for INT_MIN. Then consider:

#if INT_MIN < (-2147483647 - 1)
/* int is at least 32 bits */
...
#endif

When the compiler sees this during the preprocessing phase in which
keywords have no meaning (because there are only pp-tokens, not
yet any tokens), we get:

#if ((some_pp_token)-32768) < (-2147483647 - 1)

which is syntactically invalid and requires a diagnostic. My C99
draft says, in part:

[#1] The values given below shall be replaced by constant
expressions suitable for use in #if preprocessing
directives.

(C89 has similar if not identical wording.)
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
Reading email is like searching for food in the garbage, thanks to spammers.
Nov 14 '05 #20
Lawrence Kirby wrote:
On Tue, 01 Feb 2005 11:44:36 -0500, Eric Sosman wrote:
CBFalconer wrote:

Look at the way INT_MIN is usually defined [...]

Note that `(int)-32768' wouldn't work, since the value
must be a constant expression and constant expressions can't
contain cast operators.

Casts are allowed in constant expressions although there are some
restrictions which don't apply here. The problem with (int)-32768 where
INT_MAX is 32767 and UINT_MAX is 65535 is that it is equivalent to
(int)(32768U) i.e. you are trying to convert to a signed integer type a
value that is not representable in that type. That's undefined in C90 and
in C99 you get an implementation-defined value or signal.

Sorry; sloppy language on my part. Casts are indeed
permitted in "constant expressions" (6.6), but they are
forbidden in the special form of "constant expression" the
preprocessor can evaluate (6.10.1). INT_MIN and friends
must be evaluable by the preprocessor (5.2.4.2.1), and so
cannot be defined with casts.

--
Eric Sosman
es*****@acm-dot-org.invalid
Nov 14 '05 #21

G Patel wrote:
Hi, I'm wondering if anyone knows if the following function will
function properly as a set-bit counter on non 2s complement machines
(as K&R2 implies).

| int bitcount(unsigned x)
| {
| int count;
|
| for(count = 0; x != 0; count++, x &= (x-1))
| ;
|
| return count;
| }

I can't think of a reason why this would fail on a 1s complement or
sign-mag machine (and can't find a non 2s compliment machine to try it on). Is it portable as far as C is concerned?
If the intention is to call the function from either signed or unsigned
int, then the code will not work on non-2s complement machiens for
signed ints.

Say for example you have -0 on a sign-mag or 1s comp. machine in an int
variable y:

Even though the y is not devoid of set bits on those machines, once the
call is made to the function and the variable is converted from signed
-> unsigned, C's rules will whipe out all the set bits (via -0 +
UINT_MAX + 1). Your function will return 0 as the # of set bits
because of the conversion that took place on the argument before
function entry. This is an example of C's bias towards 2's complement
machines (which is understandable because 2's complement is the "best"
way to represent negative integers!).

So the authors of K&R were probably warning you about calling the
function with signed ints (not portable).

Nov 14 '05 #22
[Given a bit-counting function that takes an unsigned int or unsigned
long parameter...]

Big K <ki*********@gmail.com> wrote:
If the intention is to call the function from either signed or unsigned
int, then the code will not work on non-2s complement machiens for
signed ints.
Well, that depends on what one *wants*. :-)
Say for example you have -0 on a sign-mag or 1s comp. machine in an int
variable y:

Even though the y is not devoid of set bits on those machines, once the
call is made to the function and the variable is converted from signed
-> unsigned, C's rules will whipe out all the set bits (via -0 +
UINT_MAX + 1). Your function will return 0 as the # of set bits
because of the conversion that took place on the argument before
function entry.
Indeed it will. And yet, on such a machine, if -0 is produced by
ordinary arithmetic (as opposed to being used to detect uninitialized
variables, for instance), C requires that x==y be true even if x
is -0 and y is +0. That is:

x = <expression 1>;
y = <expression 2>;
if (x == y && bitcount(x) != bitcount(y))
puts("how peculiar!");

should never print the message, even if x is -0 and y is +0. So
you might *want* bitcount() to return 0 in both cases.

On the other hand, if you want to inspect the representation,
rather than the value, of a variable, you can still write that:

int representational_bit_count(unsigned char *p, int size);
#define RBC(var) \
representational_bit_count((unsigned char *)&(var), sizeof(var))
...
if (x == y && RBC(x) != RBC(y))
puts("while x==y, their representations differ");

Note, however, that two identical and nonnegative values can still
have differing representations, if some of the bits in the
This is an example of C's bias towards 2's complement
machines (which is understandable because 2's complement is the "best"
way to represent negative integers!).

While *I* am biased towards 2'sC :-) I am not so sure C really is.
C gives you the ability to inspect representations (via pointers
and "unsigned char *"); it is up to you, the programmer, not to
abuse it. :-)
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
Reading email is like searching for food in the garbage, thanks to spammers.
Nov 14 '05 #23

G Patel wrote:
Hi, I'm wondering if anyone knows if the following function will
function properly as a set-bit counter on non 2s complement machines
(as K&R2 implies).

K&R2 was implying that the expression x &= (x-1) does not remove the
rightmost set bit on all implementations - negative 'even' integers in
sign-mag form will be left unchanged by x &= (x-1)

| int bitcount(unsigned x)
| {
| int count;
|
| for(count = 0; x != 0; count++, x &= (x-1))
| ;
|
| return count;
| }

I can't think of a reason why this would fail on a 1s complement or
sign-mag machine (and can't find a non 2s compliment machine to try it on). Is it portable as far as C is concerned?

The function itself is portable, because unsigned numbers have no
issues.

And also, if I declare a signed int in the main program and want to set the msb to 1, can I do this (32bit ints)?:

int b = 0x8000000;
/* is the 0x80000000 taken as an unsigned long constant or a signed
int? */

int count = bitcount(b);
/* is this undefined- trying to send a negative int to bitcount
function? */

Nov 14 '05 #24

Chris Torek wrote:
[Given a bit-counting function that takes an unsigned int or unsigned
long parameter...]

Big K <ki*********@gmail.com> wrote:
If the intention is to call the function from either signed or unsignedint, then the code will not work on non-2s complement machiens for
signed ints.
Well, that depends on what one *wants*. :-)
Say for example you have -0 on a sign-mag or 1s comp. machine in an intvariable y:

Even though the y is not devoid of set bits on those machines, once thecall is made to the function and the variable is converted from signed-> unsigned, C's rules will whipe out all the set bits (via -0 +
UINT_MAX + 1). Your function will return 0 as the # of set bits
because of the conversion that took place on the argument before
function entry.

Indeed it will. And yet, on such a machine, if -0 is produced by
ordinary arithmetic (as opposed to being used to detect uninitialized
variables, for instance), C requires that x==y be true even if x
is -0 and y is +0. That is:

x = <expression 1>;
y = <expression 2>;
if (x == y && bitcount(x) != bitcount(y))
puts("how peculiar!");

should never print the message, even if x is -0 and y is +0. So
you might *want* bitcount() to return 0 in both cases.

On the other hand, if you want to inspect the representation,
rather than the value, of a variable, you can still write that:

int representational_bit_count(unsigned char *p, int size);
#define RBC(var) \
representational_bit_count((unsigned char *)&(var),

sizeof(var)) So what is truly the universal way of counting the respresentational
bits on any platform? I see that you (Chris) use a char pointer
somehow to cycle through the bytes, but will this catch all the extra
bits used for handling overflow detection etc?
...
if (x == y && RBC(x) != RBC(y))
puts("while x==y, their representations differ");

Note, however, that two identical and nonnegative values can still
have differing representations, if some of the bits in the
This is an example of C's bias towards 2's complement
machines (which is understandable because 2's complement is the "best"way to represent negative integers!).

While *I* am biased towards 2'sC :-) I am not so sure C really is.
C gives you the ability to inspect representations (via pointers
and "unsigned char *"); it is up to you, the programmer, not to
abuse it. :-)
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
Reading email is like searching for food in the garbage, thanks to

spammers.

Anyone know a truly universal function for counting all the setbits in
a signed or unsigned number? I presume the receiving function would
have to accept the parameter into a large signed integer (signed long
long ?).

Nov 14 '05 #25

G Patel wrote:
Chris Torek wrote:

On the other hand, if you want to inspect the representation,
rather than the value, of a variable, you can still write that:

int representational_bit_count(unsigned char *p, int size);
#define RBC(var) \
representational_bit_count((unsigned char *)&(var), sizeof(var))

So what is truly the universal way of counting the respresentational
bits on any platform? I see that you (Chris) use a char pointer
somehow to cycle through the bytes, but will this catch all the extra
bits used for handling overflow detection etc?

I think Chris was thinking along the lines of the following
function(which will spit out the C abstract machine level bits (in each
byte) for any object sent to it):
..#include <stdio.h>
..#include <limits.h>
..
..#define RBC(var) repbitcount((unsigned char *)&var, sizeof(var))
..
..int repbitcount(unsigned char *p, int size)
..{
.. int count = 0;
.. int byte;
.. while(size)
.. {
.. byte = CHAR_BIT - 1;
.. while(byte >= 0)
.. {
.. if(*p & 1 << byte)
.. {
.. putchar('1'); /* remove putchar if necessary */
.. count++;
.. }
.. else putchar('0'); /* remove putchar if necessary */
..
.. byte--;
.. }
.. putchar('\n'); /* remove putchar if necessary */
.. size--;
.. p++;
.. }
..}
...
if (x == y && RBC(x) != RBC(y))
puts("while x==y, their representations differ");

Note, however, that two identical and nonnegative values can still
have differing representations, if some of the bits in the
This is an example of C's bias towards 2's complement
machines (which is understandable because 2's complement is the "best"way to represent negative integers!).

While *I* am biased towards 2'sC :-) I am not so sure C really is.
C gives you the ability to inspect representations (via pointers
and "unsigned char *"); it is up to you, the programmer, not to
abuse it. :-)
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277

2603 email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to

spammers.

Anyone know a truly universal function for counting all the setbits

in a signed or unsigned number? I presume the receiving function would
have to accept the parameter into a large signed integer (signed long
long ?).

Any bits in a representation not part of C's abstract machine model
(like the the extra bits to detect overflow on some implementations),
can not be portable accessed by a function (to answer your earlier
question).

Nov 14 '05 #26

Luke Wu wrote:
G Patel wrote:
Chris Torek wrote:

On the other hand, if you want to inspect the representation,
rather than the value, of a variable, you can still write that:

int representational_bit_count(unsigned char *p, int size);
#define RBC(var) \
representational_bit_count((unsigned char *)&(var), sizeof(var))

So what is truly the universal way of counting the respresentational bits on any platform? I see that you (Chris) use a char pointer
somehow to cycle through the bytes, but will this catch all the extra bits used for handling overflow detection etc?

I think Chris was thinking along the lines of the following
function(which will spit out the C abstract machine level bits (in

each byte) for any object sent to it):
.#include <stdio.h>
.#include <limits.h>
.
.#define RBC(var) repbitcount((unsigned char *)&var, sizeof(var))
.
.int repbitcount(unsigned char *p, int size)
.{
. int count = 0;
. int byte;
. while(size)
. {
. byte = CHAR_BIT - 1;
. while(byte >= 0)
. {
. if(*p & 1 << byte)
. {
. putchar('1'); /* remove putchar if necessary */
. count++;
. }
. else putchar('0'); /* remove putchar if necessary */
.
. byte--;
. }
. putchar('\n'); /* remove putchar if necessary */
. size--;
. p++;
. }
.}

You forgot to return count.

Nov 14 '05 #27

### This discussion thread is closed

Replies have been disabled for this discussion.