By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
424,854 Members | 2,015 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 424,854 IT Pros & Developers. It's quick & easy.

Bit-fields and integral promotion

P: n/a
Suppose I'm using an implementation where an int is 16 bits.
In the program below, what function is called in the first case,
and what is called in the second case?
Also, if there is a difference between C89 and C99, I would
like to know.
I have tried with different compilers, and I see some differences.
Before I file a bug report with my C vendor, I would like to
know what the correct behavior is.

struct S
{
unsigned int a:4;
unsigned int b:16;
};

void foo(void);
void bar(void);

int main(void)
{
struct S s;
s.a = 0;
s.b = 0;

if (s.a - 5 < 0)
foo();
else
bar();

if (s.b - 5 < 0)
foo();
else
bar();

return 0;
}
Carsten Hansen
Nov 14 '05 #1
Share this Question
Share on Google+
112 Replies


P: n/a
Carsten Hansen wrote:

Suppose I'm using an implementation where an int is 16 bits.
In the program below, what function is called in the first case,
and what is called in the second case?
Also, if there is a difference between C89 and C99, I would
like to know.
I have tried with different compilers, and I see some differences.
Before I file a bug report with my C vendor, I would like to
know what the correct behavior is.

struct S
{
unsigned int a:4;
unsigned int b:16;
};

void foo(void);
void bar(void);

int main(void)
{
struct S s;
s.a = 0;
s.b = 0;

if (s.a - 5 < 0) foo();
else > bar();
if (s.b - 5 < 0) foo();
else bar();
return 0;
}


(quote slightly reformatted for clarity)

foo and bar, respectively. An unsigned can never be less than 0.

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
Nov 14 '05 #2

P: n/a
> Suppose I'm using an implementation where an int is 16 bits.

doesn't change anything. The problem isn't the "16" that is in there
but rather the "unsigned." Unsigned variables can only be positive
(they have no sign.) A signed variable has the option to be either
positive or negative--though it won't be able to be as big of a number
as an unsigned value (explained more at the bottom.)
In the program below what function is called in the first case,
and what is called in the second case?
Looks like bar() bar() to me which means that either I or Falconer is
spacing out. If it can never be negative, then it will never be less
than 0 so the first statement for both of these should be false and
bar() will be called.
Also, if there is a difference between C89 and C99, I would
like to know.
I'm a lightweight on the specifications so no comment.
I have tried with different compilers, and I see some differences.


If different compilers are giving different results, my best bet would
be that some are doing what they should do and not allowing an unsigned
value to be less than 0, while as others are trying to be intelligent
and correct the programmer's mistakes invisibly. While I wouldn't say
that the second "intelligent" compiler is buggy I certainly wouldn't
want to use it. Bugs are best corrected by you and not merely disappear
based on some particular quirk of your development
environment--otherwise the instant your environment changes, everything
comes to a halt.

And back to the topic of signed versus unsigned, the way that these
values are represented is exactly the same:

16 bit value
1111111111111111 = 65535
1111111111111111 = -1

The only way for your computer to know which one you meant is when you
specify "int" or "unsigned int." Based on that it will treat the same
exact bits in two entirely different ways.
To be specific, an unsigned number grows like this:

0 = 0
1 = 1
10 = 2
11 = 3
100 = 4
101 = 5
110 = 10
....
1111111111111101 = 65533
1111111111111110 = 65534
1111111111111111 = 65535

A signed number is the same up until the 16th bit

0 = 0
1 = 1
10 = 2
11 = 3
....
111111111111101 = 32765 //15 bits
111111111111110 = 32766
111111111111111 = 32767
1000000000000000 = -32768 //16 bits
1000000000000001 = -32767
1000000000000010 = -32766
....
1111111111111101 = -3
1111111111111110 = -2
1111111111111111 = -1

In the end, the word "hello" <- right there is just a bunch of ones and
zeroes. It is all just a matter of how you tell your computer it should
interpret it.
0010 1101 0100 0011 0110 1000 0111 0010 0110 1001 0111 0011

Nov 14 '05 #3

P: n/a
"Chris Williams" <th********@yahoo.co.jp> wrote in message
news:11**********************@c13g2000cwb.googlegr oups.com...
Looks like bar() bar() to me which means that either I or Falconer is
spacing out. If it can never be negative, then it will never be less
than 0 so the first statement for both of these should be false and
bar() will be called.


What can never be negative? s.a can't, but s.a-5 can.
6.7.2.1p9: "A bit-field is interpreted as a signed or unsigned integer type
consisting of the specified number of bits." The type of s.a is a 4-bit
unsigned type. Since all the values of such a type can be represented by
int, the integer promotions convert it to int rather than to unsigned int,
and the value of s.a-5 is -5 rather than UINT_MAX-4.
Nov 14 '05 #4

P: n/a
CBFalconer wrote:
Carsten Hansen wrote:

Suppose I'm using an implementation where an int is 16 bits.
In the program below, what function is called in the first case,
and what is called in the second case?
Also, if there is a difference between C89 and C99, I would
like to know.
I have tried with different compilers, and I see some differences.
Before I file a bug report with my C vendor, I would like to
know what the correct behavior is.

struct S
{
unsigned int a:4;
unsigned int b:16;
};

void foo(void);
void bar(void);

int main(void)
{
struct S s;
s.a = 0;
s.b = 0;

if (s.a - 5 < 0) foo();
else > bar();
if (s.b - 5 < 0) foo();
else bar();
return 0;
}


(quote slightly reformatted for clarity)

foo and bar, respectively. An unsigned can never be less than 0.


I was wrong - bar and bar. For some reason I assumed the a field
was specified as int. Chris Williams made me look again.

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
Nov 14 '05 #5

P: n/a

"CBFalconer" <cb********@yahoo.com> wrote in message
news:41***************@yahoo.com...
CBFalconer wrote:
Carsten Hansen wrote:

Suppose I'm using an implementation where an int is 16 bits.
In the program below, what function is called in the first case,
and what is called in the second case?
Also, if there is a difference between C89 and C99, I would
like to know.
I have tried with different compilers, and I see some differences.
Before I file a bug report with my C vendor, I would like to
know what the correct behavior is.

struct S
{
unsigned int a:4;
unsigned int b:16;
};

void foo(void);
void bar(void);

int main(void)
{
struct S s;
s.a = 0;
s.b = 0;

if (s.a - 5 < 0) foo();
else > bar();
if (s.b - 5 < 0) foo();
else bar();
return 0;
}


(quote slightly reformatted for clarity)

foo and bar, respectively. An unsigned can never be less than 0.


I was wrong - bar and bar. For some reason I assumed the a field
was specified as int. Chris Williams made me look again.


Assume you are using an implementation with 32-bits int, and you change
the width of b to be 32, does that change your answer?

gcc, Intel's compiler, Comeau's compiler and Metrowerks compiler all
give the answers
foo
bar

whereas Microsoft's compiler gives the answer
bar
bar

Carsten Hansen
Nov 14 '05 #6

P: n/a
CBFalconer wrote:
I was wrong - bar and bar. For some reason I assumed the a field
was specified as int. Chris Williams made me look again.


Since all the possible values of a four-bit unsigned bit-field can be
represented by an int, I'd expect it to get promoted to int, rather than
unsigned. What's my mistake?
Nov 14 '05 #7

P: n/a
"Carsten Hansen" <ha******@worldnet.att.net> wrote in message
news:ZE*******************@bgtnsc04-news.ops.worldnet.att.net...

"CBFalconer" <cb********@yahoo.com> wrote in message
news:41***************@yahoo.com...
CBFalconer wrote:
> Carsten Hansen wrote:
>>
>> Suppose I'm using an implementation where an int is 16 bits.
>> In the program below, what function is called in the first case,
>> and what is called in the second case?
>> Also, if there is a difference between C89 and C99, I would
>> like to know.
>> I have tried with different compilers, and I see some differences.
>> Before I file a bug report with my C vendor, I would like to
>> know what the correct behavior is.
>>
>> struct S
>> {
>> unsigned int a:4;
>> unsigned int b:16;
>> };
>>
>> void foo(void);
>> void bar(void);
>>
>> int main(void)
>> {
>> struct S s;
>> s.a = 0;
>> s.b = 0;
>>
>> if (s.a - 5 < 0) foo();
>> else > bar();
>> if (s.b - 5 < 0) foo();
>> else bar();
>> return 0;
>> }
>
> (quote slightly reformatted for clarity)
>
> foo and bar, respectively. An unsigned can never be less than 0.
I was wrong - bar and bar. For some reason I assumed the a field
was specified as int. Chris Williams made me look again.


Assume you are using an implementation with 32-bits int, and you change
the width of b to be 32, does that change your answer?


That should not change the answer. Remember that the
field is unsigned, regardless of its bit width. Therefore,
the subtraction expression is unsigned.
gcc, Intel's compiler, Comeau's compiler and Metrowerks compiler all
give the answers
foo
bar
Check their documentation to see if they support
unsigned bit fields. They may be ignoring the
"unsigned" qualifier.
whereas Microsoft's compiler gives the answer
bar
bar


That's what I would expect from a compiler that
supports unsigned bit fields.
Nov 14 '05 #8

P: n/a
Wojtek Lerch wrote:
"Chris Williams" <th********@yahoo.co.jp> wrote in message
Looks like bar() bar() to me which means that either I or Falconer is
spacing out. If it can never be negative, then it will never be less
than 0 so the first statement for both of these should be false and
bar() will be called.


What can never be negative? s.a can't, but s.a-5 can.

6.7.2.1p9: "A bit-field is interpreted as a signed or unsigned integer type
consisting of the specified number of bits." The type of s.a is a 4-bit
unsigned type. Since all the values of such a type can be represented by
int, the integer promotions convert it to int rather than to unsigned int,
and the value of s.a-5 is -5 rather than UINT_MAX-4.


Aha. The installation can interpret a bit field as either signed
or unsigned, but I believe needs to document which choice it has
taken. Thus either original behaviour can be legitimate, depending
on the system documentaion. This could be used as a test for what
the documentation should say.

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
Nov 14 '05 #9

P: n/a
"Wojtek Lerch" <Wo******@yahoo.ca> wrote in message
news:35*************@individual.net...
"Chris Williams" <th********@yahoo.co.jp> wrote in message
news:11**********************@c13g2000cwb.googlegr oups.com...
Looks like bar() bar() to me which means that either I or Falconer is
spacing out. If it can never be negative, then it will never be less
than 0 so the first statement for both of these should be false and
bar() will be called.
What can never be negative? s.a can't, but s.a-5 can.
6.7.2.1p9: "A bit-field is interpreted as a signed or unsigned integer type
consisting of the specified number of bits."


That's apparently what the standard says.
The type of s.a is a 4-bit unsigned type. Since all the values of such a type
can be represented by int, the integer promotions convert it to int rather
than to unsigned int, and the value of s.a-5 is -5 rather than UINT_MAX-4.


You added that part.

Are you saying that, say, an 8-bit unsigned char can
be promoted to a wider signed int? Please quote the
standard that says unsigned integers can be widened
to signed integers.
Nov 14 '05 #10

P: n/a

Wojtek Lerch wrote:
CBFalconer wrote:
I was wrong - bar and bar. For some reason I assumed the a field
was specified as int. Chris Williams made me look again.
Since all the possible values of a four-bit unsigned bit-field can be

represented by an int, I'd expect it to get promoted to int, rather than unsigned. What's my mistake?


My compiler runs foo and foo (quit different from others'). On my
system, ints are 32-bits, since a signed int can hold all possible
values of a 16bit-field, i am presuming it's integrally promoting s.a
and s.b into signed int within the if conditions. Then subtracking -5
(a signed int constant) yields a -5.. which is less than 0.

I guess that's where I get foo, foo from. Then again, I just learned
the term "integral promotion" yesterday.

Nov 14 '05 #11

P: n/a
"xarax" <xa***@email.com> wrote in message
news:EW**************@newsread1.news.pas.earthlink .net...
"Wojtek Lerch" <Wo******@yahoo.ca> wrote in message
news:35*************@individual.net...
6.7.2.1p9: "A bit-field is interpreted as a signed or unsigned integer
type consisting of the specified number of bits."
That's apparently what the standard says.
The type of s.a is a 4-bit unsigned type. Since all the values of such a
type can be represented by int, the integer promotions convert it to int
rather than to unsigned int, and the value of s.a-5 is -5 rather than
UINT_MAX-4.


You added that part.


The part outside the quotation marks? Yes, of course.
Are you saying that, say, an 8-bit unsigned char can
be promoted to a wider signed int? Please quote the
standard that says unsigned integers can be widened
to signed integers.


Yes.

6.3.1.1p2: "If an int can represent all values of the original type, the
value is converted to an int; otherwise, it is converted to an unsigned int.
These are called the integer promotions."
Nov 14 '05 #12

P: n/a

xarax wrote:
"Wojtek Lerch" <Wo******@yahoo.ca> wrote in message
news:35*************@individual.net...
"Chris Williams" <th********@yahoo.co.jp> wrote in message
news:11**********************@c13g2000cwb.googlegr oups.com...
Looks like bar() bar() to me which means that either I or Falconer is spacing out. If it can never be negative, then it will never be less than 0 so the first statement for both of these should be false and bar() will be called.
What can never be negative? s.a can't, but s.a-5 can.
6.7.2.1p9: "A bit-field is interpreted as a signed or unsigned integer type consisting of the specified number of bits."


That's apparently what the standard says.
The type of s.a is a 4-bit unsigned type. Since all the values of such a type can be represented by int, the integer promotions convert it to int rather than to unsigned int, and the value of s.a-5 is -5 rather than

UINT_MAX-4.
You added that part.

Are you saying that, say, an 8-bit unsigned char can
be promoted to a wider signed int? Please quote the
standard that says unsigned integers can be widened
to signed integers.


Like I said in my other post, there is total confusion around integral
promotions and arithmetic conversions. All 5 of the 5 C programmers at
my workplace don't understand it, which says a lot (about them and the
topic). I bet Chris Torek is the only one who truly understands it.

Nov 14 '05 #13

P: n/a
"CBFalconer" <cb********@yahoo.com> wrote in message
news:41***************@yahoo.com...
Aha. The installation can interpret a bit field as either signed
or unsigned, but I believe needs to document which choice it has
taken. Thus either original behaviour can be legitimate, depending
on the system documentaion. This could be used as a test for what
the documentation should say.


No, the implementation only has choice when the bit-field is declared as
plain "int":

6.7.2p5: "Each of the comma-separated sets designates the same type, except
that for bit-fields, it is implementation-defined whether the specifier int
designates the same type as signed int or the same type as unsigned int."
Nov 14 '05 #14

P: n/a
Wojtek Lerch wrote:
CBFalconer wrote:
I was wrong - bar and bar. For some reason I assumed the a field
was specified as int. Chris Williams made me look again.


Since all the possible values of a four-bit unsigned bit-field can
be represented by an int, I'd expect it to get promoted to int,
rather than unsigned. What's my mistake?


IMO the field was specified as unsigned int, so the 4 bits are
expressed as such a type. Then the arithmetic with an int is done,
and arithmetic between unsigned and signed requires that the signed
be promoted to unsigned. Thus the result is unsigned.

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
Nov 14 '05 #15

P: n/a

CBFalconer wrote:
Wojtek Lerch wrote:
CBFalconer wrote:
I was wrong - bar and bar. For some reason I assumed the a field
was specified as int. Chris Williams made me look again.


Since all the possible values of a four-bit unsigned bit-field can
be represented by an int, I'd expect it to get promoted to int,
rather than unsigned. What's my mistake?


IMO the field was specified as unsigned int, so the 4 bits are
expressed as such a type. Then the arithmetic with an int is done,
and arithmetic between unsigned and signed requires that the signed
be promoted to unsigned. Thus the result is unsigned.


But isn't that only true if we're converting between types of the same
width.
In this case,the 4-bit field is to be converted to an int. Since an int
can hold any value a 4-bit field can, doesn't the integral promotion
default to an int(signed)?

Nov 14 '05 #16

P: n/a
In message <11**********************@z14g2000cwz.googlegroups .com>
"TTroy" <ti*****@gmail.com> wrote:

CBFalconer wrote:
Wojtek Lerch wrote:
CBFalconer wrote:

> I was wrong - bar and bar. For some reason I assumed the a field
> was specified as int. Chris Williams made me look again.

Since all the possible values of a four-bit unsigned bit-field can
be represented by an int, I'd expect it to get promoted to int,
rather than unsigned. What's my mistake?


IMO the field was specified as unsigned int, so the 4 bits are
expressed as such a type. Then the arithmetic with an int is done,
and arithmetic between unsigned and signed requires that the signed
be promoted to unsigned. Thus the result is unsigned.


But isn't that only true if we're converting between types of the same
width. In this case,the 4-bit field is to be converted to an int. Since an
int can hold any value a 4-bit field can, doesn't the integral promotion
default to an int(signed)?


Section 6.3.1.1p2:

"The following may be used in an expression wherever an int or unsigned
int may be used:

- An object or expression with an integer type whose integer
conversion rank is less than the rank of int and unsigned int.
- A bit-field of type _Bool, int, signed int or unsigned int.

If an int can represent all values of the original type, the value
is converted to an int; otherwise it is converted to an unsigned int.
These are called the integer promotions."

The ambiguity arises from what the "original type" is, and hence what "all
values" are. In the case of the 4-bit unsigned bitfield, is it of type
unsigned int, so all values are 0..UINT_MAX, or are all values 0..15?

I'd always understood it to be the latter interpretation, so it promotes
to int. A look check at our compiler agrees with this - it promotes to int,
unless in pcc compatibility mode where it promotes to unsigned int.

This may be supported by 6.7.2.1p9 which states:

"A bit-field is interpreted as a signed or unsigned integer type
consisting of the specified number of bits."

That at least gives wording strong enough to allow the bit-field to have
a distinct type with range 0..15 for the purposes of 6.3.1.1p2.

--
Kevin Bracey, Principal Software Engineer
Tematic Ltd Tel: +44 (0) 1223 503464
182-190 Newmarket Road Fax: +44 (0) 1728 727430
Cambridge, CB5 8HE, United Kingdom WWW: http://www.tematic.com/
Nov 14 '05 #17

P: n/a
CBFalconer wrote:
Wojtek Lerch wrote:
CBFalconer wrote:

I was wrong - bar and bar. For some reason I assumed the a field
was specified as int. Chris Williams made me look again.


Since all the possible values of a four-bit unsigned bit-field can
be represented by an int, I'd expect it to get promoted to int,
rather than unsigned. What's my mistake?

IMO the field was specified as unsigned int, so the 4 bits are
expressed as such a type. Then the arithmetic with an int is done,
and arithmetic between unsigned and signed requires that the signed
be promoted to unsigned. Thus the result is unsigned.

6.5.6p4, describing subtraction: "If both operands have arithmetic type,
the usual arithmetic conversions are performed on them."

6.3.1.8p1, defining the usual arithmetic conversions: "... Otherwise,
the integer promotions are performed on both operands. ..."

6.3.1.1p2, defining the integer promotions, for among other things, bit
fields of type unsigned int: "... If an int can represent all values of
the original type, the value is converted to an int; otherwise, it is
converted to an unsigned int. These are called the _integer promotions_."

What am I missing?
Nov 14 '05 #18

P: n/a
"CBFalconer" <cb********@yahoo.com> wrote in message
news:41***************@yahoo.com...
Wojtek Lerch wrote:
CBFalconer wrote:
I was wrong - bar and bar. For some reason I assumed the a field
was specified as int. Chris Williams made me look again.


Since all the possible values of a four-bit unsigned bit-field can
be represented by an int, I'd expect it to get promoted to int,
rather than unsigned. What's my mistake?


IMO the field was specified as unsigned int, so the 4 bits are
expressed as such a type. Then the arithmetic with an int is done,
and arithmetic between unsigned and signed requires that the signed
be promoted to unsigned. Thus the result is unsigned.


The definition of integer promotion say quite specifically that for integers
whose conversion rank is less than the rank of int, as well as for
bit-fields, whether the value is converted to int or unsigned int depends on
whether int can represent all the values of the "original" type. So I guess
the question is what exactly the "original" type is for bit-fields.

Unfortunately, the standard doesn't seem pedantically consistent about the
type of bit-fields. In 6.7.2.1p9, it says that "a bit-field is interpreted
as a signed or unsigned integer type consisting of the specified number of
bits". On the other hand, 6.7.2.1p4 says, "A bit-field shall have a type
that is a qualified or unqualified version of _Bool, signed int, unsigned
int, or some other implementation-defined type", and 6.3.1.1p2 also talks
about "a bit-field of type _Bool, int, signed int, or unsigned int". Why
does the latter make the distinction between "int" and "signed int" while
the former does not? If a four-bit field is declared as "int" on an
implementation that makes it unsigned, does that mean that the bit-field is
"of type int" but "has type unsigned int" and is "interpreted" as a four-bit
unsigned type? Which of those three different types is the "original" type
that the definition of integer promotions refers to?...
Nov 14 '05 #19

P: n/a
Gah. Knew I shouldn't have posted. Or at least not without first
plugging it into a compiler to verify I wasn't talking out of ...my
ear.

Apologies,
Chris

Nov 14 '05 #20

P: n/a
Kevin Bracey wrote:
"TTroy" <ti*****@gmail.com> wrote:
CBFalconer wrote:
Wojtek Lerch wrote:
CBFalconer wrote:

> I was wrong - bar and bar. For some reason I assumed the a field
> was specified as int. Chris Williams made me look again.

Since all the possible values of a four-bit unsigned bit-field can
be represented by an int, I'd expect it to get promoted to int,
rather than unsigned. What's my mistake?

IMO the field was specified as unsigned int, so the 4 bits are
expressed as such a type. Then the arithmetic with an int is
done, and arithmetic between unsigned and signed requires that
the signed be promoted to unsigned. Thus the result is unsigned.


But isn't that only true if we're converting between types of the
same width. In this case,the 4-bit field is to be converted to an
int. Since an int can hold any value a 4-bit field can, doesn't
the integral promotion default to an int(signed)?


Section 6.3.1.1p2:

"The following may be used in an expression wherever an int or
unsigned int may be used:

- An object or expression with an integer type whose integer
conversion rank is less than the rank of int and unsigned int.
- A bit-field of type _Bool, int, signed int or unsigned int.

If an int can represent all values of the original type, the value
is converted to an int; otherwise it is converted to an unsigned
int. These are called the integer promotions."

The ambiguity arises from what the "original type" is, and hence
what "all values" are. In the case of the 4-bit unsigned bitfield,
is it of type unsigned int, so all values are 0..UINT_MAX, or are
all values 0..15?

I'd always understood it to be the latter interpretation, so it
promotes to int. A look check at our compiler agrees with this -
it promotes to int, unless in pcc compatibility mode where it
promotes to unsigned int.

This may be supported by 6.7.2.1p9 which states:

"A bit-field is interpreted as a signed or unsigned integer type
consisting of the specified number of bits."

That at least gives wording strong enough to allow the bit-field to
have a distinct type with range 0..15 for the purposes of 6.3.1.1p2.


Maybe the way to attack it is by how any sane code generator
designer would go about it. The first thing to do is to get the
memory block holding the thing in question into a register. The
next is to shift that register so that the field in question is
right justified. Now the question arises of what to do with the
unspecified bits. They may either be masked off to 0 (i.e. the
field was unsigned) or jammed to copies of the left hand bit of the
original field (i.e. the field was signed, assuming 2's
complement). For 1's complement things are the same here, and for
sign-magnitude different (closer to unsigned treatment, but move
the sign bit over to its proper place).

After that we have an entity in a register, which is assumedly the
most convenient size that is normally used for ints, signed or
unsigned, and we can proceed from there. It seems to me that most
designers would opt for the unsigned version, because it is simpler
to process, and they are allowed to.

That means that the signed/unsigned characteristic of the bit field
is propagated into any expressions using it.

It also means that wherever given the choice, the designer will
make a bit field unsigned because it means less processing and less
chance of overflows and consequent UB.

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
Nov 14 '05 #21

P: n/a
CBFalconer wrote:
Wojtek Lerch wrote:
"Chris Williams" <th********@yahoo.co.jp> wrote in message

Looks like bar() bar() to me which means that either I or Falconer is
spacing out. If it can never be negative, then it will never be less
than 0 so the first statement for both of these should be false and
bar() will be called.


What can never be negative? s.a can't, but s.a-5 can.

6.7.2.1p9: "A bit-field is interpreted as a signed or unsigned integer type
consisting of the specified number of bits." The type of s.a is a 4-bit
unsigned type. Since all the values of such a type can be represented by
int, the integer promotions convert it to int rather than to unsigned int,
and the value of s.a-5 is -5 rather than UINT_MAX-4.

Aha. The installation can interpret a bit field as either signed
or unsigned, but I believe needs to document which choice it has
taken. Thus either original behaviour can be legitimate, depending
on the system documentaion. This could be used as a test for what
the documentation should say.

#include <stdio.h>

struct S
{
unsigned int a:4;
unsigned int b:16;
};

void foo(void) {
puts("foo");
}

void bar(void) {
puts("bar");
}

int main(void)
{
struct S s;
s.a = 0;
s.b = 0;

unsigned char c = 0;
unsigned u = 0;

if (s.a - 5 < 0)
foo();
else
bar();

if (s.b - 5 < 0)
foo();
else
bar();

if (c - 5 < 0)
foo();
else
bar();

if (u - 5 < 0)
foo();
else
bar();

return 0;
}
My output is..

foo
foo
foo
bar

...and it must be so with any C compiler. Its called Integral Promotion.

From K&R2 A6.1 pp 197

"A character, short integer or an integer bit-field, all either
signed or not, or an object of enumeration type, may be used in an
expression wherever an integer may be used. If an int can represent all
the values of the original type, then the value is converted to int;
otherwise the value is converted to unsigned int."

In the first three cases the values of the narrower types are converted
to int before 5 is subtracted from its value yielding a negative result.
In the fourth case the maximum value of u cannot be represented by an
int and so remains unsigned and therefore positive.

--
Joe Wright mailto:jo********@comcast.net
"Everything should be made as simple as possible, but not simpler."
--- Albert Einstein ---
Nov 14 '05 #22

P: n/a
"CBFalconer" <cb********@yahoo.com> wrote in message
news:41***************@yahoo.com...
Kevin Bracey wrote:

"A bit-field is interpreted as a signed or unsigned integer type
consisting of the specified number of bits."

That at least gives wording strong enough to allow the bit-field to
have a distinct type with range 0..15 for the purposes of 6.3.1.1p2.
Maybe the way to attack it is by how any sane code generator
designer would go about it. [...]


Not really. This isn't about what a sane implementor is likely to do. It's
more about how much the C standard promises to forgive an insane
implementor...

.... After that we have an entity in a register, which is assumedly the
most convenient size that is normally used for ints, signed or
unsigned, and we can proceed from there. It seems to me that most
designers would opt for the unsigned version, because it is simpler
to process, and they are allowed to.


The only thing they are allowed to is decide whether a bit-field declared as
plain "int" is signed or unsigned. Once that's been decided, you can forget
about plain "int".

After you've gone through the shifting and masking or whatever and you have
the bit pattern in a register, you don't get to decide, again, whether to
interpret it as a signed int or an unsigned int. The standard says that it
depends on whether the range of the "original type" fits into the range of
int. The text is unclear on what the "original type" means, but doesn't
sound as if it were meant to give implementors a choice here.
Nov 14 '05 #23

P: n/a
On Fri, 28 Jan 2005 21:01:52 GMT, CBFalconer <cb********@yahoo.com>
wrote in comp.lang.c:
Kevin Bracey wrote:
"TTroy" <ti*****@gmail.com> wrote:
CBFalconer wrote:
Wojtek Lerch wrote:
> CBFalconer wrote:
>
>> I was wrong - bar and bar. For some reason I assumed the a field
>> was specified as int. Chris Williams made me look again.
>
> Since all the possible values of a four-bit unsigned bit-field can
> be represented by an int, I'd expect it to get promoted to int,
> rather than unsigned. What's my mistake?

IMO the field was specified as unsigned int, so the 4 bits are
expressed as such a type. Then the arithmetic with an int is
done, and arithmetic between unsigned and signed requires that
the signed be promoted to unsigned. Thus the result is unsigned.

But isn't that only true if we're converting between types of the
same width. In this case,the 4-bit field is to be converted to an
int. Since an int can hold any value a 4-bit field can, doesn't
the integral promotion default to an int(signed)?
Section 6.3.1.1p2:

"The following may be used in an expression wherever an int or
unsigned int may be used:

- An object or expression with an integer type whose integer
conversion rank is less than the rank of int and unsigned int.
- A bit-field of type _Bool, int, signed int or unsigned int.

If an int can represent all values of the original type, the value
is converted to an int; otherwise it is converted to an unsigned
int. These are called the integer promotions."

The ambiguity arises from what the "original type" is, and hence
what "all values" are. In the case of the 4-bit unsigned bitfield,
is it of type unsigned int, so all values are 0..UINT_MAX, or are
all values 0..15?

I'd always understood it to be the latter interpretation, so it
promotes to int. A look check at our compiler agrees with this -
it promotes to int, unless in pcc compatibility mode where it
promotes to unsigned int.

This may be supported by 6.7.2.1p9 which states:

"A bit-field is interpreted as a signed or unsigned integer type
consisting of the specified number of bits."

That at least gives wording strong enough to allow the bit-field to
have a distinct type with range 0..15 for the purposes of 6.3.1.1p2.


Maybe the way to attack it is by how any sane code generator
designer would go about it. The first thing to do is to get the
memory block holding the thing in question into a register. The
next is to shift that register so that the field in question is
right justified. Now the question arises of what to do with the
unspecified bits. They may either be masked off to 0 (i.e. the
field was unsigned) or jammed to copies of the left hand bit of the
original field (i.e. the field was signed, assuming 2's
complement). For 1's complement things are the same here, and for
sign-magnitude different (closer to unsigned treatment, but move
the sign bit over to its proper place).

After that we have an entity in a register, which is assumedly the
most convenient size that is normally used for ints, signed or
unsigned, and we can proceed from there. It seems to me that most
designers would opt for the unsigned version, because it is simpler
to process, and they are allowed to.


Up to here, you're doing OK.
That means that the signed/unsigned characteristic of the bit field
is propagated into any expressions using it.
Now you've stumbled.

Think about it, you have just copied a storage unit full of bits into
a register, and perhaps right shifted that register to place the bit
field in the least significant bits of that register. Because the bit
field is defined as unsigned, you fill all the higher bits of the
registers with 0, most likely with a bitwise AND. If the bit field
had been signed and contained a positive value, you would have done
the same.

So now you have the value of the bit field in an int-sized or larger
register. Is it in a signed int register, or an unsigned int
register? Er, there aren't such things, it's just in an int-sized
register. Perhaps in 1948 there was a vacuum tube and relay based
computer that actually had, and needed, separate signed integer and
unsigned integer registers, but I rather doubt it.

So we still have the value of the bit field, right justified with
leading 0 bits, in an integer register. Is it signed or unsigned?
Impossible to tell at this point, since the object representation of a
positive value in a signed integer type is required by the standard to
be absolutely identical to that of the same value in the corresponding
integer type.

So we have a value in a register that is either a signed or unsigned
int, impossible to tell from looking at the bits. Whether the C
object type of that register is signed into or unsigned int depends on
what happens next at the object code level:

1. On some architectures and for some operations, it depends on what
processor instructions are executed using the contents of the
register. ISTM to remember some processors that had different
instructions for signed and unsigned operations such as multiply and
divide, but I could be wrong.

2. Far more commonly, the actual significance of whether the register
was signed or unsigned only happens after the operation instruction is
performed, and affects the interpretation of the result, or even of
whether or not the result is defined.

So there is no difference in overhead whatsoever in converting the
unsigned bit field to either a signed or unsigned int.
It also means that wherever given the choice, the designer will
make a bit field unsigned because it means less processing and less
chance of overflows and consequent UB.


That can be stated much more simply and succinctly as:

The programmer should used unsigned bit fields when only positive
values need to be stored, and signed bit fields when both positive and
negative values are used.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.contrib.andrew.cmu.edu/~a...FAQ-acllc.html
Nov 14 '05 #24

P: n/a
Joe Wright wrote:
CBFalconer wrote:
Wojtek Lerch wrote:
"Chris Williams" <th********@yahoo.co.jp> wrote in message
Looks like bar() bar() to me which means that either I or Falconer is
spacing out. If it can never be negative, then it will never be less
than 0 so the first statement for both of these should be false and
bar() will be called.
What can never be negative? s.a can't, but s.a-5 can.

6.7.2.1p9: "A bit-field is interpreted as a signed or unsigned
integer type
consisting of the specified number of bits." The type of s.a is a 4-bit
unsigned type. Since all the values of such a type can be
represented by
int, the integer promotions convert it to int rather than to unsigned
int,
and the value of s.a-5 is -5 rather than UINT_MAX-4.
Aha. The installation can interpret a bit field as either signed
or unsigned, but I believe needs to document which choice it has
taken. Thus either original behaviour can be legitimate, depending
on the system documentaion. This could be used as a test for what
the documentation should say.

#include <stdio.h>

struct S
{
unsigned int a:4;
unsigned int b:16;
};

void foo(void) {
puts("foo");
}

void bar(void) {
puts("bar");
}

int main(void)
{
struct S s;
s.a = 0;
s.b = 0;

unsigned char c = 0;
unsigned u = 0;

if (s.a - 5 < 0)
foo();
else
bar();

if (s.b - 5 < 0)
foo();
else
bar();

if (c - 5 < 0)
foo();
else
bar();

if (u - 5 < 0)
foo();


(u - 5 < 0) will always be false.
else
bar();


<snip>
Regards,
Jonathan.

--
"We must do something. This is something. Therefore, we must do this."
- Keith Thompson
Nov 14 '05 #25

P: n/a
Jack Klein wrote:
CBFalconer <cb********@yahoo.com>
Kevin Bracey wrote:
"TTroy" <ti*****@gmail.com> wrote:
CBFalconer wrote:
> Wojtek Lerch wrote:
>> CBFalconer wrote:
>>
>>> I was wrong - bar and bar. For some reason I assumed the a field
>>> was specified as int. Chris Williams made me look again.
>>
>> Since all the possible values of a four-bit unsigned bit-field can
>> be represented by an int, I'd expect it to get promoted to int,
>> rather than unsigned. What's my mistake?
>
> IMO the field was specified as unsigned int, so the 4 bits are
> expressed as such a type. Then the arithmetic with an int is
> done, and arithmetic between unsigned and signed requires that
> the signed be promoted to unsigned. Thus the result is unsigned.

But isn't that only true if we're converting between types of the
same width. In this case,the 4-bit field is to be converted to an
int. Since an int can hold any value a 4-bit field can, doesn't
the integral promotion default to an int(signed)?

Section 6.3.1.1p2:

"The following may be used in an expression wherever an int or
unsigned int may be used:

- An object or expression with an integer type whose integer
conversion rank is less than the rank of int and unsigned int.
- A bit-field of type _Bool, int, signed int or unsigned int.

If an int can represent all values of the original type, the value
is converted to an int; otherwise it is converted to an unsigned
int. These are called the integer promotions."

The ambiguity arises from what the "original type" is, and hence
what "all values" are. In the case of the 4-bit unsigned bitfield,
is it of type unsigned int, so all values are 0..UINT_MAX, or are
all values 0..15?

I'd always understood it to be the latter interpretation, so it
promotes to int. A look check at our compiler agrees with this -
it promotes to int, unless in pcc compatibility mode where it
promotes to unsigned int.

This may be supported by 6.7.2.1p9 which states:

"A bit-field is interpreted as a signed or unsigned integer type
consisting of the specified number of bits."

That at least gives wording strong enough to allow the bit-field to
have a distinct type with range 0..15 for the purposes of 6.3.1.1p2.
Maybe the way to attack it is by how any sane code generator
designer would go about it. The first thing to do is to get the
memory block holding the thing in question into a register. The
next is to shift that register so that the field in question is
right justified. Now the question arises of what to do with the
unspecified bits. They may either be masked off to 0 (i.e. the
field was unsigned) or jammed to copies of the left hand bit of the
original field (i.e. the field was signed, assuming 2's
complement). For 1's complement things are the same here, and for
sign-magnitude different (closer to unsigned treatment, but move
the sign bit over to its proper place).

After that we have an entity in a register, which is assumedly the
most convenient size that is normally used for ints, signed or
unsigned, and we can proceed from there. It seems to me that most
designers would opt for the unsigned version, because it is simpler
to process, and they are allowed to.


Up to here, you're doing OK.
That means that the signed/unsigned characteristic of the bit field
is propagated into any expressions using it.


Now you've stumbled.

Think about it, you have just copied a storage unit full of bits
into a register, and perhaps right shifted that register to place
the bit field in the least significant bits of that register.
Because the bit field is defined as unsigned, you fill all the
higher bits of the registers with 0, most likely with a bitwise
AND. If the bit field had been signed and contained a positive
value, you would have done the same.


No, you've missed the complications involved in assuming the bit
field to be signed. That means the other bits have to be set to
copies of the fields sign bit, in either 1's or 2's complement
machines. For sign magnitude the appropriate bit has to be
exchanged with the sign bit, after zeroing the extra bits. These
manipulations are more complex than the unsigned version (which
simply zeroes some bits), and thus to be avoided. Laziness is a
virtue here.

In all cases we now have a bit pattern in a register, and external
type knowledge saying whether that pattern describes a signed or
unsigned integer. That external knowledge comes from the original
declaration of the bit field. That knowledge also governed whether
or not to go through the sign-extension gyrations described above.

All further processing is done as if the reworked register content
had been loaded in one fell swoop from somewhere, together with the
un/signed type knowledge.

Having gotten here with our sane code generator implementor, I
maintain we now have the right clue about how to handle the usual
arithmetic conversions on the bit field. We now base them on the
original declaration as un/signed, because that minimizes the
work. This is the final clue as to what the standard should say,
were it to say anything, which so far it does not AFAICT. We do
not base it on the range of values the bit field can hold.

.... snip ...
The programmer should used unsigned bit fields when only positive
values need to be stored, and signed bit fields when both positive
and negative values are used.


We are not trying to constrain the programmer, we are trying to
interpret what s/he actually wrote.

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
Nov 14 '05 #26

P: n/a
"TTroy" <ti*****@gmail.com> wrote in message
news:11*********************@f14g2000cwb.googlegro ups.com...

xarax wrote:
"Wojtek Lerch" <Wo******@yahoo.ca> wrote in message
news:35*************@individual.net...
> "Chris Williams" <th********@yahoo.co.jp> wrote in message
> news:11**********************@c13g2000cwb.googlegr oups.com...
>> Looks like bar() bar() to me which means that either I or Falconer is >> spacing out. If it can never be negative, then it will never be less >> than 0 so the first statement for both of these should be false and >> bar() will be called.
>
> What can never be negative? s.a can't, but s.a-5 can.
>
>
> 6.7.2.1p9: "A bit-field is interpreted as a signed or unsigned integer type > consisting of the specified number of bits."


That's apparently what the standard says.
> The type of s.a is a 4-bit unsigned type. Since all the values of such a type > can be represented by int, the integer promotions convert it to int rather > than to unsigned int, and the value of s.a-5 is -5 rather than

UINT_MAX-4.

You added that part.

Are you saying that, say, an 8-bit unsigned char can
be promoted to a wider signed int? Please quote the
standard that says unsigned integers can be widened
to signed integers.


Like I said in my other post, there is total confusion around integral
promotions and arithmetic conversions. All 5 of the 5 C programmers at
my workplace don't understand it, which says a lot (about them and the
topic). I bet Chris Torek is the only one who truly understands it.


You're saying that the standard allows an unsigned
integer type to be demoted to signed integer type?
Nov 14 '05 #27

P: n/a
"xarax" <xa***@email.com> wrote in message
news:bi***************@newsread1.news.pas.earthlin k.net...
You're saying that the standard allows an unsigned
integer type to be demoted to signed integer type?


"If an int can represent all values of the original type, the value is
converted to an int; otherwise, it is converted to an unsigned int. These
are called the integer promotions." (6.3.1.1p2)
Nov 14 '05 #28

P: n/a
xarax wrote:
"TTroy" <ti*****@gmail.com> wrote in message
news:11*********************@f14g2000cwb.googlegro ups.com...

xarax wrote: ....
Are you saying that, say, an 8-bit unsigned char can
be promoted to a wider signed int? Please quote the
standard that says unsigned integers can be widened
to signed integers.
.... You're saying that the standard allows an unsigned
integer type to be demoted to signed integer type?


No, we're saying that the standard REQUIRES an unsigned type whose
values can all be represented by an int to be PROmoted to an int. See
section 6.3.1.1p2.

Nov 14 '05 #29

P: n/a
On Sat, 29 Jan 2005 11:26:02 GMT, CBFalconer <cb********@yahoo.com>
wrote in comp.std.c:
Jack Klein wrote:
CBFalconer <cb********@yahoo.com>
Kevin Bracey wrote:
"TTroy" <ti*****@gmail.com> wrote:
> CBFalconer wrote:
>> Wojtek Lerch wrote:
>>> CBFalconer wrote: Maybe the way to attack it is by how any sane code generator
designer would go about it. The first thing to do is to get the
memory block holding the thing in question into a register. The
next is to shift that register so that the field in question is
right justified. Now the question arises of what to do with the
unspecified bits. They may either be masked off to 0 (i.e. the
field was unsigned) or jammed to copies of the left hand bit of the
original field (i.e. the field was signed, assuming 2's
complement). For 1's complement things are the same here, and for
sign-magnitude different (closer to unsigned treatment, but move
the sign bit over to its proper place).

After that we have an entity in a register, which is assumedly the
most convenient size that is normally used for ints, signed or
unsigned, and we can proceed from there. It seems to me that most
designers would opt for the unsigned version, because it is simpler
to process, and they are allowed to.
Up to here, you're doing OK.
That means that the signed/unsigned characteristic of the bit field
is propagated into any expressions using it.


Now you've stumbled.

Think about it, you have just copied a storage unit full of bits
into a register, and perhaps right shifted that register to place
the bit field in the least significant bits of that register.
Because the bit field is defined as unsigned, you fill all the
higher bits of the registers with 0, most likely with a bitwise
AND. If the bit field had been signed and contained a positive
value, you would have done the same.


No, you've missed the complications involved in assuming the bit
field to be signed. That means the other bits have to be set to
copies of the fields sign bit, in either 1's or 2's complement
machines. For sign magnitude the appropriate bit has to be
exchanged with the sign bit, after zeroing the extra bits. These
manipulations are more complex than the unsigned version (which
simply zeroes some bits), and thus to be avoided. Laziness is a
virtue here.


The gyrations involved in sign extending negative unsigned bit fields
on non 2's complement platforms are relevant, and no different than
those such a platform must go through to convert an ordinary signed
integer type with a negative value to a wider signed type. If there
are any such monstrosities still in existence with current C compiler
support, they pay for the obsolete architecture.
In all cases we now have a bit pattern in a register, and external
type knowledge saying whether that pattern describes a signed or
unsigned integer. That external knowledge comes from the original
declaration of the bit field. That knowledge also governed whether
or not to go through the sign-extension gyrations described above.

All further processing is done as if the reworked register content
had been loaded in one fell swoop from somewhere, together with the
un/signed type knowledge.
What does the "signed/unsigned" knowledge have to do with it? On most
architectures, chars have fewer bits than ints and UCHAR_MAX <
INT_MAX. So in an expression involving an unsigned char, you wind up
with the same thing, namely a narrower bit field filling an int size
object, and the knowledge of whether the object it came from was
signed or unsigned. Despite the fact that the value originated in an
unsigned char, the int-sized object must be treated as signed.
Having gotten here with our sane code generator implementor, I
maintain we now have the right clue about how to handle the usual
arithmetic conversions on the bit field. We now base them on the
original declaration as un/signed, because that minimizes the
work. This is the final clue as to what the standard should say,
were it to say anything, which so far it does not AFAICT. We do
not base it on the range of values the bit field can hold.
Admittedly it is unfortunate that the standard does not specifically
mention bit fields in describing the usual integer conversions, and
hopefully that can be rectified in a TC or later version.

But since the standard selected what they call "value preserving" over
"sign preserving" operation, it would be seriously inconsistent and a
real source of problems if an 8-bit unsigned char promoted to a signed
int but an 8-bit unsigned bit field promoted to an unsigned int. That
would be rather absurd, don't you think?
... snip ...

The programmer should used unsigned bit fields when only positive
values need to be stored, and signed bit fields when both positive
and negative values are used.
We are not trying to constrain the programmer, we are trying to
interpret what s/he actually wrote.


Ah, you snipped your particular statement that my comment addressed,
so I am putting it back in:
It also means that wherever given the choice, the designer will
make a bit field unsigned because it means less processing and less
chance of overflows and consequent UB.


I misinterpreted your meaning, so my comment doesn't apply. I was
thrown off by what I think is some incompleteness in your wording. I
think what you meant to say by "make a bit field unsigned" would be
better conveyed by the words "make an unsigned bit field promote to
unsigned int".

But despite the omission from the standard, it seems silly to think
that the compiler designer is given a choice here. Since all other
promotions and conversions are rather scrupulously defined, I find it
hard to believe that the intent was to leave the results of using an
unsigned bit field in an expression implementation defined. In fact,
nothing is implementation-defined unless the standard specifically
states that it is implementation-defined.

In fact, given the lack of mention, using the value of a bit field in
an expression technically produces undefined behavior based on the
wording of the standard today.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.contrib.andrew.cmu.edu/~a...FAQ-acllc.html
Nov 14 '05 #30

P: n/a
"Jack Klein" <ja*******@spamcop.net> wrote in message
news:ho********************************@4ax.com...
In fact, given the lack of mention, using the value of a bit field in
an expression technically produces undefined behavior based on the
wording of the standard today.


It's not exactly a lack of mention: 6.3.1.1p2 does mention bit-fields, and
it's quite clear that it attempts to define a behaviour. The only thing
that is not clear whether the "original type" is meant to refer to the type
the bit-field is "interpreted as" according to 6.7.2.1p9, or simply to the
type specifier used in the declaration of the field, as the list in
6.3.1.1p2 seems to suggest.

In short, it's not undefined behaviour. It's unclearly defined behaviour.
:-/
Nov 14 '05 #31

P: n/a
In article <1h********************************@4ax.com>,
Jack Klein <ja*******@spamcop.net> wrote:
That can be stated much more simply and succinctly as:

The programmer should used unsigned bit fields when only positive
values need to be stored, and signed bit fields when both positive and
negative values are used.


Since this is about the only place where there is a difference between
"signed int" and "int", I would say

The programmer should use bit fields using the type "unsigned int" when
only positive values need to be stored, and bit fields using the type
"signed int" when both positive and negative values are used, but
_never_ bit fields using the type "int" on its own, because then you
don't know whether the bitfield will actually be signed or unsigned.

Independent from that, the integer promotions will promote any signed
bitfield to "int", and any unsigned bit field with less bits than type
"int" will be promoted to "int" as well. But that won't help if you
tried to store a negative value into a bitfield that was specified as
"int" if the compiler decided to make it unsigned int.
Nov 14 '05 #32

P: n/a
"Wojtek Lerch" <Wo******@yahoo.ca> wrote in message
news:36*************@individual.net...
"Jack Klein" <ja*******@spamcop.net> wrote in message
news:ho********************************@4ax.com...
In fact, given the lack of mention, using the value of a bit field in
an expression technically produces undefined behavior based on the
wording of the standard today.


It's not exactly a lack of mention: 6.3.1.1p2 does mention bit-fields, and
it's quite clear that it attempts to define a behaviour. The only thing that
is not clear whether the "original type" is meant to refer to the type the
bit-field is "interpreted as" according to 6.7.2.1p9, or simply to the type
specifier used in the declaration of the field, as the list in 6.3.1.1p2 seems
to suggest.

In short, it's not undefined behaviour. It's unclearly defined behaviour. :-/


I think it's a question of whether bit-fields
of any length less than an int are actually
promoted at all, or whether the compiler is
allowed to treat a bit-field as a full-sized
integer (even though the source bits may be
narrower than an int).

Is a bit-field that is declared as

"unsigned int foo:CHAR_BIT;"

treated the same as an unsigned char, or
is it "unsigned int" that happens to be
loaded from a memory location is only CHAR_BIT
wide? Are bit-fields defined in the standard
as "promoted", or are only "ordinary" integer
types "promoted" to int?

Nov 14 '05 #33

P: n/a
xarax wrote:
.... snip ...
Is a bit-field that is declared as

"unsigned int foo:CHAR_BIT;"

treated the same as an unsigned char, or
is it "unsigned int" that happens to be
loaded from a memory location is only CHAR_BIT
wide? Are bit-fields defined in the standard
as "promoted", or are only "ordinary" integer
types "promoted" to int?


If you google back in this thread for my contributions, you will
see I favor the second alternative, and my reasons therefore.

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson

Nov 14 '05 #34

P: n/a
In message <41***************@yahoo.com>
CBFalconer <cb********@yahoo.com> wrote:
Kevin Bracey wrote:
The ambiguity arises from what the "original type" is, and hence
what "all values" are. In the case of the 4-bit unsigned bitfield,
is it of type unsigned int, so all values are 0..UINT_MAX, or are
all values 0..15?

I'd always understood it to be the latter interpretation, so it
promotes to int. A look check at our compiler agrees with this -
it promotes to int, unless in pcc compatibility mode where it
promotes to unsigned int.

This may be supported by 6.7.2.1p9 which states:

"A bit-field is interpreted as a signed or unsigned integer type
consisting of the specified number of bits."

That at least gives wording strong enough to allow the bit-field to
have a distinct type with range 0..15 for the purposes of 6.3.1.1p2.
Maybe the way to attack it is by how any sane code generator
designer would go about it. The first thing to do is to get the
memory block holding the thing in question into a register. The
next is to shift that register so that the field in question is
right justified. Now the question arises of what to do with the
unspecified bits. They may either be masked off to 0 (i.e. the
field was unsigned) or jammed to copies of the left hand bit of the
original field (i.e. the field was signed, assuming 2's
complement). For 1's complement things are the same here, and for
sign-magnitude different (closer to unsigned treatment, but move
the sign bit over to its proper place).

After that we have an entity in a register, which is assumedly the
most convenient size that is normally used for ints, signed or
unsigned, and we can proceed from there. It seems to me that most
designers would opt for the unsigned version, because it is simpler
to process, and they are allowed to.


All very well, but how is this any different to the case of loading
an unsigned char? The standard states that an unsigned char is promoted
to int (assuming int is bigger than char). In the following structure,
why should a and b be treated differently?

struct
{
unsigned char a;
unsigned b:8;
} x;

It seems logical to me that bitfields should follow the same basic "value-
preserving" promotion rules as other sub-integer types. The handling
involved is basically the same, even though it may require a bit more manual
work in the code generator. Mind you, I predominantly work on a CPU
with no hardware support for signed char, so even the standard types can
require manual sign-extension.

The fact that bitfields may require a bit more work than native widths
doesn't strike me as sufficient reason to have different semantics.
That means that the signed/unsigned characteristic of the bit field
is propagated into any expressions using it.

It also means that wherever given the choice, the designer will
make a bit field unsigned because it means less processing and less
chance of overflows and consequent UB.


I kind of agree - certainly, our compiler makes bitfields unsigned by
default. Nonetheless, unsigned bitfields smaller than 31 bits still
promote to int, because that's what we believe the standard requires.

Regardless of the formal type, the unsignedness can still be remembered. But
if you're going to bother doing that, you can do much better than just
remember an "unsigned" (ie top bit clear) flag - our compiler is capable of
tracking the number of significant low order bits in an expression.

For example, after promotion, x.b above is of type int, but the compiler
knows the value of the expression occupies 8 bits (if interpreted as
unsigned), or 9 bits (if interpreted as signed) [call it 8u/9s for short].

This information then propagates through expressions:

x.a [8u/9s]
x.b [8u/9s]
x.a+x.b [9u/10s]
x.a-x.b [32u/10s] {result could be negative, thus 32 unsigned bits}
x.a^x.b [8u/9s]
x.a*x.b [16u/17s]
x.a & 0xffff [8u/9s]

A good use of this significant bit information is to skip
narrowing/sign-extending stages when storing back into a narrow type; our
architecture has no instructions specifically for this purpose when working
in registers.

--
Kevin Bracey, Principal Software Engineer
Tematic Ltd Tel: +44 (0) 1223 503464
182-190 Newmarket Road Fax: +44 (0) 1728 727430
Cambridge, CB5 8HE, United Kingdom WWW: http://www.tematic.com/
Nov 14 '05 #35

P: n/a
Kevin Bracey wrote:
.... snip ...
All very well, but how is this any different to the case of loading
an unsigned char? The standard states that an unsigned char is
promoted to int (assuming int is bigger than char). In the following
structure, why should a and b be treated differently?

struct
{
unsigned char a;
unsigned b:8;
} x;


Because a char, of any flavor, occupies a complete addressable
unit, and can be loaded and stored without affecting anything else
(at least as far as C is concerned). That does not apply to
bitfields, which may spread over byte demarcations (but not over
int demarcations). In addition, a char is a defined type. A
bitfield isn't. C does not have subranges and close typeing.

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
Nov 14 '05 #36

P: n/a
On Fri, 28 Jan 2005 16:37:52 +0000, xarax wrote:

....
That should not change the answer. Remember that the
field is unsigned, regardless of its bit width. Therefore,
the subtraction expression is unsigned.
gcc, Intel's compiler, Comeau's compiler and Metrowerks compiler all
give the answers
foo
bar


Check their documentation to see if they support
unsigned bit fields. They may be ignoring the
"unsigned" qualifier.


If they are C compilers they are required to support unsigned bit-fields.

Lawrence
Nov 14 '05 #37

P: n/a
In message <41***************@yahoo.com>
CBFalconer <cb********@yahoo.com> wrote:
Kevin Bracey wrote:
... snip ...

All very well, but how is this any different to the case of loading
an unsigned char? The standard states that an unsigned char is
promoted to int (assuming int is bigger than char). In the following
structure, why should a and b be treated differently?

struct
{
unsigned char a;
unsigned b:8;
} x;


Because a char, of any flavor, occupies a complete addressable
unit, and can be loaded and stored without affecting anything else
(at least as far as C is concerned). That does not apply to
bitfields, which may spread over byte demarcations (but not over
int demarcations).


I'd agree that there's a potential difference in the implementation (although
I'm sure there are implementations that load & store chars like a bitfield,
and others that can load aligned bitfields like x.b as a char). But is that
sufficient reason to cause such a significant difference in the semantics?

The very decision to adopt "value-preserving" promotions was to minimise
unexpected behaviour. Having bitfields alone be "unsigned-preserving" would
be rather unexpected, surely?

I suppose it's just unexpected if you have my view of bitfields though - I
automatically think of "unsigned :8" as just being a custom sub-integer type,
akin to unsigned char. Maybe you think of it as an unsigned int which just
happens to be not allocated all its bits.
In addition, a char is a defined type. A bitfield isn't.


I think 6.7.2.1p9 disagrees with you. The bitfield's type may not have a
name, but it is a type.

--
Kevin Bracey, Principal Software Engineer
Tematic Ltd Tel: +44 (0) 1223 503464
182-190 Newmarket Road Fax: +44 (0) 1728 727430
Cambridge, CB5 8HE, United Kingdom WWW: http://www.tematic.com/
Nov 14 '05 #38

P: n/a
On Fri, 28 Jan 2005 09:22:00 -0500, Wojtek Lerch wrote:
"Chris Williams" <th********@yahoo.co.jp> wrote in message
news:11**********************@c13g2000cwb.googlegr oups.com...
Looks like bar() bar() to me which means that either I or Falconer is
spacing out. If it can never be negative, then it will never be less
than 0 so the first statement for both of these should be false and
bar() will be called.


What can never be negative? s.a can't, but s.a-5 can.
6.7.2.1p9: "A bit-field is interpreted as a signed or unsigned integer type
consisting of the specified number of bits." The type of s.a is a 4-bit
unsigned type. Since all the values of such a type can be represented by
int, the integer promotions convert it to int rather than to unsigned int,
and the value of s.a-5 is -5 rather than UINT_MAX-4.


This seems to make bit-field specifications part of C's type system,
unless you want to say that "is interpreted as" means something other than
"is". I would really hate to have to go there. However if bit-fieldness is
a property of type it is rather unfortunate that this isn't mentioned in
6.2.5 (the definition of C's types). If it isn't a property of type then
for example the conversion rule specified in 6.3.1.3p2 has problems e.g.
when trying to assign the value 16 to a 4 bit unsigned bit-field:

"Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or
subtracting one more than the maximum value that can be represented in
the new type until the value is in the range of the new type."

There is enough of the standard that assumes that bit-field width is a
type property to make it difficult to interpret it otherwise. If it is
then consider 6.3.1.1p2:

"If an int can represent all values of the original type, the value is
converted to an int; otherwise, it is converted to an unsigned int."

Here "original type" presumably means the unpromoted type or the type of
our original value i.e. what we are promoting from. If bit-field width is
part of a type then the width must be significant in determining whether
the value is promoted to int or unsigned int.

In which case I agree that s.a - 5 must evaluate to -5 in the original
example.

Lawrence
Nov 14 '05 #39

P: n/a
"CBFalconer" <cb********@yahoo.com> wrote in message
news:41***************@yahoo.com...
[big snip]

What are you trying to say?

That bit-fields declared with plain int should be unsigned?
That bit-fields should be "unsigned preserving"?
Both? Something else?

Promoting a signed bit-field to (signed) int requires extra effort to copy
or move the sign bit, but promoting an unsigned bit-field to (signed) int
does not.

Alex
Nov 14 '05 #40

P: n/a
On Mon, 31 Jan 2005 14:07:25 +0000, Kevin Bracey wrote:
In message <41***************@yahoo.com>
CBFalconer <cb********@yahoo.com> wrote:
Kevin Bracey wrote:
> ... snip ...
>
> All very well, but how is this any different to the case of loading
> an unsigned char? The standard states that an unsigned char is
> promoted to int (assuming int is bigger than char). In the following
> structure, why should a and b be treated differently?
>
> struct
> {
> unsigned char a;
> unsigned b:8;
> } x;


Because a char, of any flavor, occupies a complete addressable
unit, and can be loaded and stored without affecting anything else
(at least as far as C is concerned).
It IS the case that bit-fields can be loaded and stored without affecting
anything else, as far as C is concerned, even if the implementation has
to jump through hoops to achieve it.
That does not apply to
bitfields, which may spread over byte demarcations (but not over
int demarcations).
The statement is also true of ints. The main thing about bitfields is that
they don't necessarily occupy a while number of bytes i.e. a
particular byte may be used by more than one distinct object.
I'd agree that there's a potential difference in the implementation
(although I'm sure there are implementations that load & store chars
like a bitfield, and others that can load aligned bitfields like x.b as
a char). But is that sufficient reason to cause such a significant
difference in the semantics?
IMO implementation details are not a big issue here, they don't make much
difference one way or the other. The issue is how you interpret the C
standard. It is just one case of the overall value-preserving nastiness.
The very decision to adopt "value-preserving" promotions was to minimise
unexpected behaviour.
In which case it singularly failed. Having unsigned types switch to signed
types through non-obvious rules is what most people find "unexpected". :-)
Having bitfields alone be "unsigned-preserving" would
be rather unexpected, surely?
Not if you viewed the type of the bit-field as unsigned int to start with.
In that case having a value of type unsigned int "promote" to int would be
odd indeed. However as per my other post it appears that shorter unsigned
int bitfields do promote to int.
I suppose it's just unexpected if you have my view of bitfields though -
I automatically think of "unsigned :8" as just being a custom
sub-integer type, akin to unsigned char. Maybe you think of it as an
unsigned int which just happens to be not allocated all its bits.


Bit-fieldness does seem to be a second class property of types.
In addition, a char is a defined type. A bitfield isn't.


I think 6.7.2.1p9 disagrees with you. The bitfield's type may not have a
name, but it is a type.


Whatever is meant by "is interpreted as". Is it a type (modifier) or isn't
it? If bit-fields are part of the type system why aren't they mentioned in
6.2.5?

Lawrence

Nov 14 '05 #41

P: n/a
Jack Klein wrote:

....lot of snippage...
Having gotten here with our sane code generator implementor, I
maintain we now have the right clue about how to handle the usual
arithmetic conversions on the bit field. We now base them on the
original declaration as un/signed, because that minimizes the
work. This is the final clue as to what the standard should say,
were it to say anything, which so far it does not AFAICT. We do
not base it on the range of values the bit field can hold.
Admittedly it is unfortunate that the standard does not specifically
mention bit fields in describing the usual integer conversions, and
hopefully that can be rectified in a TC or later version.

But since the standard selected what they call "value preserving"

over "sign preserving" operation, it would be seriously inconsistent and a
real source of problems if an 8-bit unsigned char promoted to a signed int but an 8-bit unsigned bit field promoted to an unsigned int. That would be rather absurd, don't you think?


One could claim that it is equally absurd to have a unsigned bit field
of width 15 promoted to int and one of width 16 to unsigned int
(following value-preservation rules on an example 16-Bit architecture).
With your argumentation, the type which is used to define the bitfield
is ignored and the smallest value preserving integer is promoted to;
this would lead to e.g. signed, unsigned, signed for 15,16,17-width
bitfields, would it not?
... snip ...

The programmer should used unsigned bit fields when only positive
values need to be stored, and signed bit fields when both positive and negative values are used.


We are not trying to constrain the programmer, we are trying to
interpret what s/he actually wrote.


Ah, you snipped your particular statement that my comment addressed,
so I am putting it back in:
It also means that wherever given the choice, the designer will
make a bit field unsigned because it means less processing and less
chance of overflows and consequent UB.


I misinterpreted your meaning, so my comment doesn't apply. I was
thrown off by what I think is some incompleteness in your wording. I
think what you meant to say by "make a bit field unsigned" would be
better conveyed by the words "make an unsigned bit field promote to
unsigned int".

But despite the omission from the standard, it seems silly to think
that the compiler designer is given a choice here. Since all other
promotions and conversions are rather scrupulously defined, I find it
hard to believe that the intent was to leave the results of using an
unsigned bit field in an expression implementation defined. In fact,
nothing is implementation-defined unless the standard specifically
states that it is implementation-defined.

In fact, given the lack of mention, using the value of a bit field in
an expression technically produces undefined behavior based on the
wording of the standard today.


[IMHO]
The commitee avoided (consciously or not) to explicitly say "look, we
defined value preservation in most cases but with bitfields we must
revert to sign-preservation. Please don't feel fooled because in most
cases this is what you as programmers actually want". Value
preservation like you propose it burdens the programmers with another
cascade of promotions with weird architecture dependent implications.
OTOH it would be consistent with the rest of C: it does the non-obviuos
;)
[/IMHO]

Mark

Nov 14 '05 #42

P: n/a
Alex Fraser wrote:
"CBFalconer" <cb********@yahoo.com> wrote in message

[big snip]

What are you trying to say?

That bit-fields declared with plain int should be unsigned?
That bit-fields should be "unsigned preserving"?
Both? Something else?

Promoting a signed bit-field to (signed) int requires extra effort
to copy or move the sign bit, but promoting an unsigned bit-field
to (signed) int does not.


True. A priori we don't know whether or not the sign bit of a bit
field is set or not. Therefore the initial expansion into a
register should prefer to be into an unsigned representation,
barring explicit information otherwise. That information can only
come from the designation of the bit field as being signed.

The point is "what type do we have after extracting the field into
a register". I claim that we have either a signed or unsigned
integer, depending solely on the type designation of the bit field,
and not on its size. Once that is settled we have clear rules for
any further promotions, operations, and potential overflows.

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
Nov 14 '05 #43

P: n/a
Kevin Bracey wrote:
CBFalconer <cb********@yahoo.com> wrote:
.... snip ...
In addition, a char is a defined type. A bitfield isn't.


I think 6.7.2.1p9 disagrees with you. The bitfield's type may not
have a name, but it is a type.


Are you thinking of this (from N869)?

[#8] A bit-field shall have a type that is a qualified or
unqualified version of _Bool, signed int, or unsigned int.
A bit-field is interpreted as a signed or unsigned integer
type consisting of the specified number of bits.95) If the
value 0 or 1 is stored into a nonzero-width bit-field of
type _Bool, the value of the bit-field shall compare equal
to the value stored.

If so, I think it bolsters my view. (_Bool being a 1 bit type and
new to C99).

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
Nov 14 '05 #44

P: n/a
Mark Piffer wrote:
.... snip ...
[IMHO]
The commitee avoided (consciously or not) to explicitly say "look,
we defined value preservation in most cases but with bitfields we
must revert to sign-preservation. Please don't feel fooled because
in most cases this is what you as programmers actually want". Value
preservation like you propose it burdens the programmers with
another cascade of promotions with weird architecture dependent
implications. OTOH it would be consistent with the rest of C: it
does the non-obviuos ;)
[/IMHO]


I think that, together with my description of a sane implementor,
pretty well settles it. Now the thing is to get some verbiage into
the standard saying so.

It would suffice to say that any bitfield is treated as having the
specified type for any further operations.

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
Nov 14 '05 #45

P: n/a
CBFalconer wrote:

Kevin Bracey wrote:

... snip ...

All very well, but how is this any different to the case of loading
an unsigned char?


Because a char, of any flavor, occupies a complete addressable
unit, and can be loaded and stored without affecting anything else
(at least as far as C is concerned). That does not apply to
bitfields, which may spread over byte demarcations (but not over
int demarcations).


It's implementation-defined whether bitfields are spread over integer
"units" or not. The Borland 16-bit C compiler did indeed spread
bitfields across byte and word boundaries as long as they were fully
contained in 2 or less bytes, taking advantage of the 80x86 ability of
non-aligned memory access. A struct containing 7 9-bit bitfields was
allocated 8 8-bit bytes.

Thad
Nov 14 '05 #46

P: n/a
Thad Smith wrote:
CBFalconer wrote:
Kevin Bracey wrote:

... snip ...

All very well, but how is this any different to the case of
loading an unsigned char?


Because a char, of any flavor, occupies a complete addressable
unit, and can be loaded and stored without affecting anything else
(at least as far as C is concerned). That does not apply to
bitfields, which may spread over byte demarcations (but not over
int demarcations).


It's implementation-defined whether bitfields are spread over
integer "units" or not. The Borland 16-bit C compiler did indeed
spread bitfields across byte and word boundaries as long as they
were fully contained in 2 or less bytes, taking advantage of the
80x86 ability of non-aligned memory access. A struct containing
7 9-bit bitfields was allocated 8 8-bit bytes.


I believe there is something in the standard forbidding that.
First, bitfields larger than an int are not allowed, and they are
not allowed to cross int boundaries, IIRC.

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
Nov 14 '05 #47

P: n/a
"CBFalconer" <cb********@yahoo.com> wrote in message
news:41***************@yahoo.com...
Alex Fraser wrote:
"CBFalconer" <cb********@yahoo.com> wrote in message

[big snip]

What are you trying to say?

That bit-fields declared with plain int should be unsigned?
That bit-fields should be "unsigned preserving"?
Both? Something else?

Promoting a signed bit-field to (signed) int requires extra effort
to copy or move the sign bit, but promoting an unsigned bit-field
to (signed) int does not.
True. A priori we don't know whether or not the sign bit of a bit
field is set or not. Therefore the initial expansion into a
register should prefer to be into an unsigned representation,
barring explicit information otherwise. [...]


So bit-fields declared with plain int should be unsigned, because otherwise
all use will need the extra effort I mentioned (and you described before)?
The point is "what type do we have after extracting the field into
a register". I claim that we have either a signed or unsigned
integer, depending solely on the type designation of the bit field,
and not on its size.


Does that mean you think the standard says promotion of bit-fields is
"unsigned preserving"? That appears to be a minority view.

Alex
Nov 14 '05 #48

P: n/a
CBFalconer wrote:
....
The point is "what type do we have after extracting the field into
a register". I claim that we have either a signed or unsigned
integer, depending solely on the type designation of the bit field,
and not on its size. Once that is settled we have clear rules for
any further promotions, operations, and potential overflows.


The standard specifies that what you have is an int, not an unsigned
int, if all the values of the original type can be represented in an
int, regardless of whether the original type was signed or unsigned.
That's certainly true for unsigned int:8, and on many implementations
it's also true for unsigned int:30. On most implementations it applies
to unsigned short, and on almost every implementation it applies to
unsigned char.

Are you claiming that for "unsigned int i:8", the original type is
"unsigned int"? I'll agree that 6.3.1.1p2 is unclear about that issue.
However, do you really want "unsigned int i:8" to be handled by
different rules than "unsigned char c", even on a machine where
CHAR_BITS==8?

Counter arguments:
6.7.2.1p3 refers to "... the type that is specified if the colon and
expression are omitted." This implies that it would be a different type
than is specified when the colon and expression are present. If that
weren't the case, it could have simply said "... the specified type".

6.7.2.1p8 says "A bit-field is interpreted as a signed or unsigned
integer type consisting of the specified number of bits." It might not
be a named type, and therefore can't be the subject of an explicit cast,
but it is it's own type, and that type has a different number of bits
than it would have if the colon and the expression were absent.
Nov 14 '05 #49

P: n/a
CBFalconer <cb********@yahoo.com> writes:
[...]
I believe there is something in the standard forbidding that.
First, bitfields larger than an int are not allowed, and they are
not allowed to cross int boundaries, IIRC.


I think you're mistaken.

C99 6.7.2.1p10 says:

An implementation may allocate any addressable storage unit
large enough to hold a bit-field. If enough space remains,
a bit-field that immediately follows another bit-field in
a structure shall be packed into adjacent bits of the same
unit. If insufficient space remains, whether a bit-field that
does not fit is put into the next unit or overlaps adjacent
units is implementation-defined. The order of allocation of
bit-fields within a unit (high-order to low-order or low-order
to high-order) is implementation-defined. The alignment of the
addressable storage unit is unspecified.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Nov 14 '05 #50

112 Replies

This discussion thread is closed

Replies have been disabled for this discussion.