Implemenation Indepdent Way to Move LSByte of Char to MSB of Int, etc

no spam

What is the implementation independent way of moving the least significant
byte of unsigned char to the most significant byte of unsigned int?

And the least significant word (if not a word, have the preprocessor force
an error) of unsigned int to the most significant word of unsigned long?

Nov 14 '05 #1

Subscribe Post Reply

1880

Mike Wahler

"no spam" <no@spam.com> wrote in message
news:Aa*************@news.ntplx.net...

What is the implementation independent way of moving the least significant
byte
of unsigned char
This is the easy part. All the character types have a size of
one byte by definition. So the least significant bye and
most significant byte are the same.

to the most significant byte of unsigned int?
unsigned char c = 42;
unsigned int i = 0;
*(unsigned char *)&i = c;

Note that this is only well defined for unsigned integer types;
signed integer types can have representations where putting
arbritrary bit patterns into them could cause e.g. a
'trap' representation.

And the least significant word (if not a word, have the preprocessor force
an error) of unsigned int to the most significant word of unsigned long?

C does not define the concept of 'word'.

-Mike

Nov 14 '05 #2

Mike Wahler

"Mike Wahler" <mk******@mkwahler.net> wrote in message
news:qB*****************@newsread3.news.pas.earthl ink.net...

"no spam" <no@spam.com> wrote in message
news:Aa*************@news.ntplx.net...
What is the implementation independent way of moving the least significant byte
of unsigned char

This is the easy part. All the character types have a size of
one byte by definition. So the least significant bye and
most significant byte are the same.

to the most significant byte of unsigned int?

unsigned char c = 42;
unsigned int i = 0;
*(unsigned char *)&i = c;

This won't necessarily assign to the most significant byte
of the unsigned integer, it depends upon the underlying platform's
representation of the integer. The above stores the char value
in the first byte (in memory) of its representation.

This might or might not be what you want.

-Mike

Nov 14 '05 #3

no spam

"Mike Wahler" <mk******@mkwahler.net> wrote in message
news:WH*****************@newsread3.news.pas.earthl ink.net...

"Mike Wahler" <mk******@mkwahler.net> wrote in message
news:qB*****************@newsread3.news.pas.earthl ink.net...

"no spam" <no@spam.com> wrote in message
news:Aa*************@news.ntplx.net...
What is the implementation independent way of moving the least significant byte
of unsigned char

This is the easy part. All the character types have a size of
one byte by definition. So the least significant bye and
most significant byte are the same.

to the most significant byte of unsigned int?

unsigned char c = 42;
unsigned int i = 0;
*(unsigned char *)&i = c;

This won't necessarily assign to the most significant byte
of the unsigned integer, it depends upon the underlying platform's
representation of the integer. The above stores the char value
in the first byte (in memory) of its representation.

This might or might not be what you want.

-Mike

Does C define the concept of "most significant" and "least significant"?

Nov 14 '05 #4

xarax

"no spam" <no@spam.com> wrote in message news:j6*************@news.ntplx.net...

"Mike Wahler" <mk******@mkwahler.net> wrote in message
news:WH*****************@newsread3.news.pas.earthl ink.net...
"Mike Wahler" <mk******@mkwahler.net> wrote in message
news:qB*****************@newsread3.news.pas.earthl ink.net...

"no spam" <no@spam.com> wrote in message
news:Aa*************@news.ntplx.net...
> What is the implementation independent way of moving the least

significant
> byte
> of unsigned char

This is the easy part. All the character types have a size of
one byte by definition. So the least significant bye and
most significant byte are the same.
> to the most significant byte of unsigned int?

unsigned char c = 42;
unsigned int i = 0;
*(unsigned char *)&i = c;

This won't necessarily assign to the most significant byte
of the unsigned integer, it depends upon the underlying platform's
representation of the integer. The above stores the char value
in the first byte (in memory) of its representation.

This might or might not be what you want.

-Mike

Does C define the concept of "most significant" and "least significant"?

Maybe something like this:

#include <stddef.h>
#include <limits.h>

void lsb_to_msb(unsigned int * out, unsigned char in)
{
const unsigned int shift = (CHAR_BIT * ((sizeof *out)-1));
const unsigned int mask = ~( ( (1u << CHAR_BIT) - 1 ) << shift);

*out &= mask; /* clear msb */
*out |= (in << shift); /* set msb */
}

Nov 14 '05 #5

Eric Sosman

no spam wrote:

What is the implementation independent way of moving the least significant
byte of unsigned char to the most significant byte of unsigned int?
#include <limits.h>
unsigned char uc = ...;
unsigned int ui;

/* Almost right: works on every machine I've ever
* run across, but is not actually guaranteed by
* the Standard
*/
ui = (unsigned int)uc << (CHAR_BIT * (sizeof ui - 1));

/* Best "completely portable" solution I've thought of;
* allows for padding bits in unsigned int
*/
ui = uc;
for (unsigned int n = UINT_MAX >> CHAR_BIT; n > 0; --n)
ui <<= 1;

Maybe there's a way to calculate (UINT_MAX + 1)/(UCHAR_MAX + 1)
without risking zero in the numerator and/or denominator, but I
haven't figured one out. (Note that UCHAR_MAX == ULLONG_MAX is
permitted by the Standard.)
And the least significant word (if not a word, have the preprocessor force
an error) of unsigned int to the most significant word of unsigned long?

What is a "word?" The C Standard uses the term mostly to
refer to its own content, twice to refer to "words in a line
of text," and once in connection with floating-point numbers;
it is never used in connection with an unsigned int.

--
Er*********@sun.com

Nov 14 '05 #6

S.Tobias

no spam <no@spam.com> wrote:

Does C define the concept of "most significant" and "least significant"?

No, or at least none that I know of. "Most/least significant byte"
concepts are taken from machine representation of multi-byte integers,
and integer operations in ISO-C are defined in terms of values
and mathematical operations, and they don't depend on any representation
(however we know that integers consist of bits that have a few specific
properties).
In an integer value bits might be scattered randomly throughout
the whole object mixed together with some padding bits; which byte
should be called MSB/LSB?

--
Stan Tobias
mailx `echo si***@FamOuS.BedBuG.pAlS.INVALID | sed s/[[:upper:]]//g`

Nov 14 '05 #7

Lawrence Kirby

On Wed, 12 Jan 2005 16:36:33 -0500, Eric Sosman wrote:

no spam wrote:
What is the implementation independent way of moving the least significant
byte of unsigned char to the most significant byte of unsigned int?
#include <limits.h>
unsigned char uc = ...;
unsigned int ui;

/* Almost right: works on every machine I've ever
* run across, but is not actually guaranteed by
* the Standard
*/
ui = (unsigned int)uc << (CHAR_BIT * (sizeof ui - 1));

/* Best "completely portable" solution I've thought of;
* allows for padding bits in unsigned int
*/
ui = uc;
for (unsigned int n = UINT_MAX >> CHAR_BIT; n > 0; --n)
ui <<= 1;

If sizeof(unsigned int) is 1 the right shift results in undefined
behaviour. You could something like

static int shift_width = -1;
unsigned char uc;
unsigned ui;

if (shift_width < 0) {
unsigned testbit = UCHAR_MAX + 1U;

for (shift_width = 0; testbit != 0; shift_width++, testbit <<= 1)
;
}

ui = (unsigned)uc << shift_width;

If you need to set the top byte in an existing unsigned int value

value = (value & ~((unsigned)UCHAR_MAX << shift_width)) | ui;

The mask value is also a constant that can be set up once.

Lawrence

Maybe there's a way to calculate (UINT_MAX + 1)/(UCHAR_MAX + 1)
without risking zero in the numerator and/or denominator, but I
haven't figured one out. (Note that UCHAR_MAX == ULLONG_MAX is
permitted by the Standard.)
And the least significant word (if not a word, have the preprocessor force
an error) of unsigned int to the most significant word of unsigned long?

What is a "word?" The C Standard uses the term mostly to
refer to its own content, twice to refer to "words in a line
of text," and once in connection with floating-point numbers;
it is never used in connection with an unsigned int.

Nov 14 '05 #8

Eric Sosman

Lawrence Kirby wrote:

On Wed, 12 Jan 2005 16:36:33 -0500, Eric Sosman wrote:
no spam wrote:
What is the implementation independent way of moving the least significant
byte of unsigned char to the most significant byte of unsigned int?

#include <limits.h>
unsigned char uc = ...;
unsigned int ui;

/* Almost right: works on every machine I've ever
* run across, but is not actually guaranteed by
* the Standard
*/
ui = (unsigned int)uc << (CHAR_BIT * (sizeof ui - 1));

/* Best "completely portable" solution I've thought of;
* allows for padding bits in unsigned int
*/
ui = uc;
for (unsigned int n = UINT_MAX >> CHAR_BIT; n > 0; --n)
ui <<= 1;

If sizeof(unsigned int) is 1 the right shift results in undefined
behaviour.

Oh, drat. You're right: my attempt to be "completely
portable" merely traded one error for another.

Personally, I prefer the first of the two erroneous
forms as "less likely to get caught" ...

--
Eric Sosman
es*****@acm-dot-org.invalid

Nov 14 '05 #9

CBFalconer

Eric Sosman wrote:

Lawrence Kirby wrote:
On Wed, 12 Jan 2005 16:36:33 -0500, Eric Sosman wrote:
no spam wrote:

What is the implementation independent way of moving the least
significant byte of unsigned char to the most significant byte
of unsigned int?
.... snip ...
If sizeof(unsigned int) is 1 the right shift results in undefined
behaviour.

Oh, drat. You're right: my attempt to be "completely
portable" merely traded one error for another.

Personally, I prefer the first of the two erroneous
forms as "less likely to get caught" ...

OTOH if we revise the specification to specify "8 bits" in place of
"byte" we can handle it in a portable manner:

#define UINT_BIT (CHAR_BIT * sizeof(unsigned int))

unsigned int ui;
unsigned char uc;

ui = (uc & 255) << (UINT_BIT - 8);

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson

Nov 14 '05 #10

xarax

"CBFalconer" <cb********@yahoo.com> wrote in message
news:41***************@yahoo.com...

Eric Sosman wrote:
Lawrence Kirby wrote:
On Wed, 12 Jan 2005 16:36:33 -0500, Eric Sosman wrote:
no spam wrote:

> What is the implementation independent way of moving the least
> significant byte of unsigned char to the most significant byte
> of unsigned int?
... snip ...
If sizeof(unsigned int) is 1 the right shift results in undefined
behaviour.

Oh, drat. You're right: my attempt to be "completely
portable" merely traded one error for another.

Personally, I prefer the first of the two erroneous
forms as "less likely to get caught" ...

OTOH if we revise the specification to specify "8 bits" in place of
"byte" we can handle it in a portable manner:

#define UINT_BIT (CHAR_BIT * sizeof(unsigned int))

unsigned int ui;
unsigned char uc;

ui = (uc & 255) << (UINT_BIT - 8);

You are presuming that CHAR_BIT == 8.

Nov 14 '05 #11

aegis

Lawrence Kirby wrote:

On Wed, 12 Jan 2005 16:36:33 -0500, Eric Sosman wrote:
no spam wrote:
What is the implementation independent way of moving the least significant byte of unsigned char to the most significant byte of unsigned
int?
#include <limits.h>
unsigned char uc = ...;
unsigned int ui;

/* Almost right: works on every machine I've ever
* run across, but is not actually guaranteed by
* the Standard
*/
ui = (unsigned int)uc << (CHAR_BIT * (sizeof ui - 1));

/* Best "completely portable" solution I've thought of;
* allows for padding bits in unsigned int
*/
ui = uc;
for (unsigned int n = UINT_MAX >> CHAR_BIT; n > 0; --n)
ui <<= 1;
If sizeof(unsigned int) is 1 the right shift results in undefined
behaviour. You could something like

Why would N >> X cause undefined behavior?
where N is some object and X is a value equal in width
to that object.

static int shift_width = -1;
unsigned char uc;
unsigned ui;

if (shift_width < 0) {
unsigned testbit = UCHAR_MAX + 1U;
What if UCHAR_MAX == UINT_MAX? then you would get zero

for (shift_width = 0; testbit != 0; shift_width++, testbit <<= 1) ;
and then this condition would fail
}

ui = (unsigned)uc << shift_width;
and you would shift left by a negative one?
could you clarify this? shifting left by negative one
does not make sense to me.

If you need to set the top byte in an existing unsigned int value

if UINT_MAX == UCHAR_MAX then there is no top byte, right?
They would both be a single byte.

--
aegis

Nov 14 '05 #12

Eric Sosman

aegis wrote:

Why would N >> X cause undefined behavior?
where N is some object and X is a value equal in width
to that object.

Answer #1: Because the Standard says so, in section
6.5.7 paragraph 3.

Answer #2: Some machines' instruction sets are unable
to express shift amounts greater than the operand width.
For example, an instruction to left-shift a 32-bit value
by X bits might encode X in a five-bit field, making it
impossible to perform a 32-bit shift in one instruction.

Observation: Answer #2 is probably the motivation
behind Answer #1 ...

--
Er*********@sun.com

Nov 14 '05 #13

CBFalconer

xarax wrote:

"CBFalconer" <cb********@yahoo.com> wrote in message
Eric Sosman wrote:
Lawrence Kirby wrote:
On Wed, 12 Jan 2005 16:36:33 -0500, Eric Sosman wrote:
> no spam wrote:
>
>> What is the implementation independent way of moving the least
>> significant byte of unsigned char to the most significant byte
>> of unsigned int?
>

... snip ...

If sizeof(unsigned int) is 1 the right shift results in undefined
behaviour.

Oh, drat. You're right: my attempt to be "completely
portable" merely traded one error for another.

Personally, I prefer the first of the two erroneous
forms as "less likely to get caught" ...

OTOH if we revise the specification to specify "8 bits" in place of
"byte" we can handle it in a portable manner:

#define UINT_BIT (CHAR_BIT * sizeof(unsigned int))

unsigned int ui;
unsigned char uc;

ui = (uc & 255) << (UINT_BIT - 8);

You are presuming that CHAR_BIT == 8.

No I am not. Read the revised specification, and then the code.

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson

Nov 14 '05 #14

Peter Nilsson

no spam wrote:

What is the implementation independent way of moving the least
significant byte of unsigned char to the most significant byte
of unsigned int?
There isn't one. The following will place an unsigned char in the
highest bits of an unsigned int...

#include <limits.h>

#define move_uc_to_umsb(u,uc) \
((((unsigned )(u )) & (-1u >> (CHAR_BIT - 1) >> 1) ) \
|(((unsigned char)(uc)) * ((-1u >> (CHAR_BIT - 1) >> 1) + 1)))
And the least significant word (if not a word, have the preprocessor
force an error) of unsigned int to the most significant word of
unsigned long?

Define what you mean by 'word'.

Whilst the above probably does what you want, it sounds like you're
trying to do something inherently implementation specific.

--
Peter

Nov 14 '05 #15

dandelion

"Eric Sosman" <er*********@sun.com> wrote in message
news:cs**********@news1brm.Central.Sun.COM...

aegis wrote:

Why would N >> X cause undefined behavior?
where N is some object and X is a value equal in width
to that object.

Answer #1: Because the Standard says so, in section
6.5.7 paragraph 3.

Answer #2: Some machines' instruction sets are unable
to express shift amounts greater than the operand width.
For example, an instruction to left-shift a 32-bit value
by X bits might encode X in a five-bit field, making it
impossible to perform a 32-bit shift in one instruction.

Observation: Answer #2 is probably the motivation
behind Answer #1 ...

That and the fact that such a shift (in either direction) inevitably
results in 0, i presume. The operation seems pointless. which is
(i suspect) the motivation behind not supporting it for many HW
vendors.

I would not call it an "observation" though.

Nov 14 '05 #16

Lawrence Kirby

On Thu, 13 Jan 2005 11:56:12 -0800, aegis wrote:

Lawrence Kirby wrote:
On Wed, 12 Jan 2005 16:36:33 -0500, Eric Sosman wrote:
....
If sizeof(unsigned int) is 1 the right shift results in undefined
behaviour. You could something like

Why would N >> X cause undefined behavior?
where N is some object and X is a value equal in width
to that object.

As others have noted this is because the standard says so.

static int shift_width = -1;
unsigned char uc;
unsigned ui;

if (shift_width < 0) {
unsigned testbit = UCHAR_MAX + 1U;

What if UCHAR_MAX == UINT_MAX? then you would get zero

for (shift_width = 0; testbit != 0; shift_width++, testbit

<<= 1)
;

and then this condition would fail

Which results in a final value of 0 for shift_width, which is
appropriate. The first expresison in a for () loop is always executed
once, when the loop is entered.

ui = (unsigned)uc << shift_width;

and you would shift left by a negative one? could you clarify this?
shifting left by negative one does not make sense to me.

Shifting by a negative value is an error. However that doesn't happen.

If you need to set the top byte in an existing unsigned int value

if UINT_MAX == UCHAR_MAX then there is no top byte, right? They would
both be a single byte.

Correct which means that the byte data is already in the correct place so
no shifting is required.

Lawrence

Nov 14 '05 #17

Allan Bruce

"no spam" <no@spam.com> wrote in message
news:Aa*************@news.ntplx.net...

What is the implementation independent way of moving the least significant
byte of unsigned char to the most significant byte of unsigned int?

And the least significant word (if not a word, have the preprocessor force
an error) of unsigned int to the most significant word of unsigned long?

If you know how your platform stores the data or can determine it
programatically then you can convert from Big Endian to Little Endian (or
vice versa) with the following function:

void ReverseBytes(void *xbBytes, int xiNumBytes)
{
int loop
char *lT1 = (char *)xbBytes;
char *lT2 = (char *)xbBytes;
char lT3;
lT2 += xiNumBytes-1;

for (loop=0; loop<xiNumBytes/2; loop++)
{
lT3 = *lT2;
*lT2 = *lT1;
*lT1 = lT3;
lT1++;
lT2--;
}
}

Allan

Nov 14 '05 #18

pete

dandelion wrote:

"Eric Sosman" <er*********@sun.com> wrote in message
news:cs**********@news1brm.Central.Sun.COM...
aegis wrote:

Why would N >> X cause undefined behavior?
where N is some object and X is a value equal in width
to that object.

Answer #1: Because the Standard says so, in section
6.5.7 paragraph 3.

Answer #2: Some machines' instruction sets are unable
to express shift amounts greater than the operand width.
For example, an instruction to left-shift a 32-bit value
by X bits might encode X in a five-bit field, making it
impossible to perform a 32-bit shift in one instruction.

Observation: Answer #2 is probably the motivation
behind Answer #1 ...

That and the fact that such a shift (in either direction) inevitably
results in 0, i presume. The operation seems pointless. which is
(i suspect) the motivation behind not supporting it for many HW
vendors.

I would not call it an "observation" though.

One common form of undefined behavior
for the above mentioned 5 bit field,
is that for (u >> x), you wind up shifting by (x % 31) bits.

--
pete

Nov 14 '05 #19

Eric Sosman

dandelion wrote:

"Eric Sosman" <er*********@sun.com> wrote in message
news:cs**********@news1brm.Central.Sun.COM...
aegis wrote:
Why would N >> X cause undefined behavior?
where N is some object and X is a value equal in width
to that object.
Answer #1: Because the Standard says so, in section
6.5.7 paragraph 3.

Answer #2: Some machines' instruction sets are unable
to express shift amounts greater than the operand width.
For example, an instruction to left-shift a 32-bit value
by X bits might encode X in a five-bit field, making it
impossible to perform a 32-bit shift in one instruction.

Observation: Answer #2 is probably the motivation
behind Answer #1 ...

That and the fact that such a shift (in either direction) inevitably
results in 0, i presume. The operation seems pointless. which is
(i suspect) the motivation behind not supporting it for many HW
vendors.

Shifting by zero bits is even more pointless, but that
doesn't seem to have prompted instruction-set designers to
omit the operation (unless they also omit all multi-bit
shifts; I've used machines whose only shift instructions
were single-bit shifts).

Bit positions in CPU instructions are usually a scarce
resource, because the machine can usually be made faster if
its instructions require less memory (it takes fewer cycles
to fetch and decode an instruction that occupies one word
than an instruction requiring three). Given the scarcity of
instruction bits, a designer faced with encoding a shift
distance that "should almost always" lie between 1..31 will
be unlikely to allocate a six-bit field; the "spare" bit can
probably be put to more effective use. And that, I think, is
the motivation for hardware ceilings on shift counts.

(Since the zero-bit shift also seems useless, I imagine a
designer might decide to use the opcode that resembles "shift
by zero" to denote some entirely different operation. I don't
know whether any have done so, but I imagine it might complicate
the instruction decode process and require a bunch of extra
silicon -- it's probably easier to allow the pointless zero-bit
shift than to detect it and recycle its code space for other
purposes.)
I would not call it an "observation" though.

All right, how about "conjecture?" Or would you prefer
"damfoolishness?"

--
Eric Sosman
es*****@acm-dot-org.invalid

Nov 14 '05 #20

Eric Sosman

pete wrote:

dandelion wrote:
"Eric Sosman" <er*********@sun.com> wrote in message
news:cs**********@news1brm.Central.Sun.COM...
aegis wrote:

Why would N >> X cause undefined behavior?
where N is some object and X is a value equal in width
to that object.

Answer #1: Because the Standard says so, in section
6.5.7 paragraph 3.

Answer #2: Some machines' instruction sets are unable
to express shift amounts greater than the operand width.
For example, an instruction to left-shift a 32-bit value
by X bits might encode X in a five-bit field, making it
impossible to perform a 32-bit shift in one instruction.

Observation: Answer #2 is probably the motivation
behind Answer #1 ...

That and the fact that such a shift (in either direction) inevitably
results in 0, i presume. The operation seems pointless. which is
(i suspect) the motivation behind not supporting it for many HW
vendors.

I would not call it an "observation" though.

One common form of undefined behavior
for the above mentioned 5 bit field,
is that for (u >> x), you wind up shifting by (x % 31) bits.

Haven't seen that one. I think you mean either
`x & 31' or `x % 32'.

--
Eric Sosman
es*****@acm-dot-org.invalid

Nov 14 '05 #21

pete

Eric Sosman wrote:

pete wrote:

One common form of undefined behavior
for the above mentioned 5 bit field,
is that for (u >> x), you wind up shifting by (x % 31) bits.

Haven't seen that one. I think you mean either
`x & 31' or `x % 32'.

That's what I think too.

--
pete

Nov 14 '05 #22

dandelion

<snip>

Shifting by zero bits is even more pointless, but that
doesn't seem to have prompted instruction-set designers to
omit the operation (unless they also omit all multi-bit
shifts; I've used machines whose only shift instructions
were single-bit shifts).
True. However, I have two slight remarks:

* Shift by zero does not exceed the architecture's bounds
* Checking for shift by zero requires an explicit check, which requires
extra silicon,
which would imply some sort of fault condition being set, which requires
even more silicon...
Limiting the maximum shift only requires truncating and operand, which
requires *less* silicon than
passing the full monty.

<snip>
And that, I think, is the motivation for hardware ceilings on shift

counts.

Agreed.

<snip>

Ok. I wrote the above without reading the rest of your post, but we seem to
be of one opinion on this.

I would not call it an "observation" though.

All right, how about "conjecture?" Or would you prefer
"damfoolishness?"

Well, calling it "damnfoolishness" would be inconsistent on my part, since
I fully agreed with the post in the first place. "Conjecture" does fit the
bill, but personally I would go for "hunch", "guess" or something like that.

Nov 14 '05 #23

Lawrence Kirby

On Fri, 14 Jan 2005 09:10:30 -0500, Eric Sosman wrote:

dandelion wrote:

....

That and the fact that such a shift (in either direction) inevitably
results in 0, i presume. The operation seems pointless. which is
(i suspect) the motivation behind not supporting it for many HW
vendors.

Shifting by zero bits is even more pointless,

It is for an instruction with a constant shift count, not for one with a
variable count. The count may legitimately be calculated as zero at
runtime. Similarly the situation with large shifts resulting in zero makes
some sense for a variable shift count but is less commonly needed than a
zero shift.

Lawrence

Nov 14 '05 #24

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Eric Sosman wrote:

[...]

Maybe there's a way to calculate (UINT_MAX + 1)/(UCHAR_MAX + 1)
without risking zero in the numerator and/or denominator, but I
haven't figured one out. (Note that UCHAR_MAX == ULLONG_MAX is
permitted by the Standard.)

Since UCHAR_MAX <= UINT_MAX <= ULLONG_MAX...

#if UCHAR_MAX == ULLONG_MAX
# define UINT_UCHAR_RATIO 1
#else
# define UINT_UCHAR_RATIO \
(((unsigned long long)UINT_MAX + 1) / \
((unsigned long long)UCHAR_MAX + 1))
#endif

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (GNU/Linux)

iD8DBQFB7Bl1+hz2VlChukwRAoaVAJ0Zh2DOJwBXkeNRuyG/Jd0h9dx5cACcCD6a
ybh2nBGeJaFbYrAwuhhhMhA=
=otfo
-----END PGP SIGNATURE-----

Nov 14 '05 #25

Eric Sosman

bd wrote:

Eric Sosman wrote:
[...]
Maybe there's a way to calculate (UINT_MAX + 1)/(UCHAR_MAX + 1)
without risking zero in the numerator and/or denominator, but I
haven't figured one out. (Note that UCHAR_MAX == ULLONG_MAX is
permitted by the Standard.)

Since UCHAR_MAX <= UINT_MAX <= ULLONG_MAX...

#if UCHAR_MAX == ULLONG_MAX
# define UINT_UCHAR_RATIO 1
#else
# define UINT_UCHAR_RATIO \
(((unsigned long long)UINT_MAX + 1) / \
((unsigned long long)UCHAR_MAX + 1))
#endif

If UINT_MAX == ULLONG_MAX, this yields a ratio
of zero.

--
Eric Sosman
es*****@acm-dot-org.invalid

Nov 14 '05 #26

Peter Nilsson

Eric Sosman wrote:

bd wrote:
Eric Sosman wrote:
[...]
Maybe there's a way to calculate (UINT_MAX + 1)/(UCHAR_MAX + 1)
without risking zero in the numerator and/or denominator, but I
haven't figured one out. (Note that UCHAR_MAX == ULLONG_MAX is
permitted by the Standard.)

Since UCHAR_MAX <= UINT_MAX <= ULLONG_MAX...

#if UCHAR_MAX == ULLONG_MAX
# define UINT_UCHAR_RATIO 1
#else
# define UINT_UCHAR_RATIO \
(((unsigned long long)UINT_MAX + 1) / \
((unsigned long long)UCHAR_MAX + 1))
#endif

If UINT_MAX == ULLONG_MAX, this yields a ratio
of zero.

I'm not sure why you need this ratio, but it's trivial to calculate...
(UINT_MAX/2+1) / (UCHAR_MAX/2+1)

--
Peter

Nov 14 '05 #27

Eric Sosman

Peter Nilsson wrote:

Eric Sosman wrote:
bd wrote:
Eric Sosman wrote:
[...]

Maybe there's a way to calculate (UINT_MAX + 1)/(UCHAR_MAX + 1)
without risking zero in the numerator and/or denominator, but I
haven't figured one out. [...]

I'm not sure why you need this ratio, but it's trivial to calculate...
(UINT_MAX/2+1) / (UCHAR_MAX/2+1)

<Self-administers a dope slap> Duhh ...

The ratio was to be used to solve the problem posed
(oh, so long ago) at the beginning of this thread: Write
fully portable code to left-adjust an `unsigned char' in
an `unsigned int'. Everybody (self included) got caught
up in trying to come up with a recipe using shifts, and
ran afoul of one or more of padding bits, shift counts too
wide for shifted operand, or uncommon relationships among
the various Uxxx_MAX values. The correct solutions offered
thus far involve ugly loops; "ugly" because they perform
a calculation at run time that any particular compiler
could have done during compilation.

Having the ratio available makes the problem easy:

unsigned char uc = ...;
unsigned int ui = uc
* ((UINT_MAX/2+1)/(UCHAR_MAX/2+1));

"Mainstream" implementations will calculate the ratio
as 2**N (N >= 1) and probably replace the multiplication
with a shift, while "exotic" implementations will get a
ratio of unity and probably eliminate the multiplication
altogether.

--
Eric Sosman
es*****@acm-dot-org.invalid

Nov 14 '05 #28

Peter Nilsson

Eric Sosman wrote:

Peter Nilsson wrote:
...
I'm not sure why you need this ratio, but it's...
(UINT_MAX/2+1) / (UCHAR_MAX/2+1)
The ratio was to be used to solve the problem posed
(oh, so long ago) at the beginning of this thread: Write
fully portable code to left-adjust an `unsigned char' in
an `unsigned int'. Everybody (self included) got caught
up in trying to come up with a recipe using shifts, and
ran afoul of one or more of padding bits, shift counts too
wide for shifted operand, or uncommon relationships among
the various Uxxx_MAX values. The correct solutions offered
thus far involve ugly loops; "ugly" because they perform
a calculation at run time that any particular compiler
could have done during compilation.

Perhaps your newsserver didn't receive the early offering...

#include <limits.h>

#define move_uc_to_umsb(u,uc) \
((((unsigned )(u )) & (-1u >> (CHAR_BIT - 1) >> 1) ) \
|(((unsigned char)(uc)) * ((-1u >> (CHAR_BIT - 1) >> 1) + 1)))
Having the ratio available makes the problem easy:

unsigned char uc = ...;
unsigned int ui = uc
* ((UINT_MAX/2+1)/(UCHAR_MAX/2+1));

"Mainstream" implementations will calculate the ratio
as 2**N (N >= 1) and probably replace the multiplication
with a shift, while "exotic" implementations will get a
ratio of unity and probably eliminate the multiplication
altogether.

I don't think many people know about the x >> (N - 1) >> 1 option
of avoiding the undefined behaviour when N might be the width of x.
[Assuming that x '>>' N == 0 is the prefered option in such cases.]
--
Peter

Nov 14 '05 #29

Dave Thompson

On Fri, 14 Jan 2005 09:10:30 -0500, Eric Sosman
<es*****@acm-dot-org.invalid> wrote:
<snip: shift by >= width is UB>

Shifting by zero bits is even more pointless, but that
doesn't seem to have prompted instruction-set designers to
omit the operation (unless they also omit all multi-bit
shifts; I've used machines whose only shift instructions
were single-bit shifts).
And the PDP-8 (and 12?) had rotate-by-1 and by-2, only; "through" the
Link ~ carry bit, which could be cleared or set to produce an
effective shift. Well, and (at least on some models?) byte-swap for
6-bit bytes.

<snip> (Since the zero-bit shift also seems useless, I imagine a
designer might decide to use the opcode that resembles "shift
by zero" to denote some entirely different operation. <snip>

Not 'entirely different', but on Tandem^WCompaq^WHP NonStop,
{,Dbl}{Log,Arith}{Left,Right}Shift operand 1 to 31 meant a constant
shift while 0 meant variable shift determined by (and consuming) an
additional register-stack value, which may itself be 0.

- David.Thompson1 at worldnet.att.net

Nov 14 '05 #30

Implemenation Indepdent Way to Move LSByte of Char to MSB of Int, etc

Similar topics