By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
443,837 Members | 1,842 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 443,837 IT Pros & Developers. It's quick & easy.

Endianess: why does this code not change value on BE machine?

P: n/a
Ok,

I've searched this group for Big/Little endian issues, don't kill me,
I know endianess issues have been discussed a 1000 times. But my
question is a bit different:

I've seen the follwing function several times, it converts data stored
in Big Endian (BE) format into host native format (LE on LE machines,
BE on BE machines):

/* - this code swaps the bytes on a Little Endian Machine
- this code returns 'data' unmodified on a Big Endian Machine.
*/
static short getShortBE (char *data)
{
return (short) ((data[0] << 8) | data[1]);
}

I can understand why this function swaps bytes on a LE machine, but
why doesn't it alter 'data' on a BE machine? I've tried to understand
it with diagrams and everything, but my brain went just crazy! Can
anyone give me a simple explanation please? :) I'm really curious...

And how would a function look like which does the "opposite": swap
bytes on a BE machine, don't change values on a LE machine (that is,
"read and convert LE data")? :) Hope I can answer the 2nd quesion
myself when I understand the 1st question... :)
Here's the complete test program:

#include <stdlib.h>
#include <stdio.h>

/* This function is useful when reading and converting data which is
stored
in Big Endian Format.
- this code swaps the bytes on a Little Endian Machine
- this code returns 'data' unmodified on a Big Endian Machine.
*/
static short getShortBE (char *data)
{
return (short) ((data[0] << 8) | data[1]);
}

int main (int argc, char **argv)
{
int a;
char c;
short data1;
short data2;
char *d;

(void)argc;
(void)argv;

a = 0x01020304;
c = ((char *)&a)[0];

if (c == 1)
{
fprintf (stdout, "Integer a: %x - first byte: %x (MSB) -> Big
endian machine.\n", a, c);
}
else if (c == 4)
{
fprintf (stdout, "Integer a: %x - first byte: %x (LSB) -> Little
endian machine.\n", a, c);
}
else
{
fprintf (stdout, "Integer a: %x - first byte: %x -> A weirdo
machine.\n", a, c);
}

data1 = 0x0102;
d = ((char *)&data1);

fprintf (stdout, "Data[0]: %x - Data[1]: %x\n",
*d, *(d + 1));

data2 = getShortBE ((char *)&data1);

d = ((char *)&data2);
fprintf (stdout, "After GET: Data[0]: %x - Data[1]: %x\n",
*d, *(d + 1));

return 1;
}

Output:
-------

On Sun:
./Endian

Integer a: 1020304 - first byte: 1 (MSB) -> Big endian machine.
Data[0]: 1 - Data[1]: 2
After GET: Data[0]: 1 - Data[1]: 2

On Linux (i386):

Integer a: 1020304 - first byte: 4 (LSB) -> Little endian machine.
Data[0]: 2 - Data[1]: 1
After GET: Data[0]: 1 - Data[1]: 2

Thanks, Oliver
Nov 14 '05 #1
Share this Question
Share on Google+
12 Replies


P: n/a

"Oliver Knoll" <tk****@bluewin.ch> wrote

/* - this code swaps the bytes on a Little Endian Machine
- this code returns 'data' unmodified on a Big Endian Machine.
*/
static short getShortBE (char *data)
{
return (short) ((data[0] << 8) | data[1]);
}

I can understand why this function swaps bytes on a LE machine, but
why doesn't it alter 'data' on a BE machine? I've tried to understand
it with diagrams and everything, but my brain went just crazy! Can
anyone give me a simple explanation please? :) I'm really curious...
Firstly arbitrary data should be unsigned char. Plain char is for actual
text.

Don't be fooled by the << operator. This suggests that data is being shifted
"leftwards" in memory, but in fact it always moves less significant bits to
the more significant position.
Your function therefore takes an arbitrary stream of bytes, and treats the
first one as the top eight bits and the second one as the lower eight bits
of a 16-bit number.
Incidentally it will not work as expected if CHAR_BIT is not eight, which it
isn not always. short isn't necessarily sixteen bits, either.
And how would a function look like which does the "opposite": swap
bytes on a BE machine, don't change values on a LE machine (that is,
"read and convert LE data")? :) Hope I can answer the 2nd quesion
myself when I understand the 1st question... :)
So you can treat your arbitrary bit stream as little endian simply by
placing the first byte in the least-significant position, and shifting up
the second byte to the most significant position.

Nov 14 '05 #2

P: n/a
In article <b2**************************@posting.google.com >,
Oliver Knoll <tk****@bluewin.ch> wrote:
static short getShortBE (char *data)
{
return (short) ((data[0] << 8) | data[1]);
}

I can understand why this function swaps bytes on a LE machine, but
why doesn't it alter 'data' on a BE machine? I've tried to understand
it with diagrams and everything, but my brain went just crazy! Can
anyone give me a simple explanation please? :) I'm really curious...


Well, one way to see it is that this function doesn't depend on
endianness at all, so it will give the same result on a big- or
little-endian machine. But just accesing the data as a short will
give different results on the two.

-- Richard
Nov 14 '05 #3

P: n/a
tk****@bluewin.ch (Oliver Knoll) wrote:
/* - this code swaps the bytes on a Little Endian Machine
- this code returns 'data' unmodified on a Big Endian Machine.
*/
static short getShortBE (char *data)
{
return (short) ^^^^^^ useless cast ((data[0] << 8) | data[1]);
Implementation-defined behaviour, due to sign-extension. You should
use unsigned data types for bit manipuation, eg:

short getShortBE(void *data)
{
unsigned char *p = data;
return ( p[0] << CHAR_BIT ) | p[1] ; } I can understand why this function swaps bytes on a LE machine, but
why doesn't it alter 'data' on a BE machine?
Explain in your own words why it works on an LE machine, and
then it should be obvious why it does nothing on BE.
And how would a function look like which does the "opposite": swap
bytes on a BE machine, don't change values on a LE machine (that is,
"read and convert LE data")?


Switch the '0' with the '1'.
Nov 14 '05 #4

P: n/a
On 12 Sep 2004 05:00:49 -0700, tk****@bluewin.ch (Oliver Knoll) wrote:
Ok,

I've searched this group for Big/Little endian issues, don't kill me,
I know endianess issues have been discussed a 1000 times. But my
question is a bit different:

I've seen the follwing function several times,
where?
it converts data stored
in Big Endian (BE) format into host native format (LE on LE machines,
BE on BE machines):
it is wrong because if a char "x" is 8 bit then x<<8 == 0
/* - this code swaps the bytes on a Little Endian Machine
- this code returns 'data' unmodified on a Big Endian Machine.
*/
static short getShortBE (char *data)
{
return (short) ((data[0] << 8) | data[1]);
}

#include <limits.h>

/* Suppone 2*sizeof(char)=sizeof(short) */
unsigned short GetShort(unsigned char *data)
{unsigned short u=data[0];
return (u<<CHAR_BIT)| data[1];
}
Nov 14 '05 #5

P: n/a

On Mon, 13 Sep 2004, RoSsIaCrIiLoIA wrote:

On 12 Sep 2004 05:00:49 -0700, tk****@bluewin.ch (Oliver Knoll) wrote:

I've seen the follwing function several times,


where?
it converts data stored
in Big Endian (BE) format into host native format (LE on LE machines,
BE on BE machines):


it is wrong because if a char "x" is 8 bit then x<<8 == 0


No, it's not. Usual arithmetic promotions apply. Stop spreading
misinformation; that's bad.

-Arthur

Nov 14 '05 #6

P: n/a
"Malcolm" <ma*****@55bank.freeserve.co.uk> wrote in message news:<ci**********@news5.svr.pol.co.uk>...
"Oliver Knoll" <tk****@bluewin.ch> wrote

/* - this code swaps the bytes on a Little Endian Machine
- this code returns 'data' unmodified on a Big Endian Machine.
*/
static short getShortBE (char *data)
{
return (short) ((data[0] << 8) | data[1]);
}

I can understand why this function swaps bytes on a LE machine, but
why doesn't it alter 'data' on a BE machine? I've tried to understand
,,,
Firstly arbitrary data should be unsigned char. Plain char is for actual
text.


Thanks :) Good point. (In the real world I'm using such ugly stuff
like Q_UINT16 and the like though - I guess I've never seen a library
(Qt in this case) which doesn't define their own "datatypes" to ensure
correct byte-siyes ;)
Don't be fooled by the << operator. This suggests that data is being shifted
"leftwards" in memory, but in fact it always moves less significant bits to
the more significant position.
Ahh, that's exactly the explanation I was looking for, my brain was
stuck with this "bits go to the left". It makes perfectly sense now,
thanks a lot!
...
Incidentally it will not work as expected if CHAR_BIT is not eight, which it
isn not always. short isn't necessarily sixteen bits, either.


I've taken char and short and was naively assuming them to be 8 and 16
bit for illustration purposes.

Thanks, Oliver
Nov 14 '05 #7

P: n/a
On Mon, 13 Sep 2004 03:13:24 -0400 (EDT), "Arthur J. O'Dwyer"
<aj*@nospam.andrew.cmu.edu> wrote:
On Mon, 13 Sep 2004, RoSsIaCrIiLoIA wrote:

On 12 Sep 2004 05:00:49 -0700, tk****@bluewin.ch (Oliver Knoll) wrote:

I've seen the follwing function several times,
where?
it converts data stored
in Big Endian (BE) format into host native format (LE on LE machines,
BE on BE machines):


it is wrong because if a char "x" is 8 bit then x<<8 == 0


No, it's not. Usual arithmetic promotions apply. Stop spreading
misinformation; that's bad.

static short getShortBE (char *data)
{
return (short) ((data[0] << 8) | data[1]);
}


where is the "arithmetic promotion"? I don't know how it can be ok
data[0] is a char and do << 8 (it seems an error to me)
then or data[1]; then the promotion to short
[if it was return (short) (data[0]<<8) | data[1]; you are right]
but (short) ((data[0]<<8) | data[1])
seems to me if *data is 8 bit = (short) (data[1])
-Arthur


Nov 14 '05 #8

P: n/a
On Mon, 13 Sep 2004 03:13:24 -0400 (EDT), "Arthur J. O'Dwyer"
<aj*@nospam.andrew.cmu.edu> wrote:
On Mon, 13 Sep 2004, RoSsIaCrIiLoIA wrote:

On 12 Sep 2004 05:00:49 -0700, tk****@bluewin.ch (Oliver Knoll) wrote:

I've seen the follwing function several times,
where?
it converts data stored
in Big Endian (BE) format into host native format (LE on LE machines,
BE on BE machines):


it is wrong because if a char "x" is 8 bit then x<<8 == 0


No, it's not. Usual arithmetic promotions apply. Stop spreading
misinformation; that's bad.

static short getShortBE (char *data)
{
return (short) ((data[0] << 8) | data[1]);
}


where is the "arithmetic promotion"? I don't know how it can be ok
data[0] is a char and do << 8 (it seems an error to me)
then or data[1]; then the promotion to short
[if it was return (short) (data[0]<<8) | data[1]; you are right]
but (short) ((data[0]<<8) | data[1])
seems to me if *data is 8 bit = (short) (data[1])
-Arthur


Nov 14 '05 #9

P: n/a
On Mon, 13 Sep 2004 03:13:24 -0400 (EDT), "Arthur J. O'Dwyer"
<aj*@nospam.andrew.cmu.edu> wrote:
On Mon, 13 Sep 2004, RoSsIaCrIiLoIA wrote:

On 12 Sep 2004 05:00:49 -0700, tk****@bluewin.ch (Oliver Knoll) wrote:

I've seen the follwing function several times,
where?
it converts data stored
in Big Endian (BE) format into host native format (LE on LE machines,
BE on BE machines):


it is wrong because if a char "x" is 8 bit then x<<8 == 0


No, it's not. Usual arithmetic promotions apply. Stop spreading
misinformation; that's bad.

static short getShortBE (char *data)
{
return (short) ((data[0] << 8) | data[1]);
}


where is the "arithmetic promotion"? I don't know how it can be ok
data[0] is a char and do << 8 (it seems an error to me)
then or data[1]; then the promotion to short
[if it was return (short) (data[0]<<8) | data[1]; you are right]
but (short) ((data[0]<<8) | data[1])
seems to me if *data is 8 bit = (short) (data[1])
-Arthur


Nov 14 '05 #10

P: n/a
Oliver Knoll wrote:
...
/* - this code swaps the bytes on a Little Endian Machine
- this code returns 'data' unmodified on a Big Endian Machine.
*/
static short getShortBE (char *data)
{
return (short) ((data[0] << 8) | data[1]);
}

I can understand why this function swaps bytes on a LE machine, but
why doesn't it alter 'data' on a BE machine? I've tried to understand
it with diagrams and everything, but my brain went just crazy! Can
anyone give me a simple explanation please? :) I'm really curious...
...


I think it is rather obvious. Assuming that 'data' passed to this
function actually points to a two-byte integral value (say, 'short' in
your implementation, and CHAR_BITS is 8), on a LE machine 'data[0]' is a
low-order byte of that value and 'data[1]' is a high-order byte. Doing

(data[0] << 8) | data[1]

will indeed swap these bytes, i.e. move the former low-order byte into
the high-order position and vice versa.

On a BE machine 'data[0]' is a high-order byte and 'data[1]' is a
low-order byte of a two-byte integral value. Doing

(data[0] << 8) | data[1]

will simply re-construct the original value, i.e. high-order byte is
moved to high-order position and low-order byte is placed in low-order
position. The value remains unchanged.

--
Best regards,
Andrey Tarasevich
Nov 14 '05 #11

P: n/a
RoSsIaCrIiLoIA wrote:
...
it converts data stored
in Big Endian (BE) format into host native format (LE on LE machines,
BE on BE machines):


it is wrong because if a char "x" is 8 bit then x<<8 == 0
...


No. In C language operands of << operators are first subjected to
integer promotions. In this case a value of type 'char' will be promoted
to a value of type 'int'. The << operator will be applied to an 'int'
operand, not to a 'char' operand. That's why what you are saying is
incorrect.

--
Best regards,
Andrey Tarasevich
Nov 14 '05 #12

P: n/a
On Mon, 13 Sep 2004 15:29:05 -0700, Andrey Tarasevich
<an**************@hotmail.com> wrote:
RoSsIaCrIiLoIA wrote:
...
it converts data stored
in Big Endian (BE) format into host native format (LE on LE machines,
BE on BE machines):


it is wrong because if a char "x" is 8 bit then x<<8 == 0
...


No. In C language operands of << operators are first subjected to
integer promotions. In this case a value of type 'char' will be promoted
to a value of type 'int'. The << operator will be applied to an 'int'
operand, not to a 'char' operand. That's why what you are saying is
incorrect.


Ok
unsigned short getShortBE (unsigned char* data)
{return (data[0] << (unsigned short) 8) | data[1];}
Nov 14 '05 #13

This discussion thread is closed

Replies have been disabled for this discussion.