Byte ordering and array access

Benjamin M. Stocks

Hello all,
I've heard differing opinions on this and would like a definitive
answer on this once and for all. If I have an array of 4 1-byte values
where index 0 is the least signficant byte of a 4-byte value. Can I use
the arithmatic shift operators to hide the endian-ness of the
underlying processor when assembling a native 4-byte value like
follows:

unsigned int integerValue;
unsigned char byteArray[4];

/* byteArray is populated elsewhere, least signficant byte in index 0,
guaranteed */

integerValue = (unsigned int)byteArray[0] |
((unsigned int)byteArray[1] << 8) |
((unsigned int)byteArray[2] << 16) |
((unsigned int)byteArray[3] << 24);

So if byteArray[0] was 0x78, byteArray[1] was 0x56, byteArray[2] was
0x34 and byteArray[3] was 0x12 then would integerValue be 0x12345678 no
matter the endian-ness of the processor?

Thanks in advance,

Ben

Feb 8 '06 #1

Subscribe Reply

3362

Vladimir S. Oka

Benjamin M. Stocks wrote:

Hello all,
I've heard differing opinions on this and would like a definitive
answer on this once and for all. If I have an array of 4 1-byte values
where index 0 is the least signficant byte of a 4-byte value. Can I use
the arithmatic shift operators to hide the endian-ness of the
underlying processor when assembling a native 4-byte value like
follows:

unsigned int integerValue;
unsigned char byteArray[4];

/* byteArray is populated elsewhere, least signficant byte in index 0,
guaranteed */

integerValue = (unsigned int)byteArray[0] |
((unsigned int)byteArray[1] << 8) |
((unsigned int)byteArray[2] << 16) |
((unsigned int)byteArray[3] << 24);

So if byteArray[0] was 0x78, byteArray[1] was 0x56, byteArray[2] was
0x34 and byteArray[3] was 0x12 then would integerValue be 0x12345678 no
matter the endian-ness of the processor?

Yes, you can do this. Standard states that shift operations are defined
in terms of the /value/ of the thing you're shifting, not it's bit
representation. You can think of shifts in terms of multiplication and
division, if you will.

--
BR, Vladimir

Feb 8 '06 #2

Eric Sosman

Benjamin M. Stocks wrote On 02/08/06 10:39,:

Hello all,
I've heard differing opinions on this and would like a definitive
answer on this once and for all. If I have an array of 4 1-byte values
where index 0 is the least signficant byte of a 4-byte value. Can I use
the arithmatic shift operators to hide the endian-ness of the
underlying processor when assembling a native 4-byte value like
follows:

unsigned int integerValue;
unsigned char byteArray[4];

/* byteArray is populated elsewhere, least signficant byte in index 0,
guaranteed */

integerValue = (unsigned int)byteArray[0] |
((unsigned int)byteArray[1] << 8) |
((unsigned int)byteArray[2] << 16) |
((unsigned int)byteArray[3] << 24);

So if byteArray[0] was 0x78, byteArray[1] was 0x56, byteArray[2] was
0x34 and byteArray[3] was 0x12 then would integerValue be 0x12345678 no
matter the endian-ness of the processor?

Yes, this is fine. Well, almost: I can see two
potential portability problems:

- C guarantees that a char has at least eight bits,
but permits it to have more. Depending on just
what you mean by "1-byte values," you might want
to replace 8,16,24 by CHAR_BIT, 2*CHAR_BIT, and
3*CHAR_BIT (the CHAR_BIT macro is in <limits.h>).

- C guarantees that an int has at least sixteen bits,
but does not promise that it has as many as 32
(or 4*CHAR_BIT). That is, an int may be too narrow
to hold four bytes. This could mess things up in
two ways: first, you obviously use a two-pound sack
to hold four pounds of ... well, better unsaid.
Second, shifts are only guaranteed to work if the
shift distance is strictly less than the number of
bits in the value shifted, so if int is sixteen bits
wide both the 16- and 24-bit shifts are undefined.

--
Er*********@sun.com

Feb 8 '06 #3

Rod Pemberton

"Benjamin M. Stocks" <st*****@ieee.org> wrote in message
news:11**********************@f14g2000cwb.googlegr oups.com...

Hello all,
Ignoring your question here:
unsigned int integerValue;
unsigned char byteArray[4];

/* byteArray is populated elsewhere, least signficant byte in index 0,
guaranteed */

integerValue = (unsigned int)byteArray[0] |
((unsigned int)byteArray[1] << 8) |
((unsigned int)byteArray[2] << 16) |
((unsigned int)byteArray[3] << 24);

You'll probably need or want to declare 'integerValue' as a more definitive
type: 'unsigned long' or 'uint32_t'.
Rod Pemberton

Feb 8 '06 #4

CBFalconer

"Benjamin M. Stocks" wrote:

I've heard differing opinions on this and would like a definitive
answer on this once and for all. If I have an array of 4 1-byte values
where index 0 is the least signficant byte of a 4-byte value. Can I use
the arithmatic shift operators to hide the endian-ness of the
underlying processor when assembling a native 4-byte value like
follows:

unsigned int integerValue;
unsigned char byteArray[4];

/* byteArray is populated elsewhere, least signficant byte in index 0,
guaranteed */

integerValue = (unsigned int)byteArray[0] |
((unsigned int)byteArray[1] << 8) |
((unsigned int)byteArray[2] << 16) |
((unsigned int)byteArray[3] << 24);

So if byteArray[0] was 0x78, byteArray[1] was 0x56, byteArray[2] was
0x34 and byteArray[3] was 0x12 then would integerValue be 0x12345678 no
matter the endian-ness of the processor?

That is basically the correct way. However you should allow for
the values of CHAR_BIT and sizeof(int). If we assume that your
input data is always in units of 8 bits (i.e. if CHAR_BIT is
greater than 8, you just don't use the extra bit(s)), you could
use:

#define ASIZE sizeof(int);

unsigned int uival;
unsigned char bytes[ASIZE];
int i;
...
for (uival = 0, i = 0; i < ASIZE; i++) {
uival = 256 * uival + (bytes[i] & 0xff);
}

The use of multiplication and unsigned values avoids any unwanted
overflow behaviour, and the 0xff mask ensures only 8 bits are used
per byte. Now the objective of endianess independence is achieved
safely.

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
More details at: <http://cfaj.freeshell.org/google/>
Also see <http://www.safalra.com/special/googlegroupsreply/>

Feb 8 '06 #5

stathis gotsis

"Benjamin M. Stocks" <st*****@ieee.org> wrote in message
news:11**********************@f14g2000cwb.googlegr oups.com...

Hello all,
I've heard differing opinions on this and would like a definitive
answer on this once and for all. If I have an array of 4 1-byte values
where index 0 is the least signficant byte of a 4-byte value. Can I use
the arithmatic shift operators to hide the endian-ness of the
underlying processor when assembling a native 4-byte value like
follows:

unsigned int integerValue;
unsigned char byteArray[4];

/* byteArray is populated elsewhere, least signficant byte in index 0,
guaranteed */

integerValue = (unsigned int)byteArray[0] |
((unsigned int)byteArray[1] << 8) |
((unsigned int)byteArray[2] << 16) |
((unsigned int)byteArray[3] << 24);

So if byteArray[0] was 0x78, byteArray[1] was 0x56, byteArray[2] was
0x34 and byteArray[3] was 0x12 then would integerValue be 0x12345678 no
matter the endian-ness of the processor?

May i ask a question on this? Can the endian-ness of the processor affect
the "<<" shifting direction? From the replies i assume it does. I need an
example where this operator shifts to the right.

Feb 10 '06 #6

Vladimir S. Oka

stathis gotsis wrote:

"Benjamin M. Stocks" <st*****@ieee.org> wrote in message
news:11**********************@f14g2000cwb.googlegr oups.com...
Hello all,
I've heard differing opinions on this and would like a definitive
answer on this once and for all. If I have an array of 4 1-byte
values where index 0 is the least signficant byte of a 4-byte value.
Can I use the arithmatic shift operators to hide the endian-ness of
the underlying processor when assembling a native 4-byte value like
follows:

unsigned int integerValue;
unsigned char byteArray[4];

/* byteArray is populated elsewhere, least signficant byte in index
0, guaranteed */

integerValue = (unsigned int)byteArray[0] |
((unsigned int)byteArray[1] << 8) |
((unsigned int)byteArray[2] << 16) |
((unsigned int)byteArray[3] << 24);

So if byteArray[0] was 0x78, byteArray[1] was 0x56, byteArray[2] was
0x34 and byteArray[3] was 0x12 then would integerValue be 0x12345678
no matter the endian-ness of the processor?

May i ask a question on this? Can the endian-ness of the processor
affect the "<<" shifting direction? From the replies i assume it does.
I need an example where this operator shifts to the right.

No, it does not affect shift "direction". It may help if you think of
shifts as repeated integer divisions/multiplications by 2 (that's how
Standard defines them -- they work on /values/ not representations).
Endianness only affect how values are stored in memory (their bit
representation, if you will). IOW, before performing the shift, C
program reads operand's representation, figures out the /value/,
performs shifting (i.e. division/multiplication), and if required
converts value back to representation, and stores it back.

I don't think you can construct the representation/endinanness
combination that will "reverse" shifts. Or, I'm not at my creative best
at the moment (a distinct possibility -- it's Friday evening, I should
be in a pub).

--
BR, Vladimir

Every improvement in communication makes the bore more terrible.
-- Frank Moore Colby

Feb 10 '06 #7

pete

Benjamin M. Stocks wrote:

Hello all,
I've heard differing opinions on this and would like a definitive
answer on this once and for all. If I have an array of 4 1-byte values
where index 0 is the least signficant byte of a 4-byte value.
Can I use
the arithmatic shift operators to hide the endian-ness of the
underlying processor when assembling a native 4-byte value like
follows:

unsigned int integerValue;
unsigned char byteArray[4];

/* byteArray is populated elsewhere, least signficant byte in index 0,
guaranteed */

integerValue = (unsigned int)byteArray[0] |
((unsigned int)byteArray[1] << 8) |
((unsigned int)byteArray[2] << 16) |
((unsigned int)byteArray[3] << 24);

So if byteArray[0] was 0x78, byteArray[1] was 0x56, byteArray[2] was
0x34 and byteArray[3] was 0x12 then would integerValue
be 0x12345678 no
matter the endian-ness of the processor?

Yes, as long as INT_MAX is large enough.
You could also do it this way:

integerValue = byteArray[0]
+ byteArray[1] * 0x100LU
+ byteArray[2] * 0x10000LU
+ byteArray[3] * 0x1000000LU;

--
pete

Feb 10 '06 #8

pete

pete wrote:

Benjamin M. Stocks wrote:
unsigned int integerValue; then would integerValue be 0x12345678

Yes, as long as INT_MAX is large enough.

UINT_MAX

--
pete

Feb 10 '06 #9

stathis gotsis

"Vladimir S. Oka" <no****@btopenworld.com> wrote in message
news:ds**********@nwrdmz03.dmz.ncs.ea.ibs-infra.bt.com...

stathis gotsis wrote:
May i ask a question on this? Can the endian-ness of the processor
affect the "<<" shifting direction? From the replies i assume it does.
I need an example where this operator shifts to the right.

No, it does not affect shift "direction". It may help if you think of
shifts as repeated integer divisions/multiplications by 2 (that's how
Standard defines them -- they work on /values/ not representations).
Endianness only affect how values are stored in memory (their bit
representation, if you will). IOW, before performing the shift, C
program reads operand's representation, figures out the /value/,
performs shifting (i.e. division/multiplication), and if required
converts value back to representation, and stores it back.

I don't think you can construct the representation/endinanness
combination that will "reverse" shifts. Or, I'm not at my creative best
at the moment (a distinct possibility -- it's Friday evening, I should
be in a pub).

Well, yes there is a clear distinction between values and representations,
so my question was pointless anyway. But suppose we have 2-byte integers and
let int a=0xABCD. In one representation that could be: ABCD and in another:
DCBA. In terms of representations, one could say that in the first one "<<"
operator shifts left and in the second to the right. But that cannot happen
in the real world right?

Furthermore, let's take the OP. The program needs to evaluate the following
expression: ((unsigned int)byteArray[1] << 8). I assume that means that
byteArray[1] should move to the place of the next most significant byte in
the unsigned int word. So that shifting could be "left" or "right" depending
on endianness.

Feb 10 '06 #10

pete

stathis gotsis wrote:

But suppose we have 2-byte integers and
let int a=0xABCD. In one representation that could be: ABCD
and in another: DCBA.

Leftness and Rightness has to do with significance,
as in: The least significant byte is the right most.
It has nothing to do with addresses of the bytes.

Assuming CHAR_BIT equals 8 and
assuming that by DCBA, you mean
the lower byte will have value 0xDC and
the higher byte will have value 0xBA:
that's wrong.

One byte will have value 0xCD
and the other will have 0xAB.

--
pete

Feb 10 '06 #11

Jordan Abel

On 2006-02-10, pete <pf*****@mindspring.com> wrote:

stathis gotsis wrote:
But suppose we have 2-byte integers and
let int a=0xABCD. In one representation that could be: ABCD
and in another: DCBA.

Leftness and Rightness has to do with significance,
as in: The least significant byte is the right most.
It has nothing to do with addresses of the bytes.

Assuming CHAR_BIT equals 8 and
assuming that by DCBA, you mean
the lower byte will have value 0xDC and
the higher byte will have value 0xBA:
that's wrong.

One byte will have value 0xCD
and the other will have 0xAB.

I don't believe this is actually guaranteed. Though, processors which
play silly games with the endian-ness of units other than bytes are rare
indeed [i think i once owned a graphing calulator that did that]

Feb 10 '06 #12

pete

Jordan Abel wrote:

On 2006-02-10, pete <pf*****@mindspring.com> wrote:
stathis gotsis wrote:
But suppose we have 2-byte integers and
let int a=0xABCD. In one representation that could be: ABCD
and in another: DCBA.
Leftness and Rightness has to do with significance,
as in: The least significant byte is the right most.
It has nothing to do with addresses of the bytes.

Assuming CHAR_BIT equals 8 and
assuming that by DCBA, you mean
the lower byte will have value 0xDC and
the higher byte will have value 0xBA:
that's wrong.

One byte will have value 0xCD
and the other will have 0xAB.

I don't believe this is actually guaranteed.

It is.
Though, processors which
play silly games with the endian-ness of units
other than bytes are rare
indeed [i think i once owned a graphing calulator that did that]

Regardless of what the processor actually does,
in a C program,
it has to make objects look like bytes of bits.

In a C program you can examine any byte of any object,
except register class objects,
as a object of unsigned char.

--
pete

Feb 11 '06 #13

Jordan Abel

On 2006-02-11, pete <pf*****@mindspring.com> wrote:

Jordan Abel wrote:

On 2006-02-10, pete <pf*****@mindspring.com> wrote:
> stathis gotsis wrote:
>
>> But suppose we have 2-byte integers and
>> let int a=0xABCD. In one representation that could be: ABCD
>> and in another: DCBA.
>
> Leftness and Rightness has to do with significance,
> as in: The least significant byte is the right most.
> It has nothing to do with addresses of the bytes.
>
> Assuming CHAR_BIT equals 8 and
> assuming that by DCBA, you mean
> the lower byte will have value 0xDC and
> the higher byte will have value 0xBA:
> that's wrong.
>
> One byte will have value 0xCD
> and the other will have 0xAB.
I don't believe this is actually guaranteed.

It is.

Where?
Regardless of what the processor actually does, in a C program, it has
to make objects look like bytes of bits.
Yeah, but the values of the bytes are not defined, except that you can
copy them out and back into another object of the same type.
In a C program you can examine any byte of any object,
except register class objects,
as a object of unsigned char.

Sure, but it doesn't guarantee _what_ values there are. There's no
reason 0xABCD couldn't be three bytes: 0x5 0x2D 0x8D. Padding bits and
such. Nothing is guaranteed about the physical order in which bits are
interpreted in an integer vs in an unsigned char

Feb 11 '06 #14

stathis gotsis

"pete" <pf*****@mindspring.com> wrote in message
news:43**********@mindspring.com...

stathis gotsis wrote:
But suppose we have 2-byte integers and
let int a=0xABCD. In one representation that could be: ABCD
and in another: DCBA.
Leftness and Rightness has to do with significance,
as in: The least significant byte is the right most.
It has nothing to do with addresses of the bytes.

Assuming that more significant bytes are in the "left" of less significant
ones i come to the conclusion that the "<<" operator always shifts to the
"left". I think it is a matter of convention anyway.
Assuming CHAR_BIT equals 8 and
assuming that by DCBA, you mean
the lower byte will have value 0xDC and
the higher byte will have value 0xBA:
that's wrong.

One byte will have value 0xCD
and the other will have 0xAB.

Well, yes that is the common case. I will take your word on the
non-existence of the opposite.

Feb 11 '06 #15

pete

Jordan Abel wrote:

Nothing is guaranteed about the physical order in which bits are
interpreted in an integer vs in an unsigned char

You think a two byte int object with a value of 3,
might have one bit set in each byte?

Maybe, I don't know.

--
pete

Feb 11 '06 #16

stathis gotsis

"Vladimir S. Oka" <no****@btopenworld.com> wrote in message
news:11**********************@g14g2000cwa.googlegr oups.com...

Yes, you can do this. Standard states that shift operations are defined
in terms of the /value/ of the thing you're shifting, not it's bit
representation. You can think of shifts in terms of multiplication and
division, if you will.

Does that apply for negative (signed) operands as well? I think K&R2 says
that is implementation specific, shifting negatives can be implemented
logically (sticking to the bit representation) or arithmetically (sticking
to real value). Please comment on that.

Feb 11 '06 #17

Vladimir S. Oka

stathis gotsis wrote:

"Vladimir S. Oka" <no****@btopenworld.com> wrote in message
news:11**********************@g14g2000cwa.googlegr oups.com...
Yes, you can do this. Standard states that shift operations are
defined in terms of the /value/ of the thing you're shifting, not
it's bit representation. You can think of shifts in terms of
multiplication and division, if you will.
Does that apply for negative (signed) operands as well? I think K&R2
says that is implementation specific, shifting negatives can be
implemented logically (sticking to the bit representation) or
arithmetically (sticking to real value). Please comment on that.

In short, Standard specifies that, if the right operand is negative (or= width of the left operand), you get Undefined Behaviour (6.5.7.3).

If the left operand is negative, for left shift (<<) you get U.B.
(6.5.7.4), but for right shift (>>) result is implementation defined
(6.5.7.5).

I think full C&V should answer all you dilemmas (note how Standard talks
both of bits /and/ arithmetic operations):

6.5.7 Bitwise shift operators

6.5.7.2 Constraints
Each of the operands shall have integer type.

6.5.7.3 Semantics
The integer promotions are performed on each of the operands. The type
of the result is that of the promoted left operand. If the value of the
right operand is negative or is greater than or equal to the width of
the promoted left operand, the behavior is undefined.

6.5.7.4
The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits
are filled with zeros. If E1 has an unsigned type, the value of the
result is E1x2^E2, reduced modulo one more than the maximum value
representable in the result type. If E1 has a signed type and
nonnegative value, and E1x2^E2 is representable in the result type,
then that is the resulting value; otherwise, the behavior is undefined.

6.5.7.5
The result of E1 >> E2 is E1 right-shifted E2 bit positions. If E1 has
an unsigned type or if E1 has a signed type and a nonnegative value,
the value of the result is the integral part of the quotient of E1
divided by the quantity, 2 raised to the power E2. If E1 has a signed
type and a negative value, the resulting value is
implementation-defined.

--
BR, Vladimir

If you've done six impossible things before breakfast, why not round it
off with dinner at Milliway's, the restaurant at the end of the
universe.

Feb 11 '06 #18

Chris Torek

In article <ds***********@ulysses.noc.ntua.gr>
stathis gotsis <st***********@hotmail.com> wrote:

Well, yes there is a clear distinction between values and representations,
so my [original] question was pointless anyway.
Indeed. :-)
But suppose we have 2-byte integers [and standard 8-bit bytes]
and let int a=0xABCD. In one representation that could be: ABCD and
in another: DCBA.
This is getting close to the heart of the issue (with "endiannness"
being "the issue" in question).

"Endianness" is an artifact that arises when some entity takes a
whole -- such as the value 0xABCD -- and splits it into parts.
Here, you have allowed someone(s) to split it into two parts, "AB"
and "CD", and then scatter those two parts about your room, where
the cat can subsequently gnaw on them.

The question you should ask yourself is: who is this entity that
is splitting up your whole, and why are you giving him, her, or it
permission to do so? What will he/she/it do with the pieces? Who
or what will re-assemble them later, and will all the various
entities doing this splitting-up and re-assembling cooperate?

If *you* do the splitting-up yourself:

unsigned char split[4];
unsigned long value;

split[0] = (value >> 24) & 0xff;
split[1] = (value >> 16) & 0xff;
split[2] = (value >> 8) & 0xff;
split[3] = value & 0xff;

and *you* do the re-assembling later:

value = (unsigned long)split[0] << 24;
value |= (unsigned long)split[1] << 16;
value |= (unsigned long)split[2] << 8;
value |= (unsigned long)split[3];

will you co-operate with yourself? Will that guarantee that you
get the proper value back?
In terms of representations, one could say ...

In the Olden Daze, computer memory was stored in little magnetic
donuts called "cores" (see <http://en.wikipedia.org/wiki/Core_memory>).
You could actually point to the individual donuts holding each
individual bit in memory. Depending on the architecture (core
memory was often stored in "planes" for speed), it is quite reasonable
to expect that each bit of a single word would be stored in a
different circuit board in the computer. If you had an 18 or 36
bit word (those being common word sizes at the time), any given
value was stored in 18 or 36 different locations, none particularly
being "left" or "right" hand sided.

Even today, the actual bit layout on any given DRAM card "stick"
may be spread out, so that the chips holding your values may not
be particularly sort-able into "left" and "right" (they may be
mixed together, and/or "up" and "down"). You never notice because
you are unable -- at least without a logic probe -- to observe the
bits being split up and reassembled. A single entity (the memory
controller on the particular card) is responsible for the splitting-up
and re-assembling, and it always cooperates with itself.
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.

Feb 11 '06 #19

stathis gotsis

"Chris Torek" <no****@torek.net> wrote in message
news:ds*********@news1.newsguy.com...

In article <ds***********@ulysses.noc.ntua.gr>
stathis gotsis <st***********@hotmail.com> wrote:
Well, yes there is a clear distinction between values and representations,so my [original] question was pointless anyway.
Indeed. :-)

Thank you for taking the time to clarify these issues.

But suppose we have 2-byte integers [and standard 8-bit bytes]
and let int a=0xABCD. In one representation that could be: ABCD and
in another: DCBA.

This is getting close to the heart of the issue (with "endiannness"
being "the issue" in question).

"Endianness" is an artifact that arises when some entity takes a
whole -- such as the value 0xABCD -- and splits it into parts.
Here, you have allowed someone(s) to split it into two parts, "AB"
and "CD", and then scatter those two parts about your room, where
the cat can subsequently gnaw on them.

Well, in my previous example i allowed someone to split the 2-byte whole,
into 4-bit entities. I was wondering if that can happen in real-life
systems. Is the byte the atom (the smallest entity that cannot be further
split) in the context of endianness? If it is, then my example resides in
the field of imagination.
The question you should ask yourself is: who is this entity that
is splitting up your whole, and why are you giving him, her, or it
permission to do so? What will he/she/it do with the pieces? Who
or what will re-assemble them later, and will all the various
entities doing this splitting-up and re-assembling cooperate?

If *you* do the splitting-up yourself:

unsigned char split[4];
unsigned long value;

split[0] = (value >> 24) & 0xff;
split[1] = (value >> 16) & 0xff;
split[2] = (value >> 8) & 0xff;
split[3] = value & 0xff;

and *you* do the re-assembling later:

value = (unsigned long)split[0] << 24;
value |= (unsigned long)split[1] << 16;
value |= (unsigned long)split[2] << 8;
value |= (unsigned long)split[3];

will you co-operate with yourself? Will that guarantee that you
get the proper value back?

I think i will get the proper value back. In this example, C will hide any
implementation specific representation. I think with this expression:
(value >> n*8) & 0xff;
we get the (n+1) most significant byte, regardless of endianness. Is that
true? Can you show me of an example where C reveals endianness?

In terms of representations, one could say ...

In the Olden Daze, computer memory was stored in little magnetic
donuts called "cores" (see <http://en.wikipedia.org/wiki/Core_memory>).
You could actually point to the individual donuts holding each
individual bit in memory. Depending on the architecture (core
memory was often stored in "planes" for speed), it is quite reasonable
to expect that each bit of a single word would be stored in a
different circuit board in the computer. If you had an 18 or 36
bit word (those being common word sizes at the time), any given
value was stored in 18 or 36 different locations, none particularly
being "left" or "right" hand sided.
Even today, the actual bit layout on any given DRAM card "stick"
may be spread out, so that the chips holding your values may not
be particularly sort-able into "left" and "right" (they may be
mixed together, and/or "up" and "down"). You never notice because
you are unable -- at least without a logic probe -- to observe the
bits being split up and reassembled. A single entity (the memory
controller on the particular card) is responsible for the splitting-up
and re-assembling, and it always cooperates with itself.

That was food for thought but i think you went too low-level. Yes, memory
hides all internal implementation details, collects the 8-bits of a byte,
which maybe scattered on the chip, and gives the byte. I believe that the
real question is whether we can access pieces of data smaller than bytes in
a real memory? If we cannot then all possible processor-specific
endianness-es are the ways we can put two or more bytes in some memory
piece.

Feb 12 '06 #20

Robin Haigh

"stathis gotsis" <st***********@hotmail.com> wrote in message
news:ds**********@ulysses.noc.ntua.gr...

That was food for thought but i think you went too low-level. Yes, memory
hides all internal implementation details, collects the 8-bits of a byte,
which maybe scattered on the chip, and gives the byte. I believe that the
real question is whether we can access pieces of data smaller than bytes in a real memory? If we cannot then all possible processor-specific
endianness-es are the ways we can put two or more bytes in some memory
piece.

You need two different orderings before you can discuss how they relate to
each other.

When you store a 16-bit unsigned integer value into 2 bytes of
byte-addressable memory (and this didn't arise before byte-addressing), by
common custom and convention (but no absolute rule) you encode it base-256,
i.e. the byte values you store will be x/256 and x%256.

On that assumption, you now have an ordering by significance -- one byte is
the "big" byte -- and also an ordering by memory address, so you can talk
about which byte (by significance) is the low-address byte, i.e. endianness.

With the bits involved in bitwise operations, you have an ordering by
significance, but only that. There's no low-address end or left-hand end or
any other positional description. You can certainly access the LSB, but
every way of doing so refers to it by significance, essentially. So you
can't talk about relative bit-ordering, because you can't see anything for
it to be relative to.

Of course this changes when you serialise the bits in a byte onto a serial
communications line. Then, you do have another ordering, so the hardware
does have to agree on the bit-endianness and reassemble the byte values as
transmitted. But, unlike the cpu vendors, the bus and network vendors (by
some miracle) do have this all sorted out, and we don't actually get to see
bit-swapped bytes, so we treat it as a non-issue. The danger that you fear
was potentially real, but has been averted.
The terms "left-shift" and "right-shift" are motivated by the fact that in
America and many other countries, when numbers are written down in
place-value notation, we write the big end on the left. If numbers were
normally written the other way round, e.g. 000,000,1 for a million, the
names would have been reversed. This hasn't got anything to do with cpu
architecture.

--
RSH

Feb 12 '06 #21

Chris Torek

>"Chris Torek" <no****@torek.net> wrote in message

news:ds*********@news1.newsguy.com...
"Endianness" is an artifact that arises when some entity takes a
whole -- such as the value 0xABCD -- and splits it into parts.
Here, you have allowed someone(s) to split it into two parts, "AB"
and "CD", and then scatter those two parts about your room, where
the cat can subsequently gnaw on them.
In article <ds**********@ulysses.noc.ntua.gr>,
stathis gotsis <st***********@hotmail.com> wrote:
Well, in my previous example i allowed someone to split the 2-byte whole,
into 4-bit entities. I was wondering if that can happen in real-life
systems. Is the byte the atom (the smallest entity that cannot be further
split) in the context of endianness? If it is, then my example resides in
the field of imagination.
So, the question becomes, is there a con "struct" in C that will
allow you to ask someone/something to split a value into, say,
"two:4" bit pieces? :-)

struct S {
unsigned int a:4, b:4, c:4, d:4;
} x = { 7, 1, 8, 10 };

The next question might be: "who is doing the splitting?" (There
are two possible answers. Either the compiler is doing it all on
its own, as directed by whoever wrote that compiler; or the compiler
is doing it with some assistance from the CPU. In the first case,
the compiler-writer chooses the endian-ness. In the second, the
compiler-writer colludes with the chip-maker to choose the
endianness. Note that even chip designers sometimes change their
minds: the numbering of bits on the 680x0 is different for the Bxxx
instructions on the original 68000, and the BFxxx instructions that
were added to the 68020 or 68030 [I forget which].)
The question you should ask yourself is: who is this entity that
is splitting up your whole, and why are you giving him, her, or it
permission to do so? What will he/she/it do with the pieces? Who
or what will re-assemble them later, and will all the various
entities doing this splitting-up and re-assembling cooperate?

If *you* do the splitting-up yourself:

unsigned char split[4];
unsigned long value;

split[0] = (value >> 24) & 0xff;
split[1] = (value >> 16) & 0xff;
split[2] = (value >> 8) & 0xff;
split[3] = value & 0xff;

and *you* do the re-assembling later:

value = (unsigned long)split[0] << 24;
value |= (unsigned long)split[1] << 16;
value |= (unsigned long)split[2] << 8;
value |= (unsigned long)split[3];

will you co-operate with yourself? Will that guarantee that you
get the proper value back?

I think i will get the proper value back.

Indeed you will.

The same applies to bitfields.

C's bitfields are very tempting to the embedded-systems programmer
writing, e.g., a SCSI or USB IO system (it turns out that USB is
essentially "SCSI over serial lines", as far as protocol goes
anyway). SCSI -- and hence USB -- disk commands and responses
are full of sub-byte fields. C's bitfields *appear* to map to
SCSI bitfields ... but if you use them for this, you give up all
control to the compiler and/or CPU, and those may not arrange your
fields the way you intended.

If you write out explicit shift-and-mask code, it will work on
every system that is actually capable of supporting the hardware.
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.

Feb 12 '06 #22

Joe Wright

Robin Haigh wrote:

"stathis gotsis" <st***********@hotmail.com> wrote in message
news:ds**********@ulysses.noc.ntua.gr...
That was food for thought but i think you went too low-level. Yes, memory
hides all internal implementation details, collects the 8-bits of a byte,
which maybe scattered on the chip, and gives the byte. I believe that the
real question is whether we can access pieces of data smaller than bytes

in
a real memory? If we cannot then all possible processor-specific
endianness-es are the ways we can put two or more bytes in some memory
piece.

You need two different orderings before you can discuss how they relate to
each other.

When you store a 16-bit unsigned integer value into 2 bytes of
byte-addressable memory (and this didn't arise before byte-addressing), by
common custom and convention (but no absolute rule) you encode it base-256,
i.e. the byte values you store will be x/256 and x%256.

On that assumption, you now have an ordering by significance -- one byte is
the "big" byte -- and also an ordering by memory address, so you can talk
about which byte (by significance) is the low-address byte, i.e. endianness.

With the bits involved in bitwise operations, you have an ordering by
significance, but only that. There's no low-address end or left-hand end or
any other positional description. You can certainly access the LSB, but
every way of doing so refers to it by significance, essentially. So you
can't talk about relative bit-ordering, because you can't see anything for
it to be relative to.

Of course this changes when you serialise the bits in a byte onto a serial
communications line. Then, you do have another ordering, so the hardware
does have to agree on the bit-endianness and reassemble the byte values as
transmitted. But, unlike the cpu vendors, the bus and network vendors (by
some miracle) do have this all sorted out, and we don't actually get to see
bit-swapped bytes, so we treat it as a non-issue. The danger that you fear
was potentially real, but has been averted.
The terms "left-shift" and "right-shift" are motivated by the fact that in
America and many other countries, when numbers are written down in
place-value notation, we write the big end on the left. If numbers were
normally written the other way round, e.g. 000,000,1 for a million, the
names would have been reversed. This hasn't got anything to do with cpu
architecture.

Very good Robin. You have nailed it well. Our conventional number system
is said to be 'Arabic'. This is not because we might recognize the digit
4 in Arabic, but because numbers are written right to left, low order
first with place and value reserved for the concept of zero.

--
Joe Wright
"Everything should be made as simple as possible, but not simpler."
--- Albert Einstein ---

Feb 12 '06 #23

stathis gotsis

"Robin Haigh" <ec*****@leeds.ac.uk> wrote in message
news:ds**********@news8.svr.pol.co.uk...

You need two different orderings before you can discuss how they relate to
each other.

When you store a 16-bit unsigned integer value into 2 bytes of
byte-addressable memory (and this didn't arise before byte-addressing), by
common custom and convention (but no absolute rule) you encode it base-256, i.e. the byte values you store will be x/256 and x%256.

On that assumption, you now have an ordering by significance -- one byte is the "big" byte -- and also an ordering by memory address, so you can talk
about which byte (by significance) is the low-address byte, i.e. endianness.
Yes, i was curious if there are other encodings in real systems, other than
this common one, leading to other possibilities for endianness.
With the bits involved in bitwise operations, you have an ordering by
significance, but only that. There's no low-address end or left-hand end or any other positional description. You can certainly access the LSB, but
every way of doing so refers to it by significance, essentially. So you
can't talk about relative bit-ordering, because you can't see anything for
it to be relative to.
So, i come to the conclusion that shifting operations hide endianness from
the programmer. Maybe one could reveal endianness this way?

#include <stdio.h>

int main(void)
{
int i=0;
unsigned int a = 0xabcdabcd;
unsigned char *b;
b=(unsigned char *)&a;

while (i<sizeof(a))
{
printf("%d byte: %x\n",i+1,b[i]);
i++;
}

return 0;
}
Of course this changes when you serialise the bits in a byte onto a serial
communications line. Then, you do have another ordering, so the hardware
does have to agree on the bit-endianness and reassemble the byte values as
transmitted. But, unlike the cpu vendors, the bus and network vendors (by
some miracle) do have this all sorted out, and we don't actually get to see bit-swapped bytes, so we treat it as a non-issue. The danger that you fear was potentially real, but has been averted.
The terms "left-shift" and "right-shift" are motivated by the fact that in
America and many other countries, when numbers are written down in
place-value notation, we write the big end on the left. If numbers were
normally written the other way round, e.g. 000,000,1 for a million, the
names would have been reversed. This hasn't got anything to do with cpu
architecture.

Yes, that is clear to me now.

Feb 12 '06 #24

Rod Pemberton

"stathis gotsis" <st***********@hotmail.com> wrote in message
news:ds***********@ulysses.noc.ntua.gr...

So, i come to the conclusion that shifting operations hide endianness from
the programmer. Maybe one could reveal endianness this way?

#include <stdio.h>

int main(void)
{
int i=0;
unsigned int a = 0xabcdabcd;
unsigned char *b;
b=(unsigned char *)&a;

while (i<sizeof(a))
{
printf("%d byte: %x\n",i+1,b[i]);
i++;
}

return 0;
}

There are two standard methods to determine endianess. See the code below.
There are big-endian and little endian machines. Old 16-bit little-endian
machines (VAX, PDP-11) became middle-endian in a 32-bit world. Will the
same to little endian 32-bit Intel CPU's in a 64-bit world? And what will
they call it? middle-middle-endianess?

Wiki on Endianess
http://en.wikipedia.org/wiki/Endianness

Rod Pemberton
----

#include <stdio.h>
union { long Long; char Char[sizeof(long)]; } u;

int main (void)
{
/* Method 1 */
int x = 1;

if ( *(char *)&x == 1)
printf("Register addressing is right-to-left,"
" LSB stored in memory first, \n "
"or little endian - memory addressing "
"decreases from MSB to LSB.\n");
else
printf("Register addressing is left-to-right,"
" MSB stored in memory first, \n "
"or big endian - memory addressing "
"increases from MSB to LSB .\n");

/* Method 2 */
u.Long = 1;
if (u.Char[0] == 1)
printf("Register addressing is right-to-left,"
" LSB stored in memory first, \n "
"or little endian - memory addressing "
"decreases from MSB to LSB.\n");
else if (u.Char[sizeof(long)-1] == 1)
printf("Register addressing is left-to-right, "
"MSB stored in memory first, \n "
"or big endian - memory addressing "
"increases from MSB to LSB .\n");
else printf("Addressing is strange\n");

return(0);
}

/* MSB - most significant byte */
/* LSB - least significant byte */
/* */
/* little endian - the endian (i.e.,LSB) is at the little address in memory
*/
/* memory addressing decreases from MSB to LSB */
/* LSB stored in memory first */
/* register addressing is right-to-left */
/* big endian - the endian (i.e.,LSB) is at the big address in memory */
/* memory addressing increases from MSB to LSB */
/* MSB stored in memory first */
/* register addressing is left-to-right */
/* */
/* big little endian order */
/* 0123 0123 memory addresses 0,1,2,3 */
/* ASDF ASDF memory order, character A stored at 0, etc... */
/* */
/* ASDF FDSA register order, MSB...LSB */
/* M..L M..L MSB is bits 24-31 */
/* S..S S..S LSB is bits 0-7 */
/* B..B B..B */
/* */
/* big endian order improves string processing and */
/* eliminates the need for special string instructions */
/* thereby reducing the instruction set for the cpu (RISC) */
/* but arithmetic and branching need more circuitry to adjust */
/* for the changing location of the LSB depending on data size */
/* Words, or double words can be loaded into a register */
/* and the string byte ordering remains the same */
/* little endian order improves arithmetic and branching by */
/* keeping the LSB in the same location independant of data */
/* size and eliminates the need for exta math circuitry */
/* but needs string instructions which creates larger */
/* instruction set for the cpu (CISC) */
/* Words, or double words loaded into a register using */
/* integer instructions (non string instructions) reverses */
/* the string byte order i.e., AS -> SA, ASDF -> FDSA */

Feb 12 '06 #25

Kenny McCormack

In article <43******@news.bea.com>,
Rod Pemberton <do*********@sorry.bitbucket.cmm> wrote:

"stathis gotsis" <st***********@hotmail.com> wrote in message
news:ds***********@ulysses.noc.ntua.gr...

So, i come to the conclusion that shifting operations hide endianness from
the programmer. Maybe one could reveal endianness this way?
....There are two standard methods to determine endianess. See the code below.
There are big-endian and little endian machines. Old 16-bit little-endian
machines (VAX, PDP-11) became middle-endian in a 32-bit world. Will the
same to little endian 32-bit Intel CPU's in a 64-bit world? And what will
they call it? middle-middle-endianess?

Not portable. Can't discuss it here. Blah, blah, blah.

(Like mental illness and union SREGS...)

Feb 12 '06 #26

Keith Thompson

Joe Wright <jo********@comcast.net> writes:
[...]

Very good Robin. You have nailed it well. Our conventional number system
is said to be 'Arabic'. This is not because we might recognize the digit
4 in Arabic, but because numbers are written right to left, low order
first with place and value reserved for the concept of zero.

<OT>
Yes, our numbering system is called "Arabic" (or "Hindu-Arabic") --
but it's precisely because the Europeans adopted the system from the
Arabs, including the appearance of the digits.

Google "Hindu-Arabic numbers" for details.

Our confusion over big-endian vs. little-endian numeric
representations probably goes back to the fact that Arabic is written
right-to-left, most European langauges are written left-to-write, but
Europe adopted Arabic numbers without changing the order in which
they're written. (I'm not 100% certain on that last point.)
</OT>

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Feb 12 '06 #27

stathis gotsis

"Rod Pemberton" <do*********@sorry.bitbucket.cmm> wrote in message
news:43******@news.bea.com...

There are two standard methods to determine endianess. See the code below. There are big-endian and little endian machines. Old 16-bit little-endian
machines (VAX, PDP-11) became middle-endian in a 32-bit world. Will the
same to little endian 32-bit Intel CPU's in a 64-bit world? And what will
they call it? middle-middle-endianess?

Wiki on Endianess
http://en.wikipedia.org/wiki/Endianness

#include <stdio.h>
union { long Long; char Char[sizeof(long)]; } u;

int main (void)
{
/* Method 1 */
int x = 1;

if ( *(char *)&x == 1)
printf("Register addressing is right-to-left,"
" LSB stored in memory first, \n "
"or little endian - memory addressing "
"decreases from MSB to LSB.\n");
else
printf("Register addressing is left-to-right,"
" MSB stored in memory first, \n "
"or big endian - memory addressing "
"increases from MSB to LSB .\n");

/* Method 2 */
u.Long = 1;
if (u.Char[0] == 1)
printf("Register addressing is right-to-left,"
" LSB stored in memory first, \n "
"or little endian - memory addressing "
"decreases from MSB to LSB.\n");
else if (u.Char[sizeof(long)-1] == 1)
printf("Register addressing is left-to-right, "
"MSB stored in memory first, \n "
"or big endian - memory addressing "
"increases from MSB to LSB .\n");
else printf("Addressing is strange\n");

return(0);
}

/* MSB - most significant byte */
/* LSB - least significant byte */
/* */
/* little endian - the endian (i.e.,LSB) is at the little address in memory */
/* memory addressing decreases from MSB to LSB */
/* LSB stored in memory first */
/* register addressing is right-to-left */
/* big endian - the endian (i.e.,LSB) is at the big address in memory */
/* memory addressing increases from MSB to LSB */
/* MSB stored in memory first */
/* register addressing is left-to-right */
/* */
/* big little endian order */
/* 0123 0123 memory addresses 0,1,2,3 */
/* ASDF ASDF memory order, character A stored at 0, etc... */
/* */
/* ASDF FDSA register order, MSB...LSB */
/* M..L M..L MSB is bits 24-31 */
/* S..S S..S LSB is bits 0-7 */
/* B..B B..B */
/* */
/* big endian order improves string processing and */
/* eliminates the need for special string instructions */
/* thereby reducing the instruction set for the cpu (RISC) */
/* but arithmetic and branching need more circuitry to adjust */
/* for the changing location of the LSB depending on data size */
/* Words, or double words can be loaded into a register */
/* and the string byte ordering remains the same */
/* little endian order improves arithmetic and branching by */
/* keeping the LSB in the same location independant of data */
/* size and eliminates the need for exta math circuitry */
/* but needs string instructions which creates larger */
/* instruction set for the cpu (CISC) */
/* Words, or double words loaded into a register using */
/* integer instructions (non string instructions) reverses */
/* the string byte order i.e., AS -> SA, ASDF -> FDSA */

Thank you very much Rod.

Feb 12 '06 #28

Joe Wright

Keith Thompson wrote:

Joe Wright <jo********@comcast.net> writes:
[...]
Very good Robin. You have nailed it well. Our conventional number system
is said to be 'Arabic'. This is not because we might recognize the digit
4 in Arabic, but because numbers are written right to left, low order
first with place and value reserved for the concept of zero.

<OT>
Yes, our numbering system is called "Arabic" (or "Hindu-Arabic") --
but it's precisely because the Europeans adopted the system from the
Arabs, including the appearance of the digits.

Google "Hindu-Arabic numbers" for details.

Our confusion over big-endian vs. little-endian numeric
representations probably goes back to the fact that Arabic is written
right-to-left, most European langauges are written left-to-write, but
Europe adopted Arabic numbers without changing the order in which
they're written. (I'm not 100% certain on that last point.)
</OT>

<OT>
If you look at the ten Arabic numerals in Arabic, you will not recognize
many if any. And none of it has to do with endianness. That's a
Lilliputian thing about eggs and which end of them to open. :-)
</OT>

--
Joe Wright
"Everything should be made as simple as possible, but not simpler."
--- Albert Einstein ---

Feb 13 '06 #29

Robin Haigh

"stathis gotsis" <st***********@hotmail.com> wrote in message
news:ds***********@ulysses.noc.ntua.gr...

"Robin Haigh" <ec*****@leeds.ac.uk> wrote in message
news:ds**********@news8.svr.pol.co.uk...
You need two different orderings before you can discuss how they relate to each other.

When you store a 16-bit unsigned integer value into 2 bytes of
byte-addressable memory (and this didn't arise before byte-addressing), by common custom and convention (but no absolute rule) you encode it base-256,
i.e. the byte values you store will be x/256 and x%256.

On that assumption, you now have an ordering by significance -- one byte

is
the "big" byte -- and also an ordering by memory address, so you can talk about which byte (by significance) is the low-address byte, i.e.

endianness.
Yes, i was curious if there are other encodings in real systems, other

than this common one, leading to other possibilities for endianness.
The issue that the standard goes out of its way to mention is padding bits.
Presumably there was a reason for this. Padding bits are hidden in the
integer value, but exposed in the byte-values.

so, once you've decided to allow for padding bits, it would be tortuous and
pointless to try to say anything else about byte-encoding. Arbitrary
padding makes the issue much more general -- byte-swapping and bit-shuffling
are reduced to special cases. The byte-values can be much more complex
functions of the integer value being encoded, so you may as well just say
they can be any reversible function, regardless of what hardware is out
there. Code will be portable so long as it doesn't try to access the bytes
of the object representation by address (other than for simple copying), and
it won't if it does.

With the bits involved in bitwise operations, you have an ordering by
significance, but only that. There's no low-address end or left-hand end
or
any other positional description. You can certainly access the LSB, but
every way of doing so refers to it by significance, essentially. So you
can't talk about relative bit-ordering, because you can't see anything
for it to be relative to.

So, i come to the conclusion that shifting operations hide endianness from
the programmer.

I would have said that endianness disappears when you fetch values from
byte-addressed memory into the processor. For an operation not to "hide
endianness", you would have to have some way of saying which is the "left"
or perhaps "leading" end independently of significance.

Maybe one could reveal endianness this way?

#include <stdio.h>

int main(void)
{
int i=0;
unsigned int a = 0xabcdabcd;
unsigned char *b;
b=(unsigned char *)&a;

while (i<sizeof(a))
{
printf("%d byte: %x\n",i+1,b[i]);
i++;
}

return 0;
}

Yes, you can do that. The output will be platform-dependent, or worse if
there are padding bits. You will get clues (though not totally unambiguous
information) about the way your platform splits an unsigned int into bytes
and the order of those bytes in memory.

You won't learn anything about the bitwise physical storage of byte-values
in memory bytes or unsigned values in the processor. Bits are only exposed
to the programmer as powers of two ordered by significance. (Obviously
these "value bits" correspond to hardware bits in all normal hardware. But
the standard doesn't actually specify the hardware or depend on it being
normal -- on a really bizarre architecture, e.g. not based on powers of 2,
the real bits would have to be hidden and the value bits of the standard
would have to be emulated)

Highly non-portable code uses the above type-punning method to write binary
values into fixed external data formats such as network packets. As several
people said earlier, you can do the equivalent job portably using either
arithmetic operations or shift/mask operations.

--
RSH

Feb 13 '06 #30

CBFalconer

Keith Thompson wrote:

.... snip ...
Our confusion over big-endian vs. little-endian numeric
representations probably goes back to the fact that Arabic is
written right-to-left, most European langauges are written
left-to-write, but Europe adopted Arabic numbers without
changing the order in which they're written. (I'm not 100%
certain on that last point.) </OT>

I am five and thirty percent sure of something or other. I.e.
European languages have not been devoid of endian wars in the past.

--
"The power of the Executive to cast a man into prison without
formulating any charge known to the law, and particularly to
deny him the judgement of his peers, is in the highest degree
odious and is the foundation of all totalitarian government
whether Nazi or Communist." -- W. Churchill, Nov 21, 1943

Feb 13 '06 #31

CBFalconer

Robin Haigh wrote:

.... snip ...
Highly non-portable code uses the above type-punning method to
write binary values into fixed external data formats such as
network packets. As several people said earlier, you can do the
equivalent job portably using either arithmetic operations or
shift/mask operations.

That's because shift operations are defined in terms of values, and
multiplication or division by 2. That's also why you should limit
shift operations to unsigned ints.

--
"The power of the Executive to cast a man into prison without
formulating any charge known to the law, and particularly to
deny him the judgement of his peers, is in the highest degree
odious and is the foundation of all totalitarian government
whether Nazi or Communist." -- W. Churchill, Nov 21, 1943

Feb 13 '06 #32

Keith Thompson

CBFalconer <cb********@yahoo.com> writes:

Keith Thompson wrote:

... snip ...

Our confusion over big-endian vs. little-endian numeric
representations probably goes back to the fact that Arabic is
written right-to-left, most European langauges are written
left-to-write, but Europe adopted Arabic numbers without
changing the order in which they're written. (I'm not 100%
certain on that last point.) </OT>

I am five and thirty percent sure of something or other. I.e.
European languages have not been devoid of endian wars in the past.

Yoda fan club member of I am.

Or, to quote a bumper sticker I saw a reference to some years ago:

4TH [HEART] IF HONK THEN

Once I get that time machine I keep talking about, I'm going to get
the original CPU designers together and get them to settle on one
consistent endianness. I don't care which, just pick one.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Feb 13 '06 #33

ozbear

On Sun, 12 Feb 2006 23:21:33 GMT, Keith Thompson <ks***@mib.org>
wrote:

Joe Wright <jo********@comcast.net> writes:
[...]
Very good Robin. You have nailed it well. Our conventional number system
is said to be 'Arabic'. This is not because we might recognize the digit
4 in Arabic, but because numbers are written right to left, low order
first with place and value reserved for the concept of zero.

<OT>
Yes, our numbering system is called "Arabic" (or "Hindu-Arabic") --
but it's precisely because the Europeans adopted the system from the
Arabs, including the appearance of the digits.

Google "Hindu-Arabic numbers" for details.

Our confusion over big-endian vs. little-endian numeric
representations probably goes back to the fact that Arabic is written
right-to-left, most European langauges are written left-to-write, but
Europe adopted Arabic numbers without changing the order in which
they're written. (I'm not 100% certain on that last point.)
</OT>

Slightly backwards...while Arabic text is written right to left,
numbers in Aribic are written left to right.

oz
--
A: Because it fouls the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?

Feb 13 '06 #34

Byte ordering and array access

Similar topics