Working with Endianness 
March 14th, 2006, 03:05 PM
| | | Working with Endianness
Let's say you want to write fully portable code that will be writing files
or sending data, and the data is text encoded using Unicode 16-Bit.
Endianness comes into play. I'm writing code at the moment which determines
the Endianness of the architecture, and then converts values to Bigendian if
they need to be converted.
At the moment my code is based around:
typedef unsigned short uint16;
union {
uint16 entire_integral_value;
unsigned char bytes[2];
};
I set the value of each individual byte, and then test the value of
"entire_integral_value" to see what way it's arranged in memory.
I realise that it's Undefined Behaviour to store a value in a union via one
type, and the access it via another type, but it's the only idea I have at
the moment.
And other ways of determining Endianess?
(By the way, I think it's bullshit to note be able to access an unsigned
integral type's bytes via a union...)
-Tomás | 
March 14th, 2006, 03:15 PM
| | | Re: Working with Endianness
Tomás <NULL@NULL.NULL> wrote:
[color=blue]
> typedef unsigned short uint16;
>
> union {
> uint16 entire_integral_value;
> unsigned char bytes[2];
> };
>
>
> I set the value of each individual byte, and then test the value of
> "entire_integral_value" to see what way it's arranged in memory.
>
> I realise that it's Undefined Behaviour to store a value in a union
> via one type, and the access it via another type, but it's the only
> idea I have at the moment.
>
> And other ways of determining Endianess?[/color]
Copying the contents of an int into a large enough array of unsigned
char (or char?) using memcpy (and accessing the char array afterwards)
should be valid. Don't quote me on that tho .. it's just in my mind
having read about this before.
hth
--
jb
(reply address in rot13, unscramble first) | 
March 14th, 2006, 03:25 PM
| | | Re: Working with Endianness
Tomás wrote:
[color=blue]
>
> Let's say you want to write fully portable code that will be writing files
> or sending data, and the data is text encoded using Unicode 16-Bit.[/color]
Do you mean utf-16 or ucs-2 or something else?
[color=blue]
> Endianness comes into play. I'm writing code at the moment which
> determines the Endianness of the architecture, and then converts values to
> Bigendian if they need to be converted.[/color]
Text that uses encodings with two or more bytes should start with a byte
order mark.
[color=blue]
> At the moment my code is based around:
>
> typedef unsigned short uint16;
>
> union {
> uint16 entire_integral_value;
> unsigned char bytes[2];
> };
>
>
> I set the value of each individual byte, and then test the value of
> "entire_integral_value" to see what way it's arranged in memory.
>
> I realise that it's Undefined Behaviour to store a value in a union via
> one type, and the access it via another type, but it's the only idea I
> have at the moment.
>
> And other ways of determining Endianess?[/color]
I think there is not much you can do other than rely on the byte order mark
or let the user be able to specify the endianness somehow.
[color=blue]
> (By the way, I think it's bullshit to note be able to access an unsigned
> integral type's bytes via a union...)[/color]
Well, a union was never meant for something like this. And there is always
reinterpret_cast. | 
March 14th, 2006, 03:35 PM
| | | Re: Working with Endianness
Jakob Bieling wrote:[color=blue]
> Tomás <NULL@NULL.NULL> wrote:
>
>[color=green]
>>typedef unsigned short uint16;
>>
>>union {
>> uint16 entire_integral_value;
>> unsigned char bytes[2];
>>};
>>
>>
>>I set the value of each individual byte, and then test the value of
>>"entire_integral_value" to see what way it's arranged in memory.
>>
>>I realise that it's Undefined Behaviour to store a value in a union
>>via one type, and the access it via another type, but it's the only
>>idea I have at the moment.
>>
>>And other ways of determining Endianess?[/color]
>
>
> Copying the contents of an int into a large enough array of unsigned
> char (or char?) using memcpy (and accessing the char array afterwards)
> should be valid. Don't quote me on that tho .. it's just in my mind
> having read about this before.[/color]
Yep, I see no problem with that either. The union trick OTOH is illegal.
V
--
Please remove capital As from my address when replying by mail | 
March 14th, 2006, 06:15 PM
| | | Re: Working with Endianness
Tomás wrote:[color=blue]
> Let's say you want to write fully portable code that will be writing files
> or sending data, and the data is text encoded using Unicode 16-Bit.[/color]
Why don't choose utf-8 then? This way you don't have to deal with
byte-sex here. BTW, utf-8 is a defacto standard for Linux and I guess
other unixes, so that your fully portable code is portable across nixes
from the start. | 
March 16th, 2006, 11:15 AM
| | | Re: Working with Endianness
The following code snippet may help to find Endianness :
int x = 1;
if(*(char *)&x == 1)
printf("little-endian\n");
else printf("big-endian\n");
Thanx,
Rama | 
March 17th, 2006, 11:05 PM
| | | Re: Working with Endianness
#define LITTLE_ENDIAN 0
#define BIG_ENDIAN 1
int machineEndianness()
{
int i = 1;
char *p = (char *) &i;
if (p[0] == 1) // Lowest address contains the least significant byte
return LITTLE_ENDIAN;
else
return BIG_ENDIAN;
}
It's not very efficient, but still works fine. | | Thread Tools | Search this Thread | | | |
Posting Rules
| You may not post new threads You may not post replies You may not post attachments You may not edit your posts HTML code is Off | | | | | | What is Bytes?
We are a network of experts and professionals in IT and software development that help one another with answers to tough questions and share insights.
Get the best answers to your questions from over 220,840 network members.
|