473,385 Members | 1,764 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

How does C handle issues arising out of Endianness?

Hi,
If I am right Endianness is CPU related. I do not know if the
question is right in itself but if it is then how does C handle issues
arising out of Endianness.

I understand that if we pass structures using sockets across platforms,
we need to take care of Endianness issues at the application level. But
for example, for the code using bitwise AND to figure out if a number
is odd or even, how does C know the LSB position?

Thanks,
IC

Dec 17 '06 #1
18 2793
In************@gmail.com wrote:
Hi,
If I am right Endianness is CPU related. I do not know if the
question is right in itself but if it is then how does C handle issues
arising out of Endianness.

I understand that if we pass structures using sockets across platforms,
we need to take care of Endianness issues at the application level. But
for example, for the code using bitwise AND to figure out if a number
is odd or even, how does C know the LSB position?
C relies on the implementor to define each operator for each native data
type for each platform. For an example, you could look up the
gcc/config/*/*.md (machine description) files.
Standard C has rules against data type punning under which your odd/even
code would break with a change of endianness. C can't necessarily
prevent you from breaking those rules.
Dec 17 '06 #2
Thanks. I will check it out.
C relies on the implementor to define each operator for each native data
type for each platform.
So why does it not do the same with structs? Why should the programmer
take care of it while passing it across platforms? Is it more of a
"rationale" related question?

Thanks,
IC

Dec 17 '06 #3
In************@gmail.com wrote:
Hi,
If I am right Endianness is CPU related. I do not know if the
question is right in itself but if it is then how does C handle issues
arising out of Endianness.
By ignoring them.
I understand that if we pass structures using sockets across platforms,
we need to take care of Endianness issues at the application level. But
for example, for the code using bitwise AND to figure out if a number
is odd or even, how does C know the LSB position?
On any particular implementation, the LSB of the unknown
value being tested is in the same position as the LSB of the
constant 1 you are ANDing with it. Problem solved.

Problems can occur when you exchange data between dissimilar
implementations, because they may disagree about endianness. They
may disagree about other matters of representation, too: one
platform might represent an int with sixteen bits while the other
uses thirty-two, one might use IEEE floating-point while the other
uses the S/360 format, the two might insert padding in structures
differently, and so on. Endianness is just one of a number of
representational issues you must consider when communicating
between different systems.

One approach that has proven widely useful is to invent a
"wire format" for the data to be exchanged, a format that does
not depend on the peculiarities of the machines. Each machine
then needs two routines: One to read "wire format" and convert
it to native representation, and one to convert the native form
to "wire format." For obvious reasons, many extrememly popular
"wire formats" use textual representations: If you want to send
the value forty-two, you transmit the two characters '4' and '2',
possibly followed by a delimiter like '\n' or ';' or some such.
This doesn't solve every possible problem (because the encoding
of characters can also vary from machine to machine), but it solves
a great many of them and usually leaves a fairly tractable remnant
to deal with.

--
Eric Sosman
es*****@acm-dot-org.invalid
Dec 17 '06 #4
I understand that if we pass structures using sockets across platforms,
we need to take care of Endianness issues at the application level. But
for example, for the code using bitwise AND to figure out if a number
is odd or even, how does C know the LSB position?

On any particular implementation, the LSB of the unknown
value being tested is in the same position as the LSB of the
constant 1 you are ANDing with it. Problem solved.
Thanks. Now that you have explained it that was pretty stupid of me.

Are shift operators better examples of the question I have?

As in the following snippet (please do let me know if I need to follow
any norms while adding code snippets.)
-------
int x = 10;
int y;

y = x << 2;
-------

Thanks,
IC

Dec 17 '06 #5
In************@gmail.com wrote:
Hi,
If I am right Endianness is CPU related. I do not know if the
question is right in itself but if it is then how does C handle issues
arising out of Endianness.

I understand that if we pass structures using sockets across
platforms, we need to take care of Endianness issues at the
application level. But for example, for the code using bitwise AND to
figure out if a number is odd or even, how does C know the LSB
position?
C doesn't, but the implementation creator did.


Brian
Dec 17 '06 #6

<In************@gmail.comwrote in message
news:11*********************@t46g2000cwa.googlegro ups.com...
>
I understand that if we pass structures using sockets across platforms,
we need to take care of Endianness issues at the application level. But
for example, for the code using bitwise AND to figure out if a number
is odd or even, how does C know the LSB position?

On any particular implementation, the LSB of the unknown
value being tested is in the same position as the LSB of the
constant 1 you are ANDing with it. Problem solved.

Thanks. Now that you have explained it that was pretty stupid of me.

Are shift operators better examples of the question I have?

As in the following snippet (please do let me know if I need to follow
any norms while adding code snippets.)
-------
int x = 10;
int y;

y = x << 2;
The shift operator assumes that the bits are arrayed from left to right,
with the most significant at the left.
This may or may not have anything to do with the physical location of the
bits in memory. *(unsigned char *)x; will read the top byte of x, which is
probably either 10 or zero, but could be anything.
--
www.personal.leeds.ac.uk/~bgy1mm
freeware games to download.
Dec 17 '06 #7
Malcolm wrote:
*(unsigned char *)x; will read the top byte of x
.... if "top" means "lowest addressed"

--
pete
Dec 17 '06 #8

"pete" <pf*****@mindspring.comwrote in message
Malcolm wrote:
> *(unsigned char *)x; will read the top byte of x

... if "top" means "lowest addressed"
If Microsoft take over the world they might make us all store out bytes at
the little end.
--
www.personal.leeds.ac.uk/~bgy1mm
freeware games to download.
Dec 17 '06 #9

Just to add, as to how to determine the nature of Endianness,

#define LITTLE_ENDIAN 0
#define BIG_ENDIAN 1
int machineEndianness()
{
int i = 1;
char *p = (char *) &i;
if (p[0] == 1) // Lowest address contains the least significant
byte
return BIG_ENDIAN;
else
return LITTLE_ENDIAN;
}

Dec 17 '06 #10
If I am right Endianness is CPU related. I do not know if the
>question is right in itself but if it is then how does C handle issues
arising out of Endianness.
It's very simple: If you do anything that depends on Endianness,
the result is undefined (or perhaps implementation-defined). The
problem is thrown into the programmer's court NOT to do that. Write
your code so it doesn't depend on endianness.
>I understand that if we pass structures using sockets across platforms,
we need to take care of Endianness issues at the application level. But
for example, for the code using bitwise AND to figure out if a number
is odd or even, how does C know the LSB position?
If you view a value as a value, and not a bunch of bytes, there is
no problem. C knows which end of an int has the least significant
bit, and machine registers might not even be addressable as bytes.
The problem comes when you take a value (potentially multi-byte)
and try to convert it to or from a bunch of bytes. THEN you have
to worry about the problem that there are 24 byte-orders for 4-byte
integers, and 40320 byte-orders for 8-byte integers.
Dec 17 '06 #11
>Just to add, as to how to determine the nature of Endianness,
>
#define LITTLE_ENDIAN 0
#define BIG_ENDIAN 1
There are 24 possible byte-orders for a 4-byte integer.
Where are the other 22 defines?

At the very least, you should have a NON_ENDIAN define for
neither little-endian nor big-endian. PDP-11s are real.
int machineEndianness()
{
int i = 1;
char *p = (char *) &i;
if (p[0] == 1) // Lowest address contains the least significant
byte
return BIG_ENDIAN;
else
return LITTLE_ENDIAN;
}
Dec 17 '06 #12
go****@hammy.burditt.org (Gordon Burditt) writes:
>>Just to add, as to how to determine the nature of Endianness,

#define LITTLE_ENDIAN 0
#define BIG_ENDIAN 1

There are 24 possible byte-orders for a 4-byte integer.
Where are the other 22 defines?
Helpful.
>
At the very least, you should have a NON_ENDIAN define for
neither little-endian nor big-endian. PDP-11s are real.
> int machineEndianness()
{
int i = 1;
char *p = (char *) &i;
if (p[0] == 1) // Lowest address contains the least significant
byte
return BIG_ENDIAN;
else
return LITTLE_ENDIAN;
}
There seems to be a "it doesnt matter in C" answer appearing here which
is as incorrect as it is misleading. Eric seems to have been the only
one to give an answer.

Many system communicate using C and it isn't too uncommon for endian
issues to crop up.

C does not "take care of it" if bytes or streams of bytes are thrown
down a wire.

The programmer does have to reassemble data accordingly - especially
with user defined structures, packing etc.
Dec 18 '06 #13
Richard wrote:
[..] C does not "take care of it" if bytes or streams of bytes are thrown
down a wire.

The programmer does have to reassemble data accordingly - especially
with user defined structures, packing etc.
....and an implementation of htonl (host to network long) and friends
may be useful, if available on the current system.

/Mattan
Dec 18 '06 #14
go****@hammy.burditt.org (Gordon Burditt) writes:
>>Just to add, as to how to determine the nature of Endianness,

#define LITTLE_ENDIAN 0
#define BIG_ENDIAN 1

There are 24 possible byte-orders for a 4-byte integer.
Where are the other 22 defines?

At the very least, you should have a NON_ENDIAN define for
neither little-endian nor big-endian. PDP-11s are real.
That's true in principle. In real life, though, there are only two or
three possible endiannesses: big-endian, little-endian, and
PDP-11-endian -- and you're not likely to run into the latter.

And you also have to allow for the possibilty that you don't *have*
4-byte integers. On some DSPs, for example, an int is one byte (and a
byte is at least 16 bits); on such a system, int has no endianness.

It's a good idea to check explicitly for both big-endian and
little-endian, but it's probably not necessary to handle other cases
other than by bailing out. For example:

#include <limits.h>
#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>

int main(void)
{
#if CHAR_BIT != 8
#error "CHAR_BIT != 8, I'm not prepared to cope with that."
#endif
unsigned char arr[4] = { 0x12, 0x34, 0x56, 0x78 };
uint32_t n = *(uint32_t*)arr;
if (n == 0x12345678) {
printf("big-endian\n");
}
else if (n == 0x78563412) {
printf("little-endian\n");
}
else {
fprintf(stderr, "Unable to determine endianness, n == 0x%x\n", n);
exit(EXIT_FAILURE);
}

return 0;
}

Adjust as needed if your system doesn't support <stdint.h>.

(The first time I tried this, I had forgotten to #include <limits.h>.
CHAR_BIT quietly expanded to 0, giving me a very unexpected result.)

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Dec 18 '06 #15
"Mattan" <ne******@spoofaddress.comwrote in message
news:PP*******************@newsb.telia.net...
Richard wrote:
>[..] C does not "take care of it" if bytes or streams of bytes are
thrown
down a wire.

The programmer does have to reassemble data accordingly - especially
with user defined structures, packing etc.

...and an implementation of htonl (host to network long) and friends
may be useful, if available on the current system.
Any system which has sockets available should have htonl() et al as
well.

To forestall any complaints that sockets are OT here, note that the same
exact issue exists when you try to write any object to a file in binary
mode. You have to define file/wire formats when working with binary
data, and that includes the number of bits and endianness for each
field. In the sockets world, the unit of transport is the octet (always
8 bits), not the byte (which varies in size), and "network byte order"
is defined as big-endian. File formats have no such conventions. Using
the same convention as sockets makes your life easier if you're on a
system that has sockets available (which is nearly all, these days)
since you get ntohl() et al for free, but a huge number of file formats
(and non-IETF network protocols) from the DOS/Windows world use
little-endian storage.

Text, of course, is the safest format for interchange, provided you know
what encoding is used for the characters. Unfortunately, one still has
to deal with EBCDIC vs ASCII and all the various multibyte encodings for
Unicode, so figuring out how to read a text file with the right encoding
has become as much a challenge as dealing with binary data -- and slower
to boot. The only remaining advantage is that it's easier for humans to
debug (or, in the case of files, modify) the messages.

S

--
Stephen Sprunk "God does not play dice." --Albert Einstein
CCIE #3723 "God is an inveterate gambler, and He throws the
K5SSS dice at every possible opportunity." --Stephen Hawking
--
Posted via a free Usenet account from http://www.teranews.com

Dec 18 '06 #16
In article <11**********************@80g2000cwy.googlegroups. com>
In************@gmail.com <In************@gmail.comwrote:
If I am right Endianness is CPU related.
Others have already discussed most of the practical issues. I would
like to point out that endianness is not really "CPU related" at all
though.

Suppose you are getting ready to move from one apartment to another.
Your friend has offered you *free* use of his small pickup truck,
so that you need not rent a huge van.

There is one problem: your bed will not fit, fully assembled, into
the pickup.

Fortunately, your bed comes apart, into three pieces: headboard,
middle section, and footboard. Each of those pieces will, by
itself, fit in the truck. So you take the bed apart:

||||
|||| |||
|||| ============= |||
|||| ============= |||
headboard middle section footboard

At the other end, your friend will reassemble the bed while you
drive back to get more stuff. You bring him the headboard, then
the footboard, then the middle, because that was the easiest way
to take them out:

||||
|||| |||
|||| ||| =============
|||| ||| =============

Then you drive back to your old place to get more stuff.

Your friend, for some reason, believes that you delivered the
footboard first, then the middle, then the headboard. So he connects
the pieces in that order. But you delivered the footboard first,
so he put that where the headboard goes, then you delivered the
footboard, which he put in the middle, and last, you delivered the
middle, which he put at the foot:

||||
|||||||
=============|||||||
=============|||||||

Your bed is no longer use-able, until you take it apart and
re-reassemble it in the correct order. The problem is that you
and your friend failed to agree on "endianness". (Well, that, and
your friend is about as smart as a typical computer: he only does
what you tell him, instead of what you meant.[%]) But there is no
CPU in sight. So where did the "endianness" come from?

It came from disagreement between various entities -- in this case,
you and your friend -- that dis-assembled something (here, your
bed), then re-assembled it, but did not connect the same pieces in
the same way. To avoid the problem, you must make sure that all
entities involved in disassembly and reassembly agree as to which
sub-parts go where.

If you (and of course your friends too) never break a whole object
up into parts, the problem never occurs. (Transport the bed as a
single unit, it arrives as a single unit, still in "bed" shape.)
The problem occurs only when you *do* break something into parts.
Even then, it occurs only if you put it back together in some other
way. If you and your friends can all agree on some basic,
un-break-able sub-unit -- such as, say, the 8-bit byte -- and you
make sure never to give out anything "too big" so that your friends
have to break them up, *you* can control the order of breaking-up
and re-assembling, and therefore guarantee that the re-assembly
always follows the same sequencing rules as the breaking-up.
-----
[%] The student programmer's lament:

I really hate this darn* machine
I wish that they would sell it
It never does quite what I want
But only what I tell it.

* or other suitable one-syllable word
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.
Dec 18 '06 #17
In************@gmail.com wrote:
Thanks. I will check it out.
check what out? Please leave relevent context.
C relies on the implementor to define each operator for each native data
type for each platform.

So why does it not do the same with structs? Why should the programmer
take care of it while passing it across platforms? Is it more of a
"rationale" related question?
C takes care of structs *on the same platform*. The C standard does not

address cross-platform issues so it's the programmer's problem.

Note you not only have endian problems, but also fundamental types'
sizes, floating point representations, character sets and struct
padding may all vary. Pointers are a complete no-no.

That's just what I thought of off the top of my head there will be
other stuff.

Take a look at XDR, ASN.1 and XML for portable data formats.

--
Nick Keighley

Unicode is an international standard character set that can be used
to write documents in almost any language you're likely to speak,
learn or encounter in your lifetime, barring alien abduction.
(XML in a Nutshell)

Dec 18 '06 #18
On 18 Dec 2006 08:28:19 GMT, Chris Torek <no****@torek.netwrote:
In article <11**********************@80g2000cwy.googlegroups. com>
In************@gmail.com <In************@gmail.comwrote:
If I am right Endianness is CPU related.

Others have already discussed most of the practical issues. I would
like to point out that endianness is not really "CPU related" at all
though.

Suppose you are getting ready to move from one apartment to another. <snip>
Fortunately, your bed comes apart, into three pieces: <snip>
At the other end, your friend will reassemble the bed while you
drive back to get more stuff. <snip>
Your friend, for some reason, believes that you delivered the
[pieces in a different order and reassembles obviously wrongly]
The problem is that you
and your friend failed to agree on "endianness". (Well, that, and
your friend is about as smart as a typical computer: he only does
what you tell him, instead of what you meant.[%]) <snip>
It's a good thing you're the one driving; I'd hate to see what this
Turing-machine-brained friend does when faced with say a bent or
obscured traffic control sign. (Aside: I lived near Boston back about
1980 when the originally Californian law, allowing by default right
turn after stop at a red light if no traffic, was adopted -- or at
least its adoption 'encouraged' -- Federally as a gasoline saving
measure. So the city went around putting up 'no turn on red' signs
pretty much everywhere. One intersection near me was already signed
'no left turn' AND 'no right turn' and they added 'no turn on red'!)
If you (and of course your friends too) never break a whole object
up into parts, the problem never occurs. (Transport the bed as a
single unit, it arrives as a single unit, still in "bed" shape.)
If it and everything else in the same load is tied down adequately;
otherwise it may arrive in an arbitrary but substantial number of
pieces, none bed-shaped, and probably not reassemble-able at all. FWIW
_this_ problem rarely happens with userlevel computer data; although
it can and does occur in hardware, devices and systems mostly are
designed with error detection and correction features (parity, CRC,
LRC, VRC, EDC, ECC, etc.) which lead the program to see either (1)
correct data as sent/stored/whatever or (2) no data at all, sometimes
but not always with a more-or-less specific error indicator.

<snip rest>

- David.Thompson1 at worldnet.att.net
Jan 3 '07 #19

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

26
by: Case | last post by:
#include <string.h> int i; /* 4-byte == 4-char */ char data = { 0x78, 0x56, 0x34, 0x12 }; int main() { memcpy(&i, data, 4); /*
15
by: T Koster | last post by:
Hi group, I'm having some difficulty figuring out the most portable way to read 24 bits from a file. This is related to a Base-64 encoding. The file is opened in binary mode, and I'm using...
7
by: bush | last post by:
hi folks I am new to this group.i need to know how i can find the endianness of my system.help me out.
18
by: friend.05 | last post by:
Code to check endianness of machine
6
by: Tomás | last post by:
Let's say you want to write fully portable code that will be writing files or sending data, and the data is text encoded using Unicode 16-Bit. Endianness comes into play. I'm writing code at the...
134
by: jacob navia | last post by:
Hi Suppose you have somewhere #define BOOL int and somewhere else typedef BOOL int;
10
by: Lionel B | last post by:
Greetings, I have some code that is to read unformatted data from disc and interpret it as blocks of unsigned integers. In an attempt to achieve efficiency (it is pretty essential for my...
19
by: perry.yuan | last post by:
How could I determine the endianness of my compile environment at compile time, instead of run time? I need a macro ("some_expression"), i.e. #if some_expression #define TARGET_IS_LITTLE_ENDIAN...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.