Big-endian, little-endian and sizeof() in different systems

Javier

Hello people,
I'm recoding a library that made a few months ago, and now that I'm
reading what I wrote I have some questions.

My program reads black and white images from a bitmap (BMP 24bpp
without compression). It has it's words and dwords stored in little-
endian, so I do a conversion to big-endian when reading full words or
dwords.
I have done this because my system is big-endian.
But now... what if one compiles the library in a little-endian system?

And... I use char (which I have readed that is equal to unsigned short
int) as 'byte'.
And this is the other question: is sizeof(char) a 'byte' always?
How can I define byte, word and dword (8, 16, 32 bits) without making
the asumption that are sizeof(char) is a byte (8 bits).

Thanks.

Jun 15 '07 #1

Subscribe Post Reply

3149

dasjotre

On 15 Jun, 14:57, Javier <jjeron...@gmail.comwrote:

Hello people,
I'm recoding a library that made a few months ago, and now that I'm
reading what I wrote I have some questions.

My program reads black and white images from a bitmap (BMP 24bpp
without compression). It has it's words and dwords stored in little-
endian, so I do a conversion to big-endian when reading full words or
dwords.

hton and ntoh, mostly used in networking.

I have done this because my system is big-endian.
But now... what if one compiles the library in a little-endian system?

if you use hton and ntoh when reading the files you will have no
problem.

And... I use char (which I have readed that is equal to unsigned short
int) as 'byte'.
And this is the other question: is sizeof(char) a 'byte' always?
How can I define byte, word and dword (8, 16, 32 bits) without making
the asumption that are sizeof(char) is a byte (8 bits).

sizeof(char) is always 1.
you could use stdint.h
it s C header (boost has cstdint.hpp too)
if defines fixed width types
like intXX_t where XX is number of bits

regards

DS

Jun 15 '07 #2

Andre Kostur

Javier <jj*******@gmail.comwrote in news:1181915876.026011.283520
@q66g2000hsg.googlegroups.com:

Hello people,
I'm recoding a library that made a few months ago, and now that I'm
reading what I wrote I have some questions.

My program reads black and white images from a bitmap (BMP 24bpp
without compression). It has it's words and dwords stored in little-
endian, so I do a conversion to big-endian when reading full words or
dwords.
I have done this because my system is big-endian.
But now... what if one compiles the library in a little-endian system?

Drifting somewhat off-topic (endianness is a platform-specific issue).
Many systems have some sort of include file which will define a macro
which will tell you the endianness of the platform that you're compiling
for. Using that knowledge you can construct a function which converts
from little-endian to the endianness of the platform that you're on.
(I'd suggest using the hton* and ntoh* family of functions but those go
between host and big-endian). So anytime you need to be concerned about
the endianness you can pass it to your function to convert it from
little-endian to host-endian (which means that for some platforms your
function does nothing, and some it does the byte flip).

And... I use char (which I have readed that is equal to unsigned short
int) as 'byte'.

Where did you read that? All the standard says:

sizeof(char) <= sizeof(short) <= sizeof(int) <= sizeof(long)

(And certain minimum range constraints) In most platforms that I've dealt
with, sizeof(char) is 1, and sizeof(short) is 2.

And this is the other question: is sizeof(char) a 'byte' always?
How can I define byte, word and dword (8, 16, 32 bits) without making
the asumption that are sizeof(char) is a byte (8 bits).

Use platform-specific includes. (Or more recent C headers, IIRC). Or
find some sort of portability layer library. Some compilers define
things like uint8_t, uint16_t and the like. Or libraries such as ACE
defines ACE_UINT32, ACE_UINT64, and that sort of thing.

Jun 15 '07 #3

dasjotre

On 15 Jun, 16:53, Andre Kostur <nntps...@kostur.netwrote:

Where did you read that? All the standard says:

sizeof(char) <= sizeof(short) <= sizeof(int) <= sizeof(long)

(And certain minimum range constraints) In most platforms that I've dealt
with, sizeof(char) is 1, and sizeof(short) is 2.

sizeof(char) must be 1 regardless of actual implementation.

regards

DS

Jun 15 '07 #4

James Kanze

On Jun 15, 3:57 pm, Javier <jjeron...@gmail.comwrote:

I'm recoding a library that made a few months ago, and now that I'm
reading what I wrote I have some questions.

My program reads black and white images from a bitmap (BMP 24bpp
without compression). It has it's words and dwords stored in little-
endian, so I do a conversion to big-endian when reading full words or
dwords.
I have done this because my system is big-endian.
But now... what if one compiles the library in a little-endian system?

The endian-ness of the internal represention shouldn't make a
difference. You deal with values, not with physical
representation. Basically, to read little endian, you use
something like:

uint32_t
read( uint8_t const* buffer )
{
return (uint32_t( buffer[ 0 ] ) )
|| (uint32_t( buffer[ 1 ] ) << 8)
|| (uint32_t( buffer[ 2 ] ) << 16)
|| (uint32_t( buffer[ 3 ] ) << 24) ;
}

Works regardless of the byte order. (I've seen at least 3
different byte orders for 32 bit integers.)

And... I use char (which I have readed that is equal to unsigned short
int) as 'byte'.

That's generally not true. On most machines today (there are a
few exceptions), char is 8 bits; short must be at least 16.
Also, very often, char is signed. I tend to avoid it for that
reason as well; shifting signed values doesn't always work as
expected.

And this is the other question: is sizeof(char) a 'byte' always?

That's the definition in the standard: char is a byte.
Sizeof(char) is guaranteed to be 1. As I said above, on most
machines today, it is 8 bits. The standard requires at least 8
bits, although in the past, 6 and 7 bit bytes were common (as
were 9 and 10 bits). From what I have heard, some DSP define
char to have 32 bits, with all of the integral types having a
sizeof 1. Also legal.

How can I define byte, word and dword (8, 16, 32 bits) without making
the asumption that are sizeof(char) is a byte (8 bits).

How portable do you want to be? C has a header, <stdint.h>,
which conditionally defines a certain number of integral types
with fixed, exact length, i.e. uint8_t is an unsigned integral
type with exactly 8 bits, int32_t is a signed, 2's complement
integral type with exactly 32 bits, etc. If the underlying
hardware doesn't support the type, it is not defined.
Regretfully, support for this header seems to be rather spotty.
But it's not too difficult to knock up your own version; put it
in an isolated, system dependant directory, where you know that
you have to adapt it each time you port to a new machine.

As I said, however, the presence of the definitions are
conditionned on the existance of the actual types. Not every
machine around today uses 8 bit bytes, and not every machine
uses 2's complement. Still, for many applications, portability
to only those machines that do is a quite acceptable
restriction.

(And BTW: a word is normally 32 bits, and a dword 64. 16 bits
is a hword, at least in IBM-speak.)

--
James Kanze (Gabi Software) email: ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Jun 16 '07 #5

Gavin Deane

On 16 Jun, 11:57, James Kanze <james.ka...@gmail.comwrote:

The endian-ness of the internal represention shouldn't make a
difference. You deal with values, not with physical
representation. Basically, to read little endian, you use
something like:

uint32_t
read( uint8_t const* buffer )
{
return (uint32_t( buffer[ 0 ] ) )
|| (uint32_t( buffer[ 1 ] ) << 8)
|| (uint32_t( buffer[ 2 ] ) << 16)
|| (uint32_t( buffer[ 3 ] ) << 24) ;
}

Did you mean | instead of || there?

Gavin Deane

Jun 16 '07 #6

Gennaro Prota

On Fri, 15 Jun 2007 13:57:56 -0000, Javier wrote:

>Hello people,
I'm recoding a library that made a few months ago, and now that I'm
reading what I wrote I have some questions.

My program reads black and white images from a bitmap (BMP 24bpp
without compression). It has it's words and dwords stored in little-
endian, so I do a conversion to big-endian when reading full words or
dwords.
I have done this because my system is big-endian.
But now... what if one compiles the library in a little-endian system?

As James Kanze pointed out you don't have to worry about the internal
representation used by your C++ implementation, only the external
representation of the values. Unfortunately, that's a point that few
people seem to understand (after I explained it in the corresponding
talk page, for instance, someone still added a totally bogus
"determining the byte order" example to the Endianness entry of the
English Wikipedia).

If the GNU GPL version 2 isn't a problem for you then you can find
this useful:
<http://breeze.svn.sourceforge.net/viewvc/breeze/trunk/breeze/endianness/endian_codec.hpp>

(Since my library aims at being generally useful any feedback is very
appreciated. NOTE: I haven't committed the file width.hpp yet: if you
are dealing with unsigned types only then you can implement it as

#include "breeze/meta/constant.hpp"
#include <limits>

namespace breeze {
namespace meta {

template< typename T >
class width
: public constant< T, std::numeric_limits< T >::digits >
{
};
}
}

Eventually, you can also add a #include <cstddefand this
specialization

template< typename T, std::size_t n >
class width< T[ n ] >
: public constant< std::size_t, n * width< T >::value >
{
};

which will allow you to work with built-in arrays as well. Well, this
is untested, I just typed it in the newsreader window, but it should
work :-))

--
Gennaro Prota -- Need C++ expertise? I'm available
https://sourceforge.net/projects/breeze/
(replace 'address' with 'name.surname' to mail)

Jun 16 '07 #7

Similar topics

303

BIG successes of Lisp (was ...)

by: mike420 | last post by:

In the context of LATEX, some Pythonista asked what the big successes of Lisp were. I think there were at least three *big* successes. a. orbitz.com web site uses Lisp for algorithms, etc. b....

Python

Distributed Transaction in different web services

by: Avanish Pandey | last post by:

Hello All We have 3 differen services (in 3 different server) Service A,B,C . We want to implement distributed transaction when call methods of B and C from A. Is it possible? if yes then how? ...

.NET Framework

One Big (std::) Header File?

by: Steven T. Hatton | last post by:

I recently came across the suggestion that it might be beneficial to create a header file containing all the Standard Headers, and include that in all the places where declarations from the...

C / C++

A big thanks to you all

by: Debaser | last post by:

Normally, when I make a post in these groups, I'm asking for help with some kind of error I'm having. This time, I have no questions (at the moment!). I just wanted to thank all of the regulars...

C / C++

Understand the big picture ...

by: Jep | last post by:

We plan for 1 (big) server to host both a production and test database for a system, and some other small db's. Databases to be put on a SAN. But I don't understand the topics concerning...

DB2 Database

Same query, different access plan in Prod. and Dev.

by: sql-db2-dba | last post by:

We have DB2 UDB v8.1 fixpak3 on AIX 5. Production and Development configuarations (at least for DB2) are identical albeit production is a 2-way server while development has only one processor....

DB2 Database

postgresql +AMD64 +big address spaces - does it work?

by: Andy B | last post by:

If I bought one of these boxes/OS combos as a postgresql database server, would postgresql be able to make the best use of it with a huge (e.g. 40GB) database? Box: HP ProLiant DL585, with ...

PostgreSQL Database

Single or multiple assemblies in big apps?

by: Claudio Pacciarini | last post by:

Hi everyone, I have a question about .NET code sharing and reuse, and also about application design best practices / guidelines. Currently, we have many different .NET projects in source...

.NET Framework

How to deal with big-size files??

by: =?windows-1256?B?5eTPx+bs?= | last post by:

What can i do to open , write and seek in file with size 2 GB or more I know that you r limited with int type size which is 32-bit but how programs like gzip , zip read and write from files with...

C / C++

A big Problem >>> Please Help me quick !!!!!

by: dragony2000 | last post by:

I want to solve these questions using C# , Please !!! ************************************************************* 1- The factorial method is used frequently in probability problems. The...

C / C++

Easy Steps to Fix "Canon Printer Won't Connect to WiFi Network"

by: taylorcarr | last post by:

A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...

General

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Basic Javascript concepts

by: aa123db | last post by:

Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...

Javascript

Batch import of multiple excel files into the database

by: ryjfgjl | last post by:

If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...

Data Management

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware