473,574 Members | 2,617 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Please someone test this on a Big-Endian System

I want to see if this code works the way it should on a Big-Endian
system. Also if anyone has any ideas on how determine this at
compile-time so that I use the right decoding or encoding functions, I
would greatly appreciate the help.

Thanks,
Ché
#include <iostream>

int main( int argc, char* argv[] )
{
// Default system to little endian
bool isLittleEndian = true;

// Check whether this platform is big-endian or little endian
wchar_t a = L'a';
unsigned char* testChar = reinterpret_cas t<unsigned char*>( &a );

// Big Endian should display nothing on output here
std::cout << (unsigned char*) testChar << std::endl;

if( testChar == 0 )
{
isLittleEndian = false;

// Big Endian should display '"Big Endian Success" here
std::cout << "Big Endian Success" << std::endl;

return 0;
}

Jul 3 '06 #1
14 2842
"ThazKool" <Ch**********@g mail.comschrieb im Newsbeitrag
news:11******** *************@j 8g2000cwa.googl egroups.com...
<quote>
I want to see if this code works the way it should on a Big-Endian
system. Also if anyone has any ideas on how determine this at
compile-time so that I use the right decoding or encoding functions, I
would greatly appreciate the help.

Thanks,
Ché
#include <iostream>

int main( int argc, char* argv[] )
{
// Default system to little endian
bool isLittleEndian = true;

// Check whether this platform is big-endian or little endian
wchar_t a = L'a';
unsigned char* testChar = reinterpret_cas t<unsigned char*>( &a );

// Big Endian should display nothing on output here
std::cout << (unsigned char*) testChar << std::endl;

if( testChar == 0 )
{
isLittleEndian = false;

// Big Endian should display '"Big Endian Success" here
std::cout << "Big Endian Success" << std::endl;

return 0;
}
</quote>

If might work, but it might not also do so. You are assuming that char and
wchar_t are different type. This may not always be the case. You also assume
that enough high bits of L'a' are zero to make a big endian system think a
char* pointing to a wchar_t actually points to an empty string. Then you are
using reinterpret_cas t in a way that is undefined (or unspecified?)
behaviour (casting btween pointers to unrelated types always is). And
finally a pointer to a local variable will never be 0, so "testChar== 0" will
never be true, no matter which byte order the system is using (if any).

To test for endiness you should

1) Test if CHAR_BITS (or its <climitsequival ent) is equal to 8. Endiness
is only defined for systems internally using octets. If CHAR_BITS is not
equal to 8 you cannot access octets on that system, at least not in an easy
way.

2) Test if sizeof(wchar_t) == 2. Endiness is only defined for pairs of
octets. So, if wchar_t is not a pair of octets, you have to think about
something else.

3) Assign a well known value to a wchar_t variable. (L'a' is not a well
known value. There are good chances that it will be 0x0061, but it might be
something completly different.) Use something like 0xFEFF instead. (0xFEFF
is the Unicode byte-order-mark, but other values will do, too.) Then get the
value of the two chars (octets) occupying the same space as the variable and
compare them with 0xFE and 0xFF:

wchar_t wc = 0xFEFF;
unsigned char const* cp = reinterpret_cas t<unsigned char*>(&wc);
if (cp[0] == 0xFE && cp[1] == 0xFF)
{
// Big-Endian
}
else if (cp[0] == 0xFF && cp[1] == 0xFE)
{
// Little-Endian
}
else
{
// Something completly different
}

Alas, that code also depends on a cast of pointer to unrelated types.

But why do you need to know the endiness of the system your program runs on?
Usually you only have convert form one kind of byte-order to another when
you are reading from an external source (file, network connection) or
writing to such a destination. And in those situation you can easyly convert
between the external format and the format used in a program without knowing
the byte-order of the system itself. You only have to know the external byte
order. Then you can convert in a portable way.

To read a Unicode (UCS-16) string, read the string into an array of bytes
(unisgned char will probably be a god choid on most systems, but add some
test that CHAR_BITS is really equal to 8). The convert pairs of those octets
into values of a type large enough to hold an UCS-16 character:

if (ExternalFormat IsLittleEndian)
{
for (int i = 0; i < BytesRead; i += 2)
internalString[i / 2] = externalString[i] + 256 *
externalString[i + 1];
}
else
{
for (int i = 0; i < BytesRead; i += 2)
internalString[i / 2] = externalString[i] * 256 +
externalString[i + 1];
}

Before you write internal data to an external destination, you must of cause
convert your internal representation to the external one, but again you can
do so without knowing the internal byte order. You only have to know how
bytes should be arranged outside your program.

HTH
Heinz

Jul 3 '06 #2

Heinz Ozwirk wrote:
"ThazKool" <Ch**********@g mail.comschrieb im Newsbeitrag
news:11******** *************@j 8g2000cwa.googl egroups.com...
<quote>
I want to see if this code works the way it should on a Big-Endian
system. Also if anyone has any ideas on how determine this at
compile-time so that I use the right decoding or encoding functions, I
would greatly appreciate the help.

Thanks,
Ché
#include <iostream>

int main( int argc, char* argv[] )
{
// Default system to little endian
bool isLittleEndian = true;

// Check whether this platform is big-endian or little endian
wchar_t a = L'a';
unsigned char* testChar = reinterpret_cas t<unsigned char*>( &a );

// Big Endian should display nothing on output here
std::cout << (unsigned char*) testChar << std::endl;

if( testChar == 0 )
{
isLittleEndian = false;

// Big Endian should display '"Big Endian Success" here
std::cout << "Big Endian Success" << std::endl;

return 0;
}
</quote>

If might work, but it might not also do so. You are assuming that char and
wchar_t are different type. This may not always be the case. You also assume
that enough high bits of L'a' are zero to make a big endian system think a
char* pointing to a wchar_t actually points to an empty string. Then you are
using reinterpret_cas t in a way that is undefined (or unspecified?)
behaviour (casting btween pointers to unrelated types always is). And
finally a pointer to a local variable will never be 0, so "testChar== 0" will
never be true, no matter which byte order the system is using (if any).

To test for endiness you should

1) Test if CHAR_BITS (or its <climitsequival ent) is equal to 8. Endiness
is only defined for systems internally using octets. If CHAR_BITS is not
equal to 8 you cannot access octets on that system, at least not in an easy
way.

2) Test if sizeof(wchar_t) == 2. Endiness is only defined for pairs of
octets. So, if wchar_t is not a pair of octets, you have to think about
something else.

3) Assign a well known value to a wchar_t variable. (L'a' is not a well
known value. There are good chances that it will be 0x0061, but it might be
something completly different.) Use something like 0xFEFF instead. (0xFEFF
is the Unicode byte-order-mark, but other values will do, too.) Then get the
value of the two chars (octets) occupying the same space as the variable and
compare them with 0xFE and 0xFF:

wchar_t wc = 0xFEFF;
unsigned char const* cp = reinterpret_cas t<unsigned char*>(&wc);
if (cp[0] == 0xFE && cp[1] == 0xFF)
{
// Big-Endian
}
else if (cp[0] == 0xFF && cp[1] == 0xFE)
{
// Little-Endian
}
else
{
// Something completly different
}

Alas, that code also depends on a cast of pointer to unrelated types.

But why do you need to know the endiness of the system your program runs on?
Usually you only have convert form one kind of byte-order to another when
you are reading from an external source (file, network connection) or
writing to such a destination. And in those situation you can easyly convert
between the external format and the format used in a program without knowing
the byte-order of the system itself. You only have to know the external byte
order. Then you can convert in a portable way.

To read a Unicode (UCS-16) string, read the string into an array of bytes
(unisgned char will probably be a god choid on most systems, but add some
test that CHAR_BITS is really equal to 8). The convert pairs of those octets
into values of a type large enough to hold an UCS-16 character:

if (ExternalFormat IsLittleEndian)
{
for (int i = 0; i < BytesRead; i += 2)
internalString[i / 2] = externalString[i] + 256 *
externalString[i + 1];
}
else
{
for (int i = 0; i < BytesRead; i += 2)
internalString[i / 2] = externalString[i] * 256 +
externalString[i + 1];
}

Before you write internal data to an external destination, you must of cause
convert your internal representation to the external one, but again you can
do so without knowing the internal byte order. You only have to know how
bytes should be arranged outside your program.

HTH
Heinz
I really appreciate your help. There was at least one silly mistake as
I copied and added the code to main without testing. You are
completely correct on some of the issues that I was unaware of. My
desire to do this was formed out of uncertainty. I want to make
portable unicode handling functions that can interface directly with
say a person typing casually into C++ the const wchar_t* L"Hello World"
without worry.

Thank you for your help.

Jul 3 '06 #3
ThazKool posted:

#include <iostream>

int main( int argc, char* argv[] )
{
// Default system to little endian
bool isLittleEndian = true;

// Check whether this platform is big-endian or little endian
wchar_t a = L'a';
unsigned char* testChar = reinterpret_cas t<unsigned char*>( &a );

// Big Endian should display nothing on output here
std::cout << (unsigned char*) testChar << std::endl;

Oh good lord Jesus no!

There are perfectly portable ways of doing this, and this is not one of
them!

Check out some code I posted recently on comp.std.c++

http://groups.google.ie/group/comp.s...4a21366?hl=en&
--

Frederick Gotham
Jul 3 '06 #4
>
Oh good lord Jesus no!
Please do not use the name of the Lord in vein.
Jul 3 '06 #5

"Gernot Frisch" <Me@Privacy.net wrote in message
news:4g******** *****@individua l.net...
>
>>
Oh good lord Jesus no!

Please do not use the name of the Lord in vein.

In "vein"? As in intravenously? I think you meant "in vain", as in "unless
you REALLY mean it!" :-)

-Howard
Jul 3 '06 #6
Gernot Frisch posted:

>Oh good lord Jesus no!

Please do not use the name of the Lord in vein.

Sorry.

I myself am not religious, and so have no quibble with such exclamations.

However, I realise that contributors to the group may be religious, and as
I have no desire to offend any of you, I will refrain from any such future
religious references, be they positive or negative.

--

Frederick Gotham
Jul 3 '06 #7

Frederick Gotham wrote:
Gernot Frisch posted:

Oh good lord Jesus no!
Please do not use the name of the Lord in vein.


Sorry.

I myself am not religious, and so have no quibble with such exclamations.

However, I realise that contributors to the group may be religious, and as
I have no desire to offend any of you, I will refrain from any such future
religious references, be they positive or negative.

--

Frederick Gotham
For me, I am spiritual and not blinded by religion. Your exclamation,
from what I gather, was said with love and tolerance for the group and
the curious. It was not associated with any resentment or other ill
spirits. If anything, I am blessed by your contribution. Being
spiritual, I will not harbor any resentment for your actions. Even
though I am catholic, I know that many of my brethren harbor resentment
on what they judge as unholy. The only thing I can say is do as jesus
does. If jesus where here, he would probably only say "Why not?", and
after seeing your contribution he would be pleased. There has been no
harm done here. I appreciate your code Frederick, and I have a
question for you.

I shortened some of it test for endianess. Your code is more robust
and gives you the actuall byte order. Should the byte order be
something that I should be concerned about or can I get away with my
shortened code? I guess since I am only dealing with wchar_t in my
code I will be ok. Also, I only want this code to work on compilers
that where sizeof( wchar_t ) == 2. I am not concerned about the rest.
Are there any CPU's that have unsequential byte orders out there? I am
curious and thanks once again.

template<typena me T>
inline bool IsLittleEndian( )
{
// Initialize the type to be tested
T testType = 0;

// Shift bits: 32bit type shifts left 24
testType = 1 << ( sizeof( T ) * 8 - 8 );

// Get the first byte of the type
const unsigned char *firstByte =
reinterpret_cas t<unsigned char*>(&testTyp e);

// Return true if type is little endian
if( *firstByte == 0 ) return true;
else return false;
}

template<typena me T>
inline bool IsBigEndian()
{
// Initialize the type to be tested
T testType = 0;

// Shift bits: 32bit type shifts left 24
testType = 1 << ( sizeof( T ) * 8 - 8 );

// Get the first byte of the type
const unsigned char *firstByte =
reinterpret_cas t<unsigned char*>(&testTyp e);

// Return true if type is little endian
if( *firstByte == 0 ) return false;
else return true;
}

Jul 3 '06 #8
ThazKool posted:
The only thing I can say is do as jesus does. If jesus where here, he
would probably only say "Why not?", and after seeing your contribution
he would be pleased.
I was with you up until the point where you made things subjective by
injecting your own religious beliefs.

People from all over the world view this newsgroup; people of different
nationalities, different religions, different cultures, different socio-
economic standings -- and they shouldn't have to read Christianity-specific
praise akin to your praising of Jesus above, nor should they have had to
read my original exclamation.

My own view is that the newsgroup should be kept free of religion PERIOD --
that means no religion-related exclamations (e.g. Jesus No!), no religion-
specific greetings (e.g As-Salamu Alaykum).

If you're religious / spirtual / philisophical, then that's great -- but
please keep it to yourself on this newsgroup. In communicational channels
like this one, such things divide more people than they unite.

I shortened some of it test for endianess. Your code is more robust
and gives you the actuall byte order. Should the byte order be
something that I should be concerned about or can I get away with my
shortened code?

If an unsigned integer consists of 4 bytes, then the number of possible
arrangements is the factorial of 4, i.e. 4!, which is 24.

My own code allows for all 24 arrangements. (Actually, it allows for any
number of bytes too, and thus any number of arrangements.)

The code could be simplified if it only had to distinguish between Big-
endian and Little-endian.

First though, two things must be assured:

(1) The integer type contains no padding.
(2) The integer type consists of at least 2 bytes.

I shall use boost's static assert to make sure of these things (
http://www.boost.org/doc/html/boost_staticassert.html ).

#include <iostream>
#include <limits>
#include <boost/static_assert.h pp>

enum Endianness { BigEndian, LittleEndian };

Endianness DetermineEndian ness()
{
/* First, ensure that there's no padding: */

BOOST_STATIC_AS SERT( sizeof(unsigned ) * CHAR_BIT
== std::numeric_li mits<unsigned>: :digits );
/* Now ensure that there's at least 2 bytes: */

BOOST_STATIC_AS SERT( sizeof(unsigned ) >= 2 );
/* Now it's safe to play! */

unsigned i = 1;

return static_cast<End ianness>( reinterpret_cas t<char&>(i) );
}

int main()
{
std::cout << "This machine is: ";

switch( DetermineEndian ness() )
{
case LittleEndian:

std::cout << "Little-endian.\n";
break;

case BigEndian:

std::cout << "Big-endian.\n";
}
}

I guess since I am only dealing with wchar_t in my
code I will be ok.

No, no, no.

Also, I only want this code to work on compilers
that where sizeof( wchar_t ) == 2.

Fair enough, but as I have demonstrated, there's no need to go the non-
portable route.

Are there any CPU's that have unsequential byte orders out there?

Yes, and here's info on them:

http://en.wikipedia.org/wiki/Endianness
--

Frederick Gotham
Jul 3 '06 #9

Frederick Gotham wrote:
ThazKool posted:
The only thing I can say is do as jesus does. If jesus where here, he
would probably only say "Why not?", and after seeing your contribution
he would be pleased.

I was with you up until the point where you made things subjective by
injecting your own religious beliefs.

People from all over the world view this newsgroup; people of different
nationalities, different religions, different cultures, different socio-
economic standings -- and they shouldn't have to read Christianity-specific
praise akin to your praising of Jesus above, nor should they have had to
read my original exclamation.

My own view is that the newsgroup should be kept free of religion PERIOD --
that means no religion-related exclamations (e.g. Jesus No!), no religion-
specific greetings (e.g As-Salamu Alaykum).

If you're religious / spirtual / philisophical, then that's great -- but
please keep it to yourself on this newsgroup. In communicational channels
like this one, such things divide more people than they unite.

I shortened some of it test for endianess. Your code is more robust
and gives you the actuall byte order. Should the byte order be
something that I should be concerned about or can I get away with my
shortened code?


If an unsigned integer consists of 4 bytes, then the number of possible
arrangements is the factorial of 4, i.e. 4!, which is 24.

My own code allows for all 24 arrangements. (Actually, it allows for any
number of bytes too, and thus any number of arrangements.)

The code could be simplified if it only had to distinguish between Big-
endian and Little-endian.

First though, two things must be assured:

(1) The integer type contains no padding.
(2) The integer type consists of at least 2 bytes.

I shall use boost's static assert to make sure of these things (
http://www.boost.org/doc/html/boost_staticassert.html ).

#include <iostream>
#include <limits>
#include <boost/static_assert.h pp>

enum Endianness { BigEndian, LittleEndian };

Endianness DetermineEndian ness()
{
/* First, ensure that there's no padding: */

BOOST_STATIC_AS SERT( sizeof(unsigned ) * CHAR_BIT
== std::numeric_li mits<unsigned>: :digits );
/* Now ensure that there's at least 2 bytes: */

BOOST_STATIC_AS SERT( sizeof(unsigned ) >= 2 );
/* Now it's safe to play! */

unsigned i = 1;

return static_cast<End ianness>( reinterpret_cas t<char&>(i) );
}

int main()
{
std::cout << "This machine is: ";

switch( DetermineEndian ness() )
{
case LittleEndian:

std::cout << "Little-endian.\n";
break;

case BigEndian:

std::cout << "Big-endian.\n";
}
}

I guess since I am only dealing with wchar_t in my
code I will be ok.


No, no, no.

Also, I only want this code to work on compilers
that where sizeof( wchar_t ) == 2.


Fair enough, but as I have demonstrated, there's no need to go the non-
portable route.

Are there any CPU's that have unsequential byte orders out there?


Yes, and here's info on them:

http://en.wikipedia.org/wiki/Endianness
--

Frederick Gotham
I agree with you wholeheartedly. Seperation of church from state;
seperation of church from code. Any suggestion can be taken or left
alone. Spirtuality is an attribute an atheist, an agnostic, or a
religious fanatic can have. It is not exclusive to the religious like
most people think. I have researched the real meanings of these words
where most have not. I appologize for not properly directing my
catholic, Jesus comment. It was directed to the one that blasted you.
That one did not realize that he was not following his own doctrine. I
disagree though with keeping spiritual principles out of the newsgroup.
I do agree with keeping religion out, and you are correct with the
fact that it divides more than unifies. Spiritual principles with love
and tolerance unite people such as "Help your brother coders and show
your appreciation when they help you". This benefits the groups, but
most people don't understand the difference between spirituality and
religion. They think that they are one in the same and they are not.
Well I do have to really and trully thank you. You have given me great
wisdom that have had difficulty accertaining from web browsing. I am
indebted to you.

Much thanks and much appreciation,
Ché

Jul 4 '06 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
3392
by: Lochness | last post by:
I'm hoping someone can help me with this. I've seen and tried various solutions I've seen on the net, but nothing works. Of course it works perfectly on localhost, but when I upload it to the server (1and1.com) it gives errors. The user enters a code, the code is verified, a new one is created and it's supposed to send them to the next...
3
3341
by: cooldv | last post by:
i am running a website on Windows 2000 server with ASP 3 webpages and Access 2000 database. (with a hosting company) traffic is slow at this time but expect to grow. lately i have been reading about sql database and sql server, specially this article: http://www.aspfaq.com/show.asp?id=2195 will someone help me understand: 1. with *SQL...
3
2202
by: Sean | last post by:
Hi all I have a bit of a dilema that I am hoping some of you smart dudes might be able to help me with. 1. I have a table with about 50 million records in it and quite a few columns. 2. I have another table with just over 300 records in it and a single column (besides the id).
2
2226
by: Steve | last post by:
I have received a table of data that has a field containing date information. Unfortunately it was derived from a MainFrame dump and originated as a txt file and was then ported into an Access MDB file before it became an SQL table. The date format is vchar(50) and actually is comprised of 6 charecters ie: 010104 for Jan 1 2004. I need to...
0
1630
by: Jimmy | last post by:
Hi, I have a (multi-tier) XML document and a XSL transform file (that is meant to transform the XML into a flat structure so I can insert the data into a database table). I have reached a problem node when doing a search for a match. The following works: <xsl:template match="PlanningPortalXML/pp:PlanningApplication/pa1:form">
8
2157
by: Chris Sharman | last post by:
See the bottom of http://www.prestonbridge.pwp.blueyonder.co.uk/index2.htm (4th line of the club officers section). The mark-up's validated html 4 & css, but in ie5.0/Win2000 my surname disappears, leaving just "web author: chris (big space) <myemail.gif>". It's ok (afaik) in other browsers, and even other versions of ie. I tried a lot of...
13
1590
by: Chiller | last post by:
I'm now getting close to finishing my Distance class. In the code below I have included a number of overload operators that test for equality etc. I've also added more code in the TEST_DISTANCE driver to test the code. I now have 2 remaining problems: 1. The code will allow for the input of distances in either cm, m or km values;...
6
1543
by: sparks | last post by:
extracalc = Switch(Me.Parent.Race_Black = -1 And Me.Parent.Sex = "Female", 1.952, Me.Parent.Race_Black = -1, 1.21, Me.Parent.Sex = "Female", 0.742, 1) I look at this and say ok if race = black and sex = female then extracalc = 1.952 if race = black then extracalc = 1.21 if sex = female then extracalc = .742 else
18
1908
by: Blackspeech | last post by:
We have Siebel running in a test environment and experiencing problems as described below: We receive automated trouble ticket information from a Remedy based system that emails Siebel and expects to have a Service Request assigmnment returned after the email is received and duly processed. The issue is a little tricky in that if the...
6
2265
by: fido19 | last post by:
Once upon a time, there lived a chimpanzee called Luycha Bandor (aka Playboy Chimp). Luycha was unhappily married to Bunty Mona, a short but cute little lady chimp. Luycha was tall and handsome – he was feeling uncomfortable taking Bunty to public places along with him. People would stare at them all the while. At one point, Luycha could not...
0
7814
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main...
0
7737
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language...
0
8253
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that...
1
7827
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For...
0
6471
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then...
0
5316
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert...
0
3752
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in...
1
1356
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
1072
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.