#include <string.h>
int i; /* 4-byte == 4-char */
char data[] = { 0x78, 0x56, 0x34, 0x12 };
int main()
{
memcpy(&i, data, 4);
/*
* Thinking about endianness, what can be said about
* the value of i according to the C-spec?
*/
}
/* Thanks for listening! Case */
Nov 14 '05
26 11694
"Case" <no@no.no> wrote in message #include <string.h>
int i; /* 4-byte == 4-char */ char data[] = { 0x78, 0x56, 0x34, 0x12 };
int main() { memcpy(&i, data, 4);
/* * Thinking about endianness, what can be said about * the value of i according to the C-spec? */ }
/* Thanks for listening! Case */
How many different values can i have given code above? With value I mean a number at C level, not implementation level.
In terms of existing implementations , probably about a dozen. Usually
numbers will be big- or little- endian and in two's complement notation, so
for practical purposes the answer is two. However you could run into
non-two's complement machines, machines where there are 9 bits in a byte,
and all sorts of other wonderful variations.
Christian Bau wrote: In article <40************ *********@news. xs4all.nl>, Case <no@no.no> wrote:
#include <string.h>
int i; /* 4-byte == 4-char */ char data[] = { 0x78, 0x56, 0x34, 0x12 };
int main() { memcpy(&i, data, 4);
/* * Thinking about endianness, what can be said about * the value of i according to the C-spec? */ }
Nothing.
I agree.
I believe what is missing in all of the
discussions is what endianness _is_.
In simple terms, it is the relationship between the CPU
and its memory. The above code example will, on any
architecture/platform it's run on, ALWAYS do the
following (assuming sizeof (int) == 4 for sake of argument):
*((char *)(&i) + 0) = data[ 0 ];
*((char *)(&i) + 1) = data[ 1 ];
*((char *)(&i) + 2) = data[ 2 ];
*((char *)(&i) + 3) = data[ 3 ];
However, how the CPU interprets the bits now contained
in the variable "i" is where the concept of its endianness
comes in. An Intel CPU will see the ordering of the
bits _differently_ then a SPARC CPU (or a 68040, etc.).
The code snippet will produce identical results _in
memory_ on all architectures where the sizeof (int) is four,
however, there is nothing to say that each architecture
will interpret the arrangement of the bits in the same way.
See man htonl(), etc. for more details.
HTH...
Stephen
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Case wrote:
| Lew Pitcher wrote:
|
|> -----BEGIN PGP SIGNED MESSAGE-----
|> Hash: SHA1
|>
|> Case wrote:
|>
|>> #include <string.h>
|>>
|>> int i; /* 4-byte == 4-char */
|>> char data[] = { 0x78, 0x56, 0x34, 0x12 };
|>>
|>> int main()
|>> {
|>> memcpy(&i, data, 4);
|>
|>
|>
|> First off, sizeof(i) may not be equal to 4. So, this may or may not do
|> what you
|> expect it to do.
|
|
| Yes, I know. That's why I said i is '4-byte == 4-char'.
No. sizeof(int) is 4 if the *compiler* says it is. Your word doesn't count
here at all. And we haven't seen anything from the compiler to indicate that
sizeof(int) == 4
|>
|>> /*
|>> * Thinking about endianness, what can be said about
|>> * the value of i according to the C-spec?
|>> */
|>
|>
|> Nothing can be said about the value of i.
|> 1) you may or may not have set the value of i to a known quantity. If
|> sizeof(i)
|> is greater than 4, then you didn't set i's storage completely, and if
|> sizeof(i)
|> is less than 4, then some of your initialization was not used to set i
|> (and
|> overwrote something else instead)
|
|
| It's 4 as I said (see above).
See above. It's not 4 on your word.
| And, doesn't the C standard say that
| 'global' data (as i is) is initialized to 0?!
So? We're not talking about /before/ you memcpy(). We're talking about /after/
you memcpy()
Think of it this way. If, unlike you, your compiler believes that
sizeof(int) == 2, then your memcpy() of 4 bytes over a 2-byte int just wiped
out two additional bytes somewhere. Your int only holds the first two bytes of
the 4 byte array that you used to init with, and that value might be
interpreted /either/ in big-endian /or/ little-endian format.
OTOH, if (unlike you) your compiler believes that sizeof(int) == 8), then your
memcpy() of 4 bytes over an 8-byte int only placed data into four of the eight
bytes. The other four bytes are not touched. So, we now have an int in which
four bytes are known quantities, but that can be interpreted in one of 8! ways
(big-endian and little-endian being two of those ways). So, even knowing the 4
bytes (and by inference from the rules, all 8 bytes) we can't tell what the
value of your int is.
|> 2) the standard doesn't specify how an integer is to map into a
|> character array.
|> It doesn't specify a particular endianness for integers.
|
|
- --
Lew Pitcher
Master Codewright & JOAT-in-training | GPG public key available on request
Registered Linux User #112576 ( http://counter.li.org/)
Slackware - Because I know what I'm doing.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
iD8DBQFAoEBOagV FX4UWr64RAmnTAK DaJ1lt0cW8WHF75 3pjcGWQHMHChACb BSsD
miBERGc25WSOMfh SWfdQi28=
=woxR
-----END PGP SIGNATURE-----
Lew Pitcher wrote: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Case wrote: | Lew Pitcher wrote: | |> -----BEGIN PGP SIGNED MESSAGE----- |> Hash: SHA1 |> |> Case wrote: |> |>> #include <string.h> |>> |>> int i; /* 4-byte == 4-char */ |>> char data[] = { 0x78, 0x56, 0x34, 0x12 }; |>> |>> int main() |>> { |>> memcpy(&i, data, 4); |> |> |> |> First off, sizeof(i) may not be equal to 4. So, this may or may not do |> what you |> expect it to do. | | | Yes, I know. That's why I said i is '4-byte == 4-char'.
No. sizeof(int) is 4 if the *compiler* says it is. Your word doesn't count here at all. And we haven't seen anything from the compiler to indicate that sizeof(int) == 4
Yes, you are correct. All I meant was: 'Assuming that my compiler sees
an int as a 4-byte entity and a char as a 1-byte entity, what is the
result of ...' BTW, why doesn't anyone question the sizeof char in
my example? Is char perhaps *silently* assumed to be a byte?
Assuming my question is clear now, how should I have coded my example
unambiguously (without the use of comments)? |> |>> /* |>> * Thinking about endianness, what can be said about |>> * the value of i according to the C-spec? |>> */ |> |> |> Nothing can be said about the value of i. |> 1) you may or may not have set the value of i to a known quantity. If |> sizeof(i) |> is greater than 4, then you didn't set i's storage completely, and if |> sizeof(i) |> is less than 4, then some of your initialization was not used to set i |> (and |> overwrote something else instead) | | | It's 4 as I said (see above).
See above. It's not 4 on your word.
| And, doesn't the C standard say that | 'global' data (as i is) is initialized to 0?!
So? We're not talking about /before/ you memcpy(). We're talking about /after/ you memcpy()
Think of it this way. If, unlike you, your compiler believes that sizeof(int) == 2, then your memcpy() of 4 bytes over a 2-byte int just wiped out two additional bytes somewhere. Your int only holds the first two bytes of the 4 byte array that you used to init with, and that value might be interpreted /either/ in big-endian /or/ little-endian format.
OTOH, if (unlike you) your compiler believes that sizeof(int) == 8), then your memcpy() of 4 bytes over an 8-byte int only placed data into four of the eight bytes. The other four bytes are not touched. So, we now have an int in which four bytes are known quantities, but that can be interpreted in one of 8! ways (big-endian and little-endian being two of those ways). So, even knowing the 4 bytes (and by inference from the rules, all 8 bytes) we can't tell what the value of your int is.
|> 2) the standard doesn't specify how an integer is to map into a |> character array. |> It doesn't specify a particular endianness for integers. | |
- -- Lew Pitcher
Master Codewright & JOAT-in-training | GPG public key available on request Registered Linux User #112576 (http://counter.li.org/) Slackware - Because I know what I'm doing. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
iD8DBQFAoEBOagV FX4UWr64RAmnTAK DaJ1lt0cW8WHF75 3pjcGWQHMHChACb BSsD miBERGc25WSOMfh SWfdQi28= =woxR -----END PGP SIGNATURE-----
Martin Dickopp wrote: Case <no@no.no> writes:
Martin Dickopp wrote:
Case <no@no.no> writes:
#include <string.h>
int i; /* 4-byte == 4-char */ char data[] = { 0x78, 0x56, 0x34, 0x12 };
int main() { memcpy(&i, data, 4);
/* * Thinking about endianness, what can be said about * the value of i according to the C-spec? */ }
/* Thanks for listening! Case */
A signed integer has a sign bit, a number of value bits (each of which has a value that is an integral power of two), and possibly padding bits. The standard does not impose any rule how the bits have to be arranged. For example, in the special case of `int' having 31 value bits and no padding bits, there are 263130836933693 530167218012160 000000 (== 32!) possibilitie s how to arrange the bits. Three are particularly popular among implementors, so that they have special names: little, big, and mixed endian. The remaining 263130836933693 530167218012159 999997 don't have any endianess. Therefore, not much can be said about the value of `i' from the perspectiv e of the C standard. How many different values can i have given code above?
If type `int' has 31 value bits and no padding bits, and bytes have 8 bits, then `i' will have 13 one-bits and 19 zero-bits. The number of values with this property is given by the binomial coefficient "32 choose 13", which is 347373600. That's how many different values `i' can have.
So this means that bit ordering, as defined in the C spec, can be
completely different for int and char (and other basic types)?
With value I mean a number at C level, not implementation level.
I don't know what you mean by "C level" or "implementa tion level".
At "C level" the bits have a fixed position, for example 0x00000001
can be used to get least significant bit (bit 0) of a 4 byte int;
at implementation level there are (as I understand it from you) 32
possible positions this bit could be.
Case <no@no.no> wrote: Lew Pitcher wrote: Case wrote: | Yes, I know. That's why I said i is '4-byte == 4-char'.
No. sizeof(int) is 4 if the *compiler* says it is. Your word doesn't count here at all. And we haven't seen anything from the compiler to indicate that sizeof(int) == 4
Yes, you are correct. All I meant was: 'Assuming that my compiler sees an int as a 4-byte entity and a char as a 1-byte entity, what is the result of ...' BTW, why doesn't anyone question the sizeof char in my example? Is char perhaps *silently* assumed to be a byte?
No. It is _explicitly_ defined to be one byte by the Standard.
Richard
[ BTW, please learn to snip. ]
Case <no@no.no> writes: BTW, why doesn't anyone question the sizeof char in my example? Is char perhaps *silently* assumed to be a byte?
Yes, `char' *always* has a size of one byte, so `sizeof(char) == 1' is
always true. However, a byte can have more than 8 bits.
Note that my other answer to you in this thread deals with the special
case that seems to apply to your implementation: 8 bit bytes, 4 byte
`int's with no padding bits.
Martin
--
,--. Martin Dickopp, Dresden, Germany ,= ,-_-. =.
/ ,- ) http://www.zero-based.org/ ((_/)o o(\_))
\ `-' `-'(. .)`-'
`-. Debian, a variant of the GNU operating system. \_/
Richard Bos wrote:
....snip... [ BTW, please learn to snip. ]
Thanks for the info about char size.
Kees
Case <no@no.no> writes: Martin Dickopp wrote: Case <no@no.no> writes:
Martin Dickopp wrote:
Case <no@no.no> writes:
>#include <string.h> > >int i; /* 4-byte == 4-char */ >char data[] = { 0x78, 0x56, 0x34, 0x12 }; > >int main() >{ > memcpy(&i, data, 4); > > /* > * Thinking about endianness, what can be said about > * the value of i according to the C-spec? > */ >} > >/* Thanks for listening! Case */
A signed integer has a sign bit, a number of value bits (each of which has a value that is an integral power of two), and possibly padding bits. The standard does not impose any rule how the bits have to be arranged. For example, in the special case of `int' having 31 value bits and no padding bits, there are 263130836933693 530167218012160 000000 (== 32!) possibiliti es how to arrange the bits. Three are particularly popular among implementors, so that they have special names: little, big, and mixed endian. The remaining 263130836933693 530167218012159 999997 don't have any endianess. Therefore , not much can be said about the value of `i' from the perspecti ve of the C standard.
How many different values can i have given code above? If type `int' has 31 value bits and no padding bits, and bytes have 8 bits, then `i' will have 13 one-bits and 19 zero-bits. The number of values with this property is given by the binomial coefficient "32 choose 13", which is 347373600. That's how many different values `i' can have.
So this means that bit ordering, as defined in the C spec, can be completely different for int and char (and other basic types)?
Yes. Although in reality, I have never seen a machine which didn't
either use big endian, little endian, or mixed endian bit order, the
C standard certainly allows others. With value I mean a number at C level, not implementation level. I don't know what you mean by "C level" or "implementa tion level".
At "C level" the bits have a fixed position, for example 0x00000001 can be used to get least significant bit (bit 0) of a 4 byte int; at implementation level there are (as I understand it from you) 32 possible positions this bit could be.
I see. These are usually referred to as "value" and "representation ",
respectively. Note that the `memcpy' call sets the /representation/
of `i'.
Martin
--
,--. Martin Dickopp, Dresden, Germany ,= ,-_-. =.
/ ,- ) http://www.zero-based.org/ ((_/)o o(\_))
\ `-' `-'(. .)`-'
`-. Debian, a variant of the GNU operating system. \_/
Richard Bos wrote: Case <no@no.no> wrote: Is char perhaps *silently* assumed to be a byte?
No. It is _explicitly_ defined to be one byte by the Standard.
<sarcasm> Well, that's really going to clear up the OP's confusion.
In C, a byte is a unit of storage large enough to hold a char. By this
definition, similar to that used in the Standard, sizeof(char) == 1
The meaning that many people incorrectly associate with `byte' actually
belongs with `octet'; the latter just happens to be a common choice for
size of the former.
Applying the sizeof operator directly to the `char' type is not harmful
but it is indicative of a grave misunderstandin g of the meaning of byte
or character in C, and thus throws doubt on the correctness of all uses
of sizeof by that programmer.
--
++acr@,ka" This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics |
by: kelvSYC |
last post by:
Are there any endianness concerns in C++, or does the compiler take
care of those details? I ask because I'm not sure if code such as the
following have consistent behavior on all platforms.
typedef unsigned int u32; // sizeof(int) == 4
typedef unsigned char u8;
u8 array = { 0x01, 0x23, 0x45, 0x67 };
*((u32*) array) = 0x89ABCDEF;
|
by: T Koster |
last post by:
Hi group,
I'm having some difficulty figuring out the most portable way to read 24
bits from a file. This is related to a Base-64 encoding.
The file is opened in binary mode, and I'm using fread to read three
bytes from it. The question is though, where should fread put this? I
have considered two alternatives, but neither seem like a good idea:
In most cases, the width of a char is 8 bits, so an array of 3 chars
|
by: SSM |
last post by:
Hi,
Does C standard comment about "Endianness" to be used
to store a structure/union variables?
Thanks & Regards,
Mehta
|
by: gamehack |
last post by:
Hi all,
I was thinking today, suppose we have the number
n = 0xAB 0xFF
which is equivalent to 44031 in decimal. In big endian it will be
stored as
10101011 11111111
but in little endian it will be
11111111 10101011
If we then apply a bit shift n << 2; that would give us completely
|
by: friend.05 |
last post by:
Code to check endianness of machine
| |
by: Indian.croesus |
last post by:
Hi,
If I am right Endianness is CPU related. I do not know if the
question is right in itself but if it is then how does C handle issues
arising out of Endianness.
I understand that if we pass structures using sockets across platforms,
we need to take care of Endianness issues at the application level. But
for example, for the code using bitwise AND to figure out if a number
is odd or even, how does C know the LSB position?
|
by: Martin |
last post by:
For reasons I won't go into, I need to transfer from 1 to 3 bytes to a
variable that I know is 4 bytes long. Bytes not written to in the 4-byte
target variable must be zero. Is the following use of memcpy() a
well-defined way of so doing? The code is written knowing that
sizeof(unsigned long) == 4 in this instance. The code is somewhat contrived
in order to provide a self-contained program that will compile and show the
use of memcpy() I...
|
by: Rahul |
last post by:
Hi Everyone,
I have a program unit which does >and << of an integer which is of
4 bytes length. The logic of shifting and action based on the result,
assumes that the system is big-endian.
Accordingly, if i need the program to work fine in a little-endian
system. I understand that the code needs to be changed. ( I couldn't
find any statement in C90 about endianness, hence i'm assuming that c
programs are not portable if the endianness...
|
by: =?Utf-8?B?RGF0ZWxNb25rZXk5OQ==?= |
last post by:
I have some c++ code that I am converting to C#. What I need to convert is
the following:
memcpy(&tmpshort, (pTmpDataIn+1), 2);
This should copy two bytes of an char* to an int which then gets used
elsewhere. I am having trouble coming up with how to approach.
Thanks,
|
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed.
This is as boiled down as I can make it.
Here is my compilation command:
g++-12 -std=c++20 -Wnarrowing bit_field.cpp
Here is the code in...
|
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth.
The Art of Business Website Design
Your website is...
| |
by: Hystou |
last post by:
Overview:
Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
|
by: agi2029 |
last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own....
Now, this would greatly impact the work of software developers. The idea...
|
by: conductexam |
last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one.
At the time of converting from word file to html my equations which are in the word document file was convert into image.
Globals.ThisAddIn.Application.ActiveDocument.Select();...
|
by: adsilva |
last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
|
by: 6302768590 |
last post by:
Hai team
i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
|
by: muto222 |
last post by:
How can i add a mobile payment intergratation into php mysql website.
| |
by: bsmnconsultancy |
last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...
| |