473,480 Members | 1,700 Online
Bytes | Software Development & Data Engineering Community
Create Post

Home Posts Topics Members FAQ

endian problem, please help

hi,

it's not really an endian problem. I think I must
be missing something else ...

The problem can be reduced to different
results of the following two segments of codes:
(cut and pasted verbatim)

1.

*width = (unsigned char) fgetc(fp) +
256 * (unsigned char) fgetc(fp) +
65536 * (unsigned char) fgetc(fp) +
16777216L * (unsigned char) fgetc(fp);

yields *width == 131072 which should have been 512 for
fp points to the byte sequence of "00 02 00 00" while

2.
*width =(unsigned char) fgetc(fp);
*width += 256 * (unsigned char) fgetc(fp);
*width += 65536 * (unsigned char) fgetc(fp);
*width += 16777216L * (unsigned char) fgetc(fp);

results in the expected value 512.

What am I missing? It's driving me ....... :(

Thanks for any hint!

Dec 10 '06 #1
11 1900
On Dec 10, 4:11 pm, "kolmogo...@gmail.com" <kolmogo...@gmail.com>
wrote:
hi,

it's not really an endian problem. I think I must
be missing something else ...

The problem can be reduced to different
results of the following two segments of codes:
(cut and pasted verbatim)

1.

*width = (unsigned char) fgetc(fp) +
256 * (unsigned char) fgetc(fp) +
65536 * (unsigned char) fgetc(fp) +
16777216L * (unsigned char) fgetc(fp);

yields *width == 131072 which should have been 512 for
fp points to the byte sequence of "00 02 00 00" while

2.
*width =(unsigned char) fgetc(fp);
*width += 256 * (unsigned char) fgetc(fp);
*width += 65536 * (unsigned char) fgetc(fp);
*width += 16777216L * (unsigned char) fgetc(fp);

results in the expected value 512.

What am I missing? It's driving me ....... :(

Thanks for any hint!
I'm not 100% sure, but it could be the order of evaluation (first
result is 2*65536). From C89 draft:
"Except as indicated by the syntax{27} or otherwise specified later
(for the function-call operator () , && , || , ?: , and comma
operators), the order of evaluation of subexpressions and the order in
which side effects take place are both unspecified."
--
WYCIWYG - what you C is what you get

Dec 10 '06 #2
ko********@gmail.com wrote:
hi,

it's not really an endian problem. I think I must
be missing something else ...

The problem can be reduced to different
results of the following two segments of codes:
(cut and pasted verbatim)

1.

*width = (unsigned char) fgetc(fp) +
256 * (unsigned char) fgetc(fp) +
65536 * (unsigned char) fgetc(fp) +
16777216L * (unsigned char) fgetc(fp);

yields *width == 131072 which should have been 512 for
fp points to the byte sequence of "00 02 00 00" while
There's no guarantee that the fgetc() call on the first line gets
called first. It might on some systems, but the calls are allowed to
occur in any order, and on your system, it so happens that that order
is not what you want. Your version with four separate statements does
not have this problem since statements are not allowed to be reordered
(except when the compiler knows it doesn't matter for the result).

Dec 10 '06 #3
In article <11**********************@l12g2000cwl.googlegroups .com>,
ko********@gmail.com <ko********@gmail.comwrote:
>hi,

it's not really an endian problem. I think I must
be missing something else ...

The problem can be reduced to different
results of the following two segments of codes:
(cut and pasted verbatim)
Why not use fread()?

Dec 10 '06 #4

Kenny McCormack wrote:
In article <11**********************@l12g2000cwl.googlegroups .com>,
ko********@gmail.com <ko********@gmail.comwrote:
hi,

it's not really an endian problem. I think I must
be missing something else ...

The problem can be reduced to different
results of the following two segments of codes:
(cut and pasted verbatim)

Why not use fread()?
Thanks for all prompt answers!

So, it HAS BEEN an incorrect assumption about
the evaluation order, right?

Yes, I sometimes do use fread() followed by an endian
conversion if necessary

In case the object I'm reading is constantly written in
either big or little endian, I though I could have save
the endian conversion codes this way.

So, do you think the (debugging) correct version of mine
is ok? I mean, I'd like to kow how would real experts do
this?

Dec 10 '06 #5
On Dec 10, 5:07 pm, "kolmogo...@gmail.com" <kolmogo...@gmail.com>
wrote:
<snip>
So, do you think the (debugging) correct version of mine
is ok? I mean, I'd like to kow how would real experts do
this?
I don't know how real experts would do it, but I'd check for errors
returned by fgetc() as well. If it fails, it will return EOF, so check
for that, but _before_ converting to unsigned char. From the IRIX 5.3
man page:
WARNING
If the integer value returned by getc, getchar, or fgetc is stored
into a
character variable and then compared against the integer constant
EOF,
the comparison may never succeed, because sign-extension of a
character
on widening to integer is machine-dependent.
--
WYCIWYG - what you C is what you get

Dec 10 '06 #6

matevzb wrote:
On Dec 10, 5:07 pm, "kolmogo...@gmail.com" <kolmogo...@gmail.com>
wrote:
<snip>
So, do you think the (debugging) correct version of mine
is ok? I mean, I'd like to kow how would real experts do
this?
I don't know how real experts would do it, but I'd check for errors
returned by fgetc() as well. If it fails, it will return EOF, so check
for that, but _before_ converting to unsigned char. From the IRIX 5.3
man page:
WARNING
If the integer value returned by getc, getchar, or fgetc is stored
into a
character variable and then compared against the integer constant
EOF,
the comparison may never succeed, because sign-extension of a
character
on widening to integer is machine-dependent.
I envy the manpages you have on IRIX. Wonderful advices. Thanks.

I don't know if I'll be blamed for including non-standard things
(or should I better use long instead of int32_t for maximum
portability) but I'm calling

#include <stdint.h>
int32_t fget_int32_le(FILE *fp)
{
int32_t x, weight;
int c, i;

for (x=0, weight=1, i=0; i<4; i++)
{
assert ( EOF != (c=fgetc(fp) ) ); /* TODO: to be handled */
x += weight * (unsigned char) c;
weight *= 256;
}
/*
x =(unsigned char) fgetc(fp);
x += 256 * (unsigned char) fgetc(fp);
x += 65536 * (unsigned char) fgetc(fp);
x += 16777216L * (unsigned char) fgetc(fp);
*/
return x;
}

Dec 10 '06 #7
On Dec 10, 6:05 pm, "kolmogo...@gmail.com" <kolmogo...@gmail.com>
wrote:
<snip>
I envy the manpages you have on IRIX. Wonderful advices. Thanks.
<OT>I usually check different systems, but at home I have an IRIX at
hand, so that's the first one to check. The same man pages are
available at
http://techpubs.sgi.com/library/tpl/...db=man&pth=ALL.
You can also google for "man getchar", it usually yields good results
(for different systems)</OT>
I don't know if I'll be blamed for including non-standard things
(or should I better use long instead of int32_t for maximum
portability) but I'm calling

#include <stdint.h>
<stdint.hand <inttypes.hare C99-compliant, so you can use them (I'm
not sure how portably, though). int32_t however is a POSIX extension
(http://www.opengroup.org/onlinepubs/.../stdint.h.html)
--
WYCIWYG - what you C is what you get

Dec 10 '06 #8
"ko********@gmail.com" wrote:
>
The problem can be reduced to different results of the following
two segments of codes: (cut and pasted verbatim)

1.
*width = (unsigned char) fgetc(fp) +
256 * (unsigned char) fgetc(fp) +
65536 * (unsigned char) fgetc(fp) +
16777216L * (unsigned char) fgetc(fp);

yields *width == 131072 which should have been 512 for
fp points to the byte sequence of "00 02 00 00" while

2.
*width =(unsigned char) fgetc(fp);
*width += 256 * (unsigned char) fgetc(fp);
*width += 65536 * (unsigned char) fgetc(fp);
*width += 16777216L * (unsigned char) fgetc(fp);

results in the expected value 512.

What am I missing? It's driving me ....... :(
You don't show a complete program, so it's all pure guesswork.
First, you don't need the casts. fgetc returns an int, which is
the unsigned char equivalent of the input char. So get rid of
them. Casts are usually wrong anyhow.

Second, you don't show the type of width. I suspect you have
undefined behaviour due to overflows. The value of "65536 *
fgetc(fp)" and "256 * fgetc(fp)" can exceed the size of an int.
Width should be an unsigned long. 65536 should be 65536L. 256
should be 256L.

You also have the problem of unspecified order of fgetc calls.
Putting it together, you have, with width an unsigned long:

int ch;
unsigned long width, i;

i = 1; width = 0;
while ((i < 16777217L) && (EOF != (ch = fgetc(fp)))) {
width += i * ch; i *= 256;
}

which won't blow up if fgetc ever returns EOF.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>

Dec 10 '06 #9
"matevzb" <ma*****@gmail.comwrites:
[...]
<stdint.hand <inttypes.hare C99-compliant, so you can use them (I'm
not sure how portably, though). int32_t however is a POSIX extension
(http://www.opengroup.org/onlinepubs/.../stdint.h.html)
int32_t is a standard (but optional) typedef in <stdint.h>, introduced
in C99. (Apparently it's required in POSIX.)

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Dec 10 '06 #10
On Dec 11, 1:08 am, Keith Thompson <k...@mib.orgwrote:
"matevzb" <mate...@gmail.comwrites:[...]
<stdint.hand <inttypes.hare C99-compliant, so you can use them (I'm
not sure how portably, though). int32_t however is a POSIX extension
(http://www.opengroup.org/onlinepubs/...h.html)int32_t is a standard (but optional) typedef in <stdint.h>, introduced
in C99. (Apparently it's required in POSIX.)

--
Keith Thompson (The_Other_Keith) k...@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Oops, my bad. I incorrectly assumed that since POSIX requires it and
specifies it as an extension to ISO, it wasn't specified in the
Standard. Must check the Standard more often...
--
WYCIWYG - what you C is what you get

Dec 11 '06 #11
"ko********@gmail.com" wrote:
>
hi,

it's not really an endian problem. I think I must
be missing something else ...

The problem can be reduced to different
results of the following two segments of codes:
(cut and pasted verbatim)

1.

*width = (unsigned char) fgetc(fp) +
256 * (unsigned char) fgetc(fp) +
65536 * (unsigned char) fgetc(fp) +
16777216L * (unsigned char) fgetc(fp);

yields *width == 131072 which should have been 512 for
fp points to the byte sequence of "00 02 00 00" while

2.
*width =(unsigned char) fgetc(fp);
*width += 256 * (unsigned char) fgetc(fp);
*width += 65536 * (unsigned char) fgetc(fp);
*width += 16777216L * (unsigned char) fgetc(fp);

results in the expected value 512.

What am I missing? It's driving me ....... :(
It's called (or at least, related to) "sequence points".

There is no guarantee that the four fgetc's in (1) will be
called in the order you want. In fact, it appears that your
compiler is using the exact opposite order.

In (2), you have placed a sequence point between each of the
fgetc calls, and therefore guarantee the order in which they
get called.

This is really no different than:

printf("%d %d %d %d\n",fgetc(fp),fgetc(fp),fgetc(fp),fgetc(fp));
or
int i=0;
printf("%d %d %d\n",i++,i++,i++);

(Though I'm not sure if your code invokes UB, as does my second
example, or is simply "implementation defined".)

--
+-------------------------+--------------------+-----------------------+
| Kenneth J. Brody | www.hvcomputer.com | #include |
| kenbrody/at\spamcop.net | www.fptech.com | <std_disclaimer.h|
+-------------------------+--------------------+-----------------------+
Don't e-mail me at: <mailto:Th*************@gmail.com>

Dec 11 '06 #12

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
28994
by: hicham | last post by:
Hi, I am looking for help, i would like to know how can i use the endian.h and config.h to convert compiled files under solaris from BIG-ENDIAN to compiled files LITTLE-ENDIAN. I am working...
3
4652
by: Joe C | last post by:
I have some code that performs bitwise operations on files. I'm trying to make the code portable on different endian systems. This is not work/school related...just trying to learn/understand. ...
0
1613
by: ClimberBear | last post by:
Hi, I've got a very strange problem with a Websphere 5.1 cluster attached to DB2 database in Mainframe z/OS. I have a J2EE deployed application running normally fine agains the DB2 host. But,...
8
27431
by: Perception | last post by:
Hello all, If I have a C-like data structure such that struct Data { int a; //16-bit value char; //3 ASCII characters int b; //32-bit value int c; //24-bit value }
2
9727
by: Mehta Shailendrakumar | last post by:
Hi, I am sending this question again as new question rather than reply to old question Please refer below: struct raw_data { unsigned char x; unsigned char y; }; union full_data
14
2829
by: ThazKool | last post by:
I want to see if this code works the way it should on a Big-Endian system. Also if anyone has any ideas on how determine this at compile-time so that I use the right decoding or encoding...
9
1645
by: Sheldon | last post by:
Hi, I am trying to make sense of this endian problem and so far, it is still Greek to me. I am have some files that have stored lat and lon data in binary format. The data was originally floats...
33
3187
by: raghu | last post by:
Is it possible to know whether a system is little endian or big endian by writing a C program? If so, can anyone please give me the idea to approach... Thanks a ton. Regards, Raghu
23
7002
by: Niranjan | last post by:
I have this program : void main() { int i=1; if((*(char*)&i)==1) printf("The machine is little endian."); else printf("The machine is big endian."); }
0
7037
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
6904
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
7034
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
7076
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
1
6732
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
1
4768
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
4472
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
1
558
muto222
php
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
174
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.