By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
445,778 Members | 1,947 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 445,778 IT Pros & Developers. It's quick & easy.

different struct sizes

P: n/a
I have this weird problem with the JPEG library that I am using. There
is a function that is being called and one of the parameters being
passed is the length of a structure. Inside this function, the length
of the structure being passed is compared to the length of the same
structure calculated inside the function. Now because these values do
not match, the program exits.

======================

This is the prototype for this function:

jpeg_CreateDecompress((cinfo), JPEG_LIB_VERSION, q(size_t) sizeof(struct
jpeg_decompress_struct));

======================

Now this is a portion of this function's code:

GLOBAL(void) jpeg_CreateDecompress (j_decompress_ptr cinfo, int version,
size_t structsize)
{
int i;

/* Guard against version mismatches between library and caller. */
cinfo->mem = NULL; /* so jpeg_destroy knows mem mgr not called
*/
if (version != JPEG_LIB_VERSION)
ERREXIT2(cinfo, JERR_BAD_LIB_VERSION, JPEG_LIB_VERSION, version);
if (structsize != SIZEOF(struct jpeg_decompress_struct))
ERREXIT2(cinfo, JERR_BAD_STRUCT_SIZE,(int) SIZEOF(struct
jpeg_decompress_struct), (int) structsize);

======================

This is the error produced by this function:

JPEG parameter struct mismatch: library thinks size is 452, caller
expects 404

======================

Would anyone have any idea as to why something like this would happen?

Thanks

Reply

Nov 6 '06 #1
Share this Question
Share on Google+
45 Replies


P: n/a
Borked Pseudo Mailed wrote:
I have this weird problem with the JPEG library that I am using. There
is a function that is being called and one of the parameters being
passed is the length of a structure. Inside this function, the length
of the structure being passed is compared to the length of the same
structure calculated inside the function. Now because these values do
not match, the program exits.

======================

This is the prototype for this function:

jpeg_CreateDecompress((cinfo), JPEG_LIB_VERSION, q(size_t) sizeof(struct
jpeg_decompress_struct));
Odd looking prototype...
======================

Now this is a portion of this function's code:

GLOBAL(void) jpeg_CreateDecompress (j_decompress_ptr cinfo, int version,
size_t structsize)
{
int i;

/* Guard against version mismatches between library and caller. */
cinfo->mem = NULL; /* so jpeg_destroy knows mem mgr not called
*/
if (version != JPEG_LIB_VERSION)
ERREXIT2(cinfo, JERR_BAD_LIB_VERSION, JPEG_LIB_VERSION, version);
if (structsize != SIZEOF(struct jpeg_decompress_struct))
ERREXIT2(cinfo, JERR_BAD_STRUCT_SIZE,(int) SIZEOF(struct
jpeg_decompress_struct), (int) structsize);
What is SIZEOF()?
>
======================

This is the error produced by this function:

JPEG parameter struct mismatch: library thinks size is 452, caller
expects 404

======================

Would anyone have any idea as to why something like this would happen?
Have the library and your code been compiled with the same compiler and
if used, alignment options?

--
Ian Collins.
Nov 6 '06 #2

P: n/a
You are using a library compiled on a platform other than what you are
using. You require libraries compiled for your platform (Same Hardware
and compiler options).

Certain platforms use 4 bytes memory for "char" and "short int"
variables. when you use 1 byte for char and 2 byte for short int on
your platform, this kind of problems are reported.

Regards,
Nagaraj L

On Nov 6, 6:58 am, Borked Pseudo Mailed <nob...@pseudo.borked.net>
wrote:
I have this weird problem with the JPEG library that I am using. There
is a function that is being called and one of the parameters being
passed is the length of a structure. Inside this function, the length
of the structure being passed is compared to the length of the same
structure calculated inside the function. Now because these values do
not match, the program exits.

======================

This is the prototype for this function:

jpeg_CreateDecompress((cinfo), JPEG_LIB_VERSION, q(size_t) sizeof(struct
jpeg_decompress_struct));

======================

Now this is a portion of this function's code:

GLOBAL(void) jpeg_CreateDecompress (j_decompress_ptr cinfo, int version,
size_t structsize)
{
int i;

/* Guard against version mismatches between library and caller. */
cinfo->mem = NULL; /* so jpeg_destroy knows mem mgr not called
*/
if (version != JPEG_LIB_VERSION)
ERREXIT2(cinfo, JERR_BAD_LIB_VERSION, JPEG_LIB_VERSION, version);
if (structsize != SIZEOF(struct jpeg_decompress_struct))
ERREXIT2(cinfo, JERR_BAD_STRUCT_SIZE,(int) SIZEOF(struct
jpeg_decompress_struct), (int) structsize);

======================

This is the error produced by this function:

JPEG parameter struct mismatch: library thinks size is 452, caller
expects 404

======================

Would anyone have any idea as to why something like this would happen?

Thanks

Reply
Nov 6 '06 #3

P: n/a
Nagaraj L <na******@gmail.comwrote:
Certain platforms use 4 bytes memory for "char" and "short int"
Any platform that uses 4 bytes of memory for "char" variables is
non-conforming. sizeof(char) is 1.

--
C. Benson Manica | I *should* know what I'm talking about - if I
cbmanica(at)gmail.com | don't, I need to know. Flames welcome.
Nov 6 '06 #4

P: n/a
Christopher Benson-Manica <at***@otaku.freeshell.orgwrites:
Nagaraj L <na******@gmail.comwrote:
>Certain platforms use 4 bytes memory for "char" and "short int"

Any platform that uses 4 bytes of memory for "char" variables is
non-conforming. sizeof(char) is 1.
sizeof(char) is 1 by definition, that doesn't mean that it's one byte

--
Simias
Nov 6 '06 #5

P: n/a
Simias <no****@nowhere.comwrote:
Christopher Benson-Manica <at***@otaku.freeshell.orgwrites:
Nagaraj L <na******@gmail.comwrote:
Certain platforms use 4 bytes memory for "char" and "short int"
Any platform that uses 4 bytes of memory for "char" variables is
non-conforming. sizeof(char) is 1.

sizeof(char) is 1 by definition, that doesn't mean that it's one byte
It does in an ISO C context. One byte need not be one octet, however,
and historically it hasn't always been.

Richard
Nov 6 '06 #6

P: n/a
Christopher Benson-Manica wrote:
Nagaraj L <na******@gmail.comwrote:
>Certain platforms use 4 bytes memory for "char" and "short int"

Any platform that uses 4 bytes of memory for "char" variables is
non-conforming. sizeof(char) is 1.
Nagarajs terminology is flawed. He could have written 'octets' in
place of bytes. This is quite likely on some conforming systems.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>

Nov 6 '06 #7

P: n/a
Simias <no****@nowhere.comwrites:
Christopher Benson-Manica <at***@otaku.freeshell.orgwrites:
>Nagaraj L <na******@gmail.comwrote:
>>Certain platforms use 4 bytes memory for "char" and "short int"

Any platform that uses 4 bytes of memory for "char" variables is
non-conforming. sizeof(char) is 1.

sizeof(char) is 1 by definition, that doesn't mean that it's one byte
Yes, it certainly does; that's the C standar's definition of "byte".

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Nov 6 '06 #8

P: n/a
In article <ln************@nuthaus.mib.org>,
Keith Thompson <ks***@mib.orgwrote:
>sizeof(char) is 1 by definition, that doesn't mean that it's one byte

Yes, it certainly does; that's the C standar's definition of "byte".
Unfortunately any platform will have its own definition of byte, and
though these coincide 99.9% of the time, the difficult cases will be
the ones where it doesn't. The fact that something is true of C's
definition of byte is not very useful when the issue is how it maps
onto the implementation - you can't expect every word to be used with
C's definition in that context.

-- Richard

--
"Consideration shall be given to the need for as many as 32 characters
in some alphabets" - X3.4, 1963.
Nov 6 '06 #9

P: n/a
ri*****@cogsci.ed.ac.uk (Richard Tobin) writes:
In article <ln************@nuthaus.mib.org>,
Keith Thompson <ks***@mib.orgwrote:
>>sizeof(char) is 1 by definition, that doesn't mean that it's one byte

Yes, it certainly does; that's the C standar's definition of "byte".

Unfortunately any platform will have its own definition of byte, and
though these coincide 99.9% of the time, the difficult cases will be
the ones where it doesn't. The fact that something is true of C's
definition of byte is not very useful when the issue is how it maps
onto the implementation - you can't expect every word to be used with
C's definition in that context.
In this newsgroup, I can and I do expect words defined in the C
standard to be used in accordance with the way the C standard defines
them. Anyone using the word "byte" here in some other sense needs to
say so.

The statement was:

sizeof(char) is 1 by definition, that doesn't mean that it's one byte

If the poster had said:

sizeof(char) is 1 by definition, that doesn't mean that it's one byte
(as a particular platform defines the word "byte")

I would have had no objection.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Nov 6 '06 #10

P: n/a
Keith Thompson wrote:
ri*****@cogsci.ed.ac.uk (Richard Tobin) writes:
>>In article <ln************@nuthaus.mib.org>,
Keith Thompson <ks***@mib.orgwrote:
>>>>sizeof(char) is 1 by definition, that doesn't mean that it's one byte

Yes, it certainly does; that's the C standar's definition of "byte".

Unfortunately any platform will have its own definition of byte, and
though these coincide 99.9% of the time, the difficult cases will be
the ones where it doesn't. The fact that something is true of C's
definition of byte is not very useful when the issue is how it maps
onto the implementation - you can't expect every word to be used with
C's definition in that context.


In this newsgroup, I can and I do expect words defined in the C
standard to be used in accordance with the way the C standard defines
them. Anyone using the word "byte" here in some other sense needs to
say so.

The statement was:

sizeof(char) is 1 by definition, that doesn't mean that it's one byte

If the poster had said:

sizeof(char) is 1 by definition, that doesn't mean that it's one byte
(as a particular platform defines the word "byte")

I would have had no objection.
Important distinction. When I ported lcc-win32 to a DSP, each
character took two bytes (16 bits) because the machine could not
address odd bytes. Still, sizeof(char) was 1 of course.

In that environment sizeof(char) == sizeof(short) == sizeof(int).
Only longs were 32 bits.

Nov 6 '06 #11

P: n/a
jacob navia <ja***@jacob.remcomp.frwrites:
Keith Thompson wrote:
>ri*****@cogsci.ed.ac.uk (Richard Tobin) writes:
>>>In article <ln************@nuthaus.mib.org>,
Keith Thompson <ks***@mib.orgwrote:

>sizeof(char) is 1 by definition, that doesn't mean that it's one byte

Yes, it certainly does; that's the C standar's definition of "byte".

Unfortunately any platform will have its own definition of byte, and
though these coincide 99.9% of the time, the difficult cases will be
the ones where it doesn't. The fact that something is true of C's
definition of byte is not very useful when the issue is how it maps
onto the implementation - you can't expect every word to be used with
C's definition in that context.
In this newsgroup, I can and I do expect words defined in the C
standard to be used in accordance with the way the C standard defines
them. Anyone using the word "byte" here in some other sense needs to
say so.
The statement was:
sizeof(char) is 1 by definition, that doesn't mean that it's one
byte
If the poster had said:
sizeof(char) is 1 by definition, that doesn't mean that it's one
byte
(as a particular platform defines the word "byte")
I would have had no objection.

Important distinction. When I ported lcc-win32 to a DSP, each
character took two bytes (16 bits) because the machine could not
address odd bytes. Still, sizeof(char) was 1 of course.
Yes, it's an important distinction, and you're blurring it.

Each character took *one* byte (16 bits). One byte happened to be two
octets.
In that environment sizeof(char) == sizeof(short) == sizeof(int).
Only longs were 32 bits.
Perfectly legal, of course. It could cause some interesting problems
with stdio, since it's difficult to distinguish between EOF and a
character that happens to have the same value. But a DSP would
presumably have a freestanding implementation, so stdio support isn't
required.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Nov 7 '06 #12

P: n/a
Note: It should be referred as octet(s).

A character is 1 byte all the time. But it takes 2 octets example on -
Intel, ST10 platforms and many other platforms. It is decided by the
processor.

Certain processors require even addressing for address alignment. Some
processors require x4 alignments ( Here 1 character takes 1 byte but in
memory it takes 4 octets). The 3 octets are not used. char still uses 8
bits or 1 byte only.

Nov 7 '06 #13

P: n/a
"Nagaraj L" <na******@gmail.comwrites:
Note: It should be referred as octet(s).

A character is 1 byte all the time. But it takes 2 octets example on -
Intel, ST10 platforms and many other platforms. It is decided by the
processor.

Certain processors require even addressing for address alignment. Some
processors require x4 alignments ( Here 1 character takes 1 byte but in
memory it takes 4 octets). The 3 octets are not used. char still uses 8
bits or 1 byte only.
A C implementation *must* allow char objects to be stored at odd byte
addresses. It can choose to align all single declared char objects,
or even char struct members, at even addresses if that makes access
easier or faster, but there can be no padding between array elements:

char arr[2];
/* either arr[0] or arr[1] is at an odd byte address */

If the hardware doesn't allow this (or makes it too expensive), then
the implementation can make bytes bigger than 8 bits.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Nov 7 '06 #14

P: n/a
Keith Thompson said:
jacob navia <ja***@jacob.remcomp.frwrites:
<snip>
>In that environment sizeof(char) == sizeof(short) == sizeof(int).
Only longs were 32 bits.

Perfectly legal, of course.
And perfectly common in the DSP universe.
It could cause some interesting problems
with stdio, since it's difficult to distinguish between EOF and a
character that happens to have the same value. But a DSP would
presumably have a freestanding implementation, so stdio support isn't
required.
Even if stdio support is provided, I don't think it actually matters very
much about the EOF issue, in practical terms. Yes, 0xFFFF (or 0xFFFFFFFF,
or however many bits you're dealing with) can be seen as either EOF or a
character value, but (a) there's no problem when CHAR_BIT < 16, and (b)
when CHAR_BIT /is/ 16 or higher, the chances of real world data containing
a genuine character with the maximum possible value are pretty low. You'd
have to work pretty hard to find a counter-example, I think.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)
Nov 7 '06 #15

P: n/a
Richard Heathfield wrote:
Even if stdio support is provided, I don't think it actually matters very
much about the EOF issue, in practical terms. Yes, 0xFFFF (or 0xFFFFFFFF,
or however many bits you're dealing with) can be seen as either EOF or a
character value, but (a) there's no problem when CHAR_BIT < 16, and (b)
when CHAR_BIT /is/ 16 or higher, the chances of real world data containing
a genuine character with the maximum possible value are pretty low. You'd
have to work pretty hard to find a counter-example, I think.
I don't think it's hard at all. Just be sure not to limit your search
to text files.

Nov 7 '06 #16

P: n/a
Harald van D?k said:
Richard Heathfield wrote:
>Even if stdio support is provided, I don't think it actually matters very
much about the EOF issue, in practical terms. Yes, 0xFFFF (or 0xFFFFFFFF,
or however many bits you're dealing with) can be seen as either EOF or a
character value, but (a) there's no problem when CHAR_BIT < 16, and (b)
when CHAR_BIT /is/ 16 or higher, the chances of real world data
containing a genuine character with the maximum possible value are pretty
low. You'd have to work pretty hard to find a counter-example, I think.

I don't think it's hard at all. Just be sure not to limit your search
to text files.
Okay, let me put that another way - if you want to *avoid* the problem, you
probably can do so without too much hassle by designing your system
accordingly. If you are seeking out the problem, however, then of course
you're going to find it. Humans are good at finding problems. :-)

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)
Nov 7 '06 #17

P: n/a
Richard Heathfield wrote:
Harald van D?k said:
Richard Heathfield wrote:
Even if stdio support is provided, I don't think it actually matters very
much about the EOF issue, in practical terms. Yes, 0xFFFF (or 0xFFFFFFFF,
or however many bits you're dealing with) can be seen as either EOF or a
character value, but (a) there's no problem when CHAR_BIT < 16, and (b)
when CHAR_BIT /is/ 16 or higher, the chances of real world data
containing a genuine character with the maximum possible value are pretty
low. You'd have to work pretty hard to find a counter-example, I think.
I don't think it's hard at all. Just be sure not to limit your search
to text files.

Okay, let me put that another way - if you want to *avoid* the problem, you
probably can do so without too much hassle by designing your system
accordingly. If you are seeking out the problem, however, then of course
you're going to find it. Humans are good at finding problems. :-)
Oh, I actually mean you're very likely to run into it by accident
unless you specifically avoid the issue (which I can agree may not be
hard). At the very least, if a system allows you to upload code (which
does not usually happen with DSPs (I think), but does happen with other
embedded systems), and the code checks for EOF, the code is extremely
likely to contain EOF itself.

Nov 7 '06 #18

P: n/a
"=?utf-8?B?SGFyYWxkIHZhbiBExLNr?=" <tr*****@gmail.comwrote:
Richard Heathfield wrote:
Even if stdio support is provided, I don't think it actually matters very
much about the EOF issue, in practical terms. Yes, 0xFFFF (or 0xFFFFFFFF,
or however many bits you're dealing with) can be seen as either EOF or a
character value, but (a) there's no problem when CHAR_BIT < 16, and (b)
when CHAR_BIT /is/ 16 or higher, the chances of real world data containing
a genuine character with the maximum possible value are pretty low. You'd
have to work pretty hard to find a counter-example, I think.

I don't think it's hard at all. Just be sure not to limit your search
to text files.
If it is known that a program's input is not text but (possibly) binary,
the wise programmer doesn't use fgetc() in the first place. He uses
fread(), which doesn't have this problem even when CHAR_MAX == INT_MAX.
Similar things can be said about <ctype.h>, whose functions are
typically only used on text, not on unknown data that could be either
text or binary.

Richard
Nov 7 '06 #19

P: n/a
Richard Bos wrote:
"=?utf-8?B?SGFyYWxkIHZhbiBExLNr?=" <tr*****@gmail.comwrote:
I don't think it's hard at all. Just be sure not to limit your search
to text files.

If it is known that a program's input is not text but (possibly) binary,
the wise programmer doesn't use fgetc() in the first place. He uses
fread(), which doesn't have this problem even when CHAR_MAX == INT_MAX.
I can agree that that's (almost) always a better idea.
Similar things can be said about <ctype.h>, whose functions are
typically only used on text, not on unknown data that could be either
text or binary.
Well yeah, but the is* functions are never (counterexamples are
welcome) useful for binary data, but fgetc() is merely a less than
optimal solution.

Nov 7 '06 #20

P: n/a
"=?utf-8?B?SGFyYWxkIHZhbiBExLNr?=" <tr*****@gmail.comwrote:
Richard Bos wrote:
Similar things can be said about <ctype.h>, whose functions are
typically only used on text, not on unknown data that could be either
text or binary.

Well yeah, but the is* functions are never (counterexamples are
welcome) useful for binary data,
I can think of only one reason: a strings-like utility program. That's
system-level enough not to make this a problem.

Richard
Nov 7 '06 #21

P: n/a
2006-11-07 <45*************@news.xs4all.nl>,
Richard Bos wrote:
"=?utf-8?B?SGFyYWxkIHZhbiBExLNr?=" <tr*****@gmail.comwrote:
>Richard Bos wrote:
Similar things can be said about <ctype.h>, whose functions are
typically only used on text, not on unknown data that could be either
text or binary.

Well yeah, but the is* functions are never (counterexamples are
welcome) useful for binary data,

I can think of only one reason: a strings-like utility program. That's
system-level enough not to make this a problem.
How is strings system-level?

#define N 4
main(int argc, char **argv) {
int c;
int i=0;
char buf[N];
FILE *myfile = ...;
while(c = getc(file) != EOF) {
if(isprint(c) || isspace(c))
if(i<N) {
buf[i++]=c;
if(i==N) { i++; fwrite(buf,1,N,stdout); }
} else {
putchar(c);
}
}
else {
if(i==N+1) {
putchar('\n');
i=0;
}
}
}
putchar('\n');
}
Nov 7 '06 #22

P: n/a
In article <ln************@nuthaus.mib.org>,
Keith Thompson <ks***@mib.orgwrote:
>In this newsgroup, I can and I do expect words defined in the C
standard to be used in accordance with the way the C standard defines
them.
Well, you're often going to be disappointed.

The newsgroup isn't the *only* context determining how a word should
be interpreted. Usenet isn't a standards document, it's a
conversation.
>Anyone using the word "byte" here in some other sense needs to
say so.

The statement was:

sizeof(char) is 1 by definition, that doesn't mean that it's one byte
And he said it in the context of a up-thread statement

Certain platforms use 4 bytes memory for "char" and "short int"
variables

Why not start by assuming that the person is talking sense? (Of
course, you may have to change your assumption later.) Since "byte"
corresponds to "char" in C standards-speak, it would make no sense for
him to say "that doesn't mean that it's one byte" if he had meant the
C-standards-speak usage of "byte". Assuming that he is talking sense
removes the ambiguity, and your response therefore seems excessively
pedantic.

-- Richard
--
"Consideration shall be given to the need for as many as 32 characters
in some alphabets" - X3.4, 1963.
Nov 7 '06 #23

P: n/a
In article <3N********************@bt.com>,
Richard Heathfield <in*****@invalid.invalidwrote:
>Even if stdio support is provided, I don't think it actually matters very
much about the EOF issue, in practical terms. Yes, 0xFFFF (or 0xFFFFFFFF,
or however many bits you're dealing with) can be seen as either EOF or a
character value, but (a) there's no problem when CHAR_BIT < 16, and (b)
when CHAR_BIT /is/ 16 or higher, the chances of real world data containing
a genuine character with the maximum possible value are pretty low. You'd
have to work pretty hard to find a counter-example, I think.
And anyone designing such a character set today would be nuts.
Unicode for example makes 0xFFFF be an explicit non-character,
because of its likely use for purposes such as EOF.

-- Richard
--
"Consideration shall be given to the need for as many as 32 characters
in some alphabets" - X3.4, 1963.
Nov 7 '06 #24

P: n/a
Richard Tobin wrote:
>
In article <ln************@nuthaus.mib.org>,
Keith Thompson <ks***@mib.orgwrote:
In this newsgroup, I can and I do expect words defined in the C
standard to be used in accordance with the way the C standard defines
them.

Well, you're often going to be disappointed.

The newsgroup isn't the *only* context determining how a word should
be interpreted. Usenet isn't a standards document, it's a
conversation.
Anyone using the word "byte" here in some other sense needs to
say so.

The statement was:

sizeof(char) is 1 by definition,
that doesn't mean that it's one byte

And he said it in the context of a up-thread statement

Certain platforms use 4 bytes memory for "char" and "short int"
variables

Why not start by assuming that the person is talking sense?
Because it makes no sense.

N869
6.5.3.4 The sizeof operator
The sizeof operator yields the size (in bytes) of its operand

--
pete
Nov 7 '06 #25

P: n/a
ri*****@cogsci.ed.ac.uk (Richard Tobin) writes:
In article <ln************@nuthaus.mib.org>,
Keith Thompson <ks***@mib.orgwrote:
>>In this newsgroup, I can and I do expect words defined in the C
standard to be used in accordance with the way the C standard defines
them.

Well, you're often going to be disappointed.
I often am, but somehow I muddle through.
The newsgroup isn't the *only* context determining how a word should
be interpreted. Usenet isn't a standards document, it's a
conversation.
>>Anyone using the word "byte" here in some other sense needs to
say so.

The statement was:

sizeof(char) is 1 by definition, that doesn't mean that it's one byte

And he said it in the context of a up-thread statement

Certain platforms use 4 bytes memory for "char" and "short int"
variables

Why not start by assuming that the person is talking sense? (Of
course, you may have to change your assumption later.) Since "byte"
corresponds to "char" in C standards-speak, it would make no sense for
him to say "that doesn't mean that it's one byte" if he had meant the
C-standards-speak usage of "byte". Assuming that he is talking sense
removes the ambiguity, and your response therefore seems excessively
pedantic.
If we're not going to use the C standard's definition of "byte" in
comp.lang.c, where are we going to use it? The point of defining
terms is to have a common vocabulary so we can discuss things without
talking past each other.

I understand that the word "byte" is often used differently outside
the context of the C programming language. People these days often
use it as a synonym for "octet", though that's inconistent with the
original meaning of the word (which predates C).

The statement was:

sizeof(char) is 1 by definition, that doesn't mean that it's one byte

but that's exactly what it means, since sizeof yields the size of its
operand *in bytes*. That's not pedantry, it's simple correctness.

The alternative is to explicitly qualify every usage of the word
"byte" with either "(meaning 8 bits)", or "(in the sense defined in
the C standard)", or whatever, and to do the same for every other
technical word that may have more than one meaning.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Nov 7 '06 #26

P: n/a
Mark L Pappin wrote:
rl*@hoekstra-uitgeverij.nl (Richard Bos) writes:
"=?utf-8?B?SGFyYWxkIHZhbiBExLNr?=" <tr*****@gmail.comwrote:
Richard Heathfield wrote:
the chances of real world data containing a genuine character with
the maximum possible value are pretty low. You'd have to work
pretty hard to find a counter-example, I think.
I don't think it's hard at all. Just be sure not to limit your
search to text files.
If it is known that a program's input is not text but (possibly)
binary, the wise programmer doesn't use fgetc() in the first
place. He uses fread(),

Or, checks feof() after fgetc() returns something that equals EOF.
Don't forget to call ferror() too in that case.

Nov 7 '06 #27

P: n/a
On Tue, 07 Nov 2006 01:10:05 +0100, in comp.lang.c , jacob navia
<ja***@jacob.remcomp.frwrote:
>Important distinction. When I ported lcc-win32 to a DSP, each
character took two bytes (16 bits)
Strictly speaking, it took two OCTETS.
>because the machine could not
address odd bytes.
octets...
>Still, sizeof(char) was 1 of course.
And it was still one byte, I'm afraid.

--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan
Nov 7 '06 #28

P: n/a
>Even if stdio support is provided, I don't think it actually matters very
>much about the EOF issue, in practical terms. Yes, 0xFFFF (or 0xFFFFFFFF,
or however many bits you're dealing with) can be seen as either EOF or a
character value, but (a) there's no problem when CHAR_BIT < 16, and (b)
when CHAR_BIT /is/ 16 or higher, the chances of real world data containing
a genuine character with the maximum possible value are pretty low. You'd
have to work pretty hard to find a counter-example, I think.
The chances of real-world data containing any particular unusual
combination of data that makes the program malfunction is pretty
much certainty if there is anything to be gained by writing MALICIOUS
code and the program accepts data from the real world.

Read some Microsoft security reports (viruses are not limited to
Microsoft but their code is the biggest target due to market share).
What are the chances of some of the exploits showing up BY ACCIDENT?
Pretty much zero. What are the chances of the exploits showing up
MALICIOUSLY? Nearly certainty.

Nov 8 '06 #29

P: n/a

Christopher Benson-Manica wrote:
Nagaraj L <na******@gmail.comwrote:
Certain platforms use 4 bytes memory for "char" and "short int"

Any platform that uses 4 bytes of memory for "char" variables is
non-conforming. sizeof(char) is 1.

--
C. Benson Manica | I *should* know what I'm talking about - if I
cbmanica(at)gmail.com | don't, I need to know. Flames welcome.
Nov 8 '06 #30

P: n/a
Keith Thompson wrote:
>
A C implementation *must* allow char objects to be stored at odd byte
addresses. It can choose to align all single declared char objects,
or even char struct members, at even addresses if that makes access
easier or faster, but there can be no padding between array elements:

char arr[2];
/* either arr[0] or arr[1] is at an odd byte address */

If the hardware doesn't allow this (or makes it too expensive), then
the implementation can make bytes bigger than 8 bits.
How do you define "odd byte address" ?

I don't see why you can't have 16-bit bytes, at addresses 0, 2, 4, ...

Nov 8 '06 #31

P: n/a
Old Wolf said:
Keith Thompson wrote:
>>
A C implementation *must* allow char objects to be stored at odd byte
addresses. It can choose to align all single declared char objects,
or even char struct members, at even addresses if that makes access
easier or faster, but there can be no padding between array elements:

char arr[2];
/* either arr[0] or arr[1] is at an odd byte address */

If the hardware doesn't allow this (or makes it too expensive), then
the implementation can make bytes bigger than 8 bits.

How do you define "odd byte address" ?

I don't see why you can't have 16-bit bytes, at addresses 0, 2, 4, ...
16-bit bytes are not a problem, but of course they will be at addresses 0,
1, 2, 3...

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: normal service will be restored as soon as possible. Please do not
adjust your email clients.
Nov 8 '06 #32

P: n/a
Richard Heathfield <in*****@invalid.invalidwrites:
Old Wolf said:
>I don't see why you can't have 16-bit bytes, at addresses 0, 2, 4, ...

16-bit bytes are not a problem, but of course they will be at addresses 0,
1, 2, 3...
I don't think that a strictly conforming program could tell the
difference. The representation of pointers doesn't have to be
flat, and there's no reason that subtraction of pointers to char
can't divide by 2. (Is there?)
--
"Programmers have the right to be ignorant of many details of your code
and still make reasonable changes."
--Kernighan and Plauger, _Software Tools_
Nov 8 '06 #33

P: n/a
Ben Pfaff said:
Richard Heathfield <in*****@invalid.invalidwrites:
>Old Wolf said:
>>I don't see why you can't have 16-bit bytes, at addresses 0, 2, 4, ...

16-bit bytes are not a problem, but of course they will be at addresses
0, 1, 2, 3...

I don't think that a strictly conforming program could tell the
difference. The representation of pointers doesn't have to be
flat, and there's no reason that subtraction of pointers to char
can't divide by 2. (Is there?)
If you have a pointer to the first character in a string, and a pointer to
the second character in the same string, those pointers are required to
differ by 1 (because arrays are stored contiguously, and sizeof(char) is
guaranteed to be 1). In an architecture where the concepts of "even
address" and "odd address" are meaningful, it seems to me that one or other
of those pointers *must* store an odd address.

What am I missing?

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: normal service will be restored as soon as possible. Please do not
adjust your email clients.
Nov 8 '06 #34

P: n/a
Richard Heathfield <in*****@invalid.invalidwrites:
Ben Pfaff said:
>Richard Heathfield <in*****@invalid.invalidwrites:
>>Old Wolf said:

I don't see why you can't have 16-bit bytes, at addresses 0, 2, 4, ...

16-bit bytes are not a problem, but of course they will be at addresses
0, 1, 2, 3...

I don't think that a strictly conforming program could tell the
difference. The representation of pointers doesn't have to be
flat, and there's no reason that subtraction of pointers to char
can't divide by 2. (Is there?)

If you have a pointer to the first character in a string, and a pointer to
the second character in the same string, those pointers are required to
differ by 1 (because arrays are stored contiguously, and sizeof(char) is
guaranteed to be 1).
We may be talking past one another. I agree that the result of
subtracting pointers must be 1 in this case. But the numbers
that represent the pointers involved in the subtraction could
differ by 2, or by 4, or by 623.
In an architecture where the concepts of "even
address" and "odd address" are meaningful, it seems to me that one or other
of those pointers *must* store an odd address.
I don't think the concepts of even and odd addresses would be
meaningful in the architecture that I'm envisioning.
Potentially, all pointers could be represented by odd numbers, or
by even numbers, or the bit with value 1 could be the parity of
the rest of the bits, or whatever.
--
int main(void){char p[]="ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuv wxyz.\
\n",*q="kl BIcNBFr.NKEzjwCIxNJC";int i=sizeof p/2;char *strchr();int putchar(\
);while(*q){i+=strchr(p,*q++)-p;if(i>=(int)sizeof p)i-=sizeof p-1;putchar(p[i]\
);}return 0;}
Nov 8 '06 #35

P: n/a
"Old Wolf" <ol*****@inspire.net.nzwrites:
Keith Thompson wrote:
>A C implementation *must* allow char objects to be stored at odd byte
addresses. It can choose to align all single declared char objects,
or even char struct members, at even addresses if that makes access
easier or faster, but there can be no padding between array elements:

char arr[2];
/* either arr[0] or arr[1] is at an odd byte address */

If the hardware doesn't allow this (or makes it too expensive), then
the implementation can make bytes bigger than 8 bits.

How do you define "odd byte address" ?

I don't see why you can't have 16-bit bytes, at addresses 0, 2, 4, ...
Actually, I don't define "odd byte address" -- but the standard does
implicitly use the concept without defining it. C99 3.2 defines the
term "alignment":

alignment

requirement that objects of a particular type be located on
storage boundaries with addresses that are particular multiples of
a byte address

Strictly speaking, 0, 2, 4, ... are not addresses; they're integers.
(void*)0, (void*)2, (void*)4, ... are addresses, but the standard
tells us nothing about their representation. On some systems I've
used (Cray vector systems), machine addresses point to 64-bit words,
but CHAR_BIT==8, and void* and char* have extra offset information.
An pointer object whose representation looks like an odd integer might
point to what I'd call an even byte address, and an pointer object
whose representation looks like an even integer might point to what
I'd call an odd byte address.

My assumption is that if p is an even address, then p+1 is an odd
address (assuming p is a character pointer), regardless of the
representation; I think that's the only model that's consistent with
the standard's concept of "alignment".

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Nov 8 '06 #36

P: n/a
Ben Pfaff said:

<snip>
We may be talking past one another.
Always possible. But we seem to be converging now.
I agree that the result of
subtracting pointers must be 1 in this case. But the numbers
that represent the pointers involved in the subtraction could
differ by 2, or by 4, or by 623.
Pathologically, yes, I suppose you're correct. At least, my instinct says
it's complete nonsense, but I don't see any conforming way to establish a
mapping between the object representation of a pointer and the actual
memory address that it represents. And so my clc experience contradicts my
instincts, and tells me that it's time to fold.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: normal service will be restored as soon as possible. Please do not
adjust your email clients.
Nov 8 '06 #37

P: n/a
Ben Pfaff <bl*@cs.stanford.eduwrites:
Richard Heathfield <in*****@invalid.invalidwrites:
>Ben Pfaff said:
>>Richard Heathfield <in*****@invalid.invalidwrites:

Old Wolf said:

I don't see why you can't have 16-bit bytes, at addresses 0, 2, 4, ...

16-bit bytes are not a problem, but of course they will be at addresses
0, 1, 2, 3...

I don't think that a strictly conforming program could tell the
difference. The representation of pointers doesn't have to be
flat, and there's no reason that subtraction of pointers to char
can't divide by 2. (Is there?)

If you have a pointer to the first character in a string, and a pointer to
the second character in the same string, those pointers are required to
differ by 1 (because arrays are stored contiguously, and sizeof(char) is
guaranteed to be 1).

We may be talking past one another. I agree that the result of
subtracting pointers must be 1 in this case. But the numbers
that represent the pointers involved in the subtraction could
differ by 2, or by 4, or by 623.
[...]

What "numbers that represent the pointers" are you referring to?
Pointers aren't necessarily represented as numbers, and the results of
conversions between pointer and integer types are
implementation-defined.

But see my discussion of "alignment" elsethread.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Nov 8 '06 #38

P: n/a
Keith Thompson <ks***@mib.orgwrites:
Ben Pfaff <bl*@cs.stanford.eduwrites:
>Richard Heathfield <in*****@invalid.invalidwrites:
>>Ben Pfaff said:

Richard Heathfield <in*****@invalid.invalidwrites:

Old Wolf said:
>
>I don't see why you can't have 16-bit bytes, at addresses 0, 2, 4, ...
>
16-bit bytes are not a problem, but of course they will be at addresses
0, 1, 2, 3...

I don't think that a strictly conforming program could tell the
difference. The representation of pointers doesn't have to be
flat, and there's no reason that subtraction of pointers to char
can't divide by 2. (Is there?)

If you have a pointer to the first character in a string, and a pointer to
the second character in the same string, those pointers are required to
differ by 1 (because arrays are stored contiguously, and sizeof(char) is
guaranteed to be 1).

We may be talking past one another. I agree that the result of
subtracting pointers must be 1 in this case. But the numbers
that represent the pointers involved in the subtraction could
differ by 2, or by 4, or by 623.
[...]

What "numbers that represent the pointers" are you referring to?
Pointers aren't necessarily represented as numbers, and the results of
conversions between pointer and integer types are
implementation-defined.
Every byte of memory represents a number. Because a pointer is
made out of bytes, it can also be said to be represented by a
number, formed by concatenating bits. For many implementation,
this number is meaningful.
--
"What is appropriate for the master is not appropriate for the novice.
You must understand the Tao before transcending structure."
--The Tao of Programming
Nov 9 '06 #39

P: n/a
Ben Pfaff <bl*@cs.stanford.eduwrites:
Keith Thompson <ks***@mib.orgwrites:
[...]
>What "numbers that represent the pointers" are you referring to?
Pointers aren't necessarily represented as numbers, and the results of
conversions between pointer and integer types are
implementation-defined.

Every byte of memory represents a number. Because a pointer is
made out of bytes, it can also be said to be represented by a
number, formed by concatenating bits. For many implementation,
this number is meaningful.
Ok, but it's not *necessarily* meaningful, and any meaning it might
have isn't what I meant when I referred to even and odd byte
addresses.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Nov 9 '06 #40

P: n/a
Ben Pfaff <bl*@cs.stanford.eduwrote:
Richard Heathfield <in*****@invalid.invalidwrites:
Ben Pfaff said:
Richard Heathfield <in*****@invalid.invalidwrites:

Old Wolf said:

I don't see why you can't have 16-bit bytes, at addresses 0, 2, 4, ...

16-bit bytes are not a problem, but of course they will be at addresses
0, 1, 2, 3...

I don't think that a strictly conforming program could tell the
difference. The representation of pointers doesn't have to be
flat, and there's no reason that subtraction of pointers to char
can't divide by 2. (Is there?)
If you have a pointer to the first character in a string, and a pointer to
the second character in the same string, those pointers are required to
differ by 1 (because arrays are stored contiguously, and sizeof(char) is
guaranteed to be 1).

We may be talking past one another. I agree that the result of
subtracting pointers must be 1 in this case. But the numbers
that represent the pointers involved in the subtraction could
differ by 2, or by 4, or by 623.
Those numbers can then only reside at a level that is completely
inaccessible to the ISO C program.
In an architecture where the concepts of "even
address" and "odd address" are meaningful, it seems to me that one or other
of those pointers *must* store an odd address.

I don't think the concepts of even and odd addresses would be
meaningful in the architecture that I'm envisioning.
Potentially, all pointers could be represented by odd numbers, or
by even numbers, or the bit with value 1 could be the parity of
the rest of the bits, or whatever.
Potentially, all objects could be represented by postits stuck to the
legs of carrier pigeons, and pointers by the name of the pigeon to which
the relevant postit is stuck. But to a C program, the difference between
one byte address and the next is still 1, and not "The bit of chicken
wire between Clara and Pete".

Richard
Nov 9 '06 #41

P: n/a
Jordan Abel <ra****@random.yi.orgwrote:
2006-11-07 <45*************@news.xs4all.nl>,
Richard Bos wrote:
"=?utf-8?B?SGFyYWxkIHZhbiBExLNr?=" <tr*****@gmail.comwrote:
Richard Bos wrote:
Similar things can be said about <ctype.h>, whose functions are
typically only used on text, not on unknown data that could be either
text or binary.

Well yeah, but the is* functions are never (counterexamples are
welcome) useful for binary data,
I can think of only one reason: a strings-like utility program. That's
system-level enough not to make this a problem.

How is strings system-level?
It's a program which one finds supplied by one's OS, and one may expect
people who write OS utility programs not to be fazed by oddities in
character sets or unusual integer sizes.

Richard
Nov 9 '06 #42

P: n/a
rl*@hoekstra-uitgeverij.nl (Richard Bos) writes:
Ben Pfaff <bl*@cs.stanford.eduwrote:
>Richard Heathfield <in*****@invalid.invalidwrites:
Ben Pfaff said:

Richard Heathfield <in*****@invalid.invalidwrites:

Old Wolf said:

I don't see why you can't have 16-bit bytes, at addresses 0, 2, 4, ...

16-bit bytes are not a problem, but of course they will be at addresses
0, 1, 2, 3...

I don't think that a strictly conforming program could tell the
difference. The representation of pointers doesn't have to be
flat, and there's no reason that subtraction of pointers to char
can't divide by 2. (Is there?)

If you have a pointer to the first character in a string, and a pointer to
the second character in the same string, those pointers are required to
differ by 1 (because arrays are stored contiguously, and sizeof(char) is
guaranteed to be 1).

We may be talking past one another. I agree that the result of
subtracting pointers must be 1 in this case. But the numbers
that represent the pointers involved in the subtraction could
differ by 2, or by 4, or by 623.

Those numbers can then only reside at a level that is completely
inaccessible to the ISO C program.
That is not true. ISO C programs are able to access the bytes
that comprise a pointer value.
--
int main(void){char p[]="ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuv wxyz.\
\n",*q="kl BIcNBFr.NKEzjwCIxNJC";int i=sizeof p/2;char *strchr();int putchar(\
);while(*q){i+=strchr(p,*q++)-p;if(i>=(int)sizeof p)i-=sizeof p-1;putchar(p[i]\
);}return 0;}
Nov 9 '06 #43

P: n/a
2006-11-09 <87************@blp.benpfaff.org>,
Ben Pfaff wrote:
rl*@hoekstra-uitgeverij.nl (Richard Bos) writes:
>Ben Pfaff <bl*@cs.stanford.eduwrote:
>>Richard Heathfield <in*****@invalid.invalidwrites:

Ben Pfaff said:

Richard Heathfield <in*****@invalid.invalidwrites:

Old Wolf said:

I don't see why you can't have 16-bit bytes, at addresses 0, 2, 4, ...

16-bit bytes are not a problem, but of course they will be at addresses
0, 1, 2, 3...

I don't think that a strictly conforming program could tell the
difference. The representation of pointers doesn't have to be
flat, and there's no reason that subtraction of pointers to char
can't divide by 2. (Is there?)

If you have a pointer to the first character in a string, and a pointer to
the second character in the same string, those pointers are required to
differ by 1 (because arrays are stored contiguously, and sizeof(char) is
guaranteed to be 1).

We may be talking past one another. I agree that the result of
subtracting pointers must be 1 in this case. But the numbers
that represent the pointers involved in the subtraction could
differ by 2, or by 4, or by 623.

Those numbers can then only reside at a level that is completely
inaccessible to the ISO C program.

That is not true. ISO C programs are able to access the bytes
that comprise a pointer value.
Yes, but they can't tell what number or numbers said bytes represent.
Nov 9 '06 #44

P: n/a
Jordan Abel <ra****@random.yi.orgwrites:
2006-11-09 <87************@blp.benpfaff.org>,
Ben Pfaff wrote:
>rl*@hoekstra-uitgeverij.nl (Richard Bos) writes:
>>Ben Pfaff <bl*@cs.stanford.eduwrote:

Richard Heathfield <in*****@invalid.invalidwrites:

Ben Pfaff said:

Richard Heathfield <in*****@invalid.invalidwrites:

Old Wolf said:
>
>I don't see why you can't have 16-bit bytes, at addresses 0, 2, 4, ...
>
16-bit bytes are not a problem, but of course they will be at addresses
0, 1, 2, 3...

I don't think that a strictly conforming program could tell the
difference. The representation of pointers doesn't have to be
flat, and there's no reason that subtraction of pointers to char
can't divide by 2. (Is there?)

If you have a pointer to the first character in a string, and a pointer to
the second character in the same string, those pointers are required to
differ by 1 (because arrays are stored contiguously, and sizeof(char) is
guaranteed to be 1).

We may be talking past one another. I agree that the result of
subtracting pointers must be 1 in this case. But the numbers
that represent the pointers involved in the subtraction could
differ by 2, or by 4, or by 623.

Those numbers can then only reside at a level that is completely
inaccessible to the ISO C program.

That is not true. ISO C programs are able to access the bytes
that comprise a pointer value.

Yes, but they can't tell what number or numbers said bytes represent.
It's only relevant to talk about odd or even addresses, in my
opinion, if they do have some meaning.
--
"The fact that there is a holy war doesn't mean that one of the sides
doesn't suck - usually both do..."
--Alexander Viro
Nov 9 '06 #45

P: n/a
On 8 Nov 2006 14:25:26 -0800, in comp.lang.c , "Old Wolf"
<ol*****@inspire.net.nzwrote:
>Keith Thompson wrote:
>>
A C implementation *must* allow char objects to be stored at odd byte
addresses. It can choose to align all single declared char objects,
or even char struct members, at even addresses if that makes access
easier or faster, but there can be no padding between array elements:

char arr[2];
/* either arr[0] or arr[1] is at an odd byte address */

If the hardware doesn't allow this (or makes it too expensive), then
the implementation can make bytes bigger than 8 bits.

How do you define "odd byte address" ?

I don't see why you can't have 16-bit bytes, at addresses 0, 2, 4, ...
the point is, a byte is defined in C as the smallest uniquely
addressable object. Whether you choose to number adjacent objects
0,1,2... or 0,2,4 or pi, e, tau,.. is entirely up to you, but the
spacing between them is still unity....

--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan
Nov 9 '06 #46

This discussion thread is closed

Replies have been disabled for this discussion.