Memory alignment

Why Tea

typedef struct some_struct
{
int i;
short k,
int m;
char s[1];
} some_struct_t;

Assuming 16 bit or 32-bit alignment, can I assume that
s always gets 4 or 8 bytes of allocation due to padding
in the following? (I.e. s is either 4 or 8 characters long)

some_struct_t *my_struct;
my_struct = malloc(sizeof(some_struct_t));

BTW, this is on a PowerPC architecture.

/Why Tea

Oct 3 '08 #1

Subscribe Post Reply

3606

Fred

On Oct 3, 10:47*am, Why Tea <ytl...@gmail.comwrote:

typedef struct some_struct
{
* * int i;
* * short k,
* * int m;
* * char s[1];

} some_struct_t;

Assuming 16 bit or 32-bit alignment, can I assume that
s always gets 4 or 8 bytes of allocation due to padding
in the following? (I.e. s is either 4 or 8 characters long)

some_struct_t *my_struct;
my_struct = malloc(sizeof(some_struct_t));

BTW, this is on a PowerPC architecture.

s will get one byte of space.
The struct itself may or may not get additional bytes or padding,
but writing to s[1] is an error.
--
Fred Kleinschmidt.

Oct 3 '08 #2

Why Tea

On Oct 3, 12:18*pm, Fred <fred.l.kleinschm...@boeing.comwrote:

On Oct 3, 10:47*am, Why Tea <ytl...@gmail.comwrote:

typedef struct some_struct
{
* * int i;
* * short k,
* * int m;
* * char s[1];

} some_struct_t;

Assuming 16 bit or 32-bit alignment, can I assume that
s always gets 4 or 8 bytes of allocation due to padding
in the following? (I.e. s is either 4 or 8 characters long)

some_struct_t *my_struct;
my_struct = malloc(sizeof(some_struct_t));

BTW, this is on a PowerPC architecture.

s will get one byte of space.
The struct itself may or may not get additional bytes or padding,

Why not if alignment is done at 16 or 32 bit boundary?

but writing to s[1] is an error.

But strcpy(my->struct->s, "AB"); is OK if there is
padding. Isn't it not? Please go easy on me here.
I just want to understand how this really works...

Oct 3 '08 #3

Antoninus Twink

On 3 Oct 2008 at 18:33, Why Tea wrote:

typedef struct some_struct
{
Â* Â* int i;
Â* Â* short k,
Â* Â* int m;
Â* Â* char s[1];
} some_struct_t;

But strcpy(my->struct->s, "AB"); is OK if there is
padding. Isn't it not? Please go easy on me here.
I just want to understand how this really works...

Yes, of course, if there are two or more padding bytes at the end of the
struct then that's your memory to write to, whether it's on the stack if
my_struct is an automatic variable, or on the heap if you got the memory
from malloc().

You should be aware that the regulars here aren't interested in your
wish for a pragmatic answer that's true in practise: they'll just
bombard you with hypothetical answers that are true in theory.

Oct 3 '08 #4

danmath06

On Oct 3, 2:47 pm, Why Tea <ytl...@gmail.comwrote:

typedef struct some_struct
{
int i;
short k,
int m;
char s[1];

} some_struct_t;

Assuming 16 bit or 32-bit alignment, can I assume that
s always gets 4 or 8 bytes of allocation due to padding
in the following? (I.e. s is either 4 or 8 characters long)

Each element in the structure has the size you specified in the
declaration, padding adds space between elements to adjust alignment
or at the end. s has only one character.

For example there might be 2 bytes of space between k and m so that m
starts on an 32bit alignes address. You could avoid this by keeping
the integers together:

typedef struct some_struct
{
int i;
int m;
short k,
char s[1];

} some_struct_t;

This might result in only one extra byte at the end.

You can always use sizeof(struct some_struct) to know the size of the
structure and guess how many bytes were added as padding.
offsetof(struct some_struct, s) will return the offset of s in the
structure (you will need: #include <stddef.h>). You can use offsetof()
to find out the position of every element in the struct and determine
where the struct has been padded.

Oct 3 '08 #5

danmath06

But strcpy(my->struct->s, "AB"); is OK if there is
padding. Isn't it not? Please go easy on me here.
I just want to understand how this really works...

It will, but if you add a new elemnt at the end of the structure later
on in might not, you might overwrite the next element.

Why would you want to declare a 1 char array to store 2 anyway?

Oct 3 '08 #6

Why Tea

On Oct 3, 12:48*pm, danmat...@gmail.com wrote:

But strcpy(my->struct->s, "AB"); is OK if there is
padding. Isn't it not? Please go easy on me here.
I just want to understand how this really works...

It will, but if you add a new elemnt at the end of the structure later
on in might not, you might overwrite the next element.

Why would you want to declare a 1 char array to store 2 anyway?

Good question. This is found in some real embedded
code to make more efficient of the memory. As I understood
it, the last s[1] is just a placeholder as you can
allocate more memory when needed. For example:

my_struct = malloc(sizeof(my_struct_t) + MY_PAYLOAD_STRING_SIZE);

The same my_struct_t is used throughout the code for
signal sending. If s[] is used to carry binary data, the
size is specified by an int preceding s[]. I'd be
interested to hear comments from the experts about
this approach.

Oct 3 '08 #7

Ian Collins

Why Tea wrote:

On Oct 3, 12:48 pm, danmat...@gmail.com wrote:

>>But strcpy(my->struct->s, "AB"); is OK if there is
padding. Isn't it not? Please go easy on me here.
I just want to understand how this really works...
It will, but if you add a new elemnt at the end of the structure later
on in might not, you might overwrite the next element.

Why would you want to declare a 1 char array to store 2 anyway?

Good question. This is found in some real embedded
code to make more efficient of the memory. As I understood
it, the last s[1] is just a placeholder as you can
allocate more memory when needed. For example:

my_struct = malloc(sizeof(my_struct_t) + MY_PAYLOAD_STRING_SIZE);

The same my_struct_t is used throughout the code for
signal sending. If s[] is used to carry binary data, the
size is specified by an int preceding s[]. I'd be
interested to hear comments from the experts about
this approach.

There isn't much to say, it's known as the "struct hack" and fairly
common. There's probably some decent reference on the web if you google
for it.

--
Ian Collins.

Oct 3 '08 #8

Eric Sosman

Why Tea wrote:

On Oct 3, 12:48 pm, danmat...@gmail.com wrote:

>>But strcpy(my->struct->s, "AB"); is OK if there is
padding. Isn't it not? Please go easy on me here.
I just want to understand how this really works...
It will, but if you add a new elemnt at the end of the structure later
on in might not, you might overwrite the next element.

Why would you want to declare a 1 char array to store 2 anyway?

Good question. This is found in some real embedded
code to make more efficient of the memory. As I understood
it, the last s[1] is just a placeholder as you can
allocate more memory when needed. For example:

my_struct = malloc(sizeof(my_struct_t) + MY_PAYLOAD_STRING_SIZE);

The same my_struct_t is used throughout the code for
signal sending. If s[] is used to carry binary data, the
size is specified by an int preceding s[]. I'd be
interested to hear comments from the experts about
this approach.

You're full of questions, Why Tea, and that's a good thing.
But has it occurred to you that other people have asked some of
these same questions? Has it occurred to you to look for a FAQ
at some likely-sounding site like, oh, <http://www.c-faq.com/>?
If you get lucky and find some useful material at such a site,
I'd suggest not limiting yourself only to reading Question 2.6
(for instance), but perusing some of the others as well.

--
Er*********@sun.com

Oct 3 '08 #9

Ben Bacarisse

Why Tea <yt****@gmail.comwrites:

On Oct 3, 12:48Â*pm, danmat...@gmail.com wrote:

But strcpy(my->struct->s, "AB"); is OK if there is
padding. Isn't it not? Please go easy on me here.
I just want to understand how this really works...

It will, but if you add a new elemnt at the end of the structure later
on in might not, you might overwrite the next element.

Why would you want to declare a 1 char array to store 2 anyway?

Good question. This is found in some real embedded
code to make more efficient of the memory. As I understood
it, the last s[1] is just a placeholder as you can
allocate more memory when needed. For example:

my_struct = malloc(sizeof(my_struct_t) + MY_PAYLOAD_STRING_SIZE);

This is called the "struct hack". It has been formalised in C99 so if
you can use C99 then all will be well.

The same my_struct_t is used throughout the code for
signal sending. If s[] is used to carry binary data, the
size is specified by an int preceding s[]. I'd be
interested to hear comments from the experts about
this approach.

It is considered to be "a bit dodgy" (that is the technical term) but
it generally works. I am not sure there is really much more to say
about it though I get the feeling I will be proved very much wrong
about that!

--
Ben.

Oct 3 '08 #10

Lowell Gilbert

Why Tea <yt****@gmail.comwrites:

On Oct 3, 12:48*pm, danmat...@gmail.com wrote:

But strcpy(my->struct->s, "AB"); is OK if there is
padding. Isn't it not? Please go easy on me here.
I just want to understand how this really works...

It will, but if you add a new elemnt at the end of the structure later
on in might not, you might overwrite the next element.

Why would you want to declare a 1 char array to store 2 anyway?

Good question. This is found in some real embedded
code to make more efficient of the memory. As I understood
it, the last s[1] is just a placeholder as you can
allocate more memory when needed. For example:

my_struct = malloc(sizeof(my_struct_t) + MY_PAYLOAD_STRING_SIZE);

or even more likely, something more like
my_struct = malloc(sizeof(my_struct_t) + strlen(my->struct->s));

The same my_struct_t is used throughout the code for
signal sending. If s[] is used to carry binary data, the
size is specified by an int preceding s[]. I'd be
interested to hear comments from the experts about
this approach.

But that's different, because you allocated more space. Even without
worrying about the alignments, you *know* that the structure is big
enough for the data you're putting into it. There's no reason to play
around with the padding; just allocate the space you need.

--
Lowell Gilbert, embedded/networking software engineer
http://be-well.ilk.org/~lowell/

Oct 3 '08 #11

Eric Sosman

Why Tea wrote:

On Oct 3, 12:18 pm, Fred <fred.l.kleinschm...@boeing.comwrote:
>On Oct 3, 10:47 am, Why Tea <ytl...@gmail.comwrote:

>>typedef struct some_struct
{
int i;
short k,
int m;
char s[1];
} some_struct_t;
Assuming 16 bit or 32-bit alignment, can I assume that
s always gets 4 or 8 bytes of allocation due to padding
in the following? (I.e. s is either 4 or 8 characters long)
some_struct_t *my_struct;
my_struct = malloc(sizeof(some_struct_t));
BTW, this is on a PowerPC architecture.
s will get one byte of space.
The struct itself may or may not get additional bytes or padding,

Why not if alignment is done at 16 or 32 bit boundary?

Because you don't know for sure what the alignment
requirement is.

>but writing to s[1] is an error.

But strcpy(my->struct->s, "AB"); is OK if there is
padding. Isn't it not? Please go easy on me here.
I just want to understand how this really works...

It's not OK at all: It tries to cram three bytes into a
one-byte space. If the struct has two or more padding bytes
at the end you might get away with it. If the struct has no
padding you might get away with it if it just happens that
there's nothing important right after the struct. The "What?
Me worry?" school of programming has no lack of adherents;
you can read all about them in CERT bulletins.

When you go home tonight, get yourself a cane or a medium-
sized tree branch or an umbrella or something of that sort.
Close your eyes before you walk through the door, and keep
them closed while walking several paces into the room. Then
swing the cane or whatever vigorously about yourself, striking
in every direction, including overhead. It's possible that
you won't smash anything ...

--
Er*********@sun.com

Oct 3 '08 #12

Why Tea

On Oct 3, 1:47*pm, Eric Sosman <Eric.Sos...@sun.comwrote:

Why Tea wrote:
On Oct 3, 12:48 pm, danmat...@gmail.com wrote:
>But strcpy(my->struct->s, "AB"); is OK if there is
padding. Isn't it not? Please go easy on me here.
I just want to understand how this really works...
It will, but if you add a new elemnt at the end of the structure later
on in might not, you might overwrite the next element.

Why would you want to declare a 1 char array to store 2 anyway?

Good question. This is found in some real embedded
code to make more efficient of the memory. As I understood
it, the last s[1] is just a placeholder as you can
allocate more memory when needed. For example:

my_struct = malloc(sizeof(my_struct_t) + MY_PAYLOAD_STRING_SIZE);

The same my_struct_t is used throughout the code for
signal sending. If s[] is used to carry binary data, the
size is specified by an int preceding s[]. I'd be
interested to hear comments from the experts about
this approach.

* * *You're full of questions, Why Tea, and that's a good thing.
But has it occurred to you that other people have asked some of
these same questions? *Has it occurred to you to look for a FAQ
at some likely-sounding site like, oh, <http://www.c-faq.com/>?
If you get lucky and find some useful material at such a site,
I'd suggest not limiting yourself only to reading Question 2.6
(for instance), but perusing some of the others as well.

--
Eric.Sos...@sun.com

Eric, thanks for the gentle nudge. I did google, but I didn't
know it was called "struct hack". It's hard to find the
right thing if you don't know the right term. Thanks to all
who took the time to reply.

Oct 3 '08 #13

jameskuyper

Why Tea wrote:

On Oct 3, 12:48ï¿½pm, danmat...@gmail.com wrote:

But strcpy(my->struct->s, "AB"); is OK if there is
padding. Isn't it not? Please go easy on me here.
I just want to understand how this really works...
It will, but if you add a new elemnt at the end of the structure later
on in might not, you might overwrite the next element.

Why would you want to declare a 1 char array to store 2 anyway?

Good question. This is found in some real embedded
code to make more efficient of the memory. As I understood
it, the last s[1] is just a placeholder as you can
allocate more memory when needed. For example:

my_struct = malloc(sizeof(my_struct_t) + MY_PAYLOAD_STRING_SIZE);

The same my_struct_t is used throughout the code for
signal sending. If s[] is used to carry binary data, the
size is specified by an int preceding s[]. I'd be
interested to hear comments from the experts about
this approach.

In C90, this was technically illegal, but as a practical matter it
worked on most (all?) implementations. It was sufficiently useful that
a modified version of the concept was added as an extension to many
complilers, but with the difference that the array was declared with a
size of 0, rather than 1. In C99, a modified version of the concept
was made finally standardized, under the name "flexible arrays". For
the C99 version, you should use:

typedef struct some_struct
{
int i;
short k,
int m;
char s[];
} some_struct_t;

With all three versions of this concept, it's required that the array
be declared at the end of the structure. The best way to handle the
allocation is as follows:

my_struct = malloc(offsetof(some_struct_t, s) +
MY_PAYLOAD_STRING_SIZE*sizeof my_struct.s[0]);

Using offsetof() rather than sizeof() makes a difference for the C90
version, because the size of the struct includes enough room for one
element of the array, whereas offsetof() does not. That means that
with sizeof you'd be reserving room for at least 1 more array element
than you need to (unless MY_PAYLOAD_STRING_SIZE does not include the
terminating null character). For C99 sizeof(some_struct_t) and
offsetof(some_struct_t, s) should give the same result.
Using sizeof mystruct.s[0] protects against the possibility that you
might change the element type of s. It also helps makes it easier to
verify that allocation is correct. If the length of the flexible array
is stored in the struct, as is usually the case, I'd recommend filling
in that member, and using the value of that member instead of
MY_PAYLOAD_STRING_SIZE. Again, the main advantage of this is that it
makes it easier for a reader to verify that the code is correct.

Oct 3 '08 #14

Keith Thompson

Why Tea <yt****@gmail.comwrites:

typedef struct some_struct
{
int i;
short k,
int m;
char s[1];
} some_struct_t;

Assuming 16 bit or 32-bit alignment, can I assume that
s always gets 4 or 8 bytes of allocation due to padding
in the following? (I.e. s is either 4 or 8 characters long)

some_struct_t *my_struct;
my_struct = malloc(sizeof(some_struct_t));

BTW, this is on a PowerPC architecture.

You can assume that s itself will be allocated exactly 1 byte. Any
padding following it is part of the struct, but not part of s.

In general, compilers are allowed to insert arbitrary padding between
members or after the last member of a struct. The purpose of this
padding is generally to meet alignment requirements, but the standard
doesn't place any restrictions on how little or how much padding is
used, as long as the members can be accessed.

So no, you can't portably make any assumptions about how much padding
is added after s. But you generally shouldn't need to. If you'll let
us know what you're trying to do, we can probably help you do it
*without* making any (or too many) non-portable assumptions. If
you're trying to use the "struct hack", see question 2.6 in the
comp.lang.c FAQ, <http://c-faq.com/>.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Oct 3 '08 #15

Eric Sosman

Why Tea wrote:

On Oct 3, 1:47 pm, Eric Sosman <Eric.Sos...@sun.comwrote:
>[...]
I'd suggest not limiting yourself only to reading Question 2.6
(for instance), but perusing some of the others as well.

Eric, thanks for the gentle nudge. I did google, but I didn't
know it was called "struct hack". It's hard to find the
right thing if you don't know the right term. [...]

.... which is why I suggest browsing the rest of the FAQ, too. No
need to memorize it, but get an idea of what's in it and what kind
of terminology describes it -- then next time you're puzzled you'll
have a better chance of finding an answer, or at least have the
vocabulary that lets you frame your question better.

--
Er*********@sun.com

Oct 3 '08 #16

Keith Thompson

Why Tea <yt****@gmail.comwrites:

On Oct 3, 1:47*pm, Eric Sosman <Eric.Sos...@sun.comwrote:

[...]

>* * *You're full of questions, Why Tea, and that's a good thing.
But has it occurred to you that other people have asked some of
these same questions? *Has it occurred to you to look for a FAQ
at some likely-sounding site like, oh, <http://www.c-faq.com/>?
If you get lucky and find some useful material at such a site,
I'd suggest not limiting yourself only to reading Question 2.6
(for instance), but perusing some of the others as well.

Eric, thanks for the gentle nudge. I did google, but I didn't
know it was called "struct hack". It's hard to find the
right thing if you don't know the right term. Thanks to all
who took the time to reply.

Fair enough. I *know* it's called the "struct hack", and I've read
question 2.6 before, but I had trouble finding it myself, since the
answer to 2.6 doesn't use the phrase "struct hack". I happened to
remember that the URL includes the string "structhack", so I was able
to use a Google advanced search to find it.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Oct 3 '08 #17

Keith Thompson

Lowell Gilbert <lg******@be-well.ilk.orgwrites:

Why Tea <yt****@gmail.comwrites:
>On Oct 3, 12:48*pm, danmat...@gmail.com wrote:

[...]

>>Why would you want to declare a 1 char array to store 2 anyway?

Good question. This is found in some real embedded
code to make more efficient of the memory. As I understood
it, the last s[1] is just a placeholder as you can
allocate more memory when needed. For example:

my_struct = malloc(sizeof(my_struct_t) + MY_PAYLOAD_STRING_SIZE);

or even more likely, something more like
my_struct = malloc(sizeof(my_struct_t) + strlen(my->struct->s));

+ 1. The length returned by strlen() doesn't include the terminating '\0'.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Oct 3 '08 #18

Keith Thompson

ja*********@verizon.net writes:
[...]

In C90, this was technically illegal, but as a practical matter it
worked on most (all?) implementations. It was sufficiently useful that
a modified version of the concept was added as an extension to many
complilers, but with the difference that the array was declared with a
size of 0, rather than 1. In C99, a modified version of the concept
was made finally standardized, under the name "flexible arrays". For
the C99 version, you should use:

typedef struct some_struct
{
int i;
short k,
int m;
char s[];
} some_struct_t;

With all three versions of this concept, it's required that the array
be declared at the end of the structure. The best way to handle the
allocation is as follows:

my_struct = malloc(offsetof(some_struct_t, s) +
MY_PAYLOAD_STRING_SIZE*sizeof my_struct.s[0]);

Using offsetof() rather than sizeof() makes a difference for the C90
version, because the size of the struct includes enough room for one
element of the array, whereas offsetof() does not.

[...]

I'm not convinced that's safe. If the string is very short, you might
allocate fewer than sizeof(some_struct_t) bytes. That could, in some
circumstances, result in accessing memory that's within range of a
some_struct_t object, but outside the range of what was actually
allocated.

On the other hand, if you're doing this then you're not going to be
accessing the some_struct_t object as a whole. On the other other
hand, the compiler might be allowed to generate code that does so.
It's probably not going to cause any visible problems in practice,
especially since the memory allocated by malloc() will probably be big
enough to contain a my_struct_t object anyway, but it makes me
nervous. I'd rather risk allocating a byte or so too much than too
little, unless memory space is *really* critical and I'm willing to be
unwarrantedly chummy with the compiler (paraphrasing Dennis Ritchie).

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Oct 3 '08 #19

Why Tea

On Oct 3, 2:10*pm, Keith Thompson <ks...@mib.orgwrote:

Why Tea <ytl...@gmail.comwrites:
typedef struct some_struct
{
* * int i;
* * short k,
* * int m;
* * char s[1];
} some_struct_t;

Assuming 16 bit or 32-bit alignment, can I assume that
s always gets 4 or 8 bytes of allocation due to padding
in the following? (I.e. s is either 4 or 8 characters long)

some_struct_t *my_struct;
my_struct = malloc(sizeof(some_struct_t));

BTW, this is on a PowerPC architecture.

You can assume that s itself will be allocated exactly 1 byte. *Any
padding following it is part of the struct, but not part of s.

In general, compilers are allowed to insert arbitrary padding between
members or after the last member of a struct. *The purpose of this
padding is generally to meet alignment requirements, but the standard
doesn't place any restrictions on how little or how much padding is
used, as long as the members can be accessed.

So no, you can't portably make any assumptions about how much padding
is added after s. *But you generally shouldn't need to. *If you'll let
us know what you're trying to do, we can probably help you do it
*without* making any (or too many) non-portable assumptions. *If
you're trying to use the "struct hack", see question 2.6 in the
comp.lang.c FAQ, <http://c-faq.com/>.

Thanks Keith. I'll try the c-faq first next time. I just had
a look at 2.6 of c-faq, I'm surprised that what's written there
was exactly a discussion I had with a colleague. Taking the
code from the faq:

#include <stdlib.h>
#include <string.h>

#define MAXSIZE 100

struct name {
int namelen;
char namestr[MAXSIZE];
};

struct name *makename(char *newname)
{
struct name *ret =
malloc(sizeof(struct name)-MAXSIZE+strlen(newname)+1);
/* +1 for \0 */
if(ret != NULL) {
ret->namelen = strlen(newname);
strcpy(ret->namestr, newname);
}

return ret;
}

The argument from my colleague was about the extra
byte (+1 for \0), why is it needed as padding is
always there? He assumed 32-bit alignment and
MAXSIZE=1 in our discussion. Is his argument always
right?

Oct 3 '08 #20

Eric Sosman

Keith Thompson wrote:

Lowell Gilbert <lg******@be-well.ilk.orgwrites:
>Why Tea <yt****@gmail.comwrites:
>>On Oct 3, 12:48 pm, danmat...@gmail.com wrote:

[...]

>>>Why would you want to declare a 1 char array to store 2 anyway?
Good question. This is found in some real embedded
code to make more efficient of the memory. As I understood
it, the last s[1] is just a placeholder as you can
allocate more memory when needed. For example:

my_struct = malloc(sizeof(my_struct_t) + MY_PAYLOAD_STRING_SIZE);
or even more likely, something more like
my_struct = malloc(sizeof(my_struct_t) + strlen(my->struct->s));

+ 1. The length returned by strlen() doesn't include the terminating '\0'.

The struct has `char s[1]' as its last element[*], and the
size of that element is already included in sizeof(my_struct_t).
[*] Or so I assume. We started with a some_struct_t, but
morphed to my_struct_t somewhere in mid-dicussion without ever
seeing the type defined. Given the reference to "the last s[1]"
I think we can count on it still being present, though.

--
Er*********@sun.com

Oct 3 '08 #21

CBFalconer

Why Tea wrote:

Fred <fred.l.kleinschm...@boeing.comwrote:
>Why Tea <ytl...@gmail.comwrote:

>>typedef struct some_struct {
int i;
short k,
int m;
char s[1];
} some_struct_t;

.... snip ...

>
>but writing to s[1] is an error.

But strcpy(my->struct->s, "AB"); is OK if there is
padding. Isn't it not? Please go easy on me here.
I just want to understand how this really works...

Because you declared s to hold one byte. You have no idea what the
system is doing with any extra memory that it had to supply, or
even if it had to supply anything extra.

--
[mail]: Chuck F (cbfalconer at maineline dot net)
[page]: <http://cbfalconer.home.att.net>
Try the download section.

Oct 3 '08 #22

lawrence.jones

Antoninus Twink <no****@nospam.invalidwrote:

>
Yes, of course, if there are two or more padding bytes at the end of the
struct then that's your memory to write to, whether it's on the stack if
my_struct is an automatic variable, or on the heap if you got the memory
from malloc().

No, the padding bytes are most definitely *not* yours to write to.
Although most implementations will let you get away with it, there are
implementations that do careful memory bounds checking and won't.
--
Larry Jones

My life needs a rewind/erase button. -- Calvin

Oct 3 '08 #23

Ben Pfaff

la************@siemens.com writes:

Antoninus Twink <no****@nospam.invalidwrote:
>>
Yes, of course, if there are two or more padding bytes at the end of the
struct then that's your memory to write to, whether it's on the stack if
my_struct is an automatic variable, or on the heap if you got the memory
from malloc().

No, the padding bytes are most definitely *not* yours to write to.

Really?

struct { int a; } b;
memset(&b, 0, sizeof b); /* Bang? */
--
char a[]="\n .CJacehknorstu";int putchar(int);int main(void){unsigned long b[]
={0x67dffdff,0x9aa9aa6a,0xa77ffda9,0x7da6aa6a,0xa6 7f6aaa,0xaa9aa9f6,0x11f6},*p
=b,i=24;for(;p+=!*p;*p/=4)switch(0[p]&3)case 0:{return 0;for(p--;i--;i--)case+
2:{i++;if(i)break;else default:continue;if(0)case 1:putchar(a[i&15]);break;}}}

Oct 3 '08 #24

Eric Sosman

la************@siemens.com wrote:

Antoninus Twink <no****@nospam.invalidwrote:
>Yes, of course, if there are two or more padding bytes at the end of the
struct then that's your memory to write to, whether it's on the stack if
my_struct is an automatic variable, or on the heap if you got the memory
from malloc().

No, the padding bytes are most definitely *not* yours to write to.
Although most implementations will let you get away with it, there are
implementations that do careful memory bounds checking and won't.

There's also the issue of copying one such struct to another
by assignment, or passing a struct instance as a function argument
or returning one as a function value. The implementation is not
obliged to preserve the values of padding bytes during these
operations.

--
Er*********@sun.com

Oct 3 '08 #25

Keith Thompson

Why Tea <yt****@gmail.comwrites:

On Oct 3, 2:10*pm, Keith Thompson <ks...@mib.orgwrote:
>Why Tea <ytl...@gmail.comwrites:
typedef struct some_struct
{
* * int i;
* * short k,
* * int m;
* * char s[1];
} some_struct_t;

Assuming 16 bit or 32-bit alignment, can I assume that
s always gets 4 or 8 bytes of allocation due to padding
in the following? (I.e. s is either 4 or 8 characters long)

some_struct_t *my_struct;
my_struct = malloc(sizeof(some_struct_t));

BTW, this is on a PowerPC architecture.

You can assume that s itself will be allocated exactly 1 byte. *Any
padding following it is part of the struct, but not part of s.

In general, compilers are allowed to insert arbitrary padding between
members or after the last member of a struct. *The purpose of this
padding is generally to meet alignment requirements, but the standard
doesn't place any restrictions on how little or how much padding is
used, as long as the members can be accessed.

So no, you can't portably make any assumptions about how much padding
is added after s. *But you generally shouldn't need to. *If you'll let
us know what you're trying to do, we can probably help you do it
*without* making any (or too many) non-portable assumptions. *If
you're trying to use the "struct hack", see question 2.6 in the
comp.lang.c FAQ, <http://c-faq.com/>.

Thanks Keith. I'll try the c-faq first next time. I just had
a look at 2.6 of c-faq, I'm surprised that what's written there
was exactly a discussion I had with a colleague. Taking the
code from the faq:

#include <stdlib.h>
#include <string.h>

#define MAXSIZE 100

struct name {
int namelen;
char namestr[MAXSIZE];
};

struct name *makename(char *newname)
{
struct name *ret =
malloc(sizeof(struct name)-MAXSIZE+strlen(newname)+1);
/* +1 for \0 */
if(ret != NULL) {
ret->namelen = strlen(newname);
strcpy(ret->namestr, newname);
}

return ret;
}

The argument from my colleague was about the extra
byte (+1 for \0), why is it needed as padding is
always there? He assumed 32-bit alignment and
MAXSIZE=1 in our discussion. Is his argument always
right?

It may happen to be true that it will work on all implementations. A
compiler will *probably* add enough padding to the end of the struct
to make it work. Even if not, a runtime library's malloc()
implementation will *probably* add enough padding to the allocated
space. And even if not, you're likely (but by no means certain) to be
able to get away with accessing memory just one byte beyond what's
been allocated, depending on what happens to be there. Finally, most
C implementations won't go out of their way to prevent you from
accessing memory beyond the declared bounds of an array (and if yours
does, you can't use the struct hack in the first place).

But the line of reasoning that lets you assume that you don't need to
explicitly allocate enough space for the terminating '\0' is
convoluted, weak, and system-specific. If I were writing the code,
I'd just allocate the byte and be done with it. Otherwise, every time
a problem shows up, I'd have to spend time confirming that the failure
to allocate that byte isn't the cause. It doesn't cost much to do it
right.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Oct 3 '08 #26

CBFalconer

la************@siemens.com wrote:

Antoninus Twink <no****@nospam.invalidwrote:

>Yes, of course, if there are two or more padding bytes at the end
of the struct then that's your memory to write to, whether it's
on the stack if my_struct is an automatic variable, or on the heap
if you got the memory from malloc().

No, the padding bytes are most definitely *not* yours to write to.
Although most implementations will let you get away with it, there
are implementations that do careful memory bounds checking and won't.

Bear in mind that Twink is a troll, whose objective is to disrupt
this newsgroup.

--
[mail]: Chuck F (cbfalconer at maineline dot net)
[page]: <http://cbfalconer.home.att.net>
Try the download section.

Oct 3 '08 #27

Keith Thompson

Eric Sosman <Er*********@sun.comwrites:

Keith Thompson wrote:
>Lowell Gilbert <lg******@be-well.ilk.orgwrites:
>>Why Tea <yt****@gmail.comwrites:
On Oct 3, 12:48 pm, danmat...@gmail.com wrote:
[...]
>>>>Why would you want to declare a 1 char array to store 2 anyway?
Good question. This is found in some real embedded
code to make more efficient of the memory. As I understood
it, the last s[1] is just a placeholder as you can
allocate more memory when needed. For example:

my_struct = malloc(sizeof(my_struct_t) + MY_PAYLOAD_STRING_SIZE);
or even more likely, something more like
my_struct = malloc(sizeof(my_struct_t) + strlen(my->struct->s));
+ 1. The length returned by strlen() doesn't include the
terminating '\0'.

The struct has `char s[1]' as its last element[*], and the
size of that element is already included in sizeof(my_struct_t).

So it does.

Seeing strlen() in a computation of how much memory to allocate sets
off my alarm bells. If I were going to write something like that in
real code, it would be heavily commented.

[...]

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Oct 4 '08 #28

Why Tea

On Oct 4, 4:40*am, Antoninus Twink <nos...@nospam.invalidwrote:

On *3 Oct 2008 at 18:33, Why Tea wrote:

typedef struct some_struct
{
* * int i;
* * short k,
* * int m;
* * char s[1];
} some_struct_t;

But strcpy(my->struct->s, "AB"); is OK if there is
padding. Isn't it not? Please go easy on me here.
I just want to understand how this really works...

Yes, of course, if there are two or more padding bytes at the end of the
struct then that's your memory to write to, whether it's on the stack if
my_struct is an automatic variable, or on the heap if you got the memory
from malloc().

You should be aware that the regulars here aren't interested in your
wish for a pragmatic answer that's true in practise: they'll just
bombard you with hypothetical answers that are true in theory.

Thanks Antoninus. I appreciate your answer. Eric S. is very
knowledgeable and I respect that. But to say "It's not OK at all"
categorically without considering cases when it COULD be OK
can give a wrong impression of the problem.

For example, if there is a system crash in an embedded system and
the dump shows it was due to a wild pointer. When you see a piece
of code that uses the struct hack (a term I just learned) and it
forgets to include the '\0' in the size allocated, you can't really
swear by the bible that it causes the crash as it over steps the
memory allocated with strcpy. I understood it's bad and dangerous
programming, but the question is can we be 100% sure that an
existing code like that causes memory corruption? Based on the
little I know, I don't think it's a simple answer of a YES or NO.
That's why I think Antoninus has given a more accurate answer.

can't swear by the bible
"It's not OK at all"

Oct 4 '08 #29

Keith Thompson

Why Tea <yt****@gmail.comwrites:

On Oct 4, 4:40*am, Antoninus Twink <nos...@nospam.invalidwrote:

[snip]

>
Thanks Antoninus. I appreciate your answer. Eric S. is very
knowledgeable and I respect that. But to say "It's not OK at all"
categorically without considering cases when it COULD be OK
can give a wrong impression of the problem.

For example, if there is a system crash in an embedded system and
the dump shows it was due to a wild pointer. When you see a piece
of code that uses the struct hack (a term I just learned) and it
forgets to include the '\0' in the size allocated, you can't really
swear by the bible that it causes the crash as it over steps the
memory allocated with strcpy. I understood it's bad and dangerous
programming, but the question is can we be 100% sure that an
existing code like that causes memory corruption? Based on the
little I know, I don't think it's a simple answer of a YES or NO.
That's why I think Antoninus has given a more accurate answer.

can't swear by the bible
"It's not OK at all"

Um, did you read my answer, in which I specifically acknowledged that
you can probably get away with accessing padding bytes but also
explained why it's a bad idea to depend on it?

There is nothing "pragmatic" about encouraging you to assume that
there are a certain number of padding bytes at the end of a structure,
and that you can safely use them for whatever you want.

"Antoninus Twink" is a troll. Please do us all a favor and ignore
him.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Oct 4 '08 #30

Richard Heathfield

Keith Thompson said:

Why Tea <yt****@gmail.comwrites:

<snip>

>I understood it's bad and dangerous
programming, but the question is can we be 100% sure that an
existing code like that causes memory corruption? Based on the
little I know, I don't think it's a simple answer of a YES or NO.

It is a very simple, but two-fold, answer - YES, you can be sure that
writing into memory you don't own causes memory corruption; and NO, you
can't be sure that the effect of this corruption will always be
noticeable. When you write into memory you don't own, the behaviour of
your program is undefined, and the rules of C no longer apply - so
anything can happen, including (but by no means limited to) what you
expected to happen.

>That's why I think Antoninus has given a more accurate answer.

His answer is wrong. (See below.)

>can't swear by the bible
"It's not OK at all"

Um, did you read my answer, in which I specifically acknowledged that
you can probably get away with accessing padding bytes but also
explained why it's a bad idea to depend on it?

There is nothing "pragmatic" about encouraging you to assume that
there are a certain number of padding bytes at the end of a structure,
and that you can safely use them for whatever you want.

"Antoninus Twink" is a troll. Please do us all a favor and ignore
him.

It is clear from the above that at least some newbies are *not* ignoring
the technically incompetent answers provided by Mr Twink. Note that
"technically incompetent" and "trollish" are not the same thing. The
problem with Mr Twink (and it is not a problem that is unique to him) is
that he's *both*. The regular contributors to this group know full well
that he's a troll. He is in many killfiles (including mine). And therefore
he can often get away with spouting any old rubbish without being
challenged, and thus newbies can be misled into following his "advice" (as
appears to have happened in this case). Hanlon's Razor suggests that this
technical rubbish that Mr Twink posts is best explained by incompetence
rather than malice.

So the killfile "solution" is problematic, because it allows trolls like Mr
Twink the freedom to give stupid advice to up-lapping newbies with little
or no risk of being corrected. Unfortunately, the non-killfile "solution"
is also problematic, because it raises the overall temperature of the
group. If, for example, I were to remove Mr Twink from my killfile, I know
from experience that within a week there'd be a flame war several hundreds
of articles long.

The best solution would be for all the trolls to either:

(a) become C experts - of which there seems little or no hope; or
(b) stop posting C advice - of which there seems little or no
hope, although ISTR that Kenny McCormack, at least, has
the grace to realise that he doesn't know spit about C
and consequently limits his articles to obnoxiousness and
misconceived attempts at irony; or
(c) stop trolling - of which there seems to be no hope whatsoever.

Since none of these solutions is going to happen, we are left with "to
killfile or not to killfile", and - as we have seen - each has its
problems.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999

Oct 4 '08 #31

CBFalconer

Richard Heathfield wrote:

>

.... snip ...

>
The best solution would be for all the trolls to either:

(a) become C experts - of which there seems little or no hope; or
(b) stop posting C advice - of which there seems little or no
hope, although ISTR that Kenny McCormack, at least, has
the grace to realise that he doesn't know spit about C
and consequently limits his articles to obnoxiousness and
misconceived attempts at irony; or
(c) stop trolling - of which there seems to be no hope whatsoever.

Since none of these solutions is going to happen, we are left with
"to killfile or not to killfile", and - as we have seen - each has
its problems.

There is also:

(d) Await a quote from a reply to the troll, and reply to that.
That works adequately with a killfile.

--
[mail]: Chuck F (cbfalconer at maineline dot net)
[page]: <http://cbfalconer.home.att.net>
Try the download section.

Oct 4 '08 #32

Antoninus Twink

On 4 Oct 2008 at 5:02, Keith Thompson wrote:

There is nothing "pragmatic" about encouraging you to assume that
there are a certain number of padding bytes at the end of a structure,
and that you can safely use them for whatever you want.

Why don't you get off your high horse for a minute and try to separate
the two issues in your mind?

The question you're addressing is: Is it OK to make certain padding
assumptions about structs, and based on those assumptions to write to
memory beyond the last field in the struct? The answer is, of course
that's not a good idea in general, and I'd never advise anyone to do it.

But that wasn't what the OP was asking. He was talking about a specific
compiler on a specific system, where he'd verified that there were some
number of padding bytes at the end of the struct. He asked, in that
specific situation, whether writing into those padding bytes could cause
his program to blow up?

The pragmatic answer is no, unless (as someone else pointed out) the
compiler comes with some elaborate bounds-checking "feature", a
possibility which for all practical purposes can be ignored, because
it's vanishingly unlikely.

struct foo *p = malloc(sizeof(struct foo));
char *q = (char *) p;
q[sizeof(struct foo) - 1]=0;

Are you seriously saying that there's any real-world system that will
blow up here if it happens that there's padding at the end of struct
foo?

Oct 4 '08 #33

Kenny McCormack

In article <ln************@nuthaus.mib.org>,
Keith Thompson <ks***@mib.orggot a little confused (as is the norm
for him) and wrote something else. What he meant to write was:
....

>"Antoninus Twink" is one of the few people here who tells the truth.
This pisses us off and so we call him (and others like him) a "troll".

Please do us all a favor (because this newsgroup is the only thing many
of us have that passes for a social life) and pretend that you too are
as dumb as we are.

Corrections done. No thanks are necessary (but cash is always accepted).

Oct 4 '08 #34

Kenny McCormack

In article <sl*******************@nospam.invalid>,
Antoninus Twink <no****@nospam.invalidwrote:
....

>Are you seriously saying that there's any real-world system that will
blow up here if it happens that there's padding at the end of struct
foo?

Real world systems are OT here. Surely you know that by now.

Oct 4 '08 #35

Eric Sosman

Why Tea wrote:

[...]
#define MAXSIZE 100

struct name {
int namelen;
char namestr[MAXSIZE];
};

struct name *makename(char *newname)
{
struct name *ret =
malloc(sizeof(struct name)-MAXSIZE+strlen(newname)+1);
/* +1 for \0 */
[...]

The argument from my colleague was about the extra
byte (+1 for \0), why is it needed as padding is
always there? He assumed 32-bit alignment and
MAXSIZE=1 in our discussion. Is his argument always
right?

No, because his assumption of four-byte alignment is
not always right. Assuming that the Sun shines blue is
not a good way to reason about the color of the sky.

--
Eric Sosman
es*****@ieee-dot-org.invalid

Oct 4 '08 #36

Richard Tobin

In article <sl*******************@nospam.invalid>,
Antoninus Twink <no****@nospam.invalidwrote:

>Yes, of course, if there are two or more padding bytes at the end of the
struct then that's your memory to write to, whether it's on the stack if
my_struct is an automatic variable, or on the heap if you got the memory
from malloc().

I think it's guaranteed to be safe to write to them through a char
pointer, but you can't rely on the value you write staying the same.
For example, a little-endian system writing a 16-bit unsigned short
value into an 8-bit unsigned char followed by padding might just write
the whole short, clobbering the first padding byte.

And as has already been pointed out, structure assignment may not
copy padding.

-- Richard
--
Please remember to mention me / in tapes you leave behind.

Oct 4 '08 #37

Richard Tobin

In article <sl*******************@nospam.invalid>,
Antoninus Twink <no****@nospam.invalidwrote:

>struct foo *p = malloc(sizeof(struct foo));
char *q = (char *) p;
q[sizeof(struct foo) - 1]=0;

Are you seriously saying that there's any real-world system that will
blow up here if it happens that there's padding at the end of struct
foo?

I don't think this paticular case is a real-world vs theoretical issue.
It's always legal to write malloc()ed memory through a byte pointer.

-- Richard

--
Please remember to mention me / in tapes you leave behind.

Oct 4 '08 #38

Malcolm McLean

"Ben Pfaff" <bl*@cs.stanford.eduwrote in message news:

la************@siemens.com writes:

>>
No, the padding bytes are most definitely *not* yours to write to.

Really?

struct { int a; } b;
memset(&b, 0, sizeof b); /* Bang? */

I think you've hit on a glitch in the standard there.
Use of memset() to zero out memory is well-established, but could lead to
writing to padding bytes, which strictly isn't allowed. Which leads to the
issue of whether a custom "zero-memory", just a hand-coded memset() with a
hard zero, would lead to UB, whilst memset() doesn't. That's a nonsense
rule.

--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm

Oct 4 '08 #39

Keith Thompson

"Malcolm McLean" <re*******@btinternet.comwrites:

"Ben Pfaff" <bl*@cs.stanford.eduwrote in message news:
>la************@siemens.com writes:
>>No, the padding bytes are most definitely *not* yours to write to.

Really?

struct { int a; } b;
memset(&b, 0, sizeof b); /* Bang? */

I think you've hit on a glitch in the standard there.
Use of memset() to zero out memory is well-established, but could lead
to writing to padding bytes, which strictly isn't allowed. Which leads
to the issue of whether a custom "zero-memory", just a hand-coded
memset() with a hard zero, would lead to UB, whilst memset()
doesn't. That's a nonsense rule.

I don't see a glitch. You're permitted to access any object as if it
were an array of unsigned char, which is what memset (or a hand-coded
equivalent) does.

That includes accessing padding bytes. For example:

struct foo { int x; char y; } obj;

if (sizeof obj offsetof(struct foo, y) + 1) {

/* struct foo has one or more padding bytes at the end */
/* We can do what we like with those padding bytes. */
((unsigned char*)&obj)[sizeof obj - 1] = 42;

/* But updating foo.y might clobber the padding bytes. */
obj.y = 'y';
/* We don't know what value the padding byte now has. */
}

And if you happen to know, after carefully reading your
implementation's documentation, that there are one or more padding
bytes at the end of struct foo, then you can get away with writing to
them. But then your code will be non-portable -- and it could very
easily break if the declaration of struct foo is changed during
maintenance.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Oct 4 '08 #40

Malcolm McLean

"Keith Thompson" <ks***@mib.orgwrote in message

"Malcolm McLean" <re*******@btinternet.comwrites:
>"Ben Pfaff" <bl*@cs.stanford.eduwrote in message news:
>>la************@siemens.com writes:
No, the padding bytes are most definitely *not* yours to write to.

Really?

struct { int a; } b;
memset(&b, 0, sizeof b); /* Bang? */

I think you've hit on a glitch in the standard there.
Use of memset() to zero out memory is well-established, but could lead
to writing to padding bytes, which strictly isn't allowed. Which leads
to the issue of whether a custom "zero-memory", just a hand-coded
memset() with a hard zero, would lead to UB, whilst memset()
doesn't. That's a nonsense rule.

I don't see a glitch. You're permitted to access any object as if it
were an array of unsigned char, which is what memset (or a hand-coded
equivalent) does.

Consider this

struct foo
{
char *ptr;
/* we've got a few padding bytes here */
};

Now we happen to know that null is all zeroes on our particular machine, so

foo x;
memset(&x, 0, sizeof(struct foo));

is OK.

However C0C0C0C0 is the pointer trap representation. So

memset(&x, 0xC0, sizeof(struct foo));

will cause a program termination, probably when the structure is accessed,
even though ptr isn't written through or read from.

But the question is, can the padding bytes have a similar trap
representation? If so, can it be all bits zero, and so can Ben's example
blow up?

--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm

Oct 4 '08 #41

Harald van =?UTF-8?b?RMSzaw==?=

On Sat, 04 Oct 2008 18:56:19 +0100, Malcolm McLean wrote:

>[...]
However C0C0C0C0 is the pointer trap representation. So

memset(&x, 0xC0, sizeof(struct foo));

will cause a program termination, probably when the structure is
accessed, even though ptr isn't written through or read from.

"The value of a structure or union object is never a trap representation,
even though the value of a member of the structure or union object may be
a trap representation."

You're allowed to set a pointer to C0C0C0C0 the way you do, and you're
allowed to do pretty much anything using a structure containing that
pointer, so long as you don't look at that pointer specifically.

But the question is, can the padding bytes have a similar trap
representation? If so, can it be all bits zero, and so can Ben's example
blow up?

No, padding bytes cannot affect whether the bits represent a value.

Oct 4 '08 #42

Keith Thompson

"Malcolm McLean" <re*******@btinternet.comwrites:

"Keith Thompson" <ks***@mib.orgwrote in message
>"Malcolm McLean" <re*******@btinternet.comwrites:
>>"Ben Pfaff" <bl*@cs.stanford.eduwrote in message news:
la************@siemens.com writes:
No, the padding bytes are most definitely *not* yours to write to.

Really?

struct { int a; } b;
memset(&b, 0, sizeof b); /* Bang? */

I think you've hit on a glitch in the standard there.
Use of memset() to zero out memory is well-established, but could lead
to writing to padding bytes, which strictly isn't allowed. Which leads
to the issue of whether a custom "zero-memory", just a hand-coded
memset() with a hard zero, would lead to UB, whilst memset()
doesn't. That's a nonsense rule.

I don't see a glitch. You're permitted to access any object as if it
were an array of unsigned char, which is what memset (or a hand-coded
equivalent) does.

Consider this

struct foo
{
char *ptr;
/* we've got a few padding bytes here */
};

Now we happen to know that null is all zeroes on our particular machine, so

foo x;
memset(&x, 0, sizeof(struct foo));

is OK.

However C0C0C0C0 is the pointer trap representation. So

memset(&x, 0xC0, sizeof(struct foo));

will cause a program termination, probably when the structure is
accessed, even though ptr isn't written through or read from.

The latter won't cause a program termination *unless* we access x.ptr.
This is just a simple case of type punning; it doesn't really have
anything to do with padding bytes.

But the question is, can the padding bytes have a similar trap
representation? If so, can it be all bits zero, and so can Ben's
example blow up?

No. n1256 6.2.6.1p6:

When a value is stored in an object of structure or union type,
including in a member object, the bytes of the object
representation that correspond to any padding bytes take
unspecified values. The value of a structure or union object is
never a trap representation, even though the value of a member of
the structure or union object may be a trap representation.

(There's a change bar on the last sentence; I think it was added post-C99.)

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Oct 4 '08 #43

Martien Verbruggen

On Sat, 04 Oct 2008 09:57:00 -0700,
Keith Thompson <ks***@mib.orgwrote:

>
struct foo { int x; char y; } obj;

if (sizeof obj offsetof(struct foo, y) + 1) {

/* struct foo has one or more padding bytes at the end */
/* We can do what we like with those padding bytes. */
((unsigned char*)&obj)[sizeof obj - 1] = 42;

/* But updating foo.y might clobber the padding bytes. */
obj.y = 'y';
/* We don't know what value the padding byte now has. */
}

And if you happen to know, after carefully reading your
implementation's documentation, that there are one or more padding
bytes at the end of struct foo, then you can get away with writing to

If you've used the baove test, would you still need to read your
compiler's documentation? Isn't the test enough?

IOW, why did you include the phrase "after carefully reading your
implementation's documentation", rather than leave it unqualified?
Martien
--
|
Martien Verbruggen | "In a world without fences,
| who needs Gates?"
|

Oct 4 '08 #44

Keith Thompson

Martien Verbruggen <mg**@tradingpost.com.auwrites:

On Sat, 04 Oct 2008 09:57:00 -0700,
Keith Thompson <ks***@mib.orgwrote:
>>
struct foo { int x; char y; } obj;

if (sizeof obj offsetof(struct foo, y) + 1) {

/* struct foo has one or more padding bytes at the end */
/* We can do what we like with those padding bytes. */
((unsigned char*)&obj)[sizeof obj - 1] = 42;

/* But updating foo.y might clobber the padding bytes. */
obj.y = 'y';
/* We don't know what value the padding byte now has. */
}

And if you happen to know, after carefully reading your
implementation's documentation, that there are one or more padding
bytes at the end of struct foo, then you can get away with writing to

If you've used the baove test, would you still need to read your
compiler's documentation? Isn't the test enough?

IOW, why did you include the phrase "after carefully reading your
implementation's documentation", rather than leave it unqualified?

Unclear writing on my part. The "And if you happen to know ..." part
was intended to refer to accessing the padding bytes in general
(without the "if"), not specifically to the code above.

The whole idea is frankly a bit silly. If you want to access bytes
within a struct, why on Earth would you not declare members to cover
those bytes? (The struct hack isn't an exception to this; it
deliberately accesses bytes that may be outside the struct, but within
a block allocated by malloc.) And the proposed "if" just makes it
sillier; what are you going to do if there isn't any padding at the
end, and why not just do that unconditionally? (That's a generic
"you".)

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Oct 5 '08 #45

Why Tea

On Oct 4, 4:34*pm, Richard Heathfield <r...@see.sig.invalidwrote:

Keith Thompson said:

Why Tea <ytl...@gmail.comwrites:

<snip>

I understood it's bad and dangerous
programming, but the question is can we be 100% sure that an
existing code like that causes memory corruption? Based on the
little I know, I don't think it's a simple answer of a YES or NO.

It is a very simple, but two-fold, answer - YES, you can be sure that
writing into memory you don't own causes memory corruption; and NO, you
can't be sure that the effect of this corruption will always be
noticeable. When you write into memory you don't own, the behaviour of
your program is undefined, and the rules of C no longer apply - so
anything can happen, including (but by no means limited to) what you
expected to happen.

Thanks Richard. I understood what said. I'd like to apologize for
asking for more questions. But I really would like to get to the
bottom of this.

If the memory is corrupted, wouldn't the system eventually crash
if you run it long enough? If so, then we can be sure that the
corruption will be noticeable.

I went back to c-faq to read 2.6 many times again. I'll paste
the code here for easy reference.

#include <stdlib.h>
#include <string.h>

struct name {
int namelen;
char namestr[1];
};

struct name *makename(char *newname)
{
struct name *ret =
malloc(sizeof(struct name)-1 + strlen(newname)+1);
/* -1 for initial [1]; +1 for \0 */
if(ret != NULL) {
ret->namelen = strlen(newname);
strcpy(ret->namestr, newname);
}
return ret;
}

Although not specifically stated, padding is likely
to occur for namestr. So strcpy must have written into
the padding bytes. The faq says "... has deemed that
it is not strictly conforming with the C Standard,
although it does seem to work under all known
implementations...".

I know it's bad and it shouldn't be done. But when
I look at tens of thousands of lines of code written
by someone else and many of them make use of this
hack. What can we conclude? Perhaps it does work,
just like the faq says.

I appreciate all of you who took the time to answer
my questions. Not all of us work with C everyday,
that's why we ask question here. Of course we try
to google and ask our colleagues before turning to
the group, but it doesn't always work - as this
"struct hack" indicated. It doesn't help to answer
the question with an almighty attitude, again as
this "struct hack" has indicated, although the
consensus is not to do it, but no one can say for
sure if the system will eventually die or crash.
Very often, we ask a question because we badly
need help and we know there are many knowledgeable
and competent people here. Thanks again for your
time.

That's why I think Antoninus has given a more accurate answer.

His answer is wrong. (See below.)

can't swear by the bible
"It's not OK at all"

Um, did you read my answer, in which I specifically acknowledged that
you can probably get away with accessing padding bytes but also
explained why it's a bad idea to depend on it?

There is nothing "pragmatic" about encouraging you to assume that
there are a certain number of padding bytes at the end of a structure,
and that you can safely use them for whatever you want.

Oct 6 '08 #46

Richard Heathfield

Why Tea said:

<snip>

If the memory is corrupted, wouldn't the system eventually crash
if you run it long enough?

The C Standard does not guarantee this (either way).

If so, then we can be sure that the
corruption will be noticeable.

I went back to c-faq to read 2.6 many times again. I'll paste
the code here for easy reference.

#include <stdlib.h>
#include <string.h>

struct name {
int namelen;
char namestr[1];
};

struct name *makename(char *newname)
{
struct name *ret =
malloc(sizeof(struct name)-1 + strlen(newname)+1);
/* -1 for initial [1]; +1 for \0 */
if(ret != NULL) {
ret->namelen = strlen(newname);
strcpy(ret->namestr, newname);
}
return ret;
}

Although not specifically stated, padding is likely
to occur for namestr. So strcpy must have written into
the padding bytes.

Well, actually strcpy has written into memory that you allocated via
malloc.

The faq says "... has deemed that
it is not strictly conforming with the C Standard,
although it does seem to work under all known
implementations...".

dmr once called it "unwarranted chumminess with the implementation".

I know it's bad and it shouldn't be done. But when
I look at tens of thousands of lines of code written
by someone else and many of them make use of this
hack. What can we conclude?

We can conclude that lots of people do bad things that shouldn't be done.

Perhaps it does work, just like the faq says.

As far as I'm aware, nobody has ever found an implementation on which the
struct hack (i.e. the hack by which you allocate more storage than the
structure actually needs) doesn't work. That is not the same as saying
that it's okay to write into memory you don't own. It isn't and you
shouldn't. Whether you do or not is your concern, but you can't blame the
implementation if it all goes wrong.

<snip>

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999

Oct 6 '08 #47

Nick Keighley

On 4 Oct, 06:34, Richard Heathfield <r...@see.sig.invalidwrote:

Keith Thompson said:
Why Tea <ytl...@gmail.comwrites:

<snip>

"Antoninus Twink" is a troll. *Please do us all a favor and ignore
him.

It is clear from the above that at least some newbies are *not* ignoring
the technically incompetent answers provided by Mr Twink. Note that
"technically incompetent" and "trollish" are not the same thing. The
problem with Mr Twink (and it is not a problem that is unique to him) is
that he's *both*. The regular contributors to this group know full well
that he's a troll. He is in many killfiles (including mine). And therefore
he can often get away with spouting any old rubbish without being
challenged, and thus newbies can be misled into following his "advice" (as
appears to have happened in this case).

which is why I don't have him kill filed.

<snip>

So the killfile "solution" is problematic, because it allows trolls like Mr
Twink the freedom to give stupid advice to up-lapping newbies with little
or no risk of being corrected. Unfortunately, the non-killfile "solution"
is also problematic, because it raises the overall temperature of the
group. If, for example, I were to remove Mr Twink from my killfile, I know
from experience that within a week there'd be a flame war several hundreds
of articles long.

my rule of two (or sometimes 3) hopefully keeps me from this.
I me and my protagonist have repeated the same position three
then I declare it a stalemate and retire from the discussion.
The analogy is with similar chess rule.

I usually try to avoid replying to Twink but instead reply
to his repliers. Sometimes he's so potentially misleading
I feel I have to respond for the sake of the lurkers.

<snip>

Kenny's easier as he's just tedious.

--
Nick Keighley

Oct 6 '08 #48

Keith Thompson

Why Tea <yt****@gmail.comwrites:

On Oct 4, 4:34*pm, Richard Heathfield <r...@see.sig.invalidwrote:
>Keith Thompson said:

Why Tea <ytl...@gmail.comwrites:

<snip>

>I understood it's bad and dangerous
programming, but the question is can we be 100% sure that an
existing code like that causes memory corruption? Based on the
little I know, I don't think it's a simple answer of a YES or NO.

It is a very simple, but two-fold, answer - YES, you can be sure that
writing into memory you don't own causes memory corruption; and NO, you
can't be sure that the effect of this corruption will always be
noticeable. When you write into memory you don't own, the behaviour of
your program is undefined, and the rules of C no longer apply - so
anything can happen, including (but by no means limited to) what you
expected to happen.

Thanks Richard. I understood what said. I'd like to apologize for
asking for more questions. But I really would like to get to the
bottom of this.

If the memory is corrupted, wouldn't the system eventually crash
if you run it long enough? If so, then we can be sure that the
corruption will be noticeable.

Maybe, maybe not. There are no guarantees, one way or the other.

For example, if by "corrupting memory" you clobber the value of some
other variable, maybe it's a variable that isn't used again, so it has
no visible effect on the program. Or maybe you clobber memory that's
outside the object you're trying to access, but also outside any other
object. Or maybe you set some variable to a value that happens to be
correct.

There are any number of ways you can corrupt memory with no visible
effect. The risk is that the effect could become visible at the least
convenient possible time -- say, when your software has been deployed
to customers, or when you're demonstrating it to somebody important,
or years later when all the people who are familiar with the code have
left the company. Such is the nature of undefined behavior.

I went back to c-faq to read 2.6 many times again. I'll paste
the code here for easy reference.

#include <stdlib.h>
#include <string.h>

struct name {
int namelen;
char namestr[1];
};

struct name *makename(char *newname)
{
struct name *ret =
malloc(sizeof(struct name)-1 + strlen(newname)+1);
/* -1 for initial [1]; +1 for \0 */
if(ret != NULL) {
ret->namelen = strlen(newname);
strcpy(ret->namestr, newname);
}
return ret;
}

Although not specifically stated, padding is likely
to occur for namestr. So strcpy must have written into
the padding bytes. The faq says "... has deemed that
it is not strictly conforming with the C Standard,
although it does seem to work under all known
implementations...".

Proper use of the struct hack does *not* depend on padding bytes. It
writes outside the bounds of the array, and of the struct that
contains it, but *within* the bounds of the chunk of memory allocated
by malloc. For this to work, you need an implementation that doesn't
do bounds checking; almost all existing implementations qualify. (In
fact, since the struct hack is a common trick, a compiler that broke
it would probably fail in the marketplace.)

On the other hand, there's some risk that an optimizing compiler could
cause problems. Since violating array bounds invokes undefined
behavior, an optimizing compiler is allowed to *assume* that you
haven't done so, even if it doesn't generate code for explicit
run-time bounds checks. But again, the struct hack is common enough
that you should be ok.

I know it's bad and it shouldn't be done. But when
I look at tens of thousands of lines of code written
by someone else and many of them make use of this
hack. What can we conclude? Perhaps it does work,
just like the faq says.

The struct hack itself *probably* violates the rules of the language,
but it's generally supported -- and C99 explicitly supports it in a
different form. Code that assumes the presence of padding bytes, on
the other hand, is more dangerous. For example, if your declaration
changes from this:
struct name {
int namelen;
char namestr[1];
};
to this:
struct name {
int namelen;
short something;
unsigned char something_else;
char namestr[1];
};
then it's likely (given 4-byte int, 2-byte short, and, of course,
1-byte char) that the structure will be 8 bytes with *no* padding.

There might be some confusion here. I haven't gone back to the
original article, and I'm not certain that the code you originally
posted actually assumed the existence of padding bytes rather than
just making ordinary use of the struct hack.

[...]

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Oct 6 '08 #49

Nick Keighley

On 4 Oct, 09:57, Antoninus Twink <nos...@nospam.invalidwrote:

On 4 Oct 2008 at 5:02, Keith Thompson wrote:

There is nothing "pragmatic" about encouraging you to assume that
there are a certain number of padding bytes at the end of a structure,
and that you can safely use them for whatever you want.

Why don't you get off your high horse for a minute and try to separate
the two issues in your mind?

The question you're addressing is: Is it OK to make certain padding
assumptions about structs, and based on those assumptions to write to
memory beyond the last field in the struct? The answer is, of course
that's not a good idea in general, and I'd never advise anyone to do it.

But that wasn't what the OP was asking. He was talking about a specific
compiler on a specific system,

and perhaps he should have asked on a specifc news group

where he'd verified that there were some
number of padding bytes at the end of the struct. He asked, in that
specific situation, whether writing into those padding bytes could cause
his program to blow up?

The pragmatic answer is no,

only if "pragmatic" means "wrong". The problem is his program is now
non-portable. This non-portability *could* involve different compilers
on the same platform. Or different versions of the same compiler.
Or changes to flag settings of the compiler (particularly
optimistation
flags).

Its quite easier for you to end up writing to bytes that
don't belong to you. And that's an accident waiting to happen.

unless (as someone else pointed out) the
compiler comes with some elaborate bounds-checking "feature", a
possibility which for all practical purposes can be ignored, because
it's vanishingly unlikely.

struct foo *p = malloc(sizeof(struct foo));
char *q = (char *) p;
q[sizeof(struct foo) - 1]=0;

Are you seriously saying that there's any real-world system that will
blow up here if it happens that there's padding at the end of struct
foo?

no no-one knows of one. The standard commitee still doesn't think
highly of the "struct hack" of which this a variant.
--
Nick Keighley
"Almost every species in the universe has an irrational fear of the
dark.
But they're wrong- cos it's not irrational. It's Vashta Nerada."
The Doctor

Oct 6 '08 #50

Memory alignment

Similar topics