469,271 Members | 1,481 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,271 developers. It's quick & easy.

Memory alignment

typedef struct some_struct
{
int i;
short k,
int m;
char s[1];
} some_struct_t;

Assuming 16 bit or 32-bit alignment, can I assume that
s always gets 4 or 8 bytes of allocation due to padding
in the following? (I.e. s is either 4 or 8 characters long)

some_struct_t *my_struct;
my_struct = malloc(sizeof(some_struct_t));

BTW, this is on a PowerPC architecture.

/Why Tea
Oct 3 '08
66 3219
On 3 Oct, 20:57, Ben Bacarisse <ben.use...@bsb.me.ukwrote:
Why Tea <ytl...@gmail.comwrites:
On Oct 3, 12:48 pm, danmat...@gmail.com wrote:
But strcpy(my->struct->s, "AB"); is OK if there is
padding. Isn't it not? Please go easy on me here.
I just want to understand how this really works...
It will, but if you add a new elemnt at the end of the structure later
on in might not, you might overwrite the next element.
Why would you want to declare a 1 char array to store 2 anyway?
Good question. This is found in some real embedded
code to make more efficient of the memory. As I understood
it, the last s[1] is just a placeholder as you can
allocate more memory when needed. For example:
my_struct = malloc(sizeof(my_struct_t) + MY_PAYLOAD_STRING_SIZE);

This is called the "struct hack". It has been formalised in C99 so if
you can use C99 then all will be well.
um. I thought the position of the struct hack was the same
on C99 as C90. What *did* change was the addiition of
VLAs that were intended to remove the need for TSH

The same my_struct_t is used throughout the code for
signal sending. If s[] is used to carry binary data, the
size is specified by an int preceding s[]. I'd be
interested to hear comments from the experts about
this approach.

It is considered to be "a bit dodgy" (that is the technical term) but
it generally works. I am not sure there is really much more to say
about it though I get the feeling I will be proved very much wrong
about that!
I think on a reasonably sane embedded system it was almost
certain it would work. Obviously you run some tests. I've used
TSH heavily.

--
Nick Keighley

"If, indeed the subatomic energy in the stars is being freely
used to maintain their great furnaces, it seems to bring a little
nearer to fulfillment our dreams of controlling this latent
power for the well-being of the human race - or for its suicide."
Aurthur S. Eddington "The Internal Constitution of the Stars" 1926
Oct 6 '08 #51
On 6 Oct 2008 at 7:44, Nick Keighley wrote:
On 4 Oct, 09:57, Antoninus Twink <nos...@nospam.invalidwrote:
>where he'd verified that there were some
number of padding bytes at the end of the struct. He asked, in that
specific situation, whether writing into those padding bytes could cause
his program to blow up?

The pragmatic answer is no,

only if "pragmatic" means "wrong". The problem is his program is now
non-portable. This non-portability *could* involve different compilers
on the same platform. Or different versions of the same compiler. Or
changes to flag settings of the compiler (particularly optimistation
flags).
The OP knows that perfectly well, and it isn't what he was asking:
>>if there is a system crash in an embedded system and the dump shows
it was due to a wild pointer. When you see a piece of code that uses
the struct hack (a term I just learned) and it forgets to include
the '\0' in the size allocated, you can't really swear by the bible
that it causes the crash as it over steps the memory allocated with
strcpy. I understood it's bad and dangerous programming, but the
question is can we be 100% sure that an existing code like that
causes memory corruption?
The answer is NO, we can't be 100% sure that struct-hack-like code
caused the problem. In fact, we can be 99% sure that it didn't.

Oct 6 '08 #52
Why Tea wrote:
....
If the memory is corrupted, wouldn't the system eventually crash
if you run it long enough? If so, then we can be sure that the
corruption will be noticeable.
No. Instead of crashing, the system could also get stuck in an infinite
loop. However, the truly insidious possibility is that your program will
continue apparently normally and exit without showing any obvious signs.
That doesn't mean that it worked correctly; it might produce output that
is subtly wrong in some way, or it might produce a catastrophic error,
but with only a 1% chance of triggering the catastrophe during any
particular run of the program. It could run a long time, generating lots
of subtly erroneous data that will require a lot of work to fix, before
you even notice the problem.

....
I know it's bad and it shouldn't be done. But when
I look at tens of thousands of lines of code written
by someone else and many of them make use of this
hack. What can we conclude? Perhaps it does work,
just like the faq says.
The struct hack does work, on most C90 systems, and I would not
recommend worrying too much about the possibility of it failing unless
and until you find that it fails on a particular system that you need to
port it to. However, don't make any important decisions based upon the
assumption that it can't fail - in principle it can, and you need to
remember that.

C99 flexible arrays will work on any fully conforming implementation of
C99, and on many implementations that fall short of full conformance.
I'd recommend using flexible arrays rather than the struct hack, if
you're able to restrict the portability of your program to those
implementations that support it, and to complain about the ones that
don't support it. As a practical matter, any implementation that lets
you declare a flexible array member without generating a diagnostic will
almost certainly support correct use of that member. Thus, you'll learn
at compile time, rather than run time, whether or not you'll be able to
use it safely.

However, flexible array members are still not universally supported. The
difference is, if the struct hack fails, the implementor can point to
sections of the standard that allow it to fail. If flexible arrays are
not supported, you can point to sections of C99 standard that require
them to be. That won't necessarily make it any easier to convince the
implementor to change their implementation, but it is a stronger argument.
Oct 6 '08 #53
Nick Keighley wrote:
On 3 Oct, 20:57, Ben Bacarisse <ben.use...@bsb.me.ukwrote:
....
>This is called the "struct hack". It has been formalised in C99 so if
you can use C99 then all will be well.

um. I thought the position of the struct hack was the same
on C99 as C90. What *did* change was the addiition of
VLAs that were intended to remove the need for TSH
No, the relevant change was not VLAs, but flexible array members. They
work almost exactly like the struct hack, except that instead of
declaring a specific length for the array, the length is left
unspecified. The key point is that the behavior of flexible array
members is defined by the standard, the behavior when using the struct
hack is not.
Oct 6 '08 #54
On Oct 5, 10:22*pm, Richard Heathfield <r...@see.sig.invalidwrote:
Why Tea said:

<snip>
If the memory is corrupted, wouldn't the system eventually crash
if you run it long enough?

The C Standard does not guarantee this (either way).
If so, then we can be sure that the
corruption will be noticeable.
I went back to c-faq to read 2.6 many times again. I'll paste
the code here for easy reference.
#include <stdlib.h>
#include <string.h>
struct name {
* int namelen;
* char namestr[1];
};
struct name *makename(char *newname)
{
* struct name *ret =
* * * malloc(sizeof(struct name)-1 + strlen(newname)+1);
* * * * * * * * * * */* -1 for initial [1]; +1 for \0 */
* if(ret != NULL) {
* * ret->namelen = strlen(newname);
* * strcpy(ret->namestr, newname);
* }
* return ret;
}
Although not specifically stated, padding is likely
to occur for namestr. So strcpy must have written into
the padding bytes.

Well, actually strcpy has written into memory that you allocated via
malloc.
Does "strcpy(ret->namestr, newname);" really copy data
into the memory malloc'ed? I have a mental picture of
this for namestr:

Byte #1 is declared, 2-4 could be padding assuming
32 bit alignment.
1 2 3 4 5 6 7 8 9 10 11 ...
|<->|<---- malloc'ed ...--->

Doesn't 'strcpy(ret->namestr, "Richard");" become
this?
1 2 3 4 5 6 7 8 9
R i c h a r d \0
Perhaps it does work, just like the faq says.

As far as I'm aware, nobody has ever found an implementation on which the
struct hack (i.e. the hack by which you allocate more storage than the
structure actually needs) doesn't work. That is not the same as saying
that it's okay to write into memory you don't own. It isn't and you
shouldn't. Whether you do or not is your concern, but you can't blame the
implementation if it all goes wrong.
Understood.

Oct 6 '08 #55
Keith Thompson <ks***@mib.orgwrote:
>
Fair enough. I *know* it's called the "struct hack", and I've read
question 2.6 before, but I had trouble finding it myself, since the
answer to 2.6 doesn't use the phrase "struct hack".
That's the kind of thing that makes creating a good index difficult.
It's also why the index in the C standard contains terms that don't
appear anywhere else in the document (including "struct hack", I'm
glad to say).
--
Larry Jones

I wonder if you can refuse to inherit the world. -- Calvin
Oct 6 '08 #56
Why Tea <yt****@gmail.comwrites:
On Oct 5, 10:22*pm, Richard Heathfield <r...@see.sig.invalidwrote:
>Why Tea said:
<snip>
#include <stdlib.h>
#include <string.h>
struct name {
* int namelen;
* char namestr[1];
};
struct name *makename(char *newname)
{
* struct name *ret =
* * * malloc(sizeof(struct name)-1 + strlen(newname)+1);
* * * * * * * * * * */* -1 for initial [1]; +1 for \0 */
* if(ret != NULL) {
* * ret->namelen = strlen(newname);
* * strcpy(ret->namestr, newname);
* }
* return ret;
}
Although not specifically stated, padding is likely
to occur for namestr. So strcpy must have written into
the padding bytes.

Well, actually strcpy has written into memory that you allocated via
malloc.

Does "strcpy(ret->namestr, newname);" really copy data
into the memory malloc'ed?
Yes.
I have a mental picture of
this for namestr:

Byte #1 is declared, 2-4 could be padding assuming
32 bit alignment.
1 2 3 4 5 6 7 8 9 10 11 ...
|<->|<---- malloc'ed ...--->
It is all malloced, including some other bytes that are used for
namelen.
Doesn't 'strcpy(ret->namestr, "Richard");" become
this?
1 2 3 4 5 6 7 8 9
R i c h a r d \0
Yes, if by this you mean that the letters get put in the first (and
only) byte of namestr and then in consecutive following bytes. The
fact that some of these bytes might be due to sizeof *ret being >
sizeof ret->namelen + 1 does not mean they are not malloced or that you
can't write to them.

The main problems with the struct hack come from maintaining the
code. Everyone using the structure has to know that the size is "fake"
and that it can't be zeroed and copied like other structures.

BTW, I'd number these from 0 rather than 1 just to be consistent with
offsets and C's array indexes.

--
Ben.
Oct 6 '08 #57
Keith Thompson <ks***@mib.orgwrites:
Eric Sosman <Er*********@sun.comwrites:
>Keith Thompson wrote:
>>Lowell Gilbert <lg******@be-well.ilk.orgwrites:
Why Tea <yt****@gmail.comwrites:
On Oct 3, 12:48 pm, danmat...@gmail.com wrote:
[...]
>Why would you want to declare a 1 char array to store 2 anyway?
Good question. This is found in some real embedded
code to make more efficient of the memory. As I understood
it, the last s[1] is just a placeholder as you can
allocate more memory when needed. For example:
>
my_struct = malloc(sizeof(my_struct_t) + MY_PAYLOAD_STRING_SIZE);
or even more likely, something more like
my_struct = malloc(sizeof(my_struct_t) + strlen(my->struct->s));
+ 1. The length returned by strlen() doesn't include the
terminating '\0'.

The struct has `char s[1]' as its last element[*], and the
size of that element is already included in sizeof(my_struct_t).

So it does.

Seeing strlen() in a computation of how much memory to allocate sets
off my alarm bells. If I were going to write something like that in
real code, it would be heavily commented.
Sure.

What I was really trying to point out was that the malloc() length
won't be the same all the time, or else you wouldn't use the struct
hack to begin with.

--
Lowell Gilbert, embedded/networking software engineer
http://be-well.ilk.org/~lowell/
Oct 6 '08 #58
la************@siemens.com writes:
Antoninus Twink <no****@nospam.invalidwrote:

Yes, of course, if there are two or more padding bytes at the end of the
struct then that's your memory to write to, whether it's on the stack if
my_struct is an automatic variable, or on the heap if you got the memory
from malloc().

No, the padding bytes are most definitely *not* yours to write to.
Although most implementations will let you get away with it, there are
implementations that do careful memory bounds checking and won't.
I'm sure someone with Larry Jones's credentials must be saying
something by this, but I'll be darned if I know what it is. Surely
padding bytes must be available for writing to, at least as unsigned
char, so functions with a qsort-like interface can be written.
Oct 9 '08 #59
Tim Rentsch <tx*@alumnus.caltech.eduwrites:
la************@siemens.com writes:
[...]
>No, the padding bytes are most definitely *not* yours to write to.
Although most implementations will let you get away with it, there are
implementations that do careful memory bounds checking and won't.

I'm sure someone with Larry Jones's credentials must be saying
something by this, but I'll be darned if I know what it is. Surely
padding bytes must be available for writing to, at least as unsigned
char, so functions with a qsort-like interface can be written.
The context was a discussion of the struct hack. For example, given:

struct h {
int i;
char arr[1];
};
struct h obj;

Assume:
sizeof(int)==4
sizeof(struct h)==8
offsetof(struct h, arr)==4.

Then:

obj.arr[0] is perfectly ok

obj.arr[1] is potentially problematic in several ways:

1. Accessing obj.arr beyond its bounds is UB (but in practice will
cause problems only if the implementation does bounds checking).

2. Assuming no array bounds checking, accessing a nonexistent element
of obj.arr that happens to overlay a padding byte probably won't
cause problems in practice, but if the struct declaration is
changed, or if the code is compiled under a different
implementation, there might not be a padding byte there.

But if, rather than a single declare object obj, you're accessing the
arr member of a struct h object allocated via malloc(), where malloc()
was carefully called to allocate enough extra memory for however many
elements of arr you need, then ptr->arr[1] is almost certainly ok.
It's strictly UB, and it would break on a strict bounds-checking
implementation, but the struct hack is commonly used in practice.
It's safe to assume (though not guaranteed by the standard) that there
will at least be a way to turn off any bounds checking if you need to.

Treating the structure as an array of unsigned char, and accessing
those bytes, including any padding bytes, is ok.

Assuming that there are one or more padding bytes is non-portable
(and, in my opinion, unnecessary and poor style).

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Oct 9 '08 #60
Keith Thompson <ks***@mib.orgwrites:
Tim Rentsch <tx*@alumnus.caltech.eduwrites:
la************@siemens.com writes:
[...]
No, the padding bytes are most definitely *not* yours to write to.
Although most implementations will let you get away with it, there are
implementations that do careful memory bounds checking and won't.
I'm sure someone with Larry Jones's credentials must be saying
something by this, but I'll be darned if I know what it is. Surely
padding bytes must be available for writing to, at least as unsigned
char, so functions with a qsort-like interface can be written.

The context was a discussion of the struct hack. For example, given:

struct h {
int i;
char arr[1];
};
struct h obj;

[...]
Ahhh, okay. Accessing obj.arr[1] is always undefined behavior,
whether (struct h) has padding bytes or not.
Oct 10 '08 #61
Treating the structure as an array of unsigned char, and accessing
those bytes, including any padding bytes, is ok.
Thanks Keith. I wonder why it took 60 messages for someone
to make a statement as concise as this :)
Assuming that there are one or more padding bytes is non-portable
(and, in my opinion, unnecessary and poor style).
Sure.
Oct 10 '08 #62
On Oct 10, 6:10 pm, Why Tea <ytl...@gmail.comwrote:
[ Keith Thompson wrote this ]
Treating the structure as an array of unsigned char, and accessing
those bytes, including any padding bytes, is ok.

Thanks Keith. I wonder why it took 60 messages for someone
to make a statement as concise as this :)
Assuming that there are one or more padding bytes is non-portable
(and, in my opinion, unnecessary and poor style).

Sure.

Please don't snip attribution lines, ie the part that says "Keith
Thompson wrote:" or similar.
I restored it.
Oct 10 '08 #63
On 10 Oct 2008 at 15:10, Why Tea wrote:
Thanks Keith. I wonder why it took 60 messages for someone to make a
statement as concise as this :)
Because rather than giving a simple answer to a simple question, the
"regulars" would rather turn every thread into a drama by pretending to
misunderstand what everyone else means, and going off into word games
and angels-on-the-head-of-a-pin arguments for post after post.

Oct 10 '08 #64
In article <sl*******************@nospam.invalid>,
Antoninus Twink <no****@nospam.invalidwrote:
>On 10 Oct 2008 at 15:10, Why Tea wrote:
>Thanks Keith. I wonder why it took 60 messages for someone to make a
statement as concise as this :)

Because rather than giving a simple answer to a simple question, the
"regulars" would rather turn every thread into a drama by pretending to
misunderstand what everyone else means, and going off into word games
and angels-on-the-head-of-a-pin arguments for post after post.
Yes, and for most of them, it's the only thing resembling a life that
they will ever know.

Just once, I'd like a statement from the regs as to why they waste their
lives like this.

Oct 10 '08 #65
On Fri, 03 Oct 2008 13:43:30 -0700, Keith Thompson <ks***@mib.org>
wrote:
Lowell Gilbert <lg******@be-well.ilk.orgwrites:
Why Tea <yt****@gmail.comwrites:
On Oct 3, 12:48*pm, danmat...@gmail.com wrote:
[...]
>Why would you want to declare a 1 char array to store 2 anyway?

Good question. This is found in some real embedded
code to make more efficient of the memory. As I understood
it, the last s[1] is just a placeholder as you can
allocate more memory when needed. For example:

my_struct = malloc(sizeof(my_struct_t) + MY_PAYLOAD_STRING_SIZE);
or even more likely, something more like
my_struct = malloc(sizeof(my_struct_t) + strlen(my->struct->s));
Not my->struct->s; that's a syntax error.

Guessing you (PP) meant my_struct->s, no. You need to compute the
length BEFORE allocating; and THEN set my_struct, and fill in
my_struct->whatever. Something much more like:
my_struct = malloc (sizeof(my_struct_t) + strlen(source_str) );
or the equivalent but clc-preferred
my_struct = malloc (sizeof *my_struct + strlen(source_str) );
+ 1. The length returned by strlen() doesn't include the terminating '\0'.
But if you use the C89-struct-hack version, with s[1], the sizeof the
struct already includes room for at least one byte, maybe more. I have
been known to write, for clarity(?!):
... malloc (sizeof(struct_t) -1 +strlen(source_str) +1 )
and sometimes to get stupid compilers to optimize even:
... malloc (sizeof(struct_t) -1 +1 +strlen(source_str) )

Or as already noted elsethread you use the offsetof variant:
... malloc (offsetof(struct_t,s) +strlen(source_str) +1 )

And that's not even considering the case where you don't need .s to be
nullterminated, typically because its length is stored elsewhere.

- formerly david.thompson1 || achar(64) || worldnet.att.net
Oct 13 '08 #66
vi******@gmail.com writes:
On Oct 10, 6:10 pm, Why Tea <ytl...@gmail.comwrote:
>[ Keith Thompson wrote this ]
Treating the structure as an array of unsigned char, and accessing
those bytes, including any padding bytes, is ok.

Thanks Keith. I wonder why it took 60 messages for someone
to make a statement as concise as this :)
Assuming that there are one or more padding bytes is non-portable
(and, in my opinion, unnecessary and poor style).

Sure.


Please don't snip attribution lines, ie the part that says "Keith
Thompson wrote:" or similar.
I restored it.
*chuckle*

Vippstar probably doesn't even realise the irony and humour in his
reply. Good to see not much has changed the past few weeks!
Oct 15 '08 #67

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

18 posts views Thread by Tron Thomas | last post: by
7 posts views Thread by serikas | last post: by
13 posts views Thread by Kutty Banerjee | last post: by
13 posts views Thread by sachin_mzn | last post: by
7 posts views Thread by Dhirendra Pal Singh | last post: by
11 posts views Thread by simonp | last post: by
29 posts views Thread by K. Jennings | last post: by
13 posts views Thread by Chris Thomasson | last post: by
2 posts views Thread by somenath | last post: by
8 posts views Thread by ramsatishv | last post: by
1 post views Thread by CARIGAR | last post: by
reply views Thread by zhoujie | last post: by
reply views Thread by suresh191 | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.