FAQ 2.6

Christopher Benson-Manica

"I came across some code that declared a structure like this:

struct name {
int namelen;
char namestr[1];
};

and then did some tricky allocation to make the namestr array act like it had
several elements."

Exactly what kind of "tricky allocation" is the FAQ talking about?

--
Christopher Benson-Manica | Jumonji giri, for honour.
ataru(at)cyberspace.org |

Nov 13 '05 #1

Subscribe Post Reply

2210

Matt Gregory

Christopher Benson-Manica wrote:

"I came across some code that declared a structure like this:

struct name {
int namelen;
char namestr[1];
};

and then did some tricky allocation to make the namestr array act like it had
several elements."

Exactly what kind of "tricky allocation" is the FAQ talking about?

You can go:

struct name *n = malloc(sizeof(int) + 50);

and that conveniently gives you a

struct name {
int namelen;
char namestr[50];
};

variable since there's no bounds checking in C. At least, that's how I
do it. I suppose that's wrong though because of field padding, but it's
never bothered me as long as you keep all the fields before namestr[]
32 bits on a 32-bit machine.

Matt Gregory

Nov 13 '05 #2

Eric Sosman

Christopher Benson-Manica wrote:

"I came across some code that declared a structure like this:

struct name {
int namelen;
char namestr[1];
};

and then did some tricky allocation to make the namestr array act like it had
several elements."

Exactly what kind of "tricky allocation" is the FAQ talking about?

Allocating memory for the struct "plus:"

struct name *ptr;
ptr = malloc(offsetof(struct name, namestr) + strlen(name) + 1);
assert (ptr != NULL);
ptr->namelen = strlen(name);
strcpy(ptr->namestr, name);

This dodge has long been known as "the struct hack," and its
use has been described as relying on "unwarranted chumminess
with the compiler." C99 introduced a new syntax to allow the
same end to be accomplished in a supported way.

If you'll forgive an observation based on this question
and on the earler "Contrived casting situation" thread: stop
mucking around with this stuff, at least for now. C makes it
easy to peek under the hood, and beginners are understandably
interested in doing so -- knowing what goes on in the engine
makes one feel more like an expert. But it doesn't make one
a better driver! Begin by learning how to steer, how to use
the pedals, how to make good judgements in traffic. Returning
to C terms, this means you should concentrate on the values
you're manipulating, not on the how the computer represents
them. Studying the representations too closely leads you into
pitfalls like "knowing" that an `int' is four eight-bit bytes,
or that a `void*' and a `long' are interchangeable. Don't go
there.

Now, once you're really secure in what we might call the
"official" or "abstract" C language, it may then make sense to
pay attention to how different implementations represent the
values your program uses -- you'll be a slightly better driver
if you know how to replace a broken fan belt, and you'll be a
slightly better C programmer if you understand the different
ways in which various C implementations arrange memory. But
such knowledge is really necessary relatively rarely (fan belts
are fairly durable), and partial knowledge can be dangerous
(changing the belt with the motor running invokes Undefined
Behavior). Stick with the essentials and forget the tricks --
for the nonce, at least.

--
Er*********@sun.com

Nov 13 '05 #3

Christopher Benson-Manica

Eric Sosman <Er*********@sun.com> spoke thus:

This dodge has long been known as "the struct hack," and its
use has been described as relying on "unwarranted chumminess
with the compiler." C99 introduced a new syntax to allow the
same end to be accomplished in a supported way.
If I may ask, what is the new syntax?
If you'll forgive an observation based on this question
and on the earler "Contrived casting situation" thread: stop
mucking around with this stuff, at least for now. C makes it

Heh, no offense taken... Fortunately, I don't intend to write *real* code
that looks like that ;) I was just curious... That's why I'm here!

--
Christopher Benson-Manica | Jumonji giri, for honour.
ataru(at)cyberspace.org |

Nov 13 '05 #4

The Real OS/2 Guy

On Tue, 16 Sep 2003 17:17:38 UTC, Matt Gregory
<ms********@earthlink.net> wrote:

Christopher Benson-Manica wrote:
"I came across some code that declared a structure like this:

struct name {
int namelen;
char namestr[1];
};

and then did some tricky allocation to make the namestr array act like it had
several elements."

Exactly what kind of "tricky allocation" is the FAQ talking about?

You can go:

struct name *n = malloc(sizeof(int) + 50);

You means
struct name *n = malloc(sizeof(*n) + 50);
This gives you room for the int and a \0 termintated string of 50
chars. (an array of 51) AND it includes any padding that may or may
not the compiler include between the members of the struct.

What is when you have to insert another member into the struct? You
have to scan through 3 millions lines of code to find any point where
you'd used the wrong formula to determine the space you have to
allocate that struct and correct it. When you misses ONE place then
you gets UB.

Allocate the number of bytes whole struct grasps (the compiler knows
the size) + the number of bytes you have to store in the dummy string
field and let the compiler do the needed math correctly.
and that conveniently gives you a

struct name {
int namelen;
char namestr[50];
};

variable since there's no bounds checking in C.
No, not in all circumstances. If your comiler needs to insert padding
bytes between namelen and namestr you gets <number of padding bytes>
too less bytes reserved. Means you will address space that is NOT
reserved for the struct when you tries to store a string of 50 bytes
AND you've forgotten to reserve a byte for the \0 charater that ends
the string.
At least, that's how I do it. I suppose that's wrong though because of field padding, but it's
never bothered me as long as you keep all the fields before namestr[]
32 bits on a 32-bit machine.

Why gives you wrong tips when with less typing the right can be given?

--
Tschau/Bye
Herbert

eComStation 1.1 Deutsch Beta ist verügbar

Nov 13 '05 #5

Eric Sosman

Christopher Benson-Manica wrote:

Eric Sosman <Er*********@sun.com> spoke thus:
This dodge has long been known as "the struct hack," and its
use has been described as relying on "unwarranted chumminess
with the compiler." C99 introduced a new syntax to allow the
same end to be accomplished in a supported way.

If I may ask, what is the new syntax?

Usenet is a poor channel for basic reference information.
The current Standard in PDF format costs a whopping US$18 from
ANSI, and there've been some recent postings about a printed
version for somewhat more. It's too bad (although necessary)
that the Standard is written in Standardese, a language known
more for its prescriptive than its expository power -- but
until and unless somebody comes up with a description in a
human tongue, it's the only C99 reference available. AFAIK.

--
Er*********@sun.com

Nov 13 '05 #6

The Real OS/2 Guy

On Tue, 16 Sep 2003 17:32:36 UTC, Eric Sosman <Er*********@sun.com>
wrote:

Christopher Benson-Manica wrote:

"I came across some code that declared a structure like this:

struct name {
int namelen;
char namestr[1];
};

and then did some tricky allocation to make the namestr array act like it had
several elements."

Exactly what kind of "tricky allocation" is the FAQ talking about?
Allocating memory for the struct "plus:"

struct name *ptr;
ptr = malloc(offsetof(struct name, namestr) + strlen(name) + 1);

I've the FAQ not by hand now, but I don't think that this _bug_ will
be in there.
malloc(*ptr+strlen(name)) or maybe malloc(sizeof(struct
name)+strlen(name))
is the right answer.

Whenever the compiler determines it has to do some paddings between
the int and the char[] you'll lost. Whenever you've to change the
struct by inserting some new member before the incomplete array you'll
lost too.
Ask the compiler how much memory it needs to place a struct somewhere
(sizeof(struct x) and add the naked size of the string to store in the
incomplete array. Place for an empty string is included [1] anyway.
assert (ptr != NULL);

and your 1 GB data that stands around you've calculated the last 5
days on is loose because the CRT crashes the application immediately -
only because you'd uses a debug function in golden code or when the
compiler has eliminated the assert the next access to the NULL pointer
does it for you.

assert is not designed to check data as you tries. assert() is
designed to find programmer errors as it has to check to catch errors
a programmer can do, not errors the environment the app runs under
produces. Lack of memory is nothing one can catch with assert().

It should be
if (!ptr) { do some error handling, at least try to save pending
data; exit(exitcode)/return errorcode);
--
Tschau/Bye
Herbert

eComStation 1.1 Deutsch Beta ist verügbar

Nov 13 '05 #7

Christopher Benson-Manica

Eric Sosman <Er*********@sun.com> spoke thus:

struct name {
int namelen;
char namestr[1];
};
Allocating memory for the struct "plus:" struct name *ptr;
ptr = malloc(offsetof(struct name, namestr) + strlen(name) + 1);
assert (ptr != NULL);
ptr->namelen = strlen(name);
strcpy(ptr->namestr, name);

Okay. I think my real question is, why the heck would one use this cheap hack
(that *is* what it is, right?) instead of

struct name {
int namelen;
char *namestr;
};

and just allocate space for the string separately?

--
Christopher Benson-Manica | Jumonji giri, for honour.
ataru(at)cyberspace.org |

Nov 13 '05 #8

Eric Sosman

The Real OS/2 Guy wrote:

On Tue, 16 Sep 2003 17:32:36 UTC, Eric Sosman <Er*********@sun.com>
wrote:
Christopher Benson-Manica wrote:

"I came across some code that declared a structure like this:

struct name {
int namelen;
char namestr[1];
};

and then did some tricky allocation to make the namestr array act like it had
several elements."

Exactly what kind of "tricky allocation" is the FAQ talking about?
Allocating memory for the struct "plus:"

struct name *ptr;
ptr = malloc(offsetof(struct name, namestr) + strlen(name) + 1);

I've the FAQ not by hand now, but I don't think that this _bug_ will
be in there.

If there's a _bug_ present, it's still escaping my notice.
malloc(*ptr+strlen(name)) or maybe malloc(sizeof(struct
name)+strlen(name))
is the right answer.
The first suggestion won't compile, because the `+' operator
cannot use a struct type as an operand. The second will work,
but may allocate more space than necessary.
Whenever the compiler determines it has to do some paddings between
the int and the char[] you'll lost. Whenever you've to change the
struct by inserting some new member before the incomplete array you'll
lost too.
I think you've failed to understand what `offsetof' does.
Ask the compiler how much memory it needs to place a struct somewhere
(sizeof(struct x) and add the naked size of the string to store in the
incomplete array. Place for an empty string is included [1] anyway.
assert (ptr != NULL);

and your 1 GB data that stands around you've calculated the last 5
days on is loose because the CRT crashes the application immediately -
only because you'd uses a debug function in golden code or when the
compiler has eliminated the assert the next access to the NULL pointer
does it for you.

assert is not designed to check data as you tries. assert() is
designed to find programmer errors as it has to check to catch errors
a programmer can do, not errors the environment the app runs under
produces. Lack of memory is nothing one can catch with assert().

Agreed. I wrote the assert() line as shorthand for "error
checking omitted" -- and quite a savings it produced, too!

--
Er*********@sun.com

Nov 13 '05 #9

Arthur J. O'Dwyer

On Tue, 16 Sep 2003, Christopher Benson-Manica wrote:

Eric Sosman <Er*********@sun.com> spoke thus:
struct name {
int namelen;
char namestr[1];
};

Allocating memory for the struct "plus:"

struct name *ptr;
ptr = malloc(offsetof(struct name, namestr) + strlen(name) + 1);
assert (ptr != NULL);
ptr->namelen = strlen(name);
strcpy(ptr->namestr, name);

Okay. I think my real question is, why the heck would one use
this cheap hack (that *is* what it is, right?) instead of

struct name {
int namelen;
char *namestr;
};

and just allocate space for the string separately?

Speed. One allocation versus two allocations.
Unwarranted chumminess with the implementation.
:-)

Generally, such hacks are done in the name of
"efficiency", and usually end up obfuscating or
deportabilizing the code, for no gain.

My $.02,
-Arthur

Nov 13 '05 #10

Richard Heathfield

Christopher Benson-Manica wrote:

Eric Sosman <Er*********@sun.com> spoke thus:
struct name {
int namelen;
char namestr[1];
};
Allocating memory for the struct "plus:"

struct name *ptr;
ptr = malloc(offsetof(struct name, namestr) + strlen(name) + 1);
assert (ptr != NULL);
ptr->namelen = strlen(name);
strcpy(ptr->namestr, name);

Okay. I think my real question is, why the heck would one use this cheap
hack (that *is* what it is, right?)

Yes, it's a cheap hack; dmr calls it "unwarranted chumminess with the
implementation".
instead of

struct name {
int namelen;
char *namestr;
};

and just allocate space for the string separately?

The payoff is a single malloc instead of two, and contiguous data storage.
The price is tackiness, and of course it's not difficult to concoct
implementations on which the cheap hack would fail.

--
Richard Heathfield : bi****@eton.powernet.co.uk
"Usenet is a strange place." - Dennis M Ritchie, 29 July 1999.
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html
K&R answers, C books, etc: http://users.powernet.co.uk/eton

Nov 13 '05 #11

Eric Sosman

Christopher Benson-Manica wrote:

Eric Sosman <Er*********@sun.com> spoke thus:
struct name {
int namelen;
char namestr[1];
};

Allocating memory for the struct "plus:"

struct name *ptr;
ptr = malloc(offsetof(struct name, namestr) + strlen(name) + 1);
assert (ptr != NULL);
ptr->namelen = strlen(name);
strcpy(ptr->namestr, name);

Okay. I think my real question is, why the heck would one use this cheap hack
(that *is* what it is, right?) instead of

struct name {
int namelen;
char *namestr;
};

and just allocate space for the string separately?

Because malloc() and friends usually need to add some
bookkeeping data for each allocated area. If the items
you allocate are small and numerous, the bookkeeping
overhead can become important, and can be worth minimizing.

--
Er*********@sun.com

Nov 13 '05 #12

The Real OS/2 Guy

On Tue, 16 Sep 2003 20:35:32 UTC, Eric Sosman <Er*********@sun.com>
wrote:

The Real OS/2 Guy wrote:

On Tue, 16 Sep 2003 17:32:36 UTC, Eric Sosman <Er*********@sun.com>
wrote:
Christopher Benson-Manica wrote:
>
> "I came across some code that declared a structure like this:
>
> struct name {
> int namelen;
> char namestr[1];
> };
>
> and then did some tricky allocation to make the namestr array act like it had
> several elements."
>
> Exactly what kind of "tricky allocation" is the FAQ talking about?

Allocating memory for the struct "plus:"

struct name *ptr;
ptr = malloc(offsetof(struct name, namestr) + strlen(name) + 1);
I've the FAQ not by hand now, but I don't think that this _bug_ will
be in there.

If there's a _bug_ present, it's still escaping my notice.
malloc(*ptr+strlen(name)) or maybe malloc(sizeof(struct
name)+strlen(name))
is the right answer.

The first suggestion won't compile,

Typing error: malloc(sizeof(*ptr)+..... should be written.

because the `+' operator cannot use a struct type as an operand. The second will work,
but may allocate more space than necessary.

No it allocates exactly the needed memory. Yes, as a struct may have
some padding bytes at its end to get the next address a pointer to
struct can point to without failing on the CPU.

Whenever the compiler determines it has to do some paddings between
the int and the char[] you'll lost. Whenever you've to change the
struct by inserting some new member before the incomplete array you'll
lost too.

I think you've failed to understand what `offsetof' does.

No, but why make code coplicated? Is building a char pointer from a
struct, adding the difference in bytes to an member of a struct
casting it then to another type better than using simply the ptr->name
or ptr->namestr or ptr->newmember?

Write a program that has to access that struct in 3.000 transation
units in different places, then insert a new member into the struct,
then find any place in any translation unit that accesses the struct.
Is moth of retesting the whole application then worth to save some
bytes?

You may ever do this when you have to crypt the sources - but when you
were my (or ever employyer I've ever worked for) employee you would
get fired before you were ready with a single translation unit.

Using a byte array without any structure would be more clean then. Why
does you use a struct? You needs/likes to document your code! Handling
with casing pointers from type to type only to get access to a member
of a struct is crazy.

Even as it is a common trick to set an array[1] as placeholder for a
variable lenth string on the end of of a struct to save the space for
a pointer (and subsequent malloc()s to get a dynamic address to a
dynamic sized array it is noways a need to misuse the whole struct and
play with castings to pointers to make an addition to recaste the
pointer only to get a single member from that struct.

When you likes such games then you should read/store each
int/short/char/float/pointer and all other datatypes byte by byte,
using an unnamed array of bytes to store them - but not a struct. Then
you can win some more bytes when you allocs simply a big block on
memory and does the full housekeeping on it by hand. Then you can be
sure to avoid any padding byte. But the readability of your code
shrinks to zero, nil, void, nada. And the code will completely
unmaintenanceable. And you have to spend the volume you saves in data
in code and the time the program needs to do what it has to do
increases significantly. In extreme it would need a multiple on time
to load or save information there as it needs to do the work it is
designed for.

--
Tschau/Bye
Herbert

eComStation 1.1 Deutsch Beta ist verügbar

Nov 13 '05 #13

The Real OS/2 Guy

On Tue, 16 Sep 2003 20:34:14 UTC, Christopher Benson-Manica
<at***@nospam.cyberspace.org> wrote:

Eric Sosman <Er*********@sun.com> spoke thus:
struct name {
int namelen;
char namestr[1];
};
Allocating memory for the struct "plus:"
struct name *ptr;
ptr = malloc(offsetof(struct name, namestr) + strlen(name) + 1);

malloc(sizeof(*ptr) + strlen.......
assert (ptr != NULL);
ptr->namelen = strlen(name);
strcpy(ptr->namestr, name);

Okay. I think my real question is, why the heck would one use this cheap hack
(that *is* what it is, right?) instead of

It is simply to save
1. time intensive calls to malloc(). You needs ONE instead of TWO
malloc()
2.. malloc itself needs some bytes to manage the chunks it gives to
the caller
3. when you constantly malloc()s and frees() little chuncs of memory
it can
occure that the space malloc holds get heavy fragmented and the
next call to it that requires a bigger chunc can't fullifyed
because
malloc can't find a chunk that is big enough to give it, So when
even
some GB are free malloc() must tell you that it has not enough
memory for you.
4. save memory.

When you have to allocate lots of small strings, say about 5 to 20
bytes each then the size of a pointer you needs too is a significant
cost. So it makes sense to concetate little request to one bigger one.

struct name {
int namelen;
char *namestr;
};

and just allocate space for the string separately?

On a 32 bit mashine a pointer costs 4 bytes. And when your string is
in middle only 8 bytes long then you have to use 50% more memory. That
can make it simply impossible to hold all data in memory wheras the
char namestr[1] saves the pointer (so you gets 1/4 more structs in the
same amount of memory. O.k., in practice the struct will hold some
more members, so the difference is not sooo big - but even big enough
to do it.

When your requirement of such lists is relative small, so that you can
give away some unused bytes AND the maximal size you have to use for
namestr is exactly defined you can use simply
struct name {
int namelen;
char namestr[51];
};

But when 50 chars is the worst case and you have names with 20 bytes
in middle you gives away about 30 bytes for nothing.

--
Tschau/Bye
Herbert

eComStation 1.1 Deutsch Beta ist verügbar

Nov 13 '05 #14

Arthur J. O'Dwyer

On Tue, 16 Sep 2003, The Real OS/2 Guy wrote:

Eric Sosman wrote:
The Real OS/2 Guy wrote:
Eric Sosman wrote:
> Christopher Benson-Manica wrote:
> >
> > struct name {
> > int namelen;
> > char namestr[1];
> > };
> >
> > Exactly what kind of "tricky allocation" [...]
>
> Allocating memory for the struct "plus:"
>
> struct name *ptr;
> ptr = malloc(offsetof(struct name, namestr) + strlen(name) + 1);
malloc(sizeof *ptr + strlen(name)) or maybe
malloc(sizeof(struct name) + strlen(name))
is the right answer. [ > Now that the first one is fixed, they'll both work, ]
but may allocate more space than necessary.

No it allocates exactly the needed memory. Yes, as a struct may have
some padding bytes at its end to get the next address a pointer to
struct can point to without failing on the CPU.

Of course not; and yes, that's exactly why; respectively.
Consider the following hypothetical memory layouts:

A) 4 bytes of int, 1 byte of char

B) 4 bytes of int, 1 byte of char, 3 bytes of padding

C) 4 bytes of int, 3 bytes of padding, 1 byte of char
Now, whose method works?

A B C

Eric's method yes yes yes

Herb's method yes * yes
The * marks the case in which Herb's method allocates three
more bytes of memory than necessary. Note that Eric's
method handles this case just fine.

Whenever the compiler determines it has to do some paddings between
the int and the char[] you'll lost. Whenever you've to change the
struct by inserting some new member before the incomplete array you'll
lost too.

I think you've failed to understand what `offsetof' does.

No, but why make code [complicated]? Is building a char pointer from a
struct, adding the difference in bytes to an member of a struct
casting it then to another type better than using simply the ptr->name
or ptr->namestr or ptr->newmember?

Yes, I think you don't understand what offsetof does.

ptr = malloc(offsetof(struct name, namestr) + strlen(name) + 1);

This code takes a compile-time constant and adds it to the length
of string 'name', just like yours does. The only difference is in
the choice of constant; this one's correct, yours isn't.

Write a program that has to access that struct in 3.000 transation
units in different places, then insert a new member into the struct,
then find any place in any translation unit that accesses the struct.
Is [a month] of retesting the whole application then worth [it] to
save some bytes?
Not really. However, haven't we been saying all along that the
"struct hack" isn't a good idea anyway, just to "save some bytes"?
Note also that Eric's method works, even if the programmer later
decides to change

struct foo {
int i;
char namestr[1];
};

to

struct foo {
int j;
int k;
char bar[100];
char namestr[1];
};

As long as the "struct hack" is applicable ('namestr' remains the
last item in the struct, with a dimension of 1), it will work.
[snip garbage]
When you likes such games then you should read/store each
int/short/char/float/pointer and all other datatypes byte by byte,
using an unnamed array of bytes to store them - but not a struct.

[This is always the case, IMHO. Designing a portable floating-point
format could be tricky, though. And it doesn't have much to do
with memory allocation.]

-Arthur

Nov 13 '05 #15

Matt Gregory

The Real OS/2 Guy wrote:

What is when you have to insert another member into the struct? You
have to scan through 3 millions lines of code to find any point where
you'd used the wrong formula to determine the space you have to
allocate that struct and correct it. When you misses ONE place then
you gets UB.
Well, I personally wouldn't use this technique like that. I would wrap
it in a module for an ADT or something and make an interface for it.

Allocate the number of bytes whole struct grasps (the compiler knows
the size) + the number of bytes you have to store in the dummy string
field and let the compiler do the needed math correctly.

and that conveniently gives you a

struct name {
int namelen;
char namestr[50];
};

variable since there's no bounds checking in C.

No, not in all circumstances. If your comiler needs to insert padding
bytes between namelen and namestr you gets <number of padding bytes>
too less bytes reserved. Means you will address space that is NOT
reserved for the struct when you tries to store a string of 50 bytes
AND you've forgotten to reserve a byte for the \0 charater that ends
the string.
At least, that's how I
do it. I suppose that's wrong though because of field padding, but it's
never bothered me as long as you keep all the fields before namestr[]
32 bits on a 32-bit machine.

Why gives you wrong tips when with less typing the right can be given?

Well, I thought it better demonstrated the logic behind the concept
without all the kludges. I assumed the OP wanted to think about the
technique as opposed to copying and pasting code from usenet.

Matt Gregory

Nov 13 '05 #16

Jack Klein

On Tue, 16 Sep 2003 19:49:14 +0000 (UTC), "The Real OS/2 Guy"
<os****@pc-rosenau.de> wrote in comp.lang.c:

On Tue, 16 Sep 2003 17:17:38 UTC, Matt Gregory
<ms********@earthlink.net> wrote:
Christopher Benson-Manica wrote:
"I came across some code that declared a structure like this:

struct name {
int namelen;
char namestr[1];
};

and then did some tricky allocation to make the namestr array act like it had
several elements."

Exactly what kind of "tricky allocation" is the FAQ talking about?

You can go:

struct name *n = malloc(sizeof(int) + 50);

You means
struct name *n = malloc(sizeof(*n) + 50);
This gives you room for the int and a \0 termintated string of 50
chars. (an array of 51) AND it includes any padding that may or may
not the compiler include between the members of the struct.

What is when you have to insert another member into the struct? You
have to scan through 3 millions lines of code to find any point where
you'd used the wrong formula to determine the space you have to
allocate that struct and correct it. When you misses ONE place then
you gets UB.

You don't even have to miss ONE place, it's underlined behavior
regardless. And almost always unnecessary trickery. I'd be willing
to bet that in 99% if the places where its used in place of a pointer
member pointing to a separately malloc'ed buffer it doesn't have a
noticeable performance impact on the program anyway.

Still, it is undefined, which is why the latest standard provides a
similar mechanism that is specifically defined differently.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++ ftp://snurse-l.org/pub/acllc-c++/faq

Nov 13 '05 #17

The Real OS/2 Guy

On Tue, 16 Sep 2003 23:37:35 UTC, "Arthur J. O'Dwyer"
<aj*@andrew.cmu.edu> wrote:

On Tue, 16 Sep 2003, The Real OS/2 Guy wrote:

Eric Sosman wrote:
The Real OS/2 Guy wrote:
> Eric Sosman wrote:
> > Christopher Benson-Manica wrote:
> > >
> > > struct name {
> > > int namelen;
> > > char namestr[1];
> > > };
> > >
> > > Exactly what kind of "tricky allocation" [...]
> >
> > Allocating memory for the struct "plus:"
> >
> > struct name *ptr;
> > ptr = malloc(offsetof(struct name, namestr) + strlen(name) + 1);
> malloc(sizeof *ptr + strlen(name)) > or maybe
> malloc(sizeof(struct name) + strlen(name))
> is the right answer.
[ > Now that the first one is fixed, they'll both work, ] but may allocate more space than necessary.
No it allocates exactly the needed memory. Yes, as a struct may have
some padding bytes at its end to get the next address a pointer to
struct can point to without failing on the CPU.

Of course not; and yes, that's exactly why; respectively.
Consider the following hypothetical memory layouts:

A) 4 bytes of int, 1 byte of char

B) 4 bytes of int, 1 byte of char, 3 bytes of padding

C) 4 bytes of int, 3 bytes of padding, 1 byte of char
Now, whose method works?

A B C

Eric's method yes yes yes

Herb's method yes * yes
The * marks the case in which Herb's method allocates three
more bytes of memory than necessary. Note that Eric's
method handles this case just fine.

Wrong, because this would never occure! That is because a char[] will
ever start at an aligned address - and yiu describes a way to givbe
the array an unaligned address.

What is when you have another member in the struct?

Your project requires the struct to change to

struct name {
struct NAME *pNect;
unsigned short activity : 3;
unsigned short extInfo : 1;
unsigned short reserved : 3;
unsigned short io :1;
int namelen;
char namestr[1];
};

Without changing all the places you tries to access in an obscure way
one of the members must be changed to get name and namestr accessed
right, whereas only 2 or 3 places needs access to the new mebers.
What is when your your environment an aliment requires 8 bytes? Then
yoy'll have

D) 4 bytes of int, 4 bytes padding, 1 byte of char[], 7 bytes
padding

sizeof(int) = 4
sizeof(struct) = 16

And you can't never address the namestring at all using the pointer
arithmetic the compiler has buil in.

Never make code obvious! Hold it simple. Let the compiler do any dirty
work under cover for you - it is designed to do so.

> Whenever the compiler determines it has to do some paddings between
> the int and the char[] you'll lost. Whenever you've to change the
> struct by inserting some new member before the incomplete array you'll
> lost too.

I think you've failed to understand what `offsetof' does.
No, but why make code [complicated]? Is building a char pointer from a
struct, adding the difference in bytes to an member of a struct
casting it then to another type better than using simply the ptr->name
or ptr->namestr or ptr->newmember?

Yes, I think you don't understand what offsetof does.

ptr = malloc(offsetof(struct name, namestr) + strlen(name) + 1);

I know of offsetoff longer than you went from the 3. class of your
scool. Do has to obfuscate your code more than neccessary.

why not simply
ptr = malloc(sizeof(struct name) + strlen(name));
It is clear, secure code - and at least it is more maintenaceable.

Its much shorter to type, reduces the danger you forget to write the
unneeded offsetof(). You has NOT to think on the extra byte you needs
for the string terminator - because it is already included in the
result of sizeof(struct name).

You can't save the space of a single char because malloc will still
increase the size you requires to be an index next through the next
available aligned address - 1. The only you reaches is to obfuscate
the code.

This code takes a compile-time constant and adds it to the length
of string 'name', just like yours does. The only difference is in
the choice of constant; this one's correct, yours isn't.

False. Try to determine if 'Mayr' is the surname in an array of 1984
menbers of the sturct by both methods. But don't use an implementation
defined macro (offsetof is implementation defined), don't define you
own, use only ANSI C to do so.

--
Tschau/Bye
Herbert

eComStation 1.1 Deutsch Beta ist verügbar

Nov 13 '05 #18

The Real OS/2 Guy

On Wed, 17 Sep 2003 02:06:58 UTC, Matt Gregory
<ms********@earthlink.net> wrote:

The Real OS/2 Guy wrote:

What is when you have to insert another member into the struct? You
have to scan through 3 millions lines of code to find any point where
you'd used the wrong formula to determine the space you have to
allocate that struct and correct it. When you misses ONE place then
you gets UB.

Well, I personally wouldn't use this technique like that. I would wrap
it in a module for an ADT or something and make an interface for it.

Sure, but when you have only limited memory to store data but lots of
dynamic arrays of different struct types you uses this hack to come to
shortest possible memory. On other hand you can't do that with a
single fuction because there are too may parameters that differ from
struct type to struct type - and even the nasme of the variable array
differs.

I've written many programs where memory usage was a critical factor,
so the fine specification of the program gives the sdolution to use
this hack to save memory.

Clean code is a must in each failsave environment. Hacking around with
offsetof where no real need is for makes the code obsure.

By that, I'd even my own override of malloc to win most of the bytes
malloc requires for its own bookkeeping because the length of string
or even the size of a struct is often not longer than the nuber of
bytes malloc requires for itself for a chunk. So 've got some structs
and string in the place malloc (stdlib) alone needs.

In debug form it will determine illegal pointers to my free() (even as
it would determine multiple calls with already freed pointers.

In production form it uses only ONE byte for bookkeping of used chunks
and four bytes in free list when the required chunk is less than 61
bytes or 2/4 bytes for chunks less than 1021 bytes. A set of string
specific functions allos strings to use the bytes aligned address
would have as unuseable to deliver aligned addresses, because each
string can start on unaligned address anyway. This brings more useable
memory as any hack on the hack will ever give.

It uses inernally malloc() to have a number of big sized bocks to give
the smallest possible mini/midi sized chunk to the application. It is
designed for replacing the underlying malloc() with the OS defined
equivalent but that extension is not realised yet. As the environment
I'm working mostenly under has an application addressroom of only 512
MB and there is an bareley sufficient documented possibility to use
extended memory malloc() does not known of it. So for really memory
hungry applications one has to use the nearly undocumented system
functions - making it really unhandy for an app.

--
Tschau/Bye
Herbert

eComStation 1.1 Deutsch Beta ist verügbar

Nov 13 '05 #19

Kevin Easton

Arthur J. O'Dwyer <aj*@andrew.cmu.edu> wrote:

On Tue, 16 Sep 2003, Christopher Benson-Manica wrote:

Eric Sosman <Er*********@sun.com> spoke thus:
>
>> struct name {
>> int namelen;
>> char namestr[1];
>> };
>
> Allocating memory for the struct "plus:"
>
> struct name *ptr;
> ptr = malloc(offsetof(struct name, namestr) + strlen(name) + 1);
> assert (ptr != NULL);
> ptr->namelen = strlen(name);
> strcpy(ptr->namestr, name);

Okay. I think my real question is, why the heck would one use
this cheap hack (that *is* what it is, right?) instead of

struct name {
int namelen;
char *namestr;
};

and just allocate space for the string separately?

Speed. One allocation versus two allocations.
Unwarranted chumminess with the implementation.
:-)

You don't have to have to allocations. The same old:

struct name *n = malloc(sizeof *n + 50);

allocation suffices - you just follow it up with:

n->namelen = 50;
n->namestr = (char *)n + sizeof *n;

- Kevin.

Nov 13 '05 #20

Kevin D. Quitt

On Wed, 17 Sep 2003 02:32:33 GMT, Jack Klein <ja*******@spamcop.net>
wrote:

You don't even have to miss ONE place, it's underlined behavior
regardless. And almost always unnecessary trickery. I'd be willing
to bet that in 99% if the places where its used in place of a pointer
member pointing to a separately malloc'ed buffer it doesn't have a
noticeable performance impact on the program anyway.

Try building and working with network packets. Besides struct hack is
supported in C99, 6.7.2.1:

16 As a special case, the last element of a structure with more than one
named member may have an incomplete array type; this is called a flexible
array member. With two exceptions, the flexible array member is ignored.
First, the size of the structure shall be equal to the offset of the last
element of an otherwise identical structure that replaces the flexible
array member with an array of unspecified length.106) Second, when a . (or
->) operator has a left operand that is (a pointer to) a structure with a
flexible array member and the right operand names that member, it behaves
as if that member were replaced with the longest array (with the same
element type) that would not make the structure larger than the object
being accessed; the offset of the array shall remain that of the flexible
array member, even if this would differ from that of the replacement
array. If this array would have no elements, it behaves as if it had one
element but the behavior is undefined if any attempt is made to access
that element or to generate a pointer one past it.

17 EXAMPLE Assuming that all array members are aligned the same, after the
declarations:

struct s { int n; double d[]; };
struct ss { int n; double d[1]; };

the three expressions:

sizeof (struct s)
offsetof(struct s, d)
offsetof(struct ss, d)

have the same value. The structure struct s has a flexible array member d.
--
#include <standard.disclaimer>
_
Kevin D Quitt USA 91387-4454 96.37% of all statistics are made up
Per the FCA, this address may not be added to any commercial mail list

Nov 13 '05 #21

Eric Sosman

The Real OS/2 Guy wrote:

On Tue, 16 Sep 2003 23:37:35 UTC, "Arthur J. O'Dwyer"
<aj*@andrew.cmu.edu> wrote:
Of course not; and yes, that's exactly why; respectively.
Consider the following hypothetical memory layouts:

A) 4 bytes of int, 1 byte of char

B) 4 bytes of int, 1 byte of char, 3 bytes of padding

C) 4 bytes of int, 3 bytes of padding, 1 byte of char
Now, whose method works?

A B C

Eric's method yes yes yes

Herb's method yes * yes
The * marks the case in which Herb's method allocates three
more bytes of memory than necessary. Note that Eric's
method handles this case just fine.
Wrong, because this would never occure! That is because a char[] will
ever start at an aligned address - and yiu describes a way to givbe
the array an unaligned address.

Are you claiming that Case B will never occur? Every
compiler I've ever seen in the past twenty-five years has
used the Case B algorithm (sometimes adjusted for different
`int' sizes). Of course, the Standard permits any of A,B,C
and other variations as well.
What is when you have another member in the struct?

Your project requires the struct to change to

struct name {
struct NAME *pNect;
unsigned short activity : 3;
unsigned short extInfo : 1;
unsigned short reserved : 3;
unsigned short io :1;
int namelen;
char namestr[1];
};
(Aside: bit-fields of `unsigned short' type are non-portable.)
Without changing all the places you tries to access in an obscure way
one of the members must be changed to get name and namestr accessed
right, whereas only 2 or 3 places needs access to the new mebers.
The fundamental assumption of the struct hack -- and it *is*
a hack, IMHO -- is that the "expandable" element is the last in
the struct. As long as that requirement is met, the offsetof()
formulation is correct as written and need not be changed at all
when the struct grows or shrinks.
What is when your your environment an aliment requires 8 bytes? Then
yoy'll have

D) 4 bytes of int, 4 bytes padding, 1 byte of char[], 7 bytes
padding

sizeof(int) = 4
sizeof(struct) = 16

And you can't never address the namestring at all using the pointer
arithmetic the compiler has buil in.
Case D as described is certainly within the realm of
possibility, but the conditions don't imply the conclusion.
Specifically:

- It is still possible to form a perfectly good pointer
to the name string.

- It is still possible to access that same string with
an expression like `ptr->namestr'.

- ... provided, of course, that the struct hack actually
operates as expected, that is, that the undefined
behavior it invokes is the desired and usual outcome.
If the implementation does bounds-checking on the
namestr[1] array, this whole business is doomed anyhow.
Hacks are still hacks.
Never make code obvious! Hold it simple. Let the compiler do any dirty
work under cover for you - it is designed to do so.
Um, er, that's exactly what the offsetof() is for ...

> I think you've failed to understand what `offsetof' does.
I know of offsetoff longer than you went from the 3. class of your
scool.

False, unless you are prescient. C and offsetof() did not
exist at the time to which you refer.
why not simply
ptr = malloc(sizeof(struct name) + strlen(name));
It is clear, secure code - and at least it is more maintenaceable.
It is neither clearer nor murkier than using offsetof()
(assuming the reader understands offsetof(), of course). It
is neither more nor less secure. It is neither more nor less
maintainable. It is, however, potentially more wasteful, as
illustrated by Cases B and D.
Its much shorter to type, reduces the danger you forget to write the
unneeded offsetof(). You has NOT to think on the extra byte you needs
for the string terminator - because it is already included in the
result of sizeof(struct name).
Shorter to type, yes. It would be shorter still if all the
identifiers were shortened to single letters and all the extra
white space were squeezed out and comments were banned. Personally,
I don't want to work in that world. Brvty's nt alws sol o' wt.

"Reduces the danger you forget to write the unneeded offsetof()"
baffles me. There is, I suppose, some slight danger that you'll
forget to write offsetof(), but I don't see how that differs from
the equally slight danger that you might forget to write sizeof(),
or for that matter malloc(). And offsetof() is "unneeded" in
exactly the same way as sizeof() is "unneeded" -- no matter which
formulation you choose, you've got to include all the parts.

As for remembering to add 1 to strlen(): a C programmer prone
to forgetting the '\0' should be encouraged to seek other avenues
of employment. It's like forgetting the difference between 5/9
and 5.0/9.0 -- a programmer who can't keep such things in mind
with essentially no effort is not a programmer at all.
You can't save the space of a single char because malloc will still
increase the size you requires to be an index next through the next
available aligned address - 1.
The parsing of this sentence is tricky, but I *think* you're
arguing that malloc() will inflate the actual allocation by the
same amount no matter which method is chosen. Well, that's false.
Let's consider your own Case D, here repeated:
D) 4 bytes of int, 4 bytes padding, 1 byte of char[], 7 bytes
padding

sizeof(int) = 4
sizeof(struct) = 16
.... and see what happens with a string of five characters plus a
sixth spot for the terminator:

sizeof(struct name) + strlen(name) == 16 + 5 == 21,
which (by hypothesis) malloc() rounds up to 24.

offsetof(struct name, namestr) + strlen(name) + 1
== 8 + 5 + 1 == 14, which malloc() rounds up to 16.
The only you reaches is to obfuscate
the code.
It's not obfuscated to someone who knows what offsetof() is.
This code takes a compile-time constant and adds it to the length
of string 'name', just like yours does. The only difference is in
the choice of constant; this one's correct, yours isn't.

False. Try to determine if 'Mayr' is the surname in an array of 1984
menbers of the sturct by both methods.

/* Code for offsetof() method: */
for (i = 0; i < 1984; ++i) {
if (strcmp(namestructs[i].namestr, "Mayr") == 0)
return 1;
}
return 0;

/* Code for sizeof() method: */
for (i = 0; i < 1984; ++i) {
if (strcmp(namestructs[i].namestr, "Mayr") == 0)
return 1;
}
return 0;

I see no important differences between the two code snippets.
But don't use an implementation
defined macro (offsetof is implementation defined), don't define you
own, use only ANSI C to do so.

As requested, I avoided using offsetof(). I also avoided
using pow(), longjmp(), and qsort(), all of which were equally
relevant.

But why avoid offsetof() simply because it's "implementation-
defined?" So, too, is sizeof(). And malloc() and strlen() and
strcmp(), for that matter: the implementation must supply all of
them if it's to conform to the Standard. Why cut yourself off
from a Standard-mandated feature of the language? It's a little
like one of those silly contests to write programs without
semicolons or without curly braces: possibly interesting as an
exercise, but hardly relevant to actual programming.

--
Er*********@sun.com

Nov 13 '05 #22

Similar topics