which is the better way to declare dynamic single dimension array inside struct

On 26 Feb 2004 21:45:07 -0800, ge**********@yahoo.co.in (Geetesh)
wrote in comp.lang.c:

Recently i saw a code in which there was a structer defination similar
as bellow:
struct foo
{
int dummy1;
int dummy2;
int last[1]
};
This causes undefined behavior and is invalid code under all versions
of the C language standard.
In application the above array is always allocated at runtime using
malloc.In this last member of the structer "int last[1]" is not
actually used as array with single element but when alloacting space
for struct foo extra memory is allocated and last is used as array
with more then one element. my question is what are the advantages of
using the above defination instead of the shown below.
The advantages are that some programmers in any language are hot-shots
who think they know everything and happen on a trick that might work
with their particular compiler and think clever trickery proves that
they are good programmers.
struct foo
{
int dummy1;
int dummy2;
int *last;
};
The only advantage i can think of is that we will have to call single
malloc in first declaration and two malloc in second declaration and
also that in first declaration all the memeory allocated will be
contigous which may lead to less framgmentation and better cache
utilization. My question is does using first defination for accessing
of elements faster when compared to second. If yes why?
Thanks in advance.

You can still do a single malloc allocation:

foo *fp = malloc(sizeof *foo + how_many_characters_i_want);

/* error checking omitted */

fp->last = (char *)fp + sizeof *fp;

As to "faster", that doesn't apply when you are talking about illegal
code that produces undefined behavior.

Even when comparing two different legal methods of doing something,
the C standard does not specify the relative performance of anything.
The answer could be exactly opposite from one compiler to another.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.contrib.andrew.cmu.edu/~a...FAQ-acllc.html

Nov 14 '05 #2

On 26 Feb 2004 21:45:07 -0800, ge**********@yahoo.co.in (Geetesh)
wrote in comp.lang.c:

Recently i saw a code in which there was a structer defination similar
as bellow:
struct foo
{
int dummy1;
int dummy2;
int last[1]
};
This causes undefined behavior and is invalid code under all versions
of the C language standard.
In application the above array is always allocated at runtime using
malloc.In this last member of the structer "int last[1]" is not
actually used as array with single element but when alloacting space
for struct foo extra memory is allocated and last is used as array
with more then one element. my question is what are the advantages of
using the above defination instead of the shown below.
The advantages are that some programmers in any language are hot-shots
who think they know everything and happen on a trick that might work
with their particular compiler and think clever trickery proves that
they are good programmers.
struct foo
{
int dummy1;
int dummy2;
int *last;
};
The only advantage i can think of is that we will have to call single
malloc in first declaration and two malloc in second declaration and
also that in first declaration all the memeory allocated will be
contigous which may lead to less framgmentation and better cache
utilization. My question is does using first defination for accessing
of elements faster when compared to second. If yes why?
Thanks in advance.

Nov 14 '05 #3

Nejat AYDIN

Jack Klein wrote:

[...]
struct foo
{
int dummy1;
int dummy2;
int *last;
};

[...] You can still do a single malloc allocation:

foo *fp = malloc(sizeof *foo + how_many_characters_i_want);

^^^ ^^^^
struct foo *fp = malloc(sizeof *fp + how_many_characters_i_want);

Nov 14 '05 #4

Jack Klein <ja*******@spamcop.net> wrote:

On 26 Feb 2004 21:45:07 -0800, ge**********@yahoo.co.in (Geetesh)
wrote in comp.lang.c:
struct foo
{
int dummy1;
int dummy2;
int *last;
};
The only advantage i can think of is that we will have to call single
malloc in first declaration and two malloc in second declaration
You can still do a single malloc allocation:

foo *fp = malloc(sizeof *foo + how_many_characters_i_want);

Um... sizeof *type?
/* error checking omitted */

fp->last = (char *)fp + sizeof *fp;

Mind you, that only works with characters; the resulting space is not
guaranteed to be properly aligned for other types. With a bit of trouble
this should be solvable; for example, I think that

struct foo {
float bar;
long baz;
mytype *ptr;
mytype dummy; /* Really _is_ a dummy, for alignment only. */
}
struct foo *fooptr=malloc(sizeof *fooptr +
desired_number_of_objects_of_type_mytype);
fooptr->ptr=(char *)fp + sizeof *fooptr - sizeof *(fooptr->ptr);

should work no matter what mytype is. Note that the dummy object should
be of the base type of the pointer for this trick to work, and that it's
still a dirty piece of code and I give no guarantees; I wouldn't use
this myself. The code with two malloc()s is clearer and cleaner.

Richard

Nov 14 '05 #5

On Fri, 27 Feb 2004 05:45:07 UTC, ge**********@yahoo.co.in (Geetesh)
wrote:

Recently i saw a code in which there was a structer defination similar
as bellow:
struct foo
{
int dummy1;
int dummy2;
int last[1]
};
In application the above array is always allocated at runtime using
malloc.In this last member of the structer "int last[1]" is not
actually used as array with single element but when alloacting space
for struct foo extra memory is allocated and last is used as array
with more then one element. my question is what are the advantages of
using the above defination instead of the shown below.
struct foo
{
int dummy1;
int dummy2;
int *last;
};
The only advantage i can think of is that we will have to call single
malloc in first declaration and two malloc in second declaration and
also that in first declaration all the memeory allocated will be
contigous which may lead to less framgmentation and better cache
utilization. My question is does using first defination for accessing
of elements faster when compared to second. If yes why?
Thanks in advance.

It save memory. At lest the amount of memory a pointer costs.
It saves time as not every time are 2 malloc() required to fill a
whole struct.

No, it is NOT undefined behavior as Jack Klein says. But it is
implementation defined.

Look at the APIs of your OS. The chance is high that there is at least
one or more APIs who deliver or receive such kind of structs.
--
Tschau/Bye
Herbert

Visit http://www.ecomstation.de the home of german eComStation

Nov 14 '05 #6

On Fri, 27 Feb 2004 06:14:54 UTC, Jack Klein <ja*******@spamcop.net>
wrote:

On 26 Feb 2004 21:45:07 -0800, ge**********@yahoo.co.in (Geetesh)
wrote in comp.lang.c:
Recently i saw a code in which there was a structer defination similar
as bellow:
struct foo
{
int dummy1;
int dummy2;
int last[1]
};

This causes undefined behavior and is invalid code under all versions
of the C language standard.

Chapter and verse please in both ANSI C 89 and ANSI C99.

ANSI C 99 allows even <type> last[0] on this place what makes more
clean that this is only a place holder to have a name for the extra
space.

You may still call this as implementation defined - but NOT undefined.

--
Tschau/Bye
Herbert

Visit http://www.ecomstation.de the home of german eComStation

Nov 14 '05 #7

"The Real OS/2 Guy" <os****@pc-rosenau.de> wrote:

On Fri, 27 Feb 2004 06:14:54 UTC, Jack Klein <ja*******@spamcop.net>
wrote:
On 26 Feb 2004 21:45:07 -0800, ge**********@yahoo.co.in (Geetesh)
wrote in comp.lang.c:
Recently i saw a code in which there was a structer defination similar
as bellow:
struct foo
{
int dummy1;
int dummy2;
int last[1]
};
This causes undefined behavior and is invalid code under all versions
of the C language standard.

Chapter and verse please in both ANSI C 89 and ANSI C99.

Note: it is not the declaration as such which invokes UB, but its use as
a variable-sized struct. See the original post.

In ISO (not ANSI; neither you nor I are in the United States - what have
we to do with the US Standards Institute?) C99, 6.7.2.1, and note
especially the indication of "undefined behaviour" in example #19. Since
examples aren't normative, see also 6.5.6#8, and note that the result of
adding a pointer and an integer which is larger than the size of that
pointer is not defined, hence remains undefined. (Note also the
equivalence of array subscription and pointer addition, 6.5.2.1).

In C89, see 3.3.6, which defines pointer-integer addition essentially
identically to C99. Since there's no incomplete final array type in C89
structs, you won't find anything interesting in 3.5.2.1
ANSI C 99 allows even <type> last[0] on this place what makes more
clean that this is only a place holder to have a name for the extra
space.
Not quite. It allows an incomplete array - that is, one _without_ a
size, not with size 0 - as the last member of a structure. An array with
size 0 is, AFAICT, simply not allowed; and using an array with size 1 as
if it were an incomplete array is as undefined as in C89.
You may still call this as implementation defined - but NOT undefined.

It _is_ undefined. Sorry <g>.

Richard

Nov 14 '05 #8

"The Real OS/2 Guy" <os****@pc-rosenau.de> wrote:

On Fri, 27 Feb 2004 05:45:07 UTC, ge**********@yahoo.co.in (Geetesh)
wrote:
struct foo
{
int dummy1;
int dummy2;
int last[1]
};
In application the above array is always allocated at runtime using
malloc.In this last member of the structer "int last[1]" is not
actually used as array with single element but when alloacting space
for struct foo extra memory is allocated and last is used as array
with more then one element.
No, it is NOT undefined behavior as Jack Klein says. But it is
implementation defined.
Yes, it is. Pointer addition beyond the end of the array is undefined.
Look at the APIs of your OS. The chance is high that there is at least
one or more APIs who deliver or receive such kind of structs.

That some OSes choose to make this kind of undefined behaviour "work"
does not mean that it suddenly is defined.

Richard

Nov 14 '05 #9

On Fri, 27 Feb 2004 11:11:47 UTC, rl*@hoekstra-uitgeverij.nl (Richard
Bos) wrote:

"The Real OS/2 Guy" <os****@pc-rosenau.de> wrote:
On Fri, 27 Feb 2004 05:45:07 UTC, ge**********@yahoo.co.in (Geetesh)
wrote:
struct foo
{
int dummy1;
int dummy2;
int last[1]
};
In application the above array is always allocated at runtime using
malloc.In this last member of the structer "int last[1]" is not
actually used as array with single element but when alloacting space
for struct foo extra memory is allocated and last is used as array
with more then one element.
No, it is NOT undefined behavior as Jack Klein says. But it is
implementation defined.

Yes, it is. Pointer addition beyond the end of the array is undefined.

Sou you says any action int an array allocated with malloc ends up in
undefined behavior.
Look at the APIs of your OS. The chance is high that there is at least
one or more APIs who deliver or receive such kind of structs.

That some OSes choose to make this kind of undefined behaviour "work"
does not mean that it suddenly is defined.

You means that

int *p = malloc(4000),

stat *p1 = p + sizeof(stat) * 100;
stat *p2 = p1++;

is undefined behavior? So please, do never use malloc as it results
always in undefined behavior;

When it were really undefined behavior then we had produced 30 millon
lines of code as we have produced in the last yeare to run on 5
different OSes (linux, AIX, HPUX, MAC OS and OS/2 - as the same soure
gets unmodified - but recompiled running on all that mashines.

We should war our customers that theyr code inspections and regression
checks have faild 5 years ago and that all of theyr critical
applications are failing every minutes - even as the production in
theyr time critical environments runs since then without since are
runs well.

You should warn any OS producer that theyr OS will fail always because
they require undefined behavior as all of them have APIs based on that
technique.

I think you should inform yourself what pointer arithmetic can really
do for you - when you knows what you are doing.

Where is undefined behavior here?

struct x {
size_t cb;
struct a *pa;
int val;
unsigned int flags;
char *sa[1000]; /* we need 3 to 999 chars here */
};

struct y {
size_t cb;
struct a *pa;
int val;
unsigned int flags;
char s[1]; /* we have to compile ANSI C 89! */
};
struct x *p1 = malloc(sizeof(struct x) * 1000); /* UB? */
.....
struct y *p2 = malloc(sizeof(struct x) + strlen(data)); /* UB on what?
*/
.....
strcpy(y->s, data); /* UB? on what? */

Show one single ANSI C 89 compiler who will give undefined behavior on
that. I can't find one.

I can see no UB in the code fragments above. But I see that any byte
addresse gets addressed well.

Tell me what is the difference between UB and and implementation
defined. I see there some.

Whenever you allocs memory in the size you needs - not a single byte
less - then you CAN'T get UB when you knows how to hanlde pointer
arithmetic. There is even in struct y not a single byte that is UB
because anything is well, well aligned well addressed.

--
Tschau/Bye
Herbert

Visit http://www.ecomstation.de the home of german eComStation

Nov 14 '05 #10

Kelsey Bjarnason

[snips]

On Fri, 27 Feb 2004 16:42:50 +0000, The Real OS/2 Guy wrote:

struct y {
size_t cb;
struct a *pa;
int val;
unsigned int flags;
char s[1]; /* we have to compile ANSI C 89! */
};
struct y *p2 = malloc(sizeof(struct x) + strlen(data)); /* UB on what?
*/
This, AFAICT, is not UB; you can allocate whatever size you want.

strcpy(y->s, data); /* UB? on what? */
This, however...

I *think* - not sure - that C99 offers explicit support for this. In c89,
however, the code is broken if "data" contains anything but an empty
string, since s is defined to be one byte long. A compiler which supports
bounds-checking, for example, can happily trap here, since anything other
than an empty string in data will overrun the array bounds.

Show one single ANSI C 89 compiler who will give undefined behavior on
that. I can't find one.
All of them. The compiler doesn't define UB, the standard does. That
your compiler collection happens to allow this behaviour is irrelevant.
Tell me what is the difference between UB and and implementation
defined. I see there some.
UB is anything the standard either explicitly defines to be UB, or, by
failure to define in another category, leaves undefined. Notable examples
are anything which violates a "shall" clause, such as "main shall return
an int" - thus void main() is UB.

Implementation-defined behaviour is things the particular implementation
has some freedom to "play with" - shifting of signed values, IIRC, falls
into this category. However, the implementation is required to document
the behaviour; the behaviour is _defined_... but defined by the
implementation, not by the standard.
Whenever you allocs memory in the size you needs - not a single byte
less - then you CAN'T get UB when you knows how to hanlde pointer
arithmetic.

The problem isn't with malloc or with pointer arithmetic; the problem is
that you're accessing something - s - which has a definite size - 1 byte -
but not staying within the limits of the object's size.

Nov 14 '05 #11

Kelsey Bjarnason

[snips]

On Fri, 27 Feb 2004 06:06:25 +0000, Jack Klein wrote:

On 26 Feb 2004 21:45:07 -0800, ge**********@yahoo.co.in (Geetesh)
wrote in comp.lang.c:
Recently i saw a code in which there was a structer defination similar
as bellow:
struct foo
{
int dummy1;
int dummy2;
int last[1]
};

This causes undefined behavior and is invalid code under all versions
of the C language standard.

I thought C99 brought in support for the "struct hack"?

Nov 14 '05 #12

Ben Pfaff

Kelsey Bjarnason <ke*****@lightspeed.bc.ca> writes:

[snips]

On Fri, 27 Feb 2004 06:06:25 +0000, Jack Klein wrote:
On 26 Feb 2004 21:45:07 -0800, ge**********@yahoo.co.in (Geetesh)
wrote in comp.lang.c:
Recently i saw a code in which there was a structer defination similar
as bellow:
struct foo
{
int dummy1;
int dummy2;
int last[1]
};

This causes undefined behavior and is invalid code under all versions
of the C language standard.

I thought C99 brought in support for the "struct hack"?

Yes, but the C99 version is expressed with empty brackets [], not
with [1].
--
"...Almost makes you wonder why Heisenberg didn't include postinc/dec operators
in the uncertainty principle. Which of course makes the above equivalent to
Schrodinger's pointer..."
--Anthony McDonald

Nov 14 '05 #13

Arthur J. O'Dwyer

On Fri, 27 Feb 2004, Kelsey Bjarnason wrote:

On Fri, 27 Feb 2004 16:42:50 +0000, The Real OS/2 Guy wrote:

struct y {
size_t cb;
struct a *pa;
int val;
unsigned int flags;
char s[1]; /* we have to compile ANSI C 89! */
};
struct y *p2 = malloc(sizeof(struct x) + strlen(data)); /* UB on what?
*/
This, AFAICT, is not UB; you can allocate whatever size you want.

strcpy(y->s, data); /* UB? on what? */
UB on the fact that perhaps
(sizeof(struct x)+strlen(data)) < (sizeof *y), for one thing. But
I'll assume that you made a typo in the malloc line, and meant to write
struct y *p2 = malloc(sizeof(struct y) + strlen(data)); ^^^^^^^^

(Incidentally, I'll evangelize again: The canonical c.l.c idiom
for malloc calls would avoid this bug.)
strcpy(y->s, data);

This, however...

I *think* - not sure - that C99 offers explicit support for this.
Not really. C99 offers a *new syntax* for variable-sized arrays
in the last part of a struct, but the old C90 "struct hack" is still
not supported. (In fact, it's explicitly "un-supported," since it
explicitly invokes undefined behavior.)
In c89,
however, the code is broken if "data" contains anything but an empty
string, since s is defined to be one byte long. A compiler which supports
bounds-checking, for example, can happily trap here, since anything other
than an empty string in data will overrun the array bounds. <large snip> The problem isn't with malloc or with pointer arithmetic; the problem is
that you're accessing something - s - which has a definite size - 1 byte -
but not staying within the limits of the object's size.

Exactly.

-Arthur

Nov 14 '05 #14

On Fri, 27 Feb 2004 18:59:09 UTC, Kelsey Bjarnason
<ke*****@lightspeed.bc.ca> wrote:

[snips]

On Fri, 27 Feb 2004 16:42:50 +0000, The Real OS/2 Guy wrote:
struct y {
size_t cb;
struct a *pa;
int val;
unsigned int flags;
char s[1]; /* we have to compile ANSI C 89! */
};
struct y *p2 = malloc(sizeof(struct x) + strlen(data)); /* UB on what?
*/
This, AFAICT, is not UB; you can allocate whatever size you want.

strcpy(y->s, data); /* UB? on what? */

This, however...

I *think* - not sure - that C99 offers explicit support for this. In c89,
however, the code is broken if "data" contains anything but an empty
string, since s is defined to be one byte long. A compiler which supports
bounds-checking, for example, can happily trap here, since anything other
than an empty string in data will overrun the array bounds.

There is no boundschecking in the standard.

Show one single ANSI C 89 compiler who will give undefined behavior on
that. I can't find one.
All of them. The compiler doesn't define UB, the standard does. That
your compiler collection happens to allow this behaviour is irrelevant.

Where? I can't find it, so chapter and verse please.

Tell me what is the difference between UB and and implementation
defined. I see there some.

UB is anything the standard either explicitly defines to be UB, or, by
failure to define in another category, leaves undefined. Notable examples
are anything which violates a "shall" clause, such as "main shall return
an int" - thus void main() is UB.

Ah, as you says the standard explicity defines UB - but where is the
UP belonging to to this?
Implementation-defined behaviour is things the particular implementation
has some freedom to "play with" - shifting of signed values, IIRC, falls
into this category. However, the implementation is required to document
the behaviour; the behaviour is _defined_... but defined by the
implementation, not by the standard.
Whenever you allocs memory in the size you needs - not a single byte
less - then you CAN'T get UB when you knows how to hanlde pointer
arithmetic.

The problem isn't with malloc or with pointer arithmetic; the problem is
that you're accessing something - s - which has a definite size - 1 byte -
but not staying within the limits of the object's size.

Hm, the opbject size is defined through the size it is allocated
through malloc(). Sure, you falls miserably when you declares such
struct statically - but you can't fail when you use dynamic allocation
right. That is the trick where you avoids UB because YOU defines the
real size through malloc, so you gets always enough continous memory
to pass the content you needs in it.
--
Tschau/Bye
Herbert

Visit http://www.ecomstation.de the home of german eComStation

Nov 14 '05 #15

On Fri, 27 Feb 2004 11:00:06 -0800, Kelsey Bjarnason
<ke*****@lightspeed.bc.ca> wrote in comp.lang.c:

[snips]

On Fri, 27 Feb 2004 06:06:25 +0000, Jack Klein wrote:
On 26 Feb 2004 21:45:07 -0800, ge**********@yahoo.co.in (Geetesh)
wrote in comp.lang.c:
Recently i saw a code in which there was a structer defination similar
as bellow:
struct foo
{
int dummy1;
int dummy2;
int last[1]
};

This causes undefined behavior and is invalid code under all versions
of the C language standard.

I thought C99 brought in support for the "struct hack"?

Indeed it did, but not with [1], so this code is not valid for the C99
struct with flexible array either.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.contrib.andrew.cmu.edu/~a...FAQ-acllc.html

Nov 14 '05 #16

On Fri, 27 Feb 2004 09:49:30 +0000 (UTC), "The Real OS/2 Guy"
<os****@pc-rosenau.de> wrote in comp.lang.c:

On Fri, 27 Feb 2004 05:45:07 UTC, ge**********@yahoo.co.in (Geetesh)
wrote:
Recently i saw a code in which there was a structer defination similar
as bellow:
struct foo
{
int dummy1;
int dummy2;
int last[1]
};
In application the above array is always allocated at runtime using
malloc.In this last member of the structer "int last[1]" is not
actually used as array with single element but when alloacting space
for struct foo extra memory is allocated and last is used as array
with more then one element. my question is what are the advantages of
using the above defination instead of the shown below.
struct foo
{
int dummy1;
int dummy2;
int *last;
};
The only advantage i can think of is that we will have to call single
malloc in first declaration and two malloc in second declaration and
also that in first declaration all the memeory allocated will be
contigous which may lead to less framgmentation and better cache
utilization. My question is does using first defination for accessing
of elements faster when compared to second. If yes why?
Thanks in advance.

It save memory. At lest the amount of memory a pointer costs.
It saves time as not every time are 2 malloc() required to fill a
whole struct.

No, it is NOT undefined behavior as Jack Klein says. But it is
implementation defined.

The term "implementation-defined" has a precise meaning in the C
standard, and in fact is specifically defined in the standard. The
only things which are implementation-defined in C are those which the
C standard specifically states are implementation-defined, using that
exact term, hyphen and all.

The standard cannot, and makes no attempt to, prevent compilers from
providing extensions beyond the language. Such extensions are thus
that, extensions provided by an implementation. That does not mean
that they are "implementation-defined" as far as the C language is
concerned.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.contrib.andrew.cmu.edu/~a...FAQ-acllc.html

Nov 14 '05 #17