contiguity of arrays

Steve Kobes

Is this legal? Must it print 4?

int a[2][2] = {{1, 2}, {3, 4}}, *b = a[0];
printf("%d\n", *(b + 3));

--Steve

Nov 14 '05

Subscribe Post Reply

197

5599

« First
<
2
3
4

David Hopwood

James Kuyper wrote:

David Hopwood wrote:
....
There *is* an array of the pointed-at type, containing the pointed-at
object.
Where? "aligned for type int and big enough" doesn't constitute an
array. For static or automatically allocated memory, you need an
explicit declaration to make it an array. For dynamically allocated
memory, the most recent write to the memory must have been through an
lvalue of a compatible type, to make it an array.
See the definitions of "array type" and "array of T" in
C99 6.2.5 #20.

Those definitions say that "An array type describes a contiguously
allocated non-empty set of objects with a particular member object
type". It does not say that every piece of memory that happens to fits
that description is an array.

Since it's a *definition* of "array type" and "array of T", that's
effectively what it does say.
Every type that describes such a piece of
memory is an array type, but that doesn't mean that every piece of
memory that can be described that way can be used as an array.

For that to be the case, there would have to be text that explicitly
excludes some "contiguously allocated non-empty sets of objects with
a particular member object type" from being arrays. I can't see any
such text.

--
David Hopwood <da******************@blueyonder.co.uk>

Nov 14 '05 #151

James Kuyper

"Ivan A. Kosarev" <ik******@online.ru> wrote in message news:<cj**********@news.rol.ru>...
....

Both subscription ("[]") and indirection ("*") operators are defined for
pointers, but not for arrays. With a pointer it's possible to determine how
a pointed object large is, but it's impossible to determine what number of
the elements can be legally pointed. That is, the abstract machine can never

I think you're misusing the concept of the "abstract machine". The
abstract machine must behave in a certain way for code with
well-defined behavior; it has no restrictions on how it should behave
with regard to code with undefined behavior. When the standard leaves
behavior undefined, that generally means that it has different
behavior on different implementations. When the behavior is undefined,
it might include formatting the hard disk; the fact that the abstract
machine provides no reason why the hard disk might be formatted,
doesn't mean that a conforming implementation is prohibited from
formatting the hard disk.

It seems to me to be quite feasible for an implementation to arrange
for pointers to carry information inside them which allows an
implementation to impose range validity checks. The fact that the
abstract machine has no such mechanism doesn't make such an
implementation non-conforming. If the behavior is undefined, because
an array bound has been violated, then the implementation is free to
abort() the program, regardless of whether or not the abstract machine
has any reason for aborting it.

Nov 14 '05 #152

James Kuyper

Joe Wright <jo********@comcast.net> wrote in message news:<Da********************@comcast.com>...
....

The original example, well upthread, was ..

int a[2][2] = {{1,2},{3,4}};

.. which declares, defines and initializes a. Do we all agree that there are four int's there? Are they contiguous in memory at ascending addresses? Yes, they are.

void *v = a;
int *b = v;

Does anyone doubt that b[3] == 4 ? Are any rules broken? No. The object

Yes, and Yes.

Nov 14 '05 #153

Wojtek Lerch

"David Hopwood" <da******************@blueyonder.co.uk> wrote in message
news:bf********************@fe2.news.blueyonder.co .uk...

James Kuyper wrote:
Every type that describes such a piece of
memory is an array type, but that doesn't mean that every piece of memory
that can be described that way can be used as an array.

For that to be the case, there would have to be text that explicitly
excludes some "contiguously allocated non-empty sets of objects with
a particular member object type" from being arrays. I can't see any
such text.

I presume that this makes "struct { int a, b, c; }" an array type on most
implementations?

Nov 14 '05 #154

James Kuyper

pete <pf*****@mindspring.com> wrote in message news:<41***********@mindspring.com>...
....

I just don't understand what an array has to do
with incrementing a pointer through an object.

The standard defines the meaning of pointer addition in terms of
arrays. If there's no array of the pointed-at type containing both the
pointed-at object and the object that would be pointed at after
addition, then you can't do it legally (I'm simplifying the wording by
ignoring the one-past-the-end case, but that doesn't affect my main
point).

You might want to argue for a more general definition, but that's the
one we've got right now.

Nov 14 '05 #155

David Hopwood

Wojtek Lerch wrote:

"David Hopwood" <da******************@blueyonder.co.uk> wrote:
James Kuyper wrote:
Every type that describes such a piece of
memory is an array type, but that doesn't mean that every piece of memory
that can be described that way can be used as an array.

For that to be the case, there would have to be text that explicitly
excludes some "contiguously allocated non-empty sets of objects with
a particular member object type" from being arrays. I can't see any
such text.

I presume that this makes "struct { int a, b, c; }" an array type on most
implementations?

If there is no padding between the members, then yes, according to 6.2.5 #20
it is an array type as well as a structure type.

--
David Hopwood <da******************@blueyonder.co.uk>

Nov 14 '05 #156

pete

Wojtek Lerch wrote:

"pete" <pf*****@mindspring.com> wrote in message
news:41*********@mindspring.com...
Wojtek Lerch wrote:
"pete" <pf*****@mindspring.com> wrote in message
> int the_array[2][2];
> int *b = (int*)&the_array; ... (Another issue is whether
b is actually guaranteed to point to an object.
When the standard says that I can convert one pointer type
type to another, it only cautions about alignment and size.
When the standard says that I can do something,
I take it to mean that that code is defined.

The standard never says what the result of the conversion is,'
or that you can dereference it. It could be a special value that
doesn't point to any
object but magically produces the right pointer
when converted back to the original type.

How do you figure it would convert back to the original value?
It would be nice to have a guarantee that (int*)&the_array
has the same value as (int*)(void*)&the_array, but as far I can tell,
there's no such guarantee in the standard.

Even worse, I can't find any words in the standard that forbid the
conversion to produce a completely bogus value that is never correctly
aligned, making the entire conversion undefined. Can you?

I can't even find any words that define conversion of pointers.

N869
6.3 Conversions
[#1] Doesn't say much
[#2] "compatible types" doesn't really apply
to using a cast to convert one pointer to
another with less strict alignment requirements.

--
pete

Nov 14 '05 #157

Thomas Stegen

David Hopwood wrote:

Wojtek Lerch wrote:

I presume that this makes "struct { int a, b, c; }" an array type on most
implementations?

If there is no padding between the members, then yes, according to 6.2.5
#20
it is an array type as well as a structure type.

This seems to be an implementation dependent detail though. Relying on
it will make not a strictly conforming program make.

So it seems to me that according to the abstract machine, it is not an
array.

--
Thomas.

Nov 14 '05 #158

Wojtek Lerch

"David Hopwood" <da******************@blueyonder.co.uk> wrote in message
news:WV********************@fe2.news.blueyonder.co .uk...

Wojtek Lerch wrote:
I presume that this makes "struct { int a, b, c; }" an array type on most
implementations?

If there is no padding between the members, then yes, according to 6.2.5
#20
it is an array type as well as a structure type.

Wouldn't that have all sorts of interesting side-effects? For instance, an
expression that has an array type decays to a pointer in most contexts.
This would make things like "foo.b" invalid, wouldn't it?...

Nov 14 '05 #159

David Hopwood

Thomas Stegen wrote:

David Hopwood wrote:
Wojtek Lerch wrote:

I presume that this makes "struct { int a, b, c; }" an array type on
most implementations?

If there is no padding between the members, then yes, according to
6.2.5 #20 it is an array type as well as a structure type.

This seems to be an implementation dependent detail though. Relying on
it will make not a strictly conforming program make.

So what? There seems to be a pervasive misconception in this newsgroup that
the standard only imposes requirements for strictly conforming programs.
That is explicitly contradicted by C99 4 #3:

# A program that is correct in all other aspects, operating on correct data,
# containing unspecified behavior shall be a correct program and act in
# accordance with 5.1.2.3.

(where 5.1.2.3 specifies the correspondence between a program's behaviour
and that of the abstract machine, i.e. essentially all requirements of the
standard are interpreted in terms of this clause).

In the example presented by Wojtek Lerch, if someone can infer by *any* means
that there is no padding between a, b and c -- including by using pointer
equality tests at run-time, by reading implementation documentation or an
ABI standard, or simply knowing how a particular implementation lays out
structs -- then they can conclude that the C standard requires that the
contents of the struct can also be accessed as an array via a pointer of an
appropriate type (e.g. int *).

--
David Hopwood <da******************@blueyonder.co.uk>

Nov 14 '05 #160

David Hopwood

Wojtek Lerch wrote:

"David Hopwood" <da******************@blueyonder.co.uk> wrote:
Wojtek Lerch wrote:
I presume that this makes "struct { int a, b, c; }" an array type on most
implementations?
If there is no padding between the members, then yes, according to 6.2.5
#20 it is an array type as well as a structure type.

What I should have said here was that if there is no padding, an object
of the structure type can also be accessed as an object of an array type
(e.g. int[3]). The two types are not the same.
Wouldn't that have all sorts of interesting side-effects? For instance, an
expression that has an array type decays to a pointer in most contexts.
This would make things like "foo.b" invalid, wouldn't it?...

Good point. 6.2.5 #20 is less clear than it should be, but I think it has
to be interpreted in line with the corrected statement above.

--
David Hopwood <da******************@blueyonder.co.uk>

Nov 14 '05 #161

James Kuyper

Thomas Stegen wrote:

David Hopwood wrote:
Wojtek Lerch wrote:

I presume that this makes "struct { int a, b, c; }" an array type on
most
implementations?

If there is no padding between the members, then yes, according to
6.2.5 #20
it is an array type as well as a structure type.

This seems to be an implementation dependent detail though. Relying on
it will make not a strictly conforming program make.

Incorrect. If he were correct about that being an array, then he could
put the code that treats it as an array inside a test:

if(&a+1 == &b && &b+1 == &c)
{
int *pi = &a;
// code that uses p
}
else
{
// code with identical behavior, that does not use 'p'
}

Code like that would be pretty senseless, but I put the 'else' clause in
only to make it strictly conforming. Code that didn't have to be
strictly conforming could do different things in the different branches
of that 'if', and I could imagine some (rather poor) reasons why someone
might want to write something like that.

Nov 14 '05 #162

Wojtek Lerch

David Hopwood wrote:

Wojtek Lerch wrote:
"David Hopwood" <da******************@blueyonder.co.uk> wrote:
Wojtek Lerch wrote:

I presume that this makes "struct { int a, b, c; }" an array type on
most
implementations?

If there is no padding between the members, then yes, according to
6.2.5 #20 it is an array type as well as a structure type.

What I should have said here was that if there is no padding, an object
of the structure type can also be accessed as an object of an array type
(e.g. int[3]). The two types are not the same.

If the struct type is not an array type, the definition of "array type"
doesn't support your position, does it? Can you find words somewhere
else in the standard that allow you to access the struct as an array?

And does it matter? 6.5.6p8 defines pointer math in terms of the array
that the object *is* an element of. Not "could be". If you *could*
access something as an array but are not, an object that *would* be an
element of that array is not, is it?...

Nov 14 '05 #163

pete

pete wrote:

Wojtek Lerch wrote:

Even worse, I can't find any words in the standard that forbid the
conversion to produce a completely bogus value
that is never correctly
aligned, making the entire conversion undefined. Can you?

I can't even find any words that define conversion of pointers.

Well I can find some words, but the answer to your question, is "no".

--
pete

Nov 14 '05 #164

Wojtek Lerch

pete wrote:

I can't even find any words that define conversion of pointers.

6.3.2.3

Nov 14 '05 #165

Wojtek Lerch

pete wrote:

Wojtek Lerch wrote:
"pete" <pf*****@mindspring.com> wrote in message
news:41*********@mindspring.com...
When the standard says that I can convert one pointer type
type to another, it only cautions about alignment and size.
When the standard says that I can do something,
I take it to mean that that code is defined.

The standard never says what the result of the conversion is,'
or that you can dereference it. It could be a special value that
doesn't point to any
object but magically produces the right pointer
when converted back to the original type.

How do you figure it would convert back to the original value?

Imagine a machine where the representation of a pointer has two extra
bits that tell the processor whether to allow reading or writing through
the pointer. When you convert &the_array to int*, the compiler turns
both bits off. When you convert it back to the original type, or to any
type in a context where the compiler isn't sure what the original type
was, it turns the two bits back on (or just the "readable" bit if the
new type points to a const-qualified type). Pointer math is only
allowed on pointers that have the "readable" bit turned on.

Nov 14 '05 #166

Wojtek Lerch

James Kuyper wrote:

void *pv = malloc(sizeof(int)*sizeof(double));
int *pi = pv;
int (*arr)[3] = pv;
pi = arr[0];

It's true that arr[0] points at the same location as (int*)pv, but
What's a "location"? They point to the same *object*, don't they?
because of the declaration of 'arr', arr[0] is allowed to be tagged with
an upper limit that will cause the expression p[5] to abort() your
The standard never describes this "tagging" explicitly, does it?
program. (int*)pv, on the other hand, could only be tagged with an upper
limit that matched the dynamically allocated size of the entire block of
memory.

I have to say that I like that theory, but I'm not entirely sure that
the standard actually supports it. The wording of 6.5.6p8 talks about
pointer math in terms of the object the pointer points to, without any
hint that it might also depend on how the pointer was obtained. If an
object is the first element of an array of five ints, you should be able
to add four to *any* pointer that points to that object, and dereference
the result.

Nov 14 '05 #167

Dan Pop

In <2t*************@uni-berlin.de> Thomas Stegen <th***********@gmail.com> writes:

David Hopwood wrote:
Wojtek Lerch wrote:

I presume that this makes "struct { int a, b, c; }" an array type on most
implementations?

If there is no padding between the members, then yes, according to 6.2.5
#20
it is an array type as well as a structure type.

This seems to be an implementation dependent detail though. Relying on
it will make not a strictly conforming program make.

So it seems to me that according to the abstract machine, it is not an
array.

Even if there is any padding, the struct type can be safely aliased with
an array of three int's: it has the proper alignment and it is large
enough for the purpose.

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de
Currently looking for a job in the European Union

Nov 14 '05 #168

Dan Pop

In <8b*************************@posting.google.com> ku****@wizard.net (James Kuyper) writes:

Joe Wright <jo********@comcast.net> wrote in message news:<Vq********************@comcast.com>...
...
I don't know what you're trying to win here James, but whatever 'wording' the
Standard provides for malloc() is not at issue.

It's an issue because the block of memory pointed at by the return
from malloc() is required to be suitable to store any C object that
will fit in it. That includes an array of ints. No such guarantee
applies to statically or automatically allocated memory; it only
applies to dynamically allocated memory.

Nope. There is nothing magic about dynamically allocated memory. The
fact that the block of memory allocated by malloc is suitable to store
any C object that will fit in it is a direct consequence of another
property of that block, that is explicitly guaranteed by the standard:
the memory block is correctly aligned for any C object (whether it fits
inside or not).

ANY memory block that satisfies the definition of an array is an array,
whether it was declared as such or not. This is what makes pointer
arithmetic work inside dynamically allocated memory blocks, in the
absence of any explicit array declaration.

Consider struct {int a, b, c, d;} foo and int *p = &foo.a. p[0], p[1]
p[2] and p[3] are all legal lvalues, even if there is no array declaration
in sight and even if foo contains internal padding. And the behaviour of
the program is well defined, as long as foo is accessed only through p.

The only consistent way of interpreting the standard is that each
outermost object resides in its own address space and that pointer
arithmetic is well defined as long as the result is in the same address
space (the one-byte-after address included). Subobjects merely live in
that address space, they do not define their own address space.

Any other interpretation is going to have consistency problems, e.g.
when a character pointer is made to point inside a subobject, because
the pointer is also pointing inside the outermost object.

An array is just that, objects of type T, contiguous and at increasing addresses in memory.

An array is stored in that format, but having something in that format
doesn't make it an array. On many compilers,

int a,b,c;

will result three contiguous locations in memory being allocated as
ints. That doesn't create an array.

This is true: each object lives in its own address space. However, make
them part of a larger object (thus having them in the same address space)
and you have an array, as in my example above.

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de
Currently looking for a job in the European Union

Nov 14 '05 #169

Douglas A. Gwyn

David Hopwood wrote:

If there is no padding between the members, then yes, according to 6.2.5 #20
it is an array type as well as a structure type.

No, that's an incorrect reading.

Nov 14 '05 #170

James Kuyper

Wojtek Lerch wrote:

James Kuyper wrote:
void *pv = malloc(sizeof(int)*sizeof(double));
int *pi = pv;
int (*arr)[3] = pv;
pi = arr[0];

It's true that arr[0] points at the same location as (int*)pv, but

What's a "location"? They point to the same *object*, don't they?
because of the declaration of 'arr', arr[0] is allowed to be tagged
with an upper limit that will cause the expression p[5] to abort() your

The standard never describes this "tagging" explicitly, does it?

No, it's just an example of one of the many ways that the undefined
behavior that is allowed in this case can actually occur.

program. (int*)pv, on the other hand, could only be tagged with an
upper limit that matched the dynamically allocated size of the entire
block of memory.

I have to say that I like that theory, but I'm not entirely sure that
the standard actually supports it. The wording of 6.5.6p8 talks about
pointer math in terms of the object the pointer points to, without any
hint that it might also depend on how the pointer was obtained. If an
object is the first element of an array of five ints, you should be able
to add four to *any* pointer that points to that object, and dereference
the result.

I'm arguing that the special guarantees for the return values from
malloc() make sense only if considered as overriding that wording. You
can build many different object types in a single block of dynamically
allocated memory. If they are composite types, you can even use it to
store non-overlapping pieces of different composite types at the same time.

However, I believe that when you convert a pointer at dynamically
allocated memory to a pointer to a particular type, then for all other
purposes except free(), that pointer and every pointer derived from it
should act like they are pointing at or into an array of that type. In
particular, if it's a composite type, a pointer to one of the members of
the elements of that array has the same restrictions on its offsets that
a pointer at a member of statically or automatically allocated object of
the composite type would have. Any other approach would make the
differences between dynamically allocated objects and other kinds of
objects unacceptably big.

In other words:

int b[3][3];
int (*a)[3] = malloc(9*sizeof int);

You can only add 3 to b[0], and you can only add 3 to a[0]. To allow
adding 9 to a[0] would make too big of a difference between dynamically
and automatically allocated arrays.

Nov 14 '05 #171

David Hopwood

Wojtek Lerch wrote:

David Hopwood wrote:
Wojtek Lerch wrote:
"David Hopwood" <da******************@blueyonder.co.uk> wrote:
Wojtek Lerch wrote:

> I presume that this makes "struct { int a, b, c; }" an array type
> on most implementations?

If there is no padding between the members, then yes, according to
6.2.5 #20 it is an array type as well as a structure type.
What I should have said here was that if there is no padding, an object
of the structure type can also be accessed as an object of an array type
(e.g. int[3]). The two types are not the same.

If the struct type is not an array type, the definition of "array type"
doesn't support your position, does it?

The standard never defines "array", but if array means "object of array type",
then it follows from 6.3.2.3 #7 that there is a valid conversion from a
pointer to the above struct, to a pointer to an array of (at least) 3 ints.
Can you find words somewhere
else in the standard that allow you to access the struct as an array?
6.3.2.3 #7. Strictly speaking, that clause doesn't say that the converted
pointer points to the objects (of types compatible with the target type)
that are stored at the same location as the original pointer -- but that is
clearly the intent. Otherwise, what is the purpose of casts between pointers
of different types?
And does it matter? 6.5.6p8 defines pointer math in terms of the array
that the object *is* an element of.

That's because there is only one such array for a given effective type.
However, there are in general many arrays that have the object as an
element, and in this case there is a conversion from a pointer-to-struct
to a pointer to an int array.

--
David Hopwood <da******************@blueyonder.co.uk>

Nov 14 '05 #172

Douglas A. Gwyn

Dan Pop wrote:

The only consistent way of interpreting the standard is that each
outermost object resides in its own address space and that pointer
arithmetic is well defined as long as the result is in the same address
space (the one-byte-after address included). Subobjects merely live in
that address space, they do not define their own address space.
Any other interpretation is going to have consistency problems, e.g.
when a character pointer is made to point inside a subobject, because
the pointer is also pointing inside the outermost object.

No, the standard does not resort to the notion of an object's
address space, but rather it guarantees what a s.c. program
can do (thus what a conforming implementation must support)
with regard to pointer arithmetic. It would be consistent
for an implementation to take advantage of the guarantees
when generating code to access a *declared* type, as
described earlier in this thread. You can't really see this
for small examples, which is why I keep urging consideration
of the case when the subarray is nearly the size that can be
spanned by an offset field of a composite address.

Nov 14 '05 #173

Wojtek Lerch

David Hopwood wrote:

Wojtek Lerch wrote:
If the struct type is not an array type, the definition of "array
type" doesn't support your position, does it?
The standard never defines "array", but if array means "object of array
type",
then it follows from 6.3.2.3 #7 that there is a valid conversion from a
pointer to the above struct, to a pointer to an array of (at least) 3 ints.

No it doesn't. Couldn't arrays of ints have stricter alignment
requirements than structs of ints or simple ints?

Can you find words somewhere else in the standard that allow you to
access the struct as an array?

6.3.2.3 #7. Strictly speaking, that clause doesn't say that the converted
pointer points to the objects (of types compatible with the target type)
that are stored at the same location as the original pointer -- but that is
clearly the intent. Otherwise, what is the purpose of casts between
pointers of different types?

I don't know, but just because there's no obvious purpose it doesn't
mean that any interpretation that gives it a purpose must necessarily be
correct.

Anyway, you can get around that by converting to a character pointer
first. The byte that the character pointer points to is guaranteed be
the first byte of both objects.

And does it matter? 6.5.6p8 defines pointer math in terms of the
array that the object *is* an element of.

That's because there is only one such array for a given effective type.

Or none at all. If an object is declared as a struct, its effective
type is not an array type. If an allocated object hasn't been written
to yet, it has no effective type. But you're claiming that both are
arrays of more than one int for the purpose of 6.5.6p8. Pointer math
has nothing to do with the effective type of the object you're pointing to.
However, there are in general many arrays that have the object as an
element, and in this case there is a conversion from a pointer-to-struct
to a pointer to an int array.

Ah, so you mean that the first of the three ints in the struct is an
element of an array of two ints, and of an array of three ints, at the
same time? But doesn't that mean that if I take a pointer to it and add
two, then dereferencing the result is both defined and undefined at the
same time? Actually, no: it violates a "shall" in 6.5.6p8, and
therefore it is undefined. You're not going to agree with that, are you?...

Nov 14 '05 #174

Keith Thompson

Thomas Stegen <th***********@gmail.com> writes:

David Hopwood wrote:
Wojtek Lerch wrote:
I presume that this makes "struct { int a, b, c; }" an array type on most
implementations?

If there is no padding between the members, then yes, according to
6.2.5 #20
it is an array type as well as a structure type.

This seems to be an implementation dependent detail though. Relying on
it will make not a strictly conforming program make.

So it seems to me that according to the abstract machine, it is not an
array.

If it's correct that the struct is an array if there is no padding,
then I think the following is a strictly conforming program that
produces no output:

#include <stddef.h>
#include <stdio.h>

int main(void)
{
struct foo {
int a;
int b;
} struct_obj;
int ok = 1;
int *ptr = &struct_obj.a;

struct_obj.b = 12345;

if (offsetof(struct foo, b) == sizeof(int)) {
if (ptr[1] != 12345) {
ok = 0;
}
}

if (!ok) puts("Oops!");

return 0;
}

It can execute different statements depending on
implementation-defined behavior (whether there is padding between a
and b), but that doesn't affect the output, which is the criterion for
strict conformance.

On the other hand, if the struct cannot be treated as an array, the
evaluation of ptr[1] invokes undefined behavior and the program is not
strictly conforming.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Nov 14 '05 #175

Keith Thompson

Da*****@cern.ch (Dan Pop) writes:
[...]

Nope. There is nothing magic about dynamically allocated memory. The
fact that the block of memory allocated by malloc is suitable to store
any C object that will fit in it is a direct consequence of another
property of that block, that is explicitly guaranteed by the standard:
the memory block is correctly aligned for any C object (whether it fits
inside or not).

Hmm. I'm beginning to think you're right.

C99 7.20.3p1 says:

The pointer returned if the allocation succeeds is suitably
aligned so that it may be assigned to a pointer to any type of
object and then used to access such an object or an array of such
objects in the space allocated (until the space is explicitly
deallocated).

The most obvious reading of this is, as Dan says, that the allocated
space can be used for any type of objects *because* it's suitably
aligned, not because of any additional magic. (Of course, there's an
additional requirement that the space has to allow read/write access.)

The counterargument is that a bounds-checking fat-pointer
implementation, given

int arr[2][2];
int *ptr = &arr[0][0];

could disallow ptr[3] because the relevant array of int is only 2
elements long, but 7.20.3p1 seems to imply that the alignment and size
of the object arr are enough to make ptr[3] ok.

The language *could* have been consistently defined in a way that
makes evaluating ptr[3] invoke undefined behavior (though most
implementations would still allow it with the obvious semantics by
taking the shortcut of not storing bounds information with pointers).
It would be a good idea, IMHO, for the standard to state this more
explicitly, one way or the other. Aliasing a multidimensional array
as a one-dimensional array is probably common enough that there should
be a clearer statement of whether it's legal. Having to infer it from
a somewhat vague statement describing the semantics of the *alloc()
functions is unsatisfying.

Hmm. What does this say about the "struct hack"?

[...]

doesn't make it an array. On many compilers,

int a,b,c;

will result three contiguous locations in memory being allocated as
ints. That doesn't create an array.

This is true: each object lives in its own address space. However, make
them part of a larger object (thus having them in the same address space)
and you have an array, as in my example above.

But it's possible to detect whether a, b, and c happen to be
contiguous; this is specifically mentioned in C99 6.5.9p6, discussing
equality operators on pointers. So one could argue that this program:

#include <stdio.h>

int main(void)
{
int a, b, c;
int *ptr;
a = c = 12345;

if (&a + 1 == &b && &b + 1 == &c) {
ptr = &a;
printf("ptr[2] = %d\n", ptr[2]);
}
else if (&c + 1 == &b && &b + 1 == &a) {
ptr = &c;
printf("ptr[2] = %d\n", ptr[2]);
}
else {
printf("The objects are not contiguous\n");
}

return 0;
}

will print either "ptr[2] = 12345" or "The objects are not
contiguous", but in this case I think a bounds-checking implementation
can put its foot down and trap on the evaluation of ptr[2]. (I don't
have chapter and verse for this.)

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Nov 14 '05 #176

Douglas A. Gwyn

There is a DR still open on this,
the resolution of which is supposed
to rely on the notion that writing
"impresses" a type on the anonymous
storage, and a s.c. program cannot
impress a different type on an object
associated with an identifier via a
declaration. I haven't yet written
the text for this, but hope to get
it done in time to be considered by
the DR review group at the upcoming
WG14 meeting.

Nov 14 '05 #177

Keith Thompson

"Douglas A. Gwyn" <DA****@null.net> writes:

There is a DR still open on this, the resolution of which is
supposed to rely on the notion that writing "impresses" a type on
the anonymous storage, and a s.c. program cannot impress a different
type on an object associated with an identifier via a declaration.
I haven't yet written the text for this, but hope to get it done in
time to be considered by the DR review group at the upcoming WG14
meeting.

Do you have a reference for this, or at least a DR number?

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Nov 14 '05 #178

James Kuyper

David Hopwood <da******************@blueyonder.co.uk> wrote in message news:<bf********************@fe2.news.blueyonder.c o.uk>...

James Kuyper wrote: ....
Those definitions say that "An array type describes a contiguously
allocated non-empty set of objects with a particular member object
type". It does not say that every piece of memory that happens to
fits that description is an array.

Since it's a *definition* of "array type" and "array of T", that's
effectively what it does say.

No - as a definition, it says that every array type describes a piece
of memory with those characteristics. That is quite different from
saying that every piece of memory with those characteristics can be
described using an array type.
For that to be the case, there would have to be text that explicitly
excludes some "contiguously allocated non-empty sets of objects with
a particular member object type" from being arrays. I can't see any
such text.

A zip code describes a particular geographic subset of the United
States. Does that mean that every geographic subset of the United
States can be described with a zip code?

Nov 14 '05 #179

pete

James Kuyper wrote:

David Hopwood <da******************@blueyonder.co.uk> wrote in message news:<bf********************@fe2.news.blueyonder.c o.uk>...
James Kuyper wrote:

...
Those definitions say that "An array type describes a contiguously
allocated non-empty set of objects with a particular member object
type". It does not say that every piece of memory that happens to
fits that description is an array.

Since it's a *definition* of "array type" and "array of T", that's
effectively what it does say.

No - as a definition, it says that every array type describes a piece
of memory with those characteristics. That is quite different from
saying that every piece of memory with those characteristics can be
described using an array type.

At best, that could only be considered
as a partial definition of an array type.

For that to be the case, there would have to be text that explicitly
excludes some "contiguously allocated non-empty sets of objects with
a particular member object type" from being arrays. I can't see any
such text.

A zip code describes a particular geographic subset of the United
States. Does that mean that every geographic subset of the United
States can be described with a zip code?

It means that
"a code which describes a particular
geographic subset of the United States"
is not the definition of "zip code"

Definitions are reversable.
If an X is defined as a red A,
then every X is a red A, and every red A is an X.
I leaned that in math.

--
pete

Nov 14 '05 #180

pete

pete wrote:

I leaned that in math.

That was the day before I learned how to spell.

--
pete

Nov 14 '05 #181

Dan Pop

In <41**************@null.net> "Douglas A. Gwyn" <DA****@null.net> writes:

Dan Pop wrote:
The only consistent way of interpreting the standard is that each
outermost object resides in its own address space and that pointer
arithmetic is well defined as long as the result is in the same address
space (the one-byte-after address included). Subobjects merely live in
that address space, they do not define their own address space.
Any other interpretation is going to have consistency problems, e.g.
when a character pointer is made to point inside a subobject, because
the pointer is also pointing inside the outermost object.
No, the standard does not resort to the notion of an object's
address space, but rather it guarantees what a s.c. program
can do (thus what a conforming implementation must support)
with regard to pointer arithmetic.

This notion is derived from from what the standard actually says.

The standard doesn't explicitly defines the notions of operator precedence
and associativity either, yet they can be derived from the actual text
of the standard (and even the standard itself uses them in examples and
footnotes).
It would be consistent
for an implementation to take advantage of the guarantees
when generating code to access a *declared* type, as
described earlier in this thread.
Only if the standard allowed that. Which it doesn't. 6.5.6p8 says:
"If the pointer operand points to an element of an array object" but it
doesn't require that array object to be declared as such. This is
precisely what makes pointer arithmetic work inside dynamically allocated
memory blocks. The standard doesn't have one rule for pointers
inside dynamically allocated memory and another for pointers inside
statically/automatically memory. This is why it is only the outermost
object that matters when deciding whether pointer arithmetic has a well
defined result or not. If this outermost object also satisfies the
definition of an array of type T, then it is this unnamed and undeclared
array that counts when pointer arithmetic is performed. Note that
footnote 88 (original C99 numbering) is consistent with this view.

The particular case when the aliasing array is of character type is even
*explicitly* mentioned in the standard:

pointer. When a pointer to an object is converted to a pointer
to a character type, the result points to the lowest addressed
byte of the object. Successive increments of the result, up to
the size of the object, yield pointers to the remaining bytes
of the object.
You can't really see this
for small examples, which is why I keep urging consideration
of the case when the subarray is nearly the size that can be
spanned by an offset field of a composite address.

It doesn't matter: the outermost object has created an address space that
is large enough for any subarray. There is no danger of pointer
arithmetic overflow as long as the result stays within the outermost
object.

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de
Currently looking for a job in the European Union

Nov 14 '05 #182

David Hopwood

James Kuyper wrote:

David Hopwood <da******************@blueyonder.co.uk> wrote:
James Kuyper wrote:

....
Those definitions say that "An array type describes a contiguously
allocated non-empty set of objects with a particular member object
type". It does not say that every piece of memory that happens to
fits that description is an array.

Since it's a *definition* of "array type" and "array of T", that's
effectively what it does say.

No - as a definition, it says that every array type describes a piece
of memory with those characteristics. That is quite different from
saying that every piece of memory with those characteristics can be
described using an array type.

So what is the necessary and sufficient definition of an array type,
then? Same question for "array".

For that to be the case, there would have to be text that explicitly
excludes some "contiguously allocated non-empty sets of objects with
a particular member object type" from being arrays. I can't see any
such text.

A zip code describes a particular geographic subset of the United
States. Does that mean that every geographic subset of the United
States can be described with a zip code?

"A zip code describes a particular geographic subset of the United
States." is not a definition in the precise sense that should be
required of definitions of technical terms in a language standard.

--
David Hopwood <da******************@blueyonder.co.uk>

Nov 14 '05 #183

Dan Pop

In <ln************@nuthaus.mib.org> Keith Thompson <ks***@mib.org> writes:

But it's possible to detect whether a, b, and c happen to be
contiguous; this is specifically mentioned in C99 6.5.9p6, discussing
equality operators on pointers. So one could argue that this program:

#include <stdio.h>

int main(void)
{
int a, b, c;
int *ptr;
a = c = 12345;

if (&a + 1 == &b && &b + 1 == &c) {
ptr = &a;
printf("ptr[2] = %d\n", ptr[2]);
}
else if (&c + 1 == &b && &b + 1 == &a) {
ptr = &c;
printf("ptr[2] = %d\n", ptr[2]);
}
else {
printf("The objects are not contiguous\n");
}

return 0;
}

will print either "ptr[2] = 12345" or "The objects are not
contiguous", but in this case I think a bounds-checking implementation
can put its foot down and trap on the evaluation of ptr[2]. (I don't
have chapter and verse for this.)

This is a borderline case that requires an official judgment from the
committee. One could find wording in the standard supporting both views.
One cannot invoke bounds-checking implementations as an argument *before*
establishing that the standard unambiguously rules out the scenario in
question. This is what makes such implementations practically unfeasible.

If the program has established that the three integers are contiguous,
then they do satisfy the definition of an array, according to the
standard:

- An array type describes a contiguously allocated nonempty set
of objects with a particular member object type, called
the element type.

All objects (a, b and c) have the same type and are contiguously
allocated, so they do compose an (ad hoc) array of 3 int. We know,
from the malloc case, that an explicit declaration is not needed this
array in order for pointer arithmetic to have well defined behaviour
inside it.

The opposing view has been amply expressed in this thread, so I'm not
going to reiterate it.

Since this issue is very much similar to the struct hack, one could easily
guess that the committee would reject my argument above. OTOH, as long
as the actual wording of the standard allowed me to build such an
argument, we *do* have a problem.

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de
Currently looking for a job in the European Union

Nov 14 '05 #184

Douglas A. Gwyn

pete wrote:

Definitions are reversable.
If an X is defined as a red A,
then every X is a red A, and every red A is an X.
I leaned that in math.

Then you learned wrong.

Nov 14 '05 #185

Douglas A. Gwyn

Keith Thompson wrote:

Do you have a reference for this, or at least a DR number?

DR 236 is the primary one, although others are also relevant.

Nov 14 '05 #186

Douglas A. Gwyn

Dan Pop wrote:

The particular case when the aliasing array is of character type is even
*explicitly* mentioned in the standard:

Aliasing as array of character type is a special dispensation,
not applicable to aliasing as array of some other type.

Nov 14 '05 #187

Keith Thompson

"Douglas A. Gwyn" <DA****@null.net> writes:

pete wrote:
Definitions are reversable.
If an X is defined as a red A,
then every X is a red A, and every red A is an X.
I leaned that in math.

Then you learned wrong.

Then I guess a lot of people "learned wrong", myself included.

Would you at least agree that a definition of an X that lets you
determine that any given entity either is an X or is not an X is more
useful than a definition that doesn't do so?

In the absence of such a definition, how is one to determine whether
something is an X?

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Nov 14 '05 #188

Dan Pop

In <41***************@null.net> "Douglas A. Gwyn" <DA****@null.net> writes:

Dan Pop wrote:
The particular case when the aliasing array is of character type is even
*explicitly* mentioned in the standard:

Aliasing as array of character type is a special dispensation,
not applicable to aliasing as array of some other type.

The dynamic memory allocation provides the dispensation for aliasing
with other array types. Otherwise, one couldn't use dynamically
allocated arrays of arbitrary types.

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de
Currently looking for a job in the European Union

Nov 14 '05 #189

Keith Thompson

Da*****@cern.ch (Dan Pop) writes:

In <41***************@null.net> "Douglas A. Gwyn" <DA****@null.net> writes:
Dan Pop wrote:
The particular case when the aliasing array is of character type is even
*explicitly* mentioned in the standard:

Aliasing as array of character type is a special dispensation,
not applicable to aliasing as array of some other type.

The dynamic memory allocation provides the dispensation for aliasing
with other array types. Otherwise, one couldn't use dynamically
allocated arrays of arbitrary types.

It's an open question whether the special dispensation for dynamically
allocated memory applies to declared objects.

I get the impression that the standard is not entirely self-consistent
in this area.

It would have been reasonable, IMHO, to have an explicit guarantee
that proper alignment (along with read/write access) is the only
necessary criterion for accessing an object as a given type. It might
be possible to infer such a principle from the description in 7.20.3:

The pointer returned if the allocation succeeds is suitably
aligned so that it may be assigned to a pointer to any type of
object and then used to access such an object or an array of such
objects in the space allocated (until the space is explicitly
deallocated).

but that looks more like a consequence of such a principle than a
statement of it. (Specifically, the alignment guarantee is
significant; the ability to access objects is a consequence of the
alignment and the suggested principle.)

The guarantee that any object can be aliased as an array of unsigned
char would also be a consequence of the principle (since unsigned char
has no alignment requirement above the single byte level), and could
have been relegated to a footnote.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Nov 14 '05 #190

Keith Thompson

"Douglas A. Gwyn" <DA****@null.net> writes:

Keith Thompson wrote:
Do you have a reference for this, or at least a DR number?

DR 236 is the primary one, although others are also relevant.

That's available at
<http://www.open-std.org/JTC1/SC22/WG14/www/docs/dr_236.htm>.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Nov 14 '05 #191

pete

Keith Thompson wrote:

"Douglas A. Gwyn" <DA****@null.net> writes:
pete wrote:
Definitions are reversable.
If an X is defined as a red A,
then every X is a red A, and every red A is an X.
I leaned that in math.

Then you learned wrong.

Then I guess a lot of people "learned wrong", myself included.

Would you at least agree that a definition of an X that lets you
determine that any given entity either is an X or is not an X is more
useful than a definition that doesn't do so?

In the absence of such a definition, how is one to determine whether
something is an X?

If every A be a B, then a baby A be a baby B ;)

--
pete

Nov 14 '05 #192

Douglas A. Gwyn

Keith Thompson wrote:

It would have been reasonable, IMHO, to have an explicit guarantee
that proper alignment (along with read/write access) is the only
necessary criterion for accessing an object as a given type.
Well, no, only for write access. Read access is constrained
by the previously written type. The rule for unions captures
this aspect of the situation.
The guarantee that any object can be aliased as an array of unsigned
char would also be a consequence of the principle (since unsigned char
has no alignment requirement above the single byte level), and could
have been relegated to a footnote.

No, it's special since array-of-char type need not have
been previously impressed upon the object being accessed
as an array of char (actually unsigned char for C99, but
let's not get into that).

Nov 14 '05 #193

Dan Pop

In <cf********************@comcast.com> "Douglas A. Gwyn" <DA****@null.net> writes:

Keith Thompson wrote:
It would have been reasonable, IMHO, to have an explicit guarantee
that proper alignment (along with read/write access) is the only
necessary criterion for accessing an object as a given type.

Well, no, only for write access. Read access is constrained
by the previously written type. The rule for unions captures
this aspect of the situation.

Actually, the rule for unions is gone in C99, being replaced by a more
general rule:

7 An object shall have its stored value accessed only by an lvalue
expression that has one of the following types:73)

- a type compatible with the effective type of the object,

- a qualified version of a type compatible with the effective
type of the object,

- a type that is the signed or unsigned type corresponding to
the effective type of the object,

- a type that is the signed or unsigned type corresponding to
a qualified version of the effective type of the object,

- an aggregate or union type that includes one of the
aforementioned types among its members (including, recursively,
a member of a subaggregate or contained union), or

- a character type.

____________________

73) The intent of this list is to specify those circumstances
in which an object may or may not be aliased.

But this is not really relevant to a discussion focused on pointer
arithmetic.
--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de
Currently looking for a job in the European Union

Nov 14 '05 #194

Dan Pop

In <ln************@nuthaus.mib.org> Keith Thompson <ks***@mib.org> writes:

Da*****@cern.ch (Dan Pop) writes:
In <41***************@null.net> "Douglas A. Gwyn" <DA****@null.net> writes:
Dan Pop wrote:
The particular case when the aliasing array is of character type is even
*explicitly* mentioned in the standard:

Aliasing as array of character type is a special dispensation,
not applicable to aliasing as array of some other type.
The dynamic memory allocation provides the dispensation for aliasing
with other array types. Otherwise, one couldn't use dynamically
allocated arrays of arbitrary types.

It's an open question whether the special dispensation for dynamically
allocated memory applies to declared objects.

The one and only special property of dynamically allocated memory is the
universal alignment.
I get the impression that the standard is not entirely self-consistent
in this area.

The standard *is* self-consistent, some of its interpretations aren't.
Including the one rejecting the struct hack.

My interpretation, posted upthread, is perfectly consistent with itself
and with the actual wording of the standard.

Other interpretations require the definition of array to be a one-way
definition (which is sheer nonsense) and make pointer arithmetic inside
dynamically allocated objects follow other (unwritten) rules than pointer
arithmetic inside declared objects.

And, for those insisting on declared array being the only relevant arrays
in the context of 6.5.6p8 (and thus leaving pointer arithmetic inside
dynamically allocated arrays invoking undefined behaviour), how about
the following example:

int array[3] = { 0 };
unsigned *p = (unsigned *)array;

is p[2] legal or not? p is certainly not pointing in any declared array
of unsigned int.

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de
Currently looking for a job in the European Union

Nov 14 '05 #195

Michael Mair

Dan Pop wrote:

In <ln************@nuthaus.mib.org> Keith Thompson <ks***@mib.org> writes:

Da*****@cern.ch (Dan Pop) writes:
In <41***************@null.net> "Douglas A. Gwyn" <DA****@null.net> writes:

Dan Pop wrote:

>The particular case when the aliasing array is of character type is even
>*explicitly* mentioned in the standard:

Aliasing as array of character type is a special dispensation,
not applicable to aliasing as array of some other type.

The dynamic memory allocation provides the dispensation for aliasing
with other array types. Otherwise, one couldn't use dynamically
allocated arrays of arbitrary types.

It's an open question whether the special dispensation for dynamically
allocated memory applies to declared objects.

The one and only special property of dynamically allocated memory is the
universal alignment.

I get the impression that the standard is not entirely self-consistent
in this area.

The standard *is* self-consistent, some of its interpretations aren't.
Including the one rejecting the struct hack.

My interpretation, posted upthread, is perfectly consistent with itself
and with the actual wording of the standard.

Other interpretations require the definition of array to be a one-way
definition (which is sheer nonsense) and make pointer arithmetic inside
dynamically allocated objects follow other (unwritten) rules than pointer
arithmetic inside declared objects.

And, for those insisting on declared array being the only relevant arrays
in the context of 6.5.6p8 (and thus leaving pointer arithmetic inside
dynamically allocated arrays invoking undefined behaviour), how about
the following example:

int array[3] = { 0 };
unsigned *p = (unsigned *)array;

is p[2] legal or not? p is certainly not pointing in any declared array
of unsigned int.

Legal, as unsigned is the corresponding unsigned type to int.
Someone posted the section with accessibility related to the effective
types lately but I do not bother to look it up as I think this example
is completely beside the point.
Cheers,
Michael
--
E-Mail: Mine is a gmx dot de address.

Nov 14 '05 #196

Dan Pop

In <2t*************@uni-berlin.de> Michael Mair <Mi**********@invalid.invalid> writes:

Dan Pop wrote:
In <ln************@nuthaus.mib.org> Keith Thompson <ks***@mib.org> writes:

Da*****@cern.ch (Dan Pop) writes:

And, for those insisting on declared array being the only relevant arrays
in the context of 6.5.6p8 (and thus leaving pointer arithmetic inside
dynamically allocated arrays invoking undefined behaviour), how about
the following example:

int array[3] = { 0 };
unsigned *p = (unsigned *)array;

is p[2] legal or not? p is certainly not pointing in any declared array
of unsigned int.

Legal, as unsigned is the corresponding unsigned type to int.

You completely missed the point. This applies to dereferencing p itself,
but says nothing about pointer arithmetic on p.

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de
Currently looking for a job in the European Union

Nov 14 '05 #197

Michael Mair

>>Legal, as unsigned is the corresponding unsigned type to int.

You completely missed the point. This applies to dereferencing p itself,
but says nothing about pointer arithmetic on p.

You are right. Maybe I have some time later on for a peek into
the standard...
Cheers
Michael
--
E-Mail: Mine is a gmx dot de address.

Nov 14 '05 #198

Similar topics