Help | Site Map
Connecting Tech Pros Worldwide
 
 
LinkBack Thread Tools
  #1  
Old November 12th, 2007, 01:05 PM
Bernd Gaertner
Guest
 
Posts: n/a
Default pointer arithmetic and multi-dimensional arrays

Dear experts,

according to my interpretation of the Standard, loop 1 in the
following program is legal, while loop 2 is not (see explanation
below). This looks a bit counterintuitive, though; do you know
the truth?

Thanks a lot in advance,
Bernd Gaertner.

#include<iostream>

int main()
{
int a[2][3] = { {0, 1, 2}, {3, 4, 5} };
int* first = a[0]; // pointer to a[0][0]

// loop 1
for (int* p = first; p < first+6; ++p)
std::cout << *p; // 012345

// loop 2
for (int* p = first; p < first+6; p+=2)
std::cout << *p; // 024

return 0;
}

Explanation: [decl.array] 7-9 implies that the memory layout of
"int a[3][2]" is as for "int a[6]" - the six ints are consecutive
in memory. This means that during any iteration of loop 1, both p
and ++p are pointers to elements (or past-the-end pointers) of the
*same* three-element array, namely either a[0] or a[1]. In this
case, [expr.add] 5 guarantees well-defined behavior. In loop 2,
if p == first+2 (pointer to last element of a[0]), p+=2 points to
the second element of a[1], so p and p+=2 do not refer to the same
array. In this case, [expr.add] 5 stipulates undefined behavior.
  #2  
Old November 13th, 2007, 08:45 AM
James Kanze
Guest
 
Posts: n/a
Default Re: pointer arithmetic and multi-dimensional arrays

On Nov 12, 1:58 pm, Bernd Gaertner <gaert...@inf.ethz.chwrote:
Quote:
according to my interpretation of the Standard, loop 1 in the
following program is legal, while loop 2 is not (see explanation
below). This looks a bit counterintuitive, though; do you know
the truth?
Neither are legal, although both are likely to work on most
implementations.
Quote:
#include<iostream>
>
int main()
{
int a[2][3] = { {0, 1, 2}, {3, 4, 5} };
int* first = a[0]; // pointer to a[0][0]
More precisely: a pointer to the first element in a[0]. More
precisely, a pointer to an int which is the first element of an
array of 3 ints. Legal values for this pointer a thus first,
first+1, first+2 and first+3; the last may not be dereferenced.

In practice, the only time this will fail is with a bounds
checking implementation using fat pointers. (CenterLine once
sold such a compiler; I don't know what the current status is,
but ICS is still selling a compiler under the CenterLine mark.)
Quote:
// loop 1
for (int* p = first; p < first+6; ++p)
std::cout << *p; // 012345
Quote:
// loop 2
for (int* p = first; p < first+6; p+=2)
std::cout << *p; // 024
Quote:
return 0;
}
Quote:
Explanation: [decl.array] 7-9 implies that the memory layout of
"int a[3][2]" is as for "int a[6]" - the six ints are consecutive
in memory.
Yes, but it's not too clear what you can do with that. The
standard has been very carefully worded to allow compiler bounds
checking (even if no one, or almost no one, does it). When you
assign a[0] to first, the bounds are a[0][0]...a[0][3]. A
compiler is allowed to maintain this information with the
pointer (i.e. a fat pointer), and check it each time you modify
the pointer.
Quote:
This means that during any iteration of loop 1, both p
and ++p are pointers to elements (or past-the-end pointers) of the
*same* three-element array, namely either a[0] or a[1]. In this
case, [expr.add] 5 guarantees well-defined behavior. In loop 2,
if p == first+2 (pointer to last element of a[0]), p+=2 points to
the second element of a[1], so p and p+=2 do not refer to the same
array. In this case, [expr.add] 5 stipulates undefined behavior.
That's a different problem. Obviously, a bounds checking
implementation would fail here; supposedly, at least, there have
also been cases where it would fail in special cases without
bounds checking. (I don't think it would ever fail without
bounds checking for an on stack array of small elements like
int. It might fail if the array were dynamically allocated,
however, or if the elements were significantly larger---things
that could cause the pointer to point to memory that didn't
exist or that wasn't mapped.)

--
James Kanze (GABI Software) email:james.kanze@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

  #3  
Old November 13th, 2007, 08:45 AM
Bernd Gaertner
Guest
 
Posts: n/a
Default Re: pointer arithmetic and multi-dimensional arrays

Thanks a lot, for your answer, Alf! There are some things I still don't
understand, though (and yes, it's homework-related, but not my homework
but that of my students).

Alf P. Steinbach wrote:
Quote:
Quote:
>#include<iostream>
>>
>int main()
>{
> int a[2][3] = { {0, 1, 2}, {3, 4, 5} };
> int* first = a[0]; // pointer to a[0][0]
>>
> // loop 1
> for (int* p = first; p < first+6; ++p)
> std::cout << *p; // 012345
>>
> // loop 2
> for (int* p = first; p < first+6; p+=2)
> std::cout << *p; // 024
>>
> return 0;
>}
Quote:
You have no guarantee that a pointer to the first array element can be
treated as a pointer to the array. On a segmented architecture the
pointers might theoretically (although not in practice) be completely
different beast, and the two inner arrays might reside in different
segments, "you can't get there from here".
This seems to be in contradiction to [dcl.array] 1 where it is stated
that "An object of array type contains a contiguously allocated nonempty
subset of N subobjects of type T". And a twodimensional array is simply
an array where T is another array type.
Quote:
Quote:
>In loop 2,
>if p == first+2 (pointer to last element of a[0]), p+=2 points to
>the second element of a[1], so p and p+=2 do not refer to the same
>array. In this case, [expr.add] 5 stipulates undefined behavior.
>
The reasoning here seems to not be meaningful in any way, but the
conclusion is just about right.
The reasoning was based on the assumed fact that the two subarrays are
contiguoulsy allocated in memory.
Quote:
On the suspicion that this is HOMEWORK, I'll leave it as an exercise to
figure out why the second loop has formally Undefined Behavior in
addition to the UB considerations that apply to the first loop.
I figured one thing out that also applies to loop 1 (stupid me): the
behavior of the operation "first+6" is already undefined since first and
the result first+6 definitely do not point to elements (or past the end)
of the same array. But what if we loop as follows?

// loop 1
int* p = first;
for (int i=0; i<6; ++i)
std::cout << *p++; // 012345
  #4  
Old November 13th, 2007, 09:25 AM
Kai-Uwe Bux
Guest
 
Posts: n/a
Default Re: pointer arithmetic and multi-dimensional arrays

Bernd Gaertner wrote:
Quote:
Thanks a lot, for your answer, Alf! There are some things I still don't
understand, though (and yes, it's homework-related, but not my homework
but that of my students).
>
Alf P. Steinbach wrote:
Quote:
Quote:
>>#include<iostream>
>>>
>>int main()
>>{
>> int a[2][3] = { {0, 1, 2}, {3, 4, 5} };
>> int* first = a[0]; // pointer to a[0][0]
>>>
>> // loop 1
>> for (int* p = first; p < first+6; ++p)
>> std::cout << *p; // 012345
>>>
>> // loop 2
>> for (int* p = first; p < first+6; p+=2)
>> std::cout << *p; // 024
>>>
>> return 0;
>>}
>
Quote:
>You have no guarantee that a pointer to the first array element can be
>treated as a pointer to the array. On a segmented architecture the
>pointers might theoretically (although not in practice) be completely
>different beast, and the two inner arrays might reside in different
>segments, "you can't get there from here".
>
This seems to be in contradiction to [dcl.array] 1 where it is stated
that "An object of array type contains a contiguously allocated nonempty
subset of N subobjects of type T". And a twodimensional array is simply
an array where T is another array type.
That only concerns the layout of the arrays in memory. It does not concern
what you can do with pointer arithmetic. The standard does not preclude an
implementation to perform bounds checking. In particular, an implementation
is allowed to use decorated pointers that keep track of the range of
validity for pointer arithmetic through extra data not used in
dereferencing. Although this may not correspond to the intuitive notion of
a pointer, it is a perfectly valid implementation. The compiler would
deduce such ranges of validity from type information. In the above cases,
range checking would find an access violation once you increment p past the
bound of the 1D-array used to initialize it.


[snip]

Best

Kai-Uwe Bux
  #5  
Old November 13th, 2007, 11:55 AM
Bernd Gaertner
Guest
 
Posts: n/a
Default Re: pointer arithmetic and multi-dimensional arrays

Thanks for all your in-depth answers! I understand the issue much better
now, and in retrospect it was somewhat naive to expect defined behavior
for the loops in my original posting.

Best regards,
Bernd.
 

Bookmarks

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are Off
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On

What is Bytes?

We are a network of experts and professionals in IT and software development that help one another with answers to tough questions and share insights. Get the best answers to your questions from over network members.
Post your question now . . .
It's fast and it's free

Popular Articles