473,396 Members | 1,827 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

Out-of-bounds Restrictions?


In another thread recently, there was discussed the accessing of array
indices which were out of bounds. Obviously, the following code is bogus:

int arr[5];

arr[9] = 4;

Take the following snippet however:

int arr[2][5];

arr[0][9] = 4;

The definition of "arr" results in a chunk of memory of size: sizeof(int)*5
*2
Therefore, I would have thought that there was nothing wrong with the
accessing of the 10th element (if we look at it as a one-dimensional array
of int's). I've been told, however, that the behaviour is undefined.

This seems strange to me, especially given that we should be able to access
memory however we like in C. For example, look at the free reign we have
with the following:

int *const p = malloc(128 * sizeof*p);

p[56] = 4;

double *const pd = (double*)p;

pd[0] = 56.334;

What about the following snippet; is the behaviour undefined?

int (*p)[5][6] = malloc(5*6*sizeof*p);

(*p)[0][8] = 7;

Can the second snippet be remedied by simply adding a pointer? e.g.:

int arr[2][5];

/* arr[9] = 4; */

int *const p = (int*)&**arr;

p[9] = 4;

--

Frederick Gotham
Oct 29 '06 #1
11 1483
Frederick Gotham wrote:
In another thread recently, there was discussed the accessing of array
indices which were out of bounds. Obviously, the following code is bogus:

int arr[5];

arr[9] = 4;

Take the following snippet however:

int arr[2][5];

arr[0][9] = 4;

The definition of "arr" results in a chunk of memory of size: sizeof(int)*5
*2
Therefore, I would have thought that there was nothing wrong with the
accessing of the 10th element (if we look at it as a one-dimensional array
of int's). I've been told, however, that the behaviour is undefined.
Correct. Formally, arr[0][9] is identical to *(arr[0] + 9),
and the sub-expression `arr[0] + 9' is invalid: it does not produce
a pointer to one of the elements of arr[0] nor to the fictitious
element just past its end (which would be arr[0][5]). That is, it
is invalid for the same reason your first example is invalid.

As a practical matter, few C implementations will catch this
error: few C implementations perform (or even can perform) bounds-
checking on array indices. However, the Standard does not forbid
bounds-checking; it does not take the step of actually sanctioning
an improper reference to perfectly good memory. (This is also the
downfall of the classic form of the "struct hack.")
This seems strange to me, especially given that we should be able to access
memory however we like in C. For example, look at the free reign we have
with the following:
"Free rein." Also, I'm not too sure what you mean by "should"
here: Is there a moral imperative lurking?
int *const p = malloc(128 * sizeof*p);

p[56] = 4;

double *const pd = (double*)p;

pd[0] = 56.334;
I'm not sure what this is supposed to illustrate. It's valid
if malloc() succeeds and if sizeof(double) <= 128 * sizeof(int).
What about the following snippet; is the behaviour undefined?

int (*p)[5][6] = malloc(5*6*sizeof*p);

(*p)[0][8] = 7;
Work it through: p has the type "pointer to int[5][6]," so
(*p) has the type "int[5][6]" and (*p)[0] has the type "int[6]"
and (*p)[0][8] is an attempt to reference outside the extent of
that six-int array. Undefined, and almost certainly uncaught.

(By the way, the malloc() requests enough memory for thirty
such int[5][6] arrays, 900 ints altogether.)
Can the second snippet be remedied by simply adding a pointer? e.g.:

int arr[2][5];

/* arr[9] = 4; */

int *const p = (int*)&**arr;

p[9] = 4;
As far as I can see, this is all right. The cast is useless
clutter, though.

--
Eric Sosman
es*****@acm-dot-org.invalid

Oct 29 '06 #2
In article <dc*******************@news.indigo.ie>,
Frederick Gotham <fg*******@SPAM.comwrote:
>
In another thread recently, there was discussed the accessing of array
indices which were out of bounds. Obviously, the following code is bogus:

int arr[5];

arr[9] = 4;

Take the following snippet however:

int arr[2][5];

arr[0][9] = 4;

The definition of "arr" results in a chunk of memory of size: sizeof(int)*5
*2
Therefore, I would have thought that there was nothing wrong with the
accessing of the 10th element (if we look at it as a one-dimensional array
of int's). I've been told, however, that the behaviour is undefined.
Freddy, Freddy, Freddy, who's side are you on?

You know perfectly well that the above is "undefined" because it's not
strictly kosher. As would anyone who has read this group for more than
a day or two. The basic rule about "undefined behavior" is that if you
have to ask (more precisely, if it even occurs to you to ask), then it
almost certainly is.

The fact that it works as you expect on every implementation known to
man is, of course, completely irrelevant (in the eyes of the religious
zealots).

Oct 29 '06 #3
On Sun, 29 Oct 2006 13:41:29 GMT, Frederick Gotham
<fg*******@SPAM.comwrote:
>
In another thread recently, there was discussed the accessing of array
indices which were out of bounds. Obviously, the following code is bogus:

int arr[5];

arr[9] = 4;

Take the following snippet however:

int arr[2][5];

arr[0][9] = 4;

The definition of "arr" results in a chunk of memory of size: sizeof(int)*5
*2
Therefore, I would have thought that there was nothing wrong with the
accessing of the 10th element (if we look at it as a one-dimensional array
of int's). I've been told, however, that the behaviour is undefined.

This seems strange to me, especially given that we should be able to access
memory however we like in C. For example, look at the free reign we have
with the following:

int *const p = malloc(128 * sizeof*p);

p[56] = 4;

double *const pd = (double*)p;

pd[0] = 56.334;
Only in the very likely event that sizeof(double) <= 128*sizeof(int)
but still not guaranteed.
>
What about the following snippet; is the behaviour undefined?

int (*p)[5][6] = malloc(5*6*sizeof*p);
You probably meant sizeof(int) here since this allocates 30x as much
space as the point you are trying to make.
>
(*p)[0][8] = 7;
Consider a very specialized processor where the compiler recognizes
that the memory allocated to p straddles a "memory hardware boundary".
All of p[0] is before the boundary and the remainder is after. When
generating code for an expression where the first subscript of (*p) is
a constant, the compiler knows to generate code that references the
correct memory "segment". In the case where the first subscript is
not constant, the compiler knows to generate code to reference both
"segments" and execute only the code that references the correct
segment based on the result of a run-time test of the subscript.

You want (*p)[0][8] to mean (*p)[1][2] but the code will access the
wrong segment since the first subscript is a constant.
>
Can the second snippet be remedied by simply adding a pointer? e.g.:

int arr[2][5];

/* arr[9] = 4; */

int *const p = (int*)&**arr;
Since **arr is already an int, the cast is unnecessary. Or you could
simplify to (int*)arr.
>
p[9] = 4;
The compiler is allowed to infer that p points into arr[0]. Not a
change from the previous.
Remove del for email
Oct 29 '06 #4
Barry Schwarz:
> p[9] = 4;

The compiler is allowed to infer that p points into arr[0]. Not a
change from the previous.

I'm still not so sure that the compiler can decide that the memory access
is bad.

Every chunk of memory can be treated as an array of unsigned char's, as
follows:

#include <stdio.h>

int main(void)
{
int arr[5][6][7] = {0};
/* Is that initialisation OK with
just the single 0 between the
braces? */

char unsigned const *p = (char unsigned*)arr;
char unsigned const *const pover =
(char unsigned*)(arr+sizeof arr/sizeof*arr);

do printf("%u",(unsigned)*p++);
while (pover != p);

return 0;
}

--

Frederick Gotham
Oct 29 '06 #5
Frederick Gotham wrote:
...
Take the following snippet however:

int arr[2][5];

arr[0][9] = 4;

The definition of "arr" results in a chunk of memory of size: sizeof(int)*5
*2
Therefore, I would have thought that there was nothing wrong with the
accessing of the 10th element (if we look at it as a one-dimensional array
of int's). I've been told, however, that the behaviour is undefined.
That's because the behaviour _is_ undefined.
This seems strange to me, especially given that we should be able to
access memory however we like in C.
Why should we be able to do that? Do you have some buffer overflow
attack virus that needs to be strictly conforming? ;)

There are cases where walking over the end of an array can be
useful (e.g. struct hack in C90.) But it's not obviously clear that
being
able to do so is more beneficial in terms of efficiency gains. Note
that
languages like Fortran have stricter rules than C. The stricter rules
allow for significantly _greater_ optimisation, not less.

The freedom that pointers have in C is actually a language weakness
more than a strength. Time has proven that.

--
Peter

Oct 29 '06 #6
In article <11**********************@f16g2000cwb.googlegroups .com"Peter Nilsson" <ai***@acay.com.auwrites:
Frederick Gotham wrote:
...
Take the following snippet however:

int arr[2][5];

arr[0][9] = 4;

The definition of "arr" results in a chunk of memory of size: sizeof(int)*5
*2
Therefore, I would have thought that there was nothing wrong with the
accessing of the 10th element (if we look at it as a one-dimensional array
of int's). I've been told, however, that the behaviour is undefined.

That's because the behaviour _is_ undefined.
Can you provide a quotation from the standard? There has been an thread
about this object a bit earlier. My conclusion was that it is permitted.
Note that 'a[i][j]' is equivalent to '*(&(*(a + i)) + j)', where the '&'
yields a pointer of type 'pointer to array 5 of int.
In the standard indexing is allowed as long as the indexed pointer points
within an 'object'. There are two objects involved here: the complete
object and the element object. Now it could be argued that arr[0] is
a 'pointer to array 5 of int', and so the index must remain in the
range [0,5] (where the last can not be used in dereferencing). But
consider the following snippet:
int *p = &(a[0][0]);
now p is a pointer to int (and by the standard an int is, with indexing,
considered to be an array of a single element), so in that case indexing
of p should be restricted to 0 and 1.
--
dik t. winter, cwi, kruislaan 413, 1098 sj amsterdam, nederland, +31205924131
home: bovenover 215, 1025 jn amsterdam, nederland; http://www.cwi.nl/~dik/
Oct 30 '06 #7
Dik T. Winter wrote:
In article <11**********************@f16g2000cwb.googlegroups .com"Peter Nilsson" <ai***@acay.com.auwrites:
Frederick Gotham wrote:
...
Take the following snippet however:

int arr[2][5];

arr[0][9] = 4;

The definition of "arr" results in a chunk of memory of size: sizeof(int)*5
*2
Therefore, I would have thought that there was nothing wrong with the
accessing of the 10th element (if we look at it as a one-dimensional array
of int's). I've been told, however, that the behaviour is undefined.
>
That's because the behaviour _is_ undefined.

Can you provide a quotation from the standard?
Non-normative, but:

J.2 Undefined behavior:
The behavior is undefined in the following circumstances:
[...]
- An array subscript is out of range, even if an object is apparently
accessible with the given subscript (as in the lvalue expression
a[1][7] given the declaration int a[4][5]) (6.5.6).

Oct 30 '06 #8
In article <11**********************@i42g2000cwa.googlegroups .com"=?utf-8?B?SGFyYWxkIHZhbiBExLNr?=" <tr*****@gmail.comwrites:
Dik T. Winter wrote:
....
That's because the behaviour _is_ undefined.
Can you provide a quotation from the standard?

Non-normative, but:

J.2 Undefined behavior:
The behavior is undefined in the following circumstances:
[...]
- An array subscript is out of range, even if an object is apparently
accessible with the given subscript (as in the lvalue expression
a[1][7] given the declaration int a[4][5]) (6.5.6).
Indeed, I see it now also in 6.5.6 (it is different from the previous
discussion). So formally also the following:
int a[4][3];
int *p = &(a[0][0]);
p[6] = 1;
is also undefined behaviour. On the other hand, the following:
int a[4][3];
int *p = (int *)a;
p[6] = 1;
apparently is valid. Or is it not?
--
dik t. winter, cwi, kruislaan 413, 1098 sj amsterdam, nederland, +31205924131
home: bovenover 215, 1025 jn amsterdam, nederland; http://www.cwi.nl/~dik/
Oct 31 '06 #9
I found one reference. This might be useful.

http://c-faq.com/aryptr/ary2dfunc2.html

"... according to an official interpretation, the behavior of accessing
(&array[0][0])[x] is not defined for x >= NCOLUMNS."

Oct 31 '06 #10
Dik T. Winter wrote:
In article <11**********************@i42g2000cwa.googlegroups .com"=?utf-8?B?SGFyYWxkIHZhbiBExLNr?=" <tr*****@gmail.comwrites:
Dik T. Winter wrote:
...
That's because the behaviour _is_ undefined.

Can you provide a quotation from the standard?
>
Non-normative, but:
>
J.2 Undefined behavior:
The behavior is undefined in the following circumstances:
[...]
- An array subscript is out of range, even if an object is apparently
accessible with the given subscript (as in the lvalue expression
a[1][7] given the declaration int a[4][5]) (6.5.6).

Indeed, I see it now also in 6.5.6 (it is different from the previous
discussion). So formally also the following:
int a[4][3];
int *p = &(a[0][0]);
p[6] = 1;
is also undefined behaviour. On the other hand, the following:
int a[4][3];
int *p = (int *)a;
p[6] = 1;
apparently is valid. Or is it not?
I don't think the standard guarantees that (int *) a points to a[0][0].

(There are a lot of cases where "everybody knows" what the behaviour of
pointer conversions should be, but where the standard doesn't spell it
out. offsetof() is close to useless without relying on such behaviour.
This may or may not be one of those cases.)

Oct 31 '06 #11
Dik T. Winter:
Note that 'a[i][j]' is equivalent to '*(&(*(a + i)) + j)'
Incorrect.

a[i][j]

is equivalent to:

( a[i] ) [j]

which is equivalent to:

*( a[i] + j )

which is equivalent to:

*( *(a + i) + j )

The addressof operator plays no part.

--

Frederick Gotham
Nov 1 '06 #12

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: FilexBB | last post by:
Hi Folks, I have tried to redirect system.out for a while and then set it back, but it can't set it back as following program snapshot ByteArrayOutputStream baos = new ByteArrayOutputStream();...
4
by: Merlin | last post by:
Hi there, I would like to check if a string is a valid zip code via Javascript. Length and existents are already checked. How can I find out if the string contains characters other than...
5
by: Mike Carroll | last post by:
I have a COM server that's generally working ok. But one of its methods, when the IDL gets read by the Intertop layer, has parameters of type "out object". The C# compiler tells me that it...
4
by: Steve B. | last post by:
Hello I'm wondering what is exactly the difference between "ref" and "out" keywords. Thanks, Steve
2
by: Chua Wen Ching | last post by:
Hi there, I am wondering the difference between attribute and out keywords. Are they the same or does it serve any different purposes? I saw the and out usage in this code, and i had idea,...
4
by: Jon | last post by:
Why are out parmeters included in an BeginInvoke? They seem to do nothing? TestProgam: using System; namespace TempConsole { class App { public delegate void MyDelegate( out byte b, out...
14
by: stic | last post by:
Hi, I'm in a middle of writing something like 'exception handler wraper' for a set of different methodes. The case is that I have ca. 40 methods form web servicem, with different return values...
4
by: dlgproc | last post by:
I have a managed C++ DLL that contains the following: MyLib.h: // MyLib.h #pragma once using namespace System; using namespace System::Runtime::InteropServices; namespace MyLib { public...
6
by: nick | last post by:
For example: public static void FillRow(Object obj, out SqlDateTime timeWritten, out SqlChars message, out SqlChars category, out long instanceId)
6
by: carlos123 | last post by:
Ok guys, check this out! Im getting an error "Error: Index: 0, Size: 0" not sure why. try{ // Create file FileWriter fstream = new FileWriter("database.txt"); BufferedWriter...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.