429,190 Members | 2,204 Online
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 429,190 IT Pros & Developers. It's quick & easy.

# contiguity of arrays

 P: n/a Is this legal? Must it print 4? int a[2][2] = {{1, 2}, {3, 4}}, *b = a[0]; printf("%d\n", *(b + 3)); --Steve Nov 14 '05 #1
197 Replies

 P: n/a Steve Kobes wrote: Is this legal? Must it print 4? int a[2][2] = {{1, 2}, {3, 4}}, *b = a[0]; printf("%d\n", *(b + 3)); --Steve Yes it is legal and will print 4. This array describes a contiguously allocated nonempty set of objects of type int. b = a[0] positions the int *b to the beginning of the array object as would b = &a[0][0]. #include int main(void) { int a[2][2] = {{1, 2}, {3, 4}}, *b = a[0]; printf("a[2][2] = ((1,2},{3,4}};\nint *b;\n" "Using b = a[0]. *(b + 3) = %d\n", *(b + 3)); b = &a[0][0]; printf("Using b = &a[0][0]. *(b + 3) = %d\n",*(b+3)); printf("The pointer values &a[0][0] and a[0] are%sequal\n", (&a[0][0] == a[0])?" ":" not "); return 0; } -- Al Bowers Tampa, Fl USA mailto: xa******@myrapidsys.com (remove the x to send email) http://www.geocities.com/abowers822/ Nov 14 '05 #2

 P: n/a Chris Torek wrote:Steve Kobes wrote: Is this legal? Must it print 4? int a[2][2] = {{1, 2}, {3, 4}}, *b = a[0]; printf("%d\n", *(b + 3)); In article Al Bowers wrote:Yes it is legal and will print 4. I believe there are those in comp.std.c, at least, who will disagree with you. If it was int a[2][2] = {{1, 2}, {3, 4}}, *b = (int *)a; ^^^^^^^ then I believe you could access b[3]. -- pete Nov 14 '05 #4

 P: n/a On 25 Sep 2004 17:21:37 GMT, Chris Torek wrote: Steve Kobes wrote: Is this legal? Must it print 4? int a[2][2] = {{1, 2}, {3, 4}}, *b = a[0]; printf("%d\n", *(b + 3));In article Al Bowers wrote:Yes it is legal and will print 4.I believe there are those in comp.std.c, at least, who will disagreewith you. I could not follow your discussion which followed so I don't know exactly which point I'm about to disagree with. Instead I will try to present an argument that I believe shows the original question about legality must be answered in the affirmative. For an array q (not declared as a function parameter), sizeof q / sizeof *q must evaluate to the number of elements in b. For a pointer p to type T, the expression p+1 must evaluate to the same value as (T*)((char*)p + sizeof(T)) Therefore sizeof a must evaluate to the same value as 2*sizeof a[0] sizeof a[0] and sizeof a[1] both must evaluate to the same value as 2*sizeof a[0][0]. sizeof a[0][0] must evaluate to the same value as sizeof(int) sizeof a must evaluate to the same value as 4*sizeof(int) Since a contains 4 int and is exactly large enough to contain 4 int, these 4 int must be located at b, b+1, b+2, and b+3. snip <> Nov 14 '05 #5

 P: n/a On 25 Sep 2004 22:48:34 GMT, in comp.lang.c , Barry Schwarz wrote: On 25 Sep 2004 17:21:37 GMT, Chris Torek wrote:Steve Kobes wrote: Is this legal? Must it print 4? int a[2][2] = {{1, 2}, {3, 4}}, *b = a[0]; printf("%d\n", *(b + 3));In article Al Bowers wrote:Yes it is legal and will print 4.I believe there are those in comp.std.c, at least, who will disagreewith you.I could not follow your discussion which followed so I don't knowexactly which point I'm about to disagree with. Instead I will try topresent an argument that I believe shows the original question aboutlegality must be answered in the affirmative. (snip argument which is based on the size of the objects, and 1-d array arithmetic. ) The size argument is spurious. The compiler is allowed to tell you the object is size 4, even if all 4 members were in separate memory arrays in different solar systems. Theres no limit to the internal magic the compiler can perform. -- Mark McIntyre CLC FAQ CLC readme: ----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==---- http://www.newsfeed.com The #1 Newsgroup Service in the World! >100,000 Newsgroups ---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =--- Nov 14 '05 #6

 P: n/a Barry Schwarz wrote: On 25 Sep 2004 17:21:37 GMT, Chris Torek wrote:Steve Kobes wrote: Is this legal? Must it print 4? int a[2][2] = {{1, 2}, {3, 4}}, *b = a[0]; printf("%d\n", *(b + 3));In article Al Bowers wrote:Yes it is legal and will print 4.I believe there are those in comp.std.c, at least, who will disagreewith you. I could not follow your discussion which followed so I don't know exactly which point I'm about to disagree with. Instead I will try to present an argument that I believe shows the original question about legality must be answered in the affirmative. For an array q (not declared as a function parameter), sizeof q / sizeof *q must evaluate to the number of elements in b. For a pointer p to type T, the expression p+1 must evaluate to the same value as (T*)((char*)p + sizeof(T)) Therefore sizeof a must evaluate to the same value as 2*sizeof a[0] sizeof a[0] and sizeof a[1] both must evaluate to the same value as 2*sizeof a[0][0]. sizeof a[0][0] must evaluate to the same value as sizeof(int) sizeof a must evaluate to the same value as 4*sizeof(int) Since a contains 4 int and is exactly large enough to contain 4 int, these 4 int must be located at b, b+1, b+2, and b+3. The problem is that b, with (b = a[0]) is pointed at the first element of a two element array. If you have int a = 0, b = 0; then, you can't do this if (a + 1 == b) { /* The condition check is valid */ printf("%d\n", a[1]); /* Then printf call isn't valid */ } Whether or not there is an int object at (a + 1) is not the point. Overunning the bounds of the object, is the point. -- pete Nov 14 '05 #7

 P: n/a pete wrote: Barry Schwarz wrote: On 25 Sep 2004 17:21:37 GMT, Chris Torek wrote:>Steve Kobes wrote:>> Is this legal? Must it print 4?>>>> int a[2][2] = {{1, 2}, {3, 4}}, *b = a[0];>> printf("%d\n", *(b + 3));In article Al Bowers wrote:>Yes it is legal and will print 4.I believe there are those in comp.std.c, at least, who will disagreewith you. I could not follow your discussion which followed so I don't know exactly which point I'm about to disagree with. Instead I will try to present an argument that I believe shows the original question about legality must be answered in the affirmative. For an array q (not declared as a function parameter), sizeof q / sizeof *q must evaluate to the number of elements in b. For a pointer p to type T, the expression p+1 must evaluate to the same value as (T*)((char*)p + sizeof(T)) Therefore sizeof a must evaluate to the same value as 2*sizeof a[0] sizeof a[0] and sizeof a[1] both must evaluate to the same value as 2*sizeof a[0][0]. sizeof a[0][0] must evaluate to the same value as sizeof(int) sizeof a must evaluate to the same value as 4*sizeof(int) Since a contains 4 int and is exactly large enough to contain 4 int, these 4 int must be located at b, b+1, b+2, and b+3. The problem is that b, with (b = a[0]) is pointed at the first element of a two element array. If you have int a = 0, b = 0; then, you can't do this if (a + 1 == b) { /* The condition check is valid */ printf("%d\n", a[1]); /* Then printf call isn't valid */ } Major typos there. Should be: if (&a + 1 == &b) { /* The condition check is valid */ printf("%d\n", (&a)[1]); /* Then printf call isn't valid */ } Whether or not there is an int object at (a + 1) is not the point. Overunning the bounds of the object, is the point. -- pete Nov 14 '05 #8

 P: n/a Steve Kobes wrote: Is this legal? Must it print 4? int a[2][2] = {{1, 2}, {3, 4}}, *b = a[0]; printf("%d\n", *(b + 3)); Despite the fact that this works everywhere, neither C89 or C99 "guarantees" this. You can reference *(b+0) and *(b+1) and you can compute (b+2) but you can't reference *(b+2). It isn't "legal" to compute (b+3) much less reference *(b+3). If this "equivalence" is important to you, I don't think that you need to be too concerned about what the standard says. Nov 14 '05 #9

 P: n/a Hiho, (snip argument which is based on the size of the objects, and 1-d array arithmetic. ) The size argument is spurious. The compiler is allowed to tell you the object is size 4, even if all 4 members were in separate memory arrays in different solar systems. Theres no limit to the internal magic the compiler can perform. True. But I have yet to see the compiler that will correctly wrap _all_ calls to memcpy() and all other functions getting size_t and pointer arguments for this case. It is much easier to build non-broken programs the other way. Cheers Michael Nov 14 '05 #10

 P: n/a E. Robert Tisdale wrote: Steve Kobes wrote: Is this legal? Must it print 4? int a[2][2] = {{1, 2}, {3, 4}}, *b = a[0]; printf("%d\n", *(b + 3)); Despite the fact that this works everywhere, neither C89 or C99 "guarantees" this. You can reference *(b+0) and *(b+1) and you can compute (b+2) but you can't reference *(b+2). It isn't "legal" to compute (b+3) much less reference *(b+3). If this "equivalence" is important to you, I don't think that you need to be too concerned about what the standard says. Nonsense. b is an int* pointing to an array of four int's. b[2] and b[3] are absolutely legal. -- Joe Wright mailto:jo********@comcast.net "Everything should be made as simple as possible, but not simpler." --- Albert Einstein --- Nov 14 '05 #11

 P: n/a Joe Wright wrote: E. Robert Tisdale wrote: Steve Kobes wrote: Is this legal? Must it print 4? int a[2][2] = {{1, 2}, {3, 4}}, *b = a[0]; printf("%d\n", *(b + 3)); Despite the fact that this works everywhere, neither C89 or C99 "guarantees" this. You can reference *(b+0) and *(b+1) and you can compute (b+2) but you can't reference *(b+2). It isn't "legal" to compute (b+3) much less reference *(b+3). If this "equivalence" is important to you, I don't think that you need to be too concerned about what the standard says. Nonsense. b is an int* pointing to an array of four int's. b[2] and b[3] are absolutely legal. Nov 14 '05 #12

 P: n/a Joe Wright wrote: E. Robert Tisdale wrote: Steve Kobes wrote: Is this legal? Must it print 4? int a[2][2] = {{1, 2}, {3, 4}}, *b = a[0]; printf("%d\n", *(b + 3)); Despite the fact that this works everywhere, neither C89 or C99 "guarantees" this. You can reference *(b+0) and *(b+1) and you can compute (b+2) but you can't reference *(b+2). It isn't "legal" to compute (b+3) much less reference *(b+3). If this "equivalence" is important to you, I don't think that you need to be too concerned about what the standard says. Nonsense. b is an int* pointing to an array of four int's. b[2] and b[3] are absolutely legal. For int myobject[4]; myobject, is an object of type array of four int. You don't have an object of that type in the previous code. -- pete Nov 14 '05 #13

 P: n/a Joe Wright wrote: E. Robert Tisdale wrote: Steve Kobes wrote: Is this legal? Must it print 4? int a[2][2] = {{1, 2}, {3, 4}}, *b = a[0]; printf("%d\n", *(b + 3)); Despite the fact that this works everywhere, neither C89 or C99 "guarantees" this. You can reference *(b+0) and *(b+1) and you can compute (b+2) but you can't reference *(b+2). It isn't "legal" to compute (b+3) much less reference *(b+3). If this "equivalence" is important to you, I don't think that you need to be too concerned about what the standard says. Nonsense. b is an int* pointing to an array of four int's. b[2] and b[3] are absolutely legal. No, it's not. b is an int * pointing to the first of to subsequent arrays of two ints. E. is quite correct that a. this is not the same thing as your version, b. strictly speaking this means that this code is not correct, and c. this difference is theoretical; you'd be hard put to find an implementation where this fails, perhaps excepting very strict debugging implementations. Richard Nov 14 '05 #14

 P: n/a E. Robert Tisdale wrote: Joe Wright wrote: E. Robert Tisdale wrote: Steve Kobes wrote: Is this legal? Must it print 4? int a[2][2] = {{1, 2}, {3, 4}}, *b = a[0]; printf("%d\n", *(b + 3)); Despite the fact that this works everywhere, neither C89 or C99 "guarantees" this. You can reference *(b+0) and *(b+1) and you can compute (b+2) but you can't reference *(b+2). It isn't "legal" to compute (b+3) much less reference *(b+3). If this "equivalence" is important to you, I don't think that you need to be too concerned about what the standard says. Nonsense. b is an int* pointing to an array of four int's. b[2] and b[3] are absolutely legal. How can it point to such an array? None has been defined in this program. C doesn't have multidimensional arrays, it has arrays of arrays, and this is one of the subtle differences between those two concepts. The named array 'a' contains two elements, which are themselves arrays. The unnamed array identified by a[0] is an array of only two elements, which are ints. The rules for the limits of pointer arithmetic are defined in terms of arrays, and according to the rules of the C language, 'a' is not an array of four integers. The elements of the elements of 'a' must be stored in the same way as the elements of an array of four integers. As a result, on most implementations pf C code like this works exactly as you expect. However, because this code has undefined behavior, an implementation is free to implement pointers in such a fashion that it can keep track of the limits beyond which they can't be dereferenced, and abort the problem if those limits are violated. In particular, when a[0] decays to a pointer, it's legal for the compiler to give that pointer dereferencing limits of a[0] and a[0]+1. There is special wording that allows any object, including an array of arrays, to be accessed completely using pointers to unsigned char. This is what makes memcpy() usable. However, for any other type this is an issue. Nov 14 '05 #15

 P: n/a James Kuyper wrote: E. Robert Tisdale wrote: Joe Wright wrote: E. Robert Tisdale wrote: Steve Kobes wrote:> Is this legal? Must it print 4?>> int a[2][2] = {{1, 2}, {3, 4}}, *b = a[0];> printf("%d\n", *(b + 3)); There is special wording that allows any object, including an array of arrays, to be accessed completely using pointers to unsigned char. This is what makes memcpy() usable. However, for any other type this is an issue. If it was int a[2][2] = {{1, 2}, {3, 4}}, *b = (int *)a; ^^^^^^^ instead, there wouldn't be a problem accessing b[3], would there? -- pete Nov 14 '05 #16

 P: n/a >>There is special wording that allows any object,including an array of arrays,to be accessed completely using pointers to unsigned char.This is what makes memcpy() usable.However, for any other type this is an issue. If it was int a[2][2] = {{1, 2}, {3, 4}}, *b = (int *)a; ^^^^^^^ instead, there wouldn't be a problem accessing b[3], would there? There would. The same arguments hold. If you want to be sure that you use the right type of pointer, you have to work with something along the lines of int (*b)[2]. Nov 14 '05 #17

 P: n/a James Kuyper wrote: E. Robert Tisdale wrote: Joe Wright wrote: E. Robert Tisdale wrote: Steve Kobes wrote:> Is this legal? Must it print 4?>> int a[2][2] = {{1, 2}, {3, 4}}, *b = a[0];> printf("%d\n", *(b + 3)); Despite the fact that this works everywhere, neither C89 or C99 "guarantees" this. You can reference *(b+0) and *(b+1) and you can compute (b+2) but you can't reference *(b+2). It isn't "legal" to compute (b+3) much less reference *(b+3). If this "equivalence" is important to you, I don't think that you need to be too concerned about what the standard says. Nonsense. b is an int* pointing to an array of four int's. b[2] and b[3] are absolutely legal. How can it point to such an array? None has been defined in this program. C doesn't have multidimensional arrays, it has arrays of arrays, and this is one of the subtle differences between those two concepts. The named array 'a' contains two elements, which are themselves arrays. The unnamed array identified by a[0] is an array of only two elements, which are ints. The rules for the limits of pointer arithmetic are defined in terms of arrays, and according to the rules of the C language, 'a' is not an array of four integers. At a, there are four int objects, one after the other, having values 1, 2, 3 and four respectively. Looks like an array to me, even if undeclared. The elements of the elements of 'a' must be stored in the same way as the elements of an array of four integers. As a result, on most implementations pf C code like this works exactly as you expect. However, because this code has undefined behavior, an implementation is free to implement pointers in such a fashion that it can keep track of the limits beyond which they can't be dereferenced, and abort the problem if those limits are violated. In particular, when a[0] decays to a pointer, it's legal for the compiler to give that pointer dereferencing limits of a[0] and a[0]+1. Name one compiler enforces such limits. There is special wording that allows any object, including an array of arrays, to be accessed completely using pointers to unsigned char. This is what makes memcpy() usable. However, for any other type this is an issue. If I couldn't access a[1][0] as b[2] I would be surprised, and annoyed. -- Joe Wright mailto:jo********@comcast.net "Everything should be made as simple as possible, but not simpler." --- Albert Einstein --- Nov 14 '05 #18

 P: n/a Joe Wright wrote: If I couldn't access a[1][0] as b[2] I would be surprised, and annoyed. But you can. The *de facto* standard is as you describe it. No compiler developer would implement the standard *de jure* unless they were suicidal. The "legal" restrictions of the standard de jure are pure sophistry. Nov 14 '05 #19

 P: n/a Michael Mair wrote:There is special wording that allows any object,including an array of arrays,to be accessed completely using pointers to unsigned char.This is what makes memcpy() usable.However, for any other type this is an issue. If it was int a[2][2] = {{1, 2}, {3, 4}}, *b = (int *)a; ^^^^^^^ instead, there wouldn't be a problem accessing b[3], would there? There would. The same arguments hold. No, the same arguments don't hold. Is this legal? Must it print 4? int a[2][2] = {{1, 2}, {3, 4}}, *b = a[0]; printf("%d\n", *(b + 3)); The problem originally was that b had the address of the first element of a[0]. a[0] has two elements, a[0][0] and a[0][1]. The address of b[3] is outside of a[0][1], which is to say that b[3] is beyond the boundary of a[0]. For b = (int *)a; b has the address of the first element of a, converted to int *. a has two elements, a[0] and a[1]. The address of b[3] is not outside of a[1], which is to say that b[3] is not beyond the boundary of a. -- pete Nov 14 '05 #20

 P: n/a Joe Wright writes: James Kuyper wrote: [...] The elements of the elements of 'a' must be stored in the same way as the elements of an array of four integers. As a result, on most implementations pf C code like this works exactly as you expect. However, because this code has undefined behavior, an implementation is free to implement pointers in such a fashion that it can keep track of the limits beyond which they can't be dereferenced, and abort the problem if those limits are violated. In particular, when a[0] decays to a pointer, it's legal for the compiler to give that pointer dereferencing limits of a[0] and a[0]+1. Name one compiler enforces such limits. There may or may not be such a compiler. The point (especially in comp.std.c) is that any compiler is allowed to enforce such limits. -- Keith Thompson (The_Other_Keith) ks***@mib.org San Diego Supercomputer Center <*> We must do something. This is something. Therefore, we must do this. Nov 14 '05 #21

 P: n/a pete wrote in message news:<41***********@mindspring.com>... .... If it was int a[2][2] = {{1, 2}, {3, 4}}, *b = (int *)a; ^^^^^^^ instead, there wouldn't be a problem accessing b[3], would there? The validity limits of pointer arithmetic are defined in terms of the containing array. There is one and only one only array of int that contains the int object pointed at by 'b'. That array is a[0], and it only contains two integers. 'a' itself is not an array of 'int', but rather is an array of 'int[2]', and therefore is not capable of determining the validity limits for arithmetic on an 'int*'. Nov 14 '05 #22

 P: n/a Keith Thompson wrote: Joe Wright writes:James Kuyper wrote: [...]The elements of the elements of 'a' must be stored in the same wayas the elements of an array of four integers. As a result, on mostimplementations pf C code like this works exactly as youexpect. However, because this code has undefined behavior, animplementation is free to implement pointers in such a fashion thatit can keep track of the limits beyond which they can't bedereferenced, and abort the problem if those limits are violated. Inparticular, when a[0] decays to a pointer, it's legal for thecompiler to give that pointer dereferencing limits of a[0] anda[0]+1.Name one compiler enforces such limits. There may or may not be such a compiler. Meaning that there *is* no such compiler. The point (especially in comp.std.c) is that any compiler is allowed to enforce such limits. Correct. The standard de jure allows compiler developers to commit suicide. We expect them to have more sense than that. Nov 14 '05 #23

 P: n/a "E. Robert Tisdale" writes: Keith Thompson wrote: Joe Wright writes: [...]Name one compiler enforces such limits. There may or may not be such a compiler. Meaning that there *is* no such compiler. No, meaning that there may or may not be such a compiler. If you're assuming that I'm familiar with all existing C compilers, and that if there were one that does strict bounds checking I would know about it, your confidence is misplaced. -- Keith Thompson (The_Other_Keith) ks***@mib.org San Diego Supercomputer Center <*> We must do something. This is something. Therefore, we must do this. Nov 14 '05 #24

 P: n/a "James Kuyper" wrote in message news:8b**************************@posting.google.c om... pete wrote in message news:<41***********@mindspring.com>... ... If it was int a[2][2] = {{1, 2}, {3, 4}}, *b = (int *)a; ^^^^^^^ instead, there wouldn't be a problem accessing b[3], would there? The validity limits of pointer arithmetic are defined in terms of the containing array. There is one and only one only array of int that contains the int object pointed at by 'b'. That array is a[0], and it I believe the wording is for stand-alone arrays only, and a[0] is a part of continious object. The array above is a single object. Since a[1] + 2 can be pointed to, even b + 4 can be pointed to, and b[3] can certainly be addressed. Nov 14 '05 #25

 P: n/a Joe Wright wrote: .... At a, there are four int objects, one after the other, having values 1, 2, 3 and four respectively. Looks like an array to me, even if undeclared. Key word: undeclared. If it is not declared as such in the C program, it doesn't count as such for purposes of determining what the C program is allowed/required to do with it. Name one compiler enforces such limits. I don't know whether are any, though I have vague memories of a compiler that provided such checking in a special debug mode. It would certainly be too expensive for use in the default mode. However, I don't care whether any compiler actually does this; for the purposes of comp.std.c, all I care about is whether compilers are allowed to do this. This is cross-posted to comp.lang.c, where the relevant concerns are different. .... If I couldn't access a[1][0] as b[2] I would be surprised, and annoyed. Your surprise and annoyance wouldn't render such a compiler non-conforming. Nov 14 '05 #26

 P: n/a "E. Robert Tisdale" wrote: Joe Wright wrote: If I couldn't access a[1][0] as b[2] I would be surprised, and annoyed. But you can. The *de facto* standard is as you describe it. No compiler developer would implement the standard *de jure* unless they were suicidal. Or designing a _deliberately_ strict implementation, for example a debugging compiler. Richard Nov 14 '05 #28

 P: n/a pete wrote: Michael Mair wrote:>There is special wording that allows any object,>including an array of arrays,>to be accessed completely using pointers to unsigned char.>This is what makes memcpy() usable.>However, for any other type this is an issue. If it was int a[2][2] = {{1, 2}, {3, 4}}, *b = (int *)a; ^^^^^^^ instead, there wouldn't be a problem accessing b[3], would there? There would. The same arguments hold. No, the same arguments don't hold. Actually, yes, they do. Observe - fit the first: int a[2][2] = {{1, 2}, {3, 4}}, You now have an array of two arrays of int. *b = a[0]; You get the address of the first of these int arrays, convert it to an int pointer, and assign it to b. printf("%d\n", *(b + 3)); You invoke undefined behaviour by increasing _that int pointer_ beyond its legal boundary. Fit the second: int a[2][2] = {{1, 2}, {3, 4}}, You now have an array of two arrays of int. *b = a[0]; You get the address of the entire array, convert it to an int pointer, and assign it to b. printf("%d\n", *(b + 3)); You invoke undefined behaviour by increasing _that int pointer_ beyond its legal boundary. Note that: - the address of an array and the address of its first member are identical. - the entire array is properly aligned for ints, so a conversion of its base address (or the address of its first member, which is the same except for type) to int * must succeed and give the address of the first int in the array. - once a pointer has been converted to another pointer type, there is nothing in the Standard that allows you to deduce the original type. IOW, if you have int a[2][2] = {{1, 2}, {3, 4}}, *b = (int *)a, *c = a[0]; then b==c _must_ evaluate to 1. The two pointers you get from those two conversions have exactly the same value, and exactly the same requirements. Note that this is _not_ true of a and a[0]; but this is because a has a different type than a[0]. Once you convert them both to int *, this distinction is, obviously, lost. Richard Nov 14 '05 #29

 P: n/a James Kuyper wrote: Joe Wright wrote: If I couldn't access a[1][0] as b[2] I would be surprised, and annoyed. Your surprise and annoyance wouldn't render such a compiler non-conforming. James has correctly identified the issue: an "array" is more than just a contiguous sequence of similar objects; it also has a declared size. It is entirely possible that a compiler can generate more efficient code if it takes advantage of the declared size; for example, if an architecture has a 64KB limit on segment size and the array is declared smaller than that, then it will not be necessary to generate code that copes with segment boundaries (e.g. loading different segment base addresses for different parts of the array). An array of arrays is guaranteed to have the storage contiguously allocated (without extra padding), but not all elements of each array can be accessed by indexing off a pointer "based on" a pointer to a given element in a particular array. However, on many architectures there is no noticeable speed penalty involved in supporting that usage, due to a uniform, large memory space and wide-enough pointers Thus, some programmers have been getting away with this nonportable practice on the platforms they have used so far. Nov 14 '05 #30

 P: n/a "James Kuyper" wrote in message news:41************@saicmodis.com... Ivan A. Kosarev wrote: > ... and a[0] is a part of > continious object. Agreed. If the validity limits of pointer arithmetic cared about contiguity, that would be a relevant argument. They don't. They're defined entirely in terms of the elements of a single array of the pointed-at type. None of types designates objects. :-) Instead, objects are memory areas which are interpreted accordingly to their types. Since that, it's not important how we get a pointer pair to compare it with relational operators; if they point to a single array (that is an *object*, not a *type*), they can be compared with a defined result. > The array above is a single object. Since a[1] + 2 can be pointed to, a[1]+2 is valid pointer value, which can be compared for equality with any valid pointer value, and compared for relative order with any other pointer that points into or one past the end of a[1]. However, it cannot Again, since the array is a single object, a[1] + 2 and b + 4 are values that point to the same object of the same type. Nov 14 '05 #31

 P: n/a Douglas A. Gwyn wrote: Thus, some programmers have been getting away with this nonportable practice on the platforms they have used so far. If they are "getting away with" it, it's portable. Nov 14 '05 #32

 P: n/a On Tue, 28 Sep 2004 16:11:16 -0700 in comp.std.c, "E. Robert Tisdale" wrote: Douglas A. Gwyn wrote: Thus, some programmers have been getting away with this nonportable practice on the platforms they have used so far.If they are "getting away with" it, it's portable. Not unless the standard says it is! -- Thanks. Take care, Brian Inglis Calgary, Alberta, Canada Br**********@CSi.com (Brian[dot]Inglis{at}SystematicSW[dot]ab[dot]ca) fake address use address above to reply Nov 14 '05 #33

 P: n/a In article , E.**************@jpl.nasa.gov says... Douglas A. Gwyn wrote: Thus, some programmers have been getting away with this nonportable practice on the platforms they have used so far. If they are "getting away with" it, it's portable. Not at all. It simply means they haven't tried enough platforms yet. By your usage, anyone using envp as an argument to main is writing portable code, as long as they haven't tried it on a platform where it doesn't work yet. -- Randy Howard (2reply remove FOOBAR) Nov 14 '05 #34

 P: n/a Randy Howard wrote: E.Robert.Tisdale wrote:Douglas A. Gwyn wrote:Thus, some programmers have been getting away with thisnonportable practiceon the platforms they have used so far.If they are "getting away with" it, it's portable. Not at all. It simply means they haven't tried enough platforms yet. cat main.c #include #include int main(int argc, char* argv[]) { const int a[2][2] = {{1, 2}, {3, 4}}, *b = a[0]; const size_t n = sizeof(a)/sizeof(a[0][0]); for (size_t j = 0; j < n; ++j) fprintf(stdout, "b[%u] = %d\t", j, b[j]); fprintf(stdout, "\n"); return EXIT_SUCCESS; } gcc -Wall -std=c99 -pedantic -o main main.c ./main b[0] = 1 b[1] = 2 b[2] = 3 b[3] = 4 This program ports to every platform with a C99 compliant compiler. Nov 14 '05 #35

 P: n/a "E. Robert Tisdale" wrote in message news:cj**********@nntp1.jpl.nasa.gov... This program ports to every platform with a C99 compliant compiler. Prove it. Nov 14 '05 #36

 P: n/a "E. Robert Tisdale" wrote: Douglas A. Gwyn wrote: Thus, some programmers have been getting away with this nonportable practice on the platforms they have used so far. If they are "getting away with" it, it's portable. Please don't post on subjects you don't understand. Thanks. Nov 14 '05 #37

 P: n/a James Kuyper wrote: pete wrote in message news:<41***********@mindspring.com>... ... If it was int a[2][2] = {{1, 2}, {3, 4}}, *b = (int *)a; ^^^^^^^ instead, there wouldn't be a problem accessing b[3], would there? The validity limits of pointer arithmetic are defined in terms of the containing array. There is one and only one only array of int that contains the int object pointed at by 'b'. That array is a[0], and it only contains two integers. 'a' itself is not an array of 'int', but rather is an array of 'int[2]', and therefore is not capable of determining the validity limits for arithmetic on an 'int*'. I don't see what difference it makes whether or not object a even contains an int type. As long as object a is as big as an int and also aligned for type int, I can access the object as (*(int*)&a) regardless if a was declared as a structure or an array of floats. With int *b = (int *)&a; b[3] isn't accessing an element of a, b[3] is accessing the memory at (int*)&a + 3, and treating it as an object of type int. -- pete Nov 14 '05 #38

 P: n/a "Douglas A. Gwyn" wrote in message news:41***************@null.net... James Kuyper wrote: Joe Wright wrote: If I couldn't access a[1][0] as b[2] I would be surprised, and annoyed. Your surprise and annoyance wouldn't render such a compiler non-conforming. James has correctly identified the issue: an "array" is more than just a contiguous sequence of similar objects; it also has a declared size. It is entirely possible that a compiler can generate more efficient code if it takes advantage of the declared size; for example, if an Does this mean that pointers that are results of array-to-pointer conversion and any other pointers are somehow differ? If they don't, how the Standard allows such optimizations with the first and forbids with the second ones? (Hopefully, we will keep in mind that an abstract machine cannot refer to any optimization, including any kind of folding and propagation.) Nov 14 '05 #39

 P: n/a Hi pete I don't see what difference it makes whether or not object a even contains an int type. As long as object a is as big as an int and also aligned for type int, I can access the object as (*(int*)&a) regardless if a was declared as a structure or an array of floats. With int *b = (int *)&a; b[3] isn't accessing an element of a, b[3] is accessing the memory at (int*)&a + 3, and treating it as an object of type int. Assuming a in this case is not an array of any flavour (or that you would have used the appropriate &a[0]...[0]): That is exactly the point! b[3] or b+3 accesses this address but it is not guaranteed that it may do so! You just might try to access memory which you do not have access to as it does not belong to the object you pointed b to... --Michael Nov 14 '05 #40

 P: n/a Wojtek Lerch wrote: "E. Robert Tisdale" wrote:This program ports to every platformwith a C99 compliant compiler. Prove it. That shouldn't be difficult to do by exhaustive search ;-) -- David Hopwood Nov 14 '05 #41

 P: n/a Hi E.R.T. > cat main.c #include #include int main(int argc, char* argv[]) { const int a[2][2] = {{1, 2}, {3, 4}}, *b = a[0]; const size_t n = sizeof(a)/sizeof(a[0][0]); for (size_t j = 0; j < n; ++j) fprintf(stdout, "b[%u] = %d\t", j, b[j]); fprintf(stdout, "\n"); return EXIT_SUCCESS; } > gcc -Wall -std=c99 -pedantic -o main main.c > ./main b[0] = 1 b[1] = 2 b[2] = 3 b[3] = 4 This program ports to every platform with a C99 compliant compiler. You are aware that gcc is not C99 compliant and that gcc does not even produce C*89* compliant programs when called with "gcc -Wall -std=c89 -pedantic ...", are you? That said: I also use mainly gcc but think it is unfit to "prove" anything. --Michael Nov 14 '05 #42

 P: n/a In comp.lang.c Douglas A. Gwyn wrote: It is entirely possible that a compiler can generate more efficient code if it takes advantage of the declared size; for example, if an architecture has a 64KB limit on segment size and the array is declared smaller than that, then it will not be necessary to generate code that copes with segment boundaries (e.g. loading different segment base addresses for different parts of the array). char a[2][2]; I understand that this would be the wrong way to access a[1][0]: a[0][2]; //wrong An array of arrays is guaranteed to have the storage contiguously allocated (without extra padding), but not all elements of each array can be accessed by indexing off a pointer "based on" a pointer to a given element in a particular array. But I still don't see what should be wrong with this: char *b = a[0]; b[2]; //equiv to *(b+2) If that were not allowed, it would mean we cannot access object `a' through char*. -- Stan Tobias sed 's/[A-Z]//g' to email Nov 14 '05 #43

 P: n/a Ivan A. Kosarev wrote: Does this mean that pointers that are results of array-to-pointer conversion and any other pointers are somehow differ? If the compiler can see the array context then it is allowed to take advantage of it, i.e. to generate code that works only for the span of the declared array, as in using a single segment base and not worrying about overflow in the offset field. Nov 14 '05 #44

 P: n/a S.Tobias wrote: But I still don't see what should be wrong with this: char *b = a[0]; b[2]; //equiv to *(b+2) If that were not allowed, it would mean we cannot access object `a' through char*. Make that unsigned char *, to be safe. The guarantee that objects can be accessed as byte arrays is a special dispensation, distinct from the question of arrays of other types. So long as a *single object* can be identified, it can indeed be walked using a byte pointer derived from any kind of pointer into the object. An array of arrays is a single object (with identifiable subobjects). For non-byte pointers, the situation is more restricted; partly this is to allow more efficient code on some platforms (as I explained previously) and partly it is to allow type-based nonaliasing assumptions to be made by compilers (another matter of code efficiency). It may be instructive to consider the similar issue that came up some time ago, concerning what guarantees exist when two declared objects are accidentally contiguous: long a[10], b[10], *p = a, *q = b; if (a + 10 == b) stmt_1 else stmt_2 In stmt_1, can b[3] be accessed using a[13], or can a[6] be accessed using b[-4]? Clearly not. Then, can b[3] be accessed using p[13], or can a[6] be accessed using q[-4]? The consensus seems to be that such usage is not strictly conforming, and the compiler is not obliged to make such code work as a naive programmer might expect. It is hard to see what could go wrong with such tiny examples, especially if you're not very familiar with segment addressing, but if you start to think in terms of arrays near, say, 64KB in size the problems may become more evident. Nov 14 '05 #45

 P: n/a Michael Mair wrote: Hi pete I don't see what difference it makes whether or not object a even contains an int type. As long as object a is as big as an int and also aligned for type int, I can access the object as (*(int*)&a) regardless if a was declared as a structure or an array of floats. With int *b = (int *)&a; b[3] isn't accessing an element of a, b[3] is accessing the memory at (int*)&a + 3, and treating it as an object of type int. Assuming a in this case is not an array of any flavour (or that you would have used the appropriate &a[0]...[0]): OK. That is exactly the point! b[3] or b+3 accesses this address but it is not guaranteed that it may do so! You just might try to access memory which you do not have access to as it does not belong to the object you pointed b to... I disagree. new.c is a portable program. /* BEGIN new.c */ #include int main(void) { int a[2][2] = {{1, 2}, {3, 4}}, *b = (int *)&a; if (b == (int *)&a[1][1] - 3) { puts("There's no chance that this program " "does not own the memory at b[3]"); } return 0; } /* END new.c */ -- pete Nov 14 '05 #46

 P: n/a "Douglas A. Gwyn" wrote in message news:tb********************@comcast.com... Ivan A. Kosarev wrote: Does this mean that pointers that are results of array-to-pointer conversion and any other pointers are somehow differ? If the compiler can see the array context then it is allowed to take advantage of it, i.e. to generate Is the "as if" rule still working? Nov 14 '05 #47

 P: n/a Hi pete, That is exactly the point! b[3] or b+3 accesses this addressbut it is not guaranteed that it may do so!You just might try to access memory which you do not haveaccess to as it does not belong to the object you pointed bto... I disagree. new.c is a portable program. /* BEGIN new.c */ #include int main(void) { int a[2][2] = {{1, 2}, {3, 4}}, *b = (int *)&a; if (b == (int *)&a[1][1] - 3) { puts("There's no chance that this program " "does not own the memory at b[3]"); } return 0; } /* END new.c */ Well, I am not one of the "chapter and verse" types but in this case it would be nice if you could explain it to me in the words of the standard as I do not understand how you conceive the idea this works portably. "Proof by program" works only for counterexamples. So, if I had a strange compiler or a strange enough platform and we both would agree on the compiler's standard compliantness, I could prove my point by running the above and getting an alternative output (when I provide an "else"...) Personally, I have shown my students the memory layout of a "two dimensional" array using the same means -- and always would, as it demonstrates this very well. However, I never assumed that this will work on some strange machine, too, and gave my students the "don't do this at home, children" warning... Cheers, Michael Nov 14 '05 #48

 P: n/a In comp.lang.c Douglas A. Gwyn wrote: and partly it is to allow type-based nonaliasing assumptions to be made by compilers (another matter of code efficiency). It may be instructive to Aha, I think I see what you mean. long a[10], b[10], *p = a, *q = b; if (a + 10 == b) stmt_1 else stmt_2 I think what you mean to say is that compiler is allowed to assume that b[0] will not be changed between first assignment and if() condition, since p does not point within b[]: b[0] = 1; p[10] = 0; if (b[0]) always_executed(); else never_reached(); It is hard to see what could go wrong with such tiny examples, I think my example shows it. especially if you're not very familiar with segment addressing, but if you start to think in terms of arrays near, say, 64KB in size the problems may become more evident. Yes, this is second important problem. And, of course, these are same reasons why similar things will not work for arrays of arrays. One more question: int a[2][2] = {{0}}; void *pa0, *pa; pa0 = a[0]; pa = a; a[1][0] = 1; ((int*)pa0)[2] = 0; // (1) ((int*)pa)[2] = 0; // (2) Is it true, that after (1) compiler *may* assume that a[1][0] is still 1, because pa0 is "based" on a[0] (and implicitly has access only to that sub-object), and after (2) it *has* *to* assume that something changed in the entire a[][] array (and hence has to re-read a[1][0]), because pa is "based" on a and has access to the whole object? [ BTW. In my previous post I forgot to thank you for your explanations. Thanks to them I finally understood why the "struct hack" (as described in the Rationale 6.7.2.1) is not safe. ] -- Stan Tobias sed 's/[A-Z]//g' to email Nov 14 '05 #49

 P: n/a "E. Robert Tisdale" wrote: Douglas A. Gwyn wrote: Thus, some programmers have been getting away with this nonportable practice on the platforms they have used so far. If they are "getting away with" it, it's portable. On the platform I currently use most, I can get away with TCHAR string[]=TEXT("This is a text"); and ok=(MessageBox(0, message, TEXT("Question"), MB_YESNO)!=IDYES); Does that mean those are portable, too? Richard Nov 14 '05 #50

197 Replies

### This discussion thread is closed

Replies have been disabled for this discussion.

### Similar topics

Browse more C / C++ Questions on Bytes