# Pointer arithmetic question

 P: n/a Given the following: char *ptr1, *ptr2; size_t n; ptr2 = ptr1 + n; Assuming ptr1 is a valid pointer, is the following guaranteed to be true? (ptr2 - ptr1) == n What if n is greater than the size of the buffer to which ptr1 points? For example: char buf[10]; char *pt = buf + 100; size_t n = (pt - buf); Is n guaranteed to be 100? Or, does the simple act of calculating an address off the end of the buffer (beyond the address that immediately follows the buffer) invoke UB? -- +-------------------------+--------------------+-----------------------------+ | Kenneth J. Brody | www.hvcomputer.com | | | kenbrody/at\spamcop.net | www.fptech.com | #include | +-------------------------+--------------------+-----------------------------+ Don't e-mail me at: Jan 20 '06 #1
 P: n/a Kenneth Brody wrote: Given the following: char *ptr1, *ptr2; size_t n; ptr2 = ptr1 + n; Undefined behaviour if n > 1. You must not dereference ptr2 if n == 1, but no overflow is generated. OK if n == 0. Assuming ptr1 is a valid pointer, is the following guaranteed to be true? (ptr2 - ptr1) == n What if n is greater than the size of the buffer to which ptr1 points? For example: char buf[10]; char *pt = buf + 100; Undefined behaviour (same C&V as above). You can only go one past the end of the array, and then you must not try to dereference the resultant pointer. size_t n = (pt - buf); Correct type for n is diffptr_t from stddef.h Is n guaranteed to be 100? Or, does the simple act of calculating an address off the end of the buffer (beyond the address that immediately follows the buffer) invoke UB? Yes it does. Cheers Vladimir PS C&V: 6.5.6.x, esp. 6.5.6.7-11. -- My e-mail address is real, and I read it. Jan 20 '06 #2

 P: n/a Vladimir S. Oka wrote: Kenneth Brody wrote: size_t n = (pt - buf); Correct type for n is diffptr_t from stddef.h Would've easily won fastest-fingers-first... :-( It's ptrdiff_t from , of course. Sorry Vladimir -- My e-mail address is real, and I read it. Jan 20 '06 #3

 P: n/a Hello Adding pointers is not just 1+1 = 2 but the type of the pointer is importand. example double *a, b[100]; a= b; printf("b-a=%d",a-(a+1)); give 4 because the sizeod double is 4 byte subtract is straight forward plus is size of pointer type so you can do also double a[100]; now is (&a[10] == a+10) gives TRUE wich type a is is not importand. Greetings Jan 20 '06 #4

 P: n/a Vladimir S. Oka wrote: Kenneth Brody wrote: Given the following: char *ptr1, *ptr2; size_t n; ptr2 = ptr1 + n; Undefined behaviour if n > 1. You must not dereference ptr2 if n == 1, but no overflow is generated. OK if n == 0. IMHO, this is not quite correct. It depends where ptr1 is pointing to. If ptr1 is pointing to a single char object, then n must not be anything else than 0 or 1. If ptr1 is pointing somewhere into an array of char, then n is allowed to be something else than 1 or 0. You must only make sure that the resulting pointer points to a valid place in the array or one past the end. Any other location will trigger UB. The relevant section is 6.5.6 paragraphs 7 and 8: For the purposes of these operators, a pointer to an object that is not an element of an array behaves the same as a pointer to the first element of an array of length one with the type of the object as its element type. When an expression that has integer type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the pointer operand points to an element of an array object, and the array is large enough, the result points to an element offset from the original element such that the difference of the subscripts of the resulting and original array elements equals the integer expression. In other words, if the expression P points to the i-th element of an array object, the expressions (P)+N (equivalently, N+(P)) and (P)-N (where N has the value n) point to, respectively, the i+n-th and iâˆ’n-th elements of the array object, provided they exist. Moreover, if the expression P points to the last element of an array object, the expression (P)+1 points one past the last element of the array object, and if the expression Q points one past the last element of an array object, the expression (Q)-1 points to the last element of the array object. If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined. If the result points one past the last element of the array object, it shall not be used as the operand of a unary * operator that is evaluated. Tom Jan 20 '06 #5

 P: n/a "mdler" writes: Adding pointers is not just 1+1 = 2 but the type of the pointer is importand. example double *a, b[100]; a= b; printf("b-a=%d",a-(a+1)); give 4 because the sizeod double is 4 byte No. It prints -1. I assume that what you actually meant to write was something like printf("%d\n", &b[1] - &b[0]); which prints 1 regardless of the value of sizeof(b[0]). I suggest you read §6.5.6 carefully. To summarize, subtracting one pointer from another is only permitted when they both point to elements of the same array object, or one past the last element, and the result is the difference between the subscripts of the elements the pointers point to, so in effect: &a[i] - &a[j] == i - j DES -- Dag-Erling Smørgrav - de*@des.no Jan 20 '06 #6

 P: n/a Thomas Maier-Komor wrote: Vladimir S. Oka wrote: Kenneth Brody wrote: Given the following: char *ptr1, *ptr2; size_t n; ptr2 = ptr1 + n; Undefined behaviour if n > 1. You must not dereference ptr2 if n == 1, but no overflow is generated. OK if n == 0. IMHO, this is not quite correct. It depends where ptr1 is pointing to. If ptr1 is pointing to a single char object, then n must not be anything else than 0 or 1. If ptr1 is pointing somewhere into an array of char, then n is allowed to be something else than 1 or 0. You must only make sure that the resulting pointer points to a valid place in the array or one past the end. Any other location will trigger UB. What you're saying is entirely correct (I point to same C&V). However, given what Kenneth posted, ptr1 and ptr2 point to a char object, not an array. They may be made to point to an array of char, but there's no telling whether they are in this case. Cheers Vladimir -- My e-mail address is real, and I read it. Jan 20 '06 #7

 P: n/a In article <43**************@spamcop.net>, Kenneth Brody wrote: Given the following: char *ptr1, *ptr2; size_t n; ptr2 = ptr1 + n; Stop right there! This is only allowed if it doesn't point beyond the end of the object that ptr1 points into. Assuming ptr1 is a valid pointer, is the following guaranteed to be true? (ptr2 - ptr1) == n If the above condition holds, yes. What if n is greater than the size of the buffer to which ptr1 points? No, and you've already gone wrong when you do the addition. Or, does the simple act of calculating anaddress off the end of the buffer (beyond the address that immediatelyfollows the buffer) invoke UB? Yes, exactly. Of course, it works perfectly well with natural C implementations on linear address-space machines. -- Richard Jan 20 '06 #8

 P: n/a Richard Tobin wrote: In article <43**************@spamcop.net>, Kenneth Brody wrote:Given the following: char *ptr1, *ptr2; size_t n; ptr2 = ptr1 + n; Stop right there! This is only allowed if it doesn't point beyond the end of the object that ptr1 points into. One past the end is allowed, too. Think of, e.g. while (*(ptr1++) != '\0') Cheers Michael -- E-Mail: Mine is an /at/ gmx /dot/ de address. Jan 20 '06 #9

 P: n/a In article <43*************@individual.net>, Michael Mair wrote: Stop right there! This is only allowed if it doesn't point beyond the end of the object that ptr1 points into. One past the end is allowed, too.Think of, e.g. while (*(ptr1++) != '\0') I was considering that as pointing to the end, but thanks for clarifying. -- Richard Jan 20 '06 #10

 P: n/a Richard Tobin wrote: In article <43**************@spamcop.net>, Kenneth Brody wrote:Given the following: char *ptr1, *ptr2; size_t n; ptr2 = ptr1 + n; Stop right there! This is only allowed if it doesn't point beyond the end of the object that ptr1 points into. Yes, I realize that. I was wondering if the mere calculation would introduce UB, since I will never actually dereference the bad pointer. Given the replies I've seen, the answer appears to be "yes, it does". Assuming ptr1 is a valid pointer, is the following guaranteed to be true? (ptr2 - ptr1) == n If the above condition holds, yes.What if n is greater than the size of the buffer to which ptr1 points? No, and you've already gone wrong when you do the addition.Or, does the simple act of calculating anaddress off the end of the buffer (beyond the address that immediatelyfollows the buffer) invoke UB? Yes, exactly. Of course, it works perfectly well with natural C implementations on linear address-space machines. Unfortunately, we all know that "works on system X does not mean that it's valid code". I was hoping to be able to take some existing code and enhance the functionality (basically, building an array of struct which include pointers into a buffer, where the buffer size used to be a known quantity before the calculations, to one where the size wouldn't be known until after building the array) without having to change the basic interface. Looks like I'll have to take a different approach. (And I'm not going to resort to storing (int)offset in a (char *)ptr, which also "works" on these systems.) Thanks to all those who responded. -- +-------------------------+--------------------+-----------------------------+ | Kenneth J. Brody | www.hvcomputer.com | | | kenbrody/at\spamcop.net | www.fptech.com | #include | +-------------------------+--------------------+-----------------------------+ Don't e-mail me at: Jan 20 '06 #11

 P: n/a "Vladimir S. Oka" writes: Thomas Maier-Komor wrote: Vladimir S. Oka wrote: Kenneth Brody wrote: Given the following: char *ptr1, *ptr2; size_t n; ptr2 = ptr1 + n; Undefined behaviour if n > 1. You must not dereference ptr2 if n == 1, but no overflow is generated. OK if n == 0. IMHO, this is not quite correct. It depends where ptr1 is pointing to. If ptr1 is pointing to a single char object, then n must not be anything else than 0 or 1. If ptr1 is pointing somewhere into an array of char, then n is allowed to be something else than 1 or 0. You must only make sure that the resulting pointer points to a valid place in the array or one past the end. Any other location will trigger UB. What you're saying is entirely correct (I point to same C&V). However, given what Kenneth posted, ptr1 and ptr2 point to a char object, not an array. They may be made to point to an array of char, but there's no telling whether they are in this case. Given what Kenneth posted, we have no idea what ptr1 and ptr2 point to. If we take the code snippet literally, they're both uninitialized, and any attempt to refer to the value of either invokes undefined behavior, but we can reasonably assume that they're initialized to *something*. In answering his question, we should take all possibilities into account, particularly since char* pointers are most commonly used to point to arrays rather than single char objects. -- Keith Thompson (The_Other_Keith) ks***@mib.org San Diego Supercomputer Center <*> We must do something. This is something. Therefore, we must do this. Jan 20 '06 #12

 P: n/a Vladimir S. Oka wrote: Thomas Maier-Komor wrote: Vladimir S. Oka wrote: Kenneth Brody wrote: Given the following: char *ptr1, *ptr2; size_t n; ptr2 = ptr1 + n; Undefined behaviour if n > 1. You must not dereference ptr2 if n == 1, but no overflow is generated. OK if n == 0. IMHO, this is not quite correct. It depends where ptr1 is pointing to. If ptr1 is pointing to a single char object, then n must not be anything else than 0 or 1. If ptr1 is pointing somewhere into an array of char, then n is allowed to be something else than 1 or 0. You must only make sure that the resulting pointer points to a valid place in the array or one past the end. Any other location will trigger UB. What you're saying is entirely correct (I point to same C&V). However, given what Kenneth posted, ptr1 and ptr2 point to a char object, not an array. They may be made to point to an array of char, but there's no telling whether they are in this case. Cheers Vladimir OK - I agree. But if you don't know where ptr1 is pointing, you cannot assume safely that it is a char object. Beside an object in a char array, it could also be a null pointer. However, in the case of ptr1 being null, n must be 0. For any other value of n you will get UB. Cheers, Tom Jan 20 '06 #13

 P: n/a Thomas Maier-Komor writes: [...] OK - I agree. But if you don't know where ptr1 is pointing, you cannot assume safely that it is a char object. Beside an object in a char array, it could also be a null pointer. However, in the case of ptr1 being null, n must be 0. For any other value of n you will get UB. Actually, adding 0 to a null pointer invokes undefined behavior: If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined. It's likely to quietly yield a null pointer on most implementations, but the standard doesn't require it. -- Keith Thompson (The_Other_Keith) ks***@mib.org San Diego Supercomputer Center <*> We must do something. This is something. Therefore, we must do this. Jan 20 '06 #14

 P: n/a Keith Thompson wrote: "Vladimir S. Oka" writes: [...] However, given what Kenneth posted, ptr1 and ptr2 point to a char object, not an array. They may be made to point to an array of char, but there's no telling whether they are in this case. Given what Kenneth posted, we have no idea what ptr1 and ptr2 point to. If we take the code snippet literally, they're both uninitialized, and any attempt to refer to the value of either invokes undefined behavior, but we can reasonably assume that they're initialized to *something*. In answering his question, we should take all possibilities into account, particularly since char* pointers are most commonly used to point to arrays rather than single char objects. Yes, the first part was just a snippet to explain the principle, with the assumption that people would understand that ptr1 and n would contain valid values. I also included a specific example: char buf[10]; char *pt = buf + 100; size_t n = (pt - buf); -- +-------------------------+--------------------+-----------------------------+ | Kenneth J. Brody | www.hvcomputer.com | | | kenbrody/at\spamcop.net | www.fptech.com | #include | +-------------------------+--------------------+-----------------------------+ Don't e-mail me at: Jan 20 '06 #15

 P: n/a Kenneth Brody wrote: char buf[10]; char *pt = buf + 100; size_t n = (pt - buf); For that example: (buf + 10) is defined (buf + 11) is undefined buf[9] is defined buf[10] is undefined -- pete Jan 21 '06 #16

 P: n/a Keith Thompson wrote: Thomas Maier-Komor writes: [...] OK - I agree. But if you don't know where ptr1 is pointing, you cannot assume safely that it is a char object. Beside an object in a char array, it could also be a null pointer. However, in the case of ptr1 being null, n must be 0. For any other value of n you will get UB. Actually, adding 0 to a null pointer invokes undefined behavior: If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined. It's likely to quietly yield a null pointer on most implementations, but the standard doesn't require it. I understand your reasoning, but I doubt that the quoted paragraph means what you are saying, because the null pointer does not point to an object. The null pointer is a little bit special in many concerns. I am unsure if it is special in this context. Maybe you are right, but then there must be a paragraph somewhere stating what happens when adding an integer to a null pointer. I don't have the standard at hand right now, but maybe I will search it on a Monday a little bit deeper for an appropriate paragraph. Cheers, Tom Jan 21 '06 #17

 P: n/a Thomas Maier-Komor wrote: Keith Thompson wrote: Thomas Maier-Komor writes: [...] OK - I agree. But if you don't know where ptr1 is pointing, you cannot assume safely that it is a char object. Beside an object in a char array, it could also be a null pointer. However, in the case of ptr1 being null, n must be 0. For any other value of n you will get UB. Actually, adding 0 to a null pointer invokes undefined behavior: If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined. It's likely to quietly yield a null pointer on most implementations, but the standard doesn't require it. I understand your reasoning, but I doubt that the quoted paragraph means what you are saying, because the null pointer does not point to an object. The null pointer is a little bit special in many concerns. I am unsure if it is special in this context. Maybe you are right, but then there must be a paragraph somewhere stating what happens when adding an integer to a null pointer. I don't have the standard at hand right now, but maybe I will search it on a Monday a little bit deeper for an appropriate paragraph. I believe there is no such paragraph defining pointer arithmetic on the null pointer, although it is of course impossible to prove a negative. After all, of what use is it given that you have the offsetof macro and sizeof operator available to you? Why define something that is not needed? -- Flash Gordon Living in interesting times. Although my email address says spam, it is real and I read it. Jan 21 '06 #18

 P: n/a Thomas Maier-Komor writes: Keith Thompson wrote: Thomas Maier-Komor writes: [...] OK - I agree. But if you don't know where ptr1 is pointing, you cannot assume safely that it is a char object. Beside an object in a char array, it could also be a null pointer. However, in the case of ptr1 being null, n must be 0. For any other value of n you will get UB. Actually, adding 0 to a null pointer invokes undefined behavior: If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined. It's likely to quietly yield a null pointer on most implementations, but the standard doesn't require it. I understand your reasoning, but I doubt that the quoted paragraph means what you are saying, because the null pointer does not point to an object. The null pointer is a little bit special in many concerns. I am unsure if it is special in this context. Maybe you are right, but then there must be a paragraph somewhere stating what happens when adding an integer to a null pointer. I don't have the standard at hand right now, but maybe I will search it on a Monday a little bit deeper for an appropriate paragraph. You're right, the quoted paragraph doesn't say that adding an integer to a null pointer invokes undefined behavior. It's implied by the fact that the standard doesn't define the behavior (and if it did, that paragraph, or one in the same section, would be the place to do it). Undefined behavior includes cases where the standard fails to state what the behavior is. -- Keith Thompson (The_Other_Keith) ks***@mib.org San Diego Supercomputer Center <*> We must do something. This is something. Therefore, we must do this. Jan 21 '06 #19

 P: n/a Flash Gordon writes: [...] I believe there is no such paragraph defining pointer arithmetic on the null pointer, although it is of course impossible to prove a negative. After all, of what use is it given that you have the offsetof macro and sizeof operator available to you? Why define something that is not needed? In this case, it's quite possible to prove a negative, since the standard is finite. -- Keith Thompson (The_Other_Keith) ks***@mib.org San Diego Supercomputer Center <*> We must do something. This is something. Therefore, we must do this. Jan 21 '06 #20

 P: n/a Keith Thompson wrote: [ too much snippage ] Undefined behavior includes cases where the standard fails to state what the behavior is. This is too perfect for further comment. Go Keith! -- Joe Wright "Everything should be made as simple as possible, but not simpler." --- Albert Einstein --- Jan 22 '06 #21

 P: n/a Joe Wright writes: Keith Thompson wrote: [ too much snippage ] Undefined behavior includes cases where the standard fails to state what the behavior is. This is too perfect for further comment. Go Keith! Thanks -- but I'm going to comment further myself. I was summarizing C99 4p2, which says: If a "shall" or "shall not" requirement that appears outside of a constraint is violated, the behavior is undefined. Undefined behavior is otherwise indicated in this International Standard by the words "undefined behavior" or by the omission of any explicit definition of behavior. There is no difference in emphasis among these three; they all describe "behavior that is undefined". -- Keith Thompson (The_Other_Keith) ks***@mib.org San Diego Supercomputer Center <*> We must do something. This is something. Therefore, we must do this. Jan 22 '06 #22

 P: n/a On Fri, 20 Jan 2006 20:24:05 GMT, Keith Thompson wrote: Thomas Maier-Komor writes:[...] OK - I agree. But if you don't know where ptr1 is pointing, you cannot assume safely that it is a char object. Beside an object in a char array, it could also be a null pointer. However, in the case of ptr1 being null, n must be 0. For any other value of n you will get UB.Actually, adding 0 to a null pointer invokes undefined behavior: If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.It's likely to quietly yield a null pointer on most implementations,but the standard doesn't require it. I would say that the behaviour is defined for NULL+0. The standard (C99 6.3.2.3 3) says any constant integer expression == 0 is equivalent to a NULL pointer. Jim Jan 23 '06 #23

 P: n/a Jim writes: On Fri, 20 Jan 2006 20:24:05 GMT, Keith Thompson wrote:Thomas Maier-Komor writes:[...] OK - I agree. But if you don't know where ptr1 is pointing, you cannot assume safely that it is a char object. Beside an object in a char array, it could also be a null pointer. However, in the case of ptr1 being null, n must be 0. For any other value of n you will get UB.Actually, adding 0 to a null pointer invokes undefined behavior: If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.It's likely to quietly yield a null pointer on most implementations,but the standard doesn't require it. I would say that the behaviour is defined for NULL+0. The standard (C99 6.3.2.3 3) says any constant integer expression == 0 is equivalent to a NULL pointer. NULL+0 is a null pointer constant only if NULL is an integer constant expression. But we were discussing null pointers, not null pointer constants. Given: int *ptr = NULL; /* or 0, or '-'-'-' */ the expression ptr + 0 invokes undefined behavior. -- Keith Thompson (The_Other_Keith) ks***@mib.org San Diego Supercomputer Center <*> We must do something. This is something. Therefore, we must do this. Jan 23 '06 #24

