446,320 Members | 2,212 Online
Need help? Post your question and get tips & solutions from a community of 446,320 IT Pros & Developers. It's quick & easy.

 P: n/a There is a thread currently active on this newsgroup entitled: "how to calculate the difference between 2 addresses ?" The thread deals with calculating the distance, in bytes, between two memory addresses. Obviously, this can only be done if the addresses refer to elements or members of the same object (or base objects, etc.). John Carson and I proposed two separate methods. I disagree with John's solution, and John disagrees with mine. Therefore, I'd like to present them both here and see what the audience thinks. Firstly, we shall start off with a simple POD type: struct MyPOD { int a; double b; void *c; short d; bool e; int f; }; Given an object of this type, we shall calculate the distance, in bytes, between the "b" member and the "e" member. My own method is as follows: reinterpret_cast(&obj.e) - reinterpret_cast(&obj.b) John's method is as follows: reinterpret_cast(&obj.e) - reinterpret_cast
24 Replies

 P: n/a On Sun, 19 Nov 2006 20:05:11 GMT in comp.lang.c++, Frederick Gotham Given an object of this type, we shall calculate the distance, in bytes,between the "b" member and the "e" member. #include offsetof(MyPOD, e) - offsetof(MyPOD, b) Nov 19 '06 #2

 P: n/a David Harmon: On Sun, 19 Nov 2006 20:05:11 GMT in comp.lang.c++, Frederick Gotham>Given an object of this type, we shall calculate the distance, in bytes,between the "b" member and the "e" member. #include offsetof(MyPOD, e) - offsetof(MyPOD, b) I'll rephrase the question: Given two memory addresses in the form of pointers -- pointer types which may be different -- calculate the distance in bytes between them. The pointers refer to parts of the same object. -- Frederick Gotham Nov 19 '06 #3

 P: n/a Frederick Gotham wrote: David Harmon: On Sun, 19 Nov 2006 20:05:11 GMT in comp.lang.c++, Frederick Gotham Given an object of this type, we shall calculate the distance, in bytes,between the "b" member and the "e" member. #include offsetof(MyPOD, e) - offsetof(MyPOD, b) I'll rephrase the question: Given two memory addresses in the form of pointers -- pointer types which may be different -- calculate the distance in bytes between them. The pointers refer to parts of the same object. -- Frederick Gotham Not that i'm trying deliberately to be a pain in the attic, but what do you mean by between them? Thats not the same as offset. struct test { int n; int i; }; The distance in bytes between a test instance.n and instance.i would be zero assuming no padding is involved. Remember: To assume == makes an ASS out of U and ME. Nov 19 '06 #4

 P: n/a Salt_Peter: Not that i'm trying deliberately to be a pain in the attic, but what do you mean by between them? Let's say that a certain object is located at memory address 14. Let's say that another object is located at memory address 18. This distance between them is 4. Thats not the same as offset. struct test { int n; int i; }; The distance in bytes between a test instance.n and instance.i would be zero assuming no padding is involved. We're just looking for the amount of bytes between two addresses. Let's say that &obj.n == Memory Byte Address 56 Let's say that &obj.i == Memory Byte Address 60 Therefore, the distance between them is 4 bytes. Remember: To assume == makes an ASS out of U and ME. Should I understand that somehow? -- Frederick Gotham Nov 19 '06 #5

 P: n/a Frederick Gotham wrote: Salt_Peter: Not that i'm trying deliberately to be a pain in the attic, but what do you mean by between them? Let's say that a certain object is located at memory address 14. Let's say that another object is located at memory address 18. This distance between them is 4. Thats not the same as offset. struct test { int n; int i; }; The distance in bytes between a test instance.n and instance.i would be zero assuming no padding is involved. We're just looking for the amount of bytes between two addresses. Let's say that &obj.n == Memory Byte Address 56 Let's say that &obj.i == Memory Byte Address 60 Therefore, the distance between them is 4 bytes. There is no guarantee that converting a pointer to an integer value will produce the logical address of the referenced object. So neither of the two approaches is certain to be portable. In fact, the only portable approach available is to use the offsetof macro - either to calculate the distance between the start of a POD object and one of its members, or between any two members of the same object: std::abs( offsetof(MyPOD, e) - offsetof(MyPOD, b)); Greg Nov 19 '06 #6

 P: n/a Greg: There is no guarantee that converting a pointer to an integer value will produce the logical address of the referenced object. So neither of the two approaches is certain to be portable. My claim is that the char* method is perfect. #include template std::ptrdiff_t BytesBetween(A const &a,B const &b) { return reinterpret_cast(&b) - reinterpret_cast(&a); } Of course, both "a" and "b" must refer to parts of the same object. -- Frederick Gotham Nov 20 '06 #7

 P: n/a "Frederick Gotham" (&obj.e) - reinterpret_cast(&obj.b) John's method is as follows: reinterpret_cast(&obj.e) - reinterpret_cast

 P: n/a Frederick Gotham wrote: Greg: There is no guarantee that converting a pointer to an integer value will produce the logical address of the referenced object. So neither of the two approaches is certain to be portable. My claim is that the char* method is perfect. #include template std::ptrdiff_t BytesBetween(A const &a,B const &b) { return reinterpret_cast(&b) - reinterpret_cast(&a); } Of course, both "a" and "b" must refer to parts of the same object. In order to subtract pointer a from pointer b, both a and b must point to the same kind of object and the objects that they point to, must both be members of the same array. Since the BytesBetween() function template observes neither of these requirements, there is no guarantee that its behavior will be defined. "Unless both pointers point to elements of the same array object, or one past the last element of the array object, the behavior is undefined." [§5.7/7] C++ would not need the offsetof macro if there were another, portable way to calculate the distance between two members of an object. Greg Nov 20 '06 #9

 P: n/a Greg wrote: C++ would not need the offsetof macro if there were another, portable way to calculate the distance between two members of an object. That seems incorrect: the difficulty with the offsetof macro is the need for compile-time evaluation. That makes it impossible to create an instance of the struct type and measure offsets of its members. Thus, even if you had a perfectly fine method of computing distances of members of an object, it would not help in writing an offsetof macro. Best Kai-Uwe Bux Nov 20 '06 #10

 P: n/a On Sun, 19 Nov 2006 20:58:34 GMT in comp.lang.c++, Frederick Gotham I'll rephrase the question: I'll still dodge it. Eschew undefined behavior. Cast not thy pointers into the void. Nov 20 '06 #11

 P: n/a Kai-Uwe Bux wrote: Greg wrote: C++ would not need the offsetof macro if there were another, portable way to calculate the distance between two members of an object. That seems incorrect: the difficulty with the offsetof macro is the need for compile-time evaluation. That makes it impossible to create an instance of the struct type and measure offsets of its members. Thus, even if you had a perfectly fine method of computing distances of members of an object, it would not help in writing an offsetof macro. Counting the number bytes from the start of an object to one of its members is not the only way to express the distance. But since the requirement in this case is to provide a byte measurement of the distance - the offsetof macro is the only portable way to obtain that figure. Requiring that the offset of a class member be expressed in bytes is of course a completely artificial constraint - no C++ program would ever face such a limitation. After all, no program calls offsetof simply to obtain a number. Instead the number that offsetof returns is useful only insofar as the program can use that value to gain access to the specified class member given a pointer to a class object. In C++, member access through an object pointer is already possible by applying a member pointer to the object pointer. A member pointer essentially abstracts the offset of a class member, and hides the implementation details from the C++ program. So although a C++ program cannot recover the byte distance of the offset that is stored within a member pointer - a member pointer is still more useful than the offsetof macro since a member pointer is not limited to members of POD classes only. Greg Nov 20 '06 #12

 P: n/a John Carson: I think they both involve implementation-defined behavior according to the Standard. Both will usually work in practice. My own claim is that _my_ code is perfectly fine. I also claim that your code is not OK, even though I acknowledge it would work on a lot of systems. I could imagine a system which doesn't have 8-Bit bytes, but which has a layer between the machine and the C implementation that makes you think there are 8-Bit bytes. Let's say that the machine actually has 4-Bit bytes. When you cast to integer type and subtract, your result might be double what you thought it would be. Any pointer can be cast to char*. However, by Section 5.2.10/3: "The mapping performed by reinterpret_cast is implementation-defined. [Note: it might, or might not, produce a representation different from the original value. ]" There are several exceptions to the whole "reinterpret_cast is a wild animal" idea. Casting to char* or void* is one of them. Another would be casting from a POD pointer to a pointer to the first member in the POD. > (1) The Standard doesn't necessitate the existance of an integertype large enough to accomodate a memory address. True, but not an issue on most platforms. On every platform though, the char* subtraction will work. Moreover, Section 5.2.10/4 says that the conversion to an integer value "is intended to be unsurprising to those who know the addressing structure of the underlying machine", which provides an assurance of sorts for my preferred approach. What if we're working with the 4-Bit system disguised as an 8-Bit system? Finally, I point out that the Standard doesn't guarantee an integer type large enough to store the result of the subtraction (See Section 5.7/6). Once again, both approaches rely on an implementation-defined feature (or on the choice of suitable addresses to compare). Are you sure about that? The purpose of ptrdiff_t is to store the result of subtracting two pointers. Presumably, if the subtraction of the pointers is valid, then the type should be able to hold the value. -- Frederick Gotham Nov 20 '06 #13

 P: n/a Frederick Gotham: There are several exceptions to the whole "reinterpret_cast is a wild animal" idea. Casting to char* or void* is one of them. Another would be casting from a POD pointer to a pointer to the first member in the POD. In the past, I've seen people so fearful of reinterpret_cast that they write: char *p = static_cast(static_cast(&obj)); I myself just write: char *p = (char*)&obj; -- Frederick Gotham Nov 20 '06 #14

 P: n/a "Frederick Gotham" > (1) The Standard doesn't necessitate the existance of an integertype large enough to accomodate a memory address. True, but not an issue on most platforms. On every platform though, the char* subtraction will work. The char* cast will work. The subtraction isn't guaranteed. >Moreover, Section5.2.10/4 says that the conversion to an integer value "is intendedto be unsurprising to those who know the addressing structure of theunderlying machine", which provides an assurance of sorts for mypreferred approach. What if we're working with the 4-Bit system disguised as an 8-Bit system? I don't know, but the implementation should say what would happen. >Finally, I point out that the Standard doesn't guarantee an integertype large enough to store the result of the subtraction (SeeSection 5.7/6). Once again, both approaches rely on animplementation-defined feature (or on the choice of suitableaddresses to compare). Are you sure about that? The purpose of ptrdiff_t is to store the result of subtracting two pointers. Presumably, if the subtraction of the pointers is valid, then the type should be able to hold the value. I can only go by the Standard, which I have already quoted in the previous thread. The result of such a subtraction is a signed type and as such has a maximum absolute value only half the size of the largest value supported by the corresponding unsigned type. If addresses can have any value covered by the unsigned type, this creates the possibility of overflow. -- John Carson Nov 20 '06 #15

 P: n/a John Carson: (Referring to pointer arithmetic) The result of such a subtraction is a signed type and as such has a maximum absolute value only half the size of the largest value supported by the corresponding unsigned type. If addresses can have any value covered by the unsigned type, this creates the possibility of overflow. I think though that this argument can be countered by a combination of the following excerpts from the Standard. 3.9.2 For any object (other than a base-class subobject) of POD type T, whether or not the object holds a valid value of type T, the underlying bytes (1.7) making up the object can be copied into an array of char or unsigned char.36) If the content of the array of char or unsigned char is copied back into the object, the object shall subsequently hold its original value. Therefore, we can do the following: double arr[64] = { ... }; char unsigned buf[sizeof arr]; memcpy(buf,arr,sizeof buf); The array object, "buf", is a fully-fledged object type. Now let's read about ptrdiff_t: 5.7.6 When two pointers to elements of the same array object are subtracted, the result is the difference of the subscripts of the two array elements. The type of the result is an implementation-defined signed integral type; this type shall be the same type that is defined as ptrdiff_t in the header (18.1). As with any other arithmetic overflow, if the result does not fit in the space provided, the behavior is undefined. In other words, if the expressions P and Q point to, respectively, the i-th and j-th elements of an array object, the expression (P)-(Q) has the value i–j provided the value fits in an object of type ptrdiff_t. I'm glad to see we're agreed that the casting to char* is OK. What I find annoying though is the situation with ptrdiff_t... I'm going to take this over to comp.std.c++. -- Frederick Gotham Nov 20 '06 #16

 P: n/a Frederick Gotham

 P: n/a Steve Pope: >Given two memory addresses in the form of pointers -- pointer types which >may be different -- calculate the distance in bytes between them. Thepointers refer to parts of the same object. You can't. You can only subtract pointers if they are pointing to the same type of object, and then only if the pointed-to objects are elements of the same array of such objects. And even then, you will not necessarily get the distance in bytes. Just my opinion. I don't see why there would be anything wrong with the following: struct SomePOD { int a; char b; int arr[5]; }; struct Base { double a; SomePOD b; void *c; }; struct Derived : Base { double d; Base e; }; #include template std::ptrdiff_t BytesBtwn(A const *const p,B const *const q) { return (char const volatile*)q - (char const volatile*)p; } int main() { Derived const volatile obj = Derived(); ptrdiff_t const i = BytesBtwn(obj.b.arr+2,&obj.e.b.a); } -- Frederick Gotham Nov 20 '06 #18

 P: n/a Frederick Gotham Steve Pope: >You can only subtract pointers if they are pointingto the same type of object, and then only if the pointed-toobjects are elements of the same array of such objects. >And even then, you will not necessarily get the distance in bytes. >Just my opinion. >I don't see why there would be anything wrong with the following:struct SomePOD { int a; char b; int arr[5];};struct Base { double a; SomePOD b; void *c;};struct Derived : Base { double d; Base e;};#include templatestd::ptrdiff_t BytesBtwn(A const *const p,B const *const q){ return (char const volatile*)q - (char const volatile*)p;}int main(){ Derived const volatile obj = Derived(); ptrdiff_t const i = BytesBtwn(obj.b.arr+2,&obj.e.b.a);} This would not give the difference in bytes on architectures for which the address of an int is a word address. (Now, I admit not having seen such an architecture for 20 years or so, but they may still be around.) Steve Nov 20 '06 #19

 P: n/a Steve Pope: This would not give the difference in bytes on architectures for which the address of an int is a word address. Sorry I don't understand, could you please explain that? -- Frederick Gotham Nov 20 '06 #20

 P: n/a Frederick Gotham Steve Pope: >This would not give the difference in bytes on architecturesfor which the address of an int is a word address. >Sorry I don't understand, could you please explain that? Picture a computer memory that is both byte-addressable and word-addressable, where a word is four bytes. The word address 1000 (decimal) would address a word containing the four bytes at byte addresses 4000, 4001, 4002, and 4003 (decimal). I don't know of any modern machines that do this, but it has been done and it can save code space. Steve Nov 20 '06 #21

 P: n/a Steve Pope: Picture a computer memory that is both byte-addressable and word-addressable, where a word is four bytes. The word address 1000 (decimal) would address a word containing the four bytes at byte addresses 4000, 4001, 4002, and 4003 (decimal). I don't know of any modern machines that do this, but it has been done and it can save code space. But I'm converting everything to char* beforehand, shouldn't that sort everything out? -- Frederick Gotham Nov 20 '06 #22

 P: n/a Steve Pope wrote: Frederick Gotham Steve Pope: >>You can only subtract pointers if they are pointingto the same type of object, and then only if the pointed-toobjects are elements of the same array of such objects. >>And even then, you will not necessarily get the distance in bytes. >>Just my opinion. >I don't see why there would be anything wrong with the following:struct SomePOD { int a; char b; int arr[5];};struct Base { double a; SomePOD b; void *c;};struct Derived : Base { double d; Base e;};#include templatestd::ptrdiff_t BytesBtwn(A const *const p,B const *const q){ return (char const volatile*)q - (char const volatile*)p;}int main(){ Derived const volatile obj = Derived(); ptrdiff_t const i = BytesBtwn(obj.b.arr+2,&obj.e.b.a);} This would not give the difference in bytes on architectures for which the address of an int is a word address. I suspect that it just might work, either because a C++ byte will be that same size as a word, or that casting to a char pointer will not be a reinterpret_cast, but involve an actual conversion. In either case you will get a byte distance, for an implementation specific definition of a byte. > (Now, I admit not having seen such an architecture for 20 years or so, but they may still be around.) We have others, that are word adressable and use special part word operations to access individual characters. That makes the above casts even more interesting. :-) Bo Persson Nov 20 '06 #23

 P: n/a Frederick Gotham Steve Pope: >Picture a computer memory that is both byte-addressable andword-addressable, where a word is four bytes. The wordaddress 1000 (decimal) would address a word containing thefour bytes at byte addresses 4000, 4001, 4002, and 4003 (decimal).I don't know of any modern machines that do this, but it hasbeen done and it can save code space. >But I'm converting everything to char* beforehand, shouldn't that sorteverything out? I'm not sure the language requires that an expression like (char *) pint, where pint is a pointer to int, does the required conversion. It seems though things like malloc() would not generally work properly if conversions like this were not done as one would naturally expect, so maybe you can rely on it. Steve Nov 20 '06 #24

 P: n/a Steve Pope: I'm not sure the language requires that an expression like (char *) pint, where pint is a pointer to int, does the required conversion. I think in does. If the Standard doesn't explicitly state this, then it should. I believe the Standard says somewhere that "void*" and "char*" must have identical representation. (Let's forget for the moment that we're not allowed access multiple members of a union). union ByteAddress { void *pv; char *pc; }; int i; ByteAddress n; n.pv = &i; /* This is definitely OK */ "pc" and "pv" should be identical right now. Therefore, we could do: int arr[2]; ByteAddress a,b; a.pv = arr; b.pv = arr+1; ptrdiff_t i = b.pc - a.pc; (Again, I acknowledge that the Standard forbids use of unions in this fashion.) Anyway, I digress. If you don't like the following: (char*)pint , then I suppose you can write the following instead: static_cast( static_cast(pint) ); (Actually this sounds utterly ridiculous as I write it -- programmers have been casting to char* in C for decades...) -- Frederick Gotham Nov 21 '06 #25

### This discussion thread is closed

Replies have been disabled for this discussion.