By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
446,320 Members | 2,212 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 446,320 IT Pros & Developers. It's quick & easy.

Byte Address Arithmetic Debate

P: n/a

There is a thread currently active on this newsgroup entitled:

"how to calculate the difference between 2 addresses ?"

The thread deals with calculating the distance, in bytes, between two
memory addresses. Obviously, this can only be done if the addresses refer
to elements or members of the same object (or base objects, etc.).

John Carson and I proposed two separate methods.

I disagree with John's solution, and John disagrees with mine. Therefore,
I'd like to present them both here and see what the audience thinks.

Firstly, we shall start off with a simple POD type:

struct MyPOD {
int a;
double b;
void *c;
short d;
bool e;
int f;
};

Given an object of this type, we shall calculate the distance, in bytes,
between the "b" member and the "e" member.

My own method is as follows:

reinterpret_cast<char const volatile*>(&obj.e)
- reinterpret_cast<char const volatile*>(&obj.b)

John's method is as follows:

reinterpret_cast<long unsigned>(&obj.e)
- reinterpret_cast<long unsigned(&obj.b);

In defence of my own method:

(1) Any byte address can be accurately stored in a char*.

In attack of John's method:

(1) The Standard doesn't necessitate the existance of an integer type
large enough to accomodate a memory address.
(2) Even if such a type exists, the subtraction need not yield the
correct answer (e.g. if each integer 1 represents half a byte, or a quarter
of a byte).

Of course, seeing as how _I_ started this thread, it may be a little biased
toward my own ends, but I hope we get to the bottom of this objectively.

--

Frederick Gotham
Nov 19 '06 #1
Share this Question
Share on Google+
24 Replies


P: n/a
On Sun, 19 Nov 2006 20:05:11 GMT in comp.lang.c++, Frederick Gotham
<fg*******@SPAM.comwrote,
>Given an object of this type, we shall calculate the distance, in bytes,
between the "b" member and the "e" member.
#include <cstddef>
offsetof(MyPOD, e) - offsetof(MyPOD, b)

Nov 19 '06 #2

P: n/a
David Harmon:
On Sun, 19 Nov 2006 20:05:11 GMT in comp.lang.c++, Frederick Gotham
<fg*******@SPAM.comwrote,
>>Given an object of this type, we shall calculate the distance, in bytes,
between the "b" member and the "e" member.

#include <cstddef>
offsetof(MyPOD, e) - offsetof(MyPOD, b)

I'll rephrase the question:

Given two memory addresses in the form of pointers -- pointer types which
may be different -- calculate the distance in bytes between them. The
pointers refer to parts of the same object.

--

Frederick Gotham
Nov 19 '06 #3

P: n/a

Frederick Gotham wrote:
David Harmon:
On Sun, 19 Nov 2006 20:05:11 GMT in comp.lang.c++, Frederick Gotham
<fg*******@SPAM.comwrote,
>Given an object of this type, we shall calculate the distance, in bytes,
between the "b" member and the "e" member.
#include <cstddef>
offsetof(MyPOD, e) - offsetof(MyPOD, b)


I'll rephrase the question:

Given two memory addresses in the form of pointers -- pointer types which
may be different -- calculate the distance in bytes between them. The
pointers refer to parts of the same object.

--

Frederick Gotham
Not that i'm trying deliberately to be a pain in the attic, but what do
you mean by between them?
Thats not the same as offset.

struct test
{
int n;
int i;
};

The distance in bytes between a test instance.n and instance.i would be
zero assuming no padding is involved. Remember: To assume == makes an
ASS out of U and ME.

Nov 19 '06 #4

P: n/a

Salt_Peter:
Not that i'm trying deliberately to be a pain in the attic, but what do
you mean by between them?

Let's say that a certain object is located at memory address 14.

Let's say that another object is located at memory address 18.

This distance between them is 4.

Thats not the same as offset.

struct test
{
int n;
int i;
};

The distance in bytes between a test instance.n and instance.i would be
zero assuming no padding is involved.

We're just looking for the amount of bytes between two addresses.

Let's say that &obj.n == Memory Byte Address 56
Let's say that &obj.i == Memory Byte Address 60

Therefore, the distance between them is 4 bytes.

Remember: To assume == makes an
ASS out of U and ME.
Should I understand that somehow?

--

Frederick Gotham
Nov 19 '06 #5

P: n/a

Frederick Gotham wrote:
Salt_Peter:
Not that i'm trying deliberately to be a pain in the attic, but what do
you mean by between them?


Let's say that a certain object is located at memory address 14.

Let's say that another object is located at memory address 18.

This distance between them is 4.

Thats not the same as offset.

struct test
{
int n;
int i;
};

The distance in bytes between a test instance.n and instance.i would be
zero assuming no padding is involved.


We're just looking for the amount of bytes between two addresses.

Let's say that &obj.n == Memory Byte Address 56
Let's say that &obj.i == Memory Byte Address 60

Therefore, the distance between them is 4 bytes.
There is no guarantee that converting a pointer to an integer value
will produce the logical address of the referenced object. So neither
of the two approaches is certain to be portable. In fact, the only
portable approach available is to use the offsetof macro - either to
calculate the distance between the start of a POD object and one of its
members, or between any two members of the same object:

std::abs( offsetof(MyPOD, e) - offsetof(MyPOD, b));

Greg

Nov 19 '06 #6

P: n/a
Greg:
There is no guarantee that converting a pointer to an integer value
will produce the logical address of the referenced object. So neither
of the two approaches is certain to be portable.

My claim is that the char* method is perfect.

#include <cstddef>

template<class A,class B>
std::ptrdiff_t BytesBetween(A const &a,B const &b)
{
return reinterpret_cast<char const volatile*>(&b)
- reinterpret_cast<char const volatile*>(&a);
}

Of course, both "a" and "b" must refer to parts of the same object.

--

Frederick Gotham
Nov 20 '06 #7

P: n/a
"Frederick Gotham" <fg*******@SPAM.comwrote in message
news:XN*******************@news.indigo.ie
There is a thread currently active on this newsgroup entitled:

"how to calculate the difference between 2 addresses ?"

The thread deals with calculating the distance, in bytes, between two
memory addresses. Obviously, this can only be done if the addresses
refer to elements or members of the same object (or base objects,
etc.).

John Carson and I proposed two separate methods.

I disagree with John's solution, and John disagrees with mine.
Therefore, I'd like to present them both here and see what the
audience thinks.
Just to be clear: I don't claim my approach is more correct than yours. I
think they both involve implementation-defined behavior according to the
Standard. Both will usually work in practice. My preference for converting
to an integer is more of an aesthetic one. The aesthetics may differ
depending on the exact nature of the problem.
Firstly, we shall start off with a simple POD type:

struct MyPOD {
int a;
double b;
void *c;
short d;
bool e;
int f;
};

Given an object of this type, we shall calculate the distance, in
bytes, between the "b" member and the "e" member.

My own method is as follows:

reinterpret_cast<char const volatile*>(&obj.e)
- reinterpret_cast<char const volatile*>(&obj.b)

John's method is as follows:

reinterpret_cast<long unsigned>(&obj.e)
- reinterpret_cast<long unsigned(&obj.b);
I wish to cast it to a pointer-sized integer. This is not synonymous with
long unsigned. Indeed on Win64, long unsigned is smaller than pointer-sized
(crazy, I know), but a pointer-sized integer nevertheless exists.
In defence of my own method:

(1) Any byte address can be accurately stored in a char*.
Any pointer can be cast to char*. However, by Section 5.2.10/3:

"The mapping performed by reinterpret_cast is implementation-defined. [Note:
it might, or might not, produce a representation different from the original
value. ]"

This applies equally to my method.
In attack of John's method:

(1) The Standard doesn't necessitate the existance of an integer
type large enough to accomodate a memory address.
True, but not an issue on most platforms.
(2) Even if such a type exists, the subtraction need not yield the
correct answer (e.g. if each integer 1 represents half a byte, or a
quarter of a byte).
If your cast can produce "a representation different from the original
value", I don't see that it offers an advantage. Moreover, Section 5.2.10/4
says that the conversion to an integer value "is intended to be unsurprising
to those who know the addressing structure of the underlying machine", which
provides an assurance of sorts for my preferred approach.

Finally, I point out that the Standard doesn't guarantee an integer type
large enough to store the result of the subtraction (See Section 5.7/6).
Once again, both approaches rely on an implementation-defined feature (or on
the choice of suitable addresses to compare).
--
John Carson


Nov 20 '06 #8

P: n/a
Frederick Gotham wrote:
Greg:
There is no guarantee that converting a pointer to an integer value
will produce the logical address of the referenced object. So neither
of the two approaches is certain to be portable.


My claim is that the char* method is perfect.

#include <cstddef>

template<class A,class B>
std::ptrdiff_t BytesBetween(A const &a,B const &b)
{
return reinterpret_cast<char const volatile*>(&b)
- reinterpret_cast<char const volatile*>(&a);
}

Of course, both "a" and "b" must refer to parts of the same object.
In order to subtract pointer a from pointer b, both a and b must point
to the same kind of object and the objects that they point to, must
both be members of the same array. Since the BytesBetween() function
template observes neither of these requirements, there is no guarantee
that its behavior will be defined.

"Unless both pointers point to elements of the same array object, or
one past the last element of the array object, the behavior is
undefined." [5.7/7]

C++ would not need the offsetof macro if there were another, portable
way to calculate the distance between two members of an object.

Greg

Nov 20 '06 #9

P: n/a
Greg wrote:
C++ would not need the offsetof macro if there were another, portable
way to calculate the distance between two members of an object.
That seems incorrect: the difficulty with the offsetof macro is the need for
compile-time evaluation. That makes it impossible to create an instance of
the struct type and measure offsets of its members. Thus, even if you had a
perfectly fine method of computing distances of members of an object, it
would not help in writing an offsetof macro.
Best

Kai-Uwe Bux

Nov 20 '06 #10

P: n/a
On Sun, 19 Nov 2006 20:58:34 GMT in comp.lang.c++, Frederick Gotham
<fg*******@SPAM.comwrote,
>I'll rephrase the question:
I'll still dodge it.
Eschew undefined behavior.
Cast not thy pointers into the void.

Nov 20 '06 #11

P: n/a
Kai-Uwe Bux wrote:
Greg wrote:
C++ would not need the offsetof macro if there were another, portable
way to calculate the distance between two members of an object.

That seems incorrect: the difficulty with the offsetof macro is the need for
compile-time evaluation. That makes it impossible to create an instance of
the struct type and measure offsets of its members. Thus, even if you had a
perfectly fine method of computing distances of members of an object, it
would not help in writing an offsetof macro.
Counting the number bytes from the start of an object to one of its
members is not the only way to express the distance. But since the
requirement in this case is to provide a byte measurement of the
distance - the offsetof macro is the only portable way to obtain that
figure.

Requiring that the offset of a class member be expressed in bytes is of
course a completely artificial constraint - no C++ program would ever
face such a limitation. After all, no program calls offsetof simply to
obtain a number. Instead the number that offsetof returns is useful
only insofar as the program can use that value to gain access to the
specified class member given a pointer to a class object.

In C++, member access through an object pointer is already possible by
applying a member pointer to the object pointer. A member pointer
essentially abstracts the offset of a class member, and hides the
implementation details from the C++ program. So although a C++ program
cannot recover the byte distance of the offset that is stored within a
member pointer - a member pointer is still more useful than the
offsetof macro since a member pointer is not limited to members of POD
classes only.

Greg

Nov 20 '06 #12

P: n/a
John Carson:
I think they both involve implementation-defined behavior according to
the Standard. Both will usually work in practice.

My own claim is that _my_ code is perfectly fine. I also claim that your
code is not OK, even though I acknowledge it would work on a lot of
systems.

I could imagine a system which doesn't have 8-Bit bytes, but which has a
layer between the machine and the C implementation that makes you think
there are 8-Bit bytes. Let's say that the machine actually has 4-Bit bytes.
When you cast to integer type and subtract, your result might be double
what you thought it would be.

Any pointer can be cast to char*. However, by Section 5.2.10/3:

"The mapping performed by reinterpret_cast is implementation-defined.
[Note: it might, or might not, produce a representation different from
the original value. ]"

There are several exceptions to the whole "reinterpret_cast is a wild
animal" idea. Casting to char* or void* is one of them. Another would be
casting from a POD pointer to a pointer to the first member in the POD.

> (1) The Standard doesn't necessitate the existance of an integer
type large enough to accomodate a memory address.

True, but not an issue on most platforms.

On every platform though, the char* subtraction will work.

Moreover, Section
5.2.10/4 says that the conversion to an integer value "is intended to be
unsurprising to those who know the addressing structure of the
underlying machine", which provides an assurance of sorts for my
preferred approach.

What if we're working with the 4-Bit system disguised as an 8-Bit system?

Finally, I point out that the Standard doesn't guarantee an integer type
large enough to store the result of the subtraction (See Section 5.7/6).
Once again, both approaches rely on an implementation-defined feature
(or on the choice of suitable addresses to compare).

Are you sure about that? The purpose of ptrdiff_t is to store the result of
subtracting two pointers. Presumably, if the subtraction of the pointers is
valid, then the type should be able to hold the value.

--

Frederick Gotham
Nov 20 '06 #13

P: n/a
Frederick Gotham:
There are several exceptions to the whole "reinterpret_cast is a wild
animal" idea. Casting to char* or void* is one of them. Another would be
casting from a POD pointer to a pointer to the first member in the POD.

In the past, I've seen people so fearful of reinterpret_cast that they write:

char *p = static_cast<char*>(static_cast<void*>(&obj));

I myself just write:

char *p = (char*)&obj;

--

Frederick Gotham
Nov 20 '06 #14

P: n/a
"Frederick Gotham" <fg*******@SPAM.comwrote in message
news:ni*******************@news.indigo.ie
John Carson:

I could imagine a system which doesn't have 8-Bit bytes, but which
has a layer between the machine and the C implementation that makes
you think there are 8-Bit bytes. Let's say that the machine actually
has 4-Bit bytes. When you cast to integer type and subtract, your
result might be double what you thought it would be.
That would depend on the implementation.
There are several exceptions to the whole "reinterpret_cast is a wild
animal" idea. Casting to char* or void* is one of them. Another would
be casting from a POD pointer to a pointer to the first member in the
POD.
The effect of reinterpret_cast on a POD pointer is specified in the Standard
(section 9.2/17). The others are not as far as I am aware.
>> (1) The Standard doesn't necessitate the existance of an integer
type large enough to accomodate a memory address.

True, but not an issue on most platforms.

On every platform though, the char* subtraction will work.
The char* cast will work. The subtraction isn't guaranteed.
>Moreover, Section
5.2.10/4 says that the conversion to an integer value "is intended
to be unsurprising to those who know the addressing structure of the
underlying machine", which provides an assurance of sorts for my
preferred approach.

What if we're working with the 4-Bit system disguised as an 8-Bit
system?
I don't know, but the implementation should say what would happen.
>Finally, I point out that the Standard doesn't guarantee an integer
type large enough to store the result of the subtraction (See
Section 5.7/6). Once again, both approaches rely on an
implementation-defined feature (or on the choice of suitable
addresses to compare).

Are you sure about that? The purpose of ptrdiff_t is to store the
result of subtracting two pointers. Presumably, if the subtraction of
the pointers is valid, then the type should be able to hold the value.
I can only go by the Standard, which I have already quoted in the previous
thread. The result of such a subtraction is a signed type and as such has a
maximum absolute value only half the size of the largest value supported by
the corresponding unsigned type. If addresses can have any value covered by
the unsigned type, this creates the possibility of overflow.

--
John Carson

Nov 20 '06 #15

P: n/a
John Carson:

(Referring to pointer arithmetic)
The result of such a subtraction is a signed type and
as such has a maximum absolute value only half the size of the largest
value supported by the corresponding unsigned type. If addresses can
have any value covered by the unsigned type, this creates the
possibility of overflow.

I think though that this argument can be countered by a combination of the
following excerpts from the Standard.

3.9.2
For any object (other than a base-class subobject) of POD type T, whether
or not the object holds a valid value of type T, the underlying bytes (1.7)
making up the object can be copied into an array of char or unsigned
char.36) If the content of the array of char or unsigned char is copied
back into the object, the object shall subsequently hold its original
value.

Therefore, we can do the following:

double arr[64] = { ... };

char unsigned buf[sizeof arr];

memcpy(buf,arr,sizeof buf);

The array object, "buf", is a fully-fledged object type.

Now let's read about ptrdiff_t:

5.7.6
When two pointers to elements of the same array object are subtracted, the
result is the difference of the subscripts of the two array elements. The
type of the result is an implementation-defined signed integral type; this
type shall be the same type that is defined as ptrdiff_t in the <cstddef>
header (18.1). As with any other arithmetic overflow, if the result does
not fit in the space provided, the behavior is undefined. In other words,
if the expressions P and Q point to, respectively, the i-th and j-th
elements of an array object, the expression (P)-(Q) has the value ij
provided the value fits in an object of type ptrdiff_t.

I'm glad to see we're agreed that the casting to char* is OK. What I find
annoying though is the situation with ptrdiff_t... I'm going to take this
over to comp.std.c++.

--

Frederick Gotham
Nov 20 '06 #16

P: n/a
Frederick Gotham <fg*******@SPAM.comwrote:
I'll rephrase the question:
Given two memory addresses in the form of pointers -- pointer types which
may be different -- calculate the distance in bytes between them. The
pointers refer to parts of the same object.
You can't. You can only subtract pointers if they are pointing
to the same type of object, and then only if the pointed-to
objects are elements of the same array of such objects.

And even then, you will not necessarily get the distance in bytes.

Just my opinion.

Steve
Nov 20 '06 #17

P: n/a
Steve Pope:
>Given two memory addresses in the form of pointers -- pointer types
which
>may be different -- calculate the distance in bytes between them. The
pointers refer to parts of the same object.

You can't. You can only subtract pointers if they are pointing
to the same type of object, and then only if the pointed-to
objects are elements of the same array of such objects.

And even then, you will not necessarily get the distance in bytes.

Just my opinion.

I don't see why there would be anything wrong with the following:

struct SomePOD {
int a;
char b;
int arr[5];
};

struct Base {
double a;
SomePOD b;
void *c;
};

struct Derived : Base {
double d;
Base e;
};

#include <cstddef>

template<class A,class B>
std::ptrdiff_t BytesBtwn(A const *const p,B const *const q)
{
return (char const volatile*)q - (char const volatile*)p;
}

int main()
{
Derived const volatile obj = Derived();

ptrdiff_t const i = BytesBtwn(obj.b.arr+2,&obj.e.b.a);
}

--

Frederick Gotham
Nov 20 '06 #18

P: n/a
Frederick Gotham <fg*******@SPAM.comwrote:
>Steve Pope:
>You can only subtract pointers if they are pointing
to the same type of object, and then only if the pointed-to
objects are elements of the same array of such objects.
>And even then, you will not necessarily get the distance in bytes.
>Just my opinion.
>I don't see why there would be anything wrong with the following:

struct SomePOD {
int a;
char b;
int arr[5];
};

struct Base {
double a;
SomePOD b;
void *c;
};

struct Derived : Base {
double d;
Base e;
};

#include <cstddef>

template<class A,class B>
std::ptrdiff_t BytesBtwn(A const *const p,B const *const q)
{
return (char const volatile*)q - (char const volatile*)p;
}

int main()
{
Derived const volatile obj = Derived();

ptrdiff_t const i = BytesBtwn(obj.b.arr+2,&obj.e.b.a);
}
This would not give the difference in bytes on architectures
for which the address of an int is a word address.

(Now, I admit not having seen such an architecture for 20
years or so, but they may still be around.)

Steve
Nov 20 '06 #19

P: n/a
Steve Pope:
This would not give the difference in bytes on architectures
for which the address of an int is a word address.

Sorry I don't understand, could you please explain that?

--

Frederick Gotham
Nov 20 '06 #20

P: n/a
Frederick Gotham <fg*******@SPAM.comwrote:
>Steve Pope:
>This would not give the difference in bytes on architectures
for which the address of an int is a word address.
>Sorry I don't understand, could you please explain that?
Picture a computer memory that is both byte-addressable and
word-addressable, where a word is four bytes. The word
address 1000 (decimal) would address a word containing the
four bytes at byte addresses 4000, 4001, 4002, and 4003 (decimal).

I don't know of any modern machines that do this, but it has
been done and it can save code space.

Steve
Nov 20 '06 #21

P: n/a
Steve Pope:
Picture a computer memory that is both byte-addressable and
word-addressable, where a word is four bytes. The word
address 1000 (decimal) would address a word containing the
four bytes at byte addresses 4000, 4001, 4002, and 4003 (decimal).

I don't know of any modern machines that do this, but it has
been done and it can save code space.

But I'm converting everything to char* beforehand, shouldn't that sort
everything out?

--

Frederick Gotham
Nov 20 '06 #22

P: n/a
Steve Pope wrote:
Frederick Gotham <fg*******@SPAM.comwrote:
>Steve Pope:
>>You can only subtract pointers if they are pointing
to the same type of object, and then only if the pointed-to
objects are elements of the same array of such objects.
>>And even then, you will not necessarily get the distance in bytes.
>>Just my opinion.
>I don't see why there would be anything wrong with the following:

struct SomePOD {
int a;
char b;
int arr[5];
};

struct Base {
double a;
SomePOD b;
void *c;
};

struct Derived : Base {
double d;
Base e;
};

#include <cstddef>

template<class A,class B>
std::ptrdiff_t BytesBtwn(A const *const p,B const *const q)
{
return (char const volatile*)q - (char const volatile*)p;
}

int main()
{
Derived const volatile obj = Derived();

ptrdiff_t const i = BytesBtwn(obj.b.arr+2,&obj.e.b.a);
}

This would not give the difference in bytes on architectures
for which the address of an int is a word address.
I suspect that it just might work, either because a C++ byte will be that
same size as a word, or that casting to a char pointer will not be a
reinterpret_cast, but involve an actual conversion.

In either case you will get a byte distance, for an implementation specific
definition of a byte.
>
(Now, I admit not having seen such an architecture for 20
years or so, but they may still be around.)
We have others, that are word adressable and use special part word
operations to access individual characters. That makes the above casts even
more interesting. :-)
Bo Persson
Nov 20 '06 #23

P: n/a
Frederick Gotham <fg*******@SPAM.comwrote:
>Steve Pope:
>Picture a computer memory that is both byte-addressable and
word-addressable, where a word is four bytes. The word
address 1000 (decimal) would address a word containing the
four bytes at byte addresses 4000, 4001, 4002, and 4003 (decimal).

I don't know of any modern machines that do this, but it has
been done and it can save code space.
>But I'm converting everything to char* beforehand, shouldn't that sort
everything out?
I'm not sure the language requires that an expression like
(char *) pint, where pint is a pointer to int, does the required
conversion.

It seems though things like malloc() would not generally work properly
if conversions like this were not done as one would naturally
expect, so maybe you can rely on it.

Steve
Nov 20 '06 #24

P: n/a
Steve Pope:
I'm not sure the language requires that an expression like
(char *) pint, where pint is a pointer to int, does the required
conversion.

I think in does. If the Standard doesn't explicitly state this, then it
should.

I believe the Standard says somewhere that "void*" and "char*" must have
identical representation. (Let's forget for the moment that we're not
allowed access multiple members of a union).

union ByteAddress {
void *pv;
char *pc;
};

int i;

ByteAddress n;

n.pv = &i; /* This is definitely OK */

"pc" and "pv" should be identical right now. Therefore, we could do:

int arr[2];

ByteAddress a,b;

a.pv = arr;
b.pv = arr+1;

ptrdiff_t i = b.pc - a.pc;

(Again, I acknowledge that the Standard forbids use of unions in this
fashion.)

Anyway, I digress. If you don't like the following:

(char*)pint

, then I suppose you can write the following instead:

static_cast<char*>( static_cast<void*>(pint) );

(Actually this sounds utterly ridiculous as I write it -- programmers have
been casting to char* in C for decades...)

--

Frederick Gotham
Nov 21 '06 #25

This discussion thread is closed

Replies have been disabled for this discussion.