addresses and integers

j0mbolar

I've read in the standard that addresses
basically can't be interpreted as integers.
If they can, it is implementation defined
behavior. However, if they can't be viewed
as integers in any sense as far as portability
goes, what then, should one think of addresses
being composed of?

Nov 14 '05 #1

Subscribe Reply

3232

pete

j0mbolar wrote:

I've read in the standard that addresses
basically can't be interpreted as integers.
If they can, it is implementation defined
behavior. However, if they can't be viewed
as integers in any sense as far as portability
goes, what then, should one think of addresses
being composed of?

Bases and offsets.

--
pete

Nov 14 '05 #2

Jack Klein

On Mon, 30 Aug 2004 00:21:31 GMT, pete <pf*****@mindspring.com> wrote
in comp.lang.c:

j0mbolar wrote:

I've read in the standard that addresses
basically can't be interpreted as integers.
If they can, it is implementation defined
behavior. However, if they can't be viewed
as integers in any sense as far as portability
goes, what then, should one think of addresses
being composed of?

Bases and offsets.

Can you provide any justification at all for your apparently
ridiculous assertion?

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.contrib.andrew.cmu.edu/~a...FAQ-acllc.html

Nov 14 '05 #3

Jack Klein

On 29 Aug 2004 16:57:46 -0700, j0******@engineer.com (j0mbolar) wrote
in comp.lang.c:

I've read in the standard that addresses
basically can't be interpreted as integers.
That's right, addresses are constants. In fact, they are rvalues, not
lvalues. Functions can't be interpreted as integers either, nor can
structures. What of it?
If they can, it is implementation defined
behavior. However, if they can't be viewed
as integers in any sense as far as portability
goes, what then, should one think of addresses
being composed of?

Addresses in C are not "composed" of anything. They have no defined
inner structure, just as the floating point types do not.

There is a requirement that an address can be represented in a string
of binary digits, because an address can be stored in a pointer object
of appropriate type, that pointer object can be inspected as an array
of unsigned chars, and upon such inspection the pointer object must
contain bits and nothing but bits.

This same possibility of inspection as the bits contained in an array
of unsigned characters also applies to the floating point types, but
the interpretation or meaning of those bits is totally unspecified by
the standard.

The standard does require that if an implementation provides an
integer type wide enough to contain a pointer, assignment with a cast
of a pointer value to that integer type and back again with a cast to
the original pointer type will yield an identical pointer. C99 even
defines typedef to be used for such a type, intptr_t and uintptr_t,
although they are optional. I think it would have been preferable for
the standard require the typedefs if such an integer type existed, the
way it requires the exact width definitions.

The standard does not require or guarantee that you can do anything
useful with a converted pointer in such a type, other than converting
it back. In particular, there is no guarantee that:

char name [] = "name";
char *n = name;
uintptr_t up = n;
++up;
n = up;

....n now points to the 'a' in name, or has any valid value at all.

Addresses have absolutely no portability at all, even between
executions of the same program.

What portability do you think they should have, and why? And why do
you think you need to think of them or treat them as integers? What
is it that you think you need to do with addresses that cannot
legitimately be done with pointers?

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.contrib.andrew.cmu.edu/~a...FAQ-acllc.html

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++ http://www.contrib.andrew.cmu.edu/~a...FAQ-acllc.html

Nov 14 '05 #4

Barry Margolin

In article <nr********************************@4ax.com>,
Jack Klein <ja*******@spamcop.net> wrote:

On Mon, 30 Aug 2004 00:21:31 GMT, pete <pf*****@mindspring.com> wrote
in comp.lang.c:
j0mbolar wrote:

I've read in the standard that addresses
basically can't be interpreted as integers.
If they can, it is implementation defined
behavior. However, if they can't be viewed
as integers in any sense as far as portability
goes, what then, should one think of addresses
being composed of?

Bases and offsets.

Can you provide any justification at all for your apparently
ridiculous assertion?

That's how C pointers were implemented on Symbolics Lisp Machines.
That's also a non-ridiculous way to implement them on a CPU that has a
reasonable segmented architecture (as opposed to the hoops you have to
jump through to use x86's segmentation).

--
Barry Margolin, ba****@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***

Nov 14 '05 #5

Ben Pfaff

Jack Klein <ja*******@spamcop.net> writes:

On Mon, 30 Aug 2004 00:21:31 GMT, pete <pf*****@mindspring.com> wrote
in comp.lang.c:
j0mbolar wrote:
>
> I've read in the standard that addresses
> basically can't be interpreted as integers.
> If they can, it is implementation defined
> behavior. However, if they can't be viewed
> as integers in any sense as far as portability
> goes, what then, should one think of addresses
> being composed of?

Bases and offsets.

Can you provide any justification at all for your apparently
ridiculous assertion?

It's not too unreasonable if the "base" is the beginning of an
array and the "offset" is the number of elements from the base.
That's my mental model for abstract C arrays, anyway. It also
works for individual objects not within an array, which can be
treated with 1 element. It does break down when you're dealing
with e.g. structure members though.
--
int main(void){char p[]="ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuv wxyz.\
\n",*q="kl BIcNBFr.NKEzjwCIxNJC";int i=sizeof p/2;char *strchr();int putchar(\
);while(*q){i+=strchr(p,*q++)-p;if(i>=(int)sizeof p)i-=sizeof p-1;putchar(p[i]\
);}return 0;}

Nov 14 '05 #6

Brian Inglis

On Sun, 29 Aug 2004 19:07:12 -0700 in comp.std.c, Ben Pfaff
<bl*@cs.stanford.edu> wrote:

Jack Klein <ja*******@spamcop.net> writes:
On Mon, 30 Aug 2004 00:21:31 GMT, pete <pf*****@mindspring.com> wrote
in comp.lang.c:
j0mbolar wrote:
>
> I've read in the standard that addresses
> basically can't be interpreted as integers.
> If they can, it is implementation defined
> behavior. However, if they can't be viewed
> as integers in any sense as far as portability
> goes, what then, should one think of addresses
> being composed of?

Bases and offsets.

Can you provide any justification at all for your apparently
ridiculous assertion?

It's not too unreasonable if the "base" is the beginning of an
array and the "offset" is the number of elements from the base.
That's my mental model for abstract C arrays, anyway. It also
works for individual objects not within an array, which can be
treated with 1 element. It does break down when you're dealing
with e.g. structure members though.

Nor really, the assertion still holds, structure member offsets are
then in addressing units (instead of number of elements) from the
structure base.

--
Thanks. Take care, Brian Inglis Calgary, Alberta, Canada

Br**********@CSi.com (Brian[dot]Inglis{at}SystematicSW[dot]ab[dot]ca)
fake address use address above to reply

Nov 14 '05 #7

pete

Jack Klein wrote:

On Mon, 30 Aug 2004 00:21:31 GMT, pete <pf*****@mindspring.com> wrote
in comp.lang.c:
j0mbolar wrote:

I've read in the standard that addresses
basically can't be interpreted as integers.
If they can, it is implementation defined
behavior. However, if they can't be viewed
as integers in any sense as far as portability
goes, what then, should one think of addresses
being composed of?

Bases and offsets.

Can you provide any justification at all for your apparently
ridiculous assertion?

It's the way that pointers (addresses) relate to
each other with arithmetic and relational operators.
You can't add two pointers together,
because pointer types are not arithmetic types.
Relational operations for pointers, are only defined
for pointers which are offset from a common base.

The address of the lowest addressable byte of an object is
(char *)&object
and the address of the highest is
(char *)&object + sizeof object - 1

That's how I think of pointers.

--
pete

Nov 14 '05 #8

E. Robert Tisdale

j0mbolar wrote:

I've read in the standard that addresses
You probably mean pointers.
basically can't be interpreted as integers.
If they can, it is implementation defined behavior.
All that means is that the ANSI/ISO C standards
do not define any relationship between integers and pointers.
However, if they can't be viewed as integers
in any sense as far as portability goes,
As far as portability goes,
you can almost always count on the fact that
pointers have the same representation as an unsigned int --
a machine word. There are practically *no* exceptions
to this rule for most C programmers.
what, then, should one think of addresses being composed of?

A pointer is an object which may contain values
which are the addresses of valid objects.

Nov 14 '05 #9

Douglas A. Gwyn

j0mbolar wrote:

I've read in the standard that addresses
basically can't be interpreted as integers.
If they can, it is implementation defined
behavior. However, if they can't be viewed
as integers in any sense as far as portability
goes, what then, should one think of addresses
being composed of?

Think of them as consisting of (segment,offset)
pairs. Why do you care, so long as they work?

Nov 14 '05 #10

Douglas A. Gwyn

E. Robert Tisdale wrote:

As far as portability goes,
you can almost always count on the fact that
pointers have the same representation as an unsigned int --
a machine word. There are practically *no* exceptions
to this rule for most C programmers.

Quite horribly wrong. There are *some* platforms
where that is true, but also others where it is not
true, and quite a few where the representation of
char* (or void*), even the size, differs from the
representation of e.g. long* on the same platform.

Nov 14 '05 #11

Keith Thompson

"E. Robert Tisdale" <E.**************@jpl.nasa.gov> writes:

j0mbolar wrote:
I've read in the standard that addresses

You probably mean pointers.

The standard uses both terms. Unary "&" is the address operator.

[...]

However, if they can't be viewed as integers
in any sense as far as portability goes,

As far as portability goes,
you can almost always count on the fact that
pointers have the same representation as an unsigned int --
a machine word. There are practically *no* exceptions
to this rule for most C programmers.

No.

I've used several systems on which unsigned int is 32 bits and
pointers are 64 bits. Such systems are likely to become more common
in the future; 64-bit systems need 64-bit pointers, but making int 64
bits makes it difficult to have predefined types covering all the
sizes 8, 16, 32, and 64 bits.

I've also used systems (though not as many) where pointers and
unsigned ints are the same size, but the representation is different
(a byte offset is stored in the high-order 3 bits of the word). And
in a recent thread here, several systems were mentioned on which an
address corresponds more closely to a signed integer than to an
unsigned integer.

We've spent years getting rid of the "All the world's a VAX" fallacy.
Please don't re-introduce it.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Nov 14 '05 #12

Charles Sanders

Keith Thompson wrote:

I've also used systems (though not as many) where pointers and
unsigned ints are the same size, but the representation is different
(a byte offset is stored in the high-order 3 bits of the word). And
in a recent thread here, several systems were mentioned on which an
address corresponds more closely to a signed integer than to an
unsigned integer.
And I have used a system with address, bit offset and
length encoded in one word. I am not trying to go one up on
anyone, just pointing out the variety that has existed and may
exist again. With increasing transistor counts, it may begin
to make sense to make single chip vector processors. Vector
processors often tend to use special addressing schemes.
We've spent years getting rid of the "All the world's a VAX"
fallacy.
Please don't re-introduce it.

I strongly agree. I cannot recall the exact words,
but I believe the standard says something about the mapping
from pointer to (large enough) int being "Unsurprising" to
people familiar with the machine addressing architecture.
Charles

Nov 14 '05 #13

junky_fellow

Jack Klein <ja*******@spamcop.net> wrote in message news:<j4********************************@4ax.com>. ..

On 29 Aug 2004 16:57:46 -0700, j0******@engineer.com (j0mbolar) wrote
in comp.lang.c:
I've read in the standard that addresses
basically can't be interpreted as integers.

That's right, addresses are constants. In fact, they are rvalues, not
lvalues. Functions can't be interpreted as integers either, nor can
structures. What of it?
If they can, it is implementation defined
behavior. However, if they can't be viewed
as integers in any sense as far as portability
goes, what then, should one think of addresses
being composed of?

Addresses in C are not "composed" of anything. They have no defined
inner structure, just as the floating point types do not.

There is a requirement that an address can be represented in a string
of binary digits, because an address can be stored in a pointer object
of appropriate type, that pointer object can be inspected as an array
of unsigned chars, and upon such inspection the pointer object must
contain bits and nothing but bits.

This same possibility of inspection as the bits contained in an array
of unsigned characters also applies to the floating point types, but
the interpretation or meaning of those bits is totally unspecified by
the standard.

The standard does require that if an implementation provides an
integer type wide enough to contain a pointer, assignment with a cast
of a pointer value to that integer type and back again with a cast to
the original pointer type will yield an identical pointer. C99 even
defines typedef to be used for such a type, intptr_t and uintptr_t,
although they are optional. I think it would have been preferable for
the standard require the typedefs if such an integer type existed, the
way it requires the exact width definitions.

The standard does not require or guarantee that you can do anything
useful with a converted pointer in such a type, other than converting
it back. In particular, there is no guarantee that:

char name [] = "name";
char *n = name;
uintptr_t up = n;
++up;
n = up;

...n now points to the 'a' in name, or has any valid value at all.

Addresses have absolutely no portability at all, even between
executions of the same program.

can you please give some example, that explains where the above scenario
won't work. On my machine (unix system) when i run the above example,
char pointer "n" was pointing to the 'a' in name.

On which platforms, "n" will point to some invalid value ?

thanx in advance for any help....

Nov 14 '05 #14

Charles Sanders

junky_fellow wrote:

Jack Klein <ja*******@spamcop.net> wrote in message news:<j4********************************@4ax.com>. ..

[snip]

The standard does not require or guarantee that you can do anything
useful with a converted pointer in such a type, other than converting
it back. In particular, there is no guarantee that:

char name [] = "name";
char *n = name;
uintptr_t up = n;
++up;
n = up;

...n now points to the 'a' in name, or has any valid value at all.

Addresses have absolutely no portability at all, even between
executions of the same program.

can you please give some example, that explains where the above
scenario won't work. On my machine (unix system) when i run the
above example, char pointer "n" was pointing to the 'a' in name.

On which platforms, "n" will point to some invalid value ?

thanx in advance for any help....

One case is CRAY Y-MP or similar. Character pointers had (have?)
the address of the 64-bit word containing the first byte in the
low order bits, and the offset within the byte in the 3 high order
bits. the above code would have the same effect as adding 8 to
n, and the result would point 8 bytes past the "n", or 4 bytes
past the end of string "name'. If the string happened to be at
the high end of the data segment, the result would most likely
point to an illegal address and accessing it would cause a signal.

If you are wondering why, the CRAY was a word addressed machine
and could only access whole words. This was the most efficient way
to have pointers to individual bytes. Char pointers could be
incremented with two instructions, an add of the 64 bit value
0x2000000000000000 (I hope I got the number of zeros right,
there should be 15 of them) followed by an add with carry of
zero. Other character pointer operations were similarly
efficient (although much slower than arithmetic on pointers to
int or float or double or a struct). Accessing a char value
involved shifting and masking, but the above representation was
better than most (all?) of the possible alternatives given the
hardware. By the way, these machines had sizeof(short) ==
sizeof(int) == sizeof(long) == 8, although not all the bits
of shorts (and depending on compiler flags) ints were significant.
Charles

Nov 14 '05 #15

James Kuyper

j0******@engineer.com (j0mbolar) wrote in message news:<2d**************************@posting.google. com>...

I've read in the standard that addresses
basically can't be interpreted as integers.
If they can, it is implementation defined
behavior. However, if they can't be viewed
as integers in any sense as far as portability
goes, what then, should one think of addresses
being composed of?

A pointer is represented by a string of bytes that could also be
interpreted as a number, but the standard contains no guarantees about
the relationship between that number and the memory location pointed
at. For instance, adding 1 to the number doesn't necessarily produce a
pointer to the position immediately after the position that the
original pointer pointed at. It might produce an invalid pointer, or a
pointer pointing at a completely different object. Also, two pointers
that contain different bit patterns might point at the same location.

The only portably useful thing to think about a pointer is that it
identifies the location of an object. In order to say something more
detailed, you have to restrict comments to particular implementations
of C.

Nov 14 '05 #16

junky_fellow

pete <pf*****@mindspring.com> wrote in message news:<41***********@mindspring.com>...

Jack Klein wrote:

On Mon, 30 Aug 2004 00:21:31 GMT, pete <pf*****@mindspring.com> wrote
in comp.lang.c:
j0mbolar wrote:
>
> I've read in the standard that addresses
> basically can't be interpreted as integers.
> If they can, it is implementation defined
> behavior. However, if they can't be viewed
> as integers in any sense as far as portability
> goes, what then, should one think of addresses
> being composed of?

Bases and offsets.

Can you provide any justification at all for your apparently
ridiculous assertion?

It's the way that pointers (addresses) relate to
each other with arithmetic and relational operators.
You can't add two pointers together,
because pointer types are not arithmetic types.
Relational operations for pointers, are only defined
for pointers which are offset from a common base.

The address of the lowest addressable byte of an object is
(char *)&object
and the address of the highest is
(char *)&object + sizeof object - 1

That's how I think of pointers.

why the conversion of a pointer type variable to integer invalid ?
what's the reason behing that ?
i always had in my mind that pointer variable contains some address,
which is some integer value ? and i can add/subtract after typecasting
the pointer variable to int.
thanx in advance for any help/hints.

Nov 14 '05 #17

Thomas Matthews

junky_fellow wrote:

pete <pf*****@mindspring.com> wrote in message news:<41***********@mindspring.com>...
Jack Klein wrote:
On Mon, 30 Aug 2004 00:21:31 GMT, pete <pf*****@mindspring.com> wrote
in comp.lang.c:
j0mbolar wrote:

>I've read in the standard that addresses
>basically can't be interpreted as integers.
>If they can, it is implementation defined
>behavior. However, if they can't be viewed
>as integers in any sense as far as portability
>goes, what then, should one think of addresses
>being composed of?

Bases and offsets.

Can you provide any justification at all for your apparently
ridiculous assertion?

It's the way that pointers (addresses) relate to
each other with arithmetic and relational operators.
You can't add two pointers together,
because pointer types are not arithmetic types.
Relational operations for pointers, are only defined
for pointers which are offset from a common base.

The address of the lowest addressable byte of an object is
(char *)&object
and the address of the highest is
(char *)&object + sizeof object - 1

That's how I think of pointers.

why the conversion of a pointer type variable to integer invalid ?
what's the reason behing that ?
i always had in my mind that pointer variable contains some address,
which is some integer value ? and i can add/subtract after typecasting
the pointer variable to int.
thanx in advance for any help/hints.

In the realm of embedded system, there are many operations
that may need be applied to an address.

One of those is testing an address for alignment to a certain
boundary. In order to perform this operation, the address must
be converted to an integral quantity then use the bit manipulation
operators. For example, testing to see if a pointer is pointing
to a location on an 8-byte (octet) boundary.
--
Thomas Matthews

C++ newsgroup welcome message:
http://www.slack.net/~shiva/welcome.txt
C++ Faq: http://www.parashift.com/c++-faq-lite
C Faq: http://www.eskimo.com/~scs/c-faq/top.html
alt.comp.lang.learn.c-c++ faq:
http://www.comeaucomputing.com/learn/faq/
Other sites:
http://www.josuttis.com -- C++ STL Library book

Nov 14 '05 #18

David Adrien Tanguay

junky_fellow wrote:

char name [] = "name";
char *n = name;
uintptr_t up = n;
++up;
n = up;

...n now points to the 'a' in name, or has any valid value at all.

Addresses have absolutely no portability at all, even between
executions of the same program.

can you please give some example, that explains where the above scenario
won't work. On my machine (unix system) when i run the above example,
char pointer "n" was pointing to the 'a' in name.

On which platforms, "n" will point to some invalid value ?

Bull mainframes. The segment is in the low order bits, so incrementing 'up'
will change the segment, not the offset. The result could be a pointer to
some non-obvious place in your program space (data or instruction), or an
invalid pointer fault if you try to use 'n'.
--
David Tanguay http://www.sentex.ca/~datanguayh/
Kitchener, Ontario, Canada [43.24N 80.29W]

Nov 14 '05 #19

Keith Thompson

Thomas Matthews <Th****************************@sbcglobal.net> writes:
[...]

In the realm of embedded system, there are many operations
that may need be applied to an address.

One of those is testing an address for alignment to a certain
boundary. In order to perform this operation, the address must
be converted to an integral quantity then use the bit manipulation
operators. For example, testing to see if a pointer is pointing
to a location on an 8-byte (octet) boundary.

That operation cannot be done portably -- but you're not likely to a
Cray vector processor in an embedded system. If you're programming
for an embedded system, you probably need to write some non-portable
code anyway.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Nov 14 '05 #20

Andrey Tarasevich

j0mbolar wrote:

I've read in the standard that addresses
basically can't be interpreted as integers.
If they can, it is implementation defined
behavior. However, if they can't be viewed
as integers in any sense as far as portability
goes, what then, should one think of addresses
being composed of?

Nothing. You don't need to think about it at all. Especially if you are
talking about portability. There's no portable context in C language
that relies in any way in the internal representation of a pointer.

--
Best regards,
Andrey Tarasevich

Nov 14 '05 #21

David R Tribble

junky_fellow wrote:

why the conversion of a pointer type variable to integer invalid ?
what's the reason behing that ?
i always had in my mind that pointer variable contains some address,
which is some integer value ? and i can add/subtract after typecasting
the pointer variable to int.

Many old C compilers (some of which still exist) for 16-bit MS-DOS typically
had the habit of converting pointers, which were composed of a 16-bit base
segment address and a 16-bit byte offset within the segment, into 16-bit
ints by simply truncating the high-order segment; teh result was the 16-bit
offset within the segment. Converting such an int value back into a pointer
did not always work, since the compiler had to assume a base segment (usually
the current data segment DS), which was not necessarily correct (e.g.,
because the original pointer came from the stack segment SS or from the
FAR heap).

I seem to recall some old MS-DOS compilers converting pointers (composed of
16-bit base plus 16-bit offset) into 32-bit long ints by simply copying
the 16+16 bit address into the 32-bit int. Doing any kind of arithmetic
on the resulting integer value would then yield surprising results, since
incrementing the 16th bit shifted the address by 4, not 64K. It required
special macros (usually found in some system header file) to extract the
segment and offset portions and then other macros to put them back into
pointer form.

Ah, the joys of the Intel segmented architecture!

-drt

Nov 14 '05 #22

Mabden

"David R Tribble" <da***@tribble.com> wrote in message
news:f4**************************@posting.google.c om...

Ah, the joys of the Intel segmented architecture!

Dunno for certain, but can't we thank Microsoft for that? I don't think the
segment / offset was part of the Intel hardware, but an OS thing. I don't
recall having to go to new hardware when MS decided the flat memory model
was better...

--
Mabden

Nov 14 '05 #23

James Kuyper

ju**********@yahoo.co.in (junky_fellow) wrote in message news:<8c**************************@posting.google. com>...
....

why the conversion of a pointer type variable to integer invalid ?
It's not necessarily invalid. If, after #include <stdint.h>, you find
that INTPTR_MAX has been #defined, then you can safely convert a
pointer value to an intptr_t. The result of that conversion can itself
be converted back to the same pointer type, in which case it will
compare equal to the original pointer value.

The problem is that the only useful thing the standard guarantees
about that integer value is the reverse conversion. Each
implementation can do it's own thing, and there's absolutely nothing
else that portable code can count on.
what's the reason behing that ?
It's invalid, when INTPTR_MAX hasn't been #defined, because that means
that pointers on this platform are too big to be stored in any integer
type.

The reason the standard doesn't provide any more useful information
about the converted pointer's value, is that many different machines
provide many different and mutually incompatible ways of defining such
a conversion. The standard, rather than trying to list all the
possible ways, simply gives up and says "don't ask me!".
i always had in my mind that pointer variable contains some address,
which is some integer value ? and i can add/subtract after typecasting
the pointer variable to int.

That's a nice thing to believe, and it's true on many platforms. It's
not true on others. If you want your code to be portable, make sure
that it doesn't rely upon that assumption being true

Nov 14 '05 #24

Brian Inglis

On Tue, 31 Aug 2004 00:48:10 GMT in comp.std.c, "Mabden"
<mabden@sbc_global.net> wrote:

"David R Tribble" <da***@tribble.com> wrote in message
news:f4**************************@posting.google. com...
Ah, the joys of the Intel segmented architecture!

Dunno for certain, but can't we thank Microsoft for that? I don't think the
segment / offset was part of the Intel hardware, but an OS thing. I don't
recall having to go to new hardware when MS decided the flat memory model
was better...

Segment addresses didn't become segment selectors until 286 protected
mode came out; and the flat memory model wasn't available until the
386, when the OS got the choice of running in real, 286 or 386
protected mode (remember the different Windows versions for each),
with multi-megabyte selector sizes which allowed the OS to set the
same base address and length for all the selectors to allow a flat
address space (and virtual 86 mode for a process in protected mode).

--
Thanks. Take care, Brian Inglis Calgary, Alberta, Canada

Br**********@CSi.com (Brian[dot]Inglis{at}SystematicSW[dot]ab[dot]ca)
fake address use address above to reply

Nov 14 '05 #25

junky_fellow

ku****@wizard.net (James Kuyper) wrote in message news:<8b**************************@posting.google. com>...

j0******@engineer.com (j0mbolar) wrote in message news:<2d**************************@posting.google. com>...
I've read in the standard that addresses
basically can't be interpreted as integers.
If they can, it is implementation defined
behavior. However, if they can't be viewed
as integers in any sense as far as portability
goes, what then, should one think of addresses
being composed of?

A pointer is represented by a string of bytes that could also be
interpreted as a number, but the standard contains no guarantees about
the relationship between that number and the memory location pointed
at. For instance, adding 1 to the number doesn't necessarily produce a
pointer to the position immediately after the position that the
original pointer pointed at. It might produce an invalid pointer, or a
pointer pointing at a completely different object. Also, two pointers
that contain different bit patterns might point at the same location.

The only portably useful thing to think about a pointer is that it
identifies the location of an object. In order to say something more
detailed, you have to restrict comments to particular implementations
of C.

do we get any advantage by having "no relation between the pointer
value(interpreted as a number) and the memory location pointed at" ?
if not then, why making things more complex ?
why not represent the pointer as the integer address of the memory
location it is pointing to ?

Nov 14 '05 #26

Douglas A. Gwyn

junky_fellow wrote:

why the conversion of a pointer type variable to integer invalid ?
Why should it be valid? They're entirely different
kinds of thing, with different properties and uses.
i always had in my mind that pointer variable contains some address,
which is some integer value ? and i can add/subtract after typecasting
the pointer variable to int.

Not always, as has been explained by several recent
postings.

Nov 14 '05 #27

Douglas A. Gwyn

junky_fellow wrote:

do we get any advantage by having "no relation between the pointer
value(interpreted as a number) and the memory location pointed at" ?
Yes. It allows the C implementation to present the
"address" encoding in the most atural manner for the
particular system. If the system does not have a
flat, byte-addressable data memory organization,
then pretending that it has one would involve
unnecessary complexity and serve no useful purpose.
why not represent the pointer as the integer address of the memory
location it is pointing to ?

There may be no such thing!

Nov 14 '05 #28

Brian Inglis

On 30 Aug 2004 22:25:14 -0700 in comp.std.c, ju**********@yahoo.co.in
(junky_fellow) wrote:

ku****@wizard.net (James Kuyper) wrote in message news:<8b**************************@posting.google. com>...
j0******@engineer.com (j0mbolar) wrote in message news:<2d**************************@posting.google. com>...
> I've read in the standard that addresses
> basically can't be interpreted as integers.
> If they can, it is implementation defined
> behavior. However, if they can't be viewed
> as integers in any sense as far as portability
> goes, what then, should one think of addresses
> being composed of?
A pointer is represented by a string of bytes that could also be
interpreted as a number, but the standard contains no guarantees about
the relationship between that number and the memory location pointed
at. For instance, adding 1 to the number doesn't necessarily produce a
pointer to the position immediately after the position that the
original pointer pointed at. It might produce an invalid pointer, or a
pointer pointing at a completely different object. Also, two pointers
that contain different bit patterns might point at the same location.

The only portably useful thing to think about a pointer is that it
identifies the location of an object. In order to say something more
detailed, you have to restrict comments to particular implementations
of C.

do we get any advantage by having "no relation between the pointer
value(interpreted as a number) and the memory location pointed at" ?

There is a relation, but it's not always the obvious, expected one;
sometimes its just the hardware, and sometimes the compiler has to
help out inadequate hardware.
if not then, why making things more complex ?
Compilers don't make things any more complex than the hardware and
language require, and the language often doesn't require anything more
than documenting strange behaviour.
why not represent the pointer as the integer address of the memory
location it is pointing to ?

That's not always how the hardware works, and even if it is, it may
not directly support all the language requirements (read as:
programmer expectations), and may need some compiler help.

--
Thanks. Take care, Brian Inglis Calgary, Alberta, Canada

Br**********@CSi.com (Brian[dot]Inglis{at}SystematicSW[dot]ab[dot]ca)
fake address use address above to reply

Nov 14 '05 #29

Dan Pop

In <8b**************************@posting.google.com > ku****@wizard.net (James Kuyper) writes:

ju**********@yahoo.co.in (junky_fellow) wrote in message news:<8c**************************@posting.google. com>...
...
why the conversion of a pointer type variable to integer invalid ?

It's not necessarily invalid. If, after #include <stdint.h>, you find
that INTPTR_MAX has been #defined, then you can safely convert a
pointer value to an intptr_t. The result of that conversion can itself
be converted back to the same pointer type, in which case it will
compare equal to the original pointer value.

The problem is that the only useful thing the standard guarantees
about that integer value is the reverse conversion. Each
implementation can do it's own thing, and there's absolutely nothing
else that portable code can count on.
what's the reason behing that ?

It's invalid, when INTPTR_MAX hasn't been #defined, because that means
that pointers on this platform are too big to be stored in any integer
type.

Can I have a chapter and verse for that?

The implementor is free not to provide [u]intptr_t and the associated
macros, regardless of how the conversion between pointers and integers
works. It's a quality of implementation issue and the lack of INTPTR_MAX
doesn't mean that (uintmax_t)ptr necessarily invokes undefined behaviour
or yields a meaningless result.

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de

Nov 14 '05 #30

Dan Pop

In <8c**************************@posting.google.com > ju**********@yahoo.co.in (junky_fellow) writes:

why the conversion of a pointer type variable to integer invalid ?
The standard says that it's valid (with one exception) and that the
result is implementation-defined.
what's the reason behind that ?
It may be possible (AS/400 springs to mind) that no integer type is
wide enough to hold the result of the conversion. This is the exception
mentioned above.

As for the implementation-defined result, there are architectures, like
the 8086, where the pointer value is not the same as the address pointed
to and most addresses can have 4096 different representations as pointer
values.
i always had in my mind that pointer variable contains some address,
which is some integer value ?
In some cases, see above, it may be more than one integer value. The
actual address is computed by the CPU itself, from these numbers.
and i can add/subtract after typecasting the pointer variable to int.

You can do that, but the result need not have the desired/expected
meaning. You have to understand how the conversion works on a given
platform (the implementation must document it) in order to do this kind
of things in a meaningful way. Which means that the code cannot be
expect to work as intended on another implementation.

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de

Nov 14 '05 #31

james8049

junky_fellow wrote:

*k*****@wizard.net (James Kuyper) wrote in message
why not represent the pointer as the integer address of the memory
location it is pointing to ? *

I think the point here is that we have had it easy for the past te
years or so, in, that with most compilers on most machines pionter an
unsigned integer were the same thing. This of course assumes that yo
were programing for INTEL x86, SPARC, Power or HP-RISS.

However the world has changed! Depending on which compiler, th
compiler options and precisely what your latest harware upgrade wa
"usigned int" could be 32 or 64 bits and " * ptr" could be 32 or 6
bits.

The only portable and safe way to do pointer arithmatic is wit
subscripts. e.g.
int * ptr;
ptr = &ptr[1]; /* next integer */

Will work whatever the size of your integer, and, whatever the size o
your address
-
james804
-----------------------------------------------------------------------
Posted via http://www.codecomments.co
-----------------------------------------------------------------------

Nov 14 '05 #32

Chris Dollin

james8049 wrote:

The only portable and safe way to do pointer arithmatic is with
subscripts. e.g.
int * ptr;
ptr = &ptr[1]; /* next integer */

Will work whatever the size of your integer, and, whatever the size of
your address.

As does `ptr += 1`.

--
Chris "electric hedgehog" Dollin
C FAQs at: http://www.faqs.org/faqs/by-newsgrou...mp.lang.c.html
C welcome: http://www.angelfire.com/ms3/bchambl...me_to_clc.html

Nov 14 '05 #33

Dan Pop

In <10*************@news.supernews.com> Andrey Tarasevich <an**************@hotmail.com> writes:

j0mbolar wrote:
I've read in the standard that addresses
basically can't be interpreted as integers.
If they can, it is implementation defined
behavior. However, if they can't be viewed
as integers in any sense as far as portability
goes, what then, should one think of addresses
being composed of?

Nothing. You don't need to think about it at all. Especially if you are
talking about portability. There's no portable context in C language
that relies in any way in the internal representation of a pointer.

It's not the internal representation of pointers that really matters,
it's the result of *converting* a pointer to an integer.

If this conversion had well defined semantics, one could use it to
perform operations that are otherwise impossible in C, e.g. checking
if a pointer value is within a certain object without comparing it
against the address of each byte in that array or figuring out the
alignment of a certain pointer value or even displaying a pointer
value in a well defined format (%p accepts no flags, field width or
precision specifications).

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de

Nov 14 '05 #34

James Kuyper

Dan Pop wrote:

In <8b**************************@posting.google.com > ku****@wizard.net (James Kuyper) writes:

....

It's invalid, when INTPTR_MAX hasn't been #defined, because that means
that pointers on this platform are too big to be stored in any integer
type.

Can I have a chapter and verse for that?

The implementor is free not to provide [u]intptr_t and the associated
macros, regardless of how the conversion between pointers and integers
works. It's a quality of implementation issue and the lack of INTPTR_MAX
doesn't mean that (uintmax_t)ptr necessarily invokes undefined behaviour
or yields a meaningless result.

OK, if you prefer, replace "invalid" with "unpredictable and probably
won't work", and replace "too big" with "probably too big". I personally
would consider that to be essentially the same thing. Code that doesn't
reliably achieve the goal I've set for it, doesn't achieve that goal,
because reliability is part of the goal.

Nov 14 '05 #35

David R Tribble

David R Tribble wrote:

Ah, the joys of the Intel segmented architecture!
Mabden wrote: Dunno for certain, but can't we thank Microsoft for that? I don't think the
segment / offset was part of the Intel hardware, but an OS thing. I don't
recall having to go to new hardware when MS decided the flat memory model
was better...

Your first PC must have been a 386. As Brian explains some of the
history...

Brian Inglis wrote: Segment addresses didn't become segment selectors until 286 protected
mode came out; and the flat memory model wasn't available until the
386, when the OS got the choice of running in real, 286 or 386
protected mode (remember the different Windows versions for each),
with multi-megabyte selector sizes which allowed the OS to set the
same base address and length for all the selectors to allow a flat
address space (and virtual 86 mode for a process in protected mode).

The Intel 8086 (8088) had a 20-bit (1 MB) address space. Addresses were
composed of a 16-bit segment and a 16-bit offset; an address was formed by
shifting the segment by 4 bits and adding the offset, resulting in a 20-bit
byte address.

The 286 enhanced the model by adding a mode that treated the 16-bit
segment of an address as a "segment selector", which chose a given 64 KB
segment from within a 24-bit (16 MB) total address space. Pointer
arithmetic consequently was even more complicated in this mode.

The 386 further enhanced the addressing scheme by adding a 32-bit mode
supporting 32-bit offsets within 32-bit (4 GB) segments, yielding a total
memory space of 4 GB (or more in later models). This is the so-called
"linear" addressing model. But it's not completely linear because each
pointer still uses an implied segment selector (depending on the
instruction it's used with). Most programmers don't notice this because
most OS's built on this architecture initialize all of the segments
(there are six) to overlap and begin at the same base address, so that
it acts like a flat 32-bit address space.
So it's incorrect to assume that on a 386 architecure that a given
byte address can be truly represented by a 32-bit integer value.
It's a function of the way the operating system has chosen to use
the underlying segment registers (see above). Assuming that the 4 GB
segments are all aligned and overlapping, you can convert a byte
address into a unique 32-bit integer. But on systems where you can't
make this assumption, a byte address translates into a 32-bit offset
within a particular 4 GB segment within physical memory.
It's also incorrect to accuse Microsoft of deciding that a linear memory
model was better before there even existed PC hardware that supported
such a thing. Sure, Microsoft probably made some bad design choices
along the way (e.g., the way their compilers performed pointer/integer
conversions), but they didn't have much choice because of the funky
segmented addressing model of the Intel PC hardware. Microsoft didn't
make the machines, after all, they just wrote software for them.

-drt

Nov 14 '05 #36

Michael Wojcik

In article <ch**********@sunnews.cern.ch>, Da*****@cern.ch (Dan Pop) writes:

If this conversion had well defined semantics, one could use it to
perform operations that are otherwise impossible in C, e.g. ...
or even displaying a pointer
value in a well defined format (%p accepts no flags, field width or
precision specifications).

Of course, for some implementations it's hard to see how %p plus
hypothetical flags or precision modifiers would produce a well-
defined format. On the AS/400, for example, %p produces a relatively
verbose description of the address, including object space name and
offset.

But there's always the array-of-unsigned-char representation, which
*is* well-defined anywhere; the only variable is its length.

--
Michael Wojcik mi************@microfocus.com

Reversible CA's are -automorphisms- on shift spaces. It is a notorious
fact in symbolic dynamics that describing such things on a shift of finite
type are -fiendishly- difficult. -- Chris Hillman

Nov 14 '05 #37

Michael Wojcik

In article <dm********************************@4ax.com>, Brian Inglis <Br**********@SystematicSW.Invalid> writes:

There is a relation, but it's not always the obvious, expected one;
sometimes its just the hardware, and sometimes the compiler has to
help out inadequate hardware.

And sometimes there is another layer between the C implementation
and the hardware. C addresses in the AS/400 implementations I've
used have no correspondence to hardware addresses; that mapping is
handled by the LIC layer. Which is one of the reasons why the same
compiled C program can run on the two different AS/400 hardware
platforms (the early CISC and the later POWER).

C is not required to run close to the metal, regardless of the
"adequacy" of the hardware.

--
Michael Wojcik mi************@microfocus.com

It wasn't fair; my life was now like everyone else's. -- Eric Severance

Nov 14 '05 #38

Kenneth Brody

David R Tribble wrote:
[... Intel x86 memory architecture ...]

The 386 further enhanced the addressing scheme by adding a 32-bit mode
supporting 32-bit offsets within 32-bit (4 GB) segments, yielding a total
memory space of 4 GB (or more in later models). This is the so-called
"linear" addressing model. But it's not completely linear because each
pointer still uses an implied segment selector (depending on the
instruction it's used with). Most programmers don't notice this because
most OS's built on this architecture initialize all of the segments
(there are six) to overlap and begin at the same base address, so that
it acts like a flat 32-bit address space.
Actually, the code segment (CS) typically points to different memory, so
you can't accidentally try to execute data. (At least this is how "real"
operating systems do it.) The rest (DS, ES, FS, GS, and SS) typically
point to the same memory, so that all data references can use a "near"
32-bit pointer.

[...] It's also incorrect to accuse Microsoft of deciding that a linear memory
model was better before there even existed PC hardware that supported
such a thing. Sure, Microsoft probably made some bad design choices
along the way (e.g., the way their compilers performed pointer/integer
conversions), but they didn't have much choice because of the funky
segmented addressing model of the Intel PC hardware. Microsoft didn't
make the machines, after all, they just wrote software for them.

And they did have Xenix on the 68000 platform before they had DOS on 8088.

--
+-------------------------+--------------------+-----------------------------+
| Kenneth J. Brody | www.hvcomputer.com | |
| kenbrody/at\spamcop.net | www.fptech.com | #include <std_disclaimer.h> |
+-------------------------+--------------------+-----------------------------+

Nov 14 '05 #39

Dan Pop

In <41**************@saicmodis.com> James Kuyper <ku****@saicmodis.com> writes:

Dan Pop wrote:
In <8b**************************@posting.google.com > ku****@wizard.net (James Kuyper) writes:

...
It's invalid, when INTPTR_MAX hasn't been #defined, because that means
that pointers on this platform are too big to be stored in any integer
type.

Can I have a chapter and verse for that?

The implementor is free not to provide [u]intptr_t and the associated
macros, regardless of how the conversion between pointers and integers
works. It's a quality of implementation issue and the lack of INTPTR_MAX
doesn't mean that (uintmax_t)ptr necessarily invokes undefined behaviour
or yields a meaningless result.

OK, if you prefer, replace "invalid" with "unpredictable and probably
won't work", and replace "too big" with "probably too big". I personally
would consider that to be essentially the same thing. Code that doesn't
reliably achieve the goal I've set for it, doesn't achieve that goal,
because reliability is part of the goal.

Code that depends on an optional feature of the language cannot reliably
achieve its goal, period. And the existence of uintmax_t doesn't prove
that the result of the conversion is suitable for *any* purpose other
than replacing pointer comparison for equality by integer comparison
for equality; the standard doesn't guarantee any other property for the
result of the conversion to uintptr_t. If p < q, the standard allows
(uintptr_t)p to be greater than (uintptr_t)q.

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de

Nov 14 '05 #40

Dan Pop

In <ch*********@news3.newsguy.com> mw*****@newsguy.com (Michael Wojcik) writes:

In article <ch**********@sunnews.cern.ch>, Da*****@cern.ch (Dan Pop) writes:

If this conversion had well defined semantics, one could use it to
perform operations that are otherwise impossible in C, e.g. ...
or even displaying a pointer
value in a well defined format (%p accepts no flags, field width or
precision specifications).
Of course, for some implementations it's hard to see how %p plus
hypothetical flags or precision modifiers would produce a well-
defined format. On the AS/400, for example, %p produces a relatively
verbose description of the address, including object space name and
offset.

Which doesn't mean that it wouldn't be useful for the vast majority of
implementations. Not very useful, because few applications need %p at
all, but quite useful for the few that do (most likely, for debugging
purposes).
But there's always the array-of-unsigned-char representation, which
*is* well-defined anywhere; the only variable is its length.

The array-of-unsigned-char representation is not particularly meaningful,
as its interpretation is affected by byte order issues (and,
theoretically, by padding bits issues).

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de

Nov 14 '05 #41

Wojtek Lerch

Dan Pop wrote:

Code that depends on an optional feature of the language cannot reliably
achieve its goal, period. And the existence of uintmax_t doesn't prove
that the result of the conversion is suitable for *any* purpose other
than replacing pointer comparison for equality by integer comparison
for equality; [...]

Where does the standard promise that if two pointers compare equal, then
converting them to uintptr_t will produce the same value?

Nov 14 '05 #42

Dan Pop

In <eV******************@newssvr27.news.prodigy.com > "Mabden" <mabden@sbc_global.net> writes:

"David R Tribble" <da***@tribble.com> wrote in message
news:f4**************************@posting.google. com...
Ah, the joys of the Intel segmented architecture!
Dunno for certain, but can't we thank Microsoft for that? I don't think the
segment / offset was part of the Intel hardware, but an OS thing.

That's either patent ignorance or patent stupidity. The segment/offset
model for memory addressing has *always* been part of the Intel 80x86
architecture.
I don't
recall having to go to new hardware when MS decided the flat memory model
was better...

I do: the flat memory model made sense only when the Intel processors
started to support 4 GB segments, so that the whole system could use a
single segment. Before the 386, a flat memory model would have meant
a system restricted to a 64 kB address space. And MSDOS compilers
did offer such a model, the "tiny" memory model. The fans of von Neumann
architectures could also use the "small" memory model, with a 64 kB
data address space and 64 kB code address space. Both models were
purely liniar, with 16-bit data and function pointers.

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de

Nov 14 '05 #43

Dan Pop

In <2p************@uni-berlin.de> Wojtek Lerch <Wo******@yahoo.ca> writes:

Dan Pop wrote:
Code that depends on an optional feature of the language cannot reliably
achieve its goal, period. And the existence of uintptr_t doesn't prove
that the result of the conversion is suitable for *any* purpose other
than replacing pointer comparison for equality by integer comparison
for equality; [...]

Where does the standard promise that if two pointers compare equal, then
converting them to uintptr_t will produce the same value?

The result of the conversion from pointer to integer being implementation
defined, it's not clear if the implementation is free to throw in some
random bits in the result of the conversion, that are ignored when the
integer is converted back to pointer.

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de

Nov 14 '05 #44

David Hopwood

Dan Pop wrote:

Wojtek Lerch <Wo******@yahoo.ca> writes:
Dan Pop wrote:
Code that depends on an optional feature of the language cannot reliably
achieve its goal, period. And the existence of uintptr_t doesn't prove
that the result of the conversion is suitable for *any* purpose other
than replacing pointer comparison for equality by integer comparison
for equality; [...]

Where does the standard promise that if two pointers compare equal, then
converting them to uintptr_t will produce the same value?

The result of the conversion from pointer to integer being implementation
defined, it's not clear if the implementation is free to throw in some
random bits in the result of the conversion, that are ignored when the
integer is converted back to pointer.

I would have thought it was clear that it *is* free to do this (or to
convert pointers to uintptr_t values that are not unique for other reasons,
for instance because they include segment selectors). uintptr_t is in
general not useful except in non-portable code that "knows" what the
implementation-defined mapping is.

David Hopwood <da******************@blueyonder.co.uk>

Nov 14 '05 #45

Niklas Matthies

On 2004-08-31 18:06, Dan Pop wrote:

In <2p************@uni-berlin.de> Wojtek Lerch <Wo******@yahoo.ca> writes:
Dan Pop wrote:
Code that depends on an optional feature of the language cannot reliably
achieve its goal, period. And the existence of uintptr_t doesn't prove
that the result of the conversion is suitable for *any* purpose other
than replacing pointer comparison for equality by integer comparison
for equality; [...]

Where does the standard promise that if two pointers compare equal, then
converting them to uintptr_t will produce the same value?

The result of the conversion from pointer to integer being implementation
defined, it's not clear if the implementation is free to throw in some
random bits in the result of the conversion, that are ignored when the
integer is converted back to pointer.

The implementation may not need to throw in "random bits" in the
result or ignore them on converting back for this to happen. Pointers
that compare equal are not required to have the same representation.
If the conversion is based on pointer representation rather than on
abstract pointer value (as defined by equality comparison), then what
Wojtek Lerch describes happens naturally.

With x86 16-bit segment:offset pointers, for example, 000A:7B60 and
000B:7B50 will compare equal, and may well convert to 0x000A7B60 and
0x000B7B50, respectively.

-- Niklas Matthies

Nov 14 '05 #46

Eric Sosman

Dan Pop wrote:

In <eV******************@newssvr27.news.prodigy.com > "Mabden" <mabden@sbc_global.net> writes:

"David R Tribble" <da***@tribble.com> wrote in message
news:f4**************************@posting.google .com...
Ah, the joys of the Intel segmented architecture!

Dunno for certain, but can't we thank Microsoft for that? I don't think the
segment / offset was part of the Intel hardware, but an OS thing.

That's either patent ignorance or patent stupidity. The segment/offset
model for memory addressing has *always* been part of the Intel 80x86
architecture.

I don't
recall having to go to new hardware when MS decided the flat memory model
was better...

I do: the flat memory model made sense only when the Intel processors
started to support 4 GB segments, so that the whole system could use a
single segment. Before the 386, a flat memory model would have meant
a system restricted to a 64 kB address space. And MSDOS compilers
did offer such a model, the "tiny" memory model. The fans of von Neumann
architectures could also use the "small" memory model, with a 64 kB
data address space and 64 kB code address space. Both models were
purely liniar, with 16-bit data and function pointers.

Dan's explanation of the "small" model also illustrates
a reason why data pointers and function pointers aren't
interconvertible or comparable.

uintptr_t dp = (uintptr_t) stdout; // data pointer
uintptr_t fp = (uintptr_t) printf; // function pointer

It is conceivable that `dp' and `fp' could be equal, but
quite obviously the original pointers `stdout' and `printf'
point to completely different things. "The" address 0x1234
might actually be two different locations with completely
different contents, depending on whether it's interpreted
as a data address or as a code address.

--
Er*********@sun.com

Nov 14 '05 #47

James Kuyper

Dan Pop wrote:
....

... And the existence of uintmax_t doesn't prove
that the result of the conversion is suitable for *any* purpose other
than replacing pointer comparison for equality by integer comparison
for equality;
Citation, please - where does the standard say anything that connects
those two equality comparisons?

An entirely plausible situation, and one I've actually seen, is that two
pointers with different bit patterns can represent the same physical
location, and therefore compare equal, but they get converted to
integers that correspond to the bit patterns, and therefore compare unequal.

On the other hand, if the integers do compare equal, then it seems to me
that they must have been converted from pointers that compared equal.
... the standard doesn't guarantee any other property for the
result of the conversion to uintptr_t. ...

The only property the standard does guarantee is that the result, if
converted back to the original pointer type, compares equal to the
original pointer.

Nov 14 '05 #48

Kenneth Brody

Niklas Matthies wrote:
[...]

With x86 16-bit segment:offset pointers, for example, 000A:7B60 and
000B:7B50 will compare equal, and may well convert to 0x000A7B60 and
0x000B7B50, respectively.

Are you sure that those two pointers will compare equal? Yes, in real
mode, they point to the same physical address. But, who says that the
pointers need to be "normalized" before being compared?

It's been a while since I've done 16-bit real mode x86 programming, but
the last time I did, I'm pretty sure that the compiler (which probably
wouldn't qualifiy as "ANSI compilant" today) would not have compared the
above pointers as "equal" unless you were in huge model.

Does the standard say that two pointers must compare as "equal" if and
only if they point to the same memory location, regardless of the bit-
level representation of the pointers?

--
+-------------------------+--------------------+-----------------------------+
| Kenneth J. Brody | www.hvcomputer.com | |
| kenbrody/at\spamcop.net | www.fptech.com | #include <std_disclaimer.h> |
+-------------------------+--------------------+-----------------------------+

Nov 14 '05 #49

Keith Thompson

Eric Sosman <Er*********@sun.com> writes:
[...]

Dan's explanation of the "small" model also illustrates
a reason why data pointers and function pointers aren't
interconvertible or comparable.

uintptr_t dp = (uintptr_t) stdout; // data pointer
uintptr_t fp = (uintptr_t) printf; // function pointer

It is conceivable that `dp' and `fp' could be equal, but
quite obviously the original pointers `stdout' and `printf'
point to completely different things. "The" address 0x1234
might actually be two different locations with completely
different contents, depending on whether it's interpreted
as a data address or as a code address.

Right, and that can happen (at least theoretically) even if the
hardware doesn't have separate address spaces for instructions and
data. A function pointer could plausibly be implemented as an index
into a table rather than as a machine address. (The AS/400 probably
does something at least that weird.)

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Nov 14 '05 #50

Similar topics