Printing a NULL pointer

On Tue, 14 Jun 2005 22:50:26 -0700, junky_fellow wrote:

Consider an implementation that doesn't use all bits 0 to represent
a NULL pointer. Let the NULL pointer is represented by 0x12345678.
On such an implementation, if the value of NULL pointer is printed
will it be all 0's or 0x12345678
Either or neither. The implementation could coose to output it as, for
example, <null>
int main(void)
{

char *ptr;
ptr = 0;

printf("\nptr=%p\n",ptr);
}
}
What would be the output, 0 or 0x12345678 ?
The standard does not specify the form of output of %p. There is no
requirement that it takes the form of a hex number, although it can and
some implementations do that.
I think user must be kept
transparent from the internal representation of NULL pointer.
That is certainly not a requirement. All that is required is that scanf()
%p can recreate the pointer from the output of printf() %p
Even if
the implementation is using 0x12345678 for NULL pointer, value printed
should be all bits zero.

If you want full transparency then direct correspondance to any particular
bit pattern should be avoided.

Lawrence

Nov 14 '05 #3

Le 15/06/2005 07:50, dans
11*********************@g47g2000cwa.googlegroups.c om,
«*ju**********@yahoo.co.in*» <ju**********@yahoo.co.in> a écrit*:

Consider an implementation that doesn't use all bits 0 to represent
a NULL pointer. Let the NULL pointer is represented by 0x12345678.
On such an implementation, if the value of NULL pointer is printed
will it be all 0's or 0x12345678

int main(void)
{

char *ptr;
ptr = 0;

printf("\nptr=%p\n",ptr);
}

What would be the output, 0 or 0x12345678 ?
I think user must be kept transparent from the internal representation
of NULL pointer. Even if the implementation is
using 0x12345678 for NULL pointer, value printed should be all
bits zero.

It seems dangerous if 0 is a valid address, and it probably will be
if 0x12345678 is NULL. In that cas, I would prefer 0x12345678,
or even "(null)" or anything clearly announcing a NULL pointer.

Nov 14 '05 #4

junky_fellow

Lawrence Kirby wrote:

On Tue, 14 Jun 2005 22:50:26 -0700, junky_fellow wrote:
Consider an implementation that doesn't use all bits 0 to represent
a NULL pointer. Let the NULL pointer is represented by 0x12345678.
On such an implementation, if the value of NULL pointer is printed
will it be all 0's or 0x12345678

Either or neither. The implementation could coose to output it as, for
example, <null>
int main(void)
{

char *ptr;
ptr = 0;

printf("\nptr=%p\n",ptr);
}
}
What would be the output, 0 or 0x12345678 ?

The standard does not specify the form of output of %p. There is no
requirement that it takes the form of a hex number, although it can and
some implementations do that.
I think user must be kept
transparent from the internal representation of NULL pointer.

That is certainly not a requirement. All that is required is that scanf()
%p can recreate the pointer from the output of printf() %p
Even if
the implementation is using 0x12345678 for NULL pointer, value printed
should be all bits zero.

If you want full transparency then direct correspondance to any particular
bit pattern should be avoided.

Lawrence

Is there any way by which user can determine what is the internal
representation for a NULL pointer ? I am asking this because,
sometimes during debugging the memory dump is analysed. In that
case it would be difficult to find it is a NULL pointer or not.

Nov 14 '05 #5

pete

ju**********@yahoo.co.in wrote:

Is there any way by which user can determine what is the internal
representation for a NULL pointer ? I am asking this because,
sometimes during debugging the memory dump is analysed. In that
case it would be difficult to find it is a NULL pointer or not.

/* BEGIN new.c */

#include <stdio.h>

int main(void)
{
void *pointer = NULL;
size_t byte;

for (byte = 0; byte != sizeof pointer; ++byte) {
printf(
"byte %lu is 0x%u\n",
(long unsigned)byte,
(unsigned)((unsigned char *)&pointer)[byte]
);
}
puts(
"There may be more than one "
"representation for a null pointer."
);
return 0;
}

/* END new.c */

--
pete

Nov 14 '05 #6

Is there any way by which user can determine what is the internal
representation for a NULL pointer ? I am asking this because,
sometimes during debugging the memory dump is analysed. In that
case it would be difficult to find it is a NULL pointer or not.

You can cast to an unsigned integer (if pointers and integers are
32 bits, that should work). If the compiler is very clever and
converts the NULL pointer to a 0 value, then you can try this:
Write a function "unsigned fun(unsigned x) { return x; }", and
compile, then in another file, declare this function as
"unsigned fun(char *p)" and pass it a NULL pointer. Hopefully the
result will be the internet representation of a NULL. I'm not
sure, since I've never tried this on a machine on which NULL != 0.
I made many assumptions here on integer and pointer sizes, so
it may not work at all. And on a machine where pointers and integers
are passed in a different way, it won't work either. Be careful,
that's just an idea.

Nov 14 '05 #7

pete

ju**********@yahoo.co.in wrote:

Consider an implementation that doesn't use all bits 0 to represent
a NULL pointer. Let the NULL pointer is represented by 0x12345678.
On such an implementation, if the value of NULL pointer is printed
will it be all 0's or 0x12345678

It could be printed out as 0xdeadbeef.

--
pete

Nov 14 '05 #8

CBFalconer

ju**********@yahoo.co.in wrote:

Consider an implementation that doesn't use all bits 0 to represent
a NULL pointer. Let the NULL pointer is represented by 0x12345678.
On such an implementation, if the value of NULL pointer is printed
will it be all 0's or 0x12345678

int main(void)
{
char *ptr;
ptr = 0;
printf("\nptr=%p\n",ptr);
}

What would be the output, 0 or 0x12345678 ?

It's implementation dependent. From N869:

p The argument shall be a pointer to void. The value
of the pointer is converted to a sequence of
printing characters, in an implementation-defined
manner.

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson

Nov 14 '05 #9

Richard Tobin

In article <11**********************@g47g2000cwa.googlegroups .com>,
<ju**********@yahoo.co.in> wrote:

Is there any way by which user can determine what is the internal
representation for a NULL pointer ? I am asking this because,
sometimes during debugging the memory dump is analysed. In that
case it would be difficult to find it is a NULL pointer or not.

In all real-world implementations the NULL pointer is all-bits zero.
(Someone will post a counter-example if I'm wrong.) So if you are
really debugging with a memory dump, rather than asking a theoretical
question, there is no problem.

-- Richard

Nov 14 '05 #10

On Wed, 15 Jun 2005 14:30:23 +0200, Jean-Claude Arbaut wrote:

Is there any way by which user can determine what is the internal
representation for a NULL pointer ? I am asking this because,
sometimes during debugging the memory dump is analysed. In that
case it would be difficult to find it is a NULL pointer or not.

You can cast to an unsigned integer (if pointers and integers are
32 bits, that should work).

This may work for debugging purposes if the compiler happens to do the
right thing. But there is no circumstances under which the standard
guarantees that converting from a pointer to an integer will produce a
useful result.
If the compiler is very clever and
converts the NULL pointer to a 0 value, then you can try this:
Write a function "unsigned fun(unsigned x) { return x; }", and
compile, then in another file, declare this function as
"unsigned fun(char *p)" and pass it a NULL pointer.
This is an extremely nasty and broken kludge. You get undefined behaviour
if the type used to call a function is not compatible with the type of
its definition.
Hopefully the
result will be the internet representation of a NULL. I'm not
sure, since I've never tried this on a machine on which NULL != 0.
I made many assumptions here on integer and pointer sizes, so
it may not work at all. And on a machine where pointers and integers
are passed in a different way, it won't work either. Be careful,
that's just an idea.

E.g. 68K systems where integers and pointers are typically passed in
different registers.

There's no ned to resort to non-portable code, this can be done portably.
In C the representation of an addressable object can be inspected by
treating it as an array of unsigned char. For example
type *ptr = NULL;
const unsigned char *p = (const unsigned char *)&ptr;
size_t i;

for (i = 0; i < sizeof ptr; i++)
printf(" %.2x", (unsigned)p[i]);
Lawrence

Nov 14 '05 #11

Le 15/06/2005 16:52, dans pa***************************@netactive.co.uk,
«*Lawrence Kirby*» <lk****@netactive.co.uk> a écrit*:

On Wed, 15 Jun 2005 14:30:23 +0200, Jean-Claude Arbaut wrote:

You can cast to an unsigned integer (if pointers and integers are
32 bits, that should work).
This may work for debugging purposes if the compiler happens to do the
right thing. But there is no circumstances under which the standard
guarantees that converting from a pointer to an integer will produce a
useful result.

I thought it was obvious here that my suggestions were non Standard.
Is there also a Standard way to explain obvious things ? :-)

If the compiler is very clever and
converts the NULL pointer to a 0 value, then you can try this:
Write a function "unsigned fun(unsigned x) { return x; }", and
compile, then in another file, declare this function as
"unsigned fun(char *p)" and pass it a NULL pointer.
This is an extremely nasty and broken kludge. You get undefined behaviour
if the type used to call a function is not compatible with the type of
its definition.

I knew you wouldn't like ;-) It's not very dangerous here, I just
return an integer and it's a valuable trick in some situations.
Oh, and when you link asm code to C, what do you think you do ?
And if you tell me it's not Standard, then I'll answer you can't
have a libc without asm... Perhaps asm is too nasty :-D

Hopefully the
result will be the internet representation of a NULL. I'm not
sure, since I've never tried this on a machine on which NULL != 0.
I made many assumptions here on integer and pointer sizes, so
it may not work at all. And on a machine where pointers and integers
are passed in a different way, it won't work either. Be careful,
that's just an idea.

E.g. 68K systems where integers and pointers are typically passed in
different registers.
Thanks for the example. I had another processor in mind, but I've never
Seen a C compiler for it. I think it's called Saturn, but I'm not sure.
There's no ned to resort to non-portable code, this can be done portably.
In C the representation of an addressable object can be inspected by
treating it as an array of unsigned char. For example
type *ptr = NULL;
const unsigned char *p = (const unsigned char *)&ptr;
size_t i;

for (i = 0; i < sizeof ptr; i++)
printf(" %.2x", (unsigned)p[i]);

Much better, yes.

Nov 14 '05 #12

On Wed, 15 Jun 2005 13:52:37 +0000, Richard Tobin wrote:

In article <11**********************@g47g2000cwa.googlegroups .com>,
<ju**********@yahoo.co.in> wrote:
Is there any way by which user can determine what is the internal
representation for a NULL pointer ? I am asking this because,
sometimes during debugging the memory dump is analysed. In that
case it would be difficult to find it is a NULL pointer or not.

In all real-world implementations the NULL pointer is all-bits zero.
(Someone will post a counter-example if I'm wrong.)

See FAQ 5.17

Lawrence

Nov 14 '05 #13

In article <pa****************************@netactive.co.uk> ,
Lawrence Kirby <lk****@netactive.co.uk> wrote:

I think user must be kept
transparent from the internal representation of NULL pointer.

That is certainly not a requirement. All that is required is that scanf()
%p can recreate the pointer from the output of printf() %p

C is broken then. Okay, not broken, but there is a hidden
assumption that needs to be called out.

If we cannot make any assumptions about the format of the pointer,
then we cannot reliably print it out and read it back in. Preceding
and succeeding data might "accidentally" consist of colons,
hexadecimal characters, the word "ullnay" etc and we have no
sanctioned set of delimiters to protect the %p data from surrounding
garbage.

The only alternatives I can see are that
- all %p data has a deterministic length (not necessarily fixed length) but
e.g. if is starts with 0x then there must be (e.g.) 8 more chars,
else it must be 6 chars of "ullnay" else it isn't a %p
- Or, the %p data can only be written to a file or char array alone
IOW, the EOF or \0 are the only delimiters.

Perhaps it is already codified and I'm not aware of it, but
modifiers like %-20.10p would disturb the length of the field
and make it unreadable.

At this point I'm suspecting that a lot of implementations
would screw up "%p%p" in a *scanf call, confusing the leading
zero of 0x as part of the preceding pointer data and ignoring
the result as if it were unsigned hex overflow.
--
7842++

Nov 14 '05 #14

On Wed, 15 Jun 2005 18:31:23 +0200, Jean-Claude Arbaut wrote:

Le 15/06/2005 16:52, dans pa***************************@netactive.co.uk,
«*Lawrence Kirby*» <lk****@netactive.co.uk> a écrit*:
....

This may work for debugging purposes if the compiler happens to do the
right thing. But there is no circumstances under which the standard
guarantees that converting from a pointer to an integer will produce a
useful result.

I thought it was obvious here that my suggestions were non Standard.
Is there also a Standard way to explain obvious things ? :-)

It is obvious if you know it is non-standard, probably not if you don't.
The best thing is simply not to do it, you rarely if ever need it unless
you are writing your own memory allocator.

If the compiler is very clever and
converts the NULL pointer to a 0 value, then you can try this:
Write a function "unsigned fun(unsigned x) { return x; }", and
compile, then in another file, declare this function as
"unsigned fun(char *p)" and pass it a NULL pointer.

This is an extremely nasty and broken kludge. You get undefined behaviour
if the type used to call a function is not compatible with the type of
its definition.

I knew you wouldn't like ;-) It's not very dangerous here,

Like everything else it is not dangerous on implementations where it
works, but could be disasterous on implementations where it doesn't. The
approach should always be to looks for approaches that avoid doing things
like this. After all what if it breaks on the next version of the compiler
you use? There's nothing wrong with the compiler, it is the code that is
faulty.
I just
return an integer and it's a valuable trick in some situations.
I find that hard to believe. It suggests that you haven't put enough
thought into finding a better solution.
Oh, and when you link asm code to C, what do you think you do ?
And if you tell me it's not Standard, then I'll answer you can't
have a libc without asm... Perhaps asm is too nasty :-D

If you link asm to C code you are sacrificing portability, which is fine
in some circumstances. However you are (or should be) still basing the
code on specifications that define its behaviour. By doing things like
calling functions incorrectly you have NO specification of behaviour, you
are trusting to blind luck and hope that things will continue to work as
you have observed in the past. This is no way to program. Today's compiler
optimisers are too complex to predict with any certainty. You may use a
trick that worked many times, then one day with the same compiler you use
it in a situation which the compiler decides it can optimise and suddenly
the trick no longer works. There was an example of this in another
thread to do with overlaying structures. One of the reasons C leaves some
areas of behaviour undefined is to allow more aggressive optimisations.

Lawrence

Nov 14 '05 #15

ju**********@yahoo.co.in writes:
[...]

Is there any way by which user can determine what is the internal
representation for a NULL pointer ? I am asking this because,
sometimes during debugging the memory dump is analysed. In that
case it would be difficult to find it is a NULL pointer or not.

printf("%p\n", (void*)NULL);

is very likely to print a legible form of the representation of a null
pointer (which is very likely to be all-bits-zero). If "very likely"
isn't good enough, several followups have shown how to break the
representation down into a sequence of bytes.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Nov 14 '05 #16

Clark S. Cox III

On 2005-06-15 14:07:30 -0400, an******@example.com (Anonymous 7843) said:

In article <pa****************************@netactive.co.uk> ,
Lawrence Kirby <lk****@netactive.co.uk> wrote:
I think user must be kept
transparent from the internal representation of NULL pointer.

That is certainly not a requirement. All that is required is that scanf()
%p can recreate the pointer from the output of printf() %p

C is broken then. Okay, not broken, but there is a hidden
assumption that needs to be called out.

If we cannot make any assumptions about the format of the pointer,
then we cannot reliably print it out and read it back in. Preceding
and succeeding data might "accidentally" consist of colons, hexadecimal
characters, the word "ullnay" etc and we have no
sanctioned set of delimiters to protect the %p data from surrounding
garbage.

The only alternatives I can see are that
- all %p data has a deterministic length (not necessarily fixed length)
but e.g. if is starts with 0x then there must be (e.g.) 8 more chars,
else it must be 6 chars of "ullnay" else it isn't a %p
- Or, the %p data can only be written to a file or char array alone
IOW, the EOF or \0 are the only delimiters.

Perhaps it is already codified and I'm not aware of it, but
modifiers like %-20.10p would disturb the length of the field
and make it unreadable.

At this point I'm suspecting that a lot of implementations
would screw up "%p%p" in a *scanf call, confusing the leading
zero of 0x as part of the preceding pointer data and ignoring
the result as if it were unsigned hex overflow.

But that's no different than any of the other printf/scanf specifiers.
That is, print two integers with "%d%d", and scan them back in.
--
Clark S. Cox, III
cl*******@gmail.com

Nov 14 '05 #17

> It is obvious if you know it is non-standard, probably not if you don't.

The best thing is simply not to do it, you rarely if ever need it unless
you are writing your own memory allocator.
I agree, but it doesn't hurt knowing it's possible (though not portable),
and knowing how things work.

I knew you wouldn't like ;-) It's not very dangerous here,

Like everything else it is not dangerous on implementations where it
works, but could be disasterous on implementations where it doesn't. The
approach should always be to looks for approaches that avoid doing things
like this.

Yes, sir ! No irony, I completely agree it's not the style to use
too often, but if I remember well, the original post asked for
a manner to see what NULL is. Just a toy program after all.

Sometimes, it may be necessary: you mention a memory allocator,
but there may be other uses.
After all what if it breaks on the next version of the compiler
you use? There's nothing wrong with the compiler, it is the code that is
faulty.
Some parts of a program are by nature very dependent on OS/compiler/proc.
It's a good practice to try to avoid them, it's not good practice do
deny their existence. This newsgroup deny all that is not perfectly
described by the Standard. It's only this attitude I reject, not
the Standard itself.

I just
return an integer and it's a valuable trick in some situations.
I find that hard to believe. It suggests that you haven't put enough
thought into finding a better solution.

You are right, I must confess :-)

Oh, and when you link asm code to C, what do you think you do ?
And if you tell me it's not Standard, then I'll answer you can't
have a libc without asm... Perhaps asm is too nasty :-D

If you link asm to C code you are sacrificing portability, which is fine
in some circumstances.

I agree (but do I need to say that ? :-)).
However you are (or should be) still basing the
code on specifications that define its behaviour. By doing things like
calling functions incorrectly you have NO specification of behaviour,
Well, the standard doesn't specify anything, but your knowledge of
the system you're programming on (including compiler), gives you
all needed informations on the behaviour. When in doubt, it's always (?)
possible to have a look at the assembly output.
you
are trusting to blind luck and hope that things will continue to work as
you have observed in the past.
Nope. I never trust a compiler :-) Even with beautiful C code, I'm not
confident in its optimizations, especially concerning floating point.
And this opinion is not going to change in the near future: "paranoia" is
old, but still an interesting test (just as an example).
This is no way to program. Today's compiler
optimisers are too complex to predict with any certainty.
Here I disagree. There are many predictable parts in gcc output, with enough
habit, you know when writing some code if it will be well optimized or not,
and which kind of optimization occurs. If you think your compiler is
perfect, well I hope it is ! Gcc, to stay with something I am acquainted
with, won't use prefetch or Altivec instructions (I heard gcc 4 will, but
I'm still not convinced). Hence it's not so difficult to beat its optimizer.
On the other hand, it's much more difficult to beat xlc (won't use Altivec,
but apart from that, it is very, very clever). I think a good practice is
always having a look at assembly output, for function that need good
optimizations, or that use non standard tricks. Obviously, that demands
some understanding of OS/proc.

Oh, and I said "acquainted with gcc", I wouldn't even try to make anyone
believe I know perfectly how it works.
You may use a
trick that worked many times, then one day with the same compiler you use
it in a situation which the compiler decides it can optimise and suddenly
the trick no longer works.
Not seen for the moment, but I am vigilant :-)
There was an example of this in another
thread to do with overlaying structures.
I'll look for it !
One of the reasons C leaves some
areas of behaviour undefined is to allow more aggressive optimisations.

And why does many if not all compilers allow so many extensions ? Some are
completely understandable, but many (including gcc's extensions) are too
tempting, and when used, code is irremissibly struck with one specific
compiler. It may seem strange for me to say that :-) In fact I agree with
the principle of a standard (I've seen this necessity with these many
implementations of f77 hanging around), but I don't agree to deny specific
use of it, when it's needed. Only when it's needed. And I'm glad you
pointed out a good way to do what was asked in this thread. I hope I've
clarified some points.

By the way, is it completely portable ? Use of an array of chars to read
what is very often an integer may seem strange. It's fun: it's portable
only because the standard don't know a pointer is an int or
something else :-) Actually, we only want to know if NULL is 0, so no
problem.

Nov 14 '05 #18

Clark S. Cox III <cl*******@gmail.com> writes:

On 2005-06-15 14:07:30 -0400, an******@example.com (Anonymous 7843) said:

[...]

Perhaps it is already codified and I'm not aware of it, but
modifiers like %-20.10p would disturb the length of the field
and make it unreadable.
At this point I'm suspecting that a lot of implementations
would screw up "%p%p" in a *scanf call, confusing the leading
zero of 0x as part of the preceding pointer data and ignoring
the result as if it were unsigned hex overflow.

But that's no different than any of the other printf/scanf
specifiers. That is, print two integers with "%d%d", and scan them
back in.

The difference, though, is that we know the format of the output
produced by "%d", and can allow for in when using *scanf. We can't
necessarily know how to avoid similar problems with "%p".

On the other hand, I've never heard of this being a problem in
practice. If the output isn't intended to be re-scanned (which is
probably the case most of the time), you merely have to count on the
implementer to produce something legible. If you need to be able to
re-scan it, it probably suffices to surround it with white space. (If
it turned out to be a problem, the standard could always be amended to
forbid blanks in the result of a "%p" format.)

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Nov 14 '05 #19

In article <2005061516263116807%clarkcox3@gmailcom>,
Clark S. Cox III <cl*******@gmail.com> wrote:

On 2005-06-15 14:07:30 -0400, an******@example.com (Anonymous 7843) said:
At this point I'm suspecting that a lot of implementations
would screw up "%p%p" in a *scanf call, confusing the leading
zero of 0x as part of the preceding pointer data and ignoring
the result as if it were unsigned hex overflow.

But that's no different than any of the other printf/scanf specifiers.
That is, print two integers with "%d%d", and scan them back in.

Sorry, poor example.

But you know how %d will printf and how it will scanf back
in. You know that you can use spaces, letters, or
punctuation to separate a %d-generated number from other
data. You do *not* know (from the standard) what %p will or
won't print so you don't know what delimiters, if any, are
safe to use. So, you can't use delimiters.

And how the heck do you print a function pointer? It's not
guaranteed to fit in a void*, right?
--
7842++

Nov 14 '05 #20

Richard Tobin

In article <pa****************************@netactive.co.uk> ,
Lawrence Kirby <lk****@netactive.co.uk> wrote:

In all real-world implementations the NULL pointer is all-bits zero.
(Someone will post a counter-example if I'm wrong.)
See FAQ 5.17

Thanks, an interesting list. But I don't think any of the machines
listed as having non-zero null pointers are in the real world any
more. In the case of the Primes, I certainly hope not.

-- Richard

Nov 14 '05 #21

pete

Anonymous 7843 wrote:

And how the heck do you print a function pointer? It's not
guaranteed to fit in a void*, right?

Right.

--
pete

Nov 14 '05 #22

Dik T. Winter

In article <133se.63$SF5.26@fed1read07> an******@example.com writes:

In article <2005061516263116807%clarkcox3@gmailcom>,
Clark S. Cox III <cl*******@gmail.com> wrote:
On 2005-06-15 14:07:30 -0400, an******@example.com (Anonymous 7843) said:
.... But you know how %d will printf and how it will scanf back
in. You know that you can use spaces, letters, or
punctuation to separate a %d-generated number from other
data. You do *not* know (from the standard) what %p will or
won't print so you don't know what delimiters, if any, are
safe to use. So, you can't use delimiters.
You can. You are sure (from the standard) that the output will not include
\t, \r or \n. %p is guaranteed to give only printable characters, and those
three are not; they are control characters. However I always doubted the
usefulness of being able to read pointers with %p, except, perhaps, in
debuggers.
And how the heck do you print a function pointer? It's not
guaranteed to fit in a void*, right?

There is a good reason for that. There are systems where function pointers
simply are not compatible with data pointers. For example, when they are
much wider. But what is a good reason to print a function pointer?
--
dik t. winter, cwi, kruislaan 413, 1098 sj amsterdam, nederland, +31205924131
home: bovenover 215, 1025 jn amsterdam, nederland; http://www.cwi.nl/~dik/

Nov 14 '05 #23

Neil Kurzman

ju**********@yahoo.co.in wrote:

Lawrence Kirby wrote:
On Tue, 14 Jun 2005 22:50:26 -0700, junky_fellow wrote:
Consider an implementation that doesn't use all bits 0 to represent
a NULL pointer. Let the NULL pointer is represented by 0x12345678.
On such an implementation, if the value of NULL pointer is printed
will it be all 0's or 0x12345678

Either or neither. The implementation could coose to output it as, for
example, <null>
int main(void)
{

char *ptr;
ptr = 0;

printf("\nptr=%p\n",ptr);
}
}
What would be the output, 0 or 0x12345678 ?

The standard does not specify the form of output of %p. There is no
requirement that it takes the form of a hex number, although it can and
some implementations do that.
I think user must be kept
transparent from the internal representation of NULL pointer.

That is certainly not a requirement. All that is required is that scanf()
%p can recreate the pointer from the output of printf() %p
Even if
the implementation is using 0x12345678 for NULL pointer, value printed
should be all bits zero.

If you want full transparency then direct correspondance to any particular
bit pattern should be avoided.

Lawrence

Is there any way by which user can determine what is the internal
representation for a NULL pointer ? I am asking this because,
sometimes during debugging the memory dump is analysed. In that
case it would be difficult to find it is a NULL pointer or not.

Set a Pointer to NULL then look at the memory dump.
Theoretical vs Practical.

Nov 14 '05 #24

Michael Wojcik

In article <ln************@nuthaus.mib.org>, Keith Thompson <ks***@mib.org> writes:

ju**********@yahoo.co.in writes:
[...]
Is there any way by which user can determine what is the internal
representation for a NULL pointer ? I am asking this because,
sometimes during debugging the memory dump is analysed. In that
case it would be difficult to find it is a NULL pointer or not.
printf("%p\n", (void*)NULL);

is very likely to print a legible form of the representation of a null
pointer (which is very likely to be all-bits-zero).

Though not on the ever-lovin' AS/400.[1] This example also shows
that the %p format on the AS/400 contains both colons and spaces,
and so meets the fears of some other posters in this thread.

(On the other hand, I ran across a post from Dan Pop noting that
length and precision specifiers are not permitted with %p; if true,
that satisfies one worry that's been posted.)
If "very likely"
isn't good enough, several followups have shown how to break the
representation down into a sequence of bytes.

Yep. 16 bytes, in the case of the '400. The '400 also offers a
cozy 2**127-1 pointer trap representations, though - mirabile dictu -
its null pointer representation is all-bits-zero.
1. http://groups-beta.google.com/group/...4a3c0c955d17f3

See also:
http://groups-beta.google.com/group/...438cf53aa68a6b
http://groups-beta.google.com/group/...9b7dc93bdbf187

--
Michael Wojcik mi************@microfocus.com

The surface of the word "profession" is hard and rough, the inside mixed with
poison. It's this that prevents me crossing over. And what is there on the
other side? Only what people longingly refer to as "the other side".
-- Tawada Yoko (trans. Margaret Mitsutani)

Nov 14 '05 #25

On Thu, 16 Jun 2005 15:33:24 +0000, Michael Wojcik wrote:

....

(On the other hand, I ran across a post from Dan Pop noting that
length and precision specifiers are not permitted with %p; if true,
that satisfies one worry that's been posted.)

A precision specifier is certainly not permitted with %p (it givbes
undefined behaviour) but as far as I can see minimum field width is.
The only conversion specifier you can't use a minimum field width
for is %%.

Lawrence

Nov 14 '05 #26

In article <II********@cwi.nl>, Dik T. Winter <Di********@cwi.nl> wrote:

But what is a good reason to print a function pointer?

Depends on your definition of good, but how about
debugging a non-portable feature like dynamic libraries?

Technically there is no need to be able to print object pointers
either, yet we can do so.
--
7842++

Nov 14 '05 #27

Le 17/06/2005 02:12, dans aAose.225$SF5.214@fed1read07, «*Anonymous 7843*»
<an******@example.com> a écrit*:

In article <II********@cwi.nl>, Dik T. Winter <Di********@cwi.nl> wrote:
But what is a good reason to print a function pointer?

Depends on your definition of good, but how about
debugging a non-portable feature like dynamic libraries?

Technically there is no need to be able to print object pointers
either, yet we can do so.

Is there a good reason appart from debugging ? I can't see one, but
I may be too tired at 3:15 local :-)

Nov 14 '05 #28

Dik T. Winter

In article <aAose.225$SF5.214@fed1read07> an******@example.com writes:

In article <II********@cwi.nl>, Dik T. Winter <Di********@cwi.nl> wrote:
But what is a good reason to print a function pointer?

Depends on your definition of good, but how about
debugging a non-portable feature like dynamic libraries?

But what do you think is a good way to print a function pointer when it
is, say, twice as wide as an object pointer? While, in general, object
pointers are as wide, or smaller than a void*, a function pointer can
be much wider.
--
dik t. winter, cwi, kruislaan 413, 1098 sj amsterdam, nederland, +31205924131
home: bovenover 215, 1025 jn amsterdam, nederland; http://www.cwi.nl/~dik/

Nov 14 '05 #29

"Dik T. Winter" <Di********@cwi.nl> writes:

In article <aAose.225$SF5.214@fed1read07> an******@example.com writes:
> In article <II********@cwi.nl>, Dik T. Winter <Di********@cwi.nl> wrote:
> > But what is a good reason to print a function pointer?

>
> Depends on your definition of good, but how about
> debugging a non-portable feature like dynamic libraries?

But what do you think is a good way to print a function pointer when it
is, say, twice as wide as an object pointer? While, in general, object
pointers are as wide, or smaller than a void*, a function pointer can
be much wider.

Here's a function that returns a hexadecimal image of any object,
given its address and its size in bytes. The caller needs to free()
the result after using it.

#define NIBBLES_PER_BYTE ((CHAR_BIT+3)/4)

char *hex_image(void *addr, size_t len)
{
char *result = malloc(NIBBLES_PER_BYTE * len + 1);
unsigned char *in = addr;
char *out = result;
size_t i;

if (result == NULL) return NULL;

for (i = 0; i < len; i ++) {
sprintf(out, "%0*x", NIBBLES_PER_BYTE, *in);
in ++;
out += NIBBLES_PER_BYTE;
}
return result;
}

It can be simplified a little if you don't mind assuming CHAR_BIT==8.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Nov 14 '05 #30

Richard Bos

an******@example.com (Anonymous 7843) wrote:

In article <II********@cwi.nl>, Dik T. Winter <Di********@cwi.nl> wrote:
But what is a good reason to print a function pointer?

Depends on your definition of good, but how about
debugging a non-portable feature like dynamic libraries?

If you're debugging non-portable features, using other non-portable
features to do so is perfectly sensible.

Anyway, you can always just print the representation in memory, using an
unsigned char *. It's not guaranteed to be unique, but at least you'll
have _a_ representation.

Richard

Nov 14 '05 #31

On Fri, 17 Jun 2005 02:18:30 +0000, Dik T. Winter wrote:

In article <aAose.225$SF5.214@fed1read07> an******@example.com writes:
> In article <II********@cwi.nl>, Dik T. Winter <Di********@cwi.nl> wrote:
> > But what is a good reason to print a function pointer?

>
> Depends on your definition of good, but how about
> debugging a non-portable feature like dynamic libraries?

But what do you think is a good way to print a function pointer when it
is, say, twice as wide as an object pointer? While, in general, object
pointers are as wide, or smaller than a void*, a function pointer can
be much wider.

C allows a function pointer to be cast to any other type of function
pointer (and back again). So you could have a printf() conversion
specifier, say %P that takes a function pointer of a particular type, say
void (*)(void). So

printf("%P\n", (void (*)(void))main);

Lawrence

Nov 14 '05 #32

In article <II********@cwi.nl>, Dik T. Winter <Di********@cwi.nl> wrote:

In article <aAose.225$SF5.214@fed1read07> an******@example.com writes:
In article <II********@cwi.nl>, Dik T. Winter <Di********@cwi.nl> wrote:
But what is a good reason to print a function pointer?

Depends on your definition of good, but how about
debugging a non-portable feature like dynamic libraries?

But what do you think is a good way to print a function pointer when it
is, say, twice as wide as an object pointer? While, in general, object
pointers are as wide, or smaller than a void*, a function pointer can
be much wider.

I like Unleashed Guy's %P idea. (Which is what I was thinking
all along, but I grew tired of dtw's platonic methodry more slowly
than usual, which is a credit to him.)

%P is a logical extension to existing practice, doesn't break
existing conforming programs, and is very easy to implement,
outright trivial on flat address space machines.

To be truly radical, I advocate amending the bit about varargs
promotion. Function pointers used in a varargs (or non-prototype
context) should be implicitly converted to void(*)(void) so you
can pass function pointers to printf w/o a cast.
--
7842++

Nov 14 '05 #33

Keith Thompson <ks***@mib.org> writes:

"Dik T. Winter" <Di********@cwi.nl> writes:
[snip]

But what do you think is a good way to print a function pointer when it
is, say, twice as wide as an object pointer? While, in general, object
pointers are as wide, or smaller than a void*, a function pointer can
be much wider.

Here's a function that returns a hexadecimal image of any object,
given its address and its size in bytes. The caller needs to free()
the result after using it.

#define NIBBLES_PER_BYTE ((CHAR_BIT+3)/4)

char *hex_image(void *addr, size_t len)
{
char *result = malloc(NIBBLES_PER_BYTE * len + 1);
unsigned char *in = addr;
char *out = result;
size_t i;

if (result == NULL) return NULL;

for (i = 0; i < len; i ++) {
sprintf(out, "%0*x", NIBBLES_PER_BYTE, *in);
in ++;
out += NIBBLES_PER_BYTE;
}
return result;
}

As compiler technology has gotten better, one of the rules I've
had to unlearn is the rule that using pointers is faster than
using array indexing. Very often it's the other way around
now.

Here's a revised hex_image that uses indexing rather than direct
pointer manipulation.
#define NIBBLES_PER_CHAR ((CHAR_BIT+3)/4)

char *
hex_image( void *addr, size_t len ){
char (*out)[NIBBLES_PER_CHAR] = malloc( NIBBLES_PER_CHAR*len + 1 );
unsigned char *in = addr;
size_t i;

if (out == NULL) return NULL;

for (i = 0; i < len; i ++) {
sprintf( out[i], "%0*x", NIBBLES_PER_CHAR, in[i] );
}
return out[0];
}
(The name of the macro was changed because I expect ..._CHAR is
less likely to cause confusion than ..._BYTE.)

Nov 14 '05 #34

Tim Rentsch <tx*@alumnus.caltech.edu> writes:

Keith Thompson <ks***@mib.org> writes: [...]
Here's a function that returns a hexadecimal image of any object,
given its address and its size in bytes. The caller needs to free()
the result after using it.

[snip]
As compiler technology has gotten better, one of the rules I've
had to unlearn is the rule that using pointers is faster than
using array indexing. Very often it's the other way around
now.

Here's a revised hex_image that uses indexing rather than direct
pointer manipulation.
#define NIBBLES_PER_CHAR ((CHAR_BIT+3)/4)

char *
hex_image( void *addr, size_t len ){
char (*out)[NIBBLES_PER_CHAR] = malloc( NIBBLES_PER_CHAR*len + 1 );
unsigned char *in = addr;
size_t i;

if (out == NULL) return NULL;

for (i = 0; i < len; i ++) {
sprintf( out[i], "%0*x", NIBBLES_PER_CHAR, in[i] );
}
return out[0];
}
(The name of the macro was changed because I expect ..._CHAR is
less likely to cause confusion than ..._BYTE.)

Hmm. Let's assume for simplicity that CHAR_BIT==8. You've declared
out as a pointer to an array of 2 chars. That's perfectly legal, of
course, but there's enough confusion between arrays and pointers that
I'm still more comfortable with pointers to chars than with pointers
to arrays.

The space you allocate with malloc() has a size that is not a multiple
of the size of the pointed-to type.

In the sprintf() call, out[i] points to an array of 2 chars, but you
write 3 bytes to it. It's hard to see how this could fail, but it
worries me.

I think the real lesson to be learned is that it seldom matters
whether using pointers or indices is faster; the compiler will often
be able to generate optimal code for either. I suspect the real
bottleneck in my original version (and in yours) is the call to
sprintf().

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Nov 14 '05 #35

Keith Thompson <ks***@mib.org> writes:

Tim Rentsch <tx*@alumnus.caltech.edu> writes:
Keith Thompson <ks***@mib.org> writes: [...]
Here's a function that returns a hexadecimal image of any object,
given its address and its size in bytes. The caller needs to free()
the result after using it.

[snip]

As compiler technology has gotten better, one of the rules I've
had to unlearn is the rule that using pointers is faster than
using array indexing. Very often it's the other way around
now.

Here's a revised hex_image that uses indexing rather than direct
pointer manipulation.
#define NIBBLES_PER_CHAR ((CHAR_BIT+3)/4)

char *
hex_image( void *addr, size_t len ){
char (*out)[NIBBLES_PER_CHAR] = malloc( NIBBLES_PER_CHAR*len + 1 );
unsigned char *in = addr;
size_t i;

if (out == NULL) return NULL;

for (i = 0; i < len; i ++) {
sprintf( out[i], "%0*x", NIBBLES_PER_CHAR, in[i] );
}
return out[0];
}
(The name of the macro was changed because I expect ..._CHAR is
less likely to cause confusion than ..._BYTE.)

Hmm. Let's assume for simplicity that CHAR_BIT==8. You've declared
out as a pointer to an array of 2 chars. That's perfectly legal, of
course, but there's enough confusion between arrays and pointers that
I'm still more comfortable with pointers to chars than with pointers
to arrays.

I'll grant you that most developers use pointers to arrays only
rarely, if at all. To some extent though that's self-reinforcing
behavior - people don't use them because they aren't used to
them, and they aren't used to them because they don't use them.
The code above wasn't natural for me when I wrote it; but I
think it's good to stretch a bit at times to get out of the
local ruts that seem only too easy to fall into.

The space you allocate with malloc() has a size that is not a multiple
of the size of the pointed-to type.
That's true. It's for the final string termination character, of
course - there's a sense in which the array is being filled with
elements of NIBBLES_PER_CHAR characters each, and then a final
zero character added at the end to make it a string. Now that you
mention it, it might have been better to write that malloc() call
as

char (*out)[NIBBLES_PER_CHAR] = malloc( len * sizeof *out + 1 );

which may show a little more clearly why the size is chosen as it
is.

In the sprintf() call, out[i] points to an array of 2 chars, but you
write 3 bytes to it. It's hard to see how this could fail, but it
worries me.
I understand the worry. I've looked at language in both the
standard and the Rationale document; I think I can make a pretty
strong case that it's required to succeed, but I'm sure there are
those who would argue against that.

I think the real lesson to be learned is that it seldom matters
whether using pointers or indices is faster; the compiler will often
be able to generate optimal code for either. I suspect the real
bottleneck in my original version (and in yours) is the call to
sprintf().

My experience (of the last few years, in case that matters) has
been that the compiler usually does a little better generating
code when indexing is used. That's contrary to "conventional
wisdom" so I thought it might be good to report that. The usage
of the variable 'out' is a little bit funny, I might even say
unclear; but I think it does show a little more directly how the
output is being produced in blocks of characters corresponding to
each input character. Don't get me wrong, I don't think either
version is better than the other -- each has its plusses and
minuses.[*] It just occurred to me that this function might be a
good example to illustrate the array-ful nature of the processing
being done, and how that might be expressed.

[*] In comp.lang.c, we might say each has its pluspluses and
minusminuses. :)

Nov 14 '05 #36

On Sat, 18 Jun 2005 16:38:38 -0700, Tim Rentsch wrote:

....

I'll grant you that most developers use pointers to arrays only
rarely, if at all. To some extent though that's self-reinforcing
behavior - people don't use them because they aren't used to
them, and they aren't used to them because they don't use them.
Pointers to arrays are a natural adjunct to arrays of array. IMO the real
issue is that people don't use arrays of arrays much. And even when you do
you don't alwayes need to use (as an explicit pointer object) a pointer to
an array. They seem to come up most when trying to pass an array of arrays
to another function, but your example of allocating an array of arrays
dynamically is good too.
The code above wasn't natural for me when I wrote it; but I think it's good to stretch a bit at times to get out of the local ruts that seem
only too easy to fall into.

The space you allocate with malloc() has a size that is not a multiple
of the size of the pointed-to type.

That's true. It's for the final string termination character, of course
- there's a sense in which the array is being filled with elements of
NIBBLES_PER_CHAR characters each, and then a final zero character added
at the end to make it a string. Now that you mention it, it might have
been better to write that malloc() call as

char (*out)[NIBBLES_PER_CHAR] = malloc( len * sizeof *out + 1 );

which may show a little more clearly why the size is chosen as it is.

In the sprintf() call, out[i] points to an array of 2 chars, but you
write 3 bytes to it. It's hard to see how this could fail, but it
worries me.

I understand the worry. I've looked at language in both the standard
and the Rationale document; I think I can make a pretty strong case
that it's required to succeed, but I'm sure there are those who would
argue against that.

Yes, you're writing outside the bounds of the array. The array in
question is out[i] which is (in the specified conditions) a 2 element
array. What you are passing as the first argument to sprintf() is a
pointer to the first element of this 2 element array. When the call to
sprintf() writes to out[i][2] it invokes undefined behaviour.

Lawrence

Nov 14 '05 #37

Lawrence Kirby <lk****@netactive.co.uk> writes:

On Sat, 18 Jun 2005 16:38:38 -0700, Tim Rentsch wrote:
[restored context]

char (*out)[NIBBLES_PER_CHAR] = malloc( NIBBLES_PER_CHAR*len + 1 );
...
for (i = 0; i < len; i ++) {
sprintf( out[i], "%0*x", NIBBLES_PER_CHAR, in[i] );
}

[end restored context]
In the sprintf() call, out[i] points to an array of 2 chars, but you
write 3 bytes to it. It's hard to see how this could fail, but it
worries me.

I understand the worry. I've looked at language in both the standard
and the Rationale document; I think I can make a pretty strong case
that it's required to succeed, but I'm sure there are those who would
argue against that.

Yes, you're writing outside the bounds of the array. The array in
question is out[i] which is (in the specified conditions) a 2 element
array. What you are passing as the first argument to sprintf() is a
pointer to the first element of this 2 element array. When the call to
sprintf() writes to out[i][2] it invokes undefined behaviour.

If you look at the actual language I think you'll find
that it's not as simple as that. So I need to ask you
to cite "chapter and verse", as they say. I've looked
at the actual language quite carefully.

Nov 14 '05 #38

Tim Rentsch <tx*@alumnus.caltech.edu> writes:
[...]

I'll grant you that most developers use pointers to arrays only
rarely, if at all. To some extent though that's self-reinforcing
behavior - people don't use them because they aren't used to
them, and they aren't used to them because they don't use them.
The code above wasn't natural for me when I wrote it; but I
think it's good to stretch a bit at times to get out of the
local ruts that seem only too easy to fall into.
Agreed.

I'd also speculate that pointers to arrays are used accidentally
nearly as often as they're used deliberately (by programmers who don't
quite understand the interactions between arrays and pointers). Such
code often works by accident if a pointer-to-char and a
pointer-to-array-of-char happen to have the same representation.

If I see code that uses pointers to arrays, I need to carefully trace
the logic before I can be sure that it's actually correct.

This isn't a criticism of your code, but unusual constructs usually
call for a comment explaining what's really going on.

[...]
I understand the worry. I've looked at language in both the
standard and the Rationale document; I think I can make a pretty
strong case that it's required to succeed, but I'm sure there are
those who would argue against that.

Which I would say is a good argument to avoid using it.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Nov 14 '05 #39

Keith Thompson <ks***@mib.org> writes:

Tim Rentsch <tx*@alumnus.caltech.edu> writes:
[...]
I'll grant you that most developers use pointers to arrays only
rarely, if at all. To some extent though that's self-reinforcing
behavior - people don't use them because they aren't used to
them, and they aren't used to them because they don't use them.
The code above wasn't natural for me when I wrote it; but I
think it's good to stretch a bit at times to get out of the
local ruts that seem only too easy to fall into.
Agreed.

I'd also speculate that pointers to arrays are used accidentally
nearly as often as they're used deliberately (by programmers who don't
quite understand the interactions between arrays and pointers). Such
code often works by accident if a pointer-to-char and a
pointer-to-array-of-char happen to have the same representation.

Interesting idea. Have you ever actually seen it? I'd expect
that normally the distinction would be caught by during type
checking.

If I see code that uses pointers to arrays, I need to carefully trace
the logic before I can be sure that it's actually correct.

This isn't a criticism of your code, but unusual constructs usually
call for a comment explaining what's really going on.
Yes. Pointer-to-arrays still usually fall in that category. It
would be nice if more people got comfortable with some basic
patterns for using pointer-to-arrays so that were less true.

[...]
I understand the worry. I've looked at language in both the
standard and the Rationale document; I think I can make a pretty
strong case that it's required to succeed, but I'm sure there are
those who would argue against that.

Which I would say is a good argument to avoid using it.

Perhaps. It depends on who are offering the arguments, and what
the arguments are.

Nov 14 '05 #40

Michael Mair

Tim Rentsch wrote:

Keith Thompson <ks***@mib.org> writes:

Tim Rentsch <tx*@alumnus.caltech.edu> writes:
[...]
I'll grant you that most developers use pointers to arrays only
rarely, if at all. To some extent though that's self-reinforcing
behavior - people don't use them because they aren't used to
them, and they aren't used to them because they don't use them.
The code above wasn't natural for me when I wrote it; but I
think it's good to stretch a bit at times to get out of the
local ruts that seem only too easy to fall into.

Agreed.

I'd also speculate that pointers to arrays are used accidentally
nearly as often as they're used deliberately (by programmers who don't
quite understand the interactions between arrays and pointers). Such
code often works by accident if a pointer-to-char and a
pointer-to-array-of-char happen to have the same representation.

Interesting idea. Have you ever actually seen it? I'd expect
that normally the distinction would be caught by during type
checking.

People who get this wrong probably have no scruples inserting the
"necessary" type casts to shut up the oh-so-noisy compiler... :-/

[snip]
Cheers
Michael
--
E-Mail: Mine is an /at/ gmx /dot/ de address.

Nov 14 '05 #41

On Sun, 19 Jun 2005 09:43:34 -0700, Tim Rentsch wrote:

....

Yes, you're writing outside the bounds of the array. The array in
question is out[i] which is (in the specified conditions) a 2 element
array. What you are passing as the first argument to sprintf() is a
pointer to the first element of this 2 element array. When the call to
sprintf() writes to out[i][2] it invokes undefined behaviour.

If you look at the actual language I think you'll find
that it's not as simple as that. So I need to ask you
to cite "chapter and verse", as they say. I've looked
at the actual language quite carefully.

I know what you mean but I believe what I wrote is the intent.

6.5.6p8 says

"... If both the pointer operand and the result point to elements of the
same array object, or one past the last element of the array object, the
evaluation shall not produce an overflow; otherwise, the behavior is
undefined. If the result points one past the last element of the array
object, it shall not be used as the operand of a unary * operator that is
evaluated."

So given

char (*out)[NIBBLES_PER_CHAR] = malloc( NIBBLES_PER_CHAR*len + 1 );

out[N] is an array of NIBBLES_PER_CHAR characters and if I write

char *p = out[N];

then p is a pointer to the first element of an array of NIBBLES_PER_CHAR
chars. p+NIBBLES_PER_CHAR is a pointer to one past the last element of
this array so *(p+NIBBLES_PER_CHAR) violates the "shall" requirement above
resulting in undefined behaviour.

Granted the original example uses sprintf() rather than + and *, snd it is
less clear how the one last the end rules apply to standard library
functions.

Lawrence

Nov 14 '05 #42