By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
446,332 Members | 1,404 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 446,332 IT Pros & Developers. It's quick & easy.

Pointer validity

P: n/a
Valid pointers have two states. Either empty (NULL), or filled with an
address that must be at a valid address.

Valid addresses are:

1) The current global context. The first byte of the data of the
program
till the last byte. Here we find static tables, global context
pointers, etc.
This are the global variables of the program.

2) The current scope and all nested scopes. The current scope is given
by the address of the local variables and the arguments. A
conservative
estimate of this area is the address of argc in main() or the
address of the
first local variable in main. Normally, a procedure should never
access
memory outside its scope, but it can receive pointers to areas in
higher
scopes, so the comparison is not easier if done throughly.

3) The heap. To this area belong all addresses allocated with malloc()
and not passed to free().

A fast procedure tyo determine the validity of a pointer could be:

1) Check if the address is in the data area. It would be nice if the
standard
specified a name for those addresses, but this is tricky in
environments where those addresses aren't contiguous. Here we
suppose
that the compiler supplies __first_data__ and __last_data__.

2) To check if the address is within the valid stack we need two
memory
comparisons again. The current stack and the stored value of the top
of it.
We suppose the compiler provides __top_of_stack__

3) The heap. We suppose there is a procedure to verify a memory block.

All this would cost a couple of memory reads in most cases, or a call
to a
procedure, in case of malloced block.

What about making those tests automatically to do that with all
pointers
passed to all functions?

That would lead to pointer bugs surfacing immediately. This could be
disconnected later. But in the first phases of development, speed is
not so
important as correctly implementing the algorithm.

Pointer bugs are likely to surface in the first phases of development,
and we have the means now to put the machine to check those pointers.

A run of the mill processor now runs at several GHZ. Some memory
comparisons would slow down the program so little as to be completely
transparent in PC architectures.

Of course, in embedded systems the situation is different, but for C
developers in a PC this would be a good improvement.

Just some thoughts

jacob

Nov 13 '05 #1
Share this Question
Share on Google+
16 Replies


P: n/a
"jacob navia" <ja***@jacob.remcomp.fr> wrote in message
news:bq**********@news-reader5.wanadoo.fr...
Valid pointers have two states. Either empty (NULL), or filled with an
address that must be at a valid address.

Valid addresses are:

1) The current global context. The first byte of the data of the
program
till the last byte. Here we find static tables, global context
pointers, etc.
This are the global variables of the program.

2) The current scope and all nested scopes. The current scope is given
by the address of the local variables and the arguments. A
conservative
estimate of this area is the address of argc in main() or the
address of the
first local variable in main. Normally, a procedure should never
access
memory outside its scope, but it can receive pointers to areas in
higher
scopes, so the comparison is not easier if done throughly.

3) The heap. To this area belong all addresses allocated with malloc()
and not passed to free().
All of the above is implementation specific and therefore OFF
TOPIC. There is no requirement for a heap or stackframe as we
know and love them. Implementations are allowed to do whatever
they want as if the behavior appears to conform to the standard.

Pointers to valid memory locations can come from an external
source (e.g., generated by an agent other than the currently
running C program), and be usable by the currently running
C program.
A fast procedure tyo determine the validity of a pointer could be:

1) Check if the address is in the data area. It would be nice if the
standard
specified a name for those addresses, but this is tricky in
environments where those addresses aren't contiguous. Here we
suppose
that the compiler supplies __first_data__ and __last_data__.

2) To check if the address is within the valid stack we need two
memory
comparisons again. The current stack and the stored value of the top
of it.
We suppose the compiler provides __top_of_stack__
A stack frame, if implemented, can contain more data locations
than what would appear to be needed by just looking at the
automatic variable declarations in the source code. A corrupted
pointer that just happens to fall within the stackframe boundaries
would appear to be valid according to your description.
3) The heap. We suppose there is a procedure to verify a memory block.
A single heap is not required by the standard, and certainly
its implementation would be opaque and subject to change.
All this would cost a couple of memory reads in most cases, or a call
to a
procedure, in case of malloced block.

What about making those tests automatically to do that with all
pointers
passed to all functions?

That would lead to pointer bugs surfacing immediately. This could be
disconnected later. But in the first phases of development, speed is
not so
important as correctly implementing the algorithm.

Pointer bugs are likely to surface in the first phases of development,
and we have the means now to put the machine to check those pointers.

A run of the mill processor now runs at several GHZ. Some memory
comparisons would slow down the program so little as to be completely
transparent in PC architectures.

Of course, in embedded systems the situation is different, but for C
developers in a PC this would be a good improvement.

Just some thoughts

jacob


Your premise is flawed, therefore your conclusions are meaningless.
There are plenty of memory management tools out there that are
replacements for the common implementations of malloc() and
friends, for locating heap corruption, dangling references, etc.
It is all dependent upon implementation details and is something
that would require a compiler to have intimate knowledge of the
heap implementation.
Nov 13 '05 #2

P: n/a

"xarax" <xa***@email.com> wrote in message
news:78*******************@newsread2.news.pas.eart hlink.net...
All of the above is implementation specific and therefore OFF
TOPIC. There is no requirement for a heap or stackframe as we
know and love them. Implementations are allowed to do whatever
they want as if the behavior appears to conform to the standard.

Sorry but I gather from the standard that the storage allocated by
local variables is valid only during the execution of a function.

Since functions return and are called, this implies a stack structure
one way or the other. The thing gets started with main() that
can call other functions.

The scope of a global is indefinite, as long as the program runs.
This means that C surely assumes that this storage is distinct
conceptually from the local storage.

malloc/free are part of the standard.
Pointers to valid memory locations can come from an external
source (e.g., generated by an agent other than the currently
running C program), and be usable by the currently running
C program.

Yes, we could hypothetically assume that the operating system
returns valid pointers to applications but this is very uncommon,
outside the obvious call to malloc/free.

This is very rare and can be safely forgotten.
Your premise is flawed, therefore your conclusions are meaningless.

There is nothing flawed here.
There are plenty of memory management tools out there that are
replacements for the common implementations of malloc() and
friends, for locating heap corruption, dangling references, etc.
And they do probably a very similar thing to what I described.
It is all dependent upon implementation details and is something
that would require a compiler to have intimate knowledge of the
heap implementation.


Yes. And so what?

My question is: would it be interesting to add to the language itself?

C has been widely critized, and with reason, for the ample
opportunities of
pointer errors. Giving thought to this is not off topic here. It is
one
of the most common errors in any program when it is being developed.

Two conceptions of the C language underlie our differences. For you,
any
reflection about some basic tenets of the language is "off topic". I
think
too little discussion is going on about how we could improve things.


Nov 13 '05 #3

P: n/a

"jacob navia" <ja***@jacob.remcomp.fr> wrote in message
news:bq**********@news-reader4.wanadoo.fr...

"xarax" <xa***@email.com> wrote in message
news:78*******************@newsread2.news.pas.eart hlink.net...
All of the above is implementation specific and therefore OFF
TOPIC. There is no requirement for a heap or stackframe as we
know and love them. Implementations are allowed to do whatever
they want as if the behavior appears to conform to the standard.

Sorry but I gather from the standard that the storage allocated by
local variables is valid only during the execution of a function.

Since functions return and are called, this implies a stack structure


.... can be used, but not that one is required.
one way or the other.
Or some other non-stack method may be used.
The thing gets started with main() that
can call other functions.
The ability for functions to call functions is not required
to be implemented with a stack.


The scope of a global is indefinite, as long as the program runs.
It is definite. The duration of the program's execution.
This means that C surely assumes that this storage is distinct
conceptually from the local storage.
"Local" vs "nonlocal" is not a lifetime issue, but one of scope.
But yes, 'local' vs. 'global' scopes are considered distinct.
That's what 'scope' means, after all. :-)

But there's no requirement that a compiler internally store e.g
'all globals here and all locals there'. This is often done,
but this is an implementation detail.

malloc/free are part of the standard.
Yes they are. What's your point?
Pointers to valid memory locations can come from an external
source (e.g., generated by an agent other than the currently
running C program), and be usable by the currently running
C program.

Yes, we could hypothetically assume that the operating system
returns valid pointers to applications but this is very uncommon,


I find it very common. E.g. Microsoft Windows is very
widespread. So are embedded devices with interfaces that
use pointers.
outside the obvious call to malloc/free.

This is very rare and can be safely forgotten.
IMO not rare at all.

Also, in my experience, the 'rare' problems are the most difficult
to rectify (or even locate).
Your premise is flawed, therefore your conclusions are meaningless.

There is nothing flawed here.
There are plenty of memory management tools out there that are
replacements for the common implementations of malloc() and
friends, for locating heap corruption, dangling references, etc.


And they do probably a very similar thing to what I described.


In necessarily platform specific ways.
It is all dependent upon implementation details and is something
that would require a compiler to have intimate knowledge of the
heap implementation.
Yes. And so what?


So here, we discuss standard C, not implementation details.

My question is: would it be interesting to add to the language itself?
Perhaps it might be to some, but not to others.
C has been widely critized, and with reason, for the ample
opportunities of
pointer errors.

The C language has no need to justify its existence to any
critics. I think it stands upon its own success. AFAIK,
besides COBOL, it's the oldest high-level language still in
widespread use (I welcome any corrections from the historians).
Giving thought to this is not off topic here.
Actually it is. Here we discuss the C language as it is.

It is
one
of the most common errors in any program when it is being developed.
Yes it is. Which is why many/most of the 'critics' you cite
often try to blame their tools for their mistakes.

Two conceptions of the C language underlie our differences. For you,
any
reflection about some basic tenets of the language is "off topic".
Explanations of 'basic tenets' of C as part of helping someone
with C are indeed topical. Speculations/suggestions about the
'how and why', 'can/should something be different', etc. are not.
I
think
too little discussion is going on about how we could improve things.


Much such discussion is occurring, you just apparently aren't
aware of it. The result of many such discussions was C99.
Ten-plus years of discussion. Too little? :-)

I suggest you visit comp.std.c if you want to share ideas
about changes to the language.

-Mike
Nov 13 '05 #4

P: n/a
jacob navia wrote:

"xarax" <xa***@email.com> wrote in message
news:78*******************@newsread2.news.pas.eart hlink.net...
All of the above is implementation specific and therefore OFF
TOPIC. There is no requirement for a heap or stackframe as we
know and love them. Implementations are allowed to do whatever
they want as if the behavior appears to conform to the standard.

Sorry but I gather from the standard that the storage allocated by
local variables is valid only during the execution of a function.

Since functions return and are called, this implies a stack structure
one way or the other. The thing gets started with main() that
can call other functions.


Yes, a stack of some kind is implied. But it would be too
much of a leap to assume the stack is represented as a simple
contiguous array of memory! For example, a linked list of
"frames" would serve the needs of C just fine but would make it
impossible to classify a pointer value as stack or non-stack
with just two comparisons, as you suggest.
The scope of a global is indefinite, as long as the program runs.
This means that C surely assumes that this storage is distinct
conceptually from the local storage.

malloc/free are part of the standard.
Pointers to valid memory locations can come from an external
source (e.g., generated by an agent other than the currently
running C program), and be usable by the currently running
C program.


Yes, we could hypothetically assume that the operating system
returns valid pointers to applications but this is very uncommon,
outside the obvious call to malloc/free.

This is very rare and can be safely forgotten.


Actually, there are two *very* common examples of "out of the
blue" memory, provided to and used by a large fraction of all C
programs. The second argument to main() comes from -- well, from
who knows where, and so do the strings to which its elements
point. A possibly less common example is getenv(), and perhaps
some thought might suggest others. In any event, memory supplied
from extra-program sources can't be "safely forgotten."
There are plenty of memory management tools out there that are
replacements for the common implementations of malloc() and
friends, for locating heap corruption, dangling references, etc.


And they do probably a very similar thing to what I described.


There's a fairly extensive literature on checking pointer
validity, but most of what I've seen addresses a more important
problem than you're tackling. For example, simply knowing that
a pointer addresses a valid object isn't enough:

int a[10][10];
int *p = &a[0][9];
*++p = 0; // valid pointer, invalid access

--
Er*********@sun.com
Nov 13 '05 #5

P: n/a
Mike Wahler wrote:
"jacob navia" <ja***@jacob.remcomp.fr> wrote in message
news:bq**********@news-reader4.wanadoo.fr...

"xarax" <xa***@email.com> wrote in message
news:78*******************@newsread2.news.pas.eart hlink.net...
> All of the above is implementation specific and therefore OFF
> TOPIC. There is no requirement for a heap or stackframe as we
> know and love them. Implementations are allowed to do whatever
> they want as if the behavior appears to conform to the standard.
>
Sorry but I gather from the standard that the storage allocated by
local variables is valid only during the execution of a function.

Since functions return and are called, this implies a stack structure


... can be used, but not that one is required.


That's not what Jacob meant. At any point during the execution of a C
program, the currently active functions together with their local
storage form a stack structure. Calling a function is equivalent to
pushing an item onto the stack; returning from a function pops it off
the (top of the) stack. However this is implemented, the basic
operations (call/return) correspond to those which can be performed on
a stack (push/pop), so it's entirely accurate, and natural, to talk
about "the call stack".
The C language has no need to justify its existence to any
critics. I think it stands upon its own success. AFAIK,
besides COBOL, it's the oldest high-level language still in
widespread use (I welcome any corrections from the historians).


That depends on what you mean by "widespread". There are several
languages still in use that are considerably older than C, e.g. (in
descending order of popularity) Fortran, Lisp, BCPL, etc.

Jeremy.
Nov 13 '05 #6

P: n/a
Mike Wahler wrote:
Here we discuss the C language as it is.


Are we discussing C89, C99 or some combination of the two?

Nov 13 '05 #7

P: n/a
"jacob navia" <ja***@jacob.remcomp.fr> wrote in message
news:bq**********@news-reader4.wanadoo.fr...

"xarax" <xa***@email.com> wrote in message
news:78*******************@newsread2.news.pas.eart hlink.net...
All of the above is implementation specific and therefore OFF
TOPIC. There is no requirement for a heap or stackframe as we
know and love them. Implementations are allowed to do whatever
they want as if the behavior appears to conform to the standard.

Sorry but I gather from the standard that the storage allocated by
local variables is valid only during the execution of a function.

Actually, it's "Storage for the object is no longer guaranteed
to be reserved when execution of the block ends in any way."
Standard doesn't mention "validity of a storage" in this context.
Since functions return and are called, this implies a stack structure
one way or the other.
It does not. That it is common on some platforms doesn't mean
that standard implies such thing.
The thing gets started with main() that
Not necessarilly in freestanding environments.
can call other functions.

The scope of a global is indefinite, as long as the program runs.
You seem to be confusing scope (of identifiers) and (storage)
duration. For neither of them standard enumerates "indefinite".
This means that C surely assumes that this storage is distinct
conceptually from the local storage.
How did you arrive at this conclusion ("C surely assumes")?
malloc/free are part of the standard.
Pointers to valid memory locations can come from an external
source (e.g., generated by an agent other than the currently
running C program), and be usable by the currently running
C program.

Yes, we could hypothetically assume that the operating system
returns valid pointers to applications but this is very uncommon,
outside the obvious call to malloc/free.

Maybe uncommon for programs you are writing? BTW, standard doesn't
say that malloc() returns "OS returns pointer".
This is very rare and can be safely forgotten.
Sure. Given your experience a problems you are facing (which
undoubtely spawned this thread) ...
Your premise is flawed, therefore your conclusions are meaningless.


There is nothing flawed here.

Well, so far you've got "malloc/free are part of the standard"
right.
There are plenty of memory management tools out there that are
replacements for the common implementations of malloc() and
friends, for locating heap corruption, dangling references, etc.


And they do probably a very similar thing to what I described.

So sensible thing would be to get them and use them. No need
for fundamental language change.
It is all dependent upon implementation details and is something
that would require a compiler to have intimate knowledge of the
heap implementation.


Yes. And so what?

My question is: would it be interesting to add to the language itself?

No.
C has been widely critized, and with reason, for the ample
opportunities of pointer errors.
So were chainsaws by idiots sawing their fingers off. Blame
the tools, eh?
Giving thought to this is not off topic here. It is
one
of the most common errors in any program when it is being developed.
You mean in any program developed by you? Or by newbie? C is not
a tool that can be mastered in "21 days" or so (for majority
of people, anyway, IMHO).
Two conceptions of the C language ------^^^^^^^^^^^
There was only one - by Dennis M. Ritchie, AFAIK. If you want
to talk concepts, in clc there is still only one, as I gather -
that of standard. Maybe comp.std.c would be better place?
underlie our differences. For you, any reflection
about some basic tenets of the language is "off topic".
Again, unwarranted conclusion.
I think
too little discussion is going on about how we could improve things.


First step would be to fully understand how things *are*.
Then, *why* are they as they are. (Not that I know it all,
but what you are asking is against "spirit" of C, as *I* see
it. But who am I, anyway?:-)
Nov 13 '05 #8

P: n/a
On Mon, 1 Dec 2003 22:32:26 +0100, "jacob navia"
<ja***@jacob.remcomp.fr> wrote in comp.lang.c:
Valid pointers have two states. Either empty (NULL), or filled with an
address that must be at a valid address.

Valid addresses are:

1) The current global context. The first byte of the data of the
program
till the last byte. Here we find static tables, global context
pointers, etc.
This are the global variables of the program.

2) The current scope and all nested scopes. The current scope is given
by the address of the local variables and the arguments. A
conservative
estimate of this area is the address of argc in main() or the
address of the
first local variable in main. Normally, a procedure should never
access
memory outside its scope, but it can receive pointers to areas in
higher
scopes, so the comparison is not easier if done throughly.

3) The heap. To this area belong all addresses allocated with malloc()
and not passed to free().

A fast procedure tyo determine the validity of a pointer could be:

1) Check if the address is in the data area. It would be nice if the
standard
specified a name for those addresses, but this is tricky in
environments where those addresses aren't contiguous. Here we
suppose
that the compiler supplies __first_data__ and __last_data__.

2) To check if the address is within the valid stack we need two
memory
comparisons again. The current stack and the stored value of the top
of it.
We suppose the compiler provides __top_of_stack__

3) The heap. We suppose there is a procedure to verify a memory block.

All this would cost a couple of memory reads in most cases, or a call
to a
procedure, in case of malloced block.

What about making those tests automatically to do that with all
pointers
passed to all functions?

That would lead to pointer bugs surfacing immediately. This could be
disconnected later. But in the first phases of development, speed is
not so
important as correctly implementing the algorithm.

Pointer bugs are likely to surface in the first phases of development,
and we have the means now to put the machine to check those pointers.

A run of the mill processor now runs at several GHZ. Some memory
comparisons would slow down the program so little as to be completely
transparent in PC architectures.

Of course, in embedded systems the situation is different, but for C
developers in a PC this would be a good improvement.

Just some thoughts

jacob


Speak as an lcc-win32 user, I like the idea but would suggest a
somewhat different implementation, at least from what I think you are
suggesting.

It sound to me like you are thinking of adding a compiler option that
would silently generate runtime code to test the validity of a pointer
every time it was used in the code in certain situations.
Dereferencing, certainly. Also assigning to pointers, passing as
function arguments, returning from functions?

That sounds like too much overhead even in early testing.

I would suggest something like the assert macro. A macro like:

POINTER_TEST(prt_name);

....that could be put in explicitly where wanted, returning 0 if the
pointer is invalid, non-zero if it passes the test, so that in fact
the POINTER_TEST macro could be used inside an assert macro.

For example consider a function that receives a pointer to a
structure. Ideally, that pointer should be validated only once, like
it might be checked for NULL once at the beginning of the function,
rather than for each of the many times the code uses the pointer to
access a structure member.

And, of course, like the assert macro, the pointer test macro should
expand to nothing (such as "void(0)"), depending on the definition or
lack of some other macro definition.

Example:

#ifdef TEST_POINTERS
#define POINTER_TEST(p) pointer_test(p)
#else
#define POINTER_TEST(p) void(0)
#endif

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++ ftp://snurse-l.org/pub/acllc-c++/faq
Nov 13 '05 #9

P: n/a
>Sorry but I gather from the standard that the storage allocated by
local variables is valid only during the execution of a function.

Since functions return and are called, this implies a stack structure
one way or the other. The thing gets started with main() that
can call other functions.
This does not disallow a "stack structure" that involves OS/360-style
save areas ("stack frames", if you insist) that are GETMAIN'd on
function entry and FREEMAIN'd on function exit. The so-called "stack"
would then be intermixed with malloc() memory as malloc() would almost
certainly call GETMAIN also.

The possibility of having multiple threads also tends to blow away
the idea that active local variables can be found in a contiguous
area between something like the address of a local variable in main()
and the address of a local variable in the current function.

The scope of a global is indefinite, as long as the program runs.
This means that C surely assumes that this storage is distinct
conceptually from the local storage.
But it does not mean that any global is contiguous with any other
global in another translation unit, without having non-globals
(e.g. read-only code) in between them.
malloc/free are part of the standard.
Pointers to valid memory locations can come from an external
source (e.g., generated by an agent other than the currently
running C program), and be usable by the currently running
C program.
Nobody ever uses functions like mmap() or dlopen(). Well, in ANSI
C, they really don't.
There are plenty of memory management tools out there that are
replacements for the common implementations of malloc() and
friends, for locating heap corruption, dangling references, etc.
And they do probably a very similar thing to what I described.


For memory management debugging purposes, I'd like to see
check_malloc_arena(void) which checks for memory overruns
in an unspecified way but *if* it's a linked list of some sort,
makes sure it's in the correct order and not broken. I could
also use is_from_malloc(void *pointer) which checks whether it
is a valid pointer *FROM malloc()* .
My question is: would it be interesting to add to the language itself?


It tends to encourage programmers to do horrible hacks of
making more variants of NULL that test as invalid pointers for
use with more special cases. For example:

struct symtab *lookupsymbol(char *name)

Returns: pointer to symbol table entry if name is found or created.
NULL if name is an invalid symbol
(void *)3 if memory could not be allocated to create a new
symbol table entry.
(void *)7 if an attempt is made to create an entry for two symbols
identical in the first 64 characters.
(void *)11 if name is an officially registered obscene word.

Oh, yes, on the author's machine, 3, 7, and 11 are considered invalid
pointers, and the author is depending on that being true elsewhere.
Gordon L. Burditt
Nov 13 '05 #10

P: n/a
On Mon, 1 Dec 2003 22:32:26 +0100, "jacob navia"
<ja***@jacob.remcomp.fr> wrote:
Valid pointers have two states. Either empty (NULL), or filled with an
address that must be at a valid address.

Valid addresses are:

1) The current global context. The first byte of the data of the
program
till the last byte. Here we find static tables, global context
pointers, etc.
This are the global variables of the program.

2) The current scope and all nested scopes. The current scope is given
by the address of the local variables and the arguments. A
conservative
estimate of this area is the address of argc in main() or the
address of the
first local variable in main. Normally, a procedure should never
access
memory outside its scope, but it can receive pointers to areas in
higher
scopes, so the comparison is not easier if done throughly.

3) The heap. To this area belong all addresses allocated with malloc()
and not passed to free().


What about those of us whose hardware does not have stack or a heap?

Why do you think argc and argv are in that order? Why do you think
they are in any way related to the location of local variables?

What about function pointers?

There is at least one more state, uninitialized.
<<Remove the del for email>>
Nov 13 '05 #11

P: n/a
"jacob navia" <ja***@jacob.remcomp.fr> wrote:
Valid pointers have two states. Either empty (NULL), or filled with an
address that must be at a valid address.

Valid addresses are:

1) The current global context.
There is no "current" global context. The point of global is that it's
global, not current.
The first byte of the data of the program
till the last byte. Here we find static tables, global context
pointers, etc.
That is assuming that these are to be found in one place only. Not
necessarily true; these two:

char *str1 ="Of Man's first disobedience, and the fruit";
char str2[]="Through Eden took their Solitary Way";

may well reside in widely separate areas of memory.
2) The current scope and all nested scopes.
Again, not necessarily together.
The current scope is given by the address of the local variables and
the arguments. A conservative estimate of this area is the address
of argc in main() or the address of the first local variable in main.
Up, or down?
3) The heap. To this area belong all addresses allocated with malloc()
Or realloc(), or calloc().
and not passed to free().
Or realloc().
A fast procedure tyo determine the validity of a pointer could be:
Fast, but completely unportable.
1) Check if the address is in the data area.
How? Pointer comparison is not defined for pointers in different
objects, let alone between a valid and an invalid pointer.
2) To check if the address is within the valid stack we need two
memory comparisons again.
Ditto.
3) The heap. We suppose there is a procedure to verify a memory block.


You might as well suppose an implementation-supplied function called
validate_pointer(void *ptr), which would do all the work for you. It
would, of course, not work for function pointers (which you never even
mention), and wouldn't be able to tell if the pointer were properly
aligned.

Richard
Nov 13 '05 #12

P: n/a
In <bq**********@news-reader5.wanadoo.fr> "jacob navia" <ja***@jacob.remcomp.fr> writes:
Valid pointers have two states. Either empty (NULL), or filled with an
address that must be at a valid address.
Null pointers are valid only in certain contexts: they can be assigned to
other pointers, type converted and used as operands for the equality
operators. In any other context involving the pointer value, they're
invalid.
Valid addresses are:

1) The current global context. The first byte of the data of the
program
till the last byte. Here we find static tables, global context
pointers, etc.
This are the global variables of the program.
There is no requirement/guarantee that this is a compact address space.
It may have "holes", whose addresses are invalid or each item may be put
into its own memory segment on a segmented memory architecture.
2) The current scope and all nested scopes. The current scope is given
by the address of the local variables and the arguments. A
conservative
estimate of this area is the address of argc in main() or the
address of the
first local variable in main. Normally, a procedure should never
access
memory outside its scope, but it can receive pointers to areas in
higher
scopes, so the comparison is not easier if done throughly.
See my comment above.
3) The heap. To this area belong all addresses allocated with malloc()
and not passed to free().


Ditto. Also, there are malloc implementations that deliberately don't
use a heap, to assist in the immediate detection of buffer overruns.
See Electric Fence for an example.

As far as the C standard is concerned, *each* top level (outermost)
object exists in an address space of its own. This is clearly indicated
by the interdiction to even compare pointers that don't point to the same
object (or one byte after).

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de
Nov 13 '05 #13

P: n/a
"jacob navia" <ja***@jacob.remcomp.fr> wrote:
Valid pointers have two states. Either empty (NULL), or filled with an
address that must be at a valid address.

Valid addresses are: [... pointer classification description deleted ...]


Indeed this is very similar to an idea I've had about extending the C
language. If you think about it, tools like Purify, etc. must already
do things like this.

There are, of course, some problems with the approach you propose:

1. C is commonly extended to be used for hardware device drivers. In
such situations memory mapped pointers which are outside of any of
your classifications will exist.

2. C/C++ is commonly extended to include multithreading. In such
environments there are *many* stacks. This would require that you
retain a list of all such stack as any time.

3. Some operating systems may expose a *shared memory* region from
which different applications may share access to pointers coming from
the OS. But as has been pointed out by another poster, the OS can, in
general, give you pointers that come from who knows where.

So rather than making the hardline assertion about whether or not a
pointer is valid, why don't you instead try to determine the nature of
the pointer as best you can determine it? For example:

enum PTR_CLASSIFICATION {
PTRCL_NULL = 0, /* NULL */
PTRCL_UNKNOWN = 1, /* Unknown classification */
PTRCL_ERR = 2, /* Pointers we *know* are wrong */
PTRCL_STATIC_DATA = 3, /* In your data or program areas */
PTRCL_AUTO = 4, /* A local variable (live stack) */
PTRCL_HEAP = 5 /* In the heap */
PTRCL_MAXIMUM = 5
};

enum PTR_CLASSIFICATION getPtrClassification (void *p);

The point being that any compiler could extend this by adding in more
classifications after PTRCL_HEAP, but not make unfounded assertions
(there may be pointer types it doesn't know about, but others that are
known for sure to be wrong.) The function is only required to make a
best effort range check -- the pointer may be invalid for other
reasons such as alignment which cannot be determined since the type is
not provided.

This then leaves it to the application to try to use this to test the
validity of a pointer. In this way, even if the application has
access to pointers outside of classifications that the compiler is
aware of, one can still somehow try to account for these by hook or by
crook in the application itself without being mislead about the true
validity of the the pointer.

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/
Nov 13 '05 #14

P: n/a
qe*@pobox.com (Paul Hsieh) wrote:
So rather than making the hardline assertion about whether or not a
pointer is valid, why don't you instead try to determine the nature of
the pointer as best you can determine it? For example:

enum PTR_CLASSIFICATION {
PTRCL_NULL = 0, /* NULL */
PTRCL_UNKNOWN = 1, /* Unknown classification */
PTRCL_ERR = 2, /* Pointers we *know* are wrong */
PTRCL_STATIC_DATA = 3, /* In your data or program areas */
PTRCL_AUTO = 4, /* A local variable (live stack) */
PTRCL_HEAP = 5 /* In the heap */
PTRCL_MAXIMUM = 5
};

enum PTR_CLASSIFICATION getPtrClassification (void *p);


Question: why do you give a damn about anything but "pointer is null",
"pointer points to an object", "pointer points to a function" and
"pointer is invalid"? To a well-written program, it should not matter if
a pointer points to automatic, static or allocated memory.

Richard
Nov 13 '05 #15

P: n/a
rl*@hoekstra-uitgeverij.nl says:
qe*@pobox.com (Paul Hsieh) wrote:
So rather than making the hardline assertion about whether or not a
pointer is valid, why don't you instead try to determine the nature of
the pointer as best you can determine it? For example:

enum PTR_CLASSIFICATION {
PTRCL_NULL = 0, /* NULL */
PTRCL_UNKNOWN = 1, /* Unknown classification */
PTRCL_ERR = 2, /* Pointers we *know* are wrong */
PTRCL_STATIC_DATA = 3, /* In your data or program areas */
PTRCL_AUTO = 4, /* A local variable (live stack) */
PTRCL_HEAP = 5 /* In the heap */
PTRCL_MAXIMUM = 5
};

enum PTR_CLASSIFICATION getPtrClassification (void *p);


Question: why do you give a damn about anything but "pointer is null",
"pointer points to an object", "pointer points to a function" and
"pointer is invalid"? To a well-written program, it should not matter if
a pointer points to automatic, static or allocated memory.


The most obvious use for this is *DEBUGGING*. In the realm of debugging, the
more information you recover at the time of error, the better.

The classic example is being passes a string which you then store into a
structure. You check back at some other point and the data is corrupted --
why? Because the string was really just a local char array that is long gone.
The fact that a pointer is pointing to an object is useless information if the
fact that its wrong is because a local that's no longer there. This is one of
many notoriously difficult debugging cases for which you would wish to have
more information about what your data is, and where it came from.

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/
Nov 13 '05 #16

P: n/a
qe*@pobox.com (Paul Hsieh) wrote:
rl*@hoekstra-uitgeverij.nl says:
qe*@pobox.com (Paul Hsieh) wrote:
enum PTR_CLASSIFICATION {
PTRCL_NULL = 0, /* NULL */
PTRCL_UNKNOWN = 1, /* Unknown classification */
PTRCL_ERR = 2, /* Pointers we *know* are wrong */
PTRCL_STATIC_DATA = 3, /* In your data or program areas */
PTRCL_AUTO = 4, /* A local variable (live stack) */
PTRCL_HEAP = 5 /* In the heap */
PTRCL_MAXIMUM = 5
};

enum PTR_CLASSIFICATION getPtrClassification (void *p);


Question: why do you give a damn about anything but "pointer is null",
"pointer points to an object", "pointer points to a function" and
"pointer is invalid"? To a well-written program, it should not matter if
a pointer points to automatic, static or allocated memory.


The most obvious use for this is *DEBUGGING*. In the realm of debugging, the
more information you recover at the time of error, the better.


Of course. But that's what debuggers are for; you shouldn't be doing
this yourself.

Richard
Nov 13 '05 #17

This discussion thread is closed

Replies have been disabled for this discussion.