Incrementing a void pointer. Legal C99?

Erik de Castro Lopo

Hi all,

The GNU C compiler allows a void pointer to be incremented and
the behaviour is equivalent to incrementing a char pointer.

Is this legal C99 or is this a GNU C extention?

Thanks in advance.

Erik
--
+-----------------------------------------------------------+
Erik de Castro Lopo
+-----------------------------------------------------------+
"C++ is a language strongly optimized for liars and people who
go by guesswork and ignorance." -- Erik Naggum

Apr 13 '06 #1

Subscribe Post Reply

8892

Burton Samograd

Erik de Castro Lopo <no****@mega-nerd.com> writes:

The GNU C compiler allows a void pointer to be incremented and
the behaviour is equivalent to incrementing a char pointer.

Is this legal C99 or is this a GNU C extention?

gcc extension

check info gcc -> C Extensions

--
burton samograd kruhft .at. gmail
kruhft.blogspot.com www.myspace.com/kruhft metashell.blogspot.com

Apr 13 '06 #2

Spoon

Erik de Castro Lopo wrote:

The GNU C compiler allows a void pointer to be incremented and
the behaviour is equivalent to incrementing a char pointer.

Is this legal C99 or is this a GNU C extention?

Even if you specify -Wall -Wextra -Werror -std=c99 -pedantic ?

Apr 13 '06 #3

Diomidis Spinellis

Erik de Castro Lopo wrote:

The GNU C compiler allows a void pointer to be incremented and
the behaviour is equivalent to incrementing a char pointer.

Is this legal C99 or is this a GNU C extention?

C99 (6.5.6) defines the addition of an integer to a pointer in terms of
a pointer pointing to an element of an array: after the addition of N
the pointer will point N elements forward (subject to array size
restrictions). C99 (6.2.5) also says that void is an incomplete type,
which means that arrays of that type can not be constructed. My
understanding from these two facts is that C99 doesn't define pointer
arithmetic with a void pointer.

Note that gcc -ansi -pedantic will issue a warning:

warning: wrong type argument to increment

--
Diomidis Spinellis
Code Quality: The Open Source Perspective (Addison-Wesley 2006)
http://www.spinellis.gr/codequality

Apr 13 '06 #4

Erik de Castro Lopo

Spoon wrote:

Erik de Castro Lopo wrote:
The GNU C compiler allows a void pointer to be incremented and
the behaviour is equivalent to incrementing a char pointer.

Is this legal C99 or is this a GNU C extention?

Even if you specify -Wall -Wextra -Werror -std=c99 -pedantic ?

Unfortunately I have to use -std=gnu99 to get access to
64 bit file offsets for the POSIX read/write/lseek
family of functions (I'm targeting POSIX rather than
real standard C).

Erik
--
+-----------------------------------------------------------+
Erik de Castro Lopo
+-----------------------------------------------------------+
"life is too long to know C++ well" -- Erik Naggum

Apr 13 '06 #5

danielhe99

> The GNU C compiler allows a void pointer to be incremented and

the behaviour is equivalent to incrementing a char pointer.
Is this legal C99 or is this a GNU C extention?

The pointer can be regarded as "memory address". The value of the
pointer can be incremented whatever the pointer type is. But you
cannot know the memory boundary of the structure to which the void
pointer refers.

In your case, the behavior is legal for both C99 and GNU C. But you
need to cast the void pointer into a known data type when you want to
access the content at the pointer.

Apr 14 '06 #6

Keith Thompson

da********@gmail.com writes:

The GNU C compiler allows a void pointer to be incremented and
the behaviour is equivalent to incrementing a char pointer.
Is this legal C99 or is this a GNU C extention?

Please don't snip attribution lines. The above was written by Erik de
Castro Lopo.
The pointer can be regarded as "memory address". The value of the
pointer can be incremented whatever the pointer type is. But you
cannot know the memory boundary of the structure to which the void
pointer refers.
Right.
In your case, the behavior is legal for both C99 and GNU C. But you
need to cast the void pointer into a known data type when you want to
access the content at the pointer.

No, standard C does not allow arithmetic on void*; it's a gcc
extension.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Apr 14 '06 #7

Bill Pursell

Burton Samograd wrote:

Erik de Castro Lopo <no****@mega-nerd.com> writes:
The GNU C compiler allows a void pointer to be incremented and
the behaviour is equivalent to incrementing a char pointer.

Is this legal C99 or is this a GNU C extention?

gcc extension

check info gcc -> C Extensions

Note that:
void *a; ..... a++; leads to a warning from gcc, but
void **a; .....a++; does not.
Probably you are probably trying to do the latter anyway.
Are the following identical semantically?
1)
void *a;
a = &foo;
a = (void **)a+1;

2)
void **a;
a = &foo;
a++;

I feel a little uneasy declaring things as void **, (or void ***), but
it seems to work just fine. I'm not sure why I feel uncomfortable with
it. Is it safe and valid?, or should the cast be used?

Apr 14 '06 #8

Chris Torek

>> Erik de Castro Lopo <no****@mega-nerd.com> writes:

> The GNU C compiler allows a void pointer to be incremented and
> the behaviour is equivalent to incrementing a char pointer.
>
> Is this legal C99 or is this a GNU C extention?
Burton Samograd wrote:
gcc extension

check info gcc -> C Extensions

(Burton Samograd is correct. Note that when GCC is requested
to compile either Standard C89 or Standard C99, it does complain
about attempts to do pointer arithmetic of any sort on "void *".)

In article <11**********************@u72g2000cwu.googlegroups .com>
Bill Pursell <bi**********@gmail.com> wrote:Note that:
void *a; ..... a++; leads to a warning from gcc, but
void **a; .....a++; does not.
That is because the C Standards forbid the former (arithmetic on
"void *"), but allow the latter (arithmetic on a pointer that points
to "void *", i.e., values of type "void **").
Are the following identical semantically?
1)
void *a;
a = &foo;
a = (void **)a+1;

2)
void **a;
a = &foo;
a++;
Let me rewrite these as differently-named variables, so that
we can talk about "a" and "a" and not be confused as to whether
"a" means "a", or means instead "a". :-) I will rename the
first one "v1" and the second one "v2".

Consider a hypothetic machine that has 128-byte "void *"s, but
4 or 8 byte "int *", "double *", "struct S *", and so on. That
is, almost every pointer is just 4 or 8 bytes, as they are on
most machines with which most programmers are familiar -- but
for some reason (perhaps random stubborn-ness), the C compiler
writer chose to make "void *" *much* bigger.

In this case, "void *v1" declares a 128-byte pointer, and
sizeof(v1) is 128. Those 128 bytes hold some value(s); as
required by Standard C, after:

void *v1 = &foo;

the 128 bytes hold enough information to locate the variable
"foo", no matter what (data) type "foo" has (we can assume
"foo" is not a function name here).

On the other hand, "void **v2" declares an ordinary 4 or 8 byte
pointer, so that sizeof(v2) is 4 or 8. If we are lucky -- it is
not clear whether this is "good luck" or "bad luck" -- and we
write:

void **v2 = &foo;

and it compiles at all and runs, "v2" will also hold enough
information to locate the variable "foo". The C standards do
not *guarantee* this unless "foo" has type "void *" in the
first place, though, because whatever type "foo" has, &foo
produces a value of type "pointer to ____" (fill in the blank
with foo's type). The variable v2 has type "void **" and thus
can only point to "void *"s.

Let us get a little more concrete about our target machine, and
declare that, on our target machine, "void *" is 128 bytes, and
"int *" is 8 bytes, but "double *" is just 4 bytes. The machine
does this because it has a maximum of 32 gigabytes of RAM. Its
"int"s are 4 bytes long and always aligned on a 4-byte boundary,
but this means that they could be at any of 8,589,934,592 possible
locations, so we need 33 bits to address them. Its "double"s are
8 bytes long and always aligned on an 8-byte boundary, so they
can only be at any of 4,294,967,296 possible locations -- 32
bits suffices to address those.

"void *" remains 128 bytes because of the compiler-writer's whim,
apparently. But, because these are always aligned on a 128-byte
boundary, "void **" only needs to represent 268,435,456 distinct
values (28 bits), so "void **" is just 4 bytes, like "double *".

Now let us suppose that "foo" has type "int":

int foo;
void *v1 = &foo;
void **v2 = (void **)&foo;

The cast is needed here to force the compiler to accept the
conversion: &foo has type "int *" -- which is 8 bytes long -- and
we use the cast to squeeze it through a knothole, whacking off 4
of the 8 bytes, to put it into a "void **".

Having done all of this, "v1" definitely points to "foo" --
128 bytes of "void *" have plenty of room for 8 bytes' worth
of pointer value -- but "v2" might well not point to "foo" at
all, having lost some important bit(s).

Now, if we do:

v1 = ((void **)v1) + 1;

this takes the 128-byte long value in v1 -- which contains the
exact 8-byte address of the variable "foo", along with a bunch more
bytes that are not very interesting -- and scrapes it through the
knothole, producing a 4-byte value of type "void **" and discarding
the remaining 124 bytes. 120 of those 124 were probably useless,
but 4 of them might have held important data. Those data are gone!

Next, this adds 1 "void *"'s worth to the value computed so far.
Since sizeof(void *) is 128, this effectively adds 128 to the
poointer produced by scraping 124 bytes off the value in v1. Then
this result is converted back to "void *" -- adding back 124
bytes, but probably not restoring the lost data -- and that
result is stored back into "v1". Of course, this being all one
big expression, and the effect of "scraping off" value bits not
being defined by the C Standards, it is possible this *does*
restore the lost data, so that v1 points to "128 bytes past
the variable foo".

If we do:

v2++;

then we take the (not well defined) value in v2 -- the result of
squeezing &foo through that same knothole -- and increment it,
pointing to the next "void *" to which v2 points. Since v2 does
not point to "void *"s, the effect continues to be undefined,
but this probably effectively adds 128 to v2.

In other words, the two have similar effects -- but "v1" was
at least guaranteed to start out as valid, while v2 had no
guarantees at all.

Note that if we made "foo" have type "void *", the picture
changes. In this case, &foo has type "void **" -- pointer to
pointer to void -- which fits in v2 just fine, because v2
has type "pointer to pointer to void" and can thus point to
any "pointer to void". In this case, "v2++" is well-defined
-- it makes v2 point "one past the end of the array", where
"the array" is the "implied" array of size 1 that the Standards
guarantee for ordinary variables. Since v2++ is well-defined,
(v1 = (void **)v1 + 1) is also well-defined and then *does*
do the same thing.
I feel a little uneasy declaring things as void **, (or void ***), but
it seems to work just fine. I'm not sure why I feel uncomfortable with
it. Is it safe and valid?, or should the cast be used?

It may, I think, help to think of "void *" as a special type
(because it *is* in fact a *very* special type). It may even
help a bit further to use a typedef for it:

typedef void *Generic_Ptr;

Now we have an alias named Generic_Ptr that can be used to declare
variables (and structure members and so on):

Generic_Ptr p1;
Generic_Ptr p2;
Generic_Ptr arr[100];

Of course, given any object (loosely, "variable") of some type T,
we can always, in C, construct a value of type "pointer to T", and
declare variables suitable to hold such values. So if p1, p2,
and arr[i] are "Generic_Ptr"s, we can point to any of them:

Generic_Ptr *q;
...
q = &p1;
...
q = &p2;
...
q = &arr[i]; /* for some valid index "i" */

As with any pointer in C, "q" can be NULL, or it can point to a
single instance of a Generic_Ptr (like p1 and p2), or it can point
into an array (like &arr[i]). If it points to the first element
of an array:

q = arr;

then q[i] "means" the same thing as arr[i]. This is just like
any other pointer in C:

int iarr[100];
int *ip = iarr;
/* now ip[i] is just like iarr[i] */

char carr[100];
char *cp = carr;
/* now cp[i] is just like carr[i] */

And just as we can do things with "*ip++" and "*cp++" to step
through iarr[] and carr[], we can do things with "*q++" to step
through arr[], our array of "Generic_Ptr"s. All we have to
do is set elements of that array as appropriate:

arr[0] = &this_var;
arr[1] = &that_var;
arr[2] = &the_other_var;
arr[3] = NULL; /* marks the end */

for (q = arr; *q != NULL; q++) {
... do something with *q ...
}

Of course, typedefs do not actually define new types, so "q" has
type "void **" -- q has type "pointer to Generic_Ptr", but "Generic_Ptr"
is just another name for "void *".

Note that "void **" is *not* generic; it is an ordinary, non-special
pointer. It just happens to point *to* a special kind of pointer.

[pause here, if you like :-) ]

The "void ***" type in C is just like any other triple-pointer: it
can be NULL, or can point to a single "void **", or can point into
an array of "void **"s. If "q" has type "void **", then we can
define a "pq" thus:

void ***pq = &q;

Now *pq is just another way to name "q", and if we do:

q = arr;

then q[i] names arr[i] as before.

Most often, triple pointers in C like "qp" come about when we need
to write a function that sets a variable like "q".

For instance, suppose we start with a function that creates an
array of "char *"s:

char **make_arr(size_t n) {
char **space;
size_t i;

space = malloc(n * sizeof *space);
if (space == NULL) ... do something here ...
for (i = 0; i < n; i++)
space[i] = NULL;
}

This function is reasonably straightforward and might be used to
build an "argv" array. Of course, each argv[i] also has to be
set to some useful value, not just NULL -- only the last one is
NULL -- so we might augment it a bit further:

for (i = 0; i < n - 1; i++)
space[i] = malloc( ?? );
space[i] = NULL;

We still need to figure out how much space to allocate, i.e.,
to fill in the "??" part. We also need to handle the case where
we run out of memory (at least one of the malloc()s fails).

(Those paying particularly close attention should have noticed
by now that, with argc and argv, we have argv[argc]==NULL, not
argv[argc-1]==NULL. So make_arr() is not *quite* the same as
working with argv -- we would have to pass in argc+1.)

As the code progresses, we might eventually find ourselves
wanting to write a sub-function that takes "&space" and does
the malloc()-ing. Well, "space" has type "char **" here, so
&space has type "char ***".

If instead of an argv[]-like array, we want an arr[]-like array of
"void *"s, then we would have a "void **space", and if we decided
to put the allocation in a sub-function that takes &space, the
sub-function would have anargument of type "void ***".

Note, again, that it is ONLY the type "void *" that is special:
"void **", "void ***", and even "void ****" or "void *****" is just
ordinary pointer types, and obey all the normal rules for pointers
in C. The "void *" type is the ONLY "special" one, and its
"special-ness" is limited to ordinary assignments and those things
that emulate them (argument passing with prototypes, and -- because
casts are just very forceful assignments -- casts).

The moral, as it were, of the all this is that "void *" can point
to any data type, but "void **" can only point to "void *"; and
"void ***" can only point to "void **"; and so on. You can let
yourself think of "void *" as a special case, because it is; but
do not allow this to lead you into thinking that other types are
special too.
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.

Apr 23 '06 #9

Bill Pursell

Chris Torek wrote:

Bill Pursell <bi**********@gmail.com> wrote:
Are the following identical semantically?
1)
void *a;
a = &foo;
a = (void **)a+1;

2)
void **a;
a = &foo;
a++;

<long and excellent description which shortens to "no" snipped>
In other words, the two have similar effects -- but "v1" was
at least guaranteed to start out as valid, while v2 had no
guarantees at all.

Thank you, that was very informative. I've been using a function to
create arrays, and I beleive my function is valid, but my method for
dereferencing the values is slightly incorrect. Based on your
description, I believe the following is correct. The first method,
commented as invalid, is how I've been using this. I believe I should
simply change to the second method. Comments welcome.

#include <stdlib.h>
#include <stdio.h>

void *
xmalloc(size_t size)
{
void *ret;
/*
* Using the comma is a stylistically horrible way of avoiding
* excessive {}'s. Is there a non-stylistic reason to avoid it?
*/
if ( ( ret = malloc(size)) == NULL)
perror("malloc"), exit(-1);

return ret;
}

/*
* Recursively allocate a
* dimensions[0] x max_dim array of void *.
* If size > 0, malloc space for objects of that size.
*/
void *
make_array(unsigned int *dimensions, unsigned int max_dim, unsigned int
size)
{
void **ret; /* declared as void ** to allow "ret[idx]". Is this
safe?*/
unsigned int idx;

if (max_dim == 0) {
if (size > 0)
return xmalloc(size);
else
return NULL;
}

ret = xmalloc(dimensions[0] * sizeof ret);

for ( idx = 0; idx < dimensions[0]; idx++) {
ret[idx] = make_array(dimensions + 1, max_dim-1, size);
}

return ret;
}

int
main(void)
{
void *foo;
int i,j,k,l;
unsigned int dimensions[] = {5,2,4,8};

foo = make_array( dimensions, 4, sizeof foo);
/* Assign to the array in an invalid way. */
for (i=0; i<5; i++) {
for (j=0; j<2; j++) {
for (k=0; k<4; k++) {
for (l=0; l<8; l++) {
((int ****)foo)[i][j][k][l] = 1000*i + 100*j + 10*k
+ l;
printf("foo [%d][%d][%d][%d] = %d\n", i,j,k,l,
((int ****)foo)[i][j][k][l]);
}
}
}
}
printf("\n\n****************************\n\n");

/* Assign to the array in a valid way. */
for (i=0; i<5; i++) {
for (j=0; j<2; j++) {
for (k=0; k<4; k++) {
for (l=0; l<8; l++) {
((int *)( ((void ****)foo)[i][j][k]))[l] =
1000*i + 100*j + 10*k + l;
printf("foo [%d][%d][%d][%d] = %d\n", i,j,k,l,
((int *)( ((void ****)foo)[i][j][k]))[l] );
}
}
}
}

return 0;
}

Apr 23 '06 #10

Dave Thompson

On 23 Apr 2006 05:29:33 -0700, "Bill Pursell" <bi**********@gmail.com>
wrote:

Chris Torek wrote: <snip> Thank you, that was very informative. I've been using a function to
create arrays, and I beleive my function is valid, but my method for
dereferencing the values is slightly incorrect. Based on your
description, I believe the following is correct. The first method,
commented as invalid, is how I've been using this. I believe I should
simply change to the second method. Comments welcome. void *
xmalloc(size_t size)
{
void *ret;
/*
* Using the comma is a stylistically horrible way of avoiding
* excessive {}'s. Is there a non-stylistic reason to avoid it?
*/
if ( ( ret = malloc(size)) == NULL)
perror("malloc"), exit(-1);

return ret;
}
For this case the reasons are only stylistic. (Or possibly related to
features or limitations of the tools you use, like complexity metrics,
although those are borderline stylistic also.) If you (may) want to
use return or <controversial> goto </> those are statements and cannot
be comma-tized. <OT> OTOH C++ throw is an expression. </>

Substantively: malloc() is not standardly required to set errno and
thus produce useful info from perror, and IME rarely does. exit(-1) is
not provided by the standard and in practice not very portable;
exit(EXIT_FAILURE) is the only guaranteed standard way -- or abort()
-- but in practice exit(small_POSITIVE) commonly works.
/*
* Recursively allocate a
* dimensions[0] x max_dim array of void *.
* If size > 0, malloc space for objects of that size.
*/
void *
make_array(unsigned int *dimensions, unsigned int max_dim, unsigned int
size)
In xmalloc you use size_t for size; why not here?
You might want to make dimensions[] size_t also.
To me max_dim implies the callee(s) can choose to do less; I would
name it (something like) num_dim. Or more artfully rank.
{
void **ret; /* declared as void ** to allow "ret[idx]". Is this
safe?*/
Not really. For your current structure you want to treat the bottom
level pointers as void* = pointer to any data, but the second level as
void** = pointer to pointer (to void), and the third as void*** etc.
There are platforms where pointer-to-void (or character) is different
from and cannot safely be treated as pointer-to-pointer. And it is
allowed for pointers to any incompatible types to be different,
including pointer to pointer (to void) versus pointer to pointer (to
pointer to void) etc., although I don't know of any actual case.
unsigned int idx;

if (max_dim == 0) {
if (size > 0)
return xmalloc(size);
else
return NULL;
}

ret = xmalloc(dimensions[0] * sizeof ret);

If you want to deal only with the first problem above you could have
if( rank == 1 ){
void **leaves = xmalloc ...; for( idx ...) leaves[i] = ...
}else /* rank > 1 */ {
void ***branches = xmalloc ...; for( idx ... ) branches[i] =
}

Alternatively you could leave all the pointers as void* but cast each
level as you use it, see below, but that's really ugly.
for ( idx = 0; idx < dimensions[0]; idx++) {
ret[idx] = make_array(dimensions + 1, max_dim-1, size);
}

return ret;
}
If you want to be absolutely 100% safe, you need a different type for
each level, which makes a single routine really ugly; it's much
clearer to have a separate routine for each rank, each of which can
call the next lower one, and unless you actually use ranks higher than
say 10 or so, in which case you'll probably run out of memory before
doing anything useful, it's very unlikely on reasonable
implementations to cost more than maybe a hundred bytes of code.
int
main(void)
{
void *foo;
int i,j,k,l;
unsigned int dimensions[] = {5,2,4,8};

foo = make_array( dimensions, 4, sizeof foo);
sizeof foo is the size of one void*; it is not necessarily the size of
an int, which is what you apparently want below.
/* Assign to the array in an invalid way. */ <snip> /* Assign to the array in a valid way. */
for (i=0; i<5; i++) {
for (j=0; j<2; j++) {
for (k=0; k<4; k++) {
for (l=0; l<8; l++) {
((int *)( ((void ****)foo)[i][j][k]))[l] =
1000*i + 100*j + 10*k + l;
Off by one. Your design creates pointers 'all the way down', i.e. you
have 4 levels of pointers, with the lowest ones each pointing to a
single element, which you apparently want to be int in this case.
Assuming you solve only the leaf/higher problem above:
* (int*) ((void****)foo) [i] [j] [k] [l]
(Or declare foo as void**** in the first place to avoid the cast.)

If not, if you left all the pointers as void*, what you actually have
to do is (if I've counted right, not tested):
*(int*)( (void**)( (void**)( (void**)foo)[i] )[j] )[k] )[l] )
which I would certainly hide under a macro or several.

Alternatively you could change the recursion to terminate at rank 1
and allocate a leaf 'row' of origdims[num-1] * size and:
( (int*) ((void***)foo) [i] [j] [k] ) [l]
(or other adjustments as above).

Or you might consider an entirely different design where you just
malloc prod(alldims)*size and do your own subscripting as
((int*)foo) [ i1*n2*n3*n4 + i2*n3*n4 + i3*n4 + i4]
again almost certainly hidden in a macro. If you want to pass around
and access these arrays from more than one function you could package
the 'raw' pointer with the dims in a structure something like
struct myarray3 { size_t dims[3]; void * data; }
or more generically
struct myarray { uint ndim; size_t *dims; void * data; }
/* or perhaps size_t dims [SOMELIMIT]; */
in both cases perhaps adding some code/descriptor of the element type,
which is pretty much like the dope vectors used internally by other
languages that have support for variable multi-dim arrays built in:
Fortran, PL/I, and Ada, and to a limited extent Pascal.
printf("foo [%d][%d][%d][%d] = %d\n", i,j,k,l,
((int *)( ((void ****)foo)[i][j][k]))[l] );
}
}
}
}

return 0;
}

- David.Thompson1 at worldnet.att.net

May 4 '06 #11

Ben Pfaff

"Bill Pursell" <bi**********@gmail.com> writes:

if ( ( ret = malloc(size)) == NULL)
perror("malloc"), exit(-1);

If you ever pass size == 0, some implementations will return NULL
from malloc() and your xmalloc() implementation will abort().
--
"It wouldn't be a new C standard if it didn't give a
new meaning to the word `static'."
--Peter Seebach on C99

May 4 '06 #12

Keith Thompson

Ben Pfaff <bl*@cs.stanford.edu> writes:

"Bill Pursell" <bi**********@gmail.com> writes:
if ( ( ret = malloc(size)) == NULL)
perror("malloc"), exit(-1);

If you ever pass size == 0, some implementations will return NULL
from malloc() and your xmalloc() implementation will abort().

Which raises an interesting but trivial point.

malloc() returns NULL if it's unable to allocate the requested memory.
malloc(0), even if it succeeds, can return either NULL or a valid
pointer.

If an implementation returns a non-null pointer for malloc(0), it will
return NULL for malloc(0) if it's unable to allocate any memory. This
is indistinguishable from malloc(0) succeeding and returning NULL --
unless you happen to know how the implementation behaves. It's an
obscure error condition that can't be detected by portable code. But
portable code shouldn't care whether malloc(0) returns NULL or not
anyway.

(I'm using NULL as a verbal shorthand for a null pointer value;
obviously a function can't return a macro.)

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

May 4 '06 #13

Ben Pfaff

Ben Pfaff <bl*@cs.stanford.edu> writes:

"Bill Pursell" <bi**********@gmail.com> writes:
if ( ( ret = malloc(size)) == NULL)
perror("malloc"), exit(-1);

If you ever pass size == 0, some implementations will return NULL
from malloc() and your xmalloc() implementation will abort().

I don't know why I wrote abort() (the function) when I meant
abort (the ordinary English word), but I think my intention was
clear.
--
"It would be a much better example of undefined behavior
if the behavior were undefined."
--Michael Rubenstein

May 4 '06 #14

Bill Pursell

Dave Thompson wrote:

On 23 Apr 2006 05:29:33 -0700, "Bill Pursell" <bi**********@gmail.com>
wrote:
I've been using a function to
create arrays, and I beleive my function is valid, but my method for
dereferencing the values is slightly incorrect. Based on your
description, I believe the following is correct. The first method,
commented as invalid, is how I've been using this. I believe I should
simply change to the second method. Comments welcome.
void *
xmalloc(size_t size)
{
void *ret;
if ( ( ret = malloc(size)) == NULL)
perror("malloc"), exit(-1);

return ret;
}

/*
* Recursively allocate a
* dimensions[0] x max_dim array of void *.
* If size > 0, malloc space for objects of that size.
*/
void *
make_array(unsigned int *dimensions, unsigned int max_dim, unsigned int
size)

In xmalloc you use size_t for size; why not here?
You might want to make dimensions[] size_t also.
To me max_dim implies the callee(s) can choose to do less; I would
name it (something like) num_dim. Or more artfully rank.

I like that, thanks.

{
void **ret; /* declared as void ** to allow "ret[idx]". Is this
safe?*/

Not really. For your current structure you want to treat the bottom
level pointers as void* = pointer to any data, but the second level as
void** = pointer to pointer (to void), and the third as void*** etc.
There are platforms where pointer-to-void (or character) is different
from and cannot safely be treated as pointer-to-pointer. And it is
allowed for pointers to any incompatible types to be different,
including pointer to pointer (to void) versus pointer to pointer (to
pointer to void) etc., although I don't know of any actual case.

Thats strikes me as very counter-intuitive. If a void * can refer to
any kind of pointer, then it should be able to refer to an int *, or a
void **, or a void ***.

unsigned int idx;

if (max_dim == 0) {
if (size > 0)
return xmalloc(size);
else
return NULL;
}

ret = xmalloc(dimensions[0] * sizeof ret);

If you want to deal only with the first problem above you could have
if( rank == 1 ){
void **leaves = xmalloc ...; for( idx ...) leaves[i] = ...
}else /* rank > 1 */ {
void ***branches = xmalloc ...; for( idx ... ) branches[i] =
}

Alternatively you could leave all the pointers as void* but cast each
level as you use it, see below, but that's really ugly.
for ( idx = 0; idx < dimensions[0]; idx++) {
ret[idx] = make_array(dimensions + 1, max_dim-1, size);
}

return ret;
}

If you want to be absolutely 100% safe, you need a different type for
each level, which makes a single routine really ugly; it's much
clearer to have a separate routine for each rank, each of which can
call the next lower one, and unless you actually use ranks higher than
say 10 or so, in which case you'll probably run out of memory before
doing anything useful, it's very unlikely on reasonable
implementations to cost more than maybe a hundred bytes of code.
int
main(void)
{
void *foo;
int i,j,k,l;
unsigned int dimensions[] = {5,2,4,8};

foo = make_array( dimensions, 4, sizeof foo);

sizeof foo is the size of one void*; it is not necessarily the size of
an int, which is what you apparently want below.

Good catch. This example was a toy that I'd put together
quickly, but I'm concerned that I may have that same error
hidden in my code elsewhere...someday it will burn me.

/* Assign to the array in an invalid way. */

<snip>
/* Assign to the array in a valid way. */
for (i=0; i<5; i++) {
for (j=0; j<2; j++) {
for (k=0; k<4; k++) {
for (l=0; l<8; l++) {
((int *)( ((void ****)foo)[i][j][k]))[l] =
1000*i + 100*j + 10*k + l;

Off by one. Your design creates pointers 'all the way down', i.e. you
have 4 levels of pointers, with the lowest ones each pointing to a
single element, which you apparently want to be int in this case.
Assuming you solve only the leaf/higher problem above:
* (int*) ((void****)foo) [i] [j] [k] [l]
(Or declare foo as void**** in the first place to avoid the cast.)

If not, if you left all the pointers as void*, what you actually have
to do is (if I've counted right, not tested):
*(int*)( (void**)( (void**)( (void**)foo)[i] )[j] )[k] )[l] )
which I would certainly hide under a macro or several.

I think this demonstrates a point that is throwing me a little.
It strikes me that the compiler should treat
((void ****)foo)[i][j] and (void **)((void **)foo[i])[j]
exactly the same way. The first says
"foo is a pointer to a pointer to a pointer to a pointer",
therefore foo[i] is a pointer to a pointer to a pointer,
and foo[j] is a pointer to a pointer, which
is just made explicit by the second. I don't see how
the explicit double cast gives any more information to
the compiler.
Alternatively you could change the recursion to terminate at rank 1
and allocate a leaf 'row' of origdims[num-1] * size and:
( (int*) ((void***)foo) [i] [j] [k] ) [l]
(or other adjustments as above).

Or you might consider an entirely different design where you just
malloc prod(alldims)*size and do your own subscripting as
((int*)foo) [ i1*n2*n3*n4 + i2*n3*n4 + i3*n4 + i4]

The reason I'm avoiding this is performance. I seem to be
able to dereference the array much faster than doing the
arithmetic. Of course, I could probably build the array using
the arithmetic, but I all the points you made on dereferencing the
pointers will remain. In practice, my upper bound on the
rank is a compile-time limit of 8, with realistic values never
exceeding 4, so I could easily incorporate explicit casts
on each level.

Thanks for you comments

May 5 '06 #15

Richard Bos

Keith Thompson <ks***@mib.org> wrote:

Ben Pfaff <bl*@cs.stanford.edu> writes:
"Bill Pursell" <bi**********@gmail.com> writes:
if ( ( ret = malloc(size)) == NULL)
perror("malloc"), exit(-1);
If you ever pass size == 0, some implementations will return NULL
from malloc() and your xmalloc() implementation will abort().

Which raises an interesting but trivial point.

malloc() returns NULL if it's unable to allocate the requested memory.
malloc(0), even if it succeeds, can return either NULL or a valid
pointer.

Ah. This is a philosophical point. _Is_ trying to allocate 0 bytes and
getting nothing as a result truly a success? Contrariwise, is it truly a
failure? Since, IMO, it is only marginally either, allowing malloc(0) to
return either a null pointer or a pointer to non-usable memory is
philosophically the right answer.
It may be technically awkward in some cases, but then, any program
should be prepared to get a null pointer from malloc() for any size, at
any time; and also prepared for a malloc() succeeding after a previous
one failed - for example, when another program has just terminated and
freed a lot of memory.
(I'm using NULL as a verbal shorthand for a null pointer value;
obviously a function can't return a macro.)

Tsk...

Richard

May 5 '06 #16

Dave Thompson

On 4 May 2006 20:40:39 -0700, "Bill Pursell" <bi**********@gmail.com>
wrote:

Dave Thompson wrote:
On 23 Apr 2006 05:29:33 -0700, "Bill Pursell" <bi**********@gmail.com>
wrote: <snip>
void **ret; /* declared as void ** to allow "ret[idx]". Is this
safe?*/
Not really. For your current structure you want to treat the bottom
level pointers as void* = pointer to any data, but the second level as
void** = pointer to pointer (to void), and the third as void*** etc.
There are platforms where pointer-to-void (or character) is different
from and cannot safely be treated as pointer-to-pointer. And it is
allowed for pointers to any incompatible types to be different,
including pointer to pointer (to void) versus pointer to pointer (to
pointer to void) etc., although I don't know of any actual case.

Thats strikes me as very counter-intuitive. If a void * can refer to
any kind of pointer, then it should be able to refer to an int *, or a
void **, or a void ***.

A void* can address any memory, but it can't be used to access it at
all; a 'typed' pointer must be used for that. void** accesses the
memory assuming it contains (or will do) a void*. See below.

<big snip>
Off by one. Your design creates pointers 'all the way down', i.e. you
have 4 levels of pointers, with the lowest ones each pointing to a
single element, which you apparently want to be int in this case.
Assuming you solve only the leaf/higher problem above:
* (int*) ((void****)foo) [i] [j] [k] [l]
(Or declare foo as void**** in the first place to avoid the cast.)

If not, if you left all the pointers as void*, what you actually have
to do is (if I've counted right, not tested):
*(int*)( (void**)( (void**)( (void**)foo)[i] )[j] )[k] )[l] )
which I would certainly hide under a macro or several.

I think this demonstrates a point that is throwing me a little.
It strikes me that the compiler should treat
((void ****)foo)[i][j] and (void **)((void **)foo[i])[j]
exactly the same way. The first says
"foo is a pointer to a pointer to a pointer to a pointer",
therefore foo[i] is a pointer to a pointer to a pointer,
and foo[j] is a pointer to a pointer, which
is just made explicit by the second. I don't see how
the explicit double cast gives any more information to
the compiler.

But they are different types of pointers. void*** is a pointer to
(something which is) a pointer to a pointer to void. The nested casts
describe the data structure you actually created, where the root
points to memory which actually contains a void* which points to
memory which contains a void* which points to data, but the root does
not point to something actually _is_ a void**.

A void *, or [signed/unsigned] char *, must be able to point to any
valid memory object in C, that is, any byte. But other (data) pointer
types need only to be able to point at valid objects of their type.
Thus an int* only needs to be able to point to an int, a struct foo*
only needs to be able to point to a struct foo, and a void** only
needs to be able to point to a void*. On some platforms -- including
one I use -- these other data types are addressed as words, not as
bytes, and thus pointers of these types have different representations
than void*. In fact the Standard even allows them to have different
_sizes_, and some systems have done so, although not mine.

Thus (int*) ( (void**)foo[i] ) [j] says fetch the i'th rank2 pointer,
which is a pointer to void (= to byte), _convert_ it to a pointer to
void* (= a certain kind of word), use it fetch the j'th rank1 pointer,
which is a byte pointer, convert it to pointer to int (another type of
word), and use it to fetch the targetted int.

On most platforms today all pointers -- at least all data pointers --
are just byte addresses, so you can get away with cheating.
As I think I noted some ways back. But you asked about doing it right.

Alternatively you could change the recursion to terminate at rank 1
and allocate a leaf 'row' of origdims[num-1] * size and:
( (int*) ((void***)foo) [i] [j] [k] ) [l]
(or other adjustments as above).

Or you might consider an entirely different design where you just
malloc prod(alldims)*size and do your own subscripting as
((int*)foo) [ i1*n2*n3*n4 + i2*n3*n4 + i3*n4 + i4]

The reason I'm avoiding this is performance. I seem to be
able to dereference the array much faster than doing the
arithmetic. Of course, I could probably build the array using

That's interesting. Historically (like in the 70s and 80s) this was
widely the case, but on today's mainstream CPUs ALU operations are
almost always _much_ faster than any real memory access, and generally
at least as fast as even cached ones. Are you using small embedded
systems, or some specialized environment like avionics or space?

Do you need completely random access, or can you do (or encourage your
compiler to) partial precomputation aka strength reduction?
the arithmetic, but I all the points you made on dereferencing the
pointers will remain. In practice, my upper bound on the
rank is a compile-time limit of 8, with realistic values never
exceeding 4, so I could easily incorporate explicit casts
on each level.

Or, for 4 and probably even rather more than 8, you could do as I also
suggested and just write separate routines for each rank, where each
higher one calls the next lower one. That way you can use the
'natural' types for C, have less cluttered code, and be sure it works.

And it even gives you a little bit of error checking: you can't
accidentally allocate a 4D and then subscript it as 3D. Even
(especially) after years of changes/maintenance.

- David.Thompson1 at worldnet.att.net

May 14 '06 #17

Barry Schwarz

On Sun, 14 May 2006 20:32:29 GMT, Dave Thompson
<da*************@worldnet.att.net> wrote:

A void* can address any memory, but it can't be used to access it at
all;

A void* cannot be dereferenced to access memory; actually cannot be
dereferenced at all. But several functions accept a void* to access
memory, such as memcmp, qsort, etc.
Remove del for email

May 21 '06 #18

Harald van DÄ³k

Barry Schwarz wrote:

On Sun, 14 May 2006 20:32:29 GMT, Dave Thompson
<da*************@worldnet.att.net> wrote:

A void* can address any memory, but it can't be used to access it at
all;

A void* cannot be dereferenced to access memory; actually cannot be
dereferenced at all.

My compilers accept

int main(void) {
void *p = &p;
*p;
return 0;
}

gcc with -ansi -pedantic-errors gives only a warning, tcc with -Xs
("strict ANSI with many extra checks") gives no complaints at all. I
don't see anything in n1124 forbidding this either. Did I miss
something?

May 21 '06 #19

pete

=?utf-8?B?SGFyYWxkIHZhbiBExLNr?= wrote:

Barry Schwarz wrote:
On Sun, 14 May 2006 20:32:29 GMT, Dave Thompson
<da*************@worldnet.att.net> wrote:

A void* can address any memory, but it can't be used to access it at
all;

A void* cannot be dereferenced to access memory; actually cannot be
dereferenced at all.

My compilers accept

int main(void) {
void *p = &p;
*p;
return 0;
}

gcc with -ansi -pedantic-errors gives only a warning, tcc with -Xs
("strict ANSI with many extra checks") gives no complaints at all. I
don't see anything in n1124 forbidding this either. Did I miss
something?

Is (*p) equal to (p[0]) ?

--
pete

May 21 '06 #20

pete

pete wrote:

=?utf-8?B?SGFyYWxkIHZhbiBExLNr?= wrote:

Barry Schwarz wrote:
On Sun, 14 May 2006 20:32:29 GMT, Dave Thompson
<da*************@worldnet.att.net> wrote:
>A void* can address any memory, but it can't be used to access it at
>all;

A void* cannot be dereferenced to access memory; actually cannot be
dereferenced at all.

My compilers accept

int main(void) {
void *p = &p;
*p;
return 0;
}

gcc with -ansi -pedantic-errors gives only a warning,
What's the warning?
tcc with -Xs
("strict ANSI with many extra checks") gives no complaints at all. I
don't see anything in n1124 forbidding this either. Did I miss
something?

Is (*p) equal to (p[0]) ?

(*puts) isn't equal to (puts[0]),
so the point I was trying to get at, may be irrelevant.

--
pete

May 21 '06 #21

Harald van DÄ³k

pete wrote:

pete wrote:

=?utf-8?B?SGFyYWxkIHZhbiBExLNr?= wrote:

Barry Schwarz wrote:
> On Sun, 14 May 2006 20:32:29 GMT, Dave Thompson
> <da*************@worldnet.att.net> wrote:
>
>
> >A void* can address any memory, but it can't be used to access it at
> >all;
>
> A void* cannot be dereferenced to access memory; actually cannot be
> dereferenced at all.

My compilers accept

int main(void) {
void *p = &p;
*p;
return 0;
}

gcc with -ansi -pedantic-errors gives only a warning,
What's the warning?

Simply "warning: dereferencing â€˜void *â€™ pointer".

tcc with -Xs
("strict ANSI with many extra checks") gives no complaints at all. I
don't see anything in n1124 forbidding this either. Did I miss
something?

Is (*p) equal to (p[0]) ?

(*puts) isn't equal to (puts[0]),
so the point I was trying to get at, may be irrelevant.

Not even just with function pointers, but

extern char p[];
int main(void) {
*&p; /* no error */
(&p)[0]; /* gcc: error: invalid use of array with unspecified
bounds */
return 0;
}

And tcc's error message is more verbose, but essentially the same.

May 21 '06 #22

S.Tobias

pete <pf*****@mindspring.com> wrote:

=?utf-8?B?SGFyYWxkIHZhbiBExLNr?= wrote:

Barry Schwarz wrote:
> On Sun, 14 May 2006 20:32:29 GMT, Dave Thompson
> <da*************@worldnet.att.net> wrote:
>
>
> >A void* can address any memory, but it can't be used to access it at
> >all;
>
> A void* cannot be dereferenced to access memory; actually cannot be
> dereferenced at all.

6.5.3.2 doesn't constrain us from dereferencing void pointers.
In fact, nothing does.

My compilers accept

int main(void) {
void *p = &p;
*p;
return 0;
}

gcc with -ansi -pedantic-errors gives only a warning, tcc with -Xs
("strict ANSI with many extra checks") gives no complaints at all. I
don't see anything in n1124 forbidding this either. Did I miss
something?

Everything is all right.

Is (*p) equal to (p[0]) ?

No.

--
Stan Tobias
mailx `echo si***@FamOuS.BedBuG.pAlS.INVALID | sed s/[[:upper:]]//g`

May 21 '06 #23

Richard Heathfield

Harald van D?k said:

Barry Schwarz wrote:
On Sun, 14 May 2006 20:32:29 GMT, Dave Thompson
<da*************@worldnet.att.net> wrote:

>A void* can address any memory, but it can't be used to access it at
>all;
A void* cannot be dereferenced to access memory; actually cannot be
dereferenced at all.

My compilers accept

int main(void) {
void *p = &p;
*p;
return 0;
}

gcc with -ansi -pedantic-errors gives only a warning,

Only a warning.

"I'm going into the cave."
"I think I'd better warn you there's a mad bear in there."
"Oh. Well, it's only a warning. Here I gAAAAAARRRRGGGHHH!"

I prefer to call them "diagnostic messages". It conveys a less casual
approach to the often ghastly problems to which your compiler is trying to
draw your attention.
tcc with -Xs
("strict ANSI with many extra checks") gives no complaints at all. I
don't see anything in n1124 forbidding this either. Did I miss
something?

Well, remember that void expressions are evaluated for their side effects.
Now try /using/ the value of the expression, and see how far you get.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)

May 21 '06 #24

Harald van DÄ³k

Richard Heathfield wrote:

Harald van D?k said:
Barry Schwarz wrote:
On Sun, 14 May 2006 20:32:29 GMT, Dave Thompson
<da*************@worldnet.att.net> wrote:
>A void* can address any memory, but it can't be used to access it at
>all;

A void* cannot be dereferenced to access memory; actually cannot be
dereferenced at all.

My compilers accept

int main(void) {
void *p = &p;
*p;
return 0;
}

gcc with -ansi -pedantic-errors gives only a warning,

Only a warning.

"I'm going into the cave."
"I think I'd better warn you there's a mad bear in there."
"Oh. Well, it's only a warning. Here I gAAAAAARRRRGGGHHH!"

I prefer to call them "diagnostic messages". It conveys a less casual
approach to the often ghastly problems to which your compiler is trying to
draw your attention.

The warning here means gcc does not believe it violates any
constraints, but does believe it may be an accident. It is not an
accident. I do not call it a "diagnostic message" because error
messages are that too, and I wanted to make it clear that gcc
successfully compiled the program.

tcc with -Xs
("strict ANSI with many extra checks") gives no complaints at all. I
don't see anything in n1124 forbidding this either. Did I miss
something?

Well, remember that void expressions are evaluated for their side effects.
Now try /using/ the value of the expression, and see how far you get.

Does an expression of type void even have a value? Anyway, I know there
is not a whole lot you can do with it, and there is no reason to do
this in C other than "because I can". But when it is claimed I can't,
that's reason enough for me.

May 21 '06 #25

pete

=?utf-8?B?SGFyYWxkIHZhbiBExLNr?= wrote:

Does an expression of type void even have a value?
No.
Anyway, I know there
is not a whole lot you can do with it, and there is no reason to do
this in C other than "because I can". But when it is claimed I can't,
that's reason enough for me.

I understand.

--
pete

May 21 '06 #26

Keith Thompson

"Harald van DÄ³k" <tr*****@gmail.com> writes:

Richard Heathfield wrote:
Harald van D?k said:
> Barry Schwarz wrote:
>> On Sun, 14 May 2006 20:32:29 GMT, Dave Thompson
>> <da*************@worldnet.att.net> wrote:
>> >A void* can address any memory, but it can't be used to access it at
>> >all;
>>
>> A void* cannot be dereferenced to access memory; actually cannot be
>> dereferenced at all.
>
> My compilers accept
>
> int main(void) {
> void *p = &p;
> *p;
> return 0;
> }
>
> gcc with -ansi -pedantic-errors gives only a warning,

Only a warning.

"I'm going into the cave."
"I think I'd better warn you there's a mad bear in there."
"Oh. Well, it's only a warning. Here I gAAAAAARRRRGGGHHH!"

I prefer to call them "diagnostic messages". It conveys a less casual
approach to the often ghastly problems to which your compiler is trying to
draw your attention.

The warning here means gcc does not believe it violates any
constraints, but does believe it may be an accident. It is not an
accident. I do not call it a "diagnostic message" because error
messages are that too, and I wanted to make it clear that gcc
successfully compiled the program.

[...]

The standard only requires a compiler to produce a diagnostic message
when it encounters a syntax error or constraint violation. It doesn't
require a distinction between warnings and error messages, and it
doesn't require *any* program to be rejected (unless it contains a
"#error" directive).

So any compiler is allowed to respond to a constraint violation by
printing a warning message and successfully translating the file, <OT>
and gcc does so in some cases</OT>.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

May 21 '06 #27

Harald van DÄ³k

Keith Thompson wrote:

"Harald van DÄ³k" <tr*****@gmail.com> writes:
Richard Heathfield wrote:
Harald van D?k said:
> Barry Schwarz wrote:
>> On Sun, 14 May 2006 20:32:29 GMT, Dave Thompson
>> <da*************@worldnet.att.net> wrote:
>> >A void* can address any memory, but it can't be used to access it at
>> >all;
>>
>> A void* cannot be dereferenced to access memory; actually cannot be
>> dereferenced at all.
>
> My compilers accept
>
> int main(void) {
> void *p = &p;
> *p;
> return 0;
> }
>
> gcc with -ansi -pedantic-errors gives only a warning,

Only a warning.

"I'm going into the cave."
"I think I'd better warn you there's a mad bear in there."
"Oh. Well, it's only a warning. Here I gAAAAAARRRRGGGHHH!"

I prefer to call them "diagnostic messages". It conveys a less casual
approach to the often ghastly problems to which your compiler is trying to
draw your attention.

The warning here means gcc does not believe it violates any
constraints, but does believe it may be an accident. It is not an
accident. I do not call it a "diagnostic message" because error
messages are that too, and I wanted to make it clear that gcc
successfully compiled the program.

[...]

The standard only requires a compiler to produce a diagnostic message
when it encounters a syntax error or constraint violation. It doesn't
require a distinction between warnings and error messages, and it
doesn't require *any* program to be rejected (unless it contains a
"#error" directive).

So any compiler is allowed to respond to a constraint violation by
printing a warning message and successfully translating the file, <OT>
and gcc does so in some cases</OT>.

I am aware that in the general case, there is no such guarantee, but I
mentioned I used the -pedantic-errors option, which with GCC makes
*all* required diagnostics hard errors. Sorry for not having posted
what it meant the first time.

May 21 '06 #28

Incrementing a void pointer. Legal C99?

Similar topics