469,267 Members | 1,643 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,267 developers. It's quick & easy.

multi dimensional arrays as one dimension array

The subject might be misleading.
Regardless, is this code valid:

#include <stdio.h>

void f(double *p, size_t size) { while(size--) printf("%f\n", *p++); }
int main(void) {
double array[2][1] = { { 3.14 }, { 42.6 } };
f((double *)array, sizeof array / sizeof **array);
return 0;
}

Assuming casting double [2][1] to double * is implementation defined
or undefined behavior, replace the cast with (void *).

Since arrays are not allowed to have padding bytes in between of
elements, it seems valid to me.
Aug 29 '08
152 8799
Keith Thompson wrote:
pete <pf*****@mindspring.comwrites:
>K&R2, Appendix B3:
There are two groups of string functions
defined in the header <string.h>.
The first have names begining with str;
the second have names begining with mem.
memcpy, for example, is
a function that doesn't have anything directly to do with strings.
That's probably why it has a name that doesn't begin with str.

--
pete
Sep 1 '08 #51
Keith Thompson wrote:
pete <pf*****@mindspring.comwrites:
>K&R2, Appendix B3:
There are two groups of string functions
defined in the header <string.h>.
The first have names begining with str;
the second have names begining with mem.
memcpy, for example, is
a function that doesn't have anything directly to do with strings.
Has it occurred to you that "library functions" are called that,
because they are part of the library,
and not because of what they do?

--
pete
Sep 1 '08 #52
James Kuyper <ja*********@verizon.netwrites:
vi******@gmail.com wrote:
On Aug 29, 6:39 am, pete <pfil...@mindspring.comwrote:
vipps...@gmail.com wrote:
The subject might be misleading.
Regardless, is this code valid:
#include <stdio.h>
void f(double *p, size_t size) { while(size--) printf("%f\n", *p++); }
int main(void) {
double array[2][1] = { { 3.14 }, { 42.6 } };
f((double *)array, sizeof array / sizeof **array);
return 0;
}
Assuming casting double [2][1] to double * is implementation defined
or undefined behavior, replace the cast with (void *).
Since arrays are not allowed to have padding bytes in between of
elements, it seems valid to me.
Stepping through a one dimensional array
and on through a consecutive one dimensional array
is only defined for elements of character type
and only because
any object can be treated as an array of character type.
So as I understand it you are saying my code invokes undefined
behavior.
In which hypothetical (or real, if such implementation does exist)
implementation my code won't work, and why?

The key point is the pointer conversion. At the point where that
conversion occurs, the compiler knows that (double*)array == array[0].
It's undefined if any number greater than 1 is added to that pointer
value, and also if that pointer is dereferenced after adding 1 to it.
This conclusion doesn't fit a consistent reading of the standard.

Certainly, if we have

double a[5][3];
double (*xa)[3];
xa = a;

then the valid index values for xa are 0, 1, 2, 3, 4. The variable
xa points into the multidimensional array object a; the "extent"
of xa is all of a.

The extent of xa is not changed by converting it to void *. It's
legal to sort xa using qsort, by

qsort( xa, 5, sizeof *xa, suitable_function );

The conversion of xa to void * must preserve access to the entire
array a. And void * isn't special in this regard; converting to
unsigned char * must allow access to the entire original array a,
so that elements can be swapped (either in qsort or in another
sorting function with a similar interface).

The very same argument applies if we don't use xa but just use
a directly; the value '(void*)a' has the same extent as the
array a. And so must '(double*)a', or any other pointer
conversion of a, provided of course that alignment requirements
are satisfied.

Sep 1 '08 #53
Richard Heathfield <rj*@see.sig.invalidwrites:
James Tursa said:
>On Mon, 01 Sep 2008 08:02:29 +0000, Richard Heathfield
<rj*@see.sig.invalidwrote:
>>>
(I must confess that I don't see the fascination with deliberately
confusing type issues, though. If I want a one-dimensional array, I'll
build one. If I want a two-dimensional array, I'll build one of those
instead. Why mix them up?)

Because I want to copy a C 2-dimensional array (or multi-dimensional
array) to a MATLAB mxArray data pointer in a conforming way. Sounds
simple enough. Just copy the contents of a 2-dimensional array to a
data area accessed through a double *. The MATLAB functions will give
me a double * for the target data area and I have assumed (apparently
incorrectly) that a simple memcpy using the C variable name is
conforming. Apparently (according to some) I need to use &name instead
of just name to be conforming.

If all you want is a solution that is guaranteed not to break any rules,
it's pretty easy. If MATLAB provides a space into which you need only copy
the data, you can do so in a simple loop:

/* We assume that arr is defined as double arr[ROWS][COLS]. We
further assume that p is of type double *, and points to
space at least ROWS * COLS * sizeof(double) bytes in size.
*/
t = p;
thisrow = 0;
while(thisrow < ROWS)
{
memcpy(t, arr[thisrow], sizeof arr[thisrow]);
t += COLS; /* move t on by COLS doubles */
++thisrow;
}
Are you of the opinion that one or both of memcpy(p, arr, sizeof arr)
memcpy(p, &arr, sizeof arr) are undefined?

What puzzles me is that arr (in a context where it converts to
&arr[0]) is supposed to be pointer constrained legally to range over
only the first (array) element of arr, yet if have

double x[ROWS];

x (in places where it converts to &x[0]) is permitted to range beyond
that first (non-array) element of x. Now, in the first case you have
an array pointer that gets further converted and in the second you
don't so this may be where the difference comes from, but I am having
trouble seeing the wording in the standard.

What is it about the conversions in memcpy(p, arr, sizeof arr) that
causes trouble when those in memcpy(p, x, sizeof x) do not?

--
Ben.
Sep 1 '08 #54
vi******@gmail.com writes:
The subject might be misleading.
Regardless, is this code valid:

#include <stdio.h>

void f(double *p, size_t size) { while(size--) printf("%f\n", *p++); }
int main(void) {
double array[2][1] = { { 3.14 }, { 42.6 } };
f((double *)array, sizeof array / sizeof **array);
return 0;
}

Assuming casting double [2][1] to double * is implementation defined
or undefined behavior, replace the cast with (void *).

Since arrays are not allowed to have padding bytes in between of
elements, it seems valid to me.
Despite what some other people have been saying,
this is valid. If foo is an array, doing (some_type *)foo
gives access to all of foo. Since 'array' is made up
(ultimately) of double's, using '(double*)array' will
work just fine.
Sep 1 '08 #55
Barry Schwarz <sc*******@yahoo.comwrites:
On Aug 28, 8:15 pm, vipps...@gmail.com wrote:
The subject might be misleading.
Regardless, is this code valid:

#include <stdio.h>

void f(double *p, size_t size) { while(size--) printf("%f\n", *p++); }
int main(void) {
double array[2][1] = { { 3.14 }, { 42.6 } };
f((double *)array, sizeof array / sizeof **array);
return 0;

}

Assuming casting double [2][1] to double * is implementation defined
or undefined behavior, replace the cast with (void *).
[snip]
Since arrays are not allowed to have padding bytes in between of
elements, it seems valid to me.

If your system has built in hardware assist for bounds checking, it
would be reasonable for the "bounds registers" to contain the start
and end addresses of array[0]. Eventually your p++ would be outside
this range (even though it is still within array as a whole). While
this is a perfectly valid value attempts to dereference it should be
trapped by the bounds checking logic in the hardware.
As explained elsethread, what's being converted is 'array',
the converted pointer value must allow access to all of array.
Sep 1 '08 #56
Ben Bacarisse said:
Richard Heathfield writes:
<snip>
>>
If all you want is a solution that is guaranteed not to break any rules,
it's pretty easy. If MATLAB provides a space into which you need only
copy the data, you can do so in a simple loop:

/* We assume that arr is defined as double arr[ROWS][COLS]. We
further assume that p is of type double *, and points to
space at least ROWS * COLS * sizeof(double) bytes in size.
*/
t = p;
thisrow = 0;
while(thisrow < ROWS)
{
memcpy(t, arr[thisrow], sizeof arr[thisrow]);
t += COLS; /* move t on by COLS doubles */
++thisrow;
}

Are you of the opinion that one or both of memcpy(p, arr, sizeof arr)
memcpy(p, &arr, sizeof arr) are undefined?
<weasel>
I'm of the opinion that the above code represents a squeaky-clean
way of converting a two-dimensional array into a one-dimensional
array.
</weasel>

A slightly less weaselly answer to your question would be that I'm not
entirely sure that a pedant (i.e. plenty of people in this newsgroup,
including myself) could not construct an argument that the memcpy route
could exhibit undefined behaviour.
What puzzles me is that arr (in a context where it converts to
&arr[0]) is supposed to be pointer constrained legally to range over
only the first (array) element of arr, yet if have

double x[ROWS];

x (in places where it converts to &x[0]) is permitted to range beyond
that first (non-array) element of x. Now, in the first case you have
an array pointer that gets further converted and in the second you
don't so this may be where the difference comes from, but I am having
trouble seeing the wording in the standard.

What is it about the conversions in memcpy(p, arr, sizeof arr) that
causes trouble when those in memcpy(p, x, sizeof x) do not?
I think I'll leave this one to the pedants that are actually of that
opinion. Whilst I understand the strict point that you can't wander (too
far) past the end of an array, *even if* you know that just past it is
another array that's part of the same array-of-arrays, I find it
impossible to conceive of a way in which it could break anything if all
the other rules of C are observed (by both the implementation and the
program). That doesn't mean there isn't such a way. Perhaps someone will
be able to enlighten us both.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Sep 1 '08 #57
On Sep 1, 2:49 pm, Tim Rentsch <t...@alumnus.caltech.eduwrote:
vipps...@gmail.com writes:
The subject might be misleading.
Regardless, is this code valid:
#include <stdio.h>
void f(double *p, size_t size) { while(size--) printf("%f\n", *p++); }
int main(void) {
double array[2][1] = { { 3.14 }, { 42.6 } };
f((double *)array, sizeof array / sizeof **array);
return 0;
}
Assuming casting double [2][1] to double * is implementation defined
or undefined behavior, replace the cast with (void *).
Since arrays are not allowed to have padding bytes in between of
elements, it seems valid to me.

Despite what some other people have been saying,
this is valid. If foo is an array, doing (some_type *)foo
gives access to all of foo. Since 'array' is made up
(ultimately) of double's, using '(double*)array' will
work just fine.
Well, now I'm at loss again. I think the only way to settle this is to
provide quotes from the standard that agree (or disagree) with you.
Sep 1 '08 #58
James Kuyper <ja*********@verizon.netwrites:
James Tursa wrote:
On Fri, 29 Aug 2008 11:08:00 GMT, James Kuyper
<ja*********@verizon.netwrote:
The key point is the pointer conversion. At the point where that
conversion occurs, the compiler knows that (double*)array == array[0].
It's undefined if any number greater than 1 is added to that pointer
value, and also if that pointer is dereferenced after adding 1 to it.
Trying to understand your answer as it relates to the original post. I
don't see how the original function gets an address 2 beyond the end,
or 1 beyond the end and attempts to dereference it, as you seem to be
saying. Can you point this out? Did I misunderstand you?

Quite possibly. The key point you need to understand is what array the
pointer points at. It's important to understand that, given the
following declaration:

double array[2][1];

"array" is not an array of "double". The element type for "array" is
"double[1]". On the other hand, array[0] is itself an array; the element
type for that array is "double".

The rules governing the behavior of pointer arithmetic are described by
6.5.6p8 in terms of an array whose element type is the type that the
pointer points at; they make no sense when interpreted in terms of an
array with any other element type.

The standard does NOT clearly state where it is that (double*)array
points. I will assume what everyone "knows", which is that it points at
the same location in memory as the original pointer.
All good so far....

There is only one array with an element type of "double" that starts at
that location. It isn't "array", it's "array[0]". Therefore, the rules
concerning pointer arithmetic are described relative to array[0]. Since
array[0] has a length of 1, the behavior is undefined if any integer
other than 0 or 1 is added to it, and it is not legal to dereference it
after 1 has been added to it; the same must also be true of (double*)array.
The flaw in the reasoning here is that it must be an array of double
that is being converted into a double *. It need not; what's being
converted is array, and it's being converted to double *. Whatever
the type of elements of an array X, doing (some_type*) X treats all
of X as though it has elements of some_type (with the usual caveats
about alignment).
Sep 1 '08 #59
Tim Rentsch wrote:
....
The very same argument applies if we don't use xa but just use
a directly; the value '(void*)a' has the same extent as the
array a. And so must '(double*)a', or any other pointer
conversion of a, provided of course that alignment requirements
are satisfied.
That's where your argument breaks down. A double* is governed by rules
about the limits of pointer addition that char* is specifically exempted
from, and which are meaningless for void*. I've described the problem in
more detail in another branch of this discussion, so I won't repeat the
description here.
Sep 1 '08 #60
Richard Heathfield wrote:
Ben Bacarisse said:
>Richard Heathfield writes:
<snip>
>>If all you want is a solution that is guaranteed not to break any rules,
it's pretty easy. If MATLAB provides a space into which you need only
copy the data, you can do so in a simple loop:

/* We assume that arr is defined as double arr[ROWS][COLS]. We
further assume that p is of type double *, and points to
space at least ROWS * COLS * sizeof(double) bytes in size.
*/
t = p;
thisrow = 0;
while(thisrow < ROWS)
{
memcpy(t, arr[thisrow], sizeof arr[thisrow]);
t += COLS; /* move t on by COLS doubles */
++thisrow;
}
Are you of the opinion that one or both of memcpy(p, arr, sizeof arr)
memcpy(p, &arr, sizeof arr) are undefined?

<weasel>
I'm of the opinion that the above code represents a squeaky-clean
way of converting a two-dimensional array into a one-dimensional
array.
</weasel>

A slightly less weaselly answer to your question would be that I'm not
entirely sure that a pedant (i.e. plenty of people in this newsgroup,
including myself) could not construct an argument that the memcpy route
could exhibit undefined behaviour.
>What puzzles me is that arr (in a context where it converts to
&arr[0]) is supposed to be pointer constrained legally to range over
only the first (array) element of arr, yet if have

double x[ROWS];

x (in places where it converts to &x[0]) is permitted to range beyond
that first (non-array) element of x. Now, in the first case you have
an array pointer that gets further converted and in the second you
don't so this may be where the difference comes from, but I am having
trouble seeing the wording in the standard.

What is it about the conversions in memcpy(p, arr, sizeof arr) that
causes trouble when those in memcpy(p, x, sizeof x) do not?
There can't be any difference.
In both cases the third argument is the number of bytes
of the object refered to by the second argument.
And in both cases the second parameter is initialised
to the address of the lowest addressable byte
of the object refered to by the second argument.
I think I'll leave this one to the pedants that are actually of that
opinion. Whilst I understand the strict point that you can't wander (too
far) past the end of an array, *even if* you know that just past it is
another array that's part of the same array-of-arrays, I find it
impossible to conceive of a way in which it could break anything if all
the other rules of C are observed (by both the implementation and the
program). That doesn't mean there isn't such a way. Perhaps someone will
be able to enlighten us both.
I don't understand the aversion to using a nested loop,
using the assignment operator for the type double values.

I never use memcpy when I can use an assignment operator instead.

/* BEGIN new.c */

#include <stdio.h>
#include <stdlib.h>

#define ROWS 2
#define COLS 3

int main(void)
{
double arr[ROWS][COLS];
double (*p)[COLS];
size_t row;
size_t col;

p = malloc(ROWS * sizeof *p);
if (p == NULL) {
puts("p == NULL");
exit(EXIT_FAILURE);
}
for (row = 0; row != ROWS; ++row) {
for (col = 0; col != COLS; ++col) {
arr[row][col] = (1 + row) * (1 + col);
}
}
for (row = 0; row != ROWS; ++row) {
for (col = 0; col != COLS; ++col) {
p[row][col] = arr[row][col];
}
}
for (row = 0; row != ROWS; ++row) {
for (col = 0; col != COLS; ++col) {
printf("%f ", arr[row][col]);
}
putchar('\n');
}
putchar('\n');
for (row = 0; row != ROWS; ++row) {
for (col = 0; col != COLS; ++col) {
printf("%f ", p[row][col]);
}
putchar('\n');
}
free(p);
return 0;
}

/* END new.c */
--
pete
Sep 1 '08 #61
Tim Rentsch wrote:
....
converted is array, and it's being converted to double *. Whatever
the type of elements of an array X, doing (some_type*) X treats all
of X as though it has elements of some_type (with the usual caveats
about alignment).
The standard says nothing about that. It says remarkably little about
what (sometype*)X does in general. Most of what it does say is that,
under some circumstances (none of which are relevant to this case),
conversion back to the original type returns a pointer that compares
equal to the original. Of the few cases where it says anything more than
that about what the result of (sometype*)X is, none apply here.
Sep 1 '08 #62
James Kuyper <ja*********@verizon.netwrites:
James Tursa wrote:
...
OK, that's fine for objects, but that doesn't answer my question. What
is it about 2-dimensional (or multi-dimensional) arrays of double that
does not allow them to be stepped through with a double* ?

Ultimately, nothing more or less than the fact that the standard says
that the behavior is undefined. Because the behavior is undefined,
compilers are allowed to generate code that might fail if such stepping
is attempted (though this is rather unlikely). More importantly,
compilers are allowed to generate code that assumes that such stepping
will not be attempted, and therefore fails catastrophically if it
actually is attempted - the most plausible mode of failure is a failure
to check for aliasing.

Specific details:

Given

double array[2][1];
double *p = (double*)array;

If there is code which sets array[1][i] to one value, and p[j] to
another value, the compiler not required to consider the possibility
that p[j] and array[1][i] might point at the same location in memory.
It's allowed to keep either value in a register, or to keep the two
values in different registers. It's not required to make the next
reference to array[1][i] give the same value as the next reference to p[j].

This is because the behavior would be undefined if 'i' and 'j' had
values that might ordinarily cause you the expect array[1][i] and p[j]
to refer to the same location. Note: this convoluted wording is
necessary, because if 'i' and 'j' have such values, then at least one of
the two expressions has undefined behavior, rendering it meaningless to
talk about which location that expression actually refers to.
You're starting with the conclusion, and then "proving" the
conclusion. This conclusion isn't consistent with other
behavior and language in the standard.
... And
ultimately, I would also ask if it is safe/conforming to use memcpy or
the like to copy values from/to such an array wholesale. e.g., is it

Yes, it is, and the reason is that the standard explicitly allows access
to entirely of an object through lvalues of "unsigned char", and the
behavior of memcpy() is defined in terms of operations on "unsigned
char" lvalues. There is no similar exemption for "double*".
Irrelevant, because that's talking about whether a memory access
can have undefined behavior because of an invalid representation.
It's just as illegal to access outside of an array using unsigned
char as it is using double. The only question is, what memory
may be accessed. Since 'array' is what was converted, any memory in
array may be accessed.
Sep 1 '08 #63
pete <pf*****@mindspring.comwrites:
Richard Heathfield wrote:
>Ben Bacarisse said:
>>Richard Heathfield writes:
<snip>
>>>If all you want is a solution that is guaranteed not to break any rules,
it's pretty easy. If MATLAB provides a space into which you need only
copy the data, you can do so in a simple loop:

/* We assume that arr is defined as double arr[ROWS][COLS]. We
further assume that p is of type double *, and points to
space at least ROWS * COLS * sizeof(double) bytes in size.
*/
t = p;
thisrow = 0;
while(thisrow < ROWS)
{
memcpy(t, arr[thisrow], sizeof arr[thisrow]);
t += COLS; /* move t on by COLS doubles */
++thisrow;
}
Are you of the opinion that one or both of memcpy(p, arr, sizeof arr)
memcpy(p, &arr, sizeof arr) are undefined?

<weasel>
I'm of the opinion that the above code represents a squeaky-clean
way of converting a two-dimensional array into a one-dimensional
array.
</weasel>

A slightly less weaselly answer to your question would be that I'm
not entirely sure that a pedant (i.e. plenty of people in this
newsgroup, including myself) could not construct an argument that
the memcpy route could exhibit undefined behaviour.
>>What puzzles me is that arr (in a context where it converts to
&arr[0]) is supposed to be pointer constrained legally to range over
only the first (array) element of arr, yet if have

double x[ROWS];

x (in places where it converts to &x[0]) is permitted to range beyond
that first (non-array) element of x. Now, in the first case you have
an array pointer that gets further converted and in the second you
don't so this may be where the difference comes from, but I am having
trouble seeing the wording in the standard.

What is it about the conversions in memcpy(p, arr, sizeof arr) that
causes trouble when those in memcpy(p, x, sizeof x) do not?

There can't be any difference.
In both cases the third argument is the number of bytes
of the object refered to by the second argument.
And in both cases the second parameter is initialised
to the address of the lowest addressable byte
of the object refered to by the second argument.
Firstly, that wording (about lowest addressable byte) applies only to
conversion to "pointer to character" types. All sane people know it
applies to void * too (and hence memcpy) but it is very hard to prove
it from the wording in the standard.

Secondly, there is a hair's breadth of difference between the cases.
In one (passing arr) a pointer of type double (*)[COLS] is converted
to void *; in another (passing &arr) a pointer of type double
(*)[ROWS][COLS] is converted to void *; and in my third case (passing
x) a double * is converted to void *. It is possible that some
wording somewhere allows an implementation to restrict the range of
some of these converted pointer and not others. I don't believe so,
but that is what I think is being claimed by some.

<snip>
I don't understand the aversion to using a nested loop,
using the assignment operator for the type double values.

I never use memcpy when I can use an assignment operator instead.
I think people are just trying to work out what is allowed and what is
not but I can see some value for some numerical applications where
utility functions could be written to be "size neutral" without
needing VLAs.

--
Ben.
Sep 1 '08 #64
James Kuyper <ja*********@verizon.netwrites:
Tim Rentsch wrote:
...
>The very same argument applies if we don't use xa but just use
a directly; the value '(void*)a' has the same extent as the
array a. And so must '(double*)a', or any other pointer
conversion of a, provided of course that alignment requirements
are satisfied.

That's where your argument breaks down. A double* is governed by rules
about the limits of pointer addition that char* is specifically
exempted from, and which are meaningless for void*. I've described the
problem in more detail in another branch of this discussion, so I
won't repeat the description here.
I can't find the other message so I'll have to ask here. Are you
saying that converting a double * to, say, unsigned char * permits one
to access parts of an array that are not accessible via the double *?
What are the limits on addition that one is exempted from and where is
this permission granted?

You go on to say that these limits are meaningless for void *, but at
some point, useful void *s are converted back. Do the limits that are
then imposed derive from the original pointer or can they get lost?
I.e. is (double *)(void *)dp different in what it can access to
(double *)(void *)(unsigned char *)dp?

--
Ben.
Sep 1 '08 #65
On Mon, 01 Sep 2008 05:24:14 -0700, Tim Rentsch wrote:
Harald van =?UTF-8?b?RMSzaw==?= <tr*****@gmail.comwrites:
>On Sun, 31 Aug 2008 19:16:40 +0000, James Tursa wrote:

[restoring snipped portion}
>>int main(void) {
char array[2][1] = { { 'a' }, { 'b' } };
f((char *)array, sizeof array / sizeof **array);
return 0;

}
OK, that's fine for objects, but that doesn't answer my question.
What is it about 2-dimensional (or multi-dimensional) arrays of
double that does not allow them to be stepped through with a double*
?

The fact that double[2][3] doesn't have elements such as x[0][5]. There
must be a valid double, 5*sizeof(double) bytes into x. However, x[0][5]
doesn't mean just that. x[0][5] (or ((double*)x)[5]) means you're
looking 5*sizeof(double) bytes into x[0]. x[0] doesn't have that many
elements.

That doesn't matter since array isn't being accessed as a
two-dimensional array. Converting array (not array[0], but array) gives
a pointer that has access to all the same memory as array.
With the exception of character types, does the standard describe the
conversion of an array to anything other than its initial element?
Strictly speaking, I can't even find where the standard describes the
result of converting double(*)[3] to double* at all, but the only way to
perform that conversion indirectly is by taking the address of the first
element of the first sub-array, and I accept that a direct conversion
should mean the same thing. If you can point out where more permissions
are given, please do so.
Sep 1 '08 #66
On Mon, 01 Sep 2008 09:38:20 +0000, Richard Heathfield
<rj*@see.sig.invalidwrote:
>
If all you want is a solution that is guaranteed not to break any rules,
it's pretty easy. If MATLAB provides a space into which you need only copy
the data, you can do so in a simple loop:

(If MATLAB wants you to provide the space and tell it where to look, then
you allocate a big enough space: p = malloc(ROWS * COLS * sizeof *p), and,
provided that the allocation was successful, copy the array into it as
shown above. Then tell MATLAB about p.)
Side note FYI: It can be done either way.You can call a MATLAB
function to build the complete mxArray structure (serves as a MATLAB
style "variable" that can be used in the MATLAB workspace) and then
get the pointer to the data area which you then fill in. Or you can
allocate some raw memory, fill it in, then attach it to a bare bones
mxArray structure, What you can't do is attach a C variable directly
to the mxArray structure ... it would mess up the MATLAB memory
manager. Hence the need to do a copy.
>
I'm not saying this is the fastest way to do it, but I think it's the
fastest way that is guaranteed not to break any rules!
Well, of course, I know how to do assignments in loops ... but who
wants to give up speed if they don't have to? Particularly if there is
a library function available that does exactly what you want.

James Tursa
Sep 1 '08 #67
pete <pf*****@mindspring.comwrites:
Keith Thompson wrote:
>pete <pf*****@mindspring.comwrites:
>>K&R2, Appendix B3:
There are two groups of string functions
defined in the header <string.h>.
The first have names begining with str;
the second have names begining with mem.
>memcpy, for example, is
a function that doesn't have anything directly to do with strings.

Has it occurred to you that "library functions" are called that,
because they are part of the library,
and not because of what they do?
Of course.

So you're saying that memcpy is a string function because it's
declared in <string.h>? The same reasoning implies that size_t is a
string type, and NULL is a string macro.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Sep 1 '08 #68
Keith Thompson wrote:
pete <pf*****@mindspring.comwrites:
>Keith Thompson wrote:
>>pete <pf*****@mindspring.comwrites:
K&R2, Appendix B3:
There are two groups of string functions
defined in the header <string.h>.
The first have names begining with str;
the second have names begining with mem.
memcpy, for example, is
a function that doesn't have anything directly to do with strings.
Has it occurred to you that "library functions" are called that,
because they are part of the library,
and not because of what they do?

Of course.

So you're saying that memcpy is a string function because it's
declared in <string.h>?
That's what I was thinking at the time,
but upon further consideration,
I think memcpy is called a string function
probably because in some dialects of computer science,
"string" is a synonym for "array".

Intel 80286 and 80287 Programmer's Reference Manual
2.2 data types
String: A contiguous sequence of bytes or words.
A string may contain from 1 byte to 64K bytes.

--
pete
Sep 2 '08 #69
Ben Bacarisse wrote:
pete <pf*****@mindspring.comwrites:
>Richard Heathfield wrote:
>>Ben Bacarisse said:
Richard Heathfield writes:

<snip>
If all you want is a solution that is guaranteed not to break any rules,
it's pretty easy. If MATLAB provides a space into which you need only
copy the data, you can do so in a simple loop:
>
/* We assume that arr is defined as double arr[ROWS][COLS]. We
further assume that p is of type double *, and points to
space at least ROWS * COLS * sizeof(double) bytes in size.
*/
t = p;
thisrow = 0;
while(thisrow < ROWS)
{
memcpy(t, arr[thisrow], sizeof arr[thisrow]);
t += COLS; /* move t on by COLS doubles */
++thisrow;
}
Are you of the opinion that one or both of memcpy(p, arr, sizeof arr)
memcpy(p, &arr, sizeof arr) are undefined?
<weasel>
I'm of the opinion that the above code represents a squeaky-clean
way of converting a two-dimensional array into a one-dimensional
array.
</weasel>

A slightly less weaselly answer to your question would be that I'm
not entirely sure that a pedant (i.e. plenty of people in this
newsgroup, including myself) could not construct an argument that
the memcpy route could exhibit undefined behaviour.

What puzzles me is that arr (in a context where it converts to
&arr[0]) is supposed to be pointer constrained legally to range over
only the first (array) element of arr, yet if have

double x[ROWS];

x (in places where it converts to &x[0]) is permitted to range beyond
that first (non-array) element of x. Now, in the first case you have
an array pointer that gets further converted and in the second you
don't so this may be where the difference comes from, but I am having
trouble seeing the wording in the standard.

What is it about the conversions in memcpy(p, arr, sizeof arr) that
causes trouble when those in memcpy(p, x, sizeof x) do not?
There can't be any difference.
In both cases the third argument is the number of bytes
of the object refered to by the second argument.
And in both cases the second parameter is initialised
to the address of the lowest addressable byte
of the object refered to by the second argument.

Firstly, that wording (about lowest addressable byte) applies only to
conversion to "pointer to character" types. All sane people know it
applies to void * too (and hence memcpy) but it is very hard to prove
it from the wording in the standard.
I made a mistake when I said "parameter".
The internal workings of the string functions are described as:

N869
7.21 String handling <string.h>
7.21.1 String function conventions
[#1] The header <string.hdeclares one type and several
functions, and defines one macro useful for manipulating
arrays of character type and other objects treated as arrays
of character type.

To treat an object as an array of character type,
the object must be accessed as though by a pointer to character type.

--
pete
Sep 2 '08 #70
pete <pf*****@mindspring.comwrites:
Keith Thompson wrote:
>pete <pf*****@mindspring.comwrites:
>>Keith Thompson wrote:
pete <pf*****@mindspring.comwrites:
K&R2, Appendix B3:
There are two groups of string functions
defined in the header <string.h>.
The first have names begining with str;
the second have names begining with mem.
memcpy, for example, is
a function that doesn't have anything directly to do with strings.
Has it occurred to you that "library functions" are called that,
because they are part of the library,
and not because of what they do?
Of course.
So you're saying that memcpy is a string function because it's
declared in <string.h>?

That's what I was thinking at the time,
but upon further consideration,
I think memcpy is called a string function
probably because in some dialects of computer science,
"string" is a synonym for "array".

Intel 80286 and 80287 Programmer's Reference Manual
2.2 data types
String: A contiguous sequence of bytes or words.
A string may contain from 1 byte to 64K bytes.
Perhaps, but in C string has a very specific meaning.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Sep 2 '08 #71
On Sep 2, 8:23 am, Keith Thompson <ks...@mib.orgwrote:
pete <pfil...@mindspring.comwrites:
Keith Thompson wrote:
pete <pfil...@mindspring.comwrites:
Keith Thompson wrote:
pete <pfil...@mindspring.comwrites:
K&R2, Appendix B3:
There are two groups of string functions
defined in the header <string.h>.
The first have names begining with str;
the second have names begining with mem.
memcpy, for example, is
a function that doesn't have anything directly to do with strings.
Has it occurred to you that "library functions" are called that,
because they are part of the library,
and not because of what they do?
Of course.
So you're saying that memcpy is a string function because it's
declared in <string.h>?
That's what I was thinking at the time,
but upon further consideration,
I think memcpy is called a string function
probably because in some dialects of computer science,
"string" is a synonym for "array".
Intel 80286 and 80287 Programmer's Reference Manual
2.2 data types
String: A contiguous sequence of bytes or words.
A string may contain from 1 byte to 64K bytes.

Perhaps, but in C string has a very specific meaning.
And "string literal" has a completely different one. Thus, "string
function" can have a meaning that doesn't have to do anything with
"string".
Sep 2 '08 #72
vi******@gmail.com said:
On Sep 2, 8:23 am, Keith Thompson <ks...@mib.orgwrote:
>pete <pfil...@mindspring.comwrites:
<snip>
I think memcpy is called a string function
probably because in some dialects of computer science,
"string" is a synonym for "array".
Intel 80286 and 80287 Programmer's Reference Manual
2.2 data types
String: A contiguous sequence of bytes or words.
A string may contain from 1 byte to 64K bytes.

Perhaps, but in C string has a very specific meaning.

And "string literal" has a completely different one.
No, it doesn't. It is a specification, that's all. A string literal /is/ a
string, /and/ it's a literal. Hence, string literal.

A function such as memcpy that manipulates arbitrary blocks of memory is no
more a string function than it is an integer function, a floating-point
function or a structure function. It is all of those things - why pick out
"string"?
Thus, "string function" can have a meaning that doesn't have to do
anything with "string".
Any conclusion may be drawn from a false premise.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Sep 2 '08 #73
On Sep 2, 8:57 am, Richard Heathfield <r...@see.sig.invalidwrote:
vipps...@gmail.com said:
On Sep 2, 8:23 am, Keith Thompson <ks...@mib.orgwrote:
pete <pfil...@mindspring.comwrites:
<snip>
I think memcpy is called a string function
probably because in some dialects of computer science,
"string" is a synonym for "array".
Intel 80286 and 80287 Programmer's Reference Manual
2.2 data types
String: A contiguous sequence of bytes or words.
A string may contain from 1 byte to 64K bytes.
Perhaps, but in C string has a very specific meaning.
And "string literal" has a completely different one.

No, it doesn't. It is a specification, that's all. A string literal /is/ a
string, /and/ it's a literal. Hence, string literal.
A string literal needs not to be a string, for example "hello\0world"
is not a string.
There is a footnote somewhere in the standard explicity mentioning
that.
Sep 2 '08 #74
vi******@gmail.com said:
On Sep 2, 8:57 am, Richard Heathfield <r...@see.sig.invalidwrote:
>vipps...@gmail.com said:
On Sep 2, 8:23 am, Keith Thompson <ks...@mib.orgwrote:
<snip>
>Perhaps, but in C string has a very specific meaning.
And "string literal" has a completely different one.

No, it doesn't. It is a specification, that's all. A string literal /is/
a string, /and/ it's a literal. Hence, string literal.

A string literal needs not to be a string, for example "hello\0world"
is not a string.
Right. It's several strings. (I count twelve.) I should have said "a string
literal contains at least one string".

Now show me a string literal that doesn't contain *any* strings.
There is a footnote somewhere in the standard explicity mentioning
that.
Where, exactly?

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Sep 2 '08 #75
On Sep 2, 9:26 am, Richard Heathfield <r...@see.sig.invalidwrote:
vipps...@gmail.com said:
On Sep 2, 8:57 am, Richard Heathfield <r...@see.sig.invalidwrote:
vipps...@gmail.com said:
On Sep 2, 8:23 am, Keith Thompson <ks...@mib.orgwrote:

<snip>
Perhaps, but in C string has a very specific meaning.
And "string literal" has a completely different one.
No, it doesn't. It is a specification, that's all. A string literal /is/
a string, /and/ it's a literal. Hence, string literal.
A string literal needs not to be a string, for example "hello\0world"
is not a string.

Right. It's several strings. (I count twelve.) I should have said "a string
literal contains at least one string".
Wrong, it's not several strings. It's a string literal. It contains
two strings.
Now show me a string literal that doesn't contain *any* strings.
Show me a flying elephant. :-)
There is a footnote somewhere in the standard explicity mentioning
that.

Where, exactly?
n1256.pdf,
6.4.5 footnote 66
A character string literal need not be a string (see 7.1.1), because a null character may be embedded in
it by a \0 escape sequence.
Sep 2 '08 #76
On Sep 2, 9:53 am, vipps...@gmail.com wrote:
>
On Sep 2, 9:26 am, Richard Heathfield <r...@see.sig.invalidwrote:

vipps...@gmail.com said:
>
A string literal needs not to be a string, for example "hello\0world"
is not a string.
Right. It's several strings. (I count twelve.) I should have said "a string
literal contains at least one string".

Wrong, it's not several strings. It's a string literal. It contains
two strings.
Whoops, you are right: twelve strings. However, it's not several
strings, it's a string literal that contains twelve strings.
Sep 2 '08 #77
vi******@gmail.com said:
On Sep 2, 9:26 am, Richard Heathfield <r...@see.sig.invalidwrote:
>vipps...@gmail.com said:
On Sep 2, 8:57 am, Richard Heathfield <r...@see.sig.invalidwrote:
vipps...@gmail.com said:
On Sep 2, 8:23 am, Keith Thompson <ks...@mib.orgwrote:

<snip>
>Perhaps, but in C string has a very specific meaning.
And "string literal" has a completely different one.
>No, it doesn't. It is a specification, that's all. A string literal
/is/ a string, /and/ it's a literal. Hence, string literal.
A string literal needs not to be a string, for example "hello\0world"
is not a string.

Right. It's several strings. (I count twelve.) I should have said "a
string literal contains at least one string".

Wrong, it's not several strings. It's a string literal. It contains
two strings.
Wrong, it contains twelve strings. (I know you know this, because I read
your whole article before replying. Kindly extend to me the same
courtesy.) In any case, you seem to be missing the point of the
discussion. The fact remains that string literals are intimately
associated with strings. You can't have a string literal that doesn't
contain at least one string. Thus, string literals *are to do with
strings*.

Back to the point: the term "string function" associates "string" and
"function", so it makes perfect sense for a "string function" to be a
"function that has to do with strings". The memcpy function doesn't
qualify, because it has no more to do with strings than qsort or bsearch
or fwrite do. If memcpy is a string function, why aren't qsort and bsearch
and fwrite string functions?
>Now show me a string literal that doesn't contain *any* strings.

Show me a flying elephant. :-)
http://www.imdb.com/title/tt0033563/
There is a footnote somewhere in the standard explicity mentioning
that.

Where, exactly?

n1256.pdf,
6.4.5 footnote 66
>A character string literal need not be a string (see 7.1.1), because a
null character may be embedded in it by a \0 escape sequence.
Right - but I have yet to see a string literal that doesn't contain at
least one string.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Sep 2 '08 #78
vi******@gmail.com wrote:
On Sep 2, 8:23 am, Keith Thompson <ks...@mib.orgwrote:
>pete <pfil...@mindspring.comwrites:
>>Keith Thompson wrote:
pete <pfil...@mindspring.comwrites:
Keith Thompson wrote:
>pete <pfil...@mindspring.comwrites:
>>K&R2, Appendix B3:
>> There are two groups of string functions
>> defined in the header <string.h>.
>> The first have names begining with str;
>> the second have names begining with mem.
>Perhaps, but in C string has a very specific meaning.

And "string literal" has a completely different one. Thus, "string
function" can have a meaning that doesn't have to do anything with
"string".
I don't think it really matters much why memcpy is a string function.
What matters here, is that it is.

--
pete
Sep 2 '08 #79
Richard Heathfield <rj*@see.sig.invalidwrites:
vi******@gmail.com said:
>On Sep 2, 8:23 am, Keith Thompson <ks...@mib.orgwrote:
[...]
>>Perhaps, but in C string has a very specific meaning.

And "string literal" has a completely different one.

No, it doesn't. It is a specification, that's all. A string literal
/is/ a string, /and/ it's a literal. Hence, string literal.
No, a string literal is a token in a C source file, and a string is
something that exists during program execution. For example, "\n" is
a string literal; it consists of 4 characters, none of which exist in
the corresponding string that exists at run time -- just as the digits
1, 2, and 3 exist in the integer constant token 123 but not in the
corresponding run-time value of type int.

However, string literal certainly are very closely tied to strings; a
string literal (in C source) is almost always intended to specify a
string value (during program execution), with minor exceptions such as
"foo\0bar" and ``char s[3] = "foo"'' (as I recall, the latter still
theoretically specifies the terminating '\0', but I'd expect a typical
compiler to optimize it away).

[...]

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Sep 2 '08 #80
pete said:

<snip>
I don't think it really matters much why memcpy is a string function.
What matters here, is that it is.
Why do you think so? The Standard doesn't say so, so what is your
justification for thinking memcpy is a string function?

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Sep 2 '08 #81
pete <pf*****@mindspring.comwrites:
Ben Bacarisse wrote:
>pete <pf*****@mindspring.comwrites:
>>Richard Heathfield wrote:
Ben Bacarisse said:
<snip>
>>>>Are you of the opinion that one or both of memcpy(p, arr, sizeof arr)
memcpy(p, &arr, sizeof arr) are undefined?
<weasel>
I'm of the opinion that the above code represents a squeaky-clean
way of converting a two-dimensional array into a one-dimensional
array.
</weasel>
<snip>
>>>>What is it about the conversions in memcpy(p, arr, sizeof arr) that
causes trouble when those in memcpy(p, x, sizeof x) do not?
There can't be any difference.
In both cases the third argument is the number of bytes
of the object refered to by the second argument.
And in both cases the second parameter is initialised
to the address of the lowest addressable byte
of the object refered to by the second argument.

Firstly, that wording (about lowest addressable byte) applies only to
conversion to "pointer to character" types. All sane people know it
applies to void * too (and hence memcpy) but it is very hard to prove
it from the wording in the standard.

I made a mistake when I said "parameter".
The internal workings of the string functions are described as:
Are the internal workings in dispute? I thought we were both
commenting on what gets passed.
N869
7.21 String handling <string.h>
<snip>

--
Ben.
Sep 2 '08 #82
Richard Heathfield wrote:
pete said:

<snip>
>I don't think it really matters much why memcpy is a string function.
What matters here, is that it is.

Why do you think so? The Standard doesn't say so, so what is your
justification for thinking memcpy is a string function?
Two reasons.

1 Because the standard's description of what the functions
which have names starting with mem, do,
is covered under this section of the standard:

7.21.1 String function conventions

2 and because K&R2 says so:

K&R2, Appendix B3:
There are two groups of string functions
defined in the header <string.h>.
The first have names begining with str;
the second have names begining with mem.

--
pete
Sep 2 '08 #83
Ben Bacarisse wrote:
pete <pf*****@mindspring.comwrites:
>Ben Bacarisse wrote:
>>pete <pf*****@mindspring.comwrites:

Richard Heathfield wrote:
Ben Bacarisse said:
<snip>
>>>>>Are you of the opinion that one or both of memcpy(p, arr, sizeof arr)
>memcpy(p, &arr, sizeof arr) are undefined?
<weasel>
I'm of the opinion that the above code represents a squeaky-clean
way of converting a two-dimensional array into a one-dimensional
array.
</weasel>
>
<snip>
>>>>>What is it about the conversions in memcpy(p, arr, sizeof arr) that
>causes trouble when those in memcpy(p, x, sizeof x) do not?
There can't be any difference.
In both cases the third argument is the number of bytes
of the object refered to by the second argument.
And in both cases the second parameter is initialised
to the address of the lowest addressable byte
of the object refered to by the second argument.
Firstly, that wording (about lowest addressable byte) applies only to
conversion to "pointer to character" types. All sane people know it
applies to void * too (and hence memcpy) but it is very hard to prove
it from the wording in the standard.
I made a mistake when I said "parameter".
The internal workings of the string functions are described as:

Are the internal workings in dispute? I thought we were both
commenting on what gets passed.
I thought we were talking about the difference bewteen what
memcpy(p, arr, sizeof arr) does, and what
memcpy(p, &arr, sizeof arr) does.

In both cases, the second argument will be treated
as the address of an object which will be treated
as an array of character type,
which means that the elements of the object
will be accessed as though by a pointer to character type.

--
pete
Sep 2 '08 #84
Ben Bacarisse wrote:
James Kuyper <ja*********@verizon.netwrites:
>Tim Rentsch wrote:
...
>>The very same argument applies if we don't use xa but just use
a directly; the value '(void*)a' has the same extent as the
array a. And so must '(double*)a', or any other pointer
conversion of a, provided of course that alignment requirements
are satisfied.
That's where your argument breaks down. A double* is governed by rules
about the limits of pointer addition that char* is specifically
exempted from, and which are meaningless for void*. I've described the
problem in more detail in another branch of this discussion, so I
won't repeat the description here.

I can't find the other message so I'll have to ask here. Are you
saying that converting a double * to, say, unsigned char * permits one
to access parts of an array that are not accessible via the double *?
What are the limits on addition that one is exempted from and where is
this permission granted?
The description of how addition of an integer to a pointer works in
6.5.6p8 is entirely in terms of positions within (and one beyond the end
of) an array of the pointed-at type. The only array of 'double' declared
anywhere in the program that contains the position pointed at by
(double*)a is a[0]. Therefore, the limits based upon array length
imposed by 6.5.6p8 refer to the length of the array a[0] (which is 1),
not the length of 'a' itself (which is 2), and not the length of a
1-dimensional array which could have been allocated in the same memory
as 'a', which would have had a length of 2x1 == 2.

The special characteristic of unsigned char* is that 6.2.6.1p4 defines
C's object model, for non-character types with a size of 'n' bytes, in
terms of treating them as arrays of n unsigned chars - such an array is
defined as constituting the _object representation_ of such types. The
standard guarantees that an object may be copied into such an array, and
gives memcpy() as an example of how this may be done. The fact that
memcpy is given as an example, rather than requiring the use of memcpy()
to perform such a copy, implies that if I were to write a function named
my_memcpy() which matched the other specifications given by the standard
for memcpy(), it must also be usable for copying the object
representation. In other words, the ability to copy object
representations is not a magical additional ability of memcpy(), but is
merely a side-effect of the fact that the behavior of memcpy() is
defined in terms of copying arrays of unsigned char.

If run-time bounds checking were applied to unsigned char* in the most
extreme way otherwise permitted by 6.5.6p8, it would prevent my_memcpy()
from working properly. The undefined behavior allowed by 6.5.6p8 for
adding too large of an integer value to a pointer value, is trumped by
the defined behavior provided by 6.2.6.1p4 for copying an entire object
as an array of unsigned char, even if that object is a multi-dimensional
array of some other type.
You go on to say that these limits are meaningless for void *, but at
some point, useful void *s are converted back. Do the limits that are
then imposed derive from the original pointer or can they get lost?
I.e. is (double *)(void *)dp different in what it can access to
(double *)(void *)(unsigned char *)dp?
The standard is not at all clear about what happens in most pointer
conversions, including those. However, I see no technical difficulty
with retaining the bounds-checking information inside a pointer
throughout a long series of intermediate pointer conversions. The
bounds-checking information cannot be used if the current pointer type
is "unsigned char*", and it's meaningless if the current pointer type
"void*", but it can still reside inside such pointers, hidden, waiting
for conversion to a type which does permit run-time bounds-checking. It
is 6.5.6p8 which makes such bounds-checking legal.
Sep 2 '08 #85
Tim Rentsch wrote:
James Kuyper <ja*********@verizon.netwrites:
>James Tursa wrote:
...
>>OK, that's fine for objects, but that doesn't answer my question. What
is it about 2-dimensional (or multi-dimensional) arrays of double that
does not allow them to be stepped through with a double* ?
Ultimately, nothing more or less than the fact that the standard says
that the behavior is undefined. Because the behavior is undefined,
compilers are allowed to generate code that might fail if such stepping
is attempted (though this is rather unlikely). More importantly,
compilers are allowed to generate code that assumes that such stepping
will not be attempted, and therefore fails catastrophically if it
actually is attempted - the most plausible mode of failure is a failure
to check for aliasing.

Specific details:

Given

double array[2][1];
double *p = (double*)array;

If there is code which sets array[1][i] to one value, and p[j] to
another value, the compiler not required to consider the possibility
that p[j] and array[1][i] might point at the same location in memory.
It's allowed to keep either value in a register, or to keep the two
values in different registers. It's not required to make the next
reference to array[1][i] give the same value as the next reference to p[j].

This is because the behavior would be undefined if 'i' and 'j' had
values that might ordinarily cause you the expect array[1][i] and p[j]
to refer to the same location. Note: this convoluted wording is
necessary, because if 'i' and 'j' have such values, then at least one of
the two expressions has undefined behavior, rendering it meaningless to
talk about which location that expression actually refers to.

You're starting with the conclusion, and then "proving" the
conclusion. This conclusion isn't consistent with other
behavior and language in the standard.
I was not trying to prove that the behavior was undefined. I was trying
to explain how it is that the fact that the behavior is undefined can
make it dangerous to rely upon such code.

I've presented my argument that the behavior IS undefined elsewhere,
most recently in the response I just posted to Ben Bacarisse.

....
>>... And
ultimately, I would also ask if it is safe/conforming to use memcpy or
the like to copy values from/to such an array wholesale. e.g., is it
Yes, it is, and the reason is that the standard explicitly allows access
to entirely of an object through lvalues of "unsigned char", and the
behavior of memcpy() is defined in terms of operations on "unsigned
char" lvalues. There is no similar exemption for "double*".

Irrelevant, because that's talking about whether a memory access
can have undefined behavior because of an invalid representation.
It's just as illegal to access outside of an array using unsigned
char as it is using double. The only question is, what memory
may be accessed. Since 'array' is what was converted, any memory in
array may be accessed.
There are multiple arrays involved, and the question is - which array is
the one which 6.5.6p8 is referring to? A careful examination of 6.5.6p8
reveals that what is says constitutes utter nonsense unless the element
type of the relevant array is the same as the type pointed at by the
pointer. Example:

int matrix[3][5];
int *pi = matrix[1];

For purposes of integer additions to "pi", if the array referred to by
6.5.6p8 were "matrix" rather than matrix[1], then because pi points at
the second element of matrix, matrix[1], pi+1 would have to point at the
third element, matrix[2]. If "matrix" were the relevant array, then the
largest amount which could be added to matrix[1] would be 2, not 5,
because the length of matrix is only 3, while the length of matrix[1] is 5.

This is a wholly indefensible interpretation of 6.5.6p8. The only array
that it could possibly be referring, when applied to "pi", is matrix[1].

People have claimed that there is a one-dimensional array of 15 ints
which should be used when applying 6.5.6p8, but I see no such array
declared anywhere in the above code.
Sep 2 '08 #86
pete said:
Richard Heathfield wrote:
>pete said:

<snip>
>>I don't think it really matters much why memcpy is a string function.
What matters here, is that it is.

Why do you think so? The Standard doesn't say so, so what is your
justification for thinking memcpy is a string function?

Two reasons.

1 Because the standard's description of what the functions
which have names starting with mem, do,
is covered under this section of the standard:

7.21.1 String function conventions
Thank you. I'm sorry, but I don't find this particularly compelling...
2 and because K&R2 says so:

K&R2, Appendix B3:
There are two groups of string functions
defined in the header <string.h>.
The first have names begining with str;
the second have names begining with mem.
....and I find this even less compelling. The term "string function", as
applied to mem*, is a misnomer.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Sep 2 '08 #87
Richard Heathfield wrote:
vi******@gmail.com said:
>On Sep 2, 8:57 am, Richard Heathfield <r...@see.sig.invalidwrote:
>>vipps...@gmail.com said:
On Sep 2, 8:23 am, Keith Thompson <ks...@mib.orgwrote:

<snip>
>>>>Perhaps, but in C string has a very specific meaning.
And "string literal" has a completely different one.
No, it doesn't. It is a specification, that's all. A string literal /is/
a string, /and/ it's a literal. Hence, string literal.
A string literal needs not to be a string, for example "hello\0world"
is not a string.

Right. It's several strings. (I count twelve.) I should have said "a string
literal contains at least one string".

Now show me a string literal that doesn't contain *any* strings.
char hello[5] = "Hello";

--
Er*********@sun.com
Sep 2 '08 #88
On Tue, 02 Sep 2008 10:14:45 -0400, Eric Sosman <Er*********@sun.com>
wrote:
>Richard Heathfield wrote:
>vi******@gmail.com said:
>>On Sep 2, 8:57 am, Richard Heathfield <r...@see.sig.invalidwrote:
vipps...@gmail.com said:
On Sep 2, 8:23 am, Keith Thompson <ks...@mib.orgwrote:

<snip>
>>>>>Perhaps, but in C string has a very specific meaning.
And "string literal" has a completely different one.
No, it doesn't. It is a specification, that's all. A string literal /is/
a string, /and/ it's a literal. Hence, string literal.
A string literal needs not to be a string, for example "hello\0world"
is not a string.

Right. It's several strings. (I count twelve.) I should have said "a string
literal contains at least one string".

Now show me a string literal that doesn't contain *any* strings.

char hello[5] = "Hello";
hello is not a literal. The string literal used to initialize it, if
it does exist in the object module (it need not), will certainly
contain the terminating '\0'. The code that initializes hello with
the literal will not copy the '\0'.

--
Remove del for email
Sep 2 '08 #89
Barry Schwarz <sc******@dqel.comwrote:
On Tue, 02 Sep 2008 10:14:45 -0400, Eric Sosman <Er*********@sun.com>
Richard Heathfield wrote:
Right. It's several strings. (I count twelve.) I should have said "a string
literal contains at least one string".

Now show me a string literal that doesn't contain *any* strings.
char hello[5] = "Hello";

hello is not a literal. The string literal used to initialize it, if
it does exist in the object module (it need not), will certainly
contain the terminating '\0'. The code that initializes hello with
the literal will not copy the '\0'.
static char hello[5]="Hello";

Richard
Sep 2 '08 #90
Barry Schwarz wrote:
On Tue, 02 Sep 2008 10:14:45 -0400, Eric Sosman <Er*********@sun.com>
wrote:
>Richard Heathfield wrote:
>>vi******@gmail.com said:
On Sep 2, 8:57 am, Richard Heathfield <r...@see.sig.invalidwrote:
vipps...@gmail.com said:
>On Sep 2, 8:23 am, Keith Thompson <ks...@mib.orgwrote:
<snip>

>>Perhaps, but in C string has a very specific meaning.
>And "string literal" has a completely different one.
No, it doesn't. It is a specification, that's all. A string literal /is/
a string, /and/ it's a literal. Hence, string literal.
A string literal needs not to be a string, for example "hello\0world"
is not a string.
Right. It's several strings. (I count twelve.) I should have said "a string
literal contains at least one string".

Now show me a string literal that doesn't contain *any* strings.
char hello[5] = "Hello";

hello is not a literal.
Agreed. Neither is char, [, 5, ], =, ;, or the white space.
Everything else in the source line is a string literal.
The string literal used to initialize it, if
it does exist in the object module (it need not), will certainly
contain the terminating '\0'.
Chapter and verse?
The code that initializes hello with
the literal will not copy the '\0'.
It certainly cannot "copy the '\0'," just as it cannot copy
a three-kilogram slab of luminiferous ether. As far as I can tell,
the Standard says the same thing about the existence of the former
and the latter, to wit, nothing at all.

--
Er*********@sun.com
Sep 2 '08 #91
Eric Sosman said:
Richard Heathfield wrote:
>vi******@gmail.com said:
>>On Sep 2, 8:57 am, Richard Heathfield <r...@see.sig.invalidwrote:
vipps...@gmail.com said:
On Sep 2, 8:23 am, Keith Thompson <ks...@mib.orgwrote:

<snip>
>>>>>Perhaps, but in C string has a very specific meaning.
And "string literal" has a completely different one.
No, it doesn't. It is a specification, that's all. A string literal
/is/ a string, /and/ it's a literal. Hence, string literal.
A string literal needs not to be a string, for example "hello\0world"
is not a string.

Right. It's several strings. (I count twelve.) I should have said "a
string literal contains at least one string".

Now show me a string literal that doesn't contain *any* strings.

char hello[5] = "Hello";
I refer you to 3.1.4 of C89: "A null character is then appended", or
6.4.5(5): "In translation phase 7, a byte or code of value zero is
appended to each multibyte character sequence that results from a string
literal or literals." This certainly applies to "Hello".

I also refer you to 3.5.7 of C89 or 6.7.8(14) of C99 - the wording that
follows is from C99, but C89 is identical except that it says "members"
rather than "elements": "An array of character type may be initialized by
a character string literal, optionally enclosed in braces. Successive
characters of the character string literal (including the terminating null
character if there is room or if the array is of unknown size) initialize
the elements of the array."

This sentence clearly indicates that a string literal has a terminating
null character.

Care to try again?

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Sep 2 '08 #92
Eric Sosman said:
Barry Schwarz wrote:
>On Tue, 02 Sep 2008 10:14:45 -0400, Eric Sosman <Er*********@sun.com>
wrote:
>>Richard Heathfield wrote:
<snip>
>>>Now show me a string literal that doesn't contain *any* strings.
char hello[5] = "Hello";

hello is not a literal.

Agreed. Neither is char, [, 5, ], =, ;, or the white space.
Everything else in the source line is a string literal.
Agreed. And that string literal contains a string.
> The string literal used to initialize it, if
it does exist in the object module (it need not), will certainly
contain the terminating '\0'.

Chapter and verse?
Never mind the object module. That (probably!) isn't written in C. It's the
source that counts. See my parallel reply for C&V on whether "Hello"
contains a string.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Sep 2 '08 #93
James Kuyper <ja*********@verizon.netwrites:
Ben Bacarisse wrote:
>James Kuyper <ja*********@verizon.netwrites:
>>Tim Rentsch wrote:
...
The very same argument applies if we don't use xa but just use
a directly; the value '(void*)a' has the same extent as the
array a. And so must '(double*)a', or any other pointer
conversion of a, provided of course that alignment requirements
are satisfied.
That's where your argument breaks down. A double* is governed by rules
about the limits of pointer addition that char* is specifically
exempted from, and which are meaningless for void*. I've described the
problem in more detail in another branch of this discussion, so I
won't repeat the description here.

I can't find the other message so I'll have to ask here. Are you
saying that converting a double * to, say, unsigned char * permits one
to access parts of an array that are not accessible via the double *?
What are the limits on addition that one is exempted from and where is
this permission granted?

The description of how addition of an integer to a pointer works in
6.5.6p8 is entirely in terms of positions within (and one beyond the
end of) an array of the pointed-at type. The only array of 'double'
declared anywhere in the program that contains the position pointed at
by (double*)a is a[0]. Therefore, the limits based upon array length
imposed by 6.5.6p8 refer to the length of the array a[0] (which is 1),
not the length of 'a' itself (which is 2), and not the length of a
1-dimensional array which could have been allocated in the same memory
as 'a', which would have had a length of 2x1 == 2.

The special characteristic of unsigned char* is that 6.2.6.1p4 defines
C's object model, for non-character types with a size of 'n' bytes, in
terms of treating them as arrays of n unsigned chars - such an array
is defined as constituting the _object representation_ of such
types. The standard guarantees that an object may be copied into such
an array, and gives memcpy() as an example of how this may be
done. The fact that memcpy is given as an example, rather than
requiring the use of memcpy() to perform such a copy, implies that if
I were to write a function named my_memcpy() which matched the other
specifications given by the standard for memcpy(), it must also be
usable for copying the object representation. In other words, the
ability to copy object representations is not a magical additional
ability of memcpy(), but is merely a side-effect of the fact that the
behavior of memcpy() is defined in terms of copying arrays of unsigned
char.

If run-time bounds checking were applied to unsigned char* in the most
extreme way otherwise permitted by 6.5.6p8, it would prevent
my_memcpy() from working properly. The undefined behavior allowed by
6.5.6p8 for adding too large of an integer value to a pointer value,
is trumped by the defined behavior provided by 6.2.6.1p4 for copying
an entire object as an array of unsigned char, even if that object is
a multi-dimensional array of some other type.
If I understand your point of view, you are saying that any bounds
checking applied to character pointers must be relaxed so that it
applies (if at all) only to the largest enclosing object, because a
char * can range over the whole object's representation.

I like your line of reasoning and I find it persuasive.

To clarify: in 6.5.6p9 (about pointer difference) you presumably take
the view that &a[0][0] and &a[2][0] are not "in the same array" and
therefore can't be subtracted (or compared)? This is entirely
logical, given your argument above, but rather counter-intuitive given
the normal meaning of the term.
>You go on to say that these limits are meaningless for void *, but at
some point, useful void *s are converted back. Do the limits that are
then imposed derive from the original pointer or can they get lost?
I.e. is (double *)(void *)dp different in what it can access to
(double *)(void *)(unsigned char *)dp?

The standard is not at all clear about what happens in most pointer
conversions, including those. However, I see no technical difficulty
with retaining the bounds-checking information inside a pointer
throughout a long series of intermediate pointer conversions. The
bounds-checking information cannot be used if the current pointer type
is "unsigned char*", and it's meaningless if the current pointer type
"void*",
I don't think it is meaningless for void *s. It could, for example,
be used to check if a comparison is defined or not. The relational
operators are defined in terms in array elements but not in relation
to pointer arithmetic, so comparing void *s is permitted and the
bounds information could be used to raise an error in those situations
where the objects pointed to are not in the same array.
but it can still reside inside such pointers, hidden, waiting
for conversion to a type which does permit run-time
bounds-checking. It is 6.5.6p8 which makes such bounds-checking legal.
--
Ben.
Sep 2 '08 #94
Ben Bacarisse wrote:
James Kuyper <ja*********@verizon.netwrites:
....
The description of how addition of an integer to a pointer works in
6.5.6p8 is entirely in terms of positions within (and one beyond the
end of) an array of the pointed-at type. The only array of 'double'
declared anywhere in the program that contains the position pointed at
by (double*)a is a[0]. Therefore, the limits based upon array length
imposed by 6.5.6p8 refer to the length of the array a[0] (which is 1),
not the length of 'a' itself (which is 2), and not the length of a
1-dimensional array which could have been allocated in the same memory
as 'a', which would have had a length of 2x1 == 2.

The special characteristic of unsigned char* is that 6.2.6.1p4 defines
C's object model, for non-character types with a size of 'n' bytes, in
terms of treating them as arrays of n unsigned chars - such an array
is defined as constituting the _object representation_ of such
types. The standard guarantees that an object may be copied into such
an array, and gives memcpy() as an example of how this may be
done. The fact that memcpy is given as an example, rather than
requiring the use of memcpy() to perform such a copy, implies that if
I were to write a function named my_memcpy() which matched the other
specifications given by the standard for memcpy(), it must also be
usable for copying the object representation. In other words, the
ability to copy object representations is not a magical additional
ability of memcpy(), but is merely a side-effect of the fact that the
behavior of memcpy() is defined in terms of copying arrays of unsigned
char.

If run-time bounds checking were applied to unsigned char* in the most
extreme way otherwise permitted by 6.5.6p8, it would prevent
my_memcpy() from working properly. The undefined behavior allowed by
6.5.6p8 for adding too large of an integer value to a pointer value,
is trumped by the defined behavior provided by 6.2.6.1p4 for copying
an entire object as an array of unsigned char, even if that object is
a multi-dimensional array of some other type.

If I understand your point of view, you are saying that any bounds
checking applied to character pointers must be relaxed so that it
applies (if at all) only to the largest enclosing object, because a
char * can range over the whole object's representation.
Exactly.
I like your line of reasoning and I find it persuasive.

To clarify: in 6.5.6p9 (about pointer difference) you presumably take
the view that &a[0][0] and &a[2][0] are not "in the same array" and
therefore can't be subtracted (or compared)? This is entirely
logical, given your argument above, but rather counter-intuitive given
the normal meaning of the term.
Yes, that is how I interpret 6.5.6p9.
The standard is not at all clear about what happens in most pointer
conversions, including those. However, I see no technical difficulty
with retaining the bounds-checking information inside a pointer
throughout a long series of intermediate pointer conversions. The
bounds-checking information cannot be used if the current pointer type
is "unsigned char*", and it's meaningless if the current pointer type
"void*",

I don't think it is meaningless for void *s. It could, for example,
be used to check if a comparison is defined or not.
You're right; I wasn't thinking about comparisons. Also, I had
forgotten that the behavior was undefined when comparing pointers that
do not point into (or one past the end of) the same array - I'd
thought it produced an unspecified result. Offhand, I can't think of
any reason why it's undefined behavior, but maybe someone else can
come up with an example of an implementation where it would have been
problematic to have such comparisons merely return an unspecified
value.
Sep 2 '08 #95
On 1 Sep 2008 at 23:01, Keith Thompson wrote:
pete <pf*****@mindspring.comwrites:
>Keith Thompson wrote:
>>memcpy, for example, is
a function that doesn't have anything directly to do with strings.

Has it occurred to you that "library functions" are called that,
because they are part of the library,
and not because of what they do?

Of course.

So you're saying that memcpy is a string function because it's
declared in <string.h>? The same reasoning implies that size_t is a
string type, and NULL is a string macro.
FFS... it must be, like, 6 months or something since we last went
through this completely absurdly argument that excites such passion in
the breasts of the clc pedants club.

Sep 2 '08 #96
On Tue, 02 Sep 2008 14:41:27 GMT, rl*@hoekstra-uitgeverij.nl (Richard
Bos) wrote:
>Barry Schwarz <sc******@dqel.comwrote:
>On Tue, 02 Sep 2008 10:14:45 -0400, Eric Sosman <Er*********@sun.com>
>Richard Heathfield wrote:
Right. It's several strings. (I count twelve.) I should have said "a string
literal contains at least one string".

Now show me a string literal that doesn't contain *any* strings.

char hello[5] = "Hello";

hello is not a literal. The string literal used to initialize it, if
it does exist in the object module (it need not), will certainly
contain the terminating '\0'. The code that initializes hello with
the literal will not copy the '\0'.

static char hello[5]="Hello";
I don't see that this makes any difference.

--
Remove del for email
Sep 2 '08 #97
On Tue, 02 Sep 2008 11:09:37 -0400, Eric Sosman <Er*********@sun.com>
wrote:
>Barry Schwarz wrote:
>On Tue, 02 Sep 2008 10:14:45 -0400, Eric Sosman <Er*********@sun.com>
wrote:
>>Richard Heathfield wrote:
vi******@gmail.com said:
On Sep 2, 8:57 am, Richard Heathfield <r...@see.sig.invalidwrote:
>vipps...@gmail.com said:
>>On Sep 2, 8:23 am, Keith Thompson <ks...@mib.orgwrote:
<snip>

>>>Perhaps, but in C string has a very specific meaning.
>>And "string literal" has a completely different one.
>No, it doesn't. It is a specification, that's all. A string literal /is/
>a string, /and/ it's a literal. Hence, string literal.
A string literal needs not to be a string, for example "hello\0world"
is not a string.
Right. It's several strings. (I count twelve.) I should have said "a string
literal contains at least one string".

Now show me a string literal that doesn't contain *any* strings.
char hello[5] = "Hello";

hello is not a literal.

Agreed. Neither is char, [, 5, ], =, ;, or the white space.
Everything else in the source line is a string literal.
> The string literal used to initialize it, if
it does exist in the object module (it need not), will certainly
contain the terminating '\0'.

Chapter and verse?
6.4.5-5 seems to fit. 6.4.5-6 adds confirmation.
>
> The code that initializes hello with
the literal will not copy the '\0'.

It certainly cannot "copy the '\0'," just as it cannot copy
a three-kilogram slab of luminiferous ether. As far as I can tell,
the Standard says the same thing about the existence of the former
and the latter, to wit, nothing at all.
So you think the initialization of an automatic array of char does not
involve any code to copy the initial value into the array? How does
recursion work if code is not involved?

--
Remove del for email
Sep 2 '08 #98
On Tue, 02 Sep 2008 06:26:22 +0000, Richard Heathfield wrote:
Now show me a string literal that doesn't contain *any* strings.
#if 0
"Hello"
#endif
Sep 2 '08 #99
Barry Schwarz wrote:
On Tue, 02 Sep 2008 11:09:37 -0400, Eric Sosman <Er*********@sun.com>
wrote:
Barry Schwarz wrote:
On Tue, 02 Sep 2008 10:14:45 -0400, Eric Sosman <Er*********@sun.com>
wrote:
....
> char hello[5] = "Hello";

hello is not a literal.
Agreed. Neither is char, [, 5, ], =, ;, or the white space.
Everything else in the source line is a string literal.
The string literal used to initialize it, if
it does exist in the object module (it need not), will certainly
contain the terminating '\0'.
Chapter and verse?

6.4.5-5 seems to fit. 6.4.5-6 adds confirmation.
That applies only to code like

char *hello = "Hello";

When a string literal is used as an initializer for a char array,
6.7.8p14 is the relevant clause, and according to that clause a '\0'
comes into play only if the the array has enough room for it, or if
the array size is unknown. In this case, the array has a known length
of 5, which is not sufficient room for the terminating null; therefore
a terminating null is not even required to exist.
It certainly cannot "copy the '\0'," just as it cannot copy
a three-kilogram slab of luminiferous ether. As far as I can tell,
the Standard says the same thing about the existence of the former
and the latter, to wit, nothing at all.

So you think the initialization of an automatic array of char does not
involve any code to copy the initial value into the array? How does
recursion work if code is not involved?
He didn't say that code was not involved, he said that a '\0' was not
involved. One obvious possibility is that the initialization is
achieved by copying 5 chars from some location which need not contain
a null character after the fifth char.
Sep 2 '08 #100

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

6 posts views Thread by Gregory L. Hansen | last post: by
4 posts views Thread by Richard Hayden | last post: by
1 post views Thread by Fayez Al-Naddaf | last post: by
5 posts views Thread by David T. Ashley | last post: by
reply views Thread by zhoujie | last post: by
reply views Thread by suresh191 | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.