multidimensional arrays and cute tricks

Charles Banas

i've got an interesting peice of code i'm maintaining, and i'd like to
get some opinions and comments on it, hopefully so i can gain some
sort of insight as to why this works.

at the top of the function (which was translated from Fortran code),
among other heinous and numerous declarations, is this bit:

static float bbuff[TBUF_SIZE][60][60];
static int bkey[TBUF_SIZE];
static int buse[TBUF_SIZE];
static int bmsg[TBUF_SIZE];
static int bptr;
int ii, jj;
float buff[60][60];

register char *r1buff;
register char *r2buff;

the purpose of bbuff is to make available multiple terrain buffers so
that it doesn't have to repeatedly read the terrain data file (which
is small and insignificant, so it's not that much of a bottleneck).
performance aside, here's a snippet from a section that returns one of
the full buffers in lieu of reading the original data file:

if (stat&TBUF_USE)
for (i = 0;i < TBUF_SIZE;i++)
{
if ((*key == bkey[i]) && buse[i])
{
r1buff = (char *) & (buff[0][0]);
r2buff = (char *) & (bbuff[i][0][0]);

for (j = 0;j < 3600;j++, r1buff++, r2buff++)
*r1buff = *r2buff;

/* memcpy((char *)&(buff[0][0]),
(char *)&(bbuff[i][0][0]),
sizeof(float)*3600); */

/* transpose index buffer */
for (ii = 0; ii < 60 ; ii++)
for (jj = 0; jj < 60; jj++)
sbuff[jj][ii] = buff[ii][jj];
if (fd != NULL)
fclose(fd);
return (bmsg[i]);
}
}

now you can see that the memcpy was originally tried, but later
dumped, likely because of how C arrays work. but the for loop caught
my attention: they're basically just copying bewteen two
de-referenced pointers. is this kosher? i know it works with both
Visual C++ 6 and GCC 3.3, but to me, it doesn't look like it should
work. can someone explain to me why the buffer copies correctly and
successfully despite the rather strange-looking method?

Nov 14 '05 #1

Subscribe Reply

6649

Chris Torek

In article <news:a7************************@posting.google.co m>
Charles Banas <gr******@the-junkyard.net> writes:
[snippage; just key parts retained]

static float bbuff[TBUF_SIZE][60][60];
float buff[60][60];
register char *r1buff;
register char *r2buff;

r1buff = (char *) & (buff[0][0]);
r2buff = (char *) & (bbuff[i][0][0]);

for (j = 0;j < 3600;j++, r1buff++, r2buff++)
*r1buff = *r2buff;

/* memcpy((char *)&(buff[0][0]),
(char *)&(bbuff[i][0][0]),
sizeof(float)*3600); */

now you can see that the memcpy was originally tried, but later
dumped, likely because of how C arrays work. but the for loop caught
my attention: they're basically just copying bewteen two
de-referenced pointers. is this kosher?
Let me answer this in code order, rather than question order.

First, the loop version runs j from 0 up to (but not including)
3600, and 3600 is 60*60. The two pointers here have type "char *",
and the loop is indeed similar to the commented-out memcpy --
but they are also quite different. The "for" loop copies 3600
bytes (C bytes, "sizeof char" being 1 even if CHAR_BIT > 8).

If the memcpy() call were to be uncommented, it would copy
3600 * sizeof(float) bytes.

These are the same if and only if sizeof(float) is 1. On most
implementations, including all those supported by GCC, sizeof(float)
is at least 4.

As for whether the loop is "kosher", the strict ANSI/ISO C answer
is apparently "no". After looping over the first 60*sizeof(float)
bytes in the arrays named buff[0] and bbuff[i][0], the next accesses
are to elements in adjacent but different arrays, buff[1] and
bbuff[i][1]. Some over in comp.std.c say this is allowed to
fail with an "out of array bounds" type of exception. Somewhat
more realistically, plain char might be signed char, which might
have problems copying data -- on ones' complement systems like
the Univac, all-1s bit patterns (-0) might be turned into all-0s
(+0), ruining the "float" value. This part is easily fixed, by
using "unsigned char *" instead of plain "char *".

Certainly the memcpy() call, and most likely even just use of
"unsigned char *", will avoid any sort of "out of array bounds"
exception that might be allowed. (I remain somewhat skeptical that
they even *are* allowed.) On most practical machines, too, even
plain "char *" copies will "work right".
i know it works with both
Visual C++ 6 and GCC 3.3, but to me, it doesn't look like it should
work. can someone explain to me why the buffer copies correctly and
successfully despite the rather strange-looking method?

The fact that C does not *have* multidimensional arrays is important.
Instead, C has arrays whose elements are in turn arrays. Here
bbuff is an object whose type is "array TBUF_SIZE of <something>",
and the <something> is in turn another array type: "array 60 of
<something2>". Here <something2> is yet another array type: "array
60 of float", and now it stops, because "float" is a fundamental
C type. Arrays in C (which are by definition "1 dimensional" even
if the element-type is another array) are just contiguously-arranged
elements: a[0] is immediately followed in memory by a[1], which is
immediately followed by a[2], and so on. (This is why so many
"struct"s have padding built into them at the end: in order to go
into an array, a[1] *must* come right after a[0], so if a[0] is a
struct, and that struct would need padding between adjacent array
elements, the padding must be part of the struct type. The array
is not allowed to introduce the padding!)

Now, if a[0] is immediately followed by a[1], and if a[0] is itself
an array of (say) size 2, then a[0][0] is immediately followed by
a[0][1], which -- logically -- must be immediately followed by
a[1][0]. In other words, the memory-picture:

+-----------------------+
a: | a[0][0] | a[0][1] |
| a[1][0] | a[1][1] |
+-----------------------+

can always be "linearized":

+-----------------------------------------------+
a: | a[0][0] | a[0][1] | a[1][0] | a[1][1] |
+-----------------------------------------------+

In the above code, copying 3600 / sizeof(float) -- probably 3600/4
or 900 -- elements starting from bbuff[i][0][0] will copy bbuff[i][0][0]
through bbuff[i][14][59] into buff[0][0] through buff[14][59]
respectively (assuming the 900 number is correct; and note that
900 is 15 * 60). Apparently this suffices for whatever tasks are
being run now.
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.

Nov 14 '05 #2

Barry Schwarz

On 21 Jan 2004 13:22:36 -0800, gr******@the-junkyard.net (Charles
Banas) wrote:

i've got an interesting peice of code i'm maintaining, and i'd like to
get some opinions and comments on it, hopefully so i can gain some
sort of insight as to why this works.

at the top of the function (which was translated from Fortran code),
among other heinous and numerous declarations, is this bit:

static float bbuff[TBUF_SIZE][60][60];
static int bkey[TBUF_SIZE];
static int buse[TBUF_SIZE];
static int bmsg[TBUF_SIZE];
static int bptr;
int ii, jj;
float buff[60][60];

register char *r1buff;
register char *r2buff;

the purpose of bbuff is to make available multiple terrain buffers so
that it doesn't have to repeatedly read the terrain data file (which
is small and insignificant, so it's not that much of a bottleneck).
performance aside, here's a snippet from a section that returns one of
the full buffers in lieu of reading the original data file:

if (stat&TBUF_USE)
for (i = 0;i < TBUF_SIZE;i++)
{
if ((*key == bkey[i]) && buse[i])
{
r1buff = (char *) & (buff[0][0]);
r2buff = (char *) & (bbuff[i][0][0]);

for (j = 0;j < 3600;j++, r1buff++, r2buff++)
*r1buff = *r2buff;

/* memcpy((char *)&(buff[0][0]),
(char *)&(bbuff[i][0][0]),
sizeof(float)*3600); */
Since r1buff and r2buff are char*, the loop only copies 3600 bytes,
not 3600 floats. If you want to copy 3600 floats, delete the loop and
uncomment the call to memcpy.

Since memcpy takes void*, the casts are completely superfluous.

/* transpose index buffer */
for (ii = 0; ii < 60 ; ii++)
for (jj = 0; jj < 60; jj++)
sbuff[jj][ii] = buff[ii][jj];
if (fd != NULL)
fclose(fd);
return (bmsg[i]);
}
}

now you can see that the memcpy was originally tried, but later
dumped, likely because of how C arrays work. but the for loop caught
This comment makes no sense. The for loop is exactly the same as the
memcpy except it is off by a factor of sizeof(float).
my attention: they're basically just copying bewteen two
de-referenced pointers. is this kosher? i know it works with both
Visual C++ 6 and GCC 3.3, but to me, it doesn't look like it should
work. can someone explain to me why the buffer copies correctly and
successfully despite the rather strange-looking method?

Actually, that particular for loop is almost exactly how many books
attempt to explain memcpy (usually as some kind of exercise). Please
voice a specific concern that we can address. Why do you think it
should not work?
<<Remove the del for email>>

Nov 14 '05 #3

Charles Banas

Barry Schwarz <sc******@deloz.net> wrote in message news:<bu**********@216.39.134.8>...

Since r1buff and r2buff are char*, the loop only copies 3600 bytes,
not 3600 floats. If you want to copy 3600 floats, delete the loop and
uncomment the call to memcpy.
i'm wary of changing the code because of the function it performs.
Since memcpy takes void*, the casts are completely superfluous.
not surprising, becuase i don't think the programmer who "fixed" it
knew what he was doing. i won't name him specifically, but he works
at the FCC.

/* transpose index buffer */
for (ii = 0; ii < 60 ; ii++)
for (jj = 0; jj < 60; jj++)
sbuff[jj][ii] = buff[ii][jj];
if (fd != NULL)
fclose(fd);
return (bmsg[i]);
}
}

now you can see that the memcpy was originally tried, but later
dumped, likely because of how C arrays work. but the for loop caught

This comment makes no sense.

i'll rephrase then: i think the memcpy was dumped because C arrays
aren't always stored linearly.
The for loop is exactly the same as the
memcpy except it is off by a factor of sizeof(float).

which is something i missed when i looked through it.

my attention: they're basically just copying bewteen two
de-referenced pointers. is this kosher? i know it works with both
Visual C++ 6 and GCC 3.3, but to me, it doesn't look like it should
work. can someone explain to me why the buffer copies correctly and
successfully despite the rather strange-looking method?

Actually, that particular for loop is almost exactly how many books
attempt to explain memcpy (usually as some kind of exercise). Please
voice a specific concern that we can address. Why do you think it
should not work?

it's a multidimensional array. so in theory, it's not copying the
data verbatim (as it would have in the original Fortran), but rather
is copying the pointers in addition to the data. but i may be
misunderstanding the layout in memory.

are the pointers stored in an array in memory immediately before the
actual data? if so, that would make sense. but that still relies
heavily on the data being linearly stored in one location. something
that surprised me and left me quite unsettled.

Nov 14 '05 #4

Charles Banas

Chris Torek <no****@torek.net> wrote in message news:<bu*********@enews1.newsguy.com>...

In article <news:a7************************@posting.google.co m>
Charles Banas <gr******@the-junkyard.net> writes:
[snippage; just key parts retained]
static float bbuff[TBUF_SIZE][60][60];
float buff[60][60];
register char *r1buff;
register char *r2buff;

r1buff = (char *) & (buff[0][0]);
r2buff = (char *) & (bbuff[i][0][0]);

for (j = 0;j < 3600;j++, r1buff++, r2buff++)
*r1buff = *r2buff;

/* memcpy((char *)&(buff[0][0]),
(char *)&(bbuff[i][0][0]),
sizeof(float)*3600); */

now you can see that the memcpy was originally tried, but later
dumped, likely because of how C arrays work. but the for loop caught
my attention: they're basically just copying bewteen two
de-referenced pointers. is this kosher?
Let me answer this in code order, rather than question order.

First, the loop version runs j from 0 up to (but not including)
3600, and 3600 is 60*60. The two pointers here have type "char *",
and the loop is indeed similar to the commented-out memcpy --
but they are also quite different. The "for" loop copies 3600
bytes (C bytes, "sizeof char" being 1 even if CHAR_BIT > 8).

If the memcpy() call were to be uncommented, it would copy
3600 * sizeof(float) bytes.

i missed that, but i see it now.
These are the same if and only if sizeof(float) is 1. On most
implementations, including all those supported by GCC, sizeof(float)
is at least 4.

As for whether the loop is "kosher", the strict ANSI/ISO C answer
is apparently "no". After looping over the first 60*sizeof(float)
bytes in the arrays named buff[0] and bbuff[i][0], the next accesses
are to elements in adjacent but different arrays, buff[1] and
bbuff[i][1]. Some over in comp.std.c say this is allowed to
fail with an "out of array bounds" type of exception. Somewhat
more realistically, plain char might be signed char, which might
have problems copying data -- on ones' complement systems like
the Univac, all-1s bit patterns (-0) might be turned into all-0s
(+0), ruining the "float" value. This part is easily fixed, by
using "unsigned char *" instead of plain "char *".
i hadn't noticed that either. but strangely, the code seems to work
fine, returning expected values.

it may surprise you, but i'm instructed not to change this code -
because it's the code we were given by the FCC for the purpose of
calculating height above average terrain, and this is dealing with
60x60 blocks of actual terrain data.
Certainly the memcpy() call, and most likely even just use of
"unsigned char *", will avoid any sort of "out of array bounds"
exception that might be allowed. (I remain somewhat skeptical that
they even *are* allowed.) On most practical machines, too, even
plain "char *" copies will "work right".
which is why i was concerned. it "works right" now, but my concern is
whether the multidimensional array is actually consistently stored in
memory in proper order in every case. it's not something i would want
to depend on. but it seems to have worked on a VAX, a Sun
SparcStation, and on our Windows machines.

i know it works with both
Visual C++ 6 and GCC 3.3, but to me, it doesn't look like it should
work. can someone explain to me why the buffer copies correctly and
successfully despite the rather strange-looking method?

The fact that C does not *have* multidimensional arrays is important.
Instead, C has arrays whose elements are in turn arrays. Here
bbuff is an object whose type is "array TBUF_SIZE of <something>",
and the <something> is in turn another array type: "array 60 of
<something2>". Here <something2> is yet another array type: "array
60 of float", and now it stops, because "float" is a fundamental
C type. Arrays in C (which are by definition "1 dimensional" even
if the element-type is another array) are just contiguously-arranged
elements: a[0] is immediately followed in memory by a[1], which is
immediately followed by a[2], and so on. (This is why so many
"struct"s have padding built into them at the end: in order to go
into an array, a[1] *must* come right after a[0], so if a[0] is a
struct, and that struct would need padding between adjacent array
elements, the padding must be part of the struct type. The array
is not allowed to introduce the padding!)

aha! i didn't know this was defined behavior. it's not behavior i
would have depended on, at least.
Now, if a[0] is immediately followed by a[1], and if a[0] is itself
an array of (say) size 2, then a[0][0] is immediately followed by
a[0][1], which -- logically -- must be immediately followed by
a[1][0]. In other words, the memory-picture:

+-----------------------+
a: | a[0][0] | a[0][1] |
| a[1][0] | a[1][1] |
+-----------------------+

can always be "linearized":

+-----------------------------------------------+
a: | a[0][0] | a[0][1] | a[1][0] | a[1][1] |
+-----------------------------------------------+

In the above code, copying 3600 / sizeof(float) -- probably 3600/4
or 900 -- elements starting from bbuff[i][0][0] will copy bbuff[i][0][0]
through bbuff[i][14][59] into buff[0][0] through buff[14][59]
respectively (assuming the 900 number is correct; and note that
900 is 15 * 60). Apparently this suffices for whatever tasks are
being run now.

apparently, but again, it's not something i would have depended on.
there are cases where an array of arrays may not be linearly addressed
in memory, but in fact may be out of order.

but i am quite surprised i didn't notice the casts to (char *). this
code should in fact be copying all 3600 elements, and not just 900. i
don't know why it was changed, and it "works" now, so i don't really
have authority to "fix" it.

but thank you for clarifying some things i misunderstood.

Nov 14 '05 #5

Chris Torek

In article <a7**************************@posting.google.com >
Charles Banas <gr******@the-junkyard.net> writes:

i'll rephrase then: i think the memcpy was dumped because C arrays
aren't always stored linearly.
C arrays *are* always linear. C imposes a "locally flat" memory
structure upon all systems that implement C. There is no guarantee
that one array object's address space be comparable in any way with
some other array object's, but each object lives within some
(perhaps small-as-possible and widely-scattered) "flat" region.
it's a multidimensional array. so in theory, it's not copying the
data verbatim (as it would have in the original Fortran), but rather
is copying the pointers in addition to the data. but i may be
misunderstanding the layout in memory.

are the pointers stored in an array in memory immediately before the
actual data?

There are no pointers in memory here, only the array. ("There is
no Dana, only Zuul". :-) ) When you mention the name of an array
in a value context, the C compiler must *construct* a pointer value.
See <http://web.torek.net/torek/c/pa.html>.
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.

Nov 14 '05 #6

Charles Banas

Chris Torek <no****@torek.net> wrote in message news:<bu*********@enews2.newsguy.com>...

In article <a7**************************@posting.google.com >
Charles Banas <gr******@the-junkyard.net> writes:
i'll rephrase then: i think the memcpy was dumped because C arrays
aren't always stored linearly.

C arrays *are* always linear. C imposes a "locally flat" memory
structure upon all systems that implement C. There is no guarantee
that one array object's address space be comparable in any way with
some other array object's, but each object lives within some
(perhaps small-as-possible and widely-scattered) "flat" region.

let me see if i understand correctly. for:

int array[2][60];
int *parray = array;

then &parray[60] == &array[1][0] in all cases?

it's a multidimensional array. so in theory, it's not copying the
data verbatim (as it would have in the original Fortran), but rather
is copying the pointers in addition to the data. but i may be
misunderstanding the layout in memory.

are the pointers stored in an array in memory immediately before the
actual data?

There are no pointers in memory here, only the array. ("There is
no Dana, only Zuul". :-) ) When you mention the name of an array
in a value context, the C compiler must *construct* a pointer value.
See <http://web.torek.net/torek/c/pa.html>.

that makes sense for arrays allocated on the stack, but honestly, i
didn't know this was the case.

thanks!

-- Charles Banas

Nov 14 '05 #7

Barry Schwarz

On 23 Jan 2004 12:13:29 -0800, gr******@the-junkyard.net (Charles
Banas) wrote:

Chris Torek <no****@torek.net> wrote in message news:<bu*********@enews2.newsguy.com>...
In article <a7**************************@posting.google.com >
Charles Banas <gr******@the-junkyard.net> writes:
>i'll rephrase then: i think the memcpy was dumped because C arrays
>aren't always stored linearly.
C arrays *are* always linear. C imposes a "locally flat" memory
structure upon all systems that implement C. There is no guarantee
that one array object's address space be comparable in any way with
some other array object's, but each object lives within some
(perhaps small-as-possible and widely-scattered) "flat" region.

let me see if i understand correctly. for:

int array[2][60];
int *parray = array;

This is a syntax error. parray is an int*. In this context, array
evaluates to a pointer to the first element with type pointer to
element. The first element of array is array[0] which has type
pointer to array of 60 int. In order for this assignment to work, you
would need
int (*parray)[60] = array;

then &parray[60] == &array[1][0] in all cases?
Once your premise is wrong, your conclusions fall apart.

>it's a multidimensional array. so in theory, it's not copying the
>data verbatim (as it would have in the original Fortran), but rather
>is copying the pointers in addition to the data. but i may be
>misunderstanding the layout in memory.
>
>are the pointers stored in an array in memory immediately before the
>actual data?

There are no pointers in memory here, only the array. ("There is
no Dana, only Zuul". :-) ) When you mention the name of an array
in a value context, the C compiler must *construct* a pointer value.
See <http://web.torek.net/torek/c/pa.html>.

that makes sense for arrays allocated on the stack, but honestly, i
didn't know this was the case.

thanks!

-- Charles Banas

<<Remove the del for email>>

Nov 14 '05 #8

glen herrmannsfeldt

Charles Banas wrote:

Chris Torek <no****@torek.net> wrote in message news:<bu*********@enews2.newsguy.com>...
(snip)

i'll rephrase then: i think the memcpy was dumped because C arrays
aren't always stored linearly.
In the sense given here, they are always stored linearly.

C arrays *are* always linear. C imposes a "locally flat" memory
structure upon all systems that implement C. There is no guarantee
that one array object's address space be comparable in any way with
some other array object's, but each object lives within some
(perhaps small-as-possible and widely-scattered) "flat" region.
(snip)
let me see if i understand correctly. for:

int array[2][60];
int *parray = array;

then &parray[60] == &array[1][0] in all cases?
I believe you have to say it in the form

int *parray=(int*)array;

That is sort of saying "do it anyway, even if I shouldn't".
it's a multidimensional array. so in theory, it's not copying the
data verbatim (as it would have in the original Fortran), but rather
is copying the pointers in addition to the data. but i may be
misunderstanding the layout in memory. are the pointers stored in an array in memory immediately before the
actual data?

There are no pointers in memory here, only the array. ("There is
no Dana, only Zuul". :-) ) When you mention the name of an array
in a value context, the C compiler must *construct* a pointer value.
See <http://web.torek.net/torek/c/pa.html>.

If you say array[1], which is the same as &array[1][0], the
compiler will create the proper pointer value, but there are
(normally) not any pointers stored in memory.

I was told that some Fortran compilers on the PDP-11 would
index arrays through pointers, instead of calculating the
offset using multiplication. This was transparent to the
user, the array itself must be allocated contiguously, as
it must in C. I presume C compilers could create such a hidden
pointer array to speed up the calculation (on machines with
a slow multiply), but it isn't normally done, and you shouldn't
be able to tell if it were.
that makes sense for arrays allocated on the stack, but honestly, i
didn't know this was the case.

Well, there is another way to do this problem, which is for
the program itself to allocate pointers. In that case, you
would have something declared

float ***a;

Which could still be referenced as a[i][j][k].

Personally, I disagree with Chris that C doesn't have multidimensional
arrays, though he is likely right. They may call them by a
different name, but they have the properties that multidimensional
arrays should have.

-- glen

Nov 14 '05 #9

Dave Thompson

On 24 Jan 2004 00:10:58 GMT, Barry Schwarz <sc******@deloz.net> wrote:

On 23 Jan 2004 12:13:29 -0800, gr******@the-junkyard.net (Charles
Banas) wrote:
Chris Torek <no****@torek.net> wrote in message news:<bu*********@enews2.newsguy.com>... <snip>
C arrays *are* always linear. <snip>
let me see if i understand correctly. for:

int array[2][60];
int *parray = array;
This is a syntax error. parray is an int*. In this context, array

Technically it's a constraint violation not a syntax error. But it is
wrong for the reason you state.
evaluates to a pointer to the first element with type pointer to
element. The first element of array is array[0] which has type
pointer to array of 60 int. In order for this assignment to work, you
would need
int (*parray)[60] = array;

Or, more likely, int *parray = array[0] /* or equivalent and arguably
clearer &array[0][0] */.

then &parray[60] == &array[1][0] in all cases?

I believe this is officially guaranteed by the new wording of 6.5.9p6
with footnote 91, and in practice was always true due to the
(required) memory model of contiguous (and unpadded) arrays.
Technically it isn't guaranteed you can _access_ array[1][0] by
parray[60] etc., but I don't believe there is or ever will be an
implementation that doesn't except debugging and _maybe_JVM kludges,
partly because I believe it _is) guaranteed to access the (same)
representation using any of (uchar*)array[1]
((uchar*)array)+60*intsize
(uchar*)parray+60*intsize
et seq

<snip> There are no pointers in memory here, only the array. ("There is
no Dana, only Zuul". :-) ) When you mention the name of an array
in a value context, the C compiler must *construct* a pointer value.
See <http://web.torek.net/torek/c/pa.html>.

that makes sense for arrays allocated on the stack, but honestly, i
didn't know this was the case.

Stack or not doesn't matter; it makes sense for the C definition of
array (and pointer, particularly to element) across the board.

- David.Thompson1 at worldnet.att.net

Nov 14 '05 #10

multidimensional arrays and cute tricks

Similar topics