In article <11*********************@h48g2000cwc.googlegroups. com>,
<el*********@yahoo.comwrote:
>i was studying the virtual memory and the page fault concept in
particularly [see,
http://www.cs.man.ac.uk/~rizos/CS205...-02/lect11.pdf ]
>Considering this concept and the two versions code given below. Assume
that the matrix "a" is stored row by row , and considering that the
program is given only 1024 frames, version A below leads to 1024*1024
page fault whereas version B leads to only 1024 Page Fault, thus run
faster.
>I have never paid attention to this issue. as a programmer do we have
to?
Yes.
>is the compiler intelligent enough to order the loop execution in
the most effective way, i.e. switch automatically version A like code
to version B
Not usually, no.
*Some* compilers might be able to do it in the special case
of assigning a constant: in particular, some compilers designed
to compile into parallel operations must be able to use
their "code motion" and "strength reduction" analysis to determine
how to minimize the cache effects. A compiler that was smart about
cache effects might even introduce a padding column so that
a[x][0] and a[x+1][0] are on different cache lines. Compilers
are *allowed* to be smart about such things, but by no means
should you assume it.
Consider this situation:
double a[1024][1024] already initialized to something
/* version C*/
double total = 0.; for (j=0;j<1024;j++) for (i=0;i<1024;i++) total += A[i][j];
/* version D*/
double total = 0.; for (i=0;i<1024;i++) for (j=0;j<1024;j++) total += A[i][j];
This cannot be reordered, not unless the compiler can pre-determine that
all of the values are the same or that the matrix is symmetric. If
it cannot determine one of those conditions, then it must do the work
in the order specified, because adding doubles in a different order
can lead to different round-off errors.
The parallel compilers I mentioned earlier would usually have options
that control the amount of latittude they have for reordering operations
where the reordering might end up with a different result due to
round-off.
>int a[1024][1024];
/* version A */
for (j=0; j<1024; j++)
for (i=0;i<1024; i++)
A[i][j]=0;
>/* version B */
for (i=0; i<1024; i++)
for (j=0;j<1024; j++)
A[i][j]=0;
--
All is vanity. -- Ecclesiastes