removing a loop cause it to go at half the speed?

tom fredriksen

Hi

I was doing a simple test of the speed of a "maths" operation and when I
tested it I found that removing the loop that initialises the data array
for the operation caused the whole program to spend twice the time to
complete. If the loop is included it takes about 7.48 seconds to
complete, but when removed it takes about 11.48 seconds.

Does anybody have a suggestion as to why this is so and whether I can
trust the results of the code as it is below?

/tom
The code was compiled on linux 2.6.3 with gcc 3.3.2 and glibc 2.2.3
#include <stdio.h>
#include <stdlib.h>
#include <sys/time.h>

int main(int argc, char *argv[])
{
int total = 0;
int count = 65500;
signed int data[count];

/* Initialising array loop */
for(int c=0; c<count; c++) {
data[c]=1+(int) (2000000000.0*rand()/(RAND_MAX+1.0));
}

struct timeval start_time;
struct timeval end_time;

gettimeofday(&start_time, NULL);

for(int d=0; d<50000; d++) {
for(int c=0; c<count; c++) {
total += data[c];
}
}
gettimeofday(&end_time, NULL);

double t1=(start_time.tv_sec*1000)+(start_time.tv_usec/1000.0);
double t2=(end_time.tv_sec*1000)+(end_time.tv_usec/1000.0);

printf("Elapsed time (ms): %.6lf\n", t2-t1);
printf("Total: %u\n", total);

return(0);
}

Mar 15 '06 #1

Subscribe Post Reply

102

4429

Rod Pemberton

"tom fredriksen" <to*@nospam.org> wrote in message
news:44********@news.broadpark.no...

Hi

I was doing a simple test of the speed of a "maths" operation and when I
tested it I found that removing the loop that initialises the data array
for the operation caused the whole program to spend twice the time to
complete. If the loop is included it takes about 7.48 seconds to
complete, but when removed it takes about 11.48 seconds.

Does anybody have a suggestion as to why this is so and whether I can
trust the results of the code as it is below?

/tom
The code was compiled on linux 2.6.3 with gcc 3.3.2 and glibc 2.2.3
#include <stdio.h>
#include <stdlib.h>
#include <sys/time.h>

int main(int argc, char *argv[])
{
int total = 0;
int count = 65500;
signed int data[count];

/* Initialising array loop */
for(int c=0; c<count; c++) {
data[c]=1+(int) (2000000000.0*rand()/(RAND_MAX+1.0));
}

struct timeval start_time;
struct timeval end_time;

gettimeofday(&start_time, NULL);

for(int d=0; d<50000; d++) {
for(int c=0; c<count; c++) {
total += data[c];
}
}
gettimeofday(&end_time, NULL);

double t1=(start_time.tv_sec*1000)+(start_time.tv_usec/1000.0);
double t2=(end_time.tv_sec*1000)+(end_time.tv_usec/1000.0);

printf("Elapsed time (ms): %.6lf\n", t2-t1);
printf("Total: %u\n", total);

return(0);
}

Just a guess: the array data[] is signed and when unitialized the summation
of the data has to add many negative values which probably use more or
slower instructions than when you initialize the array with unsigned values.
Rod Pemberton

Mar 15 '06 #2

bert

tom fredriksen wrote:

. . . I found that removing the loop that initialises the data array
for the operation caused the whole program to spend twice the time to
complete. . . . Does anybody have a suggestion as to why this is so?
. . .
for(int d=0; d<50000; d++) {
for(int c=0; c<count; c++) {
total += data[c];
}
}

Maybe the time is spent servicing overflow interrupts, and the
uninitialised data[] gives more of them. Maybe your compiler
is not smart enough to replace the d-loop by total *= 50000
(both times). It is not always clear from the source code just
where the measured run time is being spent.
--

Mar 15 '06 #3

tom fredriksen

bert wrote:

tom fredriksen wrote:
. . . I found that removing the loop that initialises the data array
for the operation caused the whole program to spend twice the time to
complete. . . . Does anybody have a suggestion as to why this is so?
. . .
for(int d=0; d<50000; d++) {
for(int c=0; c<count; c++) {
total += data[c];
}
}

Maybe the time is spent servicing overflow interrupts, and the
uninitialised data[] gives more of them. Maybe your compiler
is not smart enough to replace the d-loop by total *= 50000
(both times). It is not always clear from the source code just
where the measured run time is being spent.

I tested with changing the data and total variables to unsigned and it
did not matter.

Additionally, I tried changing the compiler arguments

-O2 -Wall -D_LARGEFILE64_SOURCE -std=gnu99

By that had no effect either.

/tom

Mar 15 '06 #4

Jan-Hinnerk Dumjahn

tom fredriksen wrote:

Hi

I was doing a simple test of the speed of a "maths" operation and when I
tested it I found that removing the loop that initialises the data array
for the operation caused the whole program to spend twice the time to
complete. If the loop is included it takes about 7.48 seconds to
complete, but when removed it takes about 11.48 seconds.

Does anybody have a suggestion as to why this is so and whether I can
trust the results of the code as it is below?

The time you measure is the real time. It also contains time spend by other
applications or the system. Perhaps the array needs to be swapped in on the
first access.

Try running both versions of the programm with "time myProg".

Try a loop that reads through the array once instead of the initialisation.

BTW: Are you sure that the stack is big enough to hold your variables.

/Jan-Hinnerk

Mar 15 '06 #5

Rod Pemberton

"Rod Pemberton" <do*********@sorry.bitbuck.cmm> wrote in message
news:dv***********@news3.infoave.net...

"tom fredriksen" <to*@nospam.org> wrote in message
news:44********@news.broadpark.no...
Hi

I was doing a simple test of the speed of a "maths" operation and when I
tested it I found that removing the loop that initialises the data array
for the operation caused the whole program to spend twice the time to
complete. If the loop is included it takes about 7.48 seconds to
complete, but when removed it takes about 11.48 seconds.

Does anybody have a suggestion as to why this is so and whether I can
trust the results of the code as it is below?

/tom
The code was compiled on linux 2.6.3 with gcc 3.3.2 and glibc 2.2.3
<snip> Just a guess: the array data[] is signed and when unitialized the summation of the data has to add many negative values which probably use more or
slower instructions than when you initialize the array with unsigned

values.

Okay, I compiled your code for DJGPP and I'm not seeing this issue. There
is a slight variation in execution times, but nothing like what you
describe. Nor are they different enough to tell which is faster or slower.
The only way I got a huge difference was compiling one as gcc -O and the
other as gcc -O2. If not an optimization problem, I did notice that GCC
shifted more work to the floating point instructions when the loop was
present. Perhaps, due to instruction pairing/pipelineing, caching, etc.,
the floating point init just runs faster for you...
Rod Pemberton

Mar 15 '06 #6

Vladimir S. Oka

On Wednesday 15 March 2006 16:14, tom fredriksen opined (in
<44********@news.broadpark.no>):

I was doing a simple test of the speed of a "maths" operation and when
I tested it I found that removing the loop that initialises the data
array for the operation caused the whole program to spend twice the
time to complete. If the loop is included it takes about 7.48 seconds
to complete, but when removed it takes about 11.48 seconds.

Does anybody have a suggestion as to why this is so and whether I can
trust the results of the code as it is below?

I'd suggest brushing up on your algebra. Since when is 11.48 "twice the"
7.48? In my book it's closer to 50% larger ("twice larger" being 100%).

;-)

--
BR, Vladimir

Williams and Holland's Law:
If enough data is collected, anything may be proven by
statistical methods.

Mar 15 '06 #7

tom fredriksen

Rod Pemberton wrote:

Okay, I compiled your code for DJGPP and I'm not seeing this issue. There
is a slight variation in execution times, but nothing like what you
describe. Nor are they different enough to tell which is faster or slower.
The only way I got a huge difference was compiling one as gcc -O and the
other as gcc -O2.
Interesting. I compiled without -O2 and with -O, which gave me time
respectively 22 seconds and 11 seconds. Removing the loop made no
difference. Both with the loop, which with -O2 gave me 7 seconds
previously.

Out of cuirosity, what times did you get and what system are you using?
If not an optimization problem, I did notice that GCC
shifted more work to the floating point instructions when the loop was
present. Perhaps, due to instruction pairing/pipelineing, caching, etc.,
the floating point init just runs faster for you...

Since that happens before the measured loop, thats not an issue. The
time print statement confirms that the operation takes double the time.

/tom

Mar 15 '06 #8

tom fredriksen

Jan-Hinnerk Dumjahn wrote:

The time you measure is the real time. It also contains time spend by other
applications or the system. Perhaps the array needs to be swapped in on the
first access.

Try running both versions of the programm with "time myProg".
it gave me, for the 7 seconds run:
6.35user 0.00system 0:07.14elapsed 88%CPU

and for the 11 seconds run:
9.98user 0.00system 0:11.07elapsed 90%CPU
Try a loop that reads through the array once instead of the initialisation.

I changed the loop to the following:

for(int c=0; c<count; c++) {
data2[c] = data[c];
}

and then printed some of the elements at the end of the program so as
not to have it optimised away. That changed the execution time to the
same for both cases. So it seems that because of using floating point in
the rand statement, it works faster. Maybe gcc optimised the program to
use floating point instead of integer operations, which is then faster.

Since it must be an integer operation, that leads to the following
question, how do I modify the statement to use integers instead.

data[c]=1.0+(unsigned int) (2000000000.0*rand()/(RAND_MAX+1.0))

just removing ".0" does not make it slower as one would expect.

/tom

Mar 15 '06 #9

tom fredriksen

Vladimir S. Oka wrote:

On Wednesday 15 March 2006 16:14, tom fredriksen opined (in
<44********@news.broadpark.no>):
I was doing a simple test of the speed of a "maths" operation and when
I tested it I found that removing the loop that initialises the data
array for the operation caused the whole program to spend twice the
time to complete. If the loop is included it takes about 7.48 seconds
to complete, but when removed it takes about 11.48 seconds.

Does anybody have a suggestion as to why this is so and whether I can
trust the results of the code as it is below?

I'd suggest brushing up on your algebra. Since when is 11.48 "twice the"
7.48? In my book it's closer to 50% larger ("twice larger" being 100%).

I did not say twice the larger, I said twice the time:/

/tom

Mar 15 '06 #10

Rod Pemberton

"tom fredriksen" <to*@nospam.org> wrote in message
news:44********@news.broadpark.no...

Rod Pemberton wrote:
Okay, I compiled your code for DJGPP and I'm not seeing this issue. There is a slight variation in execution times, but nothing like what you
describe. Nor are they different enough to tell which is faster or slower. The only way I got a huge difference was compiling one as gcc -O and the
other as gcc -O2.

Interesting. I compiled without -O2 and with -O, which gave me time
respectively 22 seconds and 11 seconds. Removing the loop made no
difference. Both with the loop, which with -O2 gave me 7 seconds
previously.

Out of cuirosity, what times did you get and what system are you using?

I didn't record the times, but the -O2 were in the 45 second range, -O was
in the 90 second range.
I ran them under Win98SE, 500Mhz K6-2, DJGPP v2.03 w/GCC v3.41.

>If not an optimization problem, I did notice that GCC
> shifted more work to the floating point instructions when the loop was
> present. Perhaps, due to instruction pairing/pipelineing, caching, etc., > the floating point init just runs faster for you...

Since that happens before the measured loop, thats not an issue. The
time print statement confirms that the operation takes double the time.

Do you have a dual-core CPU? I've seen recent complaints in the assembly
language NG's that the RDTSC returns incorrect information for dual-core
CPUs. It might be that timing routine is using RDTSC.

I had to restructure the code slightly since you were using some C99
features. But, I didn't think it was critical. I moved the declarations
and init of c,c,d out of the for loops to the top of main. Also, I moved
the declarations of start_time and end_time to the top of main. For the
version I ran, I looked at the assembly from gcc -O2 -S for three versions
of your program: no loop, loop set all to zero, loop set rand() stuff. The
assembly code was very different. The no loop version used mostly integer
assembly intructions. The loop zero version used about half integer and
half floating. And the one with the rand() used maybe 70% floating point
intructions.
Rod Pemberton

Mar 15 '06 #11

tom fredriksen

Rod Pemberton wrote:

"tom fredriksen" <to*@nospam.org> wrote in message
>If not an optimization problem, I did notice that GCC
> shifted more work to the floating point instructions when the loop was
> present. Perhaps, due to instruction pairing/pipelineing, caching, etc., > the floating point init just runs faster for you...
Since that happens before the measured loop, thats not an issue. The
time print statement confirms that the operation takes double the time.

Do you have a dual-core CPU? I've seen recent complaints in the assembly
language NG's that the RDTSC returns incorrect information for dual-core
CPUs. It might be that timing routine is using RDTSC.

No, its a single core AMD Athlon 2GHz from a couple of years ago.
I had to restructure the code slightly since you were using some C99
features. But, I didn't think it was critical. I moved the declarations
and init of c,c,d out of the for loops to the top of main. Also, I moved
the declarations of start_time and end_time to the top of main. For the
version I ran, I looked at the assembly from gcc -O2 -S for three versions
of your program: no loop, loop set all to zero, loop set rand() stuff. The
assembly code was very different. The no loop version used mostly integer
assembly intructions. The loop zero version used about half integer and
half floating. And the one with the rand() used maybe 70% floating point
intructions.
I tested that but it gave no difference either.

Out of cuirosity, what times did you get and what system are you using?

I didn't record the times, but the -O2 were in the 45 second range,

-O was in the 90 second range.
I ran them under Win98SE, 500Mhz K6-2, DJGPP v2.03 w/GCC v3.41.

this is highly suspicious... a k6 is slower than an Athlon 2GHz is it
not? so how come you are getting such low times?
Could you post the code you used, and check your times.

BTW did you use gcc? I used these args with gcc

-O2 -Wall -D_LARGEFILE64_SOURCE -std=gnu99

/tom

Mar 15 '06 #12

Rod Pemberton

"tom fredriksen" <to*@nospam.org> wrote in message
news:44********@news.broadpark.no...

Rod Pemberton wrote:
"tom fredriksen" <to*@nospam.org> wrote in message
>> Out of cuirosity, what times did you get and what system are you
using? >
> I didn't record the times, but the -O2 were in the 45 second range, -O was
> in the 90 second range.
> I ran them under Win98SE, 500Mhz K6-2, DJGPP v2.03 w/GCC v3.41.

this is highly suspicious... a k6 is slower than an Athlon 2GHz is it
not? so how come you are getting such low times?
Could you post the code you used, and check your times.

Times are below. I deleted the code, but just copied it again. The only
differences are that the declaration of c and d are no longer in the for()
loops. i.e.,
int c,d;
for(c=0; c<count; c++) {
for(d=0; d<50000; d++) {
for(c=0; c<count; c++) {
BTW did you use gcc? I used these args with gcc

-O2 -Wall -D_LARGEFILE64_SOURCE -std=gnu99

Gcc v3.41 yes, but DJGPP uses it's own libc. I keep forgetting that DJGPP
has it's own libc which is different from GNU libc. That could _easily_ be
a factor. Also, CPU speeds increase by roughly a factor of 2 per generation
(2), so 1/4 of 44 ~ 11.

(2 runs, w/loop)
gcc -O2
Elapsed time (ms): 42790.000000
Elapsed time (ms): 43450.000000

(2 runs, no loop)
gcc -O2
Elapsed time (ms): 44160.000000
Elapsed time (ms): 44050.000000

(2 runs, w/loop)
gcc -Wall -O2 -D_LARGEFILE64_SOURCE -std=gnu99
Elapsed time (ms): 44600.000000
Elapsed time (ms): 43340.000000

(3 runs, no loop)
gcc -Wall -O2 -D_LARGEFILE64_SOURCE -std=gnu99
Elapsed time (ms): 39600.000000
Elapsed time (ms): 42840.000000
Elapsed time (ms): 40970.000000
It might be time to try a linux NG to see if someone with a similar system
to yours is getting the same issue...
Rod Pemberton

Mar 15 '06 #13

laura fairhead

On Wed, 15 Mar 2006 17:14:38 +0100, tom fredriksen <to*@nospam.org> wrote:

Hi

I was doing a simple test of the speed of a "maths" operation and when I
tested it I found that removing the loop that initialises the data array
for the operation caused the whole program to spend twice the time to
complete. If the loop is included it takes about 7.48 seconds to
complete, but when removed it takes about 11.48 seconds.

Hello Tom,

This is most likely because your initialisation is outside of the
timed code. When you are initialising the data that is causing the
memory to be moved into the cache (main memory accesses cost more
time than cached access) and so the next time (in the loop) access
is much faster. So, when you don't initialise your data, all the time
it takes to move the 64k main memory to the cache is added into the
loop instead and you get a corresponding slow-down.

byefornow
laura

--
echo moc.12klat@daehriaf_arual|sed 's/\(.\)\(.\),*/\2,\1/g;h;s/,//g;/@t/q;G;D'

Mar 15 '06 #14

Mark McIntyre

On Wed, 15 Mar 2006 21:15:11 +0100, in comp.lang.c , tom fredriksen
<to*@nospam.org> wrote:

On Wednesday 15 March 2006 16:14, tom fredriksen opined (in
<44********@news.broadpark.no>):
I was doing a simple test of the speed of a "maths" operation and when
I tested it I found that removing the loop that initialises the data
array for the operation caused the whole program to spend twice the
time to complete. If the loop is included it takes about 7.48 seconds
to complete, but when removed it takes about 11.48 seconds.

I did not say twice the larger, I said twice the time:/

Assuming you're using base 10, twice 7.48 isn't 11.48.

And to answer your question, who knows? In the "loopless" case you're
invoking undefined behaviour by accessing uninitialised memory, so the
extra time could be spent phoning the kremlin or counting your
diskdrive sectors, or pretty much anything.

Mark McIntyre
--
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan

Mar 15 '06 #15

laura fairhead

On Wed, 15 Mar 2006 23:01:06 +0000 (UTC), la************@talk21.com (laura fairhead) wrote:

On Wed, 15 Mar 2006 17:14:38 +0100, tom fredriksen <to*@nospam.org> wrote:
Hi

I was doing a simple test of the speed of a "maths" operation and when I
tested it I found that removing the loop that initialises the data array
for the operation caused the whole program to spend twice the time to
complete. If the loop is included it takes about 7.48 seconds to
complete, but when removed it takes about 11.48 seconds.
Hello Tom,

This is most likely because your initialisation is outside of the
timed code. When you are initialising the data that is causing the
memory to be moved into the cache (main memory accesses cost more
time than cached access) and so the next time (in the loop) access
is much faster. So, when you don't initialise your data, all the time
it takes to move the 64k main memory to the cache is added into the
loop instead and you get a corresponding slow-down.

byefornow
laura

Hi Again,

Sorry, that's an absolutely stupid answer. I hadn't read the code well
enough before posting. The caching overhead couldn't add that sort of
time to the loop, obvioulsy. Because of the very complex interaction
between that high-level code, the compiler & machine code it generates,
the CPU & memory subsystems, this could be oh so many things....

One possibility (more realistic than my previous idiotic post;) is
that the code in the non-loop version ends up being not-well-aligned
for a CPU that is affected by such and a compiler that doesn't optimise
the alignment of loops by default. That's one out of a great many to
be sure, you'd have to analyse it; but I would start off looking at
the assembly listings.

byefornow
laura

--
echo moc.12klat@daehriaf_arual|sed 's/\(.\)\(.\),*/\2,\1/g;h;s/,//g;/@t/q;G;D'

--
echo moc.12klat@daehriaf_arual|sed 's/\(.\)\(.\),*/\2,\1/g;h;s/,//g;/@t/q;G;D'

Mar 15 '06 #16

David Holland

On 2006-03-15, tom fredriksen <to*@nospam.org> wrote:

I was doing a simple test of the speed of a "maths" operation and when I
tested it I found that removing the loop that initialises the data array
for the operation caused the whole program to spend twice the time to
complete. If the loop is included it takes about 7.48 seconds to
complete, but when removed it takes about 11.48 seconds.

Does anybody have a suggestion as to why this is so and whether I can
trust the results of the code as it is below?

[...]
int total = 0;
int count = 65500;
signed int data[count];

<OT purpose=stop_wild_speculation>

It is because of the virtual memory system. When you initialize the
array beforehand, it forces the virtual memory system to materialize
memory for the pages appearing in the array. When you do not, this
materialization happens in the timed loop, and then of course it's
slower.

You should be able to confirm this by touching some but not all of the
pages beforehand and observing the resulting timings. Hint: page size
for i386 family processors is 4096 bytes. And you won't see it at all
using DJGPP, which is DOS-based and doesn't have a virtual memory
system.

</OT>

I feel compelled to observe that your program exhibits undefined
behavior... and so strictly speaking, reasoning about its performance
properties is meaningless. And you absolutely cannot trust the
results.

Perhaps you meant "total" to be unsigned?

--
- David A. Holland
(the above address works if unscrambled but isn't checked often)

Mar 15 '06 #17

Old Wolf

tom fredriksen wrote:

I was doing a simple test of the speed of a "maths" operation and when I
tested it I found that removing the loop that initialises the data array
for the operation caused the whole program to spend twice the time to
complete.

Your program causes undefined behaviour if you remove the
initialization, because it reads uninitialized variables.

Mar 16 '06 #18

Dik T. Winter

In article <44********@news.broadpark.no> tom fredriksen <to*@nospam.org> writes:

Does anybody have a suggestion as to why this is so and whether I can
trust the results of the code as it is below?
Cache effects.

.... /* Initialising array loop */
for(int c=0; c<count; c++) {
data[c]=1+(int) (2000000000.0*rand()/(RAND_MAX+1.0));
}

When initialising your array data comes into the cache, so all further
operations work on the cache. When you do not initialise, your data
has to be swapped into the cache during the operations, that takes time.

Try the same with an initialisation of data[c] to 1. Also try timing
with the initialisation included or not. In that case you will see
three timings:
a) No initialisation.
b) Initialisation to 1, not timed.
c) Initialisation to 1, included in timing.
You will see that (a) is approximately equal to (c) and that (b) is
shorter.
--
dik t. winter, cwi, kruislaan 413, 1098 sj amsterdam, nederland, +31205924131
home: bovenover 215, 1025 jn amsterdam, nederland; http://www.cwi.nl/~dik/

Mar 16 '06 #19

Dik T. Winter

In article <44***************@news.btinternet.com> la************@talk21.com (laura fairhead) writes:

On Wed, 15 Mar 2006 23:01:06 +0000 (UTC), la************@talk21.com (laura fairhead) wrote: ....
This is most likely because your initialisation is outside of the
timed code. When you are initialising the data that is causing the
memory to be moved into the cache (main memory accesses cost more
time than cached access) and so the next time (in the loop) access
is much faster. So, when you don't initialise your data, all the time
it takes to move the 64k main memory to the cache is added into the
loop instead and you get a corresponding slow-down.

.... Sorry, that's an absolutely stupid answer.

No, it is not absolutely stupid at all. But now I think I know the
answer. The timing of the addition operation depends on the values
involved.
--
dik t. winter, cwi, kruislaan 413, 1098 sj amsterdam, nederland, +31205924131
home: bovenover 215, 1025 jn amsterdam, nederland; http://www.cwi.nl/~dik/

Mar 16 '06 #20

tom fredriksen

Old Wolf wrote:

tom fredriksen wrote:
I was doing a simple test of the speed of a "maths" operation and when I
tested it I found that removing the loop that initialises the data array
for the operation caused the whole program to spend twice the time to
complete.

Your program causes undefined behaviour if you remove the
initialization, because it reads uninitialized variables.

:) sorry but using an uninitialised array does not cause undefined
behaviour. It only means the values of the array are not preset. An
array is just a sequence of memory positions, which have been used
previously by some other part of the program or the system, hence the
value it has is what was there before I allocated it. There is no
undefined behaviour in that. If I had used the value in some expression,
then I would get undefined behaviour because the expression is based on
values I don't know and which might be illegal for the case.

/tom

Mar 16 '06 #21

tom fredriksen

Dik T. Winter wrote:

In article <44********@news.broadpark.no> tom fredriksen <to*@nospam.org> writes:
> Does anybody have a suggestion as to why this is so and whether I can
> trust the results of the code as it is below?

Cache effects.

...
> /* Initialising array loop */
> for(int c=0; c<count; c++) {
> data[c]=1+(int) (2000000000.0*rand()/(RAND_MAX+1.0));
> }

When initialising your array data comes into the cache, so all further
operations work on the cache. When you do not initialise, your data
has to be swapped into the cache during the operations, that takes time.

Try the same with an initialisation of data[c] to 1. Also try timing
with the initialisation included or not. In that case you will see
three timings:
a) No initialisation.
b) Initialisation to 1, not timed.
c) Initialisation to 1, included in timing.
You will see that (a) is approximately equal to (c) and that (b) is
shorter.

Please read my other post.

Mar 16 '06 #22

Richard Bos

tom fredriksen <to*@nospam.org> wrote:

Old Wolf wrote:
tom fredriksen wrote:
I was doing a simple test of the speed of a "maths" operation and when I
tested it I found that removing the loop that initialises the data array
for the operation caused the whole program to spend twice the time to
complete.
Your program causes undefined behaviour if you remove the
initialization, because it reads uninitialized variables.

:) sorry but using an uninitialised array does not cause undefined
behaviour.

Yes, it does.
It only means the values of the array are not preset. An array is just
a sequence of memory positions,
Yes, but...
which have been used previously by some other part of the program or the
system,
....no. An uninitialised array is allowed to contain trap values.
hence the value it has is what was there before I allocated it. There is no
undefined behaviour in that.

There is nothing in the Standard which guarantees anything whatsoever
about the memory pattern in memory you just allocated, and reading
uninitialised memory _does_ cause undefined behaviour.

Richard

Mar 16 '06 #23

Richard Heathfield

tom fredriksen said:

Old Wolf wrote:
tom fredriksen wrote:
I was doing a simple test of the speed of a "maths" operation and when I
tested it I found that removing the loop that initialises the data array
for the operation caused the whole program to spend twice the time to
complete.
Your program causes undefined behaviour if you remove the
initialization, because it reads uninitialized variables.

:) sorry but using an uninitialised array does not cause undefined
behaviour.

It does if you read those "values" before writing them. And, if you remove
your loop, that's precisely what you do.

It only means the values of the array are not preset. An
array is just a sequence of memory positions, which have been used
previously by some other part of the program or the system, hence the
value it has is what was there before I allocated it. There is no
undefined behaviour in that.
At that point, I agree, but...
If I had used the value in some expression,
signed int data[count];

(loop removed, because that's the case we're discussing)

struct timeval start_time;
struct timeval end_time;

gettimeofday(&start_time, NULL);

for(int d=0; d<50000; d++) {
for(int c=0; c<count; c++) {
total += data[c];

ping - you use the value in some expression.
then I would get undefined behaviour because the expression is based on
values I don't know and which might be illegal for the case.

s/would//

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)

Mar 16 '06 #24

tom fredriksen

Richard Heathfield wrote:

signed int data[count];

(loop removed, because that's the case we're discussing)

struct timeval start_time;
struct timeval end_time;

gettimeofday(&start_time, NULL);

for(int d=0; d<50000; d++) {
for(int c=0; c<count; c++) {
total += data[c];

ping - you use the value in some expression.
then I would get undefined behaviour because the expression is based on
values I don't know and which might be illegal for the case.

s/would//

Wrong, it does not cause undefined behaviour because the values used
does not require a specific set of values or a value range. its just a
set of ints with a 32 bit value, nothing special about that. Its the
interpretation that matters not the value itself. And since my program
does not need to interpret the value there is no undefined behaviour.

If I had required the program to have values in f.ex. the ASCII range
and used a function that required ASCII values then it would have
undefined behaviour. Or if the array was required to have either 0 or 1
as a value and I used it in a boolean expression it had negative values,
that would cause undefined behaviour.

Undefined behaviour only exists if the program or any of the functions
in the program has a definition of what are legal values and the array
does not contain those legal values.

The reason it is stated that it causes undefined behaviour is because
the programmer has to decide what behaviour it should have in those
cases. Since the inventors of the language could not possibly foresee
each programs behaviour, it was decided that it is more convenient to
say it can cause undefined behaviour.

/tom

Mar 16 '06 #25

Richard Heathfield

tom fredriksen said:

Richard Heathfield wrote:

signed int data[count];

(loop removed, because that's the case we're discussing)

struct timeval start_time;
struct timeval end_time;

gettimeofday(&start_time, NULL);

for(int d=0; d<50000; d++) {
for(int c=0; c<count; c++) {
total += data[c];

ping - you use the value in some expression.
>
then I would get undefined behaviour because the expression is based on
values I don't know and which might be illegal for the case.

s/would//

Wrong, it does not cause undefined behaviour because the values used
does not require a specific set of values or a value range.

Wrong.

C89 draft:

* Undefined behavior --- behavior, upon use of a nonportable or
erroneous program construct, of erroneous data, or of
indeterminately-valued objects, for which the Standard imposes no
requirements.

C99 final:

"Certain object representations need not represent a value of the object
type. If the stored value of an object has such a representation and is
read by an lvalue expression that does not have character type, the
behavior is undefined."

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)

Mar 16 '06 #26

tom fredriksen

Rod Pemberton wrote:

"tom fredriksen" <to*@nospam.org> wrote in message
this is highly suspicious... a k6 is slower than an Athlon 2GHz is it
not? so how come you are getting such low times?
Sorry, for some reason I was thinking my results where in minutes. But
it makes sense now:)
BTW did you use gcc? I used these args with gcc

-O2 -Wall -D_LARGEFILE64_SOURCE -std=gnu99
Gcc v3.41 yes, but DJGPP uses it's own libc. I keep forgetting that DJGPP
has it's own libc which is different from GNU libc. That could _easily_ be
a factor. Also, CPU speeds increase by roughly a factor of 2 per generation
(2), so 1/4 of 44 ~ 11.

(2 runs, w/loop)
gcc -O2
Elapsed time (ms): 42790.000000
Elapsed time (ms): 43450.000000

(2 runs, no loop)
gcc -O2
Elapsed time (ms): 44160.000000
Elapsed time (ms): 44050.000000

(2 runs, w/loop)
gcc -Wall -O2 -D_LARGEFILE64_SOURCE -std=gnu99
Elapsed time (ms): 44600.000000
Elapsed time (ms): 43340.000000

(3 runs, no loop)
gcc -Wall -O2 -D_LARGEFILE64_SOURCE -std=gnu99
Elapsed time (ms): 39600.000000
Elapsed time (ms): 42840.000000
Elapsed time (ms): 40970.000000

After some head banging I realise that your number are comparatively
correct.

I find the reason to be twofold 1) cpu 2) compiler/system. Athlon is
quite different and newer than a K6 that should explain some of it. The
other reason is because of the cpu and system it optimises differently.
Leading to the athlon/linux test to behave differently than on a k6/win.

If you replaced the rand expression with a guaranteed integer
expression, the numbers between the two tests where equal. So there is
something in the rand causing it to behave radically different. I looked
at the generated assembler and I didn't find it was that different, but
still the execution was different... interesting.

Another thing I just noticed is that if I create the program to contain
both version at the same time, there is no difference, both are at 7
seconds. But if I remove the float version its back up to 1 seconds. I
am going mad now...

There is definitely something going on with the rand statement causing
it to affect the entire program.

data[c]=1.0+(unsigned int) (2000000000.0*rand()/(RAND_MAX+1.0));

The complete code is at the bottom, I fixed it to be like yours.
Can anybody test this code on an equivalent system and a one that is
different but newer than k6? My system is Linux 2.6.3 with gcc 3.3.2

It might be time to try a linux NG to see if someone with a similar system
to yours is getting the same issue...

Linux NG? never heard of it.. do explain...

/tom

#include <stdio.h>
#include <stdlib.h>
#include <sys/time.h>
void test_orig()
{
unsigned int total = 0;
int count = 65500;
unsigned int data[count];
struct timeval start_time;
struct timeval end_time;
int c,d;
double t1, t2;

for(c=0; c<count; c++) {
data[c]=1.0+(unsigned int) (2000000000.0*rand()/(RAND_MAX+1.0));
}

gettimeofday(&start_time, NULL);

for(d=0; d<50000; d++) {
for(c=0; c<count; c++) {
total += data[c];
}
}
gettimeofday(&end_time, NULL);

t1=(start_time.tv_sec*1000)+(start_time.tv_usec/1000.0);
t2=(end_time.tv_sec*1000)+(end_time.tv_usec/1000.0);

printf("Elapsed time (ms): %.6lf\n", t2-t1);
printf("Total: %u\n", total);
}
void test_int()
{
unsigned int total = 0;
int count = 65500;
unsigned int data[count];
unsigned int data2[count];
struct timeval start_time;
struct timeval end_time;
int c,d;
double t1, t2;

for(c=0; c<count; c++) {
data2[c] = data[c];
}
gettimeofday(&start_time, NULL);

for(d=0; d<50000; d++) {
for(c=0; c<count; c++) {
total += data[c];
}
}
gettimeofday(&end_time, NULL);

t1=(start_time.tv_sec*1000)+(start_time.tv_usec/1000.0);
t2=(end_time.tv_sec*1000)+(end_time.tv_usec/1000.0);

printf("Elapsed time (ms): %.6lf\n", t2-t1);
printf("Total: %u\n", total);

for(c=0; c<100; c++) {
printf("data2: %u ", data2[c]);
}
printf("\n");
}

int main(int argc, char *argv[])
{
/* test_orig(); */

test_int();

return(0);
}

Mar 16 '06 #27

tom fredriksen

Richard Heathfield wrote:

tom fredriksen said:
Wrong, it does not cause undefined behaviour because the values used
does not require a specific set of values or a value range.

Wrong.

C89 draft:

* Undefined behavior --- behavior, upon use of a nonportable or
erroneous program construct, of erroneous data, or of
indeterminately-valued objects, for which the Standard imposes no
requirements.

C99 final:

"Certain object representations need not represent a value of the object
type. If the stored value of an object has such a representation and is
read by an lvalue expression that does not have character type, the
behavior is undefined."

That does not contradict what I am saying. These statements only define
what undefined behaviour is, in context of the standard and the
language, nothing more. It does not say using such a value must demand
undefined behaviour.

/tom

Mar 16 '06 #28

Chris Dollin

tom fredriksen wrote:

Richard Heathfield wrote:
tom fredriksen said:
Wrong, it does not cause undefined behaviour because the values used
does not require a specific set of values or a value range.

Wrong.

C89 draft:

* Undefined behavior --- behavior, upon use of a nonportable or
erroneous program construct, of erroneous data, or of
indeterminately-valued objects, for which the Standard imposes no
requirements.

C99 final:

"Certain object representations need not represent a value of the object
type. If the stored value of an object has such a representation and is
read by an lvalue expression that does not have character type, the
behavior is undefined."

That does not contradict what I am saying. These statements only define
what undefined behaviour is, in context of the standard and the
language, nothing more. It does not say using such a value must demand
undefined behaviour.

It says using such a value /produces/ undefined behaviour: if you do
it, you can no longer appeal to the C standard for the remaining
behaviour of your code. An implementation can do /anything it likes/.

Now, you may happen to know - or believe - that your implementation
will do something harmless. And you may be right. But this is not
because of the semantics of C, but because of some implementation-specific
behaviour, which in turn may be accidental.

When we say that such-and-such is undefined behaviour, we mean that
the behaviour of the program can no longer be predicted from the
semantics of C, and that the standard allows /anything/, even
unreasonable or unphysical behaviour. Luckily, most implementations
are constrained by physics, otherwise we'd all require much bigger
nostrils.

--
Chris "sparqling" Dollin
"Who do you serve, and who do you trust?"

Mar 16 '06 #29

tom fredriksen

Chris Dollin wrote:

tom fredriksen wrote:
Richard Heathfield wrote:
tom fredriksen said:

Wrong, it does not cause undefined behaviour because the values used
does not require a specific set of values or a value range.
Wrong.

C89 draft:

* Undefined behavior --- behavior, upon use of a nonportable or
erroneous program construct, of erroneous data, or of
indeterminately-valued objects, for which the Standard imposes no
requirements.

C99 final:

"Certain object representations need not represent a value of the object
type. If the stored value of an object has such a representation and is
read by an lvalue expression that does not have character type, the
behavior is undefined."

That does not contradict what I am saying. These statements only define
what undefined behaviour is, in context of the standard and the
language, nothing more. It does not say using such a value must demand
undefined behaviour.

It says using such a value /produces/ undefined behaviour: if you do
it, you can no longer appeal to the C standard for the remaining
behaviour of your code. An implementation can do /anything it likes/.

Now, you may happen to know - or believe - that your implementation
will do something harmless. And you may be right. But this is not
because of the semantics of C, but because of some implementation-specific
behaviour, which in turn may be accidental.

When we say that such-and-such is undefined behaviour, we mean that
the behaviour of the program can no longer be predicted from the
semantics of C,

Can you then explain to me how it is that you think the behaviour of my
code can not be predicted? Are you actually telling me that just because
the array is uninitialised my program will fail/have undefined
behaviour, independent of how I use the array? lol:)

Before you answer, you should know that the reason for the problem is
that the rand() statement causes the compiler to output float operations
instead of integer operations. I tested it by replacing the statement in
the loop with a guaranteed integer statement. So the undefined behaviour
argument has exploded in its own face. See code below.

/tom

#include <stdio.h>
#include <stdlib.h>
#include <sys/time.h>
void test_orig()
{
unsigned int total = 0;
int count = 65500;
unsigned int data[count];
struct timeval start_time;
struct timeval end_time;
int c,d;
double t1, t2;

for(c=0; c<count; c++) {
data[c]=1.0+(unsigned int) (2000000000.0*rand()/(RAND_MAX+1.0));
}

gettimeofday(&start_time, NULL);

for(d=0; d<50000; d++) {
for(c=0; c<count; c++) {
total += data[c];
}
}
gettimeofday(&end_time, NULL);

t1=(start_time.tv_sec*1000)+(start_time.tv_usec/1000.0);
t2=(end_time.tv_sec*1000)+(end_time.tv_usec/1000.0);

printf("Elapsed time (ms): %.6lf\n", t2-t1);
printf("Total: %u\n", total);
}
void test_int()
{
unsigned int total = 0;
int count = 65500;
unsigned int data[count];
unsigned int data2[count];
struct timeval start_time;
struct timeval end_time;
int c,d;
double t1, t2;

for(c=0; c<count; c++) {
data2[c] = data[c];
}
gettimeofday(&start_time, NULL);

for(d=0; d<50000; d++) {
for(c=0; c<count; c++) {
total += data[c];
}
}
gettimeofday(&end_time, NULL);

t1=(start_time.tv_sec*1000)+(start_time.tv_usec/1000.0);
t2=(end_time.tv_sec*1000)+(end_time.tv_usec/1000.0);

printf("Elapsed time (ms): %.6lf\n", t2-t1);
printf("Total: %u\n", total);

for(c=0; c<100; c++) {
printf("data2: %u ", data2[c]);
}
printf("\n");
}

int main(int argc, char *argv[])
{
/* test_orig(); */

test_int();

return(0);
}
/tom

Mar 16 '06 #30

Chris Dollin

tom fredriksen wrote:

(long quotes)

Chris Dollin wrote:
tom fredriksen wrote:
Richard Heathfield wrote:
tom fredriksen said:

> Wrong, it does not cause undefined behaviour because the values used
> does not require a specific set of values or a value range.
Wrong.

C89 draft:

* Undefined behavior --- behavior, upon use of a nonportable or
erroneous program construct, of erroneous data, or of
indeterminately-valued objects, for which the Standard imposes no
requirements.

C99 final:

"Certain object representations need not represent a value of the
object type. If the stored value of an object has such a representation
and is read by an lvalue expression that does not have character type,
the behavior is undefined."
That does not contradict what I am saying. These statements only define
what undefined behaviour is, in context of the standard and the
language, nothing more. It does not say using such a value must demand
undefined behaviour.

It says using such a value /produces/ undefined behaviour: if you do
it, you can no longer appeal to the C standard for the remaining
behaviour of your code. An implementation can do /anything it likes/.

Now, you may happen to know - or believe - that your implementation
will do something harmless. And you may be right. But this is not
because of the semantics of C, but because of some
implementation-specific behaviour, which in turn may be accidental.

When we say that such-and-such is undefined behaviour, we mean that
the behaviour of the program can no longer be predicted from the
semantics of C,

Can you then explain to me how it is that you think the behaviour of my
code can not be predicted?

Certainly.

When you don't initialise the data array, the line:

total += data[c]

references an indeterminately-valued object. This produces undefined
behaviour. Therefore, the further behaviour of your code cannot be
predicted /from the C standard/: you must use additional information.

--
Chris "sparqling" Dollin
"Who do you serve, and who do you trust?"

Mar 16 '06 #31

tom fredriksen

Chris Dollin wrote:

Can you then explain to me how it is that you think the behaviour of my
code can not be predicted?
Certainly.

When you don't initialise the data array, the line:

total += data[c]

references an indeterminately-valued object.

Correct.
This produces undefined behaviour.
Therefore, the further behaviour of your code cannot be
predicted /from the C standard/: you must use additional information.

You see, even though the standard defines it as undefined behaviour,
does not practically make it undefined behaviour. It is only defined as
such because the writers of the standard are not in a position to guess
such a variables value or use for all possible future programs.

I think there is no point continuing the discussion. Because if you only
look at it literally, as you do, then you are correct, I dont dispute
that. But if you consider the pragmatics of it, in addition to the
literal meaning, which you should when you are programming, then you are
wrong.

/tom

Mar 16 '06 #32

Chris Dollin

tom fredriksen wrote:

Chris Dollin wrote:
Can you then explain to me how it is that you think the behaviour of my
code can not be predicted?
Certainly.

When you don't initialise the data array, the line:

total += data[c]

references an indeterminately-valued object.

Correct.
This produces undefined behaviour.
> Therefore, the further behaviour of your code cannot be
> predicted /from the C standard/: you must use additional information.

You see, even though the standard defines it as undefined behaviour,
does not practically make it undefined behaviour.

In what way have I not been explicit about this?

"the further behaviour of your code cannot be predicted /from the C
standard/: you must use additional information."

"Undefined behaviour" in this newsgroup /means/ "behaviour left undefined
by the [relevant] C standard". So, for portability, /avoid/ UB, because
even if you know (or think you know) what will happen on the machine you
happen to be running on today, you don't know what will happen tomorrow,
or on a different machine elsewhere.
It is only defined as such because the writers of the standard are
not in a position to guess such a variables value or use for all
possible future programs.
It's defined as such because different implementations have been known to
do different things with this, or similar, situations, it's very hard
to define the exact line to cross, and even if you /could/ it probably
wouldn't help.

Here's a well-known type of undefined bahviour:

int eg() { int i = 17; return i++ + ++i; }

An interesting variety of "answers" are available, all correct.
I think there is no point continuing the discussion. Because if you only
look at it literally, as you do, then you are correct, I dont dispute
that. But if you consider the pragmatics of it, in addition to the
literal meaning, which you should when you are programming, then you are
wrong.

The pragmatics are

* avoid code with undefined behaviour
* if you can't, justify it with additional standards
* if you can't, have sanity checks and document it

--
Chris "sparqling" Dollin
"Who do you serve, and who do you trust?"

Mar 16 '06 #33

tom fredriksen

Chris Dollin wrote:

tom fredriksen wrote:

total += data[c]

Explain to me how this statement causes undefined behaviour and how its
behaviour will be different on different architectures.
The difference here is whether data[c] has a value set by me instead of
a random value from some historic use of that memory address.

The only thing happening here is an addition of two binary values from a
memory address to another memory address or register, as defined in
the language specification for the "+=" operator.

I can imagine a system where the compiler or the architecture causes a
program using an uninitialised variable to abort or cast an exception /
interrupt etc. if thats what you mean then agreed, than can happen, but
I don't think its a good idea.

/tom

Mar 16 '06 #34

Vladimir S. Oka

tom fredriksen wrote:

Chris Dollin wrote:
tom fredriksen wrote:
>>> total += data[c]

Explain to me how this statement causes undefined behaviour and how its
behaviour will be different on different architectures.
The difference here is whether data[c] has a value set by me instead of
a random value from some historic use of that memory address.

Why are you so sure the same memory area would have been used before,
and even if it was, that it's not going to contain something that's a
trap representation for the type that `data` above is? That, exactly,
is why it's *undefined* behaviour. As far as C Standard is concerned,
the memory may have just been shipped from Vladivostok, after being
used on a Russian nuclear submarine.
The only thing happening here is an addition of two binary values from a
memory address to another memory address or register, as defined in
the language specification for the "+=" operator.
One of which may be a trap representation for the type...
I can imagine a system where the compiler or the architecture causes a
program using an uninitialised variable to abort or cast an exception /
interrupt etc. if thats what you mean then agreed, than can happen, but
I don't think its a good idea.

Stop and think. There may exist bit patterns that are invalid (trap
representation) for a certain type, even if they are perfectly valid
for some other type. Your `data` array, which I believe is of type that
may have trap representation(s), may have been allocated the space
containing just such bit pattern (e.g. from previous use by `unsigned
char` which cannot have trap values, padding bits notwithstanding).

--
BR, Vladimir

Mar 16 '06 #35

Chris Dollin

tom fredriksen wrote:

Chris Dollin wrote:
tom fredriksen wrote:
>>> total += data[c]

Explain to me how this statement causes undefined behaviour

/because the standard says so/.

Really. If `data[c]` has not been given a value (in a way sanctioned
by the standard), then the Standard says that all bets are now off,
it can't say anthing about the future behavior of the code.

That's all that "undefined behaviour" means (according to the C
standard). It means that the Standard doesn't specify the future
behaviour of the code - it's not defined, it's un-defined, the
implementation is at liberty to prescribe the behaviour in any
way it likes without fear of contradiction.
and how its
behaviour will be different on different architectures.
It will be different if the architecture, or the implementation,
says so.
The difference here is whether data[c] has a value set by me instead of
a random value from some historic use of that memory address.
Well, no. The difference is that the implementation can do what it
likes. It could arrange to set all the elements to 0, or -1, or
0xdeadbeef, or __TRAP before you access it. The implementation
could arrange that the array is given its own memory segment
without read access, and only grant read access when at least
one write has been given. The compiler could spot that you'll
accessed uninitialised memory and, just before you do so, plant
code:

puts( "you were going to read uninitialised memory!" );
exit( 17 );
The only thing happening here is an addition of two binary values from a
memory address to another memory address or register, as defined in
the language specification for the "+=" operator.
.... which says that anything is allowed to happen at this point.
I can imagine a system where the compiler or the architecture causes a
program using an uninitialised variable to abort or cast an exception /
interrupt etc. if thats what you mean
It's one of the things that is /permitted/.
then agreed, than can happen, but I don't think its a good idea.

But people who think it /is/ a good idea can use implementations
that /do/ do it, and those implementations are conformant (in that
respect).

Myself, I'd like to be /able/ to use an implementation that spots
bad reads and bad writes; it's nice to know I'm allowed to.

--
Chris "sparqling" Dollin
"Who do you serve, and who do you trust?"

Mar 16 '06 #36

tom fredriksen

Chris Dollin wrote:

tom fredriksen wrote:
Chris Dollin wrote:
tom fredriksen wrote:
>>> total += data[c]
Explain to me how this statement causes undefined behaviour

It will be different if the architecture, or the implementation,
says so.

Thats not really an argument, its just a contradiction. Please give an
factual example how it can cause undefined behaviour.

The difference here is whether data[c] has a value set by me instead of
a random value from some historic use of that memory address.
Well, no. The difference is that the implementation can do what it
likes. It could arrange to set all the elements to 0, or -1, or
0xdeadbeef, or __TRAP before you access it.

What is a __TRAP on f.ex an x86 or a ppc? In the eyes of a cpu which
deals with 32 bit integers datatypes, just another 32 bit value.
The implementation
could arrange that the array is given its own memory segment
without read access, and only grant read access when at least
one write has been given. The compiler could spot that you'll
accessed uninitialised memory and, just before you do so, plant
code:

puts( "you were going to read uninitialised memory!" );
exit( 17 );

Could you point me to any system that does anything like this for the
given example?

/tom

Mar 16 '06 #37

Chris Dollin

tom fredriksen wrote:

Chris Dollin wrote:
tom fredriksen wrote:
Chris Dollin wrote:
tom fredriksen wrote:
>>> total += data[c]

Explain to me how this statement causes undefined behaviour
It will be different if the architecture, or the implementation,
says so.

Thats not really an argument, its just a contradiction.

It's not a contradiction. Look, you're asking the wrong question:
how could it happen on some machine? /It doesn't matter./ What
matters is that the behaviour is not constrained by the standard;
the implementation is free to choose what to do.
Please give an
factual example how it can cause undefined behaviour.
You seem to think that "undefined behaviour" is something specific.
It isn't. /Defined/ behaviour is specific. The "cause" of undefined
behaviour is doing something that the standard says produces
undefined behaviour.

The difference here is whether data[c] has a value set by me instead of
a random value from some historic use of that memory address.

Well, no. The difference is that the implementation can do what it
likes. It could arrange to set all the elements to 0, or -1, or
0xdeadbeef, or __TRAP before you access it.

What is a __TRAP on f.ex an x86 or a ppc?

Whatever the implementation says it is (if it exists).
In the eyes of a cpu which
deals with 32 bit integers datatypes, just another 32 bit value.

[Just in passing ... your data array was an `int`, right? So it can be
a 16-bit integer, not a 32-bit one.]

Or just another 64-bit value with 32 value bits, 31 padding bits,
and a single "unassigned" bit set and read appropriately. Slower,
less compact, but handy for debugging.

The implementation
could arrange that the array is given its own memory segment
without read access, and only grant read access when at least
one write has been given. The compiler could spot that you'll
accessed uninitialised memory and, just before you do so, plant
code:

puts( "you were going to read uninitialised memory!" );
exit( 17 );

Could you point me to any system that does anything like this for the
given example?

No. Does that matter? Won't you believe that that latitude is granted
without a system to hand that actually exploits it?

[A quick Google suggests that running under Valgrind or Saber-C would
come plausibly close.]

A final note: if `data[]` is uninitialised, `total += data[c]` might
very easily overflow, if the pseudo-random values left lying around
inside [if that's what happened] were big enough. In which case, you
have another cause of undefined behaviour ...

--
Chris "sparqling" Dollin
"Who do you serve, and who do you trust?"

Mar 16 '06 #38

bert

Chris Dollin wrote:

. . .
A final note: if `data[]` is uninitialised, `total += data[c]` might
very easily overflow . . .

At last, someone has picked up on what I posted very early in
this thread. Overflow interrupts need to be serviced, and this
consumes extra TIME which, depending on the implementation,
may be charged to the process in which they occurred.
--

Mar 16 '06 #39

tom fredriksen

Chris Dollin wrote:

You seem to think that "undefined behaviour" is something specific.
It isn't. /Defined/ behaviour is specific. The "cause" of undefined
behaviour is doing something that the standard says produces
undefined behaviour.

No I don't think that. But undefined behaviour in my book is behaviour
not defined, if you tell the computer to do something then its defined.
The compiler controls the undefined behaviour so, therefore its not
undefined anymore, portability or not.

Lets just agree that we to some extent disagree, we have different
viewpoints. I agree that in the standard its undefined, but in my
opinion it is defined because of the above statement.

/tom

Mar 16 '06 #40

tom fredriksen

Chris Dollin wrote:

A final note: if `data[]` is uninitialised, `total += data[c]` might
very easily overflow, if the pseudo-random values left lying around
inside [if that's what happened] were big enough. In which case, you
have another cause of undefined behaviour ...

I forgot to comment this.

The undefined behaviour here is if an overflow occurs and there is no
mechanism to handle that overflow, what happens with the OF bit then?
In my program that was of no concern, so the "undefined behaviour" is
irrelevant.

/tom

Mar 16 '06 #41

Richard Bos

tom fredriksen <to*@nospam.org> wrote:

Chris Dollin wrote:
You seem to think that "undefined behaviour" is something specific.
It isn't. /Defined/ behaviour is specific. The "cause" of undefined
behaviour is doing something that the standard says produces
undefined behaviour.

No I don't think that. But undefined behaviour in my book is behaviour
not defined, if you tell the computer to do something then its defined.
The compiler controls the undefined behaviour so, therefore its not
undefined anymore, portability or not.

Lets just agree that we to some extent disagree, we have different
viewpoints. I agree that in the standard its undefined, but in my
opinion it is defined because of the above statement.

The whole issue simply boils down to this:

There are more implementations, tom frederiksen, in Heaven and Earth,
than are dreamt of in thy philosophy.

And if you think that you can rely on always having an implementation
that does what your limited experience on desktop personal computers
tells you it "should" do, get your code away from my computer. It cannot
be trusted.

Richard

Mar 16 '06 #42

Richard Bos

tom fredriksen <to*@nospam.org> wrote:

Chris Dollin wrote:
A final note: if `data[]` is uninitialised, `total += data[c]` might
very easily overflow, if the pseudo-random values left lying around
inside [if that's what happened] were big enough. In which case, you
have another cause of undefined behaviour ...

I forgot to comment this.

The undefined behaviour here is if an overflow occurs and there is no
mechanism to handle that overflow, what happens with the OF bit then?

There is no overflow bit. Your program is immediately terminated with
the status E_INT_OVERFLOW.

Oh, not on your implementation? Gosh, implementations being different -
what _is_ this world coming to...

Richard

Mar 16 '06 #43

Chris Dollin

tom fredriksen wrote:

Chris Dollin wrote:
A final note: if `data[]` is uninitialised, `total += data[c]` might
very easily overflow, if the pseudo-random values left lying around
inside [if that's what happened] were big enough. In which case, you
have another cause of undefined behaviour ...
I forgot to comment this.

The undefined behaviour here is if an overflow occurs and there is no
mechanism to handle that overflow, what happens with the OF bit then?

There need not be an overflow bit. There might, for example, be an
immediate exception.
In my program that was of no concern, so the "undefined behaviour" is
irrelevant.

Not if you're running on an overflow-trapping exception it isn't.

It's like this: playing games with uninitialised data isn't safe.
In some cases, you get away with it, but the more you push it, the
more likely that the revolving blades will chop out your (hopefully,
metaphorical) eyes.

Pragmatics say: don't /do/ that.

--
Chris "sparqling" Dollin
"Who do you serve, and who do you trust?"

Mar 16 '06 #44

Rod Pemberton

"Richard Heathfield" <in*****@invalid.invalid> wrote in message
news:dv**********@nwrdmz03.dmz.ncs.ea.ibs-infra.bt.com...

Actually, I think Tom's interpretation for his implentation is correct.
Take another look:

C89 draft:

* Undefined behavior --- behavior, upon use of a nonportable or
erroneous program construct, of erroneous data,
The construct and data aren't errnoneous for his implementation. But, may
be elsewhere.
or of
indeterminately-valued objects,
They aren't indeterminately-valued for his implementation.
for which the Standard imposes no
requirements.
Therefore, valid for his implementation. Maybe UB elsewhere.

C99 final:

"Certain object representations need not represent a value of the object
type. If the stored value of an object has such a representation
The object types do represent a value of the object type in his
implementation.
and is
read by an lvalue expression that does not have character type,
It has a character type in his implementation.
the
behavior is undefined."

Therefore, not undefined for his implementation, but may be UB elsewhere.

The key is that his code is correct for his environment.
Rod Pemberton

Mar 16 '06 #45

Rod Pemberton

"tom fredriksen" <to*@nospam.org> wrote in message
news:44********@news.broadpark.no...

Before you answer, you should know that the reason for the problem is
that the rand() statement causes the compiler to output float operations
instead of integer operations. I tested it by replacing the statement in
the loop with a guaranteed integer statement. So the undefined behaviour
argument has exploded in its own face. See code below.

/tom
Good! You found the problem. I think that concludes with my prior
statements being correct. :-)
"tom fredriksen" <to*@nospam.org> wrote in message
news:44********@news.broadpark.no... Rod Pemberton wrote:
"tom fredriksen" <to*@nospam.org> wrote in message
For the
version I ran, I looked at the assembly from gcc -O2 -S for three versions of your program: no loop, loop set all to zero, loop set rand() stuff. The assembly code was very different. The no loop version used mostly integer assembly intructions. The loop zero version used about half integer and
half floating. And the one with the rand() used maybe 70% floating point intructions.

I tested that but it gave no difference either.

Later,

Rod Pemberton

Mar 16 '06 #46

Rod Pemberton

"David Holland" <dh******@reverse-this--edu.harvard.fas> wrote in message
news:sl*********************@ls04.fas.harvard.edu. ..

On 2006-03-15, tom fredriksen <to*@nospam.org> wrote:
> I was doing a simple test of the speed of a "maths" operation and when I > tested it I found that removing the loop that initialises the data array > for the operation caused the whole program to spend twice the time to
> complete. If the loop is included it takes about 7.48 seconds to
> complete, but when removed it takes about 11.48 seconds.
>
> Does anybody have a suggestion as to why this is so and whether I can
> trust the results of the code as it is below?
>
> [...]
> int total = 0;
> int count = 65500;
> signed int data[count];
<OT purpose=stop_wild_speculation>

It is because of the virtual memory system. When you initialize the
array beforehand, it forces the virtual memory system to materialize
memory for the pages appearing in the array. When you do not, this
materialization happens in the timed loop, and then of course it's
slower.

You should be able to confirm this by touching some but not all of the
pages beforehand and observing the resulting timings. Hint: page size
for i386 family processors is 4096 bytes. And you won't see it at all
using DJGPP, which is DOS-based and doesn't have a virtual memory
system.

FYI, DJGPP does use a virtual memory system by default. The primary DPMI
host for DJGPP uses hardware paging and has virtual memory support: CWSDPMI.
Also, when an DJGPP application is running in a Windows dosbox, it uses the
DPMI host and virtual memory of the OS. You can, of course, use other
non-paging DPMI hosts such as PMODEDJ (aka, PMODETSR). OpenWatcom, on the
other hand, supports many DPMI hosts and/or DOS extenders. The ones I'm
aware of are non-paging and have no virtual memory support.
</OT>

Rod Pemberton

Mar 16 '06 #47

Keith Thompson

tom fredriksen <to*@nospam.org> writes:

Chris Dollin wrote:
A final note: if `data[]` is uninitialised, `total += data[c]` might
very easily overflow, if the pseudo-random values left lying around
inside [if that's what happened] were big enough. In which case, you
have another cause of undefined behaviour ...

I forgot to comment this.

The undefined behaviour here is if an overflow occurs and there is no
mechanism to handle that overflow, what happens with the OF bit then?
In my program that was of no concern, so the "undefined behaviour" is
irrelevant.

What the heck is an "OF" bit? If it's a condition code, it's
something specific to your system. We're talking about the C
programming language, not about whatever single specific
implementation of it you happen to be using.

If a signed integer arithmetic operation overflows, the C standard
says the behavior is undefined. That doesn't mean that it traps, it
means that the language standard does define the behavior. It can
wrap around, it can saturate, it can abort your program, it can yield
42, it can make demons fly out of your nose, or it can behave in some
manner that depends on the phase of the moon.

This program:

#include <limits.h>
int main(void)
{
int i = INT_MAX;
i = i + 1;
return 0;
}

exhibits undefined behavior. That is not a statement about what it
might do on your implementation, and any demonstration of what it
happens to do on your implementation does not affect the fact that it
invokes undefined behavior. Until and unless you understand that, you
don't really understand C.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Mar 16 '06 #48

Keith Thompson

tom fredriksen <to*@nospam.org> writes:

Richard Heathfield wrote:
tom fredriksen said:
Wrong, it does not cause undefined behaviour because the values used
does not require a specific set of values or a value range.

Wrong.
C89 draft:
* Undefined behavior --- behavior, upon use of a nonportable or
erroneous program construct, of erroneous data, or of
indeterminately-valued objects, for which the Standard imposes no
requirements.
C99 final:
"Certain object representations need not represent a value of the
object type. If the stored value of an object has such a
representation and is read by an lvalue expression that does not
have character type, the behavior is undefined."

That does not contradict what I am saying. These statements only
define what undefined behaviour is, in context of the standard and the
language, nothing more. It does not say using such a value must demand
undefined behaviour.

Undefined behavior is behavior that is not defined. It's entirely
consistent with doing exactly what you expected (whatever that might
be).

But there is an interesting point buried in all this.

The C99 standard's definition of undefined behavior does not mention
"indeterminately-valued objects".

For relative simplicity, I'm going to use unsigned short rather than
signed int.

int main(void)
{
unsigned short x; /* the value of x is indeterminate, C99 6.2.4p5 */
unsigned short y;
y = x; /* undefined behavior? */
return 0;
}

If unsigned short happens to have a trap representation, it's possible
that x has that trap representation; if so, the assignment definitely
produces undefined behavior.

But what if unsigned short has no trap representations and no padding
bits? It's sometimes possible to prove this:

#include <limits.h>
#include <stdio.h>
int main(void)
{
unsigned short x;
unsigned short y;
if ( CHAR_BIT == 8 &&
sizeof(unsigned short) == 2 &&
USHRT_MAX == 65535 )
{
printf("unsigned short has no padding bits\n");
y = x;
}
else {
printf("unsigned short may have padding bits\n");
}
return 0;
}

The test is not exhaustive; if it fails, unsigned short may or may not
have padding bits.

But if the test succeeds, and the program prints "unsigned short has
no padding bits", we *know* that there are no possible trap
representations for unsigned short (since the range of value uses all
the bits). Then the assignment "y = x;" does not, I believe, invoke
undefined behavior (in C99; it might in C90).

Some definitions:

undefined behavior

behavior, upon use of a nonportable or erroneous program construct or
of erroneous data, for which this International Standard imposes
no requirements

indeterminate value

either an unspecified value or a trap representation

unspecified value

valid value of the relevant type where this International Standard
imposes no requirements on which value is chosen in any instance
NOTE An unspecified value cannot be a trap representation.

And C99 6.2.6.1p5:

Certain object representations need not represent a value of the
object type. If the stored value of an object has such a
representation and is read by an lvalue expression that does not
have character type, the behavior is undefined. If such a
representation is produced by a side effect that modifies all or
any part of the object by an lvalue expression that does not have
character type, the behavior is undefined. Such a representation
is called a _trap representation_.

So, if my analysis is correct, it's safe to use the value of an
uninitialized object if the object's type has no trap representations.
But it's not safe (or at least not portable) to assume that a type has
no trap representations unless it's unsigned char. (You can prove, in
some cases, that an integer type has no trap representations, but it's
not likely to be worth the effort.)

In any case, using the value of an uninitialized object almost
certainly indicates a bug in your program, whether the behavior is
actually undefined or not.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Mar 16 '06 #49

Chris Torek

In article <44********@news.broadpark.no>
tom fredriksen <to*@nospam.org> wrote:

Can you then explain to me how it is that you think the behaviour of my
code can not be predicted?
Well, if you port it to a Burroughs A-machine, and the bit pattern
in the "int" array is a trapping floating-point value[%], the first
time you try to use the value you will get a trap and your program
will terminate.

Of course, this is "predicting" the behavior of your code, but I bet
this behavior does not match what *you* expected.

[% The A-series machines use floating-point for *everything*,
including integers. They just make sure that the value stored
in the floating-point variable is in fact an integer, by trapping
overflows and underflows, and truncating off the digits beyond
the decimal point in other cases.]
... I tested it by replacing the statement in
the loop with a guaranteed integer statement. So the undefined behaviour
argument has exploded in its own face. See code below.

The point of undefined behavior is twofold; see
<http://web.torek.net/torek/c/index.html>. Stepping outside
the C standard is not *bad* (but not *good* either). It is
just "undefined". If your implementation goes on to define the
behavior, great. But you are getting "implementation behavior",
not "C behavior". Just be aware of exactly what you are depending
on, and when it might (or definitely will not) stop working.
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.

Mar 16 '06 #50

removing a loop cause it to go at half the speed?

Similar topics