469,607 Members | 2,140 Online

Two implementations of simple math equation yield different results

I need to implement the following calculation:
f = (a*b) + (c*d)
where a,b,c,d are given double values and f is a double variable for
the result

I found out that using two different implementations gives two
different results:
1) In the first implementation I compute the entire equation in memory
2) In the second implementation I store intermediate to variables, and
then load them to compute the final result.
The results differ by a tiny amount (1E-15). I suspect that the reason
for the difference is from round off errors of the compiler.
Nonetheless, I need to know the reason for the difference as I need to
have an absolute reference for optimization work.

The two ways are as follows:

1)
Calculating two double values and then summing them to a single double
value
double f1 = (a*b) + (c*d)

2)
Calculating two double values and storing them to two double variables
Summing the two double variables to a single double value
double e1 = (a*b);
double e2 = (c*d);
double f2 = e1 + e2;

The difference is calculated as follows:
double diff = f2-f1;

I expect the difference to be exactly 0 but instead, I get a tiny
value.
Could someone explain the reason for differences in the result?

I’m compiling with gcc

Thanks
Avi
Jun 27 '08 #1
6 1745
Avi <av**************@shaw.cawrote in news:d948d5e6-307b-4a17-a20e-
I need to implement the following calculation:
f = (a*b) + (c*d)
where a,b,c,d are given double values and f is a double variable for
the result

I found out that using two different implementations gives two
different results:
1) In the first implementation I compute the entire equation in memory
2) In the second implementation I store intermediate to variables, and
then load them to compute the final result.
The results differ by a tiny amount (1E-15). I suspect that the reason
for the difference is from round off errors of the compiler.
Nonetheless, I need to know the reason for the difference as I need to
have an absolute reference for optimization work.
The reason of the difference is most probably that the processors'
internal floating-bit registers are 80-bit (common x86), but the doubles
in memory are 64 bit only. So there will be some differences depending on
whether the intermediate results are stored in memory meanwhile or not.

gcc has some option for effectively emulating 64-bit variables in 80-bit
registers, but this will slow the program down as you would guess. Better
is to make your algorithm and tests robust in the sense that such small
changes would not cause large changes in the final results (like failing
a test).

hth
Paavo
Jun 27 '08 #2
On May 13, 7:16 am, Avi <avner-moshkov...@shaw.cawrote:
I need to implement the following calculation:
f = (a*b) + (c*d)
where a,b,c,d are given double values and f is a double
variable for the result
I found out that using two different implementations gives two
different results:
1) In the first implementation I compute the entire equation in memory
You mean in CPU and registers. Variables are in memory.
2) In the second implementation I store intermediate to
variables, and then load them to compute the final result.
The results differ by a tiny amount (1E-15). I suspect that
the reason for the difference is from round off errors of the
compiler.
The C++ standard allows a compiler to use extended precision for
intermediate results, only "forcing" the result to the final
precision when it is assigned to a variable.
Nonetheless, I need to know the reason for the difference as I
need to have an absolute reference for optimization work.
The two ways are as follows:
1)
Calculating two double values and then summing them to a
single double value
double f1 = (a*b) + (c*d)
All calculations may be done in extended precision.
2)
Calculating two double values and storing them to two double variables
Summing the two double variables to a single double value
double e1 = (a*b);
This forces the result to double.
double e2 = (c*d);
As does this as well.
double f2 = e1 + e2;
The difference is calculated as follows:
double diff = f2-f1;
I expect the difference to be exactly 0 but instead, I get a
tiny value. Could someone explain the reason for differences
in the result?
It's very implementation dependent.
I'm compiling with gcc
More to the point, you're compiling on a machine which uses
extended precision in its floating point unit, and its floating
point registers. (An Intel, perhaps.)

The reason for this freedom is speed. Forcing the results of
each operation to be rounded to double can be done, but would
slow things down considerably (at least on Intel). And for the
most part, increased precision is not considered a defect.

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
Jun 27 '08 #3
On May 13, 2:42*am, James Kanze <james.ka...@gmail.comwrote:
On May 13, 7:16 am, Avi <avner-moshkov...@shaw.cawrote:
double f2 = e1 + e2;
The difference is calculated as follows:
double diff = f2-f1;
I expect the difference to be exactly 0 but instead, I get a
tiny value. *Could someone explain the reason for differences
in the result?

It's very implementation dependent.
I'm compiling with gcc

More to the point, you're compiling on a machine which uses
extended precision in its floating point unit, and its floating
point registers. *(An Intel, perhaps.)

The reason for this freedom is speed. *Forcing the results of
each operation to be rounded to double can be done, but would
slow things down considerably (at least on Intel). *And for the
most part, increased precision is not considered a defect.
But the extended floating point precision is a defect if - as in this
case - consistent numeric results are required. Fortunately, all Intel
x86 chips since the Pentium 3 support double precision (64-bit)
floating point in their SSE units. So, the solution would be to have
the compiler generate SSE floating point instructions ("-msse2 -
mfpmath=sse") instead of 387 instructions. According to the gcc
documentation, this strategy has other benefits as well:

"The resulting code should be considerably faster in the majority of
cases and avoid the numerical instability problems of 387 code, but
may break some existing code that expects temporaries to be 80-bit."

Greg
Jun 27 '08 #4
Avi wrote:
One way to identify and ignore small differences is to set a small
threshold, say:
double epsilon = 1E-6
and compare the difference against the threshold as follows:

if (abs(diff) < epsilon)
{
// difference is considered as noise
}
{
// difference is significant
}

[..]
Is there a systematic threshold that can be used to define when the
differences are noise and when they are significant?
Or should I just pick a number which is small enough in my mind (say
epsilon = 1 E-6)?
It depends on your problem domain. Some compare absolute differences
(like you're suggesting here) and are happy. Some make the diff value
relative to the range (by dividing the diff by the larger of the two
values). If all the numbers are from the same range you're going to
be fine if you just pick the value that "fits" your range. The value
of the numeric epsilon only relates to the range from 0 to 1, IIRC.

V
--
Jun 27 '08 #5

and helped me set a mechanism to distinguish between round off noise
and significant differences.

Avi
Jun 27 '08 #6
On May 13, 12:15 pm, Greg Herlihy <gre...@mac.comwrote:
On May 13, 2:42 am, James Kanze <james.ka...@gmail.comwrote:
On May 13, 7:16 am, Avi <avner-moshkov...@shaw.cawrote:
double f2 = e1 + e2;
The difference is calculated as follows:
double diff = f2-f1;
I expect the difference to be exactly 0 but instead, I get a
tiny value. Could someone explain the reason for differences
in the result?
It's very implementation dependent.
I'm compiling with gcc
More to the point, you're compiling on a machine which uses
extended precision in its floating point unit, and its floating
point registers. (An Intel, perhaps.)
The reason for this freedom is speed. Forcing the results of
each operation to be rounded to double can be done, but would
slow things down considerably (at least on Intel). And for the
most part, increased precision is not considered a defect.
But the extended floating point precision is a defect if - as
in this case - consistent numeric results are required.
By defect, I presume you mean with regards to what is needed,
not according to the standard. I'm not that good in numerics to
argue one way or the other. Intuitively, I would agree with
specialists.

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
Jun 27 '08 #7