Trouble with integer floating point conversion

rembremading

Hi all!

The following piece of code has (for me) completely unexpected behaviour.
(I compile it with gcc-Version 4.0.3)
Something goes wrong with the integer to float conversion.
Maybe somebody out there understands what happens.
Essentially, when I subtract the (double) function value GRID_POINT(2) from
a variable which has been assigned the same value before this gives a
non-zero result and I really do not understand why.

The program prints
5.000000000000000000e-01; -2.775557561562891351e-17;
0.000000000000000000e+00

And the comparison
if(temp1==GRID_POINT(2))
has negative outcome.

When I replace
((double)(i)) by 2.0
everything behaves as expected. So the output is
5.000000000000000000e-01; 0.000000000000000000e+00; 0.000000000000000000e+00

But: even if the integer to float conversion is inexact (which, I think,
should not be the case) something like
temp1 = GRID_POINT(2);
temp3 = temp1-GRID_POINT(2);
should still result in temp3==0.0, whatever function GRID_POINT does.

What do You think?

Thank you!

---------------------------------------------------
#include <stdio.h>

double GRID_POINT(int i);

double GRID_POINT(int i)
{
return ( 0.1 + ( (80.1-0.1)/(400.0) )*((double)(i)) );
}

int main (void) {

double temp1, temp2, temp3;

temp1 = GRID_POINT(2);
temp2 = GRID_POINT(2);
temp3 = temp1-GRID_POINT(2);

printf("%.18e; %.18e; %.18e\n", temp1, temp3, temp1-temp2 );

if(temp1==GRID_POINT(2)){
printf("these two are equal\n");
}

return 0;
}
---------------------------------------------------

Dec 12 '07 #1

Subscribe Post Reply

3505

rembremading

I should add, that the problem, described below, does not occur, when the
Intel compiler is used.
When compiling with gcc I don't use any optimisation options.
This does not solve my problem, however.

rembremading wrote:

Hi all!

The following piece of code has (for me) completely unexpected behaviour.
(I compile it with gcc-Version 4.0.3)
Something goes wrong with the integer to float conversion.
Maybe somebody out there understands what happens.
Essentially, when I subtract the (double) function value GRID_POINT(2)
from a variable which has been assigned the same value before this gives a
non-zero result and I really do not understand why.

The program prints
5.000000000000000000e-01; -2.775557561562891351e-17;
0.000000000000000000e+00

And the comparison
if(temp1==GRID_POINT(2))
has negative outcome.

When I replace
((double)(i)) by 2.0
everything behaves as expected. So the output is
5.000000000000000000e-01; 0.000000000000000000e+00;
0.000000000000000000e+00

But: even if the integer to float conversion is inexact (which, I think,
should not be the case) something like
temp1 = GRID_POINT(2);
temp3 = temp1-GRID_POINT(2);
should still result in temp3==0.0, whatever function GRID_POINT does.

What do You think?

Thank you!

---------------------------------------------------
#include <stdio.h>

double GRID_POINT(int i);

double GRID_POINT(int i)
{
return ( 0.1 + ( (80.1-0.1)/(400.0) )*((double)(i)) );
}

int main (void) {

double temp1, temp2, temp3;

temp1 = GRID_POINT(2);
temp2 = GRID_POINT(2);
temp3 = temp1-GRID_POINT(2);

printf("%.18e; %.18e; %.18e\n", temp1, temp3, temp1-temp2 );

if(temp1==GRID_POINT(2)){
printf("these two are equal\n");
}

return 0;
}
---------------------------------------------------

Dec 12 '07 #2

jacob navia

rembremading wrote:

I should add, that the problem, described below, does not occur, when the
Intel compiler is used.
When compiling with gcc I don't use any optimisation options.
This does not solve my problem, however.

The Intel compiler sets probably the precision of the machine
to 64 bits.

The gcc compiler and other compilers like lcc-win set the precision
of the machine at 80 bits.

This means that the calculations are done using MORE precision
than double precision what can lead to different results.

You can set the precision of the machine yourself to 64 bits
within the gcc run time.
--
jacob navia
jacob at jacob point remcomp point fr
logiciels/informatique
http://www.cs.virginia.edu/~lcc-win32

Dec 12 '07 #3

Francine.Neary

On Dec 12, 9:05 pm, jacob navia <ja...@nospam.comwrote:

rembremading wrote:
I should add, that the problem, described below, does not occur, when the
Intel compiler is used.
When compiling with gcc I don't use any optimisation options.
This does not solve my problem, however.

The Intel compiler sets probably the precision of the machine
to 64 bits.

The gcc compiler and other compilers like lcc-win set the precision
of the machine at 80 bits.

This means that the calculations are done using MORE precision
than double precision what can lead to different results.

You can set the precision of the machine yourself to 64 bits
within the gcc run time.

Much better to assume only what the Standard guarantees about floating-
point operations, and use third-party libraries like GMP if you need
extra precision.

--
jacob navia
jacob at jacob point remcomp point fr
logiciels/informatiquehttp://www.cs.virginia.edu/~lcc-win32

Dec 12 '07 #4

jacob navia

Fr************@googlemail.com wrote:

On Dec 12, 9:05 pm, jacob navia <ja...@nospam.comwrote:
>rembremading wrote:
>>I should add, that the problem, described below, does not occur, when the
Intel compiler is used.
When compiling with gcc I don't use any optimisation options.
This does not solve my problem, however.
The Intel compiler sets probably the precision of the machine
to 64 bits.

The gcc compiler and other compilers like lcc-win set the precision
of the machine at 80 bits.

This means that the calculations are done using MORE precision
than double precision what can lead to different results.

You can set the precision of the machine yourself to 64 bits
within the gcc run time.

Much better to assume only what the Standard guarantees about floating-
point operations, and use third-party libraries like GMP if you need
extra precision.

The standard doesn't guarantee *anything* at all.

Myself I *thought* that the standard implied IEEE 754 but people
here pointed me to an obscure sentence that allows an implementation to
get away with that too, and implement some other kind of weird
floating point.

Since the standard doesn't force even IEEE754, there is *nothing*
the C language by itself can guarantee the user.

To come back to the original poster problem.

The machine has two ways of floating point operations:
1) 80 bits intenral precision
b) 64 bits internal precision.

Microsoft and Intel use (2), gcc and lcc-win use (1).

This means that when values are stored only in floating point
registers and NOT stored into memory, the same computation can
yield *different* results as the user has noticed.

The final fix to this is to set the machine to use only 64 bits
precision ONLY. This can be done with a few assembler instructions.
lcc-win rpovides a function to do this, maybe gcc does too.

Another possibility is to use the floating point environment functions
to read the environment, figure out where the precision bit
is, and then modify that and write it back to the floating
point unit.

>--
jacob navia
jacob at jacob point remcomp point fr
logiciels/informatiquehttp://www.cs.virginia.edu/~lcc-win32

--
jacob navia
jacob at jacob point remcomp point fr
logiciels/informatique
http://www.cs.virginia.edu/~lcc-win32

Dec 12 '07 #5

Peter Nilsson

rembremading <rembremad...@gmx.netwrote:

Hi all!

The following piece of code has (for me) completely
unexpected behaviour.
(I compile it with gcc-Version 4.0.3)
Something goes wrong with the integer to float conversion.

Did you apply -ffloat-store?

Without that (and others), gcc is not a conforming compiler
on many architectures.

--
Peter

Dec 13 '07 #6

CBFalconer

Flash Gordon wrote:

jacob navia wrote, On 12/12/07 21:50:

.... snip ...

>
>The final fix to this is to set the machine to use only 64 bits
precision ONLY. This can be done with a few assembler instructions.
lcc-win rpovides a function to do this, maybe gcc does too.

Of course, changing the mode in which the processor operates behind
the back of the compiler can cause interesting effects. Interesting
as in the ancient Chinese curse, that is.

>Another possibility is to use the floating point environment
functions to read the environment, figure out where the precision
bit is, and then modify that and write it back to the floating
point unit.

Or better yet assume only what is guaranteed or use a library
designed to provide better guarantees if you need them. Or limit
yourself to compilers that guarantee the behaviour you require if
it is acceptable to so limit yourself.

All this silly arguing, and nobody bothers mentioning the
applicable section of the standard. From N869, about float.h:

[#10] The values given in the following list shall be
replaced by implementation-defined constant expressions with
(positive) values that are less than or equal to those
shown:

-- the difference between 1 and the least value greater
than 1 that is representable in the given floating
point type, b1-p

FLT_EPSILON 1E-5
DBL_EPSILON 1E-9
LDBL_EPSILON 1E-9

-- minimum normalized positive floating-point number,
bemin-1

FLT_MIN 1E-37
DBL_MIN 1E-37
LDBL_MIN 1E-37
--
Merry Christmas, Happy Hanukah, Happy New Year
Joyeux Noel, Bonne Annee.
Chuck F (cbfalconer at maineline dot net)
<http://cbfalconer.home.att.net>

--
Posted via a free Usenet account from http://www.teranews.com

Dec 13 '07 #7

Keith Thompson

jacob navia <ja***@nospam.comwrites:

Flash Gordon wrote:

[...]

>>Since the standard doesn't force even IEEE754, there is *nothing*
the C language by itself can guarantee the user.

(I think jacob wrote the above; the attribution was snipped.)

>>
Are you seriously trying to claim that the only way to provide any
form of guarantee on floating point is to enforce IEEE754?

If there isn't even an accepted standard that is enforced, how
can you guarantee anything.

[...]

>
How could the standard guarantee ANYTHING about the precision of
floating point calculations when it doesn't even guarantee a common
standard?

Yes I am seriously saying that the absence of an enforced standard
makes any guarantee IMPOSSIBLE.

First of all, the C standard says that accuracy of floating-point
operations is implementation-defined, though the implementation is
allowed to say that the accuracy is unknown. This doesn't refute
jacob's claim, but it does mean that a particular implementation is
allowed to make guarantees that the standard doesn't make.

It's entirely possible for a language standard to make firm guarantees
about floating-point accuracy without mandating adherence to one
specific floating-point representation. For example, the Ada standard
has a detailed parameterized floating-point model with specific
precision requirements; Ada's requirements are satisfied by an IEEE
754 implementation, but can also be (and have been) satisfied by a
number of other floating-point representations. For example, in Ada
1.0 + 1.0 is guaranteed to be exactly equal to 2.0.

The C standard could have followed the same course (and since it was
written several years after the first Ada standard, I'm not entirely
sure why it didn't). But instead, the authors of the C standard chose
to leave floating-point precision issues up to each implementation.

In practice, each C implementation and each Ada implementation usually
just uses whatever floating-point representation and operations are
provided by the underlying hardware. (Exception: some systems
probably use software floating-point, and some others might need to do
some tweaking on top of what the hardware provides.) In most cases,
the precision provided by the hardware is as good as what the language
standard *could* have guaranteed.

Yes, jacob, it's true that the C standard makes no guarantees about
floating-point precision. It does make some guarantees about
floating-point range, which seems to contradict your claim above that
"there is *nothing* the C language by itself can guarantee the user".
(Perhaps in context it was sufficiently clear that you were talking
only about precision; I'm not going to bother to go back up the thread
to check.)

As for *why* the C standard doesn't require IEEE 754, there have been
plenty of computers that support other representations, and it would
have been absurd for the C standard to require IEEE 754 on systems
that don't provide it. (Examples: VAX, older Cray vector systems, and
IBM mainframes each have their own floating-point formats; there are
undoubtedly more examples.) The intent of IEEE 754 is to define a
single universal floating-point standard, and we're headed in that
direction, but we're not there yet -- and we certainly weren't there
when the C89 standard was being written.

[...]

This is an example of how the "regulars" spread nonsense just with the
only objective of "contradicting jacob" as they announced here
yesterday.

Tedious rant ignored.

--
Keith Thompson (The_Other_Keith) <ks***@mib.org>
Looking for software development work in the San Diego area.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Dec 13 '07 #8

rembremading

What would be the other options which make gcc a conforming compiler?

-Andreas

Peter Nilsson wrote:

rembremading <rembremad...@gmx.netwrote:
>Hi all!

The following piece of code has (for me) completely
unexpected behaviour.
(I compile it with gcc-Version 4.0.3)
Something goes wrong with the integer to float conversion.

Did you apply -ffloat-store?

Without that (and others), gcc is not a conforming compiler
on many architectures.

--
Peter

Dec 13 '07 #9

ymuntyan

On Dec 13, 4:31 am, Keith Thompson <ks...@mib.orgwrote:

jacob navia <ja...@nospam.comwrites:
Flash Gordon wrote:
[...]

>Since the standard doesn't force even IEEE754, there is *nothing*
the C language by itself can guarantee the user.

(I think jacob wrote the above; the attribution was snipped.)

Are you seriously trying to claim that the only way to provide any
form of guarantee on floating point is to enforce IEEE754?

If there isn't even an accepted standard that is enforced, how
can you guarantee anything.

[...]

How could the standard guarantee ANYTHING about the precision of
floating point calculations when it doesn't even guarantee a common
standard?

Yes I am seriously saying that the absence of an enforced standard
makes any guarantee IMPOSSIBLE.

First of all, the C standard says that accuracy of floating-point
operations is implementation-defined, though the implementation is
allowed to say that the accuracy is unknown. This doesn't refute
jacob's claim, but it does mean that a particular implementation is
allowed to make guarantees that the standard doesn't make.

It's entirely possible for a language standard to make firm guarantees
about floating-point accuracy without mandating adherence to one
specific floating-point representation. For example, the Ada standard
has a detailed parameterized floating-point model with specific
precision requirements; Ada's requirements are satisfied by an IEEE
754 implementation, but can also be (and have been) satisfied by a
number of other floating-point representations. For example, in Ada
1.0 + 1.0 is guaranteed to be exactly equal to 2.0.

The C standard could have followed the same course (and since it was
written several years after the first Ada standard, I'm not entirely
sure why it didn't). But instead, the authors of the C standard chose
to leave floating-point precision issues up to each implementation.

In practice, each C implementation and each Ada implementation usually
just uses whatever floating-point representation and operations are
provided by the underlying hardware. (Exception: some systems
probably use software floating-point, and some others might need to do
some tweaking on top of what the hardware provides.) In most cases,
the precision provided by the hardware is as good as what the language
standard *could* have guaranteed.

Yes, jacob, it's true that the C standard makes no guarantees about
floating-point precision. It does make some guarantees about
floating-point range, which seems to contradict your claim above that
"there is *nothing* the C language by itself can guarantee the user".
(Perhaps in context it was sufficiently clear that you were talking
only about precision; I'm not going to bother to go back up the thread
to check.)

You quoted it: "How could the standard guarantee ANYTHING about
the precision of floating point calculations..."

[...]

This is an example of how the "regulars" spread nonsense just with the
only objective of "contradicting jacob" as they announced here
yesterday.

Tedious rant ignored.

So you just demonstrated that JN's paranoia is not *quite* paranoia.
Your post
is exactly what he refers to (using strong words like "nonsense",
since JN
gets pretty emotional in this grand war).

World will stop, and all newbies will get confused and die if you stop
arguing
with Jacob, sure. You simlpy have to do it again and again. Even when
you don't
have a point! Oh well.

Dec 13 '07 #10

Mark Bluemel

Peter Nilsson wrote:

rembremading <rembremad...@gmx.netwrote:
>Hi all!

The following piece of code has (for me) completely
unexpected behaviour.
(I compile it with gcc-Version 4.0.3)
Something goes wrong with the integer to float conversion.

Did you apply -ffloat-store?

I did that on the testcase. It didn't help.

The manual page states for -ffloat-store
" Do not store floating point variables in registers, and inhibit other
options that might change whether a floating point value is taken from
a register or memory."

It doesn't appear to alter how a floating point return value from a
function (presumably in a register) is handled. This is probably covered
by the further comment in the manual page :-

... "a few programs rely on the precise definition of IEEE floating
point. Use -ffloat-store for such programs, _after modifying them to
store all pertinent intermediate computations into variables_."

So to answer the original poster (As best as I understand it) -
This is due to the register holding the return value having greater
precision than the defined precision of "double".

Dec 13 '07 #11

Mark Bluemel

rembremading wrote:

Hi all!

The following piece of code has (for me) completely unexpected behaviour.
(I compile it with gcc-Version 4.0.3)
Something goes wrong with the integer to float conversion.
Maybe somebody out there understands what happens.
Essentially, when I subtract the (double) function value GRID_POINT(2) from
a variable which has been assigned the same value before this gives a
non-zero result and I really do not understand why.

http://www.network-theory.co.uk/docs...cintro_70.html has a
detailed discussion...

Dec 13 '07 #12

James Kuyper

ym******@gmail.com wrote:

On Dec 13, 4:31 am, Keith Thompson <ks...@mib.orgwrote:

....

>Yes, jacob, it's true that the C standard makes no guarantees about
floating-point precision. It does make some guarantees about
floating-point range, which seems to contradict your claim above that
"there is *nothing* the C language by itself can guarantee the user".
(Perhaps in context it was sufficiently clear that you were talking
only about precision; I'm not going to bother to go back up the thread
to check.)

You quoted it: "How could the standard guarantee ANYTHING about
the precision of floating point calculations..."

Well, actually, the numerical limits section does specify the precision.
However, getting an inaccurate value with high precision is not
something most numerical analysts would be happy about.

Dec 13 '07 #13

jacob navia

James Kuyper wrote:

ym******@gmail.com wrote:
>On Dec 13, 4:31 am, Keith Thompson <ks...@mib.orgwrote:
...

>>Yes, jacob, it's true that the C standard makes no guarantees about
floating-point precision. It does make some guarantees about
floating-point range, which seems to contradict your claim above that
"there is *nothing* the C language by itself can guarantee the user".
(Perhaps in context it was sufficiently clear that you were talking
only about precision; I'm not going to bother to go back up the thread
to check.)

You quoted it: "How could the standard guarantee ANYTHING about
the precision of floating point calculations..."

Well, actually, the numerical limits section does specify the precision.
However, getting an inaccurate value with high precision is not
something most numerical analysts would be happy about.

It is completely meaningless.

--
jacob navia
jacob at jacob point remcomp point fr
logiciels/informatique
http://www.cs.virginia.edu/~lcc-win32

Dec 13 '07 #14

jacob navia

rembremading wrote:
[snip]

Can you tell me if you use the following program the result is better?

This program sets the precision to 64 bits instead of
80 bits. (If I have no bugs... )

void SetPrecision64(void)
{
void (*fn)(void);
unsigned char code[] = {
0x50, // push %eax
0xd9,0x3c,0x24, // fnstcw (%esp,1)
0x8b,0x04,0x24, // mov (%esp,1),%eax
0x0f,0xba,0x34,0x24,0x08, // btrl $0x8,(%esp,1)
0x66,0x81,0x0c,0x24,0x00,0x02,// orw $0x200,(%esp,1)
0xd9,0x2c,0x24, // fldcw (%esp,1)
0x59, // pop %ecx
0xc3 // ret
};
fn = (void (*)(void))code;
fn();
}

--
jacob navia
jacob at jacob point remcomp point fr
logiciels/informatique
http://www.cs.virginia.edu/~lcc-win32

Dec 13 '07 #15

James Kuyper

jacob navia wrote:

James Kuyper wrote:
>ym******@gmail.com wrote:
>>On Dec 13, 4:31 am, Keith Thompson <ks...@mib.orgwrote:
...
>>>Yes, jacob, it's true that the C standard makes no guarantees about
floating-point precision. It does make some guarantees about
floating-point range, which seems to contradict your claim above that
"there is *nothing* the C language by itself can guarantee the user".
(Perhaps in context it was sufficiently clear that you were talking
only about precision; I'm not going to bother to go back up the thread
to check.)

You quoted it: "How could the standard guarantee ANYTHING about
the precision of floating point calculations..."

Well, actually, the numerical limits section does specify the
precision. However, getting an inaccurate value with high precision is
not something most numerical analysts would be happy about.

It is completely meaningless.

3.9975222353 is a very precise but inaccurate answer to the question
"what is 2.0 + 2.0?".

The numerical limits section describes and puts minimum requirements on
the precision; while section 5.2.4.2.2 makes explicit the fact that the
standard imposes no requirements on the accuracy.

Dec 13 '07 #16

Ben Bacarisse

rembremading <re**********@gmx.netwrites:

(1) double temp1, temp2;
(2) temp1 = any_function();
(3) temp2 = temp1 - any_function();

can lead to a result temp2 != 0.0 because temp1 is stored in the memory and
the result of any_function() in (2) is therefore rounded to the closest 64
bit number, whereas the result of any_function() in (3) is stored the
register only, such that the difference in (3) can have a finite
outcome.

<obsession type="favourite">
You mean non-zero rather than finite both here and later on.
</obsession>

--
Ben.

Dec 13 '07 #17

Dik T. Winter

In article <fj**********@aioe.orgja***@nospam.org writes:
....

Are you seriously trying to claim that the only way to provide any form
of guarantee on floating point is to enforce IEEE754?

If there isn't even an accepted standard that is enforced, how
can you guarantee anything.

Have a look at the Ada standard, where that is precisely the case. But
the C standard only goes halfway there. They properly define a model
for floating-point representation, but give no guarantee about the
precision within that model, and that is something the Ada standard
does do. What does happen in Ada systems if the maximal model for
the floating-point hardware does not meet the requirements of precision,
the model is reduced.
--
dik t. winter, cwi, kruislaan 413, 1098 sj amsterdam, nederland, +31205924131
home: bovenover 215, 1025 jn amsterdam, nederland; http://www.cwi.nl/~dik/

Dec 13 '07 #18

Thad Smith

rembremading wrote:

The following piece of code has (for me) completely unexpected

behaviour.

This is cross-posted to comp.std.c. See the final paragraph for a
standards question. This is the second posting attempt.

(I compile it with gcc-Version 4.0.3)
Something goes wrong with the integer to float conversion.
Maybe somebody out there understands what happens.
Essentially, when I subtract the (double) function value

GRID_POINT(2) from

a variable which has been assigned the same value before this gives a
non-zero result and I really do not understand why.

The program prints
5.000000000000000000e-01; -2.775557561562891351e-17;
0.000000000000000000e+00

And the comparison
if(temp1==GRID_POINT(2))
has negative outcome.

When I replace
((double)(i)) by 2.0
everything behaves as expected. So the output is
5.000000000000000000e-01; 0.000000000000000000e+00;

0.000000000000000000e+00

>
But: even if the integer to float conversion is inexact (which, I think,
should not be the case) something like
temp1 = GRID_POINT(2);
temp3 = temp1-GRID_POINT(2);
should still result in temp3==0.0, whatever function GRID_POINT does.

What do You think?

Thank you!

---------------------------------------------------
#include <stdio.h>

double GRID_POINT(int i);

double GRID_POINT(int i)
{
return ( 0.1 + ( (80.1-0.1)/(400.0) )*((double)(i)) );
}

int main (void) {

double temp1, temp2, temp3;

temp1 = GRID_POINT(2);
temp2 = GRID_POINT(2);
temp3 = temp1-GRID_POINT(2);

printf("%.18e; %.18e; %.18e\n", temp1, temp3, temp1-temp2 );

if(temp1==GRID_POINT(2)){
printf("these two are equal\n");
}

return 0;
}
---------------------------------------------------

The original code temp3 may be non-zero because of extra precision:

5.2.4.2.2:
The value of operations with floating operands and values subject to the
usual arithmetic conversions and of floating constant are evaluated to a
format whose range and precision may be greater than required by the
type. The use of evaluation formats is characterized by the
implementation-defined value of FLT_EVAL_METHOD.

The standard requires that values be rounded to their specified
precision by storing into a variable of the type or cast to the type
(See 5.1.2.3 footnote 4). Neither of these requirements are in effect
for the use of GRID_POINT(2) within the statement computing temp3.

The compiler is probably passing the evaluated value back in a floating
point register with more precision than a double variable.

Regarding the effect of replacing (double)i by the constant 2, you
permit the compiler to evaluate the expression in GRID_POINT() at
compile time. The compiler probably is then propagating that constant
to the assignments in main. What is apparently different in the
compile-time evaluation is that the value computed in GRID_POINT() gets
rounded to double precision before it is used in main.

Is the compiler rounding of the constant value, compared with the
register passing with extra precision, conforming? 5.1.2.3 footnote 4
says "an implicit spilling of a register is not permitted to alter the
value." Is the compile time evaluation, vs. runtime evaluation, an
implicit spilling of a register? If not, what is an "implicit spilling
of a register"?

--
Thad

Dec 13 '07 #19

James Kuyper

jacob navia wrote:

James Kuyper wrote:
>jacob navia wrote:
>>To the subject of precision, the standard says:
<quote>
The accuracy of the floating-point operations (+, -, *, /) and of the
library functions in <math.hand <complex.hthat return floating-point
results is implementation defined.

The implementation may state that the accuracy is unknown
<end quote>

i.e. IT CONFIRMS what I am saying. There aren't any guarantees at all.

There are no guarantees about the accuracy for implementations that
don't pre-define __STDC_IEC_559__. The Numerical Limits section he
cited does, however, contain a great many guarantees about things
other than the accuracy.

That is fine but we are speaking about the accuracy here

While the original poster's problem was with the accuracy of the
calculations, your very first response was about the precision, not the
accuracy. An 80-bit number can hold an inaccurate number with much
greater precision than an 64-bit number; in itself that doesn't make the
number any more accurate.

The standard does say things about the precision. The values for
LDBL_EPSILON, DBL_EPSILON, and FLT_EPSILON depend very directly on the
precision of floating point operations, and the standard sets a maximum
value for the magnitude of that constant, which implies a minimum number
of bits of precision.

The standard also says things about when it's permissible to use a
higher precision type for intermediate steps in calculations, and it
seems to be precisely that issue which is the actual reason for the
discrepancies that the original poster reported. See 5.2.4.2.2p8. If
FLT_EVAL_METHOD were 0, then he shouldn't have seen those discrepancies.
For any other value of FLT_EVAL_METHOD, such discrepancies are always a
possibility.

Dec 13 '07 #20

Mark Bluemel

jacob navia wrote:

rembremading wrote:
[snip]

Can you tell me if you use the following program the result is better?

http://www.network-theory.co.uk/docs...cintro_70.html
(previously cited) gives a briefer version, which may be specific to
gcc, but that's what the OP has the issue with ....

Dec 13 '07 #21

rembremading

Yes, your assembler code works fine, although I don't understand what it
does :-)
For my purposes, however, it will be better to choose the constants such
that they are exactly representable, as outlined below.
Every additional function call, at this place in the code, is unwanted.
(although an inlined version of this function would be quite fast, I guess)
I would not want to activate the 64 bit computations for the entire program.
Actually for the most part of it calculation with higher precision are very
wellcome, as long as it does not affect the speed)
Nevertheless it is an interesting piece of code, which might be usefull in
the future.

jacob navia wrote:

rembremading wrote:
[snip]

Can you tell me if you use the following program the result is better?

This program sets the precision to 64 bits instead of
80 bits. (If I have no bugs... )

void SetPrecision64(void)
{
void (*fn)(void);
unsigned char code[] = {
0x50, // push %eax
0xd9,0x3c,0x24, // fnstcw (%esp,1)
0x8b,0x04,0x24, // mov (%esp,1),%eax
0x0f,0xba,0x34,0x24,0x08, // btrl $0x8,(%esp,1)
0x66,0x81,0x0c,0x24,0x00,0x02,// orw $0x200,(%esp,1)
0xd9,0x2c,0x24, // fldcw (%esp,1)
0x59, // pop %ecx
0xc3 // ret
};
fn = (void (*)(void))code;
fn();
}

Dec 13 '07 #22

jacob navia

rembremading wrote:

Yes, your assembler code works fine, although I don't understand what it
does :-)
For my purposes, however, it will be better to choose the constants such
that they are exactly representable, as outlined below.
Every additional function call, at this place in the code, is unwanted.
(although an inlined version of this function would be quite fast, I guess)
I would not want to activate the 64 bit computations for the entire program.
Actually for the most part of it calculation with higher precision are very
wellcome, as long as it does not affect the speed)
Nevertheless it is an interesting piece of code, which might be usefull in
the future.

You need it call it ONCE when your program STARTS.

Then, all calculations are done in 64 bits only
and not in 80.

That code sets the precision flag of the floating point unit
to 64 bits, i.e. double precision.

--
jacob navia
jacob at jacob point remcomp point fr
logiciels/informatique
http://www.cs.virginia.edu/~lcc-win32

Dec 13 '07 #23

Mark Bluemel

rembremading wrote:

Yes, your assembler code works fine, although I don't understand what it
does :-)

Then read http://www.network-theory.co.uk/docs...cintro_70.html

It answers your questions fairly comprehensively and gives a simple
way of disabling extended precision for the duration of a program's
execution.

The mechanism outlined is the simplest way of "fixing" your problem.

Dec 13 '07 #24

rembremading

Yes, I understand that I would have to call it only once, but in the main
threefold for loop of my program I have to do something like

res = ( (1.0+f1)*(1.0+f2)*f3*f4 - f1*f2*(1.0+f3)*(1.0+f4) );

Where f1..f4 are function values (not of the GRID_POINT function), which can
take values from 1.0e-100 to 1.0e2.
In this case, presumably, the intermediate values are stored in the register
and it is obvious that res is highly sensitive to the numerical precision
of f1..f4 and the intermediate results of the computation.
Therefore, if I can get the computation with 80 bit precision instead of 64
bit (at same costs), I definitely prefere the former one.
Although the final result is rounded to 64 bit precision this extra digits
can change a lot.

Because my system has dimension ~1e3 and the loop is threefold I cannot use
arbitrary precision math libraries. I cannot even use long double type
variables on my machine.

Now, the GRID_POINT function has to be evaluated when f1..f4 are computed.
Therefore I would have to switch 64 bit precision on (and off again) in
every single step. That is what I meant.

That code sets the precision flag of the floating point unit
to 64 bits, i.e. double precision.

I understood that, but I do not understand how the function does it
(possibly lack of assembler knowledge)

There might be also a lack of English knowledge, which could be the reason
for missunderstandings. Sorry for that.

-Andreas
jacob navia wrote:

rembremading wrote:
>Yes, your assembler code works fine, although I don't understand what it
does :-)
For my purposes, however, it will be better to choose the constants such
that they are exactly representable, as outlined below.
Every additional function call, at this place in the code, is unwanted.
(although an inlined version of this function would be quite fast, I
guess) I would not want to activate the 64 bit computations for the
entire program. Actually for the most part of it calculation with higher
precision are very wellcome, as long as it does not affect the speed)
Nevertheless it is an interesting piece of code, which might be usefull
in the future.

You need it call it ONCE when your program STARTS.

Then, all calculations are done in 64 bits only
and not in 80.

That code sets the precision flag of the floating point unit
to 64 bits, i.e. double precision.

Dec 13 '07 #25

jacob navia

rembremading wrote:

Yes, I understand that I would have to call it only once, but in the main
threefold for loop of my program I have to do something like

res = ( (1.0+f1)*(1.0+f2)*f3*f4 - f1*f2*(1.0+f3)*(1.0+f4) );

Where f1..f4 are function values (not of the GRID_POINT function), which can
take values from 1.0e-100 to 1.0e2.
In this case, presumably, the intermediate values are stored in the register
and it is obvious that res is highly sensitive to the numerical precision
of f1..f4 and the intermediate results of the computation.
Therefore, if I can get the computation with 80 bit precision instead of 64
bit (at same costs), I definitely prefere the former one.
Although the final result is rounded to 64 bit precision this extra digits
can change a lot.

Because my system has dimension ~1e3 and the loop is threefold I cannot use
arbitrary precision math libraries. I cannot even use long double type
variables on my machine.

Now, the GRID_POINT function has to be evaluated when f1..f4 are computed.
Therefore I would have to switch 64 bit precision on (and off again) in
every single step. That is what I meant.

>That code sets the precision flag of the floating point unit
to 64 bits, i.e. double precision.

I understood that, but I do not understand how the function does it
(possibly lack of assembler knowledge)

There might be also a lack of English knowledge, which could be the reason
for missunderstandings. Sorry for that.

-Andreas

Your problem is a very complicated one Andreas. Maybe you will get
better advice
in sci.math.num-analysis, a group that is dedicated to this kind
of problems. I have seen discussions about problems of this type there,
and they have a math level that goes WAY beyond what we have here in
comp.lang.c.

I think you have to rearrange the terms in your equation to make them
less sensible to roundoff errors. I think in comp.math.num-analysis
you will find people that can help you to do that.
--
jacob navia
jacob at jacob point remcomp point fr
logiciels/informatique
http://www.cs.virginia.edu/~lcc-win32

Dec 13 '07 #26

Keith Thompson

ym******@gmail.com writes:

On Dec 13, 4:31 am, Keith Thompson <ks...@mib.orgwrote:
>jacob navia <ja...@nospam.comwrites:
Flash Gordon wrote:
[...]
>>Since the standard doesn't force even IEEE754, there is *nothing*
the C language by itself can guarantee the user.

(I think jacob wrote the above; the attribution was snipped.)

>Are you seriously trying to claim that the only way to provide any
form of guarantee on floating point is to enforce IEEE754?

If there isn't even an accepted standard that is enforced, how
can you guarantee anything.

[...]

How could the standard guarantee ANYTHING about the precision of
floating point calculations when it doesn't even guarantee a common
standard?

Yes I am seriously saying that the absence of an enforced standard
makes any guarantee IMPOSSIBLE.

[...]

>Yes, jacob, it's true that the C standard makes no guarantees about
floating-point precision. It does make some guarantees about
floating-point range, which seems to contradict your claim above that
"there is *nothing* the C language by itself can guarantee the user".
(Perhaps in context it was sufficiently clear that you were talking
only about precision; I'm not going to bother to go back up the thread
to check.)

You quoted it: "How could the standard guarantee ANYTHING about
the precision of floating point calculations..."

I was referring to jacob's earlier statement, which did not mention
precision:
| Since the standard doesn't force even IEEE754, there is *nothing*
| the C language by itself can guarantee the user.

>[...]

This is an example of how the "regulars" spread nonsense just with the
only objective of "contradicting jacob" as they announced here
yesterday.

Tedious rant ignored.

So you just demonstrated that JN's paranoia is not *quite* paranoia.
Your post is exactly what he refers to (using strong words like
"nonsense", since JN gets pretty emotional in this grand war).

World will stop, and all newbies will get confused and die if you
stop arguing with Jacob, sure. You simlpy have to do it again and
again. Even when you don't have a point! Oh well.

Look again, I *agreed* with some of what jacob was saying; the
C standard doesn't make any guarantees about the precsion of
floating-point operations.

--
Keith Thompson (The_Other_Keith) <ks***@mib.org>
Looking for software development work in the San Diego area.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Dec 13 '07 #27

Keith Thompson

Keith Thompson <ks***@mib.orgwrites:

ym******@gmail.com writes:
>On Dec 13, 4:31 am, Keith Thompson <ks...@mib.orgwrote:

[...]

>>Yes, jacob, it's true that the C standard makes no guarantees about
floating-point precision. It does make some guarantees about
floating-point range, which seems to contradict your claim above that
"there is *nothing* the C language by itself can guarantee the user".
(Perhaps in context it was sufficiently clear that you were talking
only about precision; I'm not going to bother to go back up the thread
to check.)

You quoted it: "How could the standard guarantee ANYTHING about
the precision of floating point calculations..."

I was referring to jacob's earlier statement, which did not mention
precision:
| Since the standard doesn't force even IEEE754, there is *nothing*
| the C language by itself can guarantee the user.

And, to be clear, we should have been referring to accuracy rather
than precision.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
Looking for software development work in the San Diego area.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Dec 13 '07 #28

Keith Thompson

Keith Thompson <ks***@mib.orgwrites:

ym******@gmail.com writes:
>On Dec 13, 4:31 am, Keith Thompson <ks...@mib.orgwrote:

[...]

>>Yes, jacob, it's true that the C standard makes no guarantees about
floating-point precision. It does make some guarantees about
floating-point range, which seems to contradict your claim above that
"there is *nothing* the C language by itself can guarantee the user".
(Perhaps in context it was sufficiently clear that you were talking
only about precision; I'm not going to bother to go back up the thread
to check.)

You quoted it: "How could the standard guarantee ANYTHING about
the precision of floating point calculations..."

I was referring to jacob's earlier statement, which did not mention
precision:
| Since the standard doesn't force even IEEE754, there is *nothing*
| the C language by itself can guarantee the user.

And, to be clear, we should have been referring to accuracy rather
than precision.

--
Keith Thompson (The_Other_Keith) <ks***@mib.org>
Looking for software development work in the San Diego area.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Dec 13 '07 #29

Chris Torek

In article <fj**********@aioe.org>
Mark Bluemel <ma**********@pobox.comwrote:

>... read http://www.network-theory.co.uk/docs...cintro_70.html

It answers your questions fairly comprehensively and gives a simple
way of disabling extended precision for the duration of a program's
execution.

It also notes, in passing, that setting the precision field in
the FPU control word does *not* affect the exponent range. This
means that the in-FPU computations are *still* not quite what
one might expect from an IEEE implementation.

I have the same note (with an example) at the
end of <http://web.torek.net/torek/c/numbers.html>.

As someone else noted else-thread, one can get "proper" IEEE floating
point on the x86 using gcc's "-ffloat-store" flag. This does,
however, slow down computation.
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.

Dec 13 '07 #30

Mark Bluemel

Chris Torek wrote:

In article <fj**********@aioe.org>
Mark Bluemel <ma**********@pobox.comwrote:
>... read http://www.network-theory.co.uk/docs...cintro_70.html

It answers your questions fairly comprehensively and gives a simple
way of disabling extended precision for the duration of a program's
execution.

It also notes, in passing, that setting the precision field in
the FPU control word does *not* affect the exponent range. This
means that the in-FPU computations are *still* not quite what
one might expect from an IEEE implementation.

I have the same note (with an example) at the
end of <http://web.torek.net/torek/c/numbers.html>.

As someone else noted else-thread, one can get "proper" IEEE floating
point on the x86 using gcc's "-ffloat-store" flag. This does,
however, slow down computation.

And still doesn't work for the OP's testcase :-

$ cat gr.c
#include <stdio.h>

double GRID_POINT(int i) {
return ( 0.1 + ( (80.1-0.1)/(400.0) )*((double)(i)) );
}

int main (void) {
double temp1 = GRID_POINT(2);
double temp2 = temp1-GRID_POINT(2);

printf("%.18e; %.18e\n", temp1, temp2);

return 0;
}

$ cc -ffloat-store gr.c -o gr

$ ./gr
5.000000000000000000e-01; -2.775557561562891351e-17

As I commented else-thread -ffloat-store doesn't appear to alter how
a floating point return value from a function (presumably in a register)
is handled. This is probably covered by the comment in the manual page
:-

... "a few programs rely on the precise definition of IEEE floating
point. Use -ffloat-store for such programs, _after modifying them to
store all pertinent intermediate computations into variables_."

Dec 13 '07 #31

Ben Bacarisse

Mark Bluemel <ma**********@pobox.comwrites:

Chris Torek wrote:

<snip>

>As someone else noted else-thread, one can get "proper" IEEE floating
point on the x86 using gcc's "-ffloat-store" flag. This does,
however, slow down computation.

And still doesn't work for the OP's testcase :-

$ cat gr.c
#include <stdio.h>

double GRID_POINT(int i) {
return ( 0.1 + ( (80.1-0.1)/(400.0) )*((double)(i)) );
}

int main (void) {
double temp1 = GRID_POINT(2);
double temp2 = temp1-GRID_POINT(2);

printf("%.18e; %.18e\n", temp1, temp2);

return 0;
}

$ cc -ffloat-store gr.c -o gr

$ ./gr
5.000000000000000000e-01; -2.775557561562891351e-17

A data point. It does here:

$ gcc -ffloat-store gr.c -o gr
$ ./gr
5.000000000000000000e-01; 0.000000000000000000e+00
$ gcc gr.c -o gr
$ ./gr
5.000000000000000000e-01; -2.775557561562891351e-17
$ gcc --version
gcc (GCC) 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)
<snip>

--
Ben.

Dec 13 '07 #32

Alok Singhal

On Thu, 13 Dec 2007 18:59:35 +0000, Mark Bluemel wrote:

Chris Torek wrote:
>>
As someone else noted else-thread, one can get "proper" IEEE floating
point on the x86 using gcc's "-ffloat-store" flag. This does, however,
slow down computation.

And still doesn't work for the OP's testcase :-

$ cat gr.c
#include <stdio.h>

double GRID_POINT(int i) {
return ( 0.1 + ( (80.1-0.1)/(400.0) )*((double)(i)) );
}

int main (void) {
double temp1 = GRID_POINT(2);
double temp2 = temp1-GRID_POINT(2);

printf("%.18e; %.18e\n", temp1, temp2);

return 0;
}

$ cc -ffloat-store gr.c -o gr

$ ./gr
5.000000000000000000e-01; -2.775557561562891351e-17

This is OT, but the above happens because GRID_POINT(2) in the second call
is an 'unnamed temporary', and is not necessarily flushed to the memory by
gcc (even with -ffloat-store).

See http://arxiv.org/abs/cs/0701192 - "The pitfalls of verifying
floating-point computations".

-Alok

Dec 14 '07 #33

Chris Torek

>Chris Torek wrote:

>As someone else noted else-thread, one can get "proper" IEEE floating
point on the x86 using gcc's "-ffloat-store" flag. This does,
however, slow down computation.

In article <fj**********@aioe.org>
Mark Bluemel <ma**********@pobox.comwrote:

>And still doesn't work for the OP's testcase ... [snip code; but it

reappears below, modified slightly]

>As I commented else-thread -ffloat-store doesn't appear to alter how
a floating point return value from a function (presumably in a register)
is handled.

This is allowed by the Standard -- you have to store the result
in a temporary variable, or use a cast, to trim off "extra"
precision.

>This is probably covered by the comment in the manual page :-
... "a few programs rely on the precise definition of IEEE floating
point. Use -ffloat-store for such programs, _after modifying them to
store all pertinent intermediate computations into variables_."

If gcc fails to store the number through memory (on the x86, that
is; this is how -ffloat-store trims the excess precision and
exponents) on a cast, that would be a bug.

On my Linux box here, the bug is not showing up, with the same
compilation options you used, or indeed any others. Then again,
I probably have a different Linux, different version of gcc, and
different libraries. (In particular I think this Linux and/or
libc defaults C programs to double, not extended, precision in
the FPU control word.)

Here is the modified test-case; compile with -DCAST_AWAY_EXCESS to
test the effect of adding a cast.

% cat gr.c
#include <stdio.h>

double GRID_POINT(int i) {
return ( 0.1 + ( (80.1-0.1)/(400.0) )*((double)(i)) );
}

#ifdef USE_CAST
#define CAST_AWAY_EXCESS (double)
#else
#define CAST_AWAY_EXCESS /* nothing */
#endif

int main(void) {
double temp1 = GRID_POINT(2);
double temp2 = temp1 - CAST_AWAY_EXCESS GRID_POINT(2);

printf("%.18e; %.18e\n", temp1, temp2);

return 0;
}
% cc -o gr -ffloat-store gr.c
% ./gr
5.000000000000000000e-01; 0.000000000000000000e+00
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.

Dec 14 '07 #34

Mark Bluemel

Ben Bacarisse wrote:

Mark Bluemel <ma**********@pobox.comwrites:
>Chris Torek wrote:
<snip>

>>As someone else noted else-thread, one can get "proper" IEEE floating
point on the x86 using gcc's "-ffloat-store" flag. This does,
however, slow down computation.
And still doesn't work for the OP's testcase :-

[Snip]

>
A data point. It does here:

$ gcc -ffloat-store gr.c -o gr
$ ./gr
5.000000000000000000e-01; 0.000000000000000000e+00
$ gcc gr.c -o gr
$ ./gr
5.000000000000000000e-01; -2.775557561562891351e-17
$ gcc --version
gcc (GCC) 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)

Looks like I need a later gcc then :-)

Dec 14 '07 #35

CBFalconer

James Kuyper wrote:

ym******@gmail.com wrote:

.... snip ...

>
>You quoted it: "How could the standard guarantee ANYTHING about
the precision of floating point calculations..."

Well, actually, the numerical limits section does specify the
precision. However, getting an inaccurate value with high
precision is not something most numerical analysts would be
happy about.

Please describe how you implement infinite precision in a finite
number of bits.

--
Merry Christmas, Happy Hanukah, Happy New Year
Joyeux Noel, Bonne Annee.
Chuck F (cbfalconer at maineline dot net)
<http://cbfalconer.home.att.net>

--
Posted via a free Usenet account from http://www.teranews.com

Dec 14 '07 #36

Flash Gordon

CBFalconer wrote, On 13/12/07 15:05:

James Kuyper wrote:
>ym******@gmail.com wrote:

... snip ...

>>You quoted it: "How could the standard guarantee ANYTHING about
the precision of floating point calculations..."
Well, actually, the numerical limits section does specify the
precision. However, getting an inaccurate value with high
precision is not something most numerical analysts would be
happy about.

Please describe how you implement infinite precision in a finite
number of bits.

That has nothing to do with James's comment. You do not need infinite
precision to get guaranteed accuracy. You do not need infinite precision
to get high precision.
--
Flash Gordon

Dec 14 '07 #37

jameskuyper

CBFalconer wrote:

James Kuyper wrote:
ym******@gmail.com wrote:
... snip ...

You quoted it: "How could the standard guarantee ANYTHING about
the precision of floating point calculations..."
Well, actually, the numerical limits section does specify the
precision. However, getting an inaccurate value with high
precision is not something most numerical analysts would be
happy about.

Please describe how you implement infinite precision in a finite
number of bits.

Do you have any particular reason for posing such an absurd challenge?
It's not relevant to any point that I, or anyone else, has made so far
in this discussion. Maybe if you explain more what you're looking for,
I might be able to help.

Dec 14 '07 #38

Keith Thompson

CBFalconer <cb********@yahoo.comwrites:

James Kuyper wrote:
>ym******@gmail.com wrote:
... snip ...
>>
>>You quoted it: "How could the standard guarantee ANYTHING about
the precision of floating point calculations..."

Well, actually, the numerical limits section does specify the
precision. However, getting an inaccurate value with high
precision is not something most numerical analysts would be
happy about.

Please describe how you implement infinite precision in a finite
number of bits.

I think you misunderstood James's use of the phrase "inaccurate
value".

For example, given a precision of 5 digits after the decimal point,
3.14159 is an accurate value of pi (to within the stated precision);
3.12345 is an innaccurate value of pi. Infinite precision is not
required for what's being discussed.

--
Keith Thompson (The_Other_Keith) <ks***@mib.org>
Looking for software development work in the San Diego area.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Dec 15 '07 #39

Barry Schwarz

On Wed, 12 Dec 2007 16:09:12 +0100, rembremading
<re**********@gmx.netwrote:

>Hi all!

The following piece of code has (for me) completely unexpected behaviour.
(I compile it with gcc-Version 4.0.3)
Something goes wrong with the integer to float conversion.
Maybe somebody out there understands what happens.
Essentially, when I subtract the (double) function value GRID_POINT(2) from
a variable which has been assigned the same value before this gives a
non-zero result and I really do not understand why.

The program prints
5.000000000000000000e-01; -2.775557561562891351e-17;
0.000000000000000000e+00

And the comparison
if(temp1==GRID_POINT(2))
has negative outcome.

When I replace
((double)(i)) by 2.0
everything behaves as expected. So the output is
5.000000000000000000e-01; 0.000000000000000000e+00; 0.000000000000000000e+00

But: even if the integer to float conversion is inexact (which, I think,
should not be the case) something like
temp1 = GRID_POINT(2);
temp3 = temp1-GRID_POINT(2);
should still result in temp3==0.0, whatever function GRID_POINT does.

What do You think?

I think you should output the internal representation of the doubles
in hex so you can see the actual results of your computation, not what
printf thinks it should display.

>
Thank you!

---------------------------------------------------
#include <stdio.h>

double GRID_POINT(int i);

double GRID_POINT(int i)
{
return ( 0.1 + ( (80.1-0.1)/(400.0) )*((double)(i)) );

What purpose do you think the cast serves?

All the parentheses are superfluous except the pair around the
subtraction.

Do you really think 80.1 - .1 is exactly equal to 80.0?

Even if it is on your system, do you think 80.0/400.0 is exactly equal
to 0.2? Can any floating point value be exactly equal to 0.2 on your
system? (It is possible on mine but very few other people here have a
system with decimal floating point; most use binary.)

>}

int main (void) {

double temp1, temp2, temp3;

temp1 = GRID_POINT(2);
temp2 = GRID_POINT(2);
temp3 = temp1-GRID_POINT(2);

printf("%.18e; %.18e; %.18e\n", temp1, temp3, temp1-temp2 );

if(temp1==GRID_POINT(2)){
printf("these two are equal\n");
}

return 0;
}
---------------------------------------------------

Remove del for email

Dec 16 '07 #40

Trouble with integer floating point conversion

Similar topics