473,390 Members | 1,723 Online

# Loss of precision assigning floating point values?

How can I check if assignment of a float to a double (or vice versa)
will result in loss of precision?

Jul 23 '05 #1
16 6187
BigMan wrote:
How can I check if assignment of a float to a double (or vice versa)
will result in loss of precision?

Assignment of a float to a double is always precise. Assignment of
a double to a float will cause loss of precision if the double has
non-zero bits in the part of its mantissa beyond the size of the
mantissa of the float. There is no platform-independent check that
you could do, it all depends on the representation of the float and
double types.

V
Jul 23 '05 #2

Victor Bazarov wrote:
BigMan wrote:
How can I check if assignment of a float to a double (or vice versa) will result in loss of precision?

Assignment of a float to a double is always precise. Assignment of
a double to a float will cause loss of precision if the double has
non-zero bits in the part of its mantissa beyond the size of the
mantissa of the float. There is no platform-independent check that
you could do, it all depends on the representation of the float and
double types.

V

double d = ...;
float f = (float)d;
bool loss = (d != (double)f);

?

Regards,
Bogdan Sintoma

Jul 23 '05 #3
Bogdan Sintoma wrote:
Victor Bazarov wrote:
BigMan wrote:
How can I check if assignment of a float to a double (or vice
versa)
will result in loss of precision?

Assignment of a float to a double is always precise. Assignment of
a double to a float will cause loss of precision if the double has
non-zero bits in the part of its mantissa beyond the size of the
mantissa of the float. There is no platform-independent check that
you could do, it all depends on the representation of the float and
double types.

V

double d = ...;
float f = (float)d;
bool loss = (d != (double)f);

?

I think this measures if the assignment _has_resulted_ in a loss of
precision, not if it _will_ result, don't you agree?

V
Jul 23 '05 #4

Victor Bazarov wrote:
Bogdan Sintoma wrote:
Victor Bazarov wrote:
BigMan wrote:

How can I check if assignment of a float to a double (or vice

versa)
will result in loss of precision?
Assignment of a float to a double is always precise. Assignment of
a double to a float will cause loss of precision if the double has
non-zero bits in the part of its mantissa beyond the size of the
mantissa of the float. There is no platform-independent check that
you could do, it all depends on the representation of the float and
double types.

V

double d = ...;
float f = (float)d;
bool loss = (d != (double)f);

?

I think this measures if the assignment _has_resulted_ in a loss of
precision, not if it _will_ result, don't you agree?

Agree :), but those are the implementation details of 'some' function
that check if the conversion of a double into a float will result in
loss of precision ;).

Bogdan

Jul 23 '05 #5
On Tue, 12 Apr 2005 09:27:22 -0400 in comp.lang.c++, Victor Bazarov
<v.********@comAcast.net> wrote,
mantissa of the float. There is no platform-independent check that
you could do, it all depends on the representation of the float and
double types.

What then is wrong with the following reasoning? If, after assigning
the double to a float, the float compares equal to the double, then no
precision was lost, otherwise it was.

Of course I would never suggest using float if you can afford double.

Jul 23 '05 #6
David Harmon wrote:
On Tue, 12 Apr 2005 09:27:22 -0400 in comp.lang.c++, Victor Bazarov
<v.********@comAcast.net> wrote,
mantissa of the float. There is no platform-independent check that
you could do, it all depends on the representation of the float and
double types.

What then is wrong with the following reasoning? If, after assigning
the double to a float, the float compares equal to the double, then no
precision was lost, otherwise it was.

Nothing is wrong except that the OP wanted to check if assignment "will
result in loss", not if it "has resulted in loss".
Of course I would never suggest using float if you can afford double.

Yes, but for large quantities of data, like coordinates that are read from
a file and are known to never have more than 5 digits, say, there is no
need to try to "afford double". And often storing only floats saves quite
a significant amount of memory. Intermediate operations on those values
can be performed in doubles, storing resulting values back could mean
losing some precision... There are many ways to reduce the error of math
operations on FP numbers, but they are not the subject of this thread, so
I'll shut up for now.

V
Jul 23 '05 #7
On Tue, 12 Apr 2005 12:24:33 -0400 in comp.lang.c++, Victor Bazarov
<v.********@comAcast.net> wrote,
What then is wrong with the following reasoning? If, after assigning
the double to a float, the float compares equal to the double, then no
precision was lost, otherwise it was.

Nothing is wrong except that the OP wanted to check if assignment "will
result in loss", not if it "has resulted in loss".

If you do it using a scratch variable, the test predicts whether it
will result in loss when you do it with the real thing.

Jul 23 '05 #8
1. Does the C++ standard say that assigning a float to a double never
results in loss of precision? If so, where does it say so?
on FP numbers?

Jul 23 '05 #9
BigMan wrote:

1. Does the C++ standard say that assigning a float to a double never
results in loss of precision? If so, where does it say so?
No. How could it.
Assume: sizeof( double ) == 8, sizeof( float ) == 4
How can one pack 8 bytes into 4 without loosing anything?
on FP numbers?

Floating-Point Arithmetic"
http://docs-pdf.sun.com/800-7895/800-7895.pdf

HTML Version
http://docs.sun.com/source/806-3568/ncg_goldberg.html
Having said that, here is my standard advice:

Until you know what you do and have the knowledege to do
it, better forget that there is a data type float. Always
* you know what you are heading at
* you have the knowledge and are willing to fight that beast
* you have a very very very very good reason to use float.

--
Karl Heinz Buchegger
Jul 23 '05 #10
On Wed, 13 Apr 2005 10:28:11 +0200, Karl Heinz Buchegger
BigMan wrote:

1. Does the C++ standard say that assigning a float to a double never
results in loss of precision? If so, where does it say so?

No. How could it.
Assume: sizeof( double ) == 8, sizeof( float ) == 4
How can one pack 8 bytes into 4 without loosing anything?

imho, BigMan meant it the other way round: assign a float to a double...
Jul 23 '05 #11
ulrich wrote:

On Wed, 13 Apr 2005 10:28:11 +0200, Karl Heinz Buchegger
BigMan wrote:

1. Does the C++ standard say that assigning a float to a double never
results in loss of precision? If so, where does it say so?

No. How could it.
Assume: sizeof( double ) == 8, sizeof( float ) == 4
How can one pack 8 bytes into 4 without loosing anything?

imho, BigMan meant it the other way round: assign a float to a double...

Sorry. Obviously I didn't pay close attention.

--
Karl Heinz Buchegger
Jul 23 '05 #12
BigMan wrote:
1. Does the C++ standard say that assigning a float to a double never
results in loss of precision? If so, where does it say so?
It does not. Actually, it says the opposite: "An rvalue of type float
can be converted to an rvalue of type double. The value is unchanged."
(4.6/1)
on FP numbers?

Any book in line with "C++ for Scientists and Engineers". "The Art of
Computer Programming". What Karl recommended. Other good books, many

V
Jul 23 '05 #13
On 2005-04-13 10:02:23 -0400, Victor Bazarov <v.********@comAcast.net> said:
BigMan wrote:
1. Does the C++ standard say that assigning a float to a double never
results in loss of precision? If so, where does it say so?
It does not.

Really?
Actually, it says the opposite: "An rvalue of type float
can be converted to an rvalue of type double. The value is unchanged."
(4.6/1)

Isn't that exactly what BigMan said? If the value is "unchanged" how
can it be less precise? Wouldn't a loss of precision count as changing
the value?
--
Clark S. Cox, III
cl*******@gmail.com

Jul 23 '05 #14
Clark S. Cox III wrote:
On 2005-04-13 10:02:23 -0400, Victor Bazarov <v.********@comAcast.net>
said:
BigMan wrote:
1. Does the C++ standard say that assigning a float to a double never
results in loss of precision? If so, where does it say so?

It does not.

Really?
Actually, it says the opposite: "An rvalue of type float
can be converted to an rvalue of type double. The value is unchanged."
(4.6/1)

Isn't that exactly what BigMan said? If the value is "unchanged" how can
it be less precise? Wouldn't a loss of precision count as changing the
value?

What do you want from me? So, I didn't read the question very carefully.
He got his answer. The Standard does say the assignment never results in
loss of precision, and I quoted it and gave him the paragraph number.
Jul 23 '05 #15
On 2005-04-13 11:24:19 -0400, Victor Bazarov <v.********@comAcast.net> said:
Clark S. Cox III wrote:
On 2005-04-13 10:02:23 -0400, Victor Bazarov <v.********@comAcast.net> said:
BigMan wrote:

1. Does the C++ standard say that assigning a float to a double never
results in loss of precision? If so, where does it say so?
It does not.

Really?
Actually, it says the opposite: "An rvalue of type float
can be converted to an rvalue of type double. The value is unchanged."
(4.6/1)

Isn't that exactly what BigMan said? If the value is "unchanged" how
can it be less precise? Wouldn't a loss of precision count as changing
the value?

What do you want from me? So, I didn't read the question very carefully.
He got his answer. The Standard does say the assignment never results in
loss of precision, and I quoted it and gave him the paragraph number.

No worries, I thought that I had missed something.

--
Clark S. Cox, III
cl*******@gmail.com

Jul 23 '05 #16
BigMan wrote:
How can I check if assignment of a float to a double (or vice versa)
will result in loss of precision?

You probably mean accuracy.

In the typical implementation,
double precision is more precise that single [float] precision
so conversion from double to float *always* reduces precision
and conversion from float to double *always* increases precision.
Accuracy depends upon the number of significant digits [bits].
If all of the significant digits can be represented accurately
by type float, then accuracy will be maintained in the conversion
from double to float. If conversion from double to float
causes significant digits to be discarded,
the resulting inaccuracy may cause trouble.

You may be concerned about whether the conversion is *exact* or not.
IEEE floating-point specifies signals for inexact arithmetic
and you may be able trap an inexact conversion.

Consult you man pages for signal

SIGNAL(2) Linux Programmer’s Manual SIGNAL(2)
NAME
signal - ANSI C signal handling
SYNOPSIS
#include <signal.h>
typedef void (*sighandler_t)(int);
sighandler_t signal(int signum, sighandler_t handler);

and look for

SIGFPE
Some C++ compilers have options
for trapping inexact floating-point operations