435,454 Members | 3,191 Online + Ask a Question
Need help? Post your question and get tips & solutions from a community of 435,454 IT Pros & Developers. It's quick & easy.

Floating Point Precision

 P: n/a Hi, I have a question about floating point precision in C. What is the minimum distinguishable difference between 2 floating point numbers? Does this differ for various computers? Is this the EPSILON? I know in float.h a FLT_EPSILON is defined to be 10^-5. Does this mean that the computer cannot distinguish between 2 numbers that differ by less than this epsilon? A problem I am seeing is a difference in values from a floating point computation for a run on a Windows machine compared to a run on a Linux machine. The values differ by 10^-6. Thanks for any help, Michael Nov 15 '05 #1
15 Replies

 P: n/a mi*************@gmail.com wrote: Hi, I have a question about floating point precision in C. What is the minimum distinguishable difference between 2 floating point numbers? Does this differ for various computers? The only reason it would be the same is that most computers support IEEE 754, at least to this extent. This is already Off Topic for c.l.c. Is this the EPSILON? I know in float.h a FLT_EPSILON is defined to be 10^-5. Does this mean that the computer cannot distinguish between 2 numbers that differ by less than this epsilon? FLT_EPSILON is the positive difference between 1.0f and the next higher representable number of float data type. I would be disappointed in and C textbook which did not explain this. A problem I am seeing is a difference in values from a floating point computation for a run on a Windows machine compared to a run on a Linux machine. The values differ by 10^-6. You could expect such differences, in float data, even between two versions of the same compiler, or between different optimization or code generation options of the same compiler, even on the same OS. If you want these differences to be smaller, use double data type. Check what the FAQ says on this subject. Nov 15 '05 #2

 P: n/a mi*************@gmail.com wrote: Hi, I have a question about floating point precision in C. What is the minimum distinguishable difference between 2 floating point numbers? Does this differ for various computers? Is this the EPSILON? I know in float.h a FLT_EPSILON is defined to be 10^-5. Does this mean that the computer cannot distinguish between 2 numbers that differ by less than this epsilon? A problem I am seeing is a difference in values from a floating point computation for a run on a Windows machine compared to a run on a Linux machine. The values differ by 10^-6. I suggest you have a look at and especially the reference from there to the Goldberg paper, to understand how floating point numbers work. Some notes: The "next bigger" or "next smaller" number from a given floating point number p is not always at the same distance but depends on p. EPSILON is the smallest number eps such that 1+eps != 1, so 1+eps is the next number after 1. If the base of the floating point types is b, then the next number after b is b*(1+eps) and not b+eps. Right in the same vein, there are numbers which cannot be represented by floating point numbers (e.g. the numbers between 1 and 1+eps), so errors are introduced and propagated throughout your computations; there are rounding errors as well, so you basically need a little bit of numerical analysis to know that your results are still reasonably accurate. Equality is not a relation you should rely on. Even working with relative errors can give you a headache when working with sets of potentially equal values. C does not guarantee much about floating point numbers. The few guarantees you do have are mainly the limits given in -- everything else depends on your implementation (which often is comprised of platform, operating system, compiler). On a related note: The "natural" floating point type in C is double. Use float only if you have severe memory problems or can _prove_ that the accuracy is sufficient for your purposes. (Hardly surprising, it often is not.) Cheers Michael -- E-Mail: Mine is an /at/ gmx /dot/ de address. Nov 15 '05 #3

 P: n/a In article <11**********************@o13g2000cwo.googlegroups .com>, wrote:I have a question about floating point precision in C. What is the minimum distinguishable difference between 2 floating pointnumbers? You could probably work it out in terms of FLT_RADIX and FLT_MANT_DIG but you have the small problem that the value you are describing is not representable as a normalized number -- you might have the case where A > B and yet A - B is not representable in normalized form. Does this differ for various computers? Yes, definitely. Is this the EPSILON? I know in float.h a FLT_EPSILON is defined to be10^-5. Does this mean that the computer cannot distinguish between 2numbers that differ by less than this epsilon? No, FLT_EPSILON is such that 1.0 is distinguishable from 1.0 + FLT_EPSILON A problem I am seeing is a difference in values from a floating pointcomputation for a run on a Windows machine compared to a run on a Linuxmachine. The values differ by 10^-6. The absolute value doesn't tell us much -- if you are working with values in the range 1E50 then 1E-6 is miniscule, but if you are working with values in the range 1E-30 then 1E-6 is huge. There are a lot of different reasons why computations come out differently on different computers -- too many to list them all in one message. As an example: on Pentiums, the native double precision size is 80 bits but the C double precision size is 64 bits. If some steps of the calculations are carried out at 80 bits, you can end up with different results. There are sometimes compiler options that control whether native-size register-to-register calculations are allowed, or whether the machine must round to the storable precision at each stage. But that's far from the only reason. -- Entropy is the logarithm of probability -- Boltzmann Nov 15 '05 #4

 P: n/a Have a look at ftp://ftp.quitt.net/Outgoing/goldbergFollowup.pdf -- #include _ Kevin D Quitt USA 91387-4454 96.37% of all statistics are made up Nov 15 '05 #5

 P: n/a Michael Mair wrote: EPSILON is the smallest number eps such that 1+eps != 1, so 1+eps is the next number after 1. If the base of the floating point types is b, then the next number after b is b*(1+eps) and not b+eps. The epsilon value should be the difference between 1 and the next representable number after 1. But consider the value x, defined as three quarters of the epsilon value. float x = 0.75 * FLT_EPSILON; now, your condition (1 + x) != 1 is very likely to be true. The result of the addition on the left hand side is not representable, but it should round to the closest representable value, which is the next value after 1, even though x is less than FLT_EPSILON. Perhaps the following condition is better? eps > 0 && (1 + eps) - 1 == eps Here's what I get on my computer: FLT_EPSILON is 0.000000119209289550781250000000 x is 0.000000089406967163085937500000 1 + x is 1.000000119209289550781250000000 (1 + x) - 1 is 0.000000119209289550781250000000 x is 3/4 of FLT_EPSILON. Adding x to 1 rounds up to 1 + FLT_EPSILON. Taking 1 back off again leaves the true FLT_EPSILON, not the 3/4 of it that we started with. -- Simon. Nov 15 '05 #6

 P: n/a mi*************@gmail.com wrote On 09/09/05 14:02,: Hi, I have a question about floating point precision in C. What is the minimum distinguishable difference between 2 floating point numbers? Does this differ for various computers? The smallest discernible difference depends on the magnitude of the numbers: the computer can surely distinguish 1.0 from 1.1, but might not be able to tell 100000000000000.0 from 100000000000000.1 even though the two pairs of values differ (mathematically speaking) by the same amount. You've got to be concerned with relative error, not with absolute error. And yes: The relative error (loosely speaking, the precision) will differ from one machine to another. Is this the EPSILON? I know in float.h a FLT_EPSILON is defined to be 10^-5. Does this mean that the computer cannot distinguish between 2 numbers that differ by less than this epsilon? First, I think you've misunderstood what FLT_EPSILON is. It is the difference between 1.0f and the "next" float value, the smallest float larger than 1.0f that is distinguishable from 1.0f. That is, FLT_EPSILON is one way of describing the precision of float values on the system at hand. Note that although 1.0f is distinguishable from 1.0f+FLT_EPSILON, 1000000f need not be distinguishable from 1000000f+FLT_EPSILON. Second, FLT_EPSILON is not necessarily 1E-5: the C Standard requires that it be no greater than 1E-5, but permits lower values (greater precision) for machines that support them. A problem I am seeing is a difference in values from a floating point computation for a run on a Windows machine compared to a run on a Linux machine. The values differ by 10^-6. Without knowing what the values are, there's no way to tell whether a difference of 1E-6 is huge or tiny. If the values are supposed to be Planck's Constant (~6.6E-34), 1E-6 represents an enormous error. If they're supposed to be Avogadro's Number (~6.0E23) the difference is completely insignificant. For the purposes of argument, let's say the values are in the vicinity of 1. Then a difference of 1E-6 in float arithmetic on a machine where FLT_EPSILON is 1E-5 is nothing to worry about; you've already done better than you had any right to expect. Beyond that, we get into the analysis of the origins and propagation of errors, a field known as "Numerical Analysis." The topic is simple at first but deceptively so, because it fairly rapidly becomes the stuff of PhD theses. A widely- available paper called (IIRC) "What Every Computer Scientist Should Know about Floating-Point Arithmetic" would be worth your while to read. -- Er*********@sun.com Nov 15 '05 #7

 P: n/a Simon Biber wrote: Michael Mair wrote: EPSILON is the smallest number eps such that 1+eps != 1, so 1+eps is the next number after 1. If the base of the floating point types is b, then the next number after b is b*(1+eps) and not b+eps. The epsilon value should be the difference between 1 and the next representable number after 1. Yep, I was imprecise. Let M be the set of all floating point numbers representable by the respective floating point type; then EPSILON = min {eps \in M | eps > 0 and 1+eps != 1}. But consider the value x, defined as three quarters of the epsilon value. float x = 0.75 * FLT_EPSILON; now, your condition (1 + x) != 1 is very likely to be true. The result of the addition on the left hand side is not representable, but it should round to the closest representable value, which is the next value after 1, even though x is less than FLT_EPSILON. This is a question of the rounding mode. Perhaps the following condition is better? eps > 0 && (1 + eps) - 1 == eps Yes, indeed. My mistake was that I had the classical eps = 1.0S; while ((T) (1.0S+eps/FLT_RADIX) != 1.0S) eps /= FLT_RADIX; in mind (where S is the appropriate type suffix or nothing for type T; the cast can be necessary for avoiding excess precision ->FLT_EVAL_METHOD. The usual caveats for gcc and FP arithmetics on x86 and similar apply, though.) Here's what I get on my computer: FLT_EPSILON is 0.000000119209289550781250000000 x is 0.000000089406967163085937500000 1 + x is 1.000000119209289550781250000000 (1 + x) - 1 is 0.000000119209289550781250000000 x is 3/4 of FLT_EPSILON. Adding x to 1 rounds up to 1 + FLT_EPSILON. Taking 1 back off again leaves the true FLT_EPSILON, not the 3/4 of it that we started with. See above. Round to zero is possible (->FLT_ROUNDS). Cheers Michael -- E-Mail: Mine is an /at/ gmx /dot/ de address. Nov 15 '05 #8

 P: n/a Michael Mair wrote: Simon Biber wrote: Michael Mair wrote: EPSILON is the smallest number eps such that 1+eps != 1, so 1+eps is the next number after 1. If the base of the floating point types is b, then the next number after b is b*(1+eps) and not b+eps. The epsilon value should be the difference between 1 and the next representable number after 1. Yep, I was imprecise. Let M be the set of all floating point numbers representable by the respective floating point type; then EPSILON = min {eps \in M | eps > 0 and 1+eps != 1}. That still suffers from the rounding mode issue. There are many possible values of eps that are members of M, are greater than zero but ess than the true epsilon value, and when added to one may round up to a value that is not equal to 1. Unless you mean the + operator to be an abstract mathematical thing that can return any real number, rather than the one that must operate within the given floating-point type. You need to make clear whether: +: M X M -> M (+ is of a type that maps a pair of M to a single M) or: +: M X M -> R (+ is of a type that maps a pair of M to a real) -- Simon. Nov 15 '05 #9

 P: n/a mi*************@gmail.com wrote on 09/09/05 : What is the minimum distinguishable difference between 2 floating point numbers? Does this differ for various computers? float : FLT_EPSILON () double : DBL_EPSILON () [C99] long double : LDBL_EPSILON () Is this the EPSILON? Yup. I know in float.h a FLT_EPSILON is defined to be 10^-5. On /this/ implementation. Does this mean that the computer cannot distinguish between 2 numbers that differ by less than this epsilon? On this computer, I dunno. On /this/ implementation of the C-language, yes. A problem I am seeing is a difference in values from a floating point computation for a run on a Windows machine compared to a run on a Linux machine. The values differ by 10^-6. Could be. In general terms, floating points representation is (nearly) always an approximation. Use 'double' for a better precision. (C99 supports long double). -- Emmanuel The C-FAQ: http://www.eskimo.com/~scs/C-faq/faq.html The C-library: http://www.dinkumware.com/refxc.html "C is a sharp tool" Nov 15 '05 #10

 P: n/a On Sat, 10 Sep 2005 18:17:30 +0200, "Emmanuel Delahaye" wrote: mi*************@gmail.com wrote on 09/09/05 : What is the minimum distinguishable difference between 2 floating point numbers? Does this differ for various computers?float : FLT_EPSILON ()double : DBL_EPSILON ()[C99]long double : LDBL_EPSILON () Is this the EPSILON?Yup. I know in float.h a FLT_EPSILON is defined to be 10^-5.On /this/ implementation. Does this mean that the computer cannot distinguish between 2 numbers that differ by less than this epsilon?On this computer, I dunno. On /this/ implementation of the C-language,yes. On this implementation and only for numbers between 1 and 1+epsilon. It will more than likely distinguish between 10^-9 and 10^-8, two numbers which differ by something much less than epsilon. A problem I am seeing is a difference in values from a floating point computation for a run on a Windows machine compared to a run on a Linux machine. The values differ by 10^-6.Could be. In general terms, floating points representation is (nearly)always an approximation. Use 'double' for a better precision. (C99supports long double). <> Nov 15 '05 #11

 P: n/a Simon Biber wrote: Michael Mair wrote: Simon Biber wrote: Michael Mair wrote: EPSILON is the smallest number eps such that 1+eps != 1, so 1+eps is the next number after 1. If the base of the floating point types is b, then the next number after b is b*(1+eps) and not b+eps. The epsilon value should be the difference between 1 and the next representable number after 1. Yep, I was imprecise. Let M be the set of all floating point numbers representable by the respective floating point type; then EPSILON = min {eps \in M | eps > 0 and 1+eps != 1}. That still suffers from the rounding mode issue. There are many possible values of eps that are members of M, are greater than zero but ess than the true epsilon value, and when added to one may round up to a value that is not equal to 1. Gah. I wanted 1+eps \in M -- I should post at times when I am really awake :-/ Unless you mean the + operator to be an abstract mathematical thing that can return any real number, rather than the one that must operate within the given floating-point type. You need to make clear whether: +: M X M -> M (+ is of a type that maps a pair of M to a single M) or: +: M X M -> R (+ is of a type that maps a pair of M to a real) I meant the latter but worked as if I had the former :-( Cheers Michael -- E-Mail: Mine is an /at/ gmx /dot/ de address. Nov 15 '05 #12

 P: n/a Emmanuel Delahaye wrote: (C99 supports long double). C89 does too. -- pete Nov 15 '05 #13

 P: n/a wrote in message news:11**********************@o13g2000cwo.googlegr oups.com... Hi, I have a question about floating point precision in C. What is the minimum distinguishable difference between 2 floating point numbers? Does this differ for various computers? Is this the EPSILON? I know in float.h a FLT_EPSILON is defined to be 10^-5. Does this mean that the computer cannot distinguish between 2 numbers that differ by less than this epsilon? A problem I am seeing is a difference in values from a floating point computation for a run on a Windows machine compared to a run on a Linux machine. The values differ by 10^-6. Thanks for any help, Michael This is along post and I know there will be "comments" :) People have lots of issues with the way Microsoft handles floating point number on Windows systems. IMHO is sucks. It seems that hacks left in from Intel's Pentium FPU problems may account for some to the weirdness. So now for the long part of this post. Here is a program that attempts for find the number of "real" bits in the floating point support by using only standard C functionality. Well you know that's a lie about "standard C" whenever Microsoft and Windows are involved. /* file: flt_precision.c Find number of significant bits in the floating point fraction. Sample output for Microsoft VC6: float size 4 double size 8 long double size 8 Max delta for float 16777215, bits 24 Max delta for double 9007199254740991, bits 53 Max delta for long double 9007199254740991, bits 53 Sample output for gcc 2.95.3: float size 4 double size 8 long double size 12 Max delta for float 16777215, bits 24 Max delta for double 9007199254740991, bits 53 Max delta for long double 18446744073709551615, bits 64 Notes: The EPISLON value for each float data type should be in your float.h standard library. You should check to make sure that your implementation matches your library. The Microsoft compiler for windows does not support long double at greater resolution that double. */ #include #include static void printFloat(float * f, int bits) { printf("Max delta for float %.0f, bits %d\n", *f, bits); } static void printDouble(double * d, int bits) { printf("Max delta for double %.f, bits %d\n", *d, bits); } /* This code will take a some explaining. 1) printf does not deal well long floats accurately. The solution to number one is to store the floation point fraction as an unsigned interger. 2) Microsoft does not support long long data types. The solution to number two is to use a Microsoft non-portable data type. */ static void printLongDouble(long double * ld, int bits) { /* #define MICROSOFT_STUPID_C */ #ifdef MICROSOFT_STUPID_C unsigned _int64 delta; delta = (unsigned _int64)(*ld); printf("Max delta for long double %I64u, bits %d\n", delta, bits); #else long long delta; delta = (unsigned long long)(*ld); printf("Max delta for long double %llu, bits %d\n", delta, bits); #endif } int main(int argc, char* argv[]) { float f, f2, fp1, fd2; double d, d2, dp1, dd2; long double ld, ld2, ldp1, ldd2; int bits; printf ("float size %d\n",sizeof(f)); printf ("double size %d\n",sizeof(d)); printf ("long double size %d\n",sizeof(ld)); f = 1.0; f2 = 2.0; fp1 = 1.0; fd2 = 1.0; bits = 0; do { bits++; fd2 = fd2 / f2; fp1 = f + fd2; } while (f != fp1); f = ((f / (fd2 * f2) - f) * f2) + f; printFloat(&f, bits); d = 1.0; d2 = 2.0; dp1 = 1.0; dd2 = 1.0; bits = 0; do { bits++; dd2 = dd2 / d2; dp1 = d + dd2; } while (d != dp1); d = ((d / (dd2 * d2) - d) * d2) + d; printDouble(&d, bits); ld = 1.0; ld2 = 2.0; ldp1 = 1.0; ldd2 = 1.0; bits = 0; do { bits++; ldd2 = ldd2 / ld2; ldp1 = ld + ldd2; } while (ld != ldp1); ld = ((ld / (ldd2 * ld2) - ld) * ld2) + ld; printLongDouble(&ld, bits); return 0; } Nov 15 '05 #14

 P: n/a "Keyser Soze" wrote in message news:k%*****************@newssvr21.news.prodigy.co m... People have lots of issues with the way Microsoft handles floating point number on Windows systems. IMHO is sucks. Emphasis on the H, I assume. You haven't shown anything wrong with it. It seems that hacks left in from Intel's Pentium FPU problems may account for some to the weirdness. You haven't shown any "hacks" or "weirdness". So now for the long part of this post. Here is a program that attempts for find the number of "real" bits in the floating point support by using only standard C functionality. Well you know that's a lie about "standard C" whenever Microsoft and Windows are involved. You mean that *you* are lying about your code being Standard C, because you don't know how to solve this particular problem in Standard C under Windows? I've worked with Microsoft C for nearly 20 years now, particularly in the area of conformance. IME, it conforms very well, certainly for the past decade. They chose a while back to give long double the same representation as double, for better consistency across multiple architectures as I understand it. While this sacrifices some possible functionality on Intel architectures, it *is* conforming. So what's your point? P.J. Plauger Dinkumware, Ltd. http://www.dinkumware.com Nov 15 '05 #15

 P: n/a In article "Keyser Soze" writes: .... f = 1.0; f2 = 2.0; fp1 = 1.0; fd2 = 1.0; bits = 0; do { bits++; fd2 = fd2 / f2; fp1 = f + fd2; } while (f != fp1); When the rounding mode is round to +inf this only terminates when fd2 did underflow, and so equals 0. And if underflow only gives the smallest positive floating point number, it will not terminate. f = ((f / (fd2 * f2) - f) * f2) + f; Divide by zero exception. As I remember from something similar I wrote a long long time ago, the stopping criterium should be while (fp1 - f != fd2), and I think with that criterium your next calculation can be simplified. But be afraid off pre-adjusting processors that truncate during the pre-adjust. -- dik t. winter, cwi, kruislaan 413, 1098 sj amsterdam, nederland, +31205924131 home: bovenover 215, 1025 jn amsterdam, nederland; http://www.cwi.nl/~dik/ Nov 15 '05 #16

This discussion thread is closed

Replies have been disabled for this discussion. 