Neil Kurzman wrote:

What people are trying to say is in floating point

float f;

f= 1.0 + 1.0;

And some how f=1.99999999 (newbie gasp and old timers say So?)

so in floating point 1+1 <>2

The is how floating point math is in all languages on all compilers.

You have the right general idea, but unfortunately your example is

completely wrong.

There is some range within which floating point numbers can represent

integers precisely (for float, this will typically be around +/- 2^24).

Outside this range, there will be a gap between one integer and the

next larger one that can be represented. As the magnitude of the number

grows, so does the size of the gap from one number to the next, so the

total number of digits that can be represented remains roughly constant

(the amount of data that can be represented does remain constant, but

the binary to decimal conversion can vary the number of decimal digits

the number converts to, and when you look at it in decimal, you often

get a half digit of precision -- i.e. a final digit that's not

completely right, but not complete wrong either -- e.g if it's a five,

the correct value might really be four or six, but definitely isn't 1

or 9).

Toward the center of the range, (e.g. between -1 and 1) you have more

data available than is necessary to represent an integer. In this case,

some places after the decimal point represent real data.

As soon as we get to fractions, however, we run into another problem:

we're representing the number as a binary fraction. The only fractions

that can be represented precisely are those whose prime factorizations

contain only 2. Otherwise, we end up with a repeating fraction of some

sort that can't be represented precisely in any finite number of bits.

So, 1.0f + 1.0f will always equal exactly 2.0f -- but if you continue

adding 1.0f often enough, you'll reach a point at which it no longer

changes the value of the result AT ALL! For example, a loop like this:

float current = 1.0f, previous = 0.0f;

while (current != previous) {

previous = current;

current += 1.0;

}

will exit in well under a second on a typical machine.

With doubles you get the same basic characteristics, though with a lot

more precision, including a much larger range within which integers can

all be represented precisely.

One other minor detail: by default, C and C++ output functions will

print either a float or a double to the same precision. If memory

serves, the default is 5 digits. More or less by chance, this happens

to be about the limit a typical float can represent, so almost ANY sort

of error in a float will show up immediately.

As mentioned above, a double typically has quite a bit more precsion --

around 15 digits in fact. This means that if you print out with the

default precision, a double will look like it's giving a precisely

correct result until or unless your result has gotten far enough off to

destroy the precision of roughly 2/3rds of its digits.

--

Later,

Jerry.

The universe is a figment of its own imagination.