An almost-infinite number of numbers can be presented as a finite binary

sequence. For sequences of length N there are 2^N possibilities.

That was the starting point of my researches. For that time I wanted

nothing but write a simple rounding proc with spelled in advance

limitations imposed by computer math, like "here it is will be pretty

exact, here it will not and this is why". Later I noticed - or thought

to notice - a semi-mystical correlation of these limits with Nalimov's

semantical spectra if ported onto computer programs, so the problem

got some linguistical - so some personal interest. That is not a

"showing up", I just wanted to explain why I'm spending more time on

this than it was originally planned - with the manual rounding

question itself shifter to the periphery.

The answer on "What decimal float can be stored exactly in IEEE-754

form?" appeared to be surprisingly hard to find for a non-

mathematician. The majority of online resources is simply stating that

"Not all numbers can be stored exactly in IEEE-754. For instance N

cannot." I collected a whole collection of these "for instance" all

around and nearly got berserk of it. By pure occasion, while checking

for exact English math terms, I found the definition of the dyadic

fraction (a.k.a. dyadic rational)

<http://en.wikipedia.org/wiki/Dyadic_fraction>

and my question got answered for the Turing machine, thus - besides

everything else - for unlimited internal storage space:

Any decimal number representable as a dyadic rational is also

representable as a finite binary sequence thus can be stored exactly

in binary form.

Including the definition of dyadic rational into the theorem itself,

the "1st VK's theorem" would be:

---

Any decimal number can be stored exactly in binary form if and only if

the decimal number in question is representable as rational X/2^Y (X

divided by 2 in power of Y) where X is an integer and Y is a natural

number including 0

---

This way the reduced form vulgar fraction can always tell if the

number is representable exactly in binary form:

92 = 92/1 = 92/2^0 <= dyadic fraction, so can be stored exactly

9.2 = 9/1 + 2/10 = 90/10 + 2/10 = 92/10 = 46/5 <= non-dyadic, so

cannot be stored exactly

As the joke with "1st VK's theorem" is getting a bit old, we can call

it more seriously for further, for instance "the theorem of dyadic

subset".

It is interesting to point out that the theorem applies to any binary

storage media so it doesn't depend on exact standard or space:

IEEE-754 single, double or n-byted is irrelevant.

This way for any set N of real numbers expressed in decimal numeral

form the subset of numbers stored exactly in binary form is formed by

numbers where this equation can be solved: R = X/Y^2

Math-savvies may try to find a formula for the subset size relative to

the whole set size. Empirically it is possible to say that binary

storage space is rather misfortunate media for decimal input so most

of the time any engine will be loosing/factoring/restoring and overall

playing with the precision.

An IEEE Double can have two signs, 2^52 mantissas, and almost 2^11

ordinary exponents; or it can be NaN or +Infinity or -Infinity. It can

be +0 or -0. but I forget where they fit into that scheme.

IEEE-754 is rather tricky in this aspect and here comes soon the "2nd

VK's theorem" :-)

But before that some grounds:

IEEE-754-DP-FP (double-precision floating-point) number takes 64 bit

of storage space with msb for sign, 11 bits for exponent and 52 bits

for mantissa: 1 + 11 + 52 = 64

Trick 1:

Because the exponent part must be able to represent both positive and

negative exponents, it is not stored exactly by as the result of

addition ActualValue+1023 (biased to 1023). This way if IEEE-754-DP-FP

form has the exponent part 1023 then the actual exponent is

1023-1023=0 and if the exponent part 1 then the actual exponent is

1-1023=-1022

Trick 2:

Mantissa is presumed to be in normalized form thus radix (comma)

placed after the first non-zero value: 1.23, 2.0456 etc. Because in

binary system the only non-zero value is 1, it allows do not store the

first bit but simply presume it ("hidden" bit). It means that if

mantissa part stores say

00101000000010100000001010000000101000000010100000 11 then the actual

bit sequence it represents is

1.001010000000101000000010100000001010000000101000 0011

It allows IEEE-754-DP-FP to have 53 bits for mantissa with only 52

physical bits.

Trick 3:

If the exponent part contains only zeros: 00000000000 then both tricks

1 and 2 are getting overridden and the stored number becomes

denormalized. In this case exponent is assumed to be -1022 and the

mantissa doesn't have any hidden bits: "what you see is what you get".

Such shift helps to operate with very small numbers, in the particular

it allows to keep comparison operations for very small values correct.

Extra 1

The normalized numbers always have "hidden" a.k.a. "implicit" bit

added - see Trick 2. Because of that the only way to represent zero

(0) in IEEE-754 is by using a denormalized number. Indeed 0 in

IEEE-754 is the number where both exponent and mantissa parts contain

all zeros. Because sign bit is always presented, it still can be set

to 0 or 1: thus it can be positive zero 0 and negative zero -0

In JavaScript the engine is instructed to say that -0 == 0

That has nothing to do with some IEEE-754 particular demands. This is

just a convenience heuristic added atop to keep regular users in

better sanity :-) Such heuristic is neither required nor uniform in

programming languages. Say in Perl -0 != 0 for endless surprise of

newbes.

Extra 2

IEEE-754 number with exponent all 1s and mantissa all 0s has special

meaning: it denotes Infinity. With sign bit cleared or set we are

getting +Infinity and -Infinity

Extra 3

IEEE-754 number with exponent all 1s and mantissa with at least one

bit set denotes NaN (not a number). There are many kinds of NaN

depending on mantissa bit pattern. The most regular one is Quiet NaN

(QNaN) with msb of mantissa set:

? 11111111111 1... further whatever

S EEEEEEEEEEE M...

QNaN simply informs that the performed operation has no mathematically

defined return value. You are getting QNaN when say performing

parseInt('foobar', 10);

There is also Signalling NaN (SNaN) with msb of mantissa cleared:

? 11111111111 0... further whatever but at least one bit 1

S EEEEEEEEEEE M...

SNaN is used to raise exceptions in math operations, AFAIK it is not

used in ECMAScript implementations.

Because mantissa content is not regulated for NaN except for msb there

can be potentially billions of NaN values. This way I'm taking back my

older opinion that "there are not any physical reasons behind the rule

that NaN != NaN". In fact there is and a good one.

Any number which can be expressed exactly as a sum of powers of 0.5 (or

2.0) can be held exactly,

That is back-reversed "1st VK's theorem" - expressed via Egyptian

rational-like method. IMO the definition over dyadic rational is more

clear and strict though I may be biased.

provided that the range of the powers does not

exceed about 53, and the extreme powers are neither too big nor too

small (960 to -960 should be safe).

Right, the "1st VK's theorem" - in any form - applies to the Turing

machine. On a real PC for IEEE-754-DP-FP we are hitting the mantissa

storage space limit of 52+1 bits

So the actual N of real decimal numbers represented exactly in

IEEE-754-DP-FP will be the subset of subset: first the numbers

satisfying the dyadic subset theorem, then subset of these satisfying

the storage limit condition.

I'm playing with it right now.

btw I'm reading your site as well.