P: n/a

>>(1.0/10.0) + (2.0/10.0) + (3.0/10.0)
0.60000000000000009
>>6.0/10.0
0.59999999999999998
Is using the decimal module the best way around this? (I'm expecting the first
sum to match the second). It seem anachronistic that decimal takes strings as
input, though.
Help much appreciated;
Rory

Rory CampbellLange
<ro**@campbelllange.net>
<www.campbelllange.net>  
Share this Question
P: n/a

Rory CampbellLange wrote:
Is using the decimal module the best way around this? (I'm
expecting the first sum to match the second). It seem
anachronistic that decimal takes strings as input, though.
What's your problem with the result, or what's your goal? Such
precision errors with floating point numbers are normal because the
precision is limited technically.
For floats a and b, you'd seldom say "if a == b:" (because it's
often false as in your case) but rather
"if a  b < threshold:" for a reasonable threshold value which
depends on your application.
Also check the recent thread "bizarre floating point output".
Regards,
Björn

BOFH excuse #333:
A plumber is needed, the network drain is clogged  
P: n/a

At Monday 8/1/2007 19:20, Bjoern Schliessmann wrote:
>Rory CampbellLange wrote:
Is using the decimal module the best way around this? (I'm
expecting the first sum to match the second). It seem
anachronistic that decimal takes strings as input, though.
[...] Also check the recent thread "bizarre floating point output".
And the last section on the Python Tutorial "Floating Point
Arithmetic: Issues and Limitations"

Gabriel Genellina
Softlab SRL
__________________________________________________
Preguntá. Respondé. Descubrí.
Todo lo que querías saber, y lo que ni imaginabas,
está en Yahoo! Respuestas (Beta).
¡Probalo ya! http://www.yahoo.com.ar/respuestas  
P: n/a

On Jan 8, 3:30 pm, Rory CampbellLange <r...@campbelllange.netwrote:
>(1.0/10.0) + (2.0/10.0) + (3.0/10.0)
0.60000000000000009
>6.0/10.0
0.59999999999999998
Is using the decimal module the best way around this? (I'm expecting the first
sum to match the second).
Probably not. Decimal arithmetic is NOT a cureall for floatingpoint
arithmetic errors.
>>Decimal(1) / Decimal(3) * Decimal(3)
Decimal("0.9999999999999999999999999999")
>>Decimal(2).sqrt() ** 2
Decimal("1.999999999999999999999999999")
It seem anachronistic that decimal takes strings as
input, though.
How else would you distinguish Decimal('0.1') from
Decimal('0.100000000000000005551115123125782702118 1583404541015625')?  
P: n/a

Rory CampbellLange wrote:
>
 Is using the decimal module the best way around this? (I'm
 expecting the first sum to match the second). It seem
 anachronistic that decimal takes strings as input, though.
As Dan Bishop says, probably not. The introduction to the decimal
module makes exaggerated claims of accuracy, amounting to propaganda.
It is numerically no better than binary, and has some advantages
and some disadvantages.
Also check the recent thread "bizarre floating point output".
No, don't. That is about another matter entirely, and will merely
confuse you. I have a course on computer arithmetic, and am just
now writing one on Python numerics, and confused people may contact
me  though I don't guarantee to help.
Regards,
Nick Maclaren.  
P: n/a

On Tue, 20070109 at 11:38 +0000, Nick Maclaren wrote:
Rory CampbellLange wrote:
>
 Is using the decimal module the best way around this? (I'm
 expecting the first sum to match the second). It seem
 anachronistic that decimal takes strings as input, though.
As Dan Bishop says, probably not. The introduction to the decimal
module makes exaggerated claims of accuracy, amounting to propaganda.
It is numerically no better than binary, and has some advantages
and some disadvantages.
Please elaborate. Which exaggerated claims are made, and how is decimal
no better than binary?
Carsten  
P: n/a

[Rory CampbellLange]
>>Is using the decimal module the best way around this? (I'm expecting the first sum to match the second). It seem anachronistic that decimal takes strings as input, though.
[Nick Maclaren]
>As Dan Bishop says, probably not. The introduction to the decimal module makes exaggerated claims of accuracy, amounting to propaganda. It is numerically no better than binary, and has some advantages and some disadvantages.
[Carsten Haese]
Please elaborate. Which exaggerated claims are made,
Well, just about any technical statement can be misleading if not qualified
to such an extent that the only people who can still understand it knew it
to begin with <0.8 wink>. The most dubious statement here to my eyes is
the intro's "exactness carries over into arithmetic". It takes a world of
additional words to explain exactly what it is about the example given (0.1
+ 0.1 + 0.1  0.3 = 0 exactly in decimal fp, but not in binary fp) that
does, and does not, generalize. Roughly, it does generalize to one
important reallife usecase: adding and subtracting any number of decimal
quantities delivers the exact decimal result, /provided/ that precision is
set high enough that no rounding occurs.
and how is decimal no better than binary?
Basically, they both lose info when rounding does occur. For example,
>>import decimal 1 / decimal.Decimal(3)
Decimal("0.3333333333333333333333333333")
>>_ * 3
Decimal("0.9999999999999999999999999999")
That is, (1/3)*3 != 1 in decimal. The reason why is obvious "by eyeball",
but only because you have a lifetime of experience working in base 10. A
bit ironically, the rounding in binary just happens to be such that (1/3)/3
does equal 1:
>>1./3
0.33333333333333331
>>_ * 3
1.0
It's not just * and /. The real thing at work in the 0.1 + 0.1 + 0.1  0.3
example is representation error, not sloppy +/: 0.1 and 0.3 can't be
/represented/ exactly as binary floats to begin with. Much the same can
happen if you instead you use inputs exactly representable in base 2 but
not in base 10 (and while there are none such if precision is infinite,
precision isn't infinite):
>>x = decimal.Decimal(1) / 2**90 print x
8.077935669463160887416100508E28
>>print x + x + x  3*x # not exactly 0
1E54
The same in binary f.p. is exact, because 1./2**90 is exactly representable
in binary fp:
>>x = 1. / 2**90 print x # this displays an inexact decimal approx. to 1./2**90
8.07793566946e028
>>print x + x + x  3*x # but the binary arithmetic is exact
0.0
If you boost decimal's precision high enough, then this specific example is
also exact using decimal; but with the default precision of 28, 1./2**90
can't be represented exactly in decimal to begin with; e.g.,
>>decimal.Decimal(1) / 2**90 * 2**90
Decimal("0.9999999999999999999999999999")
All forms of fp are subject to representation and rounding errors. The
biggest practical difference here is that the `decimal` module is not
subject to representation error for "natural" decimal quantities, provided
precision is set high enough to retain all the input digits. That's worth
something to many apps, and is the whole ball of wax for some apps  but
leaves a world of possible "surprises" nevertheless.  
P: n/a

Nick Maclaren wrote:
No, don't. That is about another matter entirely,
It isn't.
Regards,
Björn

BOFH excuse #366:
ATM cell has no roaming feature turned on, notebooks can't connect  
P: n/a

In article <Xn***********************@216.196.97.136>,
Tim Peters <ti*****@comcast.netwrites:
>
Well, just about any technical statement can be misleading if not qualified
to such an extent that the only people who can still understand it knew it
to begin with <0.8 wink>. The most dubious statement here to my eyes is
the intro's "exactness carries over into arithmetic". It takes a world of
additional words to explain exactly what it is about the example given (0.1
+ 0.1 + 0.1  0.3 = 0 exactly in decimal fp, but not in binary fp) that
does, and does not, generalize. Roughly, it does generalize to one
important reallife usecase: adding and subtracting any number of decimal
quantities delivers the exact decimal result, /provided/ that precision is
set high enough that no rounding occurs.
Precisely. There is one other such statement, too: "Decimal numbers can
be represented exactly." What it MEANS is that numbers with a short
representation in decimal can be represented exactly in decimal, which
is tautologous, but many people READ it to say that numbers that they
are interested in can be represented exactly in decimal. Such as pi,
sqrt(2), 1/3 and so on ....
 and how is decimal no better than binary?
>
Basically, they both lose info when rounding does occur. For example,
Yes, but there are two ways in which binary is superior. Let's skip
the superior 'smoothness', as being too arcane an issue for this group,
and deal with the other. In binary, calculating the midpoint of two
numbers (a very common operation) is guaranteed to be within the range
defined by those numbers, or to over/underflow.
Neither (x+y)/2.0 nor (x/2.0+y/2.0) are necessarily within the range
(x,y) in decimal, even for the most respectable values of x and y.
This was a MAJOR "gotcha" in the days before binary became standard,
and will clearly return with decimal.
Regards,
Nick Maclaren.  
P: n/a

"Carsten Haese" <ca*****@uniqsys.comwrote in message
news:11*********************@dot.uniqsys.com...
 On Tue, 20070109 at 11:38 +0000, Nick Maclaren wrote:
 As Dan Bishop says, probably not. The introduction to the decimal
 module makes exaggerated claims of accuracy, amounting to propaganda.
 It is numerically no better than binary, and has some advantages
 and some disadvantages.

 Please elaborate. Which exaggerated claims are made, and how is decimal
 no better than binary?
As to the latter question: calculating with decimals instead of binaries
eliminates conversion errors introduced when one has *exact* decimal
inputs, such as in financial calculations (which were the motivating use
case for the decimal module). But it does not eliminate errors inherent in
approximating reals with (a limited set of) ratrionals. Nor does it
eliminate errors inherent in approximation algorithms (such as using a
finite number of terms of an infinite series.
Terry Jan Reedy  
P: n/a

On 1/9/07, Tim Peters <ti*****@comcast.netwrote:
Well, just about any technical statement can be misleading if not qualified
to such an extent that the only people who can still understand it knew it
to begin with <0.8 wink>.
+1 QTOW

Cheers,
Simon B si***@brunningonline.net  
P: n/a

Bjoern Schliessmann wrote:
Nick Maclaren wrote:
>No, don't. That is about another matter entirely,
It isn't.
Actually it really is. That thread is about the difference between
str(some_float) and repr(some_float) and why str(some_tuple) uses the repr() of
its elements.

Robert Kern
"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
 Umberto Eco  
P: n/a

In article <ma***************************************@python. org>,
Robert Kern <ro*********@gmail.comwrites:

No, don't. That is about another matter entirely,

 It isn't.
>
Actually it really is. That thread is about the difference between
str(some_float) and repr(some_float) and why str(some_tuple) uses the repr() of
its elements.
Precisely. And it also applies to strings, which I had failed to
notice:
>>print ("1","2")
('1', '2')
>>print "1", "2"
1 2
Regards,
Nick Maclaren.  
P: n/a

[Tim Peters]
....
>Well, just about any technical statement can be misleading if not qualified to such an extent that the only people who can still understand it knew it to begin with <0.8 wink>. The most dubious statement here to my eyes is the intro's "exactness carries over into arithmetic". It takes a world of additional words to explain exactly what it is about the example given (0.1 + 0.1 + 0.1  0.3 = 0 exactly in decimal fp, but not in binary fp) that does, and does not, generalize. Roughly, it does generalize to one important reallife usecase: adding and subtracting any number of decimal quantities delivers the exact decimal result, /provided/ that precision is set high enough that no rounding occurs.
[Nick Maclaren]
Precisely. There is one other such statement, too: "Decimal numbers
can be represented exactly." What it MEANS is that numbers with a
short representation in decimal can be represented exactly in decimal,
which is tautologous, but many people READ it to say that numbers that
they are interested in can be represented exactly in decimal. Such as
pi, sqrt(2), 1/3 and so on ....
Huh. I don't read it that way. If it said "numbers can be ..." I
might, but reading that way seems to requires effort to overlook the
"decimal" in "decimal numbers can be ...".
[attribution lost]
>>and how is decimal no better than binary?
>Basically, they both lose info when rounding does occur. For example,
Yes, but there are two ways in which binary is superior. Let's skip
the superior 'smoothness', as being too arcane an issue for this
group,
With 28 decimal digits used by default, few apps would care about this
anyway.
and deal with the other. In binary, calculating the midpoint
of two numbers (a very common operation) is guaranteed to be within
the range defined by those numbers, or to over/underflow.
Neither (x+y)/2.0 nor (x/2.0+y/2.0) are necessarily within the range
(x,y) in decimal, even for the most respectable values of x and y.
This was a MAJOR "gotcha" in the days before binary became standard,
and will clearly return with decimal.
I view this as being an instance of "lose info when rounding does
occur". For example,
>>import decimal as d s = d.Decimal("." + "9" * d.getcontext().prec) s
Decimal("0.9999999999999999999999999999")
>>(s+s)/2
Decimal("1.000000000000000000000000000")
>>s/2 + s/2
Decimal("1.000000000000000000000000000")
"The problems" there are due to rounding error:
>>s/2 # "the problem" in s/2+s/2 is that s/2 rounds up to exactly 1/2
Decimal("0.5000000000000000000000000000")
>>s+s # "the problem" in (s+s)/2 is that s+s rounds up to exactly 2
Decimal("2.000000000000000000000000000")
It's always something ;)  
P: n/a

In article <Xn**********************@216.196.97.136>,
Tim Peters <ti*****@comcast.netwrites:
>
Huh. I don't read it that way. If it said "numbers can be ..." I
might, but reading that way seems to requires effort to overlook the
"decimal" in "decimal numbers can be ...".
I wouldn't expect YOU to read it that way, but I can assure you from
experience that many people do. What it MEANS is "Numbers with a
short representation in decimal can be represented exactly in decimal
arithmetic", which is tautologous. What they READ it to mean is
"One advantage of representing numbers in decimal is that they can be
represented exactly", and they then assume that also applies to pi,
sqrt(2), 1/3 ....
The point is that the "decimal" could apply equally well to the external
or internal representation and, if you aren't fairly cluedup in this
area, it is easy to choose the wrong one.
>and how is decimal no better than binary?
>
Basically, they both lose info when rounding does occur. For
example,
>
 Yes, but there are two ways in which binary is superior. Let's skip
 the superior 'smoothness', as being too arcane an issue for this
 group,
>
With 28 decimal digits used by default, few apps would care about this
anyway.
Were you in the computer arithmetic area during the "base wars" of the
1960s and 1970s that culminated with binary winning out? A lot of very
wellrespected numerical analysts said that larger bases led to a
faster buildup of error (independent of the precision). My limited
investigations indicated that there was SOME truth in that, but it
wasn't a major matter; I never say the matter settled definitively.
 and deal with the other. In binary, calculating the midpoint
 of two numbers (a very common operation) is guaranteed to be within
 the range defined by those numbers, or to over/underflow.

 Neither (x+y)/2.0 nor (x/2.0+y/2.0) are necessarily within the range
 (x,y) in decimal, even for the most respectable values of x and y.
 This was a MAJOR "gotcha" in the days before binary became standard,
 and will clearly return with decimal.
>
I view this as being an instance of "lose info when rounding does
occur". For example,
No, absolutely NOT! This is an orthogonal matter, and is about the
loss of an important invariant when using any base above 2.
Back in the days when there were multiple bases, virtually every
programmer who wrote large numerical code got caught by it at least
once, and many got caught several times (it has multiple guises).
For example, take the following algorithm for binary chop:
while 1 :
c = (a+b)/2
if f(x) < y :
if c == b :
break
b = c
else :
if c == a :
break
a = c
That works in binary, but in no base above 2 (assuming that I haven't
made a stupid error writing it down). In THAT case, it is easy to fix
for decimal, but there are ways that it can show up that can be quite
tricky to fix.
Regards,
Nick Maclaren.  
P: n/a

[Tim Peters]
....
>Huh. I don't read it that way. If it said "numbers can be ..." I might, but reading that way seems to requires effort to overlook the "decimal" in "decimal numbers can be ...".
[Nick Maclaren]
I wouldn't expect YOU to read it that way,
Of course I meant "putting myself in others' shoes, I don't ...".
but I can assure you from experience that many people do.
Sure. Possibly even most. Short of writing a long & gentle tutorial,
can that be improved? Alas, most people wouldn't read that either <0.5
wink>.
What it MEANS is "Numbers with a short representation in decimal
"short" is a red herring here: Python's Decimal constructor ignores the
precision setting, retaining all the digits you give. For example, if
you pass a string with a million decimal digits, you'll end up with a
very fat Decimal instance  no info is lost.
can be represented exactly in decimal arithmetic", which is
tautologous. What they READ it to mean is "One advantage of
representing numbers in decimal is that they can be represented
exactly", and they then assume that also applies to pi, sqrt(2),
1/3 ....
The point is that the "decimal" could apply equally well to the
external or internal representation and, if you aren't fairly
cluedup in this area, it is easy to choose the wrong one.
Worse, I expect most people have no real idea of that there's a possible
difference between internal and external representations. This is often
given as a selling point for decimal arithmetic: it's WYSIWYG in ways
binary fp can't be (short of inventing powerof2 fp representations for
I/O, which few people would use).
[attribution lost]
>>>>and how is decimal no better than binary?
[Tim]
>>>Basically, they both lose info when rounding does occur. For example,
[Nick]
>>Yes, but there are two ways in which binary is superior. Let's skip the superior 'smoothness', as being too arcane an issue for this group,
>With 28 decimal digits used by default, few apps would care about this anyway.
Were you in the computer arithmetic area during the "base wars" of the
1960s and 1970s that culminated with binary winning out?
Yes, although I came in on the tail end of that and never actually used
a nonbinary machine.
A lot of very wellrespected numerical analysts said that larger bases
led to a faster buildup of error (independent of the precision). My
limited investigations indicated that there was SOME truth in that,
but it wasn't a major matter; I never say the matter settled
definitively.
My point was that 28 decimal digits of precision is far greater than
supplied even by 64bit binary floats today (let alone the smaller sizes
in mostcommon use back in the 60s and 70s). "Pollution" of loworder
bits is far less of a real concern when there are some number of low
order bits you don't care about at all.
>>and deal with the other. In binary, calculating the midpoint of two numbers (a very common operation) is guaranteed to be within the range defined by those numbers, or to over/underflow.
Neither (x+y)/2.0 nor (x/2.0+y/2.0) are necessarily within the range (x,y) in decimal, even for the most respectable values of x and y. This was a MAJOR "gotcha" in the days before binary became standard, and will clearly return with decimal.
>I view this as being an instance of "lose info when rounding does occur". For example,
No, absolutely NOT!
Of course it is. If there were no rounding errors, the computed result
would be exactly right  that's darned near tautological too. You
snipped the examples I gave showing exactly where and how rounding error
created the problems in (x+y)/2 and x/2+y/2 for some specific values of
x and y using decimal arithmetic. If you don't like those examples,
supply your own, and if you get a similarly surprising result you'll
find rounding error(s) occur(s) in yours too.
It so happens that rounding errors in binary fp can't lead to the same
counterintuitive /outcome/, essentially because x+x == y+y implies x ==
y in base 2 fp, which is indeed a bit of magic specific to base 2. The
fact that there /do/ exist fp x and y such that x != y yet x+x == y+y in
bases 2 is entirely due to fp rounding error losing info.
This is an orthogonal matter,
Disagree.
and is about the loss of an important invariant when using any base
above 2.
It is that.
Back in the days when there were multiple bases, virtually every
programmer who wrote large numerical code got caught by it at least
once, and many got caught several times (it has multiple guises).
For example, take the following algorithm for binary chop:
while 1 :
c = (a+b)/2
if f(x) < y :
if c == b :
break
b = c
else :
if c == a :
break
a = c
That works in binary, but in no base above 2 (assuming that I haven't
made a stupid error writing it down). In THAT case, it is easy to fix
for decimal, but there are ways that it can show up that can be quite
tricky to fix.
If you know a < b, doing
c = a + (ba)/2
instead of
c = (a+b)/2
at least guarantees (ignoring possible overflow) a <= c <= b. As shown
last time, it's not even always the case that (x+x)/2 == x in decimal fp
(or in any fp base 2, for that matter).  
P: n/a

In article <Xn***********************@216.196.97.136>,
Tim Peters <ti*****@comcast.netwrites:
>
Sure. Possibly even most. Short of writing a long & gentle tutorial,
can that be improved? Alas, most people wouldn't read that either <0.5
wink>.
Yes. Improved wording would be only slightly longer, and it is never
appropriate to omit all negative aspects. The truth, the whole truth
and nothing but the truth :)
Worse, I expect most people have no real idea of that there's a possible
difference between internal and external representations. This is often
given as a selling point for decimal arithmetic: it's WYSIWYG in ways
binary fp can't be (short of inventing powerof2 fp representations for
I/O, which few people would use).
Right. Another case when none of the problems show up on dinky little
examples but do in real code :(
 A lot of very wellrespected numerical analysts said that larger bases
 led to a faster buildup of error (independent of the precision). My
 limited investigations indicated that there was SOME truth in that,
 but it wasn't a major matter; I never say the matter settled
 definitively.
>
My point was that 28 decimal digits of precision is far greater than
supplied even by 64bit binary floats today (let alone the smaller sizes
in mostcommon use back in the 60s and 70s). "Pollution" of loworder
bits is far less of a real concern when there are some number of low
order bits you don't care about at all.
Yes, but that wasn't their point. It was that in (say) iterative
algorithms, the error builds up by a factor of the base at every step.
If it wasn't for the fact that errors build up, almost all programs
could ignore numerical analysis and still get reliable answers!
Actually, my (limited) investigations indicated that such an error
buildup was extremely rare  I could achieve it only in VERY artificial
programs. But I did find that the errors built up faster for higher
bases, so that a reasonable rule of thumb is that 28 digits with a decimal
base was comparable to (say) 80 bits with a binary base.
And, IN GENERAL, programs won't be using 128bit IEEE representations.
Given Python's overheads, there is no reason not to, unless the hardware
is catastrophically slower (which is plausible).
If you know a < b, doing
>
 c = a + (ba)/2
>
instead of
>
 c = (a+b)/2
>
at least guarantees (ignoring possible overflow) a <= c <= b. As shown
last time, it's not even always the case that (x+x)/2 == x in decimal fp
(or in any fp base 2, for that matter).
Yes. Back in the days before binary floatingpoint started to dominate,
we taught that as a matter of routine, but it has not been taught to
all users of floatingpoint for a couple of decades. Indeed, a lot of
modern programmers regard having to distort simple expressions in that
way as anathema.
It isn't a major issue, because our experience from then is that it is
both teachable and practical, but it IS a way in which any base above
2 is significantly worse than base 2.
Regards,
Nick Maclaren.  
P: n/a

"Nick Maclaren" <nm**@cus.cam.ac.ukwrote:
Yes, but that wasn't their point. It was that in (say) iterative
algorithms, the error builds up by a factor of the base at every step.
If it wasn't for the fact that errors build up, almost all programs
could ignore numerical analysis and still get reliable answers!
Actually, my (limited) investigations indicated that such an error
buildup was extremely rare  I could achieve it only in VERY artificial
programs. But I did find that the errors built up faster for higher
bases, so that a reasonable rule of thumb is that 28 digits with a decimal
base was comparable to (say) 80 bits with a binary base.
I would have thought that this sort of thing was a natural consequence
of rounding errors  if I round (or worse truncate) a binary, I can be off
by at most one, with an expectation of a half of a least significant digit,
while if I use hex digits, my expectation is around eight, and for decimal
around five...
So it would seem natural that errors would propagate
faster on big base systems, AOTBE, but this may be
a naive view..
 Hendrik  
P: n/a

In article <ma***************************************@python. org>,
"Hendrik van Rooyen" <ma**@microcorp.co.zawrites:
>
I would have thought that this sort of thing was a natural consequence
of rounding errors  if I round (or worse truncate) a binary, I can be off
by at most one, with an expectation of a half of a least significant digit,
while if I use hex digits, my expectation is around eight, and for decimal
around five...
>
So it would seem natural that errors would propagate
faster on big base systems, AOTBE, but this may be
a naive view..
Yes, indeed, and that is precisely why the "we must use binary" camp won
out. The problem was that computers of the early 1970s were not quite
powerful enough to run real applications with simulated floatingpoint
arithmetic. I am one of the halfdozen people who did ANY actual tests
on real numerical code, but there may have been some work since!
Nowadays, it would be easy, and it would make quite a good PhD. The
points to look at would be the base and the rounding rules (including
IEEE rounding versus probabilistic versus last bit forced[*]). We know
that the use or not of denormalised numbers and the exact details of
true rounding make essentially no difference.
In a world ruled by reason rather than spin, this investigation
would have been done before claiming that decimal floatingpoint is an
adequate replacement for binary for numerical work, but we don't live
in such a world. No matter. Almost everyone in the area agrees that
decimal floatingpoint isn't MUCH worse than binary, from a numerical
point of view :)
[*] Assuming signed magnitude, calculate the answer truncated towards
zero but keep track of whether it is exact. If not, force the last
bit to 1. An old, cheap approximation to rounding.
Regards,
Nick Maclaren.  
P: n/a

"Nick Maclaren" <nm**@cus.cam.ac.ukwrote:
>
In article <ma***************************************@python. org>,
"Hendrik van Rooyen" <ma**@microcorp.co.zawrites:
>
I would have thought that this sort of thing was a natural consequence
of rounding errors  if I round (or worse truncate) a binary, I can be off
by at most one, with an expectation of a half of a least significant digit,
while if I use hex digits, my expectation is around eight, and for decimal
around five...
>
So it would seem natural that errors would propagate
faster on big base systems, AOTBE, but this may be
a naive view..
Yes, indeed, and that is precisely why the "we must use binary" camp won
out. The problem was that computers of the early 1970s were not quite
powerful enough to run real applications with simulated floatingpoint
arithmetic. I am one of the halfdozen people who did ANY actual tests
on real numerical code, but there may have been some work since!
*grin*  I was around at that time, and some of the inappropriate habits
almost forced by the lack of processing power still linger in my mind,
like  "Don't use division if you can possibly avoid it,  its EXPENSIVE!"
 it seems so silly nowadays.
>
Nowadays, it would be easy, and it would make quite a good PhD. The
points to look at would be the base and the rounding rules (including
IEEE rounding versus probabilistic versus last bit forced[*]). We know
that the use or not of denormalised numbers and the exact details of
true rounding make essentially no difference.
In a world ruled by reason rather than spin, this investigation
would have been done before claiming that decimal floatingpoint is an
adequate replacement for binary for numerical work, but we don't live
in such a world. No matter. Almost everyone in the area agrees that
decimal floatingpoint isn't MUCH worse than binary, from a numerical
point of view :)
As an old slide rule user  I can agree with this  if you know the order
of the answer, and maybe two points after the decimal, it will tell you
if the bridge will fall down or not. Having an additional fifty decimal
places of accuracy does not really add any real information in these
cases. Its nice of course if its free, like it has almost become  but
I think people get mesmerized by the numbers, without giving any
thought to what they mean  which is probably why we often see
threads complaining about the "error" in the fifteenth decimal place..
>[*] Assuming signed magnitude, calculate the answer truncated towards
zero but keep track of whether it is exact. If not, force the last
bit to 1. An old, cheap approximation to rounding.
This is not so cheap  its good solid reasoning in my book 
after all, "something" is a lot more than "nothing" and should
not be thrown away...
 Hendrik  
P: n/a

In article <ma***************************************@python. org>,
"Hendrik van Rooyen" <ma**@microcorp.co.zawrites:
>
*grin*  I was around at that time, and some of the inappropriate habits
almost forced by the lack of processing power still linger in my mind,
like  "Don't use division if you can possibly avoid it,  its EXPENSIVE!"
 it seems so silly nowadays.
Yes, indeed, but that one is actually still with us! Integer division
is done by software on a few systems, and floatingpoint division is
often not vectorisable or pipelines poorly. But, except for special
cases of little relevance to Python, it is not the poison that it was
back then.
As an old slide rule user  I can agree with this  if you know the order
of the answer, and maybe two points after the decimal, it will tell you
if the bridge will fall down or not. Having an additional fifty decimal
places of accuracy does not really add any real information in these
cases. Its nice of course if its free, like it has almost become  but
I think people get mesmerized by the numbers, without giving any
thought to what they mean  which is probably why we often see
threads complaining about the "error" in the fifteenth decimal place..
Agreed. But the issue is really error buildup, and algorithms that are
numerically 'unstable'  THEN, such subtle differences do matter. You
still aren't interested in more than a few digits in the result, but you
may have to sweat blood to get them.
[*] Assuming signed magnitude, calculate the answer truncated towards
 zero but keep track of whether it is exact. If not, force the last
 bit to 1. An old, cheap approximation to rounding.

This is not so cheap  its good solid reasoning in my book 
after all, "something" is a lot more than "nothing" and should
not be thrown away...
The "cheap" means "cheap in hardware"  it needs very little logic,
which is why it was used on the old, discretelogic, machines.
I have been told by hardware people that implementing IEEE 754 rounding
and denormalised numbers needs a horrific amount of logic  which is
why only IBM do it all in hardware. And the decimal formats are
significantly more complicated.
What I don't know is how much precision this approximation loses when
used in real applications, and I have never found anyone else who has
much of a clue, either.
Regards,
Nick Maclaren.  
P: n/a

[Nick Maclaren]
>... Yes, but that wasn't their point. It was that in (say) iterative algorithms, the error builds up by a factor of the base at every step. If it wasn't for the fact that errors build up, almost all programs could ignore numerical analysis and still get reliable answers!
Actually, my (limited) investigations indicated that such an error buildup was extremely rare  I could achieve it only in VERY artificial programs. But I did find that the errors built up faster for higher bases, so that a reasonable rule of thumb is that 28 digits with a decimal base was comparable to (say) 80 bits with a binary base.
[Hendrik van Rooyen]
I would have thought that this sort of thing was a natural consequence
of rounding errors  if I round (or worse truncate) a binary, I can be
off by at most one, with an expectation of a half of a least
significant digit, while if I use hex digits, my expectation is around
eight, and for decimal around five...
Which, in all cases, is a half ULP at worst (when rounding  as
everyone does now).
So it would seem natural that errors would propagate
faster on big base systems, AOTBE, but this may be
a naive view..
I don't know of any current support for this view. It the bad old days,
such things were often confused by architectures that mixed nonbinary
bases with "creative" rounding rules (like truncation indeed), and it
could be hard to know where to "pin the blame".
What you will still see stated is variations on Kahan's telegraphic
"binary is better than any other radix for error analysis (but not very
much)", listed as one of two techincal advantages for binary fp in: http://www.cs.berkeley.edu/~wkahan/MktgMath.pdf
It's important to note that he says "error analysis", not "error
propagation"  regardless of base in use, rounding is good to <= 1/2
ULP. A fuller elementary explanation of this can be found in David
Goldberg's widely available "What Every Computer Scientist Should Know
About FloatingPoint", in its "Relative Error and Ulps" section. The
short course is that rigorous forward error analysis of fp algorithms is
usually framed in terms of relative error: given a computed
approximation x' to the mathematically exact result x, what's the
largest possible absolute value of the mathematical
r = (x'x)/x
(the relative error of x')? This framework gets used because it's more
orless tractable, starting by assuming inputs are exact (or not, in
which case you start by bounding the inputs' relative errors), then
successively computing relative errors for each step of the algorithm.
Goldberg's paper, and Knuth volume 2, contain many introductory examples
of rigorous analysis using this approach.
Analysis of relative error generally goes along independent of FP base.
It's at the end, when you want to transform a statement about relative
error into a statement about error as measured by ULPs (units in the
last place), where the base comes in strongly. As Goldberg explains,
the larger the fp base the sloppier the relativeerrorconvertedtoULPs
bound is  but this is by a constant factor independent of the
algorithm being analyzed, hence Kahan's "... better ... but not very
much". In more words from Goldberg:
Since epsilon [a measure of relative error] can overestimate the
effect of rounding to the nearest floatingpoint number by the
wobble factor of B [the FP base, like 2 for binary or 10 for
decimal], error estimates of formulas will be tighter on machines
with a small B.
When only the order of magnitude of rounding error is of interest,
ulps and epsilon may be used interchangeably, since they differ by
at most a factor of B.
So that factor of B is irrelevant to most apps most of the time. For a
combination of an fp algorithm + set of inputs near the edge of giving
gibberish results, of course it can be important. Someone using
Python's decimal implementation has an often very effective workaround
then, short of writing a more robust fp algorithm: just boost the
precision.  
P: n/a

"Dennis Lee Bieber" <wl*****@ix.netcom.comwrote:
{My 8th grade teacher was a bit worried at seeing me with a slipstick
<G>; and my HighSchool Trig/Geometry teacher only required 3 significant
digits for answers  even though half the class had calculators by
then}
LOL  I haven't seen the word "slipstick" for yonks...
I recall an SF character known as "Slipstick Libby",
who was supposed to be a Genius  but I forget
the setting and the author.
It is something that has become quietly extinct, and
we did not even notice.
We should start a movement for reviving them 
on grounds of their "greenness"  they use no
batteries...
Fat chance.
 Hendrik  
P: n/a

"Nick Maclaren" <nm**@cus.cam.ac.ukwrote:
The "cheap" means "cheap in hardware"  it needs very little logic,
which is why it was used on the old, discretelogic, machines.
I have been told by hardware people that implementing IEEE 754 rounding
and denormalised numbers needs a horrific amount of logic  which is
why only IBM do it all in hardware. And the decimal formats are
significantly more complicated.
What I don't know is how much precision this approximation loses when
used in real applications, and I have never found anyone else who has
much of a clue, either.
I would suspect that this is one of those questions which are simple
to ask, but horribly difficult to answer  I mean  if the hardware has
thrown it away, how do you study it  you need somehow two
different parallel engines doing the same stuff, and comparing the
results, or you have to write a big simulation, and then you bring
your simulation errors into the picture  There be Dragons...
 Hendrik  
P: n/a

"Tim Peters" <ti*****@comcast.netwrote:
[Nick Maclaren]
...
Yes, but that wasn't their point. It was that in (say) iterative
algorithms, the error builds up by a factor of the base at every
step. If it wasn't for the fact that errors build up, almost all
programs could ignore numerical analysis and still get reliable
answers!
Actually, my (limited) investigations indicated that such an error
buildup was extremely rare  I could achieve it only in VERY
artificial programs. But I did find that the errors built up faster
for higher bases, so that a reasonable rule of thumb is that 28
digits with a decimal base was comparable to (say) 80 bits with a
binary base.
[Hendrik van Rooyen]
I would have thought that this sort of thing was a natural consequence
of rounding errors  if I round (or worse truncate) a binary, I can be
off by at most one, with an expectation of a half of a least
significant digit, while if I use hex digits, my expectation is around
eight, and for decimal around five...
Which, in all cases, is a half ULP at worst (when rounding  as
everyone does now).
So it would seem natural that errors would propagate
faster on big base systems, AOTBE, but this may be
a naive view..
I don't know of any current support for this view. It the bad old days,
such things were often confused by architectures that mixed nonbinary
bases with "creative" rounding rules (like truncation indeed), and it
could be hard to know where to "pin the blame".
What you will still see stated is variations on Kahan's telegraphic
"binary is better than any other radix for error analysis (but not very
much)", listed as one of two techincal advantages for binary fp in:
http://www.cs.berkeley.edu/~wkahan/MktgMath.pdf
It's important to note that he says "error analysis", not "error
propagation"  regardless of base in use, rounding is good to <= 1/2
ULP. A fuller elementary explanation of this can be found in David
Goldberg's widely available "What Every Computer Scientist Should Know
About FloatingPoint", in its "Relative Error and Ulps" section. The
short course is that rigorous forward error analysis of fp algorithms is
usually framed in terms of relative error: given a computed
approximation x' to the mathematically exact result x, what's the
largest possible absolute value of the mathematical
r = (x'x)/x
(the relative error of x')? This framework gets used because it's more
orless tractable, starting by assuming inputs are exact (or not, in
which case you start by bounding the inputs' relative errors), then
successively computing relative errors for each step of the algorithm.
Goldberg's paper, and Knuth volume 2, contain many introductory examples
of rigorous analysis using this approach.
Analysis of relative error generally goes along independent of FP base.
It's at the end, when you want to transform a statement about relative
error into a statement about error as measured by ULPs (units in the
last place), where the base comes in strongly. As Goldberg explains,
the larger the fp base the sloppier the relativeerrorconvertedtoULPs
bound is  but this is by a constant factor independent of the
algorithm being analyzed, hence Kahan's "... better ... but not very
much". In more words from Goldberg:
Since epsilon [a measure of relative error] can overestimate the
effect of rounding to the nearest floatingpoint number by the
wobble factor of B [the FP base, like 2 for binary or 10 for
decimal], error estimates of formulas will be tighter on machines
with a small B.
When only the order of magnitude of rounding error is of interest,
ulps and epsilon may be used interchangeably, since they differ by
at most a factor of B.
So that factor of B is irrelevant to most apps most of the time. For a
combination of an fp algorithm + set of inputs near the edge of giving
gibberish results, of course it can be important. Someone using
Python's decimal implementation has an often very effective workaround
then, short of writing a more robust fp algorithm: just boost the
precision.
Thanks Tim, for taking the trouble.  really nice explanation.
My basic error of thinking ( ?  more like gut feel ) was that the
bigger bases somehow lose "more bits" at every round,
forgetting that half a microvolt is still half a microvolt, whether
it is rounded in binary, decimal, or hex...
 Hendrik  
P: n/a

In article <ma***************************************@python. org>,
"Hendrik van Rooyen" <ma**@microcorp.co.zawrites:
 "Tim Peters" <ti*****@comcast.netwrote:
>
 What you will still see stated is variations on Kahan's telegraphic
 "binary is better than any other radix for error analysis (but not very
 much)", listed as one of two techincal advantages for binary fp in:

 http://www.cs.berkeley.edu/~wkahan/MktgMath.pdf
Which I believe to be the final statement of the matter. It was a minority
view 30 years ago, but I now know of little dissent.
He has omitted that midpoint invariant as a third advantage of binary,
but I agree that it could be phrased as "one or two extra mathematical
invariants hold for binary (but not very important ones)".
My basic error of thinking ( ?  more like gut feel ) was that the
bigger bases somehow lose "more bits" at every round,
forgetting that half a microvolt is still half a microvolt, whether
it is rounded in binary, decimal, or hex...
That is not an error, but only a mistake :)
Yes, you have hit the nail on the head. Some people claimed that some
important algorithms did that, and that binary was consequently much
better. If it were true, then the precision you would need would be
pro rata to the case  so the decimal equivalent of 64bit binary would
need 160 bits.
Experience failed to confirm their viewpoint, and the effect was seen
in only artificial algorithms (sorry  I can no longer remember the
examples and am reluctant to waste time trying to reinvent them). But
it was ALSO found that the converse was not QUITE true, either, and the
effective numerical precision is not FULLY independent of the base.
So, at a wild guesstimate, 64bit decimal will deliver a precision
comparable to about 56bit binary, and will cause significant numerical
problems to a FEW applications. Hence people will have to convert to
the much more expensive 128bit decimal format for such work.
Bloatware rules. All your bits are belong to us.
Regards,
Nick Maclaren.  
P: n/a

In article <ma***************************************@python. org>,
"Hendrik van Rooyen" <ma**@microcorp.co.zawrites:

I would suspect that this is one of those questions which are simple
to ask, but horribly difficult to answer  I mean  if the hardware has
thrown it away, how do you study it  you need somehow two
different parallel engines doing the same stuff, and comparing the
results, or you have to write a big simulation, and then you bring
your simulation errors into the picture  There be Dragons...
No. You just emulate floatingpoint in software and throw a switch
selecting between the two rounding rules.
Regards,
Nick Maclaren.  
P: n/a

"Hendrik van Rooyen" <ma**@microcorp.co.zawrote:
>"Nick Maclaren" <nm**@cus.cam.ac.ukwrote:
>What I don't know is how much precision this approximation loses when used in real applications, and I have never found anyone else who has much of a clue, either.
I would suspect that this is one of those questions which are simple to ask, but horribly difficult to answer  I mean  if the hardware has thrown it away, how do you study it  you need somehow two different parallel engines doing the same stuff, and comparing the results, or you have to write a big simulation, and then you bring your simulation errors into the picture  There be Dragons...
Actually, this is a very well studied part of computer science called
"interval arithmetic". As you say, you do every computation twice, once to
compute the minimum, once to compute the maximum. When you're done, you
can be confident that the true answer lies within the interval.
For people just getting into it, it can be shocking to realize just how
wide the interval can become after some computations.

Tim Roberts, ti**@probo.com
Providenza & Boekelheide, Inc.  
P: n/a

"Dennis Lee Bieber" <wl*****@ix.netcom.com>wrote:
On Sun, 14 Jan 2007 07:18:11 +0200, "Hendrik van Rooyen"
<ma**@microcorp.co.zadeclaimed the following in comp.lang.python:
I recall an SF character known as "Slipstick Libby",
who was supposed to be a Genius  but I forget
the setting and the author.
Robert Heinlein. Appears a few of the Lazarus Long books.
It is something that has become quietly extinct, and
we did not even notice.
And get collector prices  http://www.sphere.bc.ca/test/sruniverse.html
Thanks Dennis  Fascinating site !
 Hendrik  
P: n/a

In article <f4********************************@4ax.com>,
Tim Roberts <ti**@probo.comwrites:
"Hendrik van Rooyen" <ma**@microcorp.co.zawrote:
>
What I don't know is how much precision this approximation loses when
used in real applications, and I have never found anyone else who has
much of a clue, either.
>
I would suspect that this is one of those questions which are simple
to ask, but horribly difficult to answer  I mean  if the hardware has
thrown it away, how do you study it  you need somehow two
different parallel engines doing the same stuff, and comparing the
results, or you have to write a big simulation, and then you bring
your simulation errors into the picture  There be Dragons...
>
Actually, this is a very well studied part of computer science called
"interval arithmetic". As you say, you do every computation twice, once to
compute the minimum, once to compute the maximum. When you're done, you
can be confident that the true answer lies within the interval.
The problem with it is that it is an unrealistically pessimal model,
and there are huge classes of algorithm that it can't handle at all;
anything involving iterative convergence for a start. It has been
around for yonks (I first dabbled with it 30+ years ago), and it has
never reached viability for most real applications. In 30 years, it
has got almost nowhere.
Don't confuse interval methods with interval arithmetic, because you
don't need the latter for the former, despite the claims that you do.
For people just getting into it, it can be shocking to realize just how
wide the interval can become after some computations.
Yes. Even when you can prove (mathematically) that the bounds are
actually quite tight :)
Regards,
Nick Maclaren.  
P: n/a

Nick Maclaren wrote:
The problem with it is that it is an unrealistically pessimal model,
and there are huge classes of algorithm that it can't handle at all;
anything involving iterative convergence for a start. It has been
around for yonks (I first dabbled with it 30+ years ago), and it has
never reached viability for most real applications. In 30 years, it
has got almost nowhere.
Don't confuse interval methods with interval arithmetic, because you
don't need the latter for the former, despite the claims that you do.
For people just getting into it, it can be shocking to realize just how
wide the interval can become after some computations.
Yes. Even when you can prove (mathematically) that the bounds are
actually quite tight :)
I've been experimenting with a fixedpoint interval type in python. I
expect many algorithms would require you to explicitly
round/collapse/whateverterm the interval as they go along, essentially
making it behave like a float. Do you think it'd suitable for
generaluse, assuming you didn't mind the explicit rounding?
Unfortunately I lack a math background, so it's unlikely to progress
past an experiment.  
P: n/a

In article <11**********************@m58g2000cwm.googlegroups .com>,
"Rhamphoryncus" <rh****@gmail.comwrites:
>
I've been experimenting with a fixedpoint interval type in python. I
expect many algorithms would require you to explicitly
round/collapse/whateverterm the interval as they go along, essentially
making it behave like a float.
Yes, quite.
Do you think it'd suitable for
generaluse, assuming you didn't mind the explicit rounding?
I doubt it. Sorry.
Unfortunately I lack a math background, so it's unlikely to progress
past an experiment.
As the same is true for what plenty of people have done, despite them
having good backgrounds in mathematics, don't feel inferior!
Regards,
Nick Maclaren.  
P: n/a

Tim Peters wrote:
... Alas, most people wouldn't read that either <0.5 wink>.
Oh the loss, you missed the chance for a <0.499997684987 wink>.
Scott David Daniels sc***********@acm.org  
P: n/a

"Nick Maclaren" <nm**@cus.cam.ac.ukwrote:
[Tim Roberts]
Actually, this is a very well studied part of computer science called
"interval arithmetic". As you say, you do every computation twice, once to
compute the minimum, once to compute the maximum. When you're done, you
can be confident that the true answer lies within the interval.
>
The problem with it is that it is an unrealistically pessimal model,
and there are huge classes of algorithm that it can't handle at all;
anything involving iterative convergence for a start. It has been
around for yonks (I first dabbled with it 30+ years ago), and it has
never reached viability for most real applications. In 30 years, it
has got almost nowhere.
Don't confuse interval methods with interval arithmetic, because you
don't need the latter for the former, despite the claims that you do.
For people just getting into it, it can be shocking to realize just how
wide the interval can become after some computations.
Yes. Even when you can prove (mathematically) that the bounds are
actually quite tight :)
This sounds like one of those pesky:
"but you should be able to do better"  kinds of things...
 Hendrik  
P: n/a

In article <ma***************************************@python. org>,
"Hendrik van Rooyen" <ma**@microcorp.co.zawrites:
>
[ Interval arithmetic ]
>
 For people just getting into it, it can be shocking to realize just how
 wide the interval can become after some computations.

 Yes. Even when you can prove (mathematically) that the bounds are
 actually quite tight :)
>
This sounds like one of those pesky:
"but you should be able to do better"  kinds of things...
It's worse :(
It is rather like global optimisation (including linear programming
etc.) The algorithms that are guaranteed to work are so catastrophically
slow that they are of theoretical interest only, but almost every
practical problem can be solved "well enough" with a hack, IF it is
coded by someone who understands both the problem and global
optimisation.
This is why the "statistical" methods (so disliked by Kahan) are used.
In a fair number of cases, they give reasonable estimates of the error.
In others, they give a false sense of security :(
Regards,
Nick Maclaren.   This discussion thread is closed Replies have been disabled for this discussion.   Question stats  viewed: 1273
 replies: 34
 date asked: Jan 8 '07
