473,378 Members | 1,436 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,378 software developers and data experts.

When (32-bit) double precision isn't precise enough

I've been trying to perform a calculation that has been running into
an underflow (insufficient precision) problem in Microsoft Excel, which
calculates using at most 15 significant digits. For this purpose, that
isn't enough.

I was reading a book about some of the financial scandals of the
1990s called "Inventing Money: The Story of Long-Term Capital Management
and the Legends Behind it" by Nicholas Dunbar. On page 95, he mentions
that the 1987 stock-market crash was designated in economists' computer
models as a "20-sigma event". That is, their models (which obviously were
badly flawed!) put such an event in the exponential "tails" of a normal
Gaussian distribution, outside the range +/- 20 standard deviations from
the mean. In other words - vastly unlikely.

So I wanted to know: just how unlikely is that? Where did they get
the data to support that idea, anyway?

I put together an Excel spreadsheet, attempting to calculate this
probability. But after about 8 sigma I ran into the limits of Excel's
15-digit
precision and could go no further. Since the odds of an 8-sigma event
are about 819 trillion to 1 against, a 20-sigma event is obviously going to
be so unlikely that it's not even worth talking about; would probably
never happen in the lifetime of the universe. Further evidence that those
computer models had some serious problems.

To perform that calculation, I used Excel's built-in error function
ERF(), and here's the output:

s Confidence Interval Probability Odds against

1.0 68.268949131685700% 31.7310508683143000000% 3
1.5 86.638554269553900% 13.3614457304461000000% 7
2.0 95.449972950730900% 4.55002704926911000000% 22
2.5 98.758066814134400% 1.24193318586556000000% 81
3.0 99.730020387427800% 0.26997961257220200000% 370
3.5 99.953474183672100% 0.04652581632794690000% 2,149
4.0 99.993665751560700% 0.00633424843932140000% 15,787
4.5 99.999320465373800% 0.00067953462623560100% 147,160
5.0 99.999942669685300% 0.00005733031473997840% 1,744,278
5.5 99.999996202087500% 0.00000379791249560668% 26,330,254
6.0 99.999999802682500% 0.00000019731752898267% 506,797,346
6.5 99.999999991968000% 0.00000000803199728949% 12,450,203,405
7.0 99.999999999744000% 0.00000000025596191833% 390,683,116,666
7.5 99.999999999993600% 0.00000000000638378239% 15,664,694,356,071
8.0 99.999999999999900% 0.00000000000000000000% 818,836,295,885,545
8.5 100.00000000000000% 0.00000000000000000000% #DIV/0!
.. . . .
s =ERF(An/SQRT(2)) =1-Bn =1/Cn

(The 'n' in the cell formulas above represents the row number).

Evidently this calculation hits the limit of Excel's computational
precision at about 8 sigma.

For what it's worth, ERF(z) is defined as 2/pi * INT[(0,z) exp(-t²)
dt], and the area under the "normal" Bell curve from -z to z is just
ERF(z/sqrt(2)). There's a discussion of it here:
http://mathworld.wolfram.com/Erf.html, here:
http://mathworld.wolfram.com/ConfidenceInterval.html, and here:
http://jove.prohosting.com/~skripty/page_295.htm.

So I decided to try to tackle this in C. I downloaded a package
called the GNU Scientific Library (http://www.gnu.org/software/gsl/) -
there's a nice precompiled binary for Microsoft Visual C++ at
http://www.network-theory.co.uk/gsl/freedownloads.html. I wrote a little
piece of code to try it out:

================================================== ========================
#include <stdio.h>
#include <gsl/gsl_math.h>
#include <gsl/gsl_sf_erf.h>

int main(void)
{
double odds;
double index;
double sigma;
double result;

for (index = 1.0; index <= 8.5; index += 0.5)
{
sigma = index / M_SQRT2;
result = 1 - (gsl_sf_erf (sigma));
odds = 1 / result;
printf("P(%2.1f \345) = %.18LE \t %-1.18lE\n", index, result, odds);
}
return 0;
}
================================================== ========================

And here's its output:

P(1.0 s) = 3.173105078629142600E-001 3.151487187534375500E+000
P(1.5 s) = 1.336144025377161700E-001 7.484223115226846800E+000
P(2.0 s) = 4.550026389635841700E-002 2.197789450799282900E+001
P(2.5 s) = 1.241933065155231800E-002 8.051963733448130300E+001
P(3.0 s) = 2.699796063260206900E-003 3.703983473449563900E+002
P(3.5 s) = 4.652581580710801700E-004 2.149344364311446000E+003
P(4.0 s) = 6.334248366623995700E-005 1.578719276732396800E+004
P(4.5 s) = 6.795346249477418600E-006 1.471595358480670300E+005
P(5.0 s) = 5.733031437360480700E-007 1.744277893686913900E+006
P(5.5 s) = 3.797912495606681200E-008 2.633025382119182500E+007
P(6.0 s) = 1.973175289826656400E-009 5.067973459610119500E+008
P(6.5 s) = 8.031997289492665000E-011 1.245020340467724800E+010
P(7.0 s) = 2.559619183273298400E-012 3.906831166662759400E+011
P(7.5 s) = 6.383782391594650100E-014 1.566469435607129100E+013
P(8.0 s) = 1.221245327087672200E-015 8.188362958855447500E+014
P(8.5 s) = 0.000000000000000000E+000 1.#INF00000000000000E+000
Same limits as in Excel, it seems. GSL's gsl_sf_erf function takes a
double-precision argument and returns double. My Visual C++ compiler (v.
6.0) specifies that both double and long double use an 8-byte
representation: "The long double contains 80 bits: 1 for sign, 15 for
exponent, and 64 for mantissa. Its range is +/- 1.2E4932 with at least 19
digits of precision."

Strangely enough, I only seem to be getting 17 digits of precision.
Regardless, I estimate that this calculation will require at least 38
additional digits of precision. Interesting: according to IEEE 754, the
condition for "positive underflow" (single precision) shouldn't happen
for positive numbers greater than about 1.4E-045 or so. For double
precision, it should happen only for positive numbers less than about
1.0E-308.

So here's what I'd like to know.

Are there 64-bit implementations of something like GSL which would
produce more precise output on appropriate OS/hardware platforms like
WinXP-64, Solaris v. 7-9, Tru64 Unix, Linux, etc? Has anyone
implemented a 128-bit long long double datatype or equivalent?

Can a calculation like this be performed with some kind of arbitrary-
precision numeric package (something like Michael Ring's MAPM or PHP
BCMath), or evaluated directly by Mathematica or the java.math package?

Or maybe I'm just confused and there's an easier way to do this which
I'm just not seeing.

By the way, I don't know of anyone who's bothered to tabulate the
values of this function nearly this far: most such texts (Zwillinger's CRC
Handbooks, Abramowitz & Stegun, Gradshteyn & Ryzhik, etc) only go to about
4.0 sigma.

Any ideas?

-- Dave Schulman (ca******@gate.net)
Nov 13 '05 #1
5 6061
DAVID SCHULMAN wrote:
....
================================================== ======================== [code] ================================================== ========================

And here's its output:

P(8.5 s) = 0.000000000000000000E+000 1.#INF00000000000000E+000 .... Can a calculation like this be performed with some kind of arbitrary-
precision numeric package (something like Michael Ring's MAPM or PHP
BCMath), or evaluated directly by Mathematica or the java.math package?
Mathematica, MathCAD, MathLab, Maple all can do such stuff.
Here is what Maple spits out:
Digits := 190: 1 - erf(20.); 1 / %;

-175
.539586561160790 10
.1853270766878889323178572243436799325655920174831 38279103445269\
28082079337097653296424608926588679681171314780140 46442587\
58970925582679708132243363506261583082363313448399 25105294\
176
06688155039 10

Jirka

Nov 13 '05 #2
"DAVID SCHULMAN" <da*************@verizon.net> writes:

[Long explanation snipped. The poster wants to calculate the exclusion
probability corresponding to 20 standard deviations of a normal
distribution.]

In C99, you can use the `erfc' function. Many C89 implementations provide
it as an extension.

#include <stdio.h>
#include <math.h>

int main (void)
{
printf ("%e\n", erfc (20.0 / sqrt (2.0)));
return 0;
}

This program prints `5.507248e-89'.

Note that although the /mathematical/ definition of `erfc(x)' is
`1.0 - erf(x)', this is not a good way to actually calculate the function.
The `erf' function would have to return a value which is 5.507248e-89 less
than 1.0, which would require at least 294 significant bits to represent.
Usually, floating point numbers are much less precise in C, so `erf' just
returns the closest value it can represent, 1.0.
So I decided to try to tackle this in C. I downloaded a package
called the GNU Scientific Library (http://www.gnu.org/software/gsl/) -
there's a nice precompiled binary for Microsoft Visual C++ at
http://www.network-theory.co.uk/gsl/freedownloads.html. I wrote a little
piece of code to try it out:

================================================== ========================
#include <stdio.h>
#include <gsl/gsl_math.h>
#include <gsl/gsl_sf_erf.h>

int main(void)
{
double odds;
double index;
double sigma;
double result;

for (index = 1.0; index <= 8.5; index += 0.5)
{
sigma = index / M_SQRT2;
result = 1 - (gsl_sf_erf (sigma));
odds = 1 / result;
printf("P(%2.1f \345) = %.18LE \t %-1.18lE\n", index, result, odds);
}
return 0;
}
================================================== ========================
While I'm not familiar with the GNU Scientific Library, I would guess that
it also provides an `erfc' function if it provides an `erf' function.
My Visual C++ compiler (v. 6.0) specifies that both double and long
double use an 8-byte representation: "The long double contains 80 bits:
1 for sign, 15 for exponent, and 64 for mantissa. Its range is +/-
1.2E4932 with at least 19 digits of precision."


As explained above `erf(20.0/sqrt(2.0))' can only be represented as a
value different from 1.0 with at least 294 bits for the mantissa.

Martin
Nov 13 '05 #3
Use lcc-win32 C compiler.
Here is a small program that calculates using 350 bits precision. Math
functions are suffixed with the 'q' (qfloat) suffix. lcc-win32 supports standard
350 bits extended floats.

#include <math.h>
#include <stdio.h>
#include <qfloat.h>
int main(void)
{
printf("%.100qe\n",erfcq(20/sqrtq(2.0q)));
return 0;
}

Output
---------
5.507248237212467390151245561714930665614995469518 6611353
987433091698372417656070673834468842358561507e-89

Lcc-win32 can be downloaded at no charge from
http://www.cs.virginia.edu/~lcc-win32
"DAVID SCHULMAN" <da*************@verizon.net> wrote in message
news:t4*****************@nwrddc03.gnilink.net...
I've been trying to perform a calculation that has been running into
an underflow (insufficient precision) problem in Microsoft Excel, which
calculates using at most 15 significant digits. For this purpose, that
isn't enough.

I was reading a book about some of the financial scandals of the
1990s called "Inventing Money: The Story of Long-Term Capital Management
and the Legends Behind it" by Nicholas Dunbar. On page 95, he mentions
that the 1987 stock-market crash was designated in economists' computer
models as a "20-sigma event". That is, their models (which obviously were
badly flawed!) put such an event in the exponential "tails" of a normal
Gaussian distribution, outside the range +/- 20 standard deviations from
the mean. In other words - vastly unlikely.

So I wanted to know: just how unlikely is that? Where did they get
the data to support that idea, anyway?

I put together an Excel spreadsheet, attempting to calculate this
probability. But after about 8 sigma I ran into the limits of Excel's
15-digit
precision and could go no further. Since the odds of an 8-sigma event
are about 819 trillion to 1 against, a 20-sigma event is obviously going to
be so unlikely that it's not even worth talking about; would probably
never happen in the lifetime of the universe. Further evidence that those
computer models had some serious problems.

To perform that calculation, I used Excel's built-in error function
ERF(), and here's the output:

s Confidence Interval Probability Odds against

1.0 68.268949131685700% 31.7310508683143000000% 3
1.5 86.638554269553900% 13.3614457304461000000% 7
2.0 95.449972950730900% 4.55002704926911000000% 22
2.5 98.758066814134400% 1.24193318586556000000% 81
3.0 99.730020387427800% 0.26997961257220200000% 370
3.5 99.953474183672100% 0.04652581632794690000% 2,149
4.0 99.993665751560700% 0.00633424843932140000% 15,787
4.5 99.999320465373800% 0.00067953462623560100% 147,160
5.0 99.999942669685300% 0.00005733031473997840% 1,744,278
5.5 99.999996202087500% 0.00000379791249560668% 26,330,254
6.0 99.999999802682500% 0.00000019731752898267% 506,797,346
6.5 99.999999991968000% 0.00000000803199728949% 12,450,203,405
7.0 99.999999999744000% 0.00000000025596191833% 390,683,116,666
7.5 99.999999999993600% 0.00000000000638378239% 15,664,694,356,071
8.0 99.999999999999900% 0.00000000000000000000% 818,836,295,885,545
8.5 100.00000000000000% 0.00000000000000000000% #DIV/0!
. . . .
s =ERF(An/SQRT(2)) =1-Bn =1/Cn

(The 'n' in the cell formulas above represents the row number).

Evidently this calculation hits the limit of Excel's computational
precision at about 8 sigma.

For what it's worth, ERF(z) is defined as 2/pi * INT[(0,z) exp(-t²)
dt], and the area under the "normal" Bell curve from -z to z is just
ERF(z/sqrt(2)). There's a discussion of it here:
http://mathworld.wolfram.com/Erf.html, here:
http://mathworld.wolfram.com/ConfidenceInterval.html, and here:
http://jove.prohosting.com/~skripty/page_295.htm.

So I decided to try to tackle this in C. I downloaded a package
called the GNU Scientific Library (http://www.gnu.org/software/gsl/) -
there's a nice precompiled binary for Microsoft Visual C++ at
http://www.network-theory.co.uk/gsl/freedownloads.html. I wrote a little
piece of code to try it out:

================================================== ========================
#include <stdio.h>
#include <gsl/gsl_math.h>
#include <gsl/gsl_sf_erf.h>

int main(void)
{
double odds;
double index;
double sigma;
double result;

for (index = 1.0; index <= 8.5; index += 0.5)
{
sigma = index / M_SQRT2;
result = 1 - (gsl_sf_erf (sigma));
odds = 1 / result;
printf("P(%2.1f \345) = %.18LE \t %-1.18lE\n", index, result, odds);
}
return 0;
}
================================================== ========================

And here's its output:

P(1.0 s) = 3.173105078629142600E-001 3.151487187534375500E+000
P(1.5 s) = 1.336144025377161700E-001 7.484223115226846800E+000
P(2.0 s) = 4.550026389635841700E-002 2.197789450799282900E+001
P(2.5 s) = 1.241933065155231800E-002 8.051963733448130300E+001
P(3.0 s) = 2.699796063260206900E-003 3.703983473449563900E+002
P(3.5 s) = 4.652581580710801700E-004 2.149344364311446000E+003
P(4.0 s) = 6.334248366623995700E-005 1.578719276732396800E+004
P(4.5 s) = 6.795346249477418600E-006 1.471595358480670300E+005
P(5.0 s) = 5.733031437360480700E-007 1.744277893686913900E+006
P(5.5 s) = 3.797912495606681200E-008 2.633025382119182500E+007
P(6.0 s) = 1.973175289826656400E-009 5.067973459610119500E+008
P(6.5 s) = 8.031997289492665000E-011 1.245020340467724800E+010
P(7.0 s) = 2.559619183273298400E-012 3.906831166662759400E+011
P(7.5 s) = 6.383782391594650100E-014 1.566469435607129100E+013
P(8.0 s) = 1.221245327087672200E-015 8.188362958855447500E+014
P(8.5 s) = 0.000000000000000000E+000 1.#INF00000000000000E+000
Same limits as in Excel, it seems. GSL's gsl_sf_erf function takes a
double-precision argument and returns double. My Visual C++ compiler (v.
6.0) specifies that both double and long double use an 8-byte
representation: "The long double contains 80 bits: 1 for sign, 15 for
exponent, and 64 for mantissa. Its range is +/- 1.2E4932 with at least 19
digits of precision."

Strangely enough, I only seem to be getting 17 digits of precision.
Regardless, I estimate that this calculation will require at least 38
additional digits of precision. Interesting: according to IEEE 754, the
condition for "positive underflow" (single precision) shouldn't happen
for positive numbers greater than about 1.4E-045 or so. For double
precision, it should happen only for positive numbers less than about
1.0E-308.

So here's what I'd like to know.

Are there 64-bit implementations of something like GSL which would
produce more precise output on appropriate OS/hardware platforms like
WinXP-64, Solaris v. 7-9, Tru64 Unix, Linux, etc? Has anyone
implemented a 128-bit long long double datatype or equivalent?

Can a calculation like this be performed with some kind of arbitrary-
precision numeric package (something like Michael Ring's MAPM or PHP
BCMath), or evaluated directly by Mathematica or the java.math package?

Or maybe I'm just confused and there's an easier way to do this which
I'm just not seeing.

By the way, I don't know of anyone who's bothered to tabulate the
values of this function nearly this far: most such texts (Zwillinger's CRC
Handbooks, Abramowitz & Stegun, Gradshteyn & Ryzhik, etc) only go to about
4.0 sigma.

Any ideas?

-- Dave Schulman (ca******@gate.net)

Nov 13 '05 #4

"DAVID SCHULMAN" <da*************@verizon.net> wrote in message
news:t4*****************@nwrddc03.gnilink.net...
I've been trying to perform a calculation that has been running into
an underflow (insufficient precision) problem in Microsoft Excel, which
calculates using at most 15 significant digits. For this purpose, that
isn't enough.

I was reading a book about some of the financial scandals of the
1990s called "Inventing Money: The Story of Long-Term Capital Management
and the Legends Behind it" by Nicholas Dunbar. On page 95, he mentions
that the 1987 stock-market crash was designated in economists' computer
models as a "20-sigma event". That is, their models (which obviously were
badly flawed!) put such an event in the exponential "tails" of a normal
Gaussian distribution, outside the range +/- 20 standard deviations from
the mean. In other words - vastly unlikely.

So I wanted to know: just how unlikely is that? Where did they get
the data to support that idea, anyway?

I put together an Excel spreadsheet, attempting to calculate this
probability. But after about 8 sigma I ran into the limits of Excel's
15-digit
precision and could go no further. Since the odds of an 8-sigma event
are about 819 trillion to 1 against, a 20-sigma event is obviously going to be so unlikely that it's not even worth talking about; would probably
never happen in the lifetime of the universe. Further evidence that those
computer models had some serious problems.

To perform that calculation, I used Excel's built-in error function
ERF(), and here's the output:

s Confidence Interval Probability Odds against

1.0 68.268949131685700% 31.7310508683143000000% 3
1.5 86.638554269553900% 13.3614457304461000000% 7
2.0 95.449972950730900% 4.55002704926911000000% 22
2.5 98.758066814134400% 1.24193318586556000000% 81
3.0 99.730020387427800% 0.26997961257220200000% 370
3.5 99.953474183672100% 0.04652581632794690000% 2,149
4.0 99.993665751560700% 0.00633424843932140000% 15,787
4.5 99.999320465373800% 0.00067953462623560100% 147,160
5.0 99.999942669685300% 0.00005733031473997840% 1,744,278
5.5 99.999996202087500% 0.00000379791249560668% 26,330,254
6.0 99.999999802682500% 0.00000019731752898267% 506,797,346
6.5 99.999999991968000% 0.00000000803199728949% 12,450,203,405
7.0 99.999999999744000% 0.00000000025596191833% 390,683,116,666
7.5 99.999999999993600% 0.00000000000638378239% 15,664,694,356,071
8.0 99.999999999999900% 0.00000000000000000000% 818,836,295,885,545
8.5 100.00000000000000% 0.00000000000000000000% #DIV/0!
. . . .
s =ERF(An/SQRT(2)) =1-Bn =1/Cn

(The 'n' in the cell formulas above represents the row number).

Evidently this calculation hits the limit of Excel's computational
precision at about 8 sigma.

For what it's worth, ERF(z) is defined as 2/pi * INT[(0,z) exp(-t²)
dt], and the area under the "normal" Bell curve from -z to z is just
ERF(z/sqrt(2)). There's a discussion of it here:
http://mathworld.wolfram.com/Erf.html, here:
http://mathworld.wolfram.com/ConfidenceInterval.html, and here:
http://jove.prohosting.com/~skripty/page_295.htm.

So I decided to try to tackle this in C. I downloaded a package
called the GNU Scientific Library (http://www.gnu.org/software/gsl/) -
there's a nice precompiled binary for Microsoft Visual C++ at
http://www.network-theory.co.uk/gsl/freedownloads.html. I wrote a little
piece of code to try it out:

================================================== ========================
#include <stdio.h>
#include <gsl/gsl_math.h>
#include <gsl/gsl_sf_erf.h>

int main(void)
{
double odds;
double index;
double sigma;
double result;

for (index = 1.0; index <= 8.5; index += 0.5)
{
sigma = index / M_SQRT2;
result = 1 - (gsl_sf_erf (sigma));
odds = 1 / result;
printf("P(%2.1f \345) = %.18LE \t %-1.18lE\n", index, result, odds);
}
return 0;
}
================================================== ========================

And here's its output:

P(1.0 s) = 3.173105078629142600E-001 3.151487187534375500E+000
P(1.5 s) = 1.336144025377161700E-001 7.484223115226846800E+000
P(2.0 s) = 4.550026389635841700E-002 2.197789450799282900E+001
P(2.5 s) = 1.241933065155231800E-002 8.051963733448130300E+001
P(3.0 s) = 2.699796063260206900E-003 3.703983473449563900E+002
P(3.5 s) = 4.652581580710801700E-004 2.149344364311446000E+003
P(4.0 s) = 6.334248366623995700E-005 1.578719276732396800E+004
P(4.5 s) = 6.795346249477418600E-006 1.471595358480670300E+005
P(5.0 s) = 5.733031437360480700E-007 1.744277893686913900E+006
P(5.5 s) = 3.797912495606681200E-008 2.633025382119182500E+007
P(6.0 s) = 1.973175289826656400E-009 5.067973459610119500E+008
P(6.5 s) = 8.031997289492665000E-011 1.245020340467724800E+010
P(7.0 s) = 2.559619183273298400E-012 3.906831166662759400E+011
P(7.5 s) = 6.383782391594650100E-014 1.566469435607129100E+013
P(8.0 s) = 1.221245327087672200E-015 8.188362958855447500E+014
P(8.5 s) = 0.000000000000000000E+000 1.#INF00000000000000E+000
Same limits as in Excel, it seems. GSL's gsl_sf_erf function takes a
double-precision argument and returns double. My Visual C++ compiler (v.
6.0) specifies that both double and long double use an 8-byte
representation: "The long double contains 80 bits: 1 for sign, 15 for
exponent, and 64 for mantissa. Its range is +/- 1.2E4932 with at least 19
digits of precision."

Strangely enough, I only seem to be getting 17 digits of precision.
Regardless, I estimate that this calculation will require at least 38
additional digits of precision. Interesting: according to IEEE 754, the
condition for "positive underflow" (single precision) shouldn't happen
for positive numbers greater than about 1.4E-045 or so. For double
precision, it should happen only for positive numbers less than about
1.0E-308.

So here's what I'd like to know.

Are there 64-bit implementations of something like GSL which would
produce more precise output on appropriate OS/hardware platforms like
WinXP-64, Solaris v. 7-9, Tru64 Unix, Linux, etc? Has anyone
implemented a 128-bit long long double datatype or equivalent?

Can a calculation like this be performed with some kind of arbitrary-
precision numeric package (something like Michael Ring's MAPM or PHP
BCMath), or evaluated directly by Mathematica or the java.math package?

Or maybe I'm just confused and there's an easier way to do this which
I'm just not seeing.

By the way, I don't know of anyone who's bothered to tabulate the
values of this function nearly this far: most such texts (Zwillinger's CRC
Handbooks, Abramowitz & Stegun, Gradshteyn & Ryzhik, etc) only go to about
4.0 sigma.

Any ideas?

-- Dave Schulman (ca******@gate.net)

&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&

Whatever method you used to compute those values of the normal
integral, they are not very accurate.

Here are the true values to 15 places, listed with those you posted:

True, to 15 places Your posted values
1.0 3.173105078629141e-01 3.173105078629142600E-001
1.5 1.336144025377161e-01 1.336144025377161700E-001
2.0 4.550026389635841e-02 4.550026389635841700E-002
2.5 1.241933065155227e-02 1.241933065155231800E-002
3.0 2.699796063260189e-03 2.699796063260206900E-003
3.5 4.652581580710501e-04 4.652581580710801700E-004
4.0 6.334248366623984e-05 6.334248366623995700E-005
4.5 6.795346249460121e-06 6.795346249477418600E-006
5.0 5.733031437583878e-07 5.733031437360480700E-007
5.5 3.797912493177544e-08 3.797912495606681200E-008
6.0 1.973175290075396e-09 1.973175289826656400E-009
6.5 8.032001167718236e-11 8.031997289492665000E-011
7.0 2.559625087771670e-12 2.559619183273298400E-012
7.5 6.381783345821792e-14 6.383782391594650100E-014
8.0 1.244192114854357e-15 1.221245327087672200E-015
8.5 1.895906964440664e-17 0.000000000000000000E+000
9.0 2.257176811907681e-19
9.5 2.098903015072521e-21
10.0 1.523970604832105e-23
10.5 8.638012635618461e-26
11.0 3.821319148997350e-28
11.5 1.319154289222600e-30
12.0 3.552964224200000e-33

Why most of those who deal with the normal integral in probability
theory are still stuck with the historical baggage of the error function
is a puzzle to me, as is the poor quality of the results one gets from
standard library implementations of erf(). (One of the most common
is based on ALGORITHM AS66, APPL. STATIST.(1973) Vol.22, .424 by HILL,
which gives only 6-8 digit accuracy).

Here is a listing of my method:

/*
Marsaglia Complementary Normal Distribution Function
cPhi(x) = integral from x to infinity of exp(-.5*t^2)/sqrt(2*pi), x<15
15-digit accuracy for x<15, returns 0 for x>15.
#include <math.h>
*/

double cPhi(double x){
long double v[]={0.,.65567954241879847154L,
..42136922928805447322L,.30459029871010329573L,
..23665238291356067062L,.19280810471531576488L,
..16237766089686746182L,.14010418345305024160L,
..12313196325793229628L,.10978728257830829123L,
..99028596471731921395e-1L,.90175675501064682280e-1L,
..82766286501369177252e-1L,.76475761016248502993e-1L,
..71069580538852107091e-1L,.66374235823250173591e-1L};
long double h,a,b,z,t,sum,pwr;
int i,j;
if(x>15.) return (0.);
if(x<-15.) return (1.);
j=fabs(x)+1.;
z=j;
h=fabs(x)-z;
a=v[j];
b=z*a-1.;
pwr=1.;
sum=a+h*b;
for(i=2;i<60;i+=2){
a=(a+z*b)/i;
b=(b+z*a)/(i+1);
pwr=pwr*h*h;
t=sum;
sum=sum+pwr*(a+h*b);
if(sum==t) break; }
sum=sum*exp(-.5*x*x-.91893853320467274178L);
if(x<0.) sum=1.-sum;
return ((double) sum);
}
*/
end of listing
*/

The method is based on defining phi(x)=exp(-x^2)/sqrt(2pi) and

R(x)=cPhi(x)/phi(x).

The function R(x) is well-behaved and terms of its Taylor
series are readily obtained by a two-term recursion. With an accurate
representation of R(x) at ,say, x=0,1,2,...,15, a simple evaluation
of the Taylor series at intermediate points provides up to
15 digits of accuracy.
An article describing the method will be in the new version of
my Diehard CDROM. A new version of the Diehard tests
of randomness (but not yet the new DVDROM) is at
http://www.csis.hku.hk/~diehard/

One other point about your posting:

As is often done, you mistake odds for probabilities.
The odds for an event should be represented as the ratio
of p to 1-p, not 1/p. Thus if a bookie estimates the
probability that a certain horse will win as .2, then
the odds are 2 to 3 for and 3 to 2 against.
Of course when p is close to zero, the ratio 1/p is close
to (1-p)/p, but it is probably a sound practice to maintain
the distinction between odds and probabilities.

George Marsaglia
Nov 13 '05 #5

"George Marsaglia" <ge*@stat.fsu.edu> wrote in message news:wq********************@comcast.com...


As is often done, you mistake odds for probabilities.
The odds for an event should be represented as the ratio
of p to 1-p, not 1/p. Thus if a bookie estimates the
probability that a certain horse will win as .2, then
the odds are 2 to 3 for and 3 to 2 against.
Of course when p is close to zero, the ratio 1/p is close
to (1-p)/p, but it is probably a sound practice to maintain
the distinction between odds and probabilities.


Typo? P = 0.2 = 1/5 <=> odds of 4-1 against.

John.
Nov 13 '05 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: Ronny Mandal | last post by:
Is there a function that will do this task properly? -- Thanks Ronny Mandal
19
by: Allagappan | last post by:
Hello everyone, Just check this code: #include<stdio.h> int main() { double a,b; scanf("%lf",&a); scanf("%lf",&b); printf("\nNumber a: %0.16lf",a);
2
by: mbelew | last post by:
I'm seeing a very strange behavior with double precision subtraction. I'm using csUnit for testing. If I run the test by itself, the test passes. When I run the batch of tests, the test fails. ...
2
by: Sanjeev Azad | last post by:
I'm porting my application from VC++ 6.0 to VS .Net and experiencing a difference in the double precisions. I have the following code in my application double val = 16e-6; When I debug it,...
21
by: Karl O. Pinc | last post by:
FYI, It'd be nice if the error message from a REFERENCES constraint mentioned the column name into which the bad data was attempted to be inserted. In PostgreSQL 7.3: sandbox=> insert into...
0
by: Marcel Boscher | last post by:
Hello everybody, i get strange error messages when trying to call up my function with a SELECT functionname(with or without int); varying from: ERROR: function chr(double precision) does...
6
by: R.Biloti | last post by:
Hi folks I wrote the naive program to show up the unit roundoff (machine precision) for single and double precision: #include <stdio.h> int main (void) { double x;
4
by: marathoner | last post by:
After reading the string representation of a double precision number, -0.417597000000000D+06, from a text file, I am unable to convert it to a double precision number. Here is the code I am using:...
0
by: Charles Coldwell | last post by:
James Kanze <james.kanze@gmail.comwrites: True, with some additional considerations. The commonly used IEEE 754 floating point formats are single precision: 32 bits including 1 sign bit, 23...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.