floating point over-/under-flow

Mark L Pappin

<puts on Compiler Vendor hat>

I've recently discovered that [at least two of] our compilers don't
make any attempt to handle floating point overflow in add/subtract/
multiply/divide, with the result that programmers who don't range-
check their data can e.g. multiply two very tiny values and end up
with a very large one. This is (AFAICT) quite fine according to the
Standard but is certainly a QoI issue and I'd like to raise ours.

I believe that it's the programmer's job to know what possible ranges
his values could take, and to check they're sane before performing
operations upon them. If this is done, then overflow and underflow
can never happen. Rarely is this done, of course, so the compromise
suggested by TPTB is to offer a safety net in the form of adding
inexpensive checks to the generated code which can propagate status
back to the user.

The approach I'm taking is to detect overflow or underflow and set a
flag (in the implementation namespace) as appropriate, but leave the
[invalid] value in the result. This way, if it really is important to
the user to eke that last bit (pun not intended) of information out of
the operation then the way is clear for them to do so. For example, a
multiplication which overflows will end up with a value pow(2,256)
smaller than the correct [unrepresentable] value, thanks to wraparound
of our 8-bit signed binary exponent, but the mantissa will still have
full precision - some recovery may be possible.

An alternative approach (since we use an IEEE-754-compatible
representation already) is to go the whole hog and implement
Infinities, NaNs, and Denormals. At this time we don't want to do this
because it appears to be a significant cost (see below) for not much
return. In any case, how to deal with those types of value in C is
still not defined, so any solution of this type would still be Not C.
I'm interested to hear the c.l.c set of viewpoints on silent vs. noisy
underflow and overflow - in this case, coercing values to 0 or Inf,
vs. leaving them as is and setting a user-checkable flag.
('Very-noisy' would be throwing an exception of some kind, also
undefined by the Standards.) What practices to you use to avoid
hitting overflow and underflow? Is our plan to let the dodgy value
continue to exist of any conceivable value?
For those who've read this far: we make cross-compilers for [often
tiny; <256 bytes of RAM is common] CPUs commonly used in embedded
systems, and we do our damnedest to make them meet C90 and are inching
toward C99 in places. Floating point has become popular, some
deficiencies have been found, and Muggins put his hand up to fix them.

mlp

Nov 14 '05 #1

Subscribe Post Reply

3753

Tim Prince

"Mark L Pappin" <ml*@acm.org> wrote in message
news:m3************@Claudio.Messina...

<puts on Compiler Vendor hat>

I've recently discovered that [at least two of] our compilers don't
make any attempt to handle floating point overflow in add/subtract/
multiply/divide, with the result that programmers who don't range-
check their data can e.g. multiply two very tiny values and end up
with a very large one. .....
The approach I'm taking is to detect overflow or underflow and set a
flag (in the implementation namespace) as appropriate, but leave the
[invalid] value in the result. This way, if it really is important to
the user to eke that last bit (pun not intended) of information out of
the operation then the way is clear for them to do so. For example, a
multiplication which overflows will end up with a value pow(2,256)
smaller than the correct [unrepresentable] value, thanks to wraparound
of our 8-bit signed binary exponent, but the mantissa will still have
full precision - some recovery may be possible.
.... I'm interested to hear the c.l.c set of viewpoints on silent vs. noisy
underflow and overflow - in this case, coercing values to 0 or Inf,
vs. leaving them as is and setting a user-checkable flag.
('Very-noisy' would be throwing an exception of some kind, also
undefined by the Standards.) What practices to you use to avoid
hitting overflow and underflow? Is our plan to let the dodgy value
continue to exist of any conceivable value?

Long, long ago, before C, the usual practice for such hardware was to throw
an exception. If the application wished to continue, it had to provide an
exception handler. Thus, the application would make the choice whether to
set 0 or Inf and continue silently, or issue a diagnostic, or take a branch
to process the out of range. Continuing without fixing it up seems unlikely
to be useful. When IEEE-754 came in, the argument was this exception
handling is an unnecessary burden.

Nov 14 '05 #2

Kevin Bracey

In message <m3************@Claudio.Messina>
Mark L Pappin <ml*@acm.org> wrote:

An alternative approach (since we use an IEEE-754-compatible
representation already) is to go the whole hog and implement
Infinities, NaNs, and Denormals. At this time we don't want to do this
because it appears to be a significant cost (see below) for not much
return. In any case, how to deal with those types of value in C is
still not defined, so any solution of this type would still be Not C.
Actually, C99 does fully define how to deal with all those values for
IEEE754, in Annex F. Only "signalling NaNs" aren't covered.

The main problem with implementation of NaNs etc is the extra cost
of handling them as INPUTs to operations. All your operations will need
to be able to detect them and propagate them accordingly, otherwise there's
no point generating them in the first place. This can be expensive.

I'd be tempted to say that the basic operations should signal SIGFPE in the
event of an error (ie enable Invalid Operation, Divide By Zero and Overflow
traps, in IEEE754 terminology), and kill the program. This means NaNs and
Infinities can't get into the system. I've used such systems quite
extensively.

If you intend to just set a flag and continue execution, then I feel it's
important that some sort of indicator value be left in the result, rather
than just leaving it wrapped. If you can't manage "NaN" or "Inf", then at the
very least it should be "HUGE_VAL", as per the <math.h> functions.

As for denormals, flushing all tiny results to zero is a reasonable thing to
do if you can't implement denormals properly. Kahan wouldn't approve, as it
leaves a massive hole in the number line, but it's far better than your
current "tiny * tiny -> huge".

By the way, the overflow flags you're describing are defined in <fenv.h> in
C99 - there's no need for you to invent your own, I believe.

You may want to consider the FENV #pragmas. I'm not sure that FENV_ACCESS
will give you any leeway on shrinking your code size, but you should have a
look at that area. Otherwise some sort of compiler option or private #pragma
to suppress the extra code bloat of proper range checking will probably be in
order in your situation.
Floating point has become popular, some deficiencies have been found, and
Muggins put his hand up to fix them.

Well, this will be an important life lesson for you.

--
Kevin Bracey, Principal Software Engineer
Tematic Ltd Tel: +44 (0) 1223 503464
182-190 Newmarket Road Fax: +44 (0) 1728 727430
Cambridge, CB5 8HE, United Kingdom WWW: http://www.tematic.com/

Nov 14 '05 #3

Gordon Burditt

>I've recently discovered that [at least two of] our compilers don't

make any attempt to handle floating point overflow in add/subtract/
multiply/divide, with the result that programmers who don't range-
check their data can e.g. multiply two very tiny values and end up
with a very large one. This is (AFAICT) quite fine according to the
Standard but is certainly a QoI issue and I'd like to raise ours.
What floating point hardware / software allows you to multiply two
small floating-point numbers and end up with a huge one? The
implementations I know of either end up with zero or a small number.
IEEE-754 certainly requires that. You also can't multiply two large
numbers and end up with a small one on any hardware I know of.
Overflow sticks at either Inf or the largest possible value, or
traps. Now you can still end up with an answer that is a large
number of orders of magnitude wrong mathematically, but it won't
appear to be a small, reasonable result when it overflows.

I consider your hardware / software emulation broken, although ANSI
C doesn't.
I believe that it's the programmer's job to know what possible ranges
his values could take, and to check they're sane before performing
operations upon them.
The problem here is that individually reasonable values may produce
a collectively unreasonable result. For example, linear interpolation
using two points that are nearly identical. Roundoff error in
decimal conversion can also change a reasonable situation to an
unreasonable one (e.g. change division by zero, which you've handled,
to division by what was supposed to be zero but for roundoff error,
which you didn't because in a corner case it was more than expected).
If this is done, then overflow and underflow
can never happen. Rarely is this done, of course, so the compromise
suggested by TPTB is to offer a safety net in the form of adding
inexpensive checks to the generated code which can propagate status
back to the user. The approach I'm taking is to detect overflow or underflow and set a
flag (in the implementation namespace) as appropriate, but leave the
[invalid] value in the result.
I believe the appropriate action is to substitute +Inf, -Inf,
+DBL_MAX, or -DBL_MAX for overflow, or cause a trap.

Underflow to zero is problematical as it provides no clear way to
check for an error, but then it isn't always obvious when an underflow
*IS* an error. Is 0.10 - 0.10 underflowing to zero an error?
Or is it a case of "I had a dime and I spent it"?
This way, if it really is important to
the user to eke that last bit (pun not intended) of information out of
the operation then the way is clear for them to do so. For example, a
multiplication which overflows will end up with a value pow(2,256)
smaller than the correct [unrepresentable] value, thanks to wraparound
of our 8-bit signed binary exponent, but the mantissa will still have
full precision - some recovery may be possible.
Do you know if anyone is actually trying to do that kind of recovery?
It's very, very system-specific. How often do you deal in numbers
that are within 10 orders of magnitude of DBL_MAX? I consider it
dangerous.

An alternative approach (since we use an IEEE-754-compatible
representation already) is to go the whole hog and implement
Infinities, NaNs, and Denormals.
Overflow to +/- DBL_MAX might cover most of the issues if the
whole IEEE implementation is too expensive.

At this time we don't want to do this
because it appears to be a significant cost (see below) for not much
return. In any case, how to deal with those types of value in C is
still not defined, so any solution of this type would still be Not C.
How costly is it to check for the binary exponent overflow
and overflow to +/- DBL_MAX and underflow to 0?
I'm interested to hear the c.l.c set of viewpoints on silent vs. noisy
underflow and overflow - in this case, coercing values to 0 or Inf,
vs. leaving them as is and setting a user-checkable flag.
('Very-noisy' would be throwing an exception of some kind, also
undefined by the Standards.) What practices to you use to avoid
hitting overflow and underflow?
In most cases, range-checking the result to within sane values is
sufficient, given overflow to +/- Inf or +/- DBL_MAX. Legitimate
values are usually smaller in absolute value than the 10th root of
DBL_MAX.

It would be really nasty having 1e+200 * 1e+200 end up being 1e+4
(two outrageous values ending up with a sane-looking product).
Is our plan to let the dodgy value
continue to exist of any conceivable value?
I think it is of negative value, unless someone really needs to
use the full range. How often do you deal with numbers like 1e+300?

If someone really wants to check this hidden flag, could they not
get the dodgy value from some other hidden place, where the
normal result is DBL_MAX?
For those who've read this far: we make cross-compilers for [often
tiny; <256 bytes of RAM is common] CPUs commonly used in embedded
systems, and we do our damnedest to make them meet C90 and are inching
toward C99 in places. Floating point has become popular, some
deficiencies have been found, and Muggins put his hand up to fix them.

Gordon L. Burditt

Nov 14 '05 #4

by: JS | last post by:

We have the same floating point intensive C++ program that runs on Windows on Intel chip and on Sun Solaris on SPARC chips. The program reads the exactly the same input files on the two platforms....

C / C++

687

why still use C?

by: cody | last post by:

no this is no trollposting and please don't get it wrong but iam very curious why people still use C instead of other languages especially C++. i heard people say C++ is slower than C but i can't...

C / C++

floating point precision problem

by: syntax | last post by:

hi, i need to get high presion float numbers. say, i need pi = 22/7.0 = 3.142857....(upto 80 digits) is it possible ? does gcc/g++ compiler can give such type of high precision?? plz...

C / C++

Floating point number to binary

by: Gaurav Verma | last post by:

Hi, I want to convert a floating point number (or a decimal number say 123.456) into binary notation using a C program. Can somebody help me out with it? Thanks Gaurav --...

C / C++

Floating-point bit hacking: iterated nextafter() without loop?

by: Daniel Vallstrom | last post by:

I am having trouble with floating point addition because of the limited accuracy of floating types. Consider e.g. x0 += f(n) where x0 and f are of some floating type. Sometimes x0 is much larger...

C / C++

truncating a floating type variable

by: VISHNU VARDHAN REDDY UNDYALA | last post by:

Hello, Can someone over here help me in truncating a float variable. I mean if PI=3.14159 ...How can I get to read the first two or first three decimal values with out rounding them. Any...

C / C++

/CLR floating point performance, inter-assembly function call performance

by: Bern McCarty | last post by:

I have run an experiment to try to learn some things about floating point performance in managed C++. I am using Visual Studio 2003. I was hoping to get a feel for whether or not it would make...

.NET Framework

floating point inaccuracies

by: Peteroid | last post by:

Try putting this in your code: double x(.008) ; // 8/1000 double y(.064) ; // 64/1000 double z(x+y) ; // 72/1000 ? You'd think that z = .072. Nope, z = 0.072000000000000008. That seems REALLY...

.NET Framework

problem returning floating point

by: Peter | last post by:

I have written this small app to explain an issue I'm having with a larger program. In the following code I'm taking 10 ints from the keyboard. In the call to average() these 10 ints are then...

C / C++

To access the bytes of floating point number

by: avais | last post by:

I am transmitting a numer over a uart and reading the number in Matlab by using fread function. To transmit a number that occupies more than byte. I use and operation to access the individual bytes...

C / C++

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Batch import of multiple excel files into the database

by: ryjfgjl | last post by:

If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...

Data Management

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

General

floating point over-/under-flow

Similar topics