Hello,
I have a problem with float comparisons. Yes, I have read FAQ point
14.4ff.
Assume this little code sniplet :
#include <stdio.h>
int main(int argc, char *argv[]) {
double a = 3 ;
double b = 5 ;
if( (a/b) > 0.6)
printf(">\n") ;
else
printf("<=\n") ;
return EXIT_SUCCESS ;
}
As you might guess, it sometimes returns ">", and sometimes "<=".
Now it is of much importance for me that the program in a really, really
repeatable fashion. The threshold may be arbitrary, but what I want to
avoid at all costs are different results depending on what compiler
options I use, for example.
In my "real world" program, a and b are ints, so I could solve the
problem by doing
if( (a * 10) > (b * 6) )
But my question is: if it _were_ doubles, what should I do then? How to
get a fully deterministic statement, independent of architecture,
compilers and stuff?
Regards,
January
--
Fire-fox your konqueror! 19 1487
In article <cc**********@sagnix.uni-muenster.de>,
<NO************@uni-muenster.de> wrote: Hello,
I have a problem with float comparisons. Yes, I have read FAQ point 14.4ff.
Assume this little code sniplet :
#include <stdio.h>
int main(int argc, char *argv[]) { double a = 3 ; double b = 5 ;
if( (a/b) > 0.6) printf(">\n") ; else printf("<=\n") ;
return EXIT_SUCCESS ; }
As you might guess, it sometimes returns ">", and sometimes "<=".
Now it is of much importance for me that the program in a really, really repeatable fashion. The threshold may be arbitrary, but what I want to avoid at all costs are different results depending on what compiler options I use, for example.
In my "real world" program, a and b are ints, so I could solve the problem by doing
if( (a * 10) > (b * 6) )
But my question is: if it _were_ doubles, what should I do then? How to get a fully deterministic statement, independent of architecture, compilers and stuff?
From what I can see, you are attempting to perform an equality check against
a floating point number. That is a fundamentally broken idea.
You may want to write a function that compares two floating point numbers
and returns -1, 0, 1 for the cases of
A definitely less than B
A approximately equal to B
A definitely greater than B
where the "approximately equal to" is true if A and B are within 2 or 3 ULPs
of each other (you decide the threshold). Then whenever you want to compare
two floating point numbers, you call this function and check its return
result.
John Cochran <jd*@smof.fiawol.org> wrote: From what I can see, you are attempting
I'm attepting to find out what the proper way of solving such problems is.
In my program, I did manage to forget the floating point numbers and use
integers, but that might not always be the case -- and that is why I ask.
to perform an equality check against a floating point number. That is a fundamentally broken idea.
Well, I know. But somehow sometimes you need to check e.g. whether your
calculated statistics p is greater than, for example, 0.95. My question
is: what do you do then, if you want deterministic behaviour?
You may want to write a function that compares two floating point numbers and returns -1, 0, 1 for the cases of A definitely less than B A approximately equal to B A definitely greater than B where the "approximately equal to" is true if A and B are within 2 or 3 ULPs of each other (you decide the threshold). Then whenever you want to compare two floating point numbers, you call this function and check its return result.
Yes, I know, but this still does not give me deterministic behaviour. Say,
I have a population of 1000 data points, my program runs and finds that 123
of them are statistically significant. Now someone else takes it, and
finds around twice as much statistically significant data points, because
of one f...loating comparison! How do you deal with such problems -- this
is my question.
j.
--
Fire-fox your konqueror!
<NO************@uni-muenster.de> wrote in message
news:cc**********@sagnix.uni-muenster.de... John Cochran <jd*@smof.fiawol.org> wrote: From what I can see, you are attempting
I'm attepting to find out what the proper way of solving such problems is. In my program, I did manage to forget the floating point numbers and use integers, but that might not always be the case -- and that is why I ask.
to perform an equality check against a floating point number. That is a fundamentally broken idea.
What you want to do is compare the absolute value of their difference to
some threshold.
#define EPSILON 0.00001
if (fabs(d2-d1) < EPSILON)
{
// They're essentially equal
}
else
{
// They're not equal
}
Andre
Andre Charron <an***@tenbase.com> wrote: What you want to do is compare the absolute value of their difference to some threshold.
#define EPSILON 0.00001
if (fabs(d2-d1) < EPSILON)
You think this will be deterministic? Look: the problem is not in finding
whether or not two values are equal.
The problem is
1) deciding, whether a value is greater then a threshold
2) doing so in a _deterministic_, compiler-independent way
Adding an epsilon value does not change this problem, it just shifts it's
boundaries.
OK, let me try to explain it like this.
What happens if, mathematically, a + epsilon = b? Then a - b = epsilon. In
your program, you compare fabs(a - b) with epsilon. This is not a
deterministic operation -- you are comparing two floats. In one compiler
fabs(a - b) might be smaller than epsilon, in an other -- greater.
See? I don't really want to know for sure whether a > b -- b is a
threshold, b is arbitrarily chosen. It can as well be b + epsilon.
Instead, I want the program to make always the same decision, independently
of whether I use compiler X or Y.
j.
--
Fire-fox your konqueror!
In article <pB*********************@read2.cgocable.net>,
Andre Charron <an***@tenbase.com> wrote: <NO************@uni-muenster.de> wrote in message news:cc**********@sagnix.uni-muenster.de... John Cochran <jd*@smof.fiawol.org> wrote: > From what I can see, you are attempting
I'm attepting to find out what the proper way of solving such problems is. In my program, I did manage to forget the floating point numbers and use integers, but that might not always be the case -- and that is why I ask.
> to perform an equality check against > a floating point number. That is a fundamentally broken idea.
What you want to do is compare the absolute value of their difference to some threshold.
#define EPSILON 0.00001
if (fabs(d2-d1) < EPSILON)
{
// They're essentially equal
}
else
{
// They're not equal
}
Andre
Not quite. I would suggest
if (fabs((d2-d1)/max(fabs(d1),fabs(d2))) < EPSILON) {
// Equal
} else {
// Not equal
}
as the comparison. This allows it to adjust with the scale of the numbers.
But, given the original posting. The poster seems to want the exact same
behavior for equality checks. This can not be guaranteed even if you perform
a close range check. The indeterminate behaivor will be shifted from when
the numbers are near equal to where the numbers are at the edges of the
range checks.
Overall, use the <, >, <=, >= operators and be aware that the edge between
< and >= is a bit "fuzzy" when it comes to floating point numbers. You can
not guarantee identical behavior between compilers or compiler options.
>> What you want to do is compare the absolute value of their difference to some threshold. #define EPSILON 0.00001
if (fabs(d2-d1) < EPSILON)
You think this will be deterministic? Look: the problem is not in finding whether or not two values are equal.
The problem is 1) deciding, whether a value is greater then a threshold 2) doing so in a _deterministic_, compiler-independent way
Take a step back from this problem. You want to *RE-MEASURE* your
data, then recalculate it whether it is greater than a threshold.
Are you going to get deterministic results? No. Every time you
take a ruler and measure something you're going to get slightly
different results, even if the length of it is NOT changing with
time.
You're never going to get rid of edge-effect problems unless you
measure AND calculate with infinite precision. And infinite-precision
measurements are very difficult (and expensive) to do. A guy named
Heisenberg had some interesting things to say on this subject.
Adding an epsilon value does not change this problem, it just shifts it's boundaries.
It also says that, if compiler differences make a significant difference
in the results, your results are complete mush and should be discarded.
OK, let me try to explain it like this.
What happens if, mathematically, a + epsilon = b? Then a - b = epsilon. In your program, you compare fabs(a - b) with epsilon. This is not a deterministic operation -- you are comparing two floats. In one compiler fabs(a - b) might be smaller than epsilon, in an other -- greater.
See? I don't really want to know for sure whether a > b -- b is a threshold, b is arbitrarily chosen. It can as well be b + epsilon.
Instead, I want the program to make always the same decision, independently of whether I use compiler X or Y.
In other words, you want IDENTICAL garbage, rather than different-smelling
garbage. But it's still garbage. And I prefer to KNOW my results are
garbage rather than covering it up.
Gordon L. Burditt
In article <cc**********@smof.fiawol.org>,
John Cochran <jd*@smof.fiawol.org> wrote: In article <pB*********************@read2.cgocable.net>, Andre Charron <an***@tenbase.com> wrote:<NO************@uni-muenster.de> wrote in message news:cc**********@sagnix.uni-muenster.de...
What you want to do is compare the absolute value of their difference to some threshold.
#define EPSILON 0.00001
if (fabs(d2-d1) < EPSILON) Not quite. I would suggest
if (fabs((d2-d1)/max(fabs(d1),fabs(d2))) < EPSILON) {
as the comparison. This allows it to adjust with the scale of the numbers.
I would suggest:
if (fabs(d2-d1)/(1.0+max(fabs(d1),fabs(d2))) < EPSILON) {
That will take care of the case when d1 and d2 are both zero or
almost zero.
--
rr
In article <cc**********@pc18.math.umbc.edu>,
Rouben Rostamian <ro****@pc18.math.umbc.edu> wrote: In article <cc**********@smof.fiawol.org>, John Cochran <jd*@smof.fiawol.org> wrote:In article <pB*********************@read2.cgocable.net>, Andre Charron <an***@tenbase.com> wrote:<NO************@uni-muenster.de> wrote in message news:cc**********@sagnix.uni-muenster.de...
What you want to do is compare the absolute value of their difference to some threshold.
#define EPSILON 0.00001
if (fabs(d2-d1) < EPSILON) Not quite. I would suggest
if (fabs((d2-d1)/max(fabs(d1),fabs(d2))) < EPSILON) {
as the comparison. This allows it to adjust with the scale of the numbers.
I would suggest:
if (fabs(d2-d1)/(1.0+max(fabs(d1),fabs(d2))) < EPSILON) {
That will take care of the case when d1 and d2 are both zero or almost zero.
Reasonable.
However, I suspect that zero needs to be treated independently. Your
solution would make all comparisions where both numbers have an absolute
value below EPSILON claim to be equal. This is most likely not what you
want.
For example, I have an EPSILON of 1e-12 (eg. I will treat as equal any
two numbers that match to within 12 significant digits). This will allow
for a reasonable amount of slop in the lower digits assuming that my
floating point math is precise to within 15 digits.
What you gave would consider 1e-13 and 1e-14 to be equal even though the
floating point math is well within its ability to represent both numbers
to within 15 digits and it's going nowhere near any of its limits.
In article <cc**********@sagnix.uni-muenster.de>, NO************@uni-muenster.de wrote: double a = 3 ; double b = 5 ;
if( (a/b) > 0.6) printf(">\n") ; else printf("<=\n") ;
As you might guess, it sometimes returns ">", and sometimes "<=".
Now it is of much importance for me that the program in a really, really repeatable fashion
There is something seriously wrong with your computer if the same
computer returns ">", and "<=" for the same expression. Are you running
on a Pentium? There was a bug with Pentium's division several years ago
-- perhaps you've got a bad chip. Intel said tha they would replace the
chips if the user was affected, but most people are not.
It is understandable if you get a different result on different types of
machines.
In article <cc**********@smof.fiawol.org>,
John Cochran <jd*@smof.fiawol.org> wrote: In article <cc**********@pc18.math.umbc.edu>, Rouben Rostamian <ro****@pc18.math.umbc.edu> wrote:In article <cc**********@smof.fiawol.org>, John Cochran <jd*@smof.fiawol.org> wrote:In article <pB*********************@read2.cgocable.net>, Andre Charron <an***@tenbase.com> wrote: <NO************@uni-muenster.de> wrote in message news:cc**********@sagnix.uni-muenster.de...
What you want to do is compare the absolute value of their difference to some threshold.
#define EPSILON 0.00001
if (fabs(d2-d1) < EPSILON)
Not quite. I would suggest
if (fabs((d2-d1)/max(fabs(d1),fabs(d2))) < EPSILON) {
as the comparison. This allows it to adjust with the scale of the numbers.
I would suggest:
if (fabs(d2-d1)/(1.0+max(fabs(d1),fabs(d2))) < EPSILON) {
That will take care of the case when d1 and d2 are both zero or almost zero. Reasonable. However, I suspect that zero needs to be treated independently. Your solution would make all comparisions where both numbers have an absolute value below EPSILON claim to be equal. This is most likely not what you want.
For example, I have an EPSILON of 1e-12 (eg. I will treat as equal any two numbers that match to within 12 significant digits). This will allow for a reasonable amount of slop in the lower digits assuming that my floating point math is precise to within 15 digits.
What you gave would consider 1e-13 and 1e-14 to be equal even though the floating point math is well within its ability to represent both numbers to within 15 digits and it's going nowhere near any of its limits.
I see what you mean. Perhaps, as you suggest, the separate treatment
of the zero is the best solution. It's ugly, but that's the way it is.
--
rr
In article <cc**********@sagnix.uni-muenster.de> NO************@uni-muenster.de writes: I have a problem with float comparisons. Yes, I have read FAQ point 14.4ff.
.... int main(int argc, char *argv[]) { double a = 3 ; double b = 5 ; if( (a/b) > 0.6) printf(">\n") ; else printf("<=\n") ;
.... Now it is of much importance for me that the program in a really, really repeatable fashion. The threshold may be arbitrary, but what I want to avoid at all costs are different results depending on what compiler
.... But my question is: if it _were_ doubles, what should I do then? How to get a fully deterministic statement, independent of architecture, compilers and stuff?
The short answer is: you can't. The long answer is: because there are
(still) many floating point applications and many compilers around there
is something non-deterministic in floating point arithmetic when you
cross compilers or architectures. There are too many things not
specified:
1. How is the value 0.6 converted by the compiler to a floating-point value.
2. How is a/b rounded to a proper floating-point value.
3. Is a/b kept in higher precision?
to mention three.
The better question is on *why* you do want to have a deterministic
division of floating point values. I think your problem is probably
better solved with scaled integers.
--
dik t. winter, cwi, kruislaan 413, 1098 sj amsterdam, nederland, +31205924131
home: bovenover 215, 1025 jn amsterdam, nederland; http://www.cwi.nl/~dik/
In article <cc**********@sagnix.uni-muenster.de> NO************@uni-muenster.de writes: to perform an equality check against a floating point number. That is a fundamentally broken idea. Well, I know. But somehow sometimes you need to check e.g. whether your calculated statistics p is greater than, for example, 0.95. My question is: what do you do then, if you want deterministic behaviour?
Do you want deterministic behaviour if you are doing statistics?
Yes, I know, but this still does not give me deterministic behaviour. Say, I have a population of 1000 data points, my program runs and finds that 123 of them are statistically significant. Now someone else takes it, and finds around twice as much statistically significant data points, because of one f...loating comparison! How do you deal with such problems -- this is my question.
You see immediately that about 123 data points are on the edge of being
statistically significant or not. I would consider that quite a large
number in a population of 1000 data points! And I would distrust any
statistics made on either of the assumptions of significance. When so
many data points are on the edge, there is something seriously wrong
with the methodology.
--
dik t. winter, cwi, kruislaan 413, 1098 sj amsterdam, nederland, +31205924131
home: bovenover 215, 1025 jn amsterdam, nederland; http://www.cwi.nl/~dik/
Dik T. Winter <Di********@cwi.nl> wrote: > Well, I know. But somehow sometimes you need to check e.g. whether your > calculated statistics p is greater than, for example, 0.95. My question > is: what do you do then, if you want deterministic behaviour?
Do you want deterministic behaviour if you are doing statistics?
Oh, definitely. I expect a deterministic behaviour from my software.
Because if it is not, it will bias the statistics. I might even find a
statistical significance because of some biased randomness in floating
operations.
> Yes, I know, but this still does not give me deterministic behaviour. Say, > I have a population of 1000 data points, my program runs and finds that 123 > of them are statistically significant. Now someone else takes it, and > finds around twice as much statistically significant data points, because > of one f...loating comparison! How do you deal with such problems -- this > is my question.
You see immediately that about 123 data points are on the edge of being statistically significant or not. I would consider that quite a large number in a population of 1000 data points! And I would distrust any
OK, let's say there are 1e9 data points, you feel better now? :-)
statistics made on either of the assumptions of significance. When so many data points are on the edge, there is something seriously wrong with the methodology.
Gosh, you need not to take an abstract example so seriously. Forget about
the examples.
Is there, generally speaking, a way to make deterministic floating point
calculations in C? It is just a theoretical question. Maybe the answer
is: no, there is no such way in any language, because no computer can make
real number calculation, it can only approximate. Or, more likely, the
answer is: it can be done, but it is hard to do, and you lose precision by
orders of magnitude.
j.
--
Fire-fox your konqueror!
Dik T. Winter <Di********@cwi.nl> wrote: The better question is on *why* you do want to have a deterministic division of floating point values.
Short answer is: because I'm a curious person, that's why. I just want to
know.
I think your problem is probably better solved with scaled integers.
As I mentioned in the original posting, that is precisely what I did.
j.
--
Fire-fox your konqueror! NO************@uni-muenster.de wrote: Is there, generally speaking, a way to make deterministic floating point calculations
No. Never mind the language. Not unless you juggle the bits yourself.
Richard
Gordon Burditt <go***********@burditt.org> wrote: Take a step back from this problem. You want to *RE-MEASURE* your data, then recalculate it whether it is greater than a threshold. Are you going to get deterministic results? No. Every time you take a ruler and measure something you're going to get slightly different results, even if the length of it is NOT changing with time.
This is not true. Count your fingers. Count them again. And? Got
different results? :-) Ever heard of non-parametric statistics? Your
measurements can be infinitely precise, since they can be integer.
However, your statistics is not integer.
Anyway -- I had my problem solved before I sent the first posting; my
interest is purely academic.
You're never going to get rid of edge-effect problems unless you measure AND calculate with infinite precision. And infinite-precision measurements are very difficult (and expensive) to do. A guy named Heisenberg had some interesting things to say on this subject.
What can I say. I am a biologist. I do infinite-precision measurements on
a daily basis. Like: counting the number of amino-acids in a sequence.
Or: calculating the number of sequences with the feature x.
This does not change a bit the need for floating precision, you know, since
statistical reasoning may require it. Adding an epsilon value does not change this problem, it just shifts it's boundaries.
It also says that, if compiler differences make a significant difference in the results, your results are complete mush and should be discarded.
Really? Shouldn't think so. Rather that the way I'm approaching the
problem is mush.
j.
--
Fire-fox your konqueror!
> But my question is: if it _were_ doubles, what should I do then? How to get a fully deterministic statement, independent of architecture, compilers and stuff?
Not only do compilers differ, but you might even get different results
on different architechtures.
You'd have to write your own floating point math routines. And you'd
have to control everything down to the bit level to be sure it would
work properly, especially on multiple architechtures. A few years ago I
did some messing around with floating point stuff on a few
architechtures. As I recall, SPARC and PowerPC produced identical
results, but x86 (not a Pentium! :) ) was different.
Having your own routines would give you the freedom to get as much
precision as you want, you'd be able to ensure that the results are
deterministic and work the same on all architechtures, but it would be
pretty slow.
In article <cc**********@smof.fiawol.org>, jd*@smof.fiawol.org (John Cochran) wrote: Not quite. I would suggest
if (fabs((d2-d1)/max(fabs(d1),fabs(d2))) < EPSILON) { // Equal } else { // Not equal }
as the comparison. This allows it to adjust with the scale of the numbers.
And if you are lucky, your program will crash when d1 = d2 = 0. That's
if you are lucky; if you are out of luck it will cause your program the
give the wrong answer in a rare case, which leads to an error costing a
customer of your company millions, they find the error, sue your
employer and you get fired.
Why not
fabs (d2 - d1) <= EPSILON * max (fabs (d1), fabs (d2))
And then you can think a moment whether it makes any difference whether
you use fabs (d1), fabs (d2) or the larger of both on the right side -
if they are close then it makes no difference, if the values are far
apart then it makes no difference either! So just write
fabs (d2 - d1) <= EPSILON * fabs (d1)
In article <tu************************@news.sf.sbcglobal.net> ,
Ken Turkowski <tu**@worldserver.com> wrote: In article <cc**********@sagnix.uni-muenster.de>, NO************@uni-muenster.de wrote:
double a = 3 ; double b = 5 ;
if( (a/b) > 0.6) printf(">\n") ; else printf("<=\n") ;
As you might guess, it sometimes returns ">", and sometimes "<=".
Now it is of much importance for me that the program in a really, really repeatable fashion
There is something seriously wrong with your computer if the same computer returns ">", and "<=" for the same expression. Are you running on a Pentium? There was a bug with Pentium's division several years ago -- perhaps you've got a bad chip. Intel said tha they would replace the chips if the user was affected, but most people are not.
It is understandable if you get a different result on different types of machines.
1. He never mentioned identical or different implementations. The C code
itself can produce different results, depending on the implementation.
2. The same compiler with different compiler options may produce
different results as well. This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: Peter Scheurer |
last post by:
I have a statement that works in compatibility mode 8 of SQL Server
2000 while it fails in 7:
declare @P3 int
exec sp_prepexec @P3 output, N'@P1 float', N'select custname from
customer where...
|
by: Michael Klatt |
last post by:
I've been looking through the FAQ and Googling previous threads on
c.l.c++ but I haven't seen this exact situation addressed. I
initialize variables of type float to a known invalid value (of type...
|
by: Alexander Block |
last post by:
Hello newsgroup,
let's say I have a function like
template<class Type>
inline bool areEqual(const Type &a, const Type &b)
{
return ( a == b );
}
|
by: Gurikar |
last post by:
Hello,
Can any one tell me is the code below correct.
#include<iostream.h>
int main()
{
int i = 1;
float f = i / 2;
if(f)
cout<<"HI";
|
by: joesfer |
last post by:
I'm trying to develop a graphical user interface for a renderer i've
got written in an unmanaged C++ DLL with C#. During the rendering
process, several images are sent to a delegate as float*...
|
by: panachepad |
last post by:
I arrived here from a websearch that found this thread from your archives:
http://www.thescripts.com/forum/thread97805.html
It helped me to understand that I am on the right track, but I still have...
|
by: Harry |
last post by:
Using IE7, I'm trying to display a table in a horizontal manner by
floating the rows.
The following html does not work, displaying the table vertically as if
the rows were not floated.
This same...
|
by: Bill Reid |
last post by:
I just noticed that my "improved" version of sscanf() doesn't assign
floating point numbers properly if the variable assigned to is declared
as a "float" rather than a "double". (This never...
|
by: D'Arcy J.M. Cain |
last post by:
I'm not sure I follow this logic. Can someone explain why float and
integer can be compared with each other and decimal can be compared to
integer but decimal can't be compared to float?
True...
|
by: DolphinDB |
last post by:
The formulas of 101 quantitative trading alphas used by WorldQuant were presented in the paper 101 Formulaic Alphas. However, some formulas are complex, leading to challenges in calculation.
Take...
|
by: DolphinDB |
last post by:
Tired of spending countless mintues downsampling your data? Look no further!
In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
|
by: ryjfgjl |
last post by:
ExcelToDatabase: batch import excel into database automatically...
|
by: isladogs |
last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM).
In this month's session, we are pleased to welcome back...
|
by: isladogs |
last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM).
In this month's session, we are pleased to welcome back...
|
by: ArrayDB |
last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
|
by: PapaRatzi |
last post by:
Hello,
I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
|
by: CloudSolutions |
last post by:
Introduction:
For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
|
by: af34tf |
last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
| |