469,283 Members | 2,297 Online

# comparison between double

Hi

I recently had to write a small code in a competition ,but my code was
rejected cause it failed in 1 of test cases.

The problm was .....we are given vector of strings....each string
consists of either 1 or 2("12122" 0r "2121" so on..)...i had to find
the that string where percentage of '1' is minimum.Now the problem and
solution both are trivial but i was told that comparing double with <
or > sign doesn't ensure a correct solution always ie it fails for
number differing by a very small value.

Also if two string have indentical small percentage of '1 then the
smallest indexed string is to be returned(ie string of index 0 in
vector is identical with index 4 then 0 is to be returned.)

The code goes something like this.....

#include<iostream>
#include<string>
#include<sstream>
#include<vector>

using namespace std;

class Elections{
public:
int visit(vector<string> like);
};

int Elections::visit(vector<string>like)
{
double min=1000,temp=0;
int index=0;

for(int i=0;i<like.size();i++)
{
temp=0;

for(int j=0;j<like.size();j++)
if(like[i][j]=='1')
temp++;

temp=temp/like[i].size();

if(min>temp)
{
min=temp;
index=i;
}

}
return index;
}

The test case for which my solution fails is
{"11122211222111111112121", "11122211222111111112121",
"11122211222111111112121", "11122211222111111112121",
"11122211222111111112121", "11122211222111111112121",
"11122211222111111112121", "11122211222111111112121",
"11122211222111111112121", "11122211222111111112121"}

The above case has all equal string yet it returned 9(index instead of
0) according to the system test they carried out.
I tried running the above code in a different compiler and the answer i
got was correct.(ie 0)

I am not sure , besides the floating point comparison ambiguity what
other reason could have led to different results in two different
compilers.

Somesh

Jul 23 '05 #1
10 6814 ch************@gmail.com wrote:
Hi

I recently had to write a small code in a competition ,but my code was
rejected cause it failed in 1 of test cases.

The problm was .....we are given vector of strings....each string
consists of either 1 or 2("12122" 0r "2121" so on..)...i had to find
the that string where percentage of '1' is minimum.Now the problem and
solution both are trivial but i was told that comparing double with <
or > sign doesn't ensure a correct solution always ie it fails for
number differing by a very small value.

Also if two string have indentical small percentage of '1 then the
smallest indexed string is to be returned(ie string of index 0 in
vector is identical with index 4 then 0 is to be returned.)

The code goes something like this.....

#include<iostream>
#include<string>
#include<sstream>
#include<vector>

using namespace std;

class Elections{
public:
int visit(vector<string> like);
};

int Elections::visit(vector<string>like)
{
double min=1000,temp=0;
int index=0;

for(int i=0;i<like.size();i++)
{
temp=0;

for(int j=0;j<like.size();j++)
if(like[i][j]=='1')
temp++;

temp=temp/like[i].size();

if(min>temp)
{
min=temp;
index=i;
}

}
return index;
}

The test case for which my solution fails is
{"11122211222111111112121", "11122211222111111112121",
"11122211222111111112121", "11122211222111111112121",
"11122211222111111112121", "11122211222111111112121",
"11122211222111111112121", "11122211222111111112121",
"11122211222111111112121", "11122211222111111112121"}

The above case has all equal string yet it returned 9(index instead of
0) according to the system test they carried out.
I tried running the above code in a different compiler and the answer i
got was correct.(ie 0)

I am not sure , besides the floating point comparison ambiguity what
other reason could have led to different results in two different
compilers.

Somesh

try this

#define SMALL_NON_ZERO 1e-16D /* or something else small */
#define DBL_EQ(X,Y) ( (X) - (Y) < SMALL_NON_ZERO ) /* X == Y */
#define DBL_LT(X,Y) ( (X) - (Y) > SMALL_NON_ZERO ) /* X > Y */
#define DBL_GT(X,Y) ( (Y) - (X) > SMALL_NON_ZERO ) /* Y > X */

just an idea,
I've used this only for DBL_EQ, but assume it should extend to > and < also.
Jul 23 '05 #2

Aaron Gage wrote:

#define SMALL_NON_ZERO 1e-16D /* or something else small */
#define DBL_EQ(X,Y) ( (X) - (Y) < SMALL_NON_ZERO ) /* X == Y */
#define DBL_LT(X,Y) ( (X) - (Y) > SMALL_NON_ZERO ) /* X > Y */
#define DBL_GT(X,Y) ( (Y) - (X) > SMALL_NON_ZERO ) /* Y > X */

just an idea,
I've used this only for DBL_EQ, but assume it should extend to > and < also.

just one more thing.....how small do u think SMALL_NON_ZERO should
be....and does this have any bearing on the portability of the code ie
will it be compiler or platform dependent.

also is there a way to avoid these approximations??

are there operators that allow bitwise operations on double??maybe
those can avoid the need to make approximations.

regards

Somesh

Jul 23 '05 #3
Aaron Gage wrote:

try this

#define SMALL_NON_ZERO 1e-16D /* or something else small */
#define DBL_EQ(X,Y) ( (X) - (Y) < SMALL_NON_ZERO ) /* X == Y */

#define DBL_EQ(X,Y) ( fabs( (X) - (Y) ) < SMALL_NON_ZERO ) /* X == Y */
****
--
Karl Heinz Buchegger
Jul 23 '05 #4
ch************@gmail.com wrote:

Aaron Gage wrote:

#define SMALL_NON_ZERO 1e-16D /* or something else small */
#define DBL_EQ(X,Y) ( (X) - (Y) < SMALL_NON_ZERO ) /* X == Y */
#define DBL_LT(X,Y) ( (X) - (Y) > SMALL_NON_ZERO ) /* X > Y */
#define DBL_GT(X,Y) ( (Y) - (X) > SMALL_NON_ZERO ) /* Y > X */

just an idea,
I've used this only for DBL_EQ, but assume it should extend to > and < also.

just one more thing.....how small do u think SMALL_NON_ZERO should
be....

That depends entirely on your application. The problem with these sort
of things is, that there is no 'one size fits all' value. It depends on
the underlying characteristics of double (how many bits are in the representation)
and it depends on the history of the numbers (how many and which floating
point operations were done before you use the values for testing).

"Floating point operations are like moving piles of sand.
Every time you do it, you loose a little sand and pick up
a little dirt."
and does this have any bearing on the portability of the code ie
will it be compiler or platform dependent.
Yes. Definitly. It depends on the floating point format and how many
bits the representation has.

also is there a way to avoid these approximations??
No.
In same cases one could get a way with fractional arithmetic. But since
there are numbers which don't have an exact fractional value( eg. sqrt(2) )
it is impossible to represent every number as a fraction (not even in theory,
let away the problem of limitted bits in a computer).

are there operators that allow bitwise operations on double??maybe
those can avoid the need to make approximations.

http://www.petebecker.com/journeymansshop.html

and of course the classical:
"What Every Computer Scientist Should Know About Floating-Point Arithmetic"
http://docs-pdf.sun.com/800-7895/800-7895.pdf

--
Karl Heinz Buchegger
Jul 23 '05 #5
ch************@gmail.com wrote:
Aaron Gage wrote:

#define SMALL_NON_ZERO 1e-16D /* or something else small */
#define DBL_EQ(X,Y) ( (X) - (Y) < SMALL_NON_ZERO ) /* X == Y */
#define DBL_LT(X,Y) ( (X) - (Y) > SMALL_NON_ZERO ) /* X > Y */
#define DBL_GT(X,Y) ( (Y) - (X) > SMALL_NON_ZERO ) /* Y > X */

just an idea,
I've used this only for DBL_EQ, but assume it should extend to > and < also.

just one more thing.....how small do u think SMALL_NON_ZERO should
be....and does this have any bearing on the portability of the code ie
will it be compiler or platform dependent.

also is there a way to avoid these approximations??

are there operators that allow bitwise operations on double??maybe
those can avoid the need to make approximations.

I would get rid of the doubles and use integer values throughout. The
only other change needed would be to apply a "scaling factor" say of
100 or 1000 when calculating the ratio of 1's to 2's in a given string.

In other words, rewrite the function along these lines:
int Elections::visit(const vector<string>& like)
{
long min=LONG_MAX;
int index=0;
const long kScalingFactor = 100;

for ( int i=0; i < like.size(); i++)
{
long temp=0;

for ( int j=0; j < like.size(); j++)
if (like[i][j]=='1')
temp++;

temp = (temp * kScalingFactor)/like[i].size();

if (min > temp)
{
min = temp;
index = i;
}
}
return index;
}

Jul 23 '05 #6
<ch************@gmail.com> wrote in message
The problm was .....we are given vector of strings....each string
consists of either 1 or 2("12122" 0r "2121" so on..)...i had to find
the that string where percentage of '1' is minimum.Now the problem and
solution both are trivial but i was told that comparing double with <
or > sign doesn't ensure a correct solution always ie it fails for
number differing by a very small value.

Perhaps you should disregard future advice from this source, as it's
nonsense.

It is certainly true that when you do a floating-point computation, you have
to allow for the possibility that the result will not have infinite
precision. But on every computer I've ever encountered, once you have
floating-point results, comparing those results will behave reasonably in
terms of ordering. By that I mean that for any three floating-point numbers
x, y, and z (assuming that x, y, and z are all numbers rather than NaN):

x==x
if x == y then y == x
if x == y and y == z, then x == z
if x<y and y<z, then x<z
Exactly one of x<y, x>y, and x==y is true

and so on. These properties are required for all computers that implement
IEEE floating-point, which includes all machines that run the Microsoft or
Apple operating systems.

So there is absolutely nothing wrong with using < in the usual way to find
the largest of a collection of floating-point numbers. It is true that if
the numbers are not exactly what you intended, because of computational
inaccuracy, the larges number might not be the one you expected to be
largest. But the only way to address that problem is to think carefully
about how you're doing your calculations, not by pretending that numbers are
equal when they're not.

In fact, using approximate comparison might yield nonsensical answers.
Imagine a program that uses approximate comparison, in which there is a
value named delta with the following properties:

1+delta == 1
1-delta == 1
1-delta != 1+delta

Now consider an algorithm that finds the largest element of a sequence, and
whenever it finds two equal elements, it picks the first one. If we apply
that algorithm to

1, 1+delta

then it will yield 1 as its result, because 1 will appear to be equal to
1+delta. If, on the other hand, we apply it to

1-delta, 1, 1+delta

then it will first compare 1-delta with 1 and use 1-delta as its
intermediate result (because 1-delta comes first); then it will compare
1-delta with 1+delta, find they are unequal, and yield 1+delta as its final
result.

In other words, it is possible for the "largest element" to be different
depending on the value of other elements, which is absurd.

This is an example of why I think that approximate comparison is a bad idea
in this context.
Jul 23 '05 #7
ch************@gmail.com wrote:
just one more thing.....how small do u think SMALL_NON_ZERO should
be....and does this have any bearing on the portability of the code ie
will it be compiler or platform dependent.
Usually on topcoder competition they use 1E-9 as a smallest difference.

also is there a way to avoid these approximations?? Yes fixed point ariphmetic for example. But solution would be more
complex most probably (since there is no such thing in standard library
which is only available at topcoder)
are there operators that allow bitwise operations on double??maybe
those can avoid the need to make approximations.
No thouse cannot avoid need of approximations because floating point
number itself is an approximation.
regards

Somesh

I think the best way for you in this situation is to learn how to
handle approximations properly.

Regards,
Vyacheslav

Jul 23 '05 #8
>
just one more thing.....how small do u think SMALL_NON_ZERO should
be....and does this have any bearing on the portability of the code ie
will it be compiler or platform dependent.

also is there a way to avoid these approximations??

are there operators that allow bitwise operations on double??maybe
those can avoid the need to make approximations.

Look at std::numeric_limits<>::epsilon()
P.Krumins

Jul 23 '05 #9
Let's first define a simple struct.

struct pair12
{
int num1s;
int num2s;

pair12() : num1s(0), num2s(0);
pair12& operator() ( char ch )
{
switch ( ch )
{
case '1': ++num1s; break;
case '2': ++num2s; break;
default: // do nothing
}
return *this;
}
};

Now let's create a comparison between them.

bool operator< ( const pair12& left, const pair12& right )
{
return ( left.num1s * right.num2s < left.num2s * right.num1s );
}

// below probably not necessary
bool operator== ( const pair12& left, const pair12& right )
{
return ( left.num1s * right.num2s == left.num2s * right.num1s );
}

Now let's create a transformer from const std::string & to pair12

struct pair12FromString
{
pair12 operator() ( const std::string & s )
{
return std::for_each( s.begin(), s.end(), pair12() );
}
};

Now let's do a transform given a std::vector< std::string > (which can
be passed by const reference as we don't modify it).

std::vector<pair12>::size_type find_min_element( const std::vector<
std::string > & v )
{
std::vector< pair12 > vTemp;
vTemp.reserve( v.size() );
std::transform( v.begin(), v.end(),
std::back_inserter( vTemp ), pair12FromString() );

return std::distance( vTemp.begin(),
std::min_element( vTemp.begin(), vTemp.end() ) );
}

Ensure you include all relevant headers, of course.
std::min_element must return the first in the sequence if there are
some that compare equal - by the standard.

Jul 23 '05 #10
actually the proportion should be:

bool operator< ( const pair12& left, const pair12& right )
{
return ( left.num1s * (right.num2s+right.num1s)
< (left.num2s+left.num1s) * right.num1s );
}

when the digit is '1' I increase numdigits and num1s and when
the digit is '2' I only increase numdigits.

although I think mathematically it makes no difference as if

a/(a+b) < c/(c+d) then surely a/b must be less than c/d since
a(c+d) < c(a+b) thus ac + ad < ca + cb and eliminating the ac
we get ad < cb. QED. So my comparison function is fine.

Anyway, in spite of rounding errors, if you divide 4.0 by 9.0 you
should
get the same result every time - it doesn't round randomly, it uses a
fixed
way of rounding. So your formula should have worked.

Jul 23 '05 #11

### This discussion thread is closed

Replies have been disabled for this discussion.