468,117 Members | 1,536 Online

# How safe is a set of floats?

I want to generate all the fractions between 1 and limit (with
limit>1) in an orderly fashion, without duplicates.

def all_ratios(limit):
s = set()
hi = 1.0
lo = 1.0
while True:
if hi/lo not in s:
yield (hi,lo)
hi += 1
if hi/lo limit:
lo += 1
hi = lo

I use a set to keep from giving duplicates; but is this safe? In C
they always tell you not to trust floating point equality comparisons,
since they may not work as you expect. My code seems fine for the
limited amount I've tested, but I'm curious: is there a gaurantee
about sets of floats? Or a warning?

Thanks,

Tom

May 4 '07 #1
9 1553
Thomas Nelson <th*@mail.utexas.eduwrote:
I want to generate all the fractions between 1 and limit (with
limit>1) in an orderly fashion, without duplicates.

def all_ratios(limit):
s = set()
hi = 1.0
lo = 1.0
while True:
if hi/lo not in s:
yield (hi,lo)
hi += 1
if hi/lo limit:
lo += 1
hi = lo

I use a set to keep from giving duplicates; but is this safe? In C
they always tell you not to trust floating point equality comparisons,
since they may not work as you expect. My code seems fine for the
limited amount I've tested, but I'm curious: is there a gaurantee
about sets of floats? Or a warning?
sets of floats work exactly like sets of anything else and thus in
particular they DO intrinsically rely on == comparisons, i.e., exact
equality checks (just like dicts whose keys are floats, etc).

In your code, some "fractions" that actually differ from others you're
previously seen will in fact be skipped because they don't differ _by
enough_ -- i.e. they do compare == to within the limited precision of
floating-point computations. But if you do want to be yielding floats,
and never want to yield the (num, denom) tuples for two items that *as
float* compare ==, there's nothing you can do about that issue.

My main suggestion to you actually would be to compute hi/lo ONCE per
iteration rather than 3 times -- I detest repetition in principle and
here it may be costing you a few nanoseconds' speed:-)

[[If you don't truly care about whether the fractions you yield do
compare as == "as floats", you might e.g. use gmpy.mpq rather than
Alex
May 4 '07 #2
On May 4, 9:50 am, a...@mac.com (Alex Martelli) wrote:
Thomas Nelson <t...@mail.utexas.eduwrote:
I want to generate all the fractions between 1 and limit (with
limit>1) in an orderly fashion, without duplicates.
def all_ratios(limit):
s = set()
hi = 1.0
lo = 1.0
while True:
if hi/lo not in s:
yield (hi,lo)
hi += 1
if hi/lo limit:
lo += 1
hi = lo
I use a set to keep from giving duplicates; but is this safe? In C
they always tell you not to trust floating point equality comparisons,
since they may not work as you expect. My code seems fine for the
limited amount I've tested, but I'm curious: is there a gaurantee
about sets of floats? Or a warning?

sets of floats work exactly like sets of anything else and thus in
particular they DO intrinsically rely on == comparisons, i.e., exact
equality checks (just like dicts whose keys are floats, etc).

In your code, some "fractions" that actually differ from others you're
previously seen will in fact be skipped because they don't differ _by
enough_ -- i.e. they do compare == to within the limited precision of
floating-point computations. But if you do want to be yielding floats,
and never want to yield the (num, denom) tuples for two items that *as
float* compare ==, there's nothing you can do about that issue.

My main suggestion to you actually would be to compute hi/lo ONCE per
iteration rather than 3 times -- I detest repetition in principle and
here it may be costing you a few nanoseconds' speed:-)

[[If you don't truly care about whether the fractions you yield do
compare as == "as floats", you might e.g. use gmpy.mpq rather than

Alex- Hide quoted text -

- Show quoted text -
Does set membership test for equality ("==") or identity ("is")? I
just did some simple class tests, and it looks like sets test for
identity. So if I were to create a Rational class in which
Rational(1,2) and Rational(2,4) both evaluate to 0.5, such that
Rational(1,2) == Rational(2,4) evaluates to True, a set of such
Rationals would still hold both instances.

-- Paul
May 4 '07 #3
On May 4, 3:21 pm, Thomas Nelson <t...@mail.utexas.eduwrote:
I want to generate all the fractions between 1 and limit (with
limit>1) in an orderly fashion, without duplicates.

def all_ratios(limit):
s = set()
hi = 1.0
lo = 1.0
while True:
if hi/lo not in s:
yield (hi,lo)
hi += 1
if hi/lo limit:
lo += 1
hi = lo

I use a set to keep from giving duplicates; but is this safe? In C
they always tell you not to trust floating point equality comparisons,
since they may not work as you expect. My code seems fine for the
limited amount I've tested, but I'm curious: is there a gaurantee
about sets of floats? Or a warning?
There won't be either, but you actually don't need to store the
previous fractions. All you need to verify is that the denominator
and numerator are relatively prime (i.e. their gcd is 1). That could
be implemented as:

------------------------------------
from itertools import count

def gcd(x, y):
while x:
x, y = y % x, x
return y

def all_ratios(limit):
for d in count(1):
for n in xrange(d, int(limit*d) + 1):
if gcd(d, n) == 1:
yield n, d
------------------------------------

HTH

--
Arnaud

May 4 '07 #4
Paul McGuire wrote:
Does set membership test for equality ("==") or identity ("is")?
As Alex said, equality:
>>a = 0.0
b = -0.0
a is b
False
>>a == b
True
>>set([a, b])
set([0.0])

Peter

May 4 '07 #5
On May 4, 5:04 pm, Paul McGuire <p...@austin.rr.comwrote:
Does set membership test for equality ("==") or identity ("is")? I
just did some simple class tests, and it looks like sets test for
identity.
Sets are like dictionaries, they test for equality:
>>a=1,2
b=1,2
a is b
False
>>a in set([b])
True

--
Arnaud

May 4 '07 #6
On May 4, 11:50 am, Arnaud Delobelle <arno...@googlemail.comwrote:
On May 4, 5:04 pm, Paul McGuire <p...@austin.rr.comwrote:
Does set membership test for equality ("==") or identity ("is")? I
just did some simple class tests, and it looks like sets test for
identity.

Sets are like dictionaries, they test for equality:
>a=1,2
b=1,2
a is b
False
>a in set([b])

True

--
Arnaud
Just to beat this into the ground, "test for equality" appears to be
implemented as "test for equality of hashes". So if you want to
implement a class for the purposes of set membership, you must
implement a suitable __hash__ method. It is not sufficient to
implement __cmp__ or __eq__, which I assumed "test for equality" would
make use of. Not having a __hash__ method in my original class caused
my initial confusion.

So would you suggest that any class implemented in a general-purpose
class library should implement __hash__, since one cannot anticipate
when a user might want to insert class instances into a set? (It
certainly is not on my current checklist of methods to add to well-
behaved classes.)
-- Paul

May 4 '07 #7
Paul McGuire wrote:
Just to beat this into the ground, "test for equality" appears to be
implemented as "test for equality of hashes". So if you want to
implement a class for the purposes of set membership, you must
implement a suitable __hash__ method. It is not sufficient to
implement __cmp__ or __eq__, which I assumed "test for equality" would
make use of. Not having a __hash__ method in my original class caused
my initial confusion.
As with dictionaries, only items with the same hash are considered for
equality testing.
So would you suggest that any class implemented in a general-purpose
class library should implement __hash__, since one cannot anticipate
when a user might want to insert class instances into a set? (It
certainly is not on my current checklist of methods to add to well-
behaved classes.)
A meaningful implementation would also have to make sure that the attributes
used to calculate hash and equality don't change over time.

No, I wouldn't bother because YAGNI.

Peter
May 4 '07 #8
On May 4, 10:15 am, Paul McGuire <p...@austin.rr.comwrote:
Just to beat this into the ground, "test for equality" appears to be
implemented as "test for equality of hashes". So if you want to
implement a class for the purposes of set membership, you must
implement a suitable __hash__ method. It is not sufficient to
implement __cmp__ or __eq__, which I assumed "test for equality" would
make use of. Not having a __hash__ method in my original class caused
my initial confusion.
overriding __hash__ (even to raise NotImplementedError) is always wise
if you have override __eq__. And of course __hash__ is necessary for
using hashtable-based structures (how else could it determine whether
objects are equal? compare against every existing element?)

Finally, two objects which return the same __hash__ but return False
for __eq__ are, of course, unequal. sets/dicts do not simply "test
for equality of hashes"
So would you suggest that any class implemented in a general-purpose
class library should implement __hash__, since one cannot anticipate
when a user might want to insert class instances into a set? (It
certainly is not on my current checklist of methods to add to well-
behaved classes.)
a class should be only inserted into a set if it is immutable, and
thus designed to such. User's might also execute 'del x.attr', so
perhaps you should start each method with a series of hasattr()
checks...

-Mike

May 8 '07 #9
On 4 May 2007 07:21:49 -0700, Thomas Nelson <th*@mail.utexas.eduwrote:
I want to generate all the fractions between 1 and limit (with
limit>1) in an orderly fashion, without duplicates.
Might I suggest the Stern-Brocot tree
(http://en.wikipedia.org/wiki/Stern-Brocot_tree)
It will eliminate the need for sets as the algorithm gurantees: "Every
positive rational number can be found in this tree exactly once and in
lowest terms". The order will be different than your algorithm,
though.

#An overly simplified fraction class for this example:
class Fraction:
def __init__ (self, num, den):
self.num = num
self.den = den
def __repr__ (self):
return '%(num)d/%(den)d' % self.__dict__

def all_ratios(limit):
seq = [Fraction(1,1), Fraction(limit,1)]
while True:
newseq = seq[:1]
pairs = [seq[x:x+2] for x in range(len(seq)-1)]
for pair in pairs:
#find the mediant value between each pair in the series
newval = Fraction(pair[0].num+pair[1].num, pair[0].den+pair[1].den)
yield newval
newseq.append(newval)
newseq.append(pair[1])
seq = newseq
-Dave
May 9 '07 #10

### This discussion thread is closed

Replies have been disabled for this discussion.