470,596 Members | 1,438 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 470,596 developers. It's quick & easy.

Annoying behaviour of the != operator

Can anybody please give me a decent justification for this:

class A(object):
def __init__(self, a):
self.a = a

def __eq__(self, other):
return self.a == other.a

s = A(3)
t = A(3)
print s == t True print s != t

True

I just spent a long, long time tracking down a bug in a program that
results from this behaviour.

Surely the != operator should, if no __ne__ method is present for
either object, check to see if an __eq__ method is defined, and if so,
return its negation?

Actually, that brings me to a wider question - why does __ne__ exist at
all? Surely its completely inconsistent and unnessecary to have
seperate equals and not equals methods on an object? a != b should just
be a short way of writing not (a == b). The fact the two can give a
different answer seems to me to be utterly unintuitive and a massive
pitfall for beginners (such as myself).

Jul 19 '05 #1
45 2619
Jordan Rastrick wrote:
Surely the != operator should, if no __ne__ method is present for
either object, check to see if an __eq__ method is defined, and if so,
return its negation?

Actually, that brings me to a wider question - why does __ne__ exist at
all? Surely its completely inconsistent and unnessecary to have
seperate equals and not equals methods on an object? a != b should just
be a short way of writing not (a == b). The fact the two can give a
different answer seems to me to be utterly unintuitive and a massive
pitfall for beginners (such as myself).


I agree that it's confusing. I didn't even understand this behavior
myself until I went and looked it up:

http://docs.python.org/ref/customization.html

The rationale for this behavior is in PEP 207 -- Rich Comparisons:

http://python.fyxm.net/peps/pep-0207.html

Personally, I implement the __cmp__ method when I want my objects to be
comparable, so I've never run into this problem.

Dave
Jul 19 '05 #2
No, why should Python assume that if you use != without supplying a
__ne__ that this is what you want? Without direction it will compare
the two objects which is the default behavior.

So, s != t is True because the ids of the two objects are different.
The same applies to, for example s > t and s < t. Do you want Python to
be smart and deduce that you want to compare one variable within the
object if you don't create __gt__ and __lt__? I do not want Python to
do that.

Regards,
M

Jul 19 '05 #3
Jordan Rastrick wrote:
I just spent a long, long time tracking down a bug in a program that
results from this behaviour.

Surely the != operator should, if no __ne__ method is present for
either object, check to see if an __eq__ method is defined, and if so,
return its negation?

Actually, that brings me to a wider question - why does __ne__ exist at
all? Surely its completely inconsistent and unnessecary to have
seperate equals and not equals methods on an object? a != b should just
be a short way of writing not (a == b). The fact the two can give a
different answer seems to me to be utterly unintuitive and a massive
pitfall for beginners (such as myself).


surely reading the documentation would be a great way to avoid
pitfalls?

http://docs.python.org/ref/customization.html

__lt__, __le__ (etc)

New in version 2.1. These are the so-called "rich comparison" methods,
and are called for comparison operators in preference to __cmp__() below.

/.../

There are no implied relationships among the comparison operators. The
truth of x==y does not imply that x!=y is false. Accordingly, when defining
__eq__, one should also define __ne__ so that the operators will behave
as expected.

/.../

__cmp__

Called by comparison operations if rich comparison (see above) is not
defined. Should return a negative integer if self < other, zero if self
== other, a positive integer if self > other. /.../

for a number of situations where __ne__ cannot be derived from __eq__,
see:

http://www.python.org/peps/pep-0207.html

</F>0

Jul 19 '05 #4
Mahesh wrote:
No, why should Python assume that if you use != without supplying a
__ne__ that this is what you want?
Because every single time I've used __ne__, that *is* what I want.
Without direction it will compare
the two objects which is the default behavior.


It's also the default behavior that x == y and x != y are mutally
exclusive.

Jul 19 '05 #5
Well, I don't really want the objects to be comparable. In fact, to
quote that PEP you linked:

An additional motivation is that frequently, types don't have a
natural ordering, but still need to be compared for equality.
Currently such a type *must* implement comparison and thus define
an arbitrary ordering, just so that equality can be tested.

I don't want to order the objects. I just want to be able to say if one
is equal to the other.

Here's the justification given:

The == and != operators are not assumed to be each other's
complement (e.g. IEEE 754 floating point numbers do not satisfy
this). It is up to the type to implement this if desired.
Similar for < and >=, or > and <=; there are lots of examples
where these assumptions aren't true (e.g. tabnanny).

Well, never, ever use equality or inequality operations with floating
point numbers anyway, in any language, as they are notoriously
unreliable due to the inherent inaccuracy of floating point. Thats
another pitfall, I'll grant, but its a pretty well known one to anyone
with programming experience. So I don't think thats a major use case.

Are there any other reasonable examples people can give where it makes
sense for != and == not to be each other's complement?

Besides, the argument given doesn't justfiy the approach taken. Even if
you want to allow objects, in rare cases, to override __ne__ in a way
thats inconsistent with __eq__, fine, do so. But thats no reason for
Python not to fall back on __eq__ in the cases where __ne__ is not
defined, rather than reverting to object identity comparison (at least
I assume thats what its doing)

To draw a parallel:

In Java, the compareTo method in a class is allowed to be inconsistent
with the equals method, but those who write such code are advised to
heavily advertise this fact in their documentation and source, so as
not to catch unwary users off guard.

In Python, the *default* behaviour is for the equals and not equals
operations to disagree if the __eq__ method happens to be overriden
(which it definitely should be in a great number of cases).

Surely this isn't the right way to do things.

How many classes out there that have the following redundant,
cluttering piece of code?

def __ne__(self, other):
return not self.__eq__(other)

Worse still, how many classes don't have this code, but should, and are
therefore harbouring highly confusing potential bugs?

Unless someone can explain some sort of problem that arises from having
!= take advantage of a __eq__ method where present, I'd suggest that it
should do so in Python 2.5.

I'd be surprised if such a change broke so much as a single line of
existing Python code.

Jul 19 '05 #6
Fredrik Lundh wrote:
for a number of situations where __ne__ cannot be derived from __eq__,
see:

http://www.python.org/peps/pep-0207.html


That "number" being "one"?

I can see only one comment that seems to describe that situation, where
it refers to "IEEE 754 floating point numbers do not satisfy [== being
the complement of !=]".

(Though that may be justification enough for the feature...)

-Peter
Jul 19 '05 #7
Just because a behaviour is documented, doesn't mean its not counter
intutitive, potentially confusing, and unnessecary.

I have spent a fair amount of time reading the Python docs. I have not
yet memorised them. I may have read this particular section of the
reference manual, or I may have not, I can't remember. This is the
first time I've had cause to override the __eq__ operator, and so the
first time I've encountered the behaviour in a concrete setting, which
I feel is a far more effective way to learn than to read all the
documentation cover to cover.

Where are the 'number of situations' where __ne__ cannot be derived
from __eq__? Is it just the floating point one? I must admit, I've
missed any others.

And no, I don't think I really want Python to assume anything much -
explicit is better than implicit after all. If I don't override __eq__,
or __gt__, or __cmp__, i certainly don't expect Python to infer the
behaviour I want.

But I explicitly provided a method to test equality. And look at the
plain english meaning of the term "Not equals" I think its pretty
reasonable

As far as __gt__, __lt__, __ge__, and the rest go, the answer is
simple. Anyone defining things the 'obvious' way would override
__cmp__, and sidestep the whole issues. Those who want the less obvious
behaviour of can provide it explicitly.

But as PEP207 says, you shouldn't have to override cmp if you don't
want to provide an arbitrary ordering. And its the people doing the
non-obvious thing, having __ne__ and __eq__ inconsistent, who should
have to code it and state it explicitly.

If want an orderable object, I'll just define __cmp__. If I want an
equal-comparable object, I should just be able to define __eq__. If I
want anything fancy, I can go ahead and explicitly write methods for
__gt__, __ne__, etc.

Jul 19 '05 #8
Jordan Rastrick wrote:
Unless someone can explain some sort of problem that arises from having
!= take advantage of a __eq__ method where present, I'd suggest that it
should do so in Python 2.5.
If you're serious about this proposal, please formalize it in a PEP.

Things to specify:

How extensive are these changes? Is it just !=, or if __eq__ is not
defined, will it be equal to "not __ne__()"? __lt__/__gt__?
__le__/__ge__? Do you simulate __le__ with '__lt__ or __eq__'? How about
__lt__ with '__le__ and __ne__'?

I'd be surprised if such a change broke so much as a single line of
existing Python code.


Well, since it currently raises an error when __ne__ is not defined, I'd
say your right on that account. The only corner case is when people
would rely on 'a != b' to be implicitly translated to 'b != a'. If 'a !=
b' gets translated to 'not a == b', this may change semantics if the two
are not equivalent.
Jul 19 '05 #9
"Jordan Rastrick" <jr*******@student.usyd.edu.au> wrote in message
news:11**********************@g14g2000cwa.googlegr oups.com...
Well, I don't really want the objects to be comparable. In fact, to
quote that PEP you linked:

An additional motivation is that frequently, types don't have a
natural ordering, but still need to be compared for equality.
Currently such a type *must* implement comparison and thus define
an arbitrary ordering, just so that equality can be tested.

I don't want to order the objects. I just want to be able to say if one
is equal to the other.

Here's the justification given:

The == and != operators are not assumed to be each other's
complement (e.g. IEEE 754 floating point numbers do not satisfy
this). It is up to the type to implement this if desired.
Similar for < and >=, or > and <=; there are lots of examples
where these assumptions aren't true (e.g. tabnanny).

Well, never, ever use equality or inequality operations with floating
point numbers anyway, in any language, as they are notoriously
unreliable due to the inherent inaccuracy of floating point. Thats
another pitfall, I'll grant, but its a pretty well known one to anyone
with programming experience. So I don't think thats a major use case.
Two floating point numbers are equal if they are within epsilon of each
other. Epsilon, of course, depends on the application. I do floating
point compares quite nicely in PyFit, with intuitive results. It simply
requires you to know what you're doing rather than blindly following
the herd over the cliff.

That is, however, not the reason for the statement. The reason for the
statement is that the feature was motivated by the numerics package,
which has a number of special requirements, floating point exceptional
values only being one. Read the PEP.

John Roth


Jul 19 '05 #10
I'd suggest the only nessecary change is, if objects a,b both define
__eq__ and not __ne__, then a != b should return not (a == b)

If a class defines __ne__ but not __eq__, well that seems pretty
perverse to me. I don't especially care one way or another how thats
resolved to be honest.

The ordering comparisons (__lt__, __ge__ etc) are fine as they are I'd
say, since they only come up in the rare cases where __cmp__ isn't
sufficient.

I'll wait for a bit more discussion on here before starting a PEP - as
I've said I'm only a beginner, and a more experience Pyonista may yet
give a perfectly good argument in favour of the current behaviour.

Jul 19 '05 #11
Jordan,

On 8 Jun 2005 11:44:43 -0700, Jordan Rastrick
<jr*******@student.usyd.edu.au> wrote:
But I explicitly provided a method to test equality. And look at the
plain english meaning of the term "Not equals" I think its pretty
reasonable


Indeed. Furthermore, it seems quite silly that these would be different:

a != b
not (a == b)

To be fair, though, other languages have peculiarities with equation.
Consider this Java code:

String s1 = "a";
String s2 = "a";
String s3 = new String("a");
String s4 = new String("a");

s1 == s2; // true
s1.equals(s2); // true

s1 == s3; // false
s1.equals(s3); // true

s3 == s4; // false
s3.equals(s4); // true

Doesn't make it any less silly, though.

--
Matt Warden
Miami University
Oxford, OH, USA
http://mattwarden.com
This email proudly and graciously contributes to entropy.
Jul 19 '05 #12
Jordan Rastrick wrote:
Are there any other reasonable examples people can give where it makes
sense for != and == not to be each other's complement?


__eq__ and __ne__ implement *rich* comparisons. They don't have to
return only True or False.

In [1]:import Numeric

In [2]:a = Numeric.array([1, 2, 3, 4, 5])

In [3]:b = Numeric.array([1, 0, 4, 0, 5])

In [4]:a == b
Out[4]:array([1, 0, 0, 0, 1],'b')

In [5]:a != b
Out[5]:array([0, 1, 1, 1, 0],'b')

In [6]:(a != b) == (not (a == b))
Out[6]:array([1, 0, 0, 0, 1],'b')

In [7]:not (a == b)
Out[7]:False

In [8]:not (a != b)
Out[8]:False

--
Robert Kern
rk***@ucsd.edu

"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter

Jul 19 '05 #13
"Jordan Rastrick" wrote:
I'd suggest the only nessecary change is, if objects a,b both define
__eq__ and not __ne__, then a != b should return not (a == b)

If a class defines __ne__ but not __eq__, well that seems pretty
perverse to me. I don't especially care one way or another how thats
resolved to be honest.

The ordering comparisons (__lt__, __ge__ etc) are fine as they are I'd
say, since they only come up in the rare cases where __cmp__ isn't
sufficient.

I'll wait for a bit more discussion on here before starting a PEP - as
I've said I'm only a beginner, and a more experience Pyonista may yet
give a perfectly good argument in favour of the current behaviour.


I've been bitten by this particular wart more than once, so I wrote a
metaclass for automating the addition of the missing rich comparisons
with the expected semantics: == and != are complementary, and given
either of them and one of <, >, <=, >=, the other three are defined.
You can check it out at
http://rafb.net/paste/results/ymLfVo81.html. Corrections and comments
are welcome.

Regards,
George

Jul 19 '05 #14
Well, I'll admit I haven't ever used the Numeric module, but since
PEP207 was submitted and accepted, with Numeric as apparently one of
its main motivations, I'm going to assume that the pros and cons for
having == and ilk return things other than True or False have already
been discussed at length and that argument settled. (I suppose theres a
reason why Numeric arrays weren't just given the same behaviour as
builtin lists, and then simple non-special named methods to do the
'rich' comparisons.)

But again, it seems like a pretty rare and marginal use case, compared
to simply wanting to see if some object a is equal to (in a non object
identity sense) object b.

The current situation seems to be essentially use __cmp__ for normal
cases, and use the rich operations, __eq__, __gt__, __ne__, and rest,
only in the rare cases. Also, if you define one of them, make sure you
define all of them.

Theres no room for the case of objects where the == and != operators
should return a simple True or False, and are always each others
complement, but <, >= and the rest give an error. I haven't written
enough Python to know for sure, but based on my experience in other
languages I'd guess this case is vastly more common than all others put
together.

I'd be prepared to bet that anyone defining just __eq__ on a class, but
none of __cmp__, __ne__, __gt__ etc, wants a != b to return the
negation of a.__eq__(b). It can't be any worse than the current case of
having == work as the method __eq__ method describes but != work by
object identity.

So far, I stand by my suggested change.

Jul 19 '05 #15


Jordan Rastrick wrote:
Just because a behaviour is documented, doesn't mean its not counter
intutitive, potentially confusing, and unnessecary.

I have spent a fair amount of time reading the Python docs. I have not
yet memorised them. I may have read this particular section of the
reference manual, or I may have not, I can't remember. This is the
first time I've had cause to override the __eq__ operator, and so the
first time I've encountered the behaviour in a concrete setting, which
I feel is a far more effective way to learn than to read all the
documentation cover to cover.

Where are the 'number of situations' where __ne__ cannot be derived
from __eq__? Is it just the floating point one? I must admit, I've
missed any others.


For certain classes you can define behaviour like this:
x = Expr()
let(x+2 == 0) x+2==0 solve(let(x+2==0)) x==-2 x.val

-2

My Expr class implements __eq__ and __ne__ in the following way:

def __eq__(self,other):
self._wrapPredicate("==",other)
return self._iseq(other)

def __ne__(self,other):
self._wrapPredicate("!=",other)
return not self._iseq(other)

It would be hard to use operator overloading to create expressions if
there are a lot of implicite assumptions. On the other hand I agree
with you about the default behaviour of __ne__ and that it should be
related locally to the class that overloads __eq__ and not to some
global interpreter defined behaviour.

Kay

Jul 19 '05 #16
Matt Warden wrote:
Jordan,

On 8 Jun 2005 11:44:43 -0700, Jordan Rastrick
<jr*******@student.usyd.edu.au> wrote:
But I explicitly provided a method to test equality. And look at the
plain english meaning of the term "Not equals" I think its pretty
reasonable

Indeed. Furthermore, it seems quite silly that these would be different:

a != b
not (a == b)


It's only "silly" if one sticks to strict Boolean semantics or
implicitly assumes the law of the excluded middle
(http://en.wikipedia.org/wiki/Excluded_middle), the principle of
bivalence (http://en.wikipedia.org/wiki/Principle_of_bivalence), or the
law of noncontradiction
(http://en.wikipedia.org/wiki/Law_of_non-contradiction). Despite "law"
status, it is possible (and useful) to imagine situations where they
don't hold. (A'la non-euclidlean geometry).

The main problem is that Python is trying to stick at least three
different concepts onto the same set of operators: equivalence (are
these two objects the same?), ordering (in a sorted list, which comes
first?), and mathematical "size". This gives the wacky world where
"[(1,2), (3,4)].sort()" works, whereas "[1+2j, 3+4j].sort()" doesn't.

Luckily another related concept, identity, has already been separated
out (the 'is' operator). It would be nice (but I'm not hoding my breath)
if the complete issue gets resolved with Python 3000.
Jul 19 '05 #17
Peter Hansen wrote:
I can see only one comment that seems to describe that situation, where it refers to "IEEE 754 floating point numbers do not satisfy [==
being the complement of !=]".
(Though that may be justification enough for the feature...)

To my naive eye, that possibility seems like justification for the
language to not -enforce- that (not (a == b)) == (a != b), but for the
vast majority of cases this is true. Perhaps the language should offer
the sensible default of (!=) == (not ==) if one of them but not the
other is overriden, but still allow overriding of both.

This would technically break backwards compatibilty, because it changes
default behavior, but I can't think of any good reason (from a python
newbie perspective) for the current counterintuitive behavior to be the
default. Possibly punt this to Python 3.0?
Jul 19 '05 #18
I'm a Maths and Philosophy undergraduate first and foremost, with
Computer Science as a tacked on third; I've studied a fair bit of logic
both informally and formally, and am familiar with things such as the
non-nessecity of the law of the excluded middle in an arbitrary
propositional calculus farmework.

I can now also see the usefulness of overriding != and == to do things
other than simple equality comparison. Kay Schueler's Expr class seems
like a particularily elegant and beautful example! (and seems to me a
much better justification for rich comparisons than the rather mundane
typying-reduction case of Numeric arrays)

So I'm not arguing Python should demand != and not(a == b) return the
same thing all the time (although I did question this in my original
post). My argument is simply one of pragmatism - cases where this is
not the case are the highly unusual ones, and so they should be the
ones forced to write seperate __eq__ and __ne__ methods, In *every*
example that's been rased so far - Floats, Numeric.array, Expr,
(hypothetically) some unusual Symbolic Logic program without the law of
excluded middle - these methods are both by nessecity independently
defined, and my suggestion would not change the status quo at all.

Mahesh raised the argument some posts back that Python should not 'just
guess' what you want. But the problem is, it *already does* - it
guesses you want object identity comparison if you haven't written
__ne__. But if __ne__ is not provided, than the negation of

a==b

is *surely* a better guess for a != b than the negation of

a is b

As always, explicit is better than implicit. But if we're going to be
implicit, lets be implicit in the way that makes the most sense. I
can't stand Perl's autoconversion from strings to integers. But how
much worse would it be if strings were auto-converted to, for example,
the sum of the ordinal value of their ascii characters?

OK, thats about as compelling as I can make the argument. If Perl
bashing won't sway Python fans over, I don't know what will :)

P.S. Excuse the excessive presence typos throughout all my posts, its
been a long night.

Jul 19 '05 #19
Jordan Rastrick wrote:
Mahesh raised the argument some posts back that Python should not 'just
guess' what you want. But the problem is, it *already does* - it
guesses you want object identity comparison if you haven't written
__ne__. But if __ne__ is not provided, than the negation of

a==b

is *surely* a better guess for a != b than the negation of

a is b


The problem arises that, in the presence of rich comparisons, (a == b)
is not always a boolean value, while (a is b) is always a boolean value.
I *would* prefer that (a != b) raise an error when __ne__ isn't
provided, but such is life until 3.0.

--
Robert Kern
rk***@ucsd.edu

"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter

Jul 19 '05 #20
I understand that what makes perfect sense to me might not make perfect
sense to you but it seems a sane default. When you compare two objects,
what is that comparision based on? In the explicit is better than
implicit world, Python can only assume that you *really* do want to
compare objects unless you tell it otherwise. The only way it knows how
to compare two objects is to compare object identities.

I am against making exceptions for corner cases and I do think making
__ne__ implicitly assume not __eq__ is a corner case.

Maybe you think that it takes this explicit is better than implicit
philosophy too far and acts dumb but I think it is acting consistently.

Cheers,
Mahesh

Jul 19 '05 #21
"Jordan Rastrick" <jr*******@student.usyd.edu.au> wrote in message
news:11**********************@z14g2000cwz.googlegr oups.com...
Well, I'll admit I haven't ever used the Numeric module, but since
PEP207 was submitted and accepted, with Numeric as apparently one of
its main motivations, I'm going to assume that the pros and cons for
having == and ilk return things other than True or False have already
been discussed at length and that argument settled. (I suppose theres a
reason why Numeric arrays weren't just given the same behaviour as
builtin lists, and then simple non-special named methods to do the
'rich' comparisons.)
They were - read the PEP again. That's the behavior they wanted
to get away from.
But again, it seems like a pretty rare and marginal use case, compared
to simply wanting to see if some object a is equal to (in a non object
identity sense) object b.

The current situation seems to be essentially use __cmp__ for normal
cases, and use the rich operations, __eq__, __gt__, __ne__, and rest,
only in the rare cases. Also, if you define one of them, make sure you
define all of them.

Theres no room for the case of objects where the == and != operators
should return a simple True or False, and are always each others
complement, but <, >= and the rest give an error. I haven't written
enough Python to know for sure, but based on my experience in other
languages I'd guess this case is vastly more common than all others put
together.

I'd be prepared to bet that anyone defining just __eq__ on a class, but
none of __cmp__, __ne__, __gt__ etc, wants a != b to return the
negation of a.__eq__(b). It can't be any worse than the current case of
having == work as the method __eq__ method describes but != work by
object identity.
To quote Calvin Coolege: You lose.

The primary open source package I work on, PyFit, always wants to
do an equal comparison, and never needs to do a not equal. It also has
no use for ordering comparisons. I do not equals as a matter of symmetry
in case someone else wants them, but I usually have no need of them.
Strict XP says I shouldn't do them without a customer request.
So far, I stand by my suggested change.
I think most of your justification is simple chicken squaking, but write
the PEP anyway. I'd suggest tightening it to say that if __eq__ is
defined, and if neither __ne__ nor __cmp__ is defined, then use
__eq__ and return the negation if and only if the result of __eq__
is a boolean. Otherwise raise the current exception.

I wouldn't suggest the reverse, though. Defining __ne__ and not
defining __eq__ is simply perverse.

John Roth


Jul 19 '05 #22
Mahesh wrote:
I understand that what makes perfect sense to me might not make perfect
sense to you but it seems a sane default. When you compare two objects,
what is that comparision based on? In the explicit is better than
implicit world, Python can only assume that you *really* do want to
compare objects unless you tell it otherwise. The only way it knows how
to compare two objects is to compare object identities.


This isn't the issue here. I agree that object identity comparison is
a good default equality test. The issue is whether this default should
be thought of as

# your approach (and the current implementation)
def __eq__(self, other):
return self is other
def __ne__(self, other):
return self is not other

or

# my approach
def __eq__(self, other):
return self is other
def __ne__(self, other):
return not (self == other)

My approach simplifies the implementation (i.e., requires one fewer
method to be overridden) of classes for which (x != y) == not (x == y).
This is a very common case.

Your approach simplifies the implementation of classes for which
equality tests are based on data but inequality tests are based on
identity (or vice-versa). I can't think of a single situation in which
this is useful.

Jul 19 '05 #23
Christopher Subich wrote:
Perhaps the language should offer
the sensible default of (!=) == (not ==) if one of them but not the
other is overriden, but still allow overriding of both.
I believe that's exactly what Jordan is promoting and, having been
bitten in exactly the same way I would support the idea. On the other
hand, I was bitten only _once_ and I suspect Jordan will never be bitten
by it again either. It's pretty hard to forget this wart once you
discover it, but I think the real reason to want to have it excised is
that a large number of people will have to learn this the hard way,
documentation (thankfully) not being shoved down one's throat as one
starts intrepidly down the road of overriding __eq__ for the first time.
This would technically break backwards compatibilty, because it changes
default behavior, but I can't think of any good reason (from a python
newbie perspective) for the current counterintuitive behavior to be the
default. Possibly punt this to Python 3.0?


I'd support an effort to fix it in 2.5 actually. I suspect nobody will
pipe up with code that would actually be broken by it, though some code
(as John Roth points out) doesn't *need* to have the automatic __ne__
even if it wouldn't break because of it.

-Peter
Jul 19 '05 #24
Robert Kern wrote:
The problem arises that, in the presence of rich comparisons, (a == b)
is not always a boolean value, while (a is b) is always a boolean value.


But that still doesn't mean that in a case where a == b (via __eq__)
returns a non-boolean, __ne__ would not be defined as well. In other
words, there's _nothing_ preventing this "fix" from being made to
provide saner behaviour in the most common case (which happens to pose
the greatest risk of inadvertent mistakes for those who aren't aware of
the requirement to define both) while still allowing the cases that need
unusual behaviour to get it by (as they already surely do!) defining
both __ne__ and __eq__.

-Peter
Jul 19 '05 #25
On Wed, 08 Jun 2005 11:01:27 -0700, Mahesh wrote:
No, why should Python assume that if you use != without supplying a
__ne__ that this is what you want? Without direction it will compare
the two objects which is the default behavior.
Why should Python assume that != means "not is" instead of "not equal"?

That seems like an especially perverse choice given that the operator is
actually called "not equal".
So, s != t is True because the ids of the two objects are different.
The same applies to, for example s > t and s < t. Do you want Python to
be smart and deduce that you want to compare one variable within the
object if you don't create __gt__ and __lt__? I do not want Python to
do that.


That is an incorrect analogy. The original poster doesn't want Python to
guess which attribute to do comparisons by. He wants "!=" to be
defined as "not equal" if not explicitly overridden with a __ne__ method.

If there are no comparison methods defined, then and only then does it
make sense for == and != to implicitly test object identity.

I'm all for the ability to override the default behaviour. But surely
sensible and intuitive defaults are important?
--
Steven
Jul 19 '05 #26
Jordan Rastrick wrote:
But I explicitly provided a method to test equality.


Actually, no, you didn't. You provided a method to define
the meaning of the operator spelled '==' when applied to your
object. That's the level of abstraction at which Python's
__xxx__ methods work. They don't make any semantic assumptions.

It's arguable that there should perhaps be some default
assumptions made, but the Python developers seem to have
done the Simplest Thing That Could Possibly Work, which
isn't entirely unreasonable.

--
Greg Ewing, Computer Science Dept,
University of Canterbury,
Christchurch, New Zealand
http://www.cosc.canterbury.ac.nz/~greg
Jul 19 '05 #27
Jordan Rastrick wrote:
Where are the 'number of situations' where __ne__ cannot be derived
from __eq__? Is it just the floating point one? I must admit, I've
missed any others.


The floating point one is just an example, it's not meant
to be the entire justification.

Some others:

* Numeric arrays, where comparisons return an array of
booleans resulting from applying the comparison to each
element.

* Computer algebra systems and such like, which return a
parse tree as a result of evaluating an expression.

--
Greg Ewing, Computer Science Dept,
University of Canterbury,
Christchurch, New Zealand
http://www.cosc.canterbury.ac.nz/~greg
Jul 19 '05 #28
Rocco Moretti wrote:
The main problem is that Python is trying to stick at least three
different concepts onto the same set of operators: equivalence (are
these two objects the same?), ordering (in a sorted list, which comes
first?), and mathematical "size".
A possible compromise would be to add a new special method,
such as __equal__, for use by == and != when there is no
__eq__ or __ne__. Then there would be three clearly separated
levels of comparison: (1) __cmp__ for ordering, (2) __equal__
for equivalence, (3) __eq__ etc. for unrestricted semantics.
This gives the wacky world where
"[(1,2), (3,4)].sort()" works, whereas "[1+2j, 3+4j].sort()" doesn't.


To solve that, I would suggest a fourth category of "arbitrary
ordering", but that's probably Py3k material.

--
Greg Ewing, Computer Science Dept,
University of Canterbury,
Christchurch, New Zealand
http://www.cosc.canterbury.ac.nz/~greg
Jul 19 '05 #29
Op 2005-06-08, Mahesh schreef <su****@gmail.com>:
No, why should Python assume that if you use != without supplying a
__ne__ that this is what you want? Without direction it will compare
the two objects which is the default behavior.

So, s != t is True because the ids of the two objects are different.
The same applies to, for example s > t and s < t. Do you want Python to
be smart and deduce that you want to compare one variable within the
object if you don't create __gt__ and __lt__? I do not want Python to
do that.


Python is already smart. It deduces what you want with the += operator
even if you haven't defined an __iadd__ method. If python can be smart
with that, I don't see python being smart with !=.

--
Antoon Pardon
Jul 19 '05 #30
On Thu, 09 Jun 2005 15:50:42 +1200,
Greg Ewing <gr**@cosc.canterbury.ac.nz> wrote:
Rocco Moretti wrote:
The main problem is that Python is trying to stick at least three
different concepts onto the same set of operators: equivalence (are
these two objects the same?), ordering (in a sorted list, which comes
first?), and mathematical "size".
A possible compromise would be to add a new special method,
such as __equal__, for use by == and != when there is no
__eq__ or __ne__. Then there would be three clearly separated
levels of comparison: (1) __cmp__ for ordering, (2) __equal__
for equivalence, (3) __eq__ etc. for unrestricted semantics. This gives the wacky world where
"[(1,2), (3,4)].sort()" works, whereas "[1+2j, 3+4j].sort()" doesn't.


Python inherits that wackiness directly from (often wacky) world of
Mathematics.

IMO, the true wackiness is that

[ AssertionError, (vars, unicode), __name__, apply ].sort( )

"works," too. Python refusing to sort my list of complex numbers is a
Good Thing.
To solve that, I would suggest a fourth category of "arbitrary
ordering", but that's probably Py3k material.


Four separate classes of __comparison__ methods in a language that
doesn't (and can't and shouldn't) preclude or warn about rules regarding
which methods "conflict" with which other methods? I do not claim to be
an expert, but that doesn't seem very Pythonic to me.

AIUI, __cmp__ exists for backwards compatibility, and __eq__ and friends
are flexible enough to cover any possible comparison scheme.

Why make the rules, the documentation, and the implementation even more
"interesting" than they already are?

Regards,
Dan

--
Dan Sommers
<http://www.tombstonezero.net/dan/>
Jul 19 '05 #31
Greg Ewing <gr**@cosc.canterbury.ac.nz> writes:
Rocco Moretti wrote:
> This gives the wacky world where
"[(1,2), (3,4)].sort()" works, whereas "[1+2j, 3+4j].sort()" doesn't.


To solve that, I would suggest a fourth category of "arbitrary
ordering", but that's probably Py3k material.


We've got that: use hash().
[1+2j, 3+4j].sort(key=hash)

Using the key= arg in sort means you can do other stuff easily of course:

by real part:
import operator
[1+2j, 3+4j].sort(key=operator.attrgetter('real'))

by size:
[1+2j, 3+4j].sort(key=abs)

and since .sort() is stable, for those numbers where the key is the
same, the order will stay the same.

--
|>|\/|<
/--------------------------------------------------------------------------\
|David M. Cooke
|cookedm(at)physics(dot)mcmaster(dot)ca
Jul 19 '05 #32
David M. Cooke wrote:
To solve that, I would suggest a fourth category of "arbitrary
ordering", but that's probably Py3k material.


We've got that: use hash().
[1+2j, 3+4j].sort(key=hash)


What about objects that are not hashable?

The purpose of arbitrary ordering would be to provide
an ordering for all objects, whatever they might be.

Greg

Jul 19 '05 #33
greg wrote:
David M. Cooke wrote:
To solve that, I would suggest a fourth category of "arbitrary
ordering", but that's probably Py3k material.


We've got that: use hash().
[1+2j, 3+4j].sort(key=hash)


What about objects that are not hashable?

The purpose of arbitrary ordering would be to provide
an ordering for all objects, whatever they might be.


How about id(), then?

And so the circle is completed...

--
Robert Kern
rk***@ucsd.edu

"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter

Jul 19 '05 #34
On Thu, 09 Jun 2005 08:10:09 -0400, Dan Sommers wrote:
The main problem is that Python is trying to stick at least three
different concepts onto the same set of operators: equivalence (are
these two objects the same?), ordering (in a sorted list, which comes
first?), and mathematical "size". [snip] This gives the wacky world where
"[(1,2), (3,4)].sort()" works, whereas "[1+2j, 3+4j].sort()" doesn't.


Python inherits that wackiness directly from (often wacky) world of
Mathematics.

IMO, the true wackiness is that

[ AssertionError, (vars, unicode), __name__, apply ].sort( )

"works," too. Python refusing to sort my list of complex numbers is a
Good Thing.


Only if you understand sorting as being related to the mathematical
sense of size, rather than the sense of ordering. The two are not the
same!

If you were to ask, "which is bigger, 1+2j or 3+4j?" then you
are asking a question about mathematical size. There is no unique answer
(although taking the absolute value must surely come close) and the
expression 1+2j > 3+4j is undefined.

But if you ask "which should come first in a list, 1+2j or 3+4j?" then you
are asking about a completely different thing. The usual way of sorting
arbitrary chunks of data within a list is by dictionary order, and in
dictionary order 1+2j comes before 3+4j because 1 comes before 3.

This suggests that perhaps sort needs a keyword argument "style", one of
"dictionary", "numeric" or "datetime", which would modify how sorting
would compare keys.

Perhaps in Python 3.0.
--
Steven.

Jul 19 '05 #35
Dan Sommers wrote:
On Thu, 09 Jun 2005 15:50:42 +1200,
Greg Ewing <gr**@cosc.canterbury.ac.nz> wrote:

Rocco Moretti wrote:
The main problem is that Python is trying to stick at least three
different concepts onto the same set of operators: equivalence (are
these two objects the same?), ordering (in a sorted list, which comes
first?), and mathematical "size".This gives the wacky world where
"[(1,2), (3,4)].sort()" works, whereas "[1+2j, 3+4j].sort()" doesn't.


Python inherits that wackiness directly from (often wacky) world of
Mathematics.

IMO, the true wackiness is that

[ AssertionError, (vars, unicode), __name__, apply ].sort( )

"works," too. Python refusing to sort my list of complex numbers is a
Good Thing.


The "wackyness" I refered to wasn't that a list of complex numbers isn't
sortable, but the inconsistent behaviour of list sorting. As you
mentioned, an arbitraty collection of objects in a list is sortable, but
as soon as you throw a complex number in there, you get an exception.

One way to handle that is to refuse to sort anything that doesn't have a
"natural" order. But as I understand it, Guido decided that being able
to sort arbitrary lists is a feature, not a bug. But you can't sort ones
with complex numbers in them, because you also want '1+3j<3+1j' to raise
an error.
Four separate classes of __comparison__ methods in a language that
doesn't (and can't and shouldn't) preclude or warn about rules regarding
which methods "conflict" with which other methods? I do not claim to be
an expert, but that doesn't seem very Pythonic to me.
What "conflict"? Where are you getting the doesn't/can't/shouldn't
prescription from?

Which method you use depends on what you want to achieve:

(Hypothetical Scheme)
Object Identity? - use 'is'
Mathematical Ordering? - use '__eq__' & friends
Object Equivalence? - use '__equiv__'
Arbitrary Ordering? (e.g. for list sorting) - use '__order__'

The only caveat is to define sensible defaults for the cases where one
fuction is not defined. But that shouldn't be too hard.

__eqiv__ -> __eq__ -> is
__order__ -> __lt__/__cmp__
AIUI, __cmp__ exists for backwards compatibility, and __eq__ and friends
are flexible enough to cover any possible comparison scheme.


Except if you want the situation where "[1+2j, 3+4j].sort()" works, and
'1+3j < 3+1j' fails.
I think the issue is you thinking along the lines of Mathematical
numbers, where the four different comparisons colapse to one. Object
identity? There is only one 'two' - heck, in pure mathematics, there
isn't even a 'float two'/'int two' difference. Equivalence *is*
mathematical equality, and the "arbitrary ordering" is easily defined as
"true" ordering. It's only when you break away from mathematics do you
see the divergance in behavior.
Jul 19 '05 #36
"Rocco Moretti" wrote:
One way to handle that is to refuse to sort anything that doesn't have a
"natural" order. But as I understand it, Guido decided that being able
to sort arbitrary lists is a feature, not a bug.


He has changed his mind since then
(http://mail.python.org/pipermail/pyt...ne/045111.html) but
it was already too late.

Waiting-for-python-3K'ly yours
George

Jul 19 '05 #37
Steven D'Aprano wrote:
....
If you were to ask, "which is bigger, 1+2j or 3+4j?" then you
are asking a question about mathematical size. There is no unique answer
(although taking the absolute value must surely come close) and the
expression 1+2j > 3+4j is undefined.

But if you ask "which should come first in a list, 1+2j or 3+4j?" then you
are asking about a completely different thing. The usual way of sorting
arbitrary chunks of data within a list is by dictionary order, and in
dictionary order 1+2j comes before 3+4j because 1 comes before 3.

This suggests that perhaps sort needs a keyword argument "style", one of
"dictionary", "numeric" or "datetime", which would modify how sorting
would compare keys.

Perhaps in Python 3.0.


What's wrong with the Python 2.4 approach of
clist = [7+8j, 3+4j, 1+2j, 5+6j]
clist.sort(key=lambda z: (z.real, z.imag))
clist

[(1+2j), (3+4j), (5+6j), (7+8j)]

?

Jul 19 '05 #38

"Rocco Moretti" <ro**********@hotpop.com> wrote in message
news:d8**********@news.doit.wisc.edu...
The "wackyness" I refered to wasn't that a list of complex numbers isn't
sortable, but the inconsistent behaviour of list sorting. As you
mentioned, an arbitraty collection of objects in a list is sortable, but
as soon as you throw a complex number in there, you get an exception.


This 'wackyness' is an artifact resulting from Python being 'improved'
after its original design. When Guido added complex numbers as a builtin
type, he had to decide whethter to make them sortable or not. There were
reasons to go either way. ... and the discussion has continued ever since
;-)

Terry J. Reedy

Jul 19 '05 #39
George Sakkis wrote:
"Rocco Moretti" wrote:

One way to handle that is to refuse to sort anything that doesn't have a
"natural" order. But as I understand it, Guido decided that being able
to sort arbitrary lists is a feature, not a bug.

He has changed his mind since then
(http://mail.python.org/pipermail/pyt...ne/045111.html) but
it was already too late.


The indicated message sidesteps the crux of the issue. It confirms that
arbitrary *comparisons* between objects are considered a wart, but it
says nothing about arbitrary *ordering* of objects.

None > True --> Wart
[None, True].sort() --> ????

The point that I've been trying to get across is that the two issues are
conceptually separate. (That's not to say that Guido might now consider
the latter a wart, too.)
Jul 19 '05 #40
On Fri, 10 Jun 2005 09:50:56 -0500,
Rocco Moretti <ro**********@hotpop.com> wrote:
Dan Sommers wrote:
On Thu, 09 Jun 2005 15:50:42 +1200,
Greg Ewing <gr**@cosc.canterbury.ac.nz> wrote:
Rocco Moretti wrote:

The main problem is that Python is trying to stick at least three
different concepts onto the same set of operators: equivalence (are
these two objects the same?), ordering (in a sorted list, which comes
first?), and mathematical "size".
This gives the wacky world where
"[(1,2), (3,4)].sort()" works, whereas "[1+2j, 3+4j].sort()" doesn't.

Python inherits that wackiness directly from (often wacky) world of
Mathematics.
IMO, the true wackiness is that
[ AssertionError, (vars, unicode), __name__, apply ].sort( )
"works," too. Python refusing to sort my list of complex numbers is a
Good Thing. The "wackyness" I refered to wasn't that a list of complex numbers isn't
sortable, but the inconsistent behaviour of list sorting. As you
mentioned, an arbitraty collection of objects in a list is sortable, but
as soon as you throw a complex number in there, you get an exception.
Yes, I agree: it is inconsistent.
One way to handle that is to refuse to sort anything that doesn't have
a "natural" order. But as I understand it, Guido decided that being
able to sort arbitrary lists is a feature, not a bug. But you can't
sort ones with complex numbers in them, because you also want
'1+3j<3+1j' to raise an error.
As George Sakkis noted, Guido has since recanted. Unfortunately, in
this case, the time machine would have broken too much existing code.
Four separate classes of __comparison__ methods in a language that
doesn't (and can't and shouldn't) preclude or warn about rules
regarding which methods "conflict" with which other methods? I do
not claim to be an expert, but that doesn't seem very Pythonic to me. What "conflict"? Where are you getting the doesn't/can't/shouldn't
prescription from?
Perhaps "conflict" wasn't quite the right word. For example, if I
define __ne__ and __equal__ and __lt__, then which method(s) should
Python use if I later use a <= or a >= operator?

"Doesn't" and "can't" (call me pessimistic) comes from all the issues
and disagreements we're having in this thread, and "shouldn't" comes
from the Zen:

Explicit is better than implicit.
In the face of ambiguity, refuse the temptation to guess.
Which method you use depends on what you want to achieve: (Hypothetical Scheme)
Object Identity? - use 'is'
Mathematical Ordering? - use '__eq__' & friends
Object Equivalence? - use '__equiv__'
Arbitrary Ordering? (e.g. for list sorting) - use '__order__'
So which method would python use when sorting a list that happens to
consist only of numbers? or a list that contains mostly integers and a
few complex numbers?
The only caveat is to define sensible defaults for the cases where one
fuction is not defined. But that shouldn't be too hard.
At the risk of repeating myself:

Explicit is better than implicit.
In the face of ambiguity, refuse the temptation to guess.

Also, as I noted in a previous discussion on an unrelated topic, it
seems that we all have our own notions and limitations of expliciticity
and impliciticity.
__eqiv__ -> __eq__ -> is
__order__ -> __lt__/__cmp__ AIUI, __cmp__ exists for backwards compatibility, and __eq__ and friends
are flexible enough to cover any possible comparison scheme.

Except if you want the situation where "[1+2j, 3+4j].sort()" works, and
'1+3j < 3+1j' fails.
I'm sticking with my position that both should fail, unless you
*explicity* tell sort what to do (since for now, we all seem to agree
that the other one should fail). If I have an application that thinks
it has to sort a list of arbitrary objects, then I have to be clever
enough to help.
I think the issue is you thinking along the lines of Mathematical
numbers, where the four different comparisons colapse to one. Object
identity? There is only one 'two' - heck, in pure mathematics, there
isn't even a 'float two'/'int two' difference. Equivalence *is*
mathematical equality, and the "arbitrary ordering" is easily defined
as "true" ordering. It's only when you break away from mathematics do
you see the divergance in behavior.


IIRC, there was a discussion about overhauling of all of Python's
numbers to make them act more like the mathematical entities that they
represent rather than the Python objects that they are. The long/int
"consolidation," better handling of integer division, and the Decimal
class came out of that discussion. But that still leaves the issue of
what to do with 1 < "foo" and "bar" > 2j and 3j < 4j.

Regards,
Dan

--
Dan Sommers
<http://www.tombstonezero.net/dan/>
Jul 19 '05 #41
Robert Kern <rk***@ucsd.edu> writes:
greg wrote:
David M. Cooke wrote:
To solve that, I would suggest a fourth category of "arbitrary
ordering", but that's probably Py3k material.

We've got that: use hash().
[1+2j, 3+4j].sort(key=hash)

What about objects that are not hashable?
The purpose of arbitrary ordering would be to provide
an ordering for all objects, whatever they might be.


How about id(), then?

And so the circle is completed...


Or something like

def uniquish_id(o):
try:
return hash(o)
except TypeError:
return id(o)

hash() should be the same across interpreter invocations, whereas id()
won't.

--
|>|\/|<
/--------------------------------------------------------------------------\
|David M. Cooke
|cookedm(at)physics(dot)mcmaster(dot)ca
Jul 19 '05 #42
If a behavior change is possible at all, I think a more reasonable
behavior would be:

if any rich comparison methods are defined, always use rich comparisons
(and throw an exception if the required rich comparison method is not
available).

This would at least have the benefit of letting users know what code it
had broken when they try to run it :)

Regards,
Pat

Jul 19 '05 #43

Before I answer, let me clarify my position. I am NOT advocating any
change for the 2.x series. I'm not even forwarding any particular
proposal for 3.0/3000. My key (and close to sole) point is that behavior
of > & < is conceptually distinct from ordering in a sorted list, even
though the behaviors coincide for the most common case of numbers.
One way to handle that is to refuse to sort anything that doesn't have
a "natural" order. But as I understand it, Guido decided that being
able to sort arbitrary lists is a feature, not a bug. But you can't
sort ones with complex numbers in them, because you also want
'1+3j<3+1j' to raise an error.

As George Sakkis noted, Guido has since recanted. Unfortunately, in
this case, the time machine would have broken too much existing code.


As I mentioned in response, the referenced email only mentions the
ability to use < & > in comparing arbitrary objects. My key point is
that this is conceptually different than disallowing sorting on
heterogeneous list. There are ways to disallow one, while still allowing
the other.
"Doesn't" and "can't" (call me pessimistic) comes from all the issues
and disagreements we're having in this thread, and "shouldn't" comes
from the Zen:

Explicit is better than implicit.
In the face of ambiguity, refuse the temptation to guess.


Well, Python already "guesses implicitly", everytime you do a "1 + 2.0"
or a "a + b", where 'a' doesn't define '__add__' and 'b' defines
'__radd__' - The trick is to be clear and explicit (in the documetation)
everytine you are implicit (in the running program). It's not guessing -
it's a precisely defined part of the language.
AIUI, __cmp__ exists for backwards compatibility, and __eq__ and friends
are flexible enough to cover any possible comparison scheme.
Except if you want the situation where "[1+2j, 3+4j].sort()" works, and
'1+3j < 3+1j' fails.

I'm sticking with my position that both should fail, unless you
*explicity* tell sort what to do (since for now, we all seem to agree
that the other one should fail). If I have an application that thinks
it has to sort a list of arbitrary objects, then I have to be clever
enough to help.


Even if you decide to disallow sorting heterogenous lists, you still
have the problem of what to do a user defined homogeneous list where
__lt__ doesn't return a boolean. Moreso:
Except if you want the situation where "[a, b].sort()" works, and
'a < b' fails.


If you combine list sorting and >/< together, there is no way someone
who wants >/< on a specific class to return a non-boolean value to have
a homogeneous list of those objects sort the way they want them too.

If, on the other hand, you split the two and provide a sensible and
explicitly defined fall-back order, then someone who doesn't care about
the distinction can carry on as if they were combined. In addition,
those people who want to sort lists of objects which return non-booleans
for >/< can do that too.

BTW, the optional parameter for the sort function is not a suitable
alternative. The main problem with it is where the sort order is
encoded. The extra parameter in the sort has to be provided *every
place* where sort is called, instead of a single place in the definition
of the object. And you can't subclass the sort function in your user
defined fuction, because sort is an operation on the list, not the
objects in it.

And please don't say that always explicitly specifying the sort order is
a good thing, unless you want the sort order to *always* be specified,
even with builtins. (Sorting numbers? Ascending/Descending/Magnitude? -
Strings? ASCII/EBDIC/Alphabetical/Case Insensitive/Accents at the end or
interspersed? -- Okay, a little petulant, but the real issue is
lists/tuples/dicts. Why isn't the sort order on those explicit?)
All that said, I'm not Guido, and in his wisdom he may decide that
having a sort order different from that given by >/< is an attractive
nuisance, even with user defined objects. He may then deside to disallow
it like he disallows goto's and free-form indentation.
Jul 19 '05 #44
Max
Jordan Rastrick wrote:
I don't want to order the objects. I just want to be able to say if one
is equal to the other.

Here's the justification given:

The == and != operators are not assumed to be each other's
complement (e.g. IEEE 754 floating point numbers do not satisfy
this). It is up to the type to implement this if desired.
Similar for < and >=, or > and <=; there are lots of examples
where these assumptions aren't true (e.g. tabnanny).

Well, never, ever use equality or inequality operations with floating
point numbers anyway, in any language, as they are notoriously
unreliable due to the inherent inaccuracy of floating point. Thats
another pitfall, I'll grant, but its a pretty well known one to anyone
with programming experience. So I don't think thats a major use case.


I think this is referring to IEEE 754's NaN equality test, which
basically states that x==x is false if-and-only-if x.isNaN() is true.
Jul 19 '05 #45
Max wrote:
Jordan Rastrick wrote:
Well, never, ever use equality or inequality operations with floating
point numbers anyway, in any language, as they are notoriously
unreliable due to the inherent inaccuracy of floating point. Thats
another pitfall, I'll grant, but its a pretty well known one to anyone
with programming experience. So I don't think thats a major use case.


I think this is referring to IEEE 754's NaN equality test, which
basically states that x==x is false if-and-only-if x.isNaN() is true.


No. He means exactly what he says: when using floats,
it is risky to compare one float to another with
equality, not just NaNs.

This is platform-dependent: I remember the old Standard
Apple Numerics Environment" (SANE) making the claim
that testing equality on Macintoshes was safe. And I've
just spent a fruitless few minutes playing around with
Python on Solaris trying to find a good example. So it
is quite possible to work with floats for *ages* before
being bitten by this.

In general, the problem occurs like this:

Suppose your floating point numbers have six decimal
digits of accuracy. Then a loop like this may never
terminate:

x = 1.0/3
while x != 1.0: # three times one third equals one
print x
x += 1.0/3

It will instead print:
0.333333
0.666666
0.999999
1.333332
1.666665
and keep going.

(You can easily see the result yourself with one of
those cheap-and-nasty calculators with 8 significant
figures. 1/3*3 is not 1.)

It is a lot harder to find examples on good, modern
systems with lots of significant figures, but it can
happen.

Here is a related problem. Adding two floats together
should never give one of the original numbers unless
the other one is zero, correct? Then try this:

py> x = 1.0
py> y = 1e-16 # small, but not *that* small
py> y == 0.0
False
py> x+y == x
True
py> x-x+y = x+y-x
False

(Again, platform dependent, your milage may vary.)

Or try the same calculations with x=1e12 and y=1e-6.

In general, the work-around to these floating point
issues is to avoid floating point in favour of exact
algebraic calculations whenever possible. If you can't
avoid floats:

- always sum numbers from smallest to largest;

- never compare equality but always test whether some
number is within a small amount of your target;

- try to avoid adding or subtracting numbers of wildly
differing magnitudes; and

- be aware of the risk of errors blowing out and have
strategies in place to manage the size of the error.
--
Steven.

Jul 19 '05 #46

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

4 posts views Thread by Ben | last post: by
48 posts views Thread by marbac | last post: by
2 posts views Thread by Daniel Schüle | last post: by
13 posts views Thread by Chris Croughton | last post: by
3 posts views Thread by Carlo Capelli | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.