Difference between 'is' and '=='

mwql

Hey guys, this maybe a stupid question, but I can't seem to find the
result anywhere online. When is the right time to use 'is' and when
should we use '=='?

Thanks alot~

Mar 27 '06 #1

Subscribe Post Reply

1872

Rene Pijlman

mwql:

Hey guys, this maybe a stupid question, but I can't seem to find the
result anywhere online. When is the right time to use 'is' and when
should we use '=='?

http://docs.python.org/ref/comparisons.html

--
René Pijlman

Mar 27 '06 #2

Max M

mwql wrote:

Hey guys, this maybe a stupid question, but I can't seem to find the
result anywhere online. When is the right time to use 'is' and when
should we use '=='?

"is" is like id(obj1) == id(obj2)

100+1 == 101 True
100+1 is 101

False

They don't have the same id. (Think of id as memory adresses.)

--

hilsen/regards Max M, Denmark

http://www.mxm.dk/
IT's Mad Science

Phone: +45 66 11 84 94
Mobile: +45 29 93 42 96

Mar 27 '06 #3

Fuzzyman

mwql wrote:

Hey guys, this maybe a stupid question, but I can't seem to find the
result anywhere online. When is the right time to use 'is' and when
should we use '=='?

Thanks alot~

'==' is the equality operator. It is used to test if two objects are
'equal'.

'is' is the identity operator, it is used to test if two
names/references point to the same object.

a = {'a': 3}
b = {'a': 3}
a == b
True
a is b
False
c = a
a is c
True

The two dictionaries a and b are equal, but are separate objects.
(Under the hood, Python uses 'id' to determine identity).

When you bind another name 'c' to point to dictionary a, they *are* the
same object - so a *is* c.

One place the 'is' operator is commonly used is when testing for None.
You only ever have one instance of 'None', so

a is None

is quicker than

a == None

(It only needs to check identity not value.)

I hope that helps.

Fuzzyman
http://www.voidspace.org.uk/python/index.shtml

Mar 27 '06 #4

Joel Hedlund

> "is" is like id(obj1) == id(obj2)
<snip>

(Think of id as memory adresses.)

Which means that "is" comparisons in general will be faster than ==
comparisons. According to PEP8 (python programming style guidelines) you should
use 'is' when comparing to singletons like None. I take this to also include
constants and such. That allows us to take short cuts through known terrain,
such as in the massive_computations function below:

--------------------------------------------------------------
import time

class LotsOfData(object):
def __init__(self, *data):
self.data = data
def __eq__(self, o):
time.sleep(2) # time consuming computations...
return self.data == o.data

KNOWN_DATA = LotsOfData(1,2)
same_data = KNOWN_DATA
equal_data = LotsOfData(1,2)
other_data = LotsOfData(2,3)

def massive_computations(data = KNOWN_DATA):
if data is KNOWN_DATA:
return "very quick answer"
elif data == KNOWN_DATA:
return "quick answer"
else:
time.sleep(10) # time consuming computations...
return "slow answer"

print "Here we go!"
print massive_computations()
print massive_computations(same_data)
print massive_computations(equal_data)
print massive_computations(other_data)
print "Done."
--------------------------------------------------------------

Cheers,
Joel

Mar 27 '06 #5

Roy Smith

In article <e0**********@news.lysator.liu.se>,
Joel Hedlund <jo**********@gmail.com> wrote:

Which means that "is" comparisons in general will be faster than ==
comparisons.

I thought that == automatically compared identify before trying to compare
the values. Or am I thinking of some special case, like strings?

Mar 27 '06 #6

Peter Hansen

Roy Smith wrote:

In article <e0**********@news.lysator.liu.se>,
Joel Hedlund <jo**********@gmail.com> wrote:
Which means that "is" comparisons in general will be faster than ==
comparisons.

I thought that == automatically compared identify before trying to compare
the values. Or am I thinking of some special case, like strings?

You must be thinking of a special case:

class A: .... def __cmp__(self, other): return 1
.... a = A()
a is a True a == a

False
-Peter

Mar 27 '06 #7

Clemens Hepper

Roy Smith wrote:

In article <e0**********@news.lysator.liu.se>,
Joel Hedlund <jo**********@gmail.com> wrote:
Which means that "is" comparisons in general will be faster than ==
comparisons.

I thought that == automatically compared identify before trying to compare
the values. Or am I thinking of some special case, like strings?

Even for strings there is a performance difference:

timeit.Timer("'a'=='a'").timeit() 0.26859784126281738 timeit.Timer("'a' is 'a'").timeit()

0.21730494499206543

mfg
- eth

Mar 27 '06 #8

Dan Sommers

On Mon, 27 Mar 2006 14:52:46 +0200,
Joel Hedlund <jo**********@gmail.com> wrote:

... According to PEP8 (python programming style guidelines) you should
use 'is' when comparing to singletons like None. I take this to also
include constants and such ...

This does *not* also mean constants and such:

Python 2.4.2 (#1, Feb 22 2006, 08:02:53)
[GCC 4.0.1 (Apple Computer, Inc. build 5247)] on darwin
Type "help", "copyright", "credits" or "license" for more information.

a = 123456789
a == 123456789 True a is 123456789 False

Regards,
Dan

--
Dan Sommers
<http://www.tombstonezero.net/dan/>
"I wish people would die in alphabetical order." -- My wife, the genealogist

Mar 27 '06 #9

mwql

It's really strange,

if
a = 1
b = 1
a is b ==> True

the same thing applies for strings, but not for dict, lists or tuples
I think the 'is' operator is useful for objects only, not for primitive
types,
I think I solved the mystery behind my bugged code =)

Mar 27 '06 #10

Benji York

mwql wrote:

It's really strange,

if
a = 1
b = 1
a is b ==> True

the same thing applies for strings

Not quite:

'abc' is 'abc' True 'abc' is 'ab' + 'c'

False

--
Benji York

Mar 27 '06 #11

Clemens Hepper

Dan Sommers wrote:

This does *not* also mean constants and such:

Python 2.4.2 (#1, Feb 22 2006, 08:02:53)
[GCC 4.0.1 (Apple Computer, Inc. build 5247)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> a = 123456789
>>> a == 123456789 True >>> a is 123456789

False

It's strange: python seem to cache constants from 0 to 99:

for x in xrange(1000):
if not eval("%d"%x) is eval("%d"%x):
print x

for me it printed 100-999.

- eth

Mar 27 '06 #12

Diez B. Roggisch

mwql wrote:

It's really strange,

if
a = 1
b = 1
a is b ==> True

the same thing applies for strings, but not for dict, lists or tuples
I think the 'is' operator is useful for objects only, not for primitive
types,
I think I solved the mystery behind my bugged code =)

The reason that "is" works for small numbers is that these are cached for
performance reasons. Try

a = 1000000
b = 1000000
a is b

False

So - your conclusion is basically right: use is on (complex) objects, not on
numbers and strings and other built-ins. The exception from the rule is
None - that should only exist once, so

foo is not None

is considered better style than foo == None.

Diez

Mar 27 '06 #13

Felipe Almeida Lessa

Em Seg, 2006-03-27 Ã*s 08:23 -0500, Dan Sommers escreveu:

On Mon, 27 Mar 2006 14:52:46 +0200,
Joel Hedlund <jo**********@gmail.com> wrote:
... According to PEP8 (python programming style guidelines) you should
use 'is' when comparing to singletons like None. I take this to also
include constants and such ...

This does *not* also mean constants and such:

Python 2.4.2 (#1, Feb 22 2006, 08:02:53)
[GCC 4.0.1 (Apple Computer, Inc. build 5247)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> a = 123456789
>>> a == 123456789 True >>> a is 123456789 False >>>
Not those kind of constants, but this one:

Python 2.4.2 (#2, Nov 20 2005, 17:04:48)
[GCC 4.0.3 20051111 (prerelease) (Debian 4.0.2-4)] on linux2
Type "help", "copyright", "credits" or "license" for more information. CONST = 123456789
a = CONST
a == CONST True a is CONST True

--
Felipe.

Mar 27 '06 #14

filipwasilewski

Clemens Hepper wrote:

It's strange: python seem to cache constants from 0 to 99:

That's true. The Python api doc says that Python keeps an array of
integer objects for all integers between -1 and 100. See
http://docs.python.org/api/intObjects.html.
This also seems to be true for integers from -5 to -2 on (ActiveState)
Python 2.4.2.

--
fw

Mar 27 '06 #15

Donn Cave

In article <48************@uni-berlin.de>,
"Diez B. Roggisch" <de***@nospam.web.de> wrote:
....

So - your conclusion is basically right: use is on (complex) objects, not on
numbers and strings and other built-ins. The exception from the rule is
None - that should only exist once, so

foo is not None

is considered better style than foo == None.

But even better style is just `foo' or `not foo'. Or not,
depending on what you're thinking.

The key point between `is' and `==' has already been made -
- use `is' to compare identity
- use `==' to compare value

It's that simple, and it's hard to add to this without
potentially layering some confusion on it. While Python's
implementation makes the use of identity with small numbers
a slightly more complicated issue, there isn't a lot of
practical difference. To take a common case that has already
been mentioned here, if I define some constant symbolic values
as small integers, as long as I take care that their values
are distinct, I can reasonably use identity and ignore this
technical weakness. I can assume that no one is going to
supply randomly selected integers in this context. Meanwhile,
the use of identity clarifies the intent.

Depending, of course, on what the intent may be, which brings
us to None, and a point about values in Python that was brought
to a fairly brilliant light some years back by someone we don't
hear from often here any more, unfortunately.

- use `is' to compare identity
- use `==' to compare value
- use neither to test for `somethingness'

I'm not going to try to elucidate the theory of something and
nothing in Python, but suffice it to say that there are places
where it may be better to write

if not expr:

than

if expr is None:

or worse yet,

if expr == False:

That's what I think, anyway.

Donn Cave, do**@u.washington.edu

Mar 27 '06 #16

Terry Reedy

"Clemens Hepper" <et*******@gmx.net> wrote in message
news:e0*********@news2.open-news-network.org...

It's strange: python seem to cache constants from 0 to 99:

The Python specification allows but does not require such behind-the-scenes
implementation optimization hacks. As released, CPython 2.4 caches -5 to
99, I believe. In 2.5, the upper limit was increased to 256. The limits
are in a pair of #define statements in the int object source file. Anyone
who compiles from source can adjust as desired (though the corresponding
test will fail unless also adjusted ;-).

I think the visibility of this implementation detail from Python code is an
example of a leaky abstraction. For more, see
http://www.joelonsoftware.com/articl...tractions.html

Terry Jan Reedy

Mar 27 '06 #17

Erik Max Francis

Terry Reedy wrote:

The Python specification allows but does not require such behind-the-scenes
implementation optimization hacks. As released, CPython 2.4 caches -5 to
99, I believe. In 2.5, the upper limit was increased to 256. The limits
are in a pair of #define statements in the int object source file. Anyone
who compiles from source can adjust as desired (though the corresponding
test will fail unless also adjusted ;-).

I think the visibility of this implementation detail from Python code is an
example of a leaky abstraction. For more, see
http://www.joelonsoftware.com/articl...tractions.html

I don't see that as quite the same thing. That you can fiddle around
with the `is` operator to investigate how small integers are cached
doesn't really reveal any profound underlying abstraction, except that
maybe all Python entities are true objects and that integers are
immutable, which are things hopefully everyone was already aware of.

If you're trying to test integer equality, you should be using the `==`
operator, not the `is` operator, so what you find out about how things
are caching is really irrelevant.

--
Erik Max Francis && ma*@alcyone.com && http://www.alcyone.com/max/
San Jose, CA, USA && 37 20 N 121 53 W && AIM erikmaxfrancis
And covenants, without the sword, are but words and of no strength to
secure a man at all. -- Thomas Hobbes, 1588-1679

Mar 27 '06 #18

Rene Pijlman

Terry Reedy:

The Python specification allows but does not require such behind-the-scenes
implementation optimization hacks. As released, CPython 2.4 caches -5 to
99, I believe. In 2.5, the upper limit was increased to 256. The limits
are in a pair of #define statements in the int object source file. Anyone
who compiles from source can adjust as desired (though the corresponding
test will fail unless also adjusted ;-).

I think the visibility of this implementation detail from Python code is an
example of a leaky abstraction. For more, see
http://www.joelonsoftware.com/articl...tractions.html

Joel has his abstractions wrong. TCP doesn't guarantee reliable delivery,
unlike IP it delivers reliably or it tells you it cannot. SQL perormance
doesn't break abstraction, since performance isn't part of SQL. You _can_
drive as fast when it's raining.

Now, about identity and equality. As you know, identity implies equality,
but equality doesn't imply identity. You seem to assume that when identity
is not implied, it should be undetectable. But why?

Here's A, there's B, they may or may not be identical, they may or may not
be equal.

What are you suggesting?

1. If A and B are of comparable types, and A equals B, we should not be
allowed to evaluate "A is B".

2. If A and B are of comparable types, and A equals B, A should be B.

3. If A and B are of comparable types, and A equals B, A should not be B.

4. If A and B are of comparable types, and A equals B, "A is B" should not
evaluate to a boolean value.

5. When A and B are of comparable types, we should not be allowed to
evaluate "A == B" :-)

--
René Pijlman

Mar 27 '06 #19

Dan Sommers

On Mon, 27 Mar 2006 11:08:36 -0300,
Felipe Almeida Lessa <fe**********@gmail.com> wrote:

Em Seg, 2006-03-27 Ã*s 08:23 -0500, Dan Sommers escreveu:
On Mon, 27 Mar 2006 14:52:46 +0200,
Joel Hedlund <jo**********@gmail.com> wrote:
> ... According to PEP8 (python programming style guidelines) you should
> use 'is' when comparing to singletons like None. I take this to also
> include constants and such ...
This does *not* also mean constants and such:

Python 2.4.2 (#1, Feb 22 2006, 08:02:53)
[GCC 4.0.1 (Apple Computer, Inc. build 5247)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> a = 123456789
>>> a == 123456789

True
>>> a is 123456789

False
>>> Not those kind of constants, but this one: Python 2.4.2 (#2, Nov 20 2005, 17:04:48)
[GCC 4.0.3 20051111 (prerelease) (Debian 4.0.2-4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
CONST = 123456789
a = CONST
a == CONST True a is CONST True

That's a little misleading, and goes back to the questions of "what is
assignment in Python?" and "What does it mean for an object to be
mutable?"

The line "a = CONST" simply gives CONST a new name. After that, "a is
CONST" will be True no matter what CONST was. Under some circumstances,
I can even change CONST, and "a is CONST" will *still* be True.
CONST = range(22)
a = CONST
a [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21] a is CONST True CONST[12] = 'foo'
a is CONST True

Right off the top of my head, I can't think of a way to make "a = b; a
is b" return False.

Regards,
Dan

--
Dan Sommers
<http://www.tombstonezero.net/dan/>
"I wish people would die in alphabetical order." -- My wife, the genealogist

Mar 28 '06 #20

Felipe Almeida Lessa

Em Seg, 2006-03-27 Ã*s 21:05 -0500, Dan Sommers escreveu:

Right off the top of my head, I can't think of a way to make "a = b; a
is b" return False.

Sorry for being so --quiet. I will try to be more --verbose.

I can think of two types of constants:
1) Those defined in the language, like True, None, 0 and the like.
2) Those defined on your code.

You said type 1 can be used with "is", you're right:

a = 100
a is 100 False

I said type 2 can (maybe "should"?) be used with "is", and AFAICT I'm
right as well: b = a
b is a True

That said, you can do thinks like: import socket
a = socket.AF_UNIX
a is socket.AF_UNIX

True

That kind of constants can be used with "is". But if don't want to be
prone to errors as I do, use "is" only when you really know for sure
that you're dealing with singletons.

HTH,

--
Felipe.

Mar 28 '06 #21

alex23

Felipe Almeida Lessa wrote:

I said [constants defined in your code] can (maybe "should"?) be used with "is", and
AFAICT I'm right as well:
b = a
b is a True

You should _never_ use 'is' to check for equivalence of value. Yes, due
to the implementation of CPython the behaviour you quote above does
occur, but it doesn't mean quite what you seem to think it does.

Try this:

UPPERLIMIT = 100
i = 0
while not (i is UPPERLIMIT):
i+=1
print i

Comparing a changing variable to a pre-defined constant seems a lot
more general a use case than sequential binding & comparison...and as
this should show, 'is' does _not_ catch these cases.

- alex23

Mar 28 '06 #22

Antoon Pardon

Op 2006-03-27, Donn Cave schreef <do**@u.washington.edu>:

In article <48************@uni-berlin.de>,
"Diez B. Roggisch" <de***@nospam.web.de> wrote:
...
So - your conclusion is basically right: use is on (complex) objects, not on
numbers and strings and other built-ins. The exception from the rule is
None - that should only exist once, so

foo is not None

is considered better style than foo == None.

But even better style is just `foo' or `not foo'. Or not,
depending on what you're thinking.

No it is not. When you need None to be treated special,
that doesn't imply you want to treat zero numbers or empty
sequences as special too.

--
Antoon Pardon

Mar 28 '06 #23

Felipe Almeida Lessa

Em Seg, 2006-03-27 Ã*s 23:02 -0800, alex23 escreveu:

Felipe Almeida Lessa wrote:
I said [constants defined in your code] can (maybe "should"?) be used with "is", and
AFAICT I'm right as well:
>> b = a
>> b is a True

You should _never_ use 'is' to check for equivalence of value. Yes, due
to the implementation of CPython the behaviour you quote above does
occur, but it doesn't mean quite what you seem to think it does.

/me not checking for value. I'm checking for identity. Suppose "a" is a
constant. I want to check if "b" is the same constant.
Try this:
UPPERLIMIT = 100
i = 0
while not (i is UPPERLIMIT):
i+=1
print i
Comparing a changing variable to a pre-defined constant seems a lot
more general a use case than sequential binding & comparison...and as
this should show, 'is' does _not_ catch these cases.

That's *another* kind of constant. I gave you the example of
socket.AF_UNIX, the kind of constant I'm talking about. Are you going to
sequentially create numbers until you find it? Of couse not.

The problem with Python (and other languages like Jave) is that we don't
have a type like an enum (yet) so we have to define constants in our
code. By doing an "is" instead of a "==" you *can* catch some errors.
For example, a very dummy function (picked MSG_EOR as its value is
greater than 99):

---
from socket import MSG_EOR, MSG_WAITALL

def test(type):
if type is MSG_EOR:
print "This *is* MSG_EOR"
elif type == MSG_EOR:
print "This maybe be MSG_EOR"
else:
print "*Not MSG_EOR"
---

Now testing it:

test(MSG_EOR) This *is* MSG_EOR

Fine, but:
print MSG_EOR 128 test(128)

This maybe be MSG_EOR

This is a mistake. Here I knew 128 == MSG_EOR, but what if that was a
coincidence of some other function I created? I would *never* catch that
bug as the function that tests for MSG_EOR expects any integer. By
testing with "is" you test for *that* integer, the one defined on your
module and that shouldn't go out of it anyway.

Of course using an enum should make all said here obsolete.

--
Felipe.

Mar 28 '06 #24

Joel Hedlund

> This does *not* also mean constants and such:
<snip>

>>> a = 123456789
>>> a == 123456789 True >>> a is 123456789 False >>>

I didn't mean that kind of constant. I meant named constants with defined
meaning, as in the example that I cooked up in my post. More examples: os.R_OK,
or more complex ones like mymodule.DEFAULT_CONNECTION_CLASS.

Sorry for causing unneccessary confusion.

Cheers!
/Joel Hedlund

Mar 28 '06 #25

Joel Hedlund

>>You should _never_ use 'is' to check for equivalence of value. Yes, due

to the implementation of CPython the behaviour you quote above does
occur, but it doesn't mean quite what you seem to think it does.

/me not checking for value. I'm checking for identity. Suppose "a" is a
constant. I want to check if "b" is the same constant.

/me too. That's what my example was all about. I was using identity to a known
CONSTANT (in caps as per python naming conventions :-) to sidestep costly value
equality computations.
By doing an "is" instead of a "==" you *can* catch some errors.
<snip>
By
testing with "is" you test for *that* integer, the one defined on your
module and that shouldn't go out of it anyway.

I totally agree with you on this point. Anything that helps guarding against
"stealthed" errors is a good thing by my standards.

Cheers!
/Joel Hedlund

Mar 28 '06 #26

Joel Hedlund

>>Not those kind of constants, but this one:

Python 2.4.2 (#2, Nov 20 2005, 17:04:48)
[GCC 4.0.3 20051111 (prerelease) (Debian 4.0.2-4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>CONST = 123456789
>a = CONST
>a == CONST

True
>a is CONST

True

That's a little misleading, and goes back to the questions of "what is
assignment in Python?" and "What does it mean for an object to be
mutable?"

The line "a = CONST" simply gives CONST a new name. After that, "a is
CONST" will be True no matter what CONST was. Under some circumstances,
I can even change CONST, and "a is CONST" will *still* be True.

Anyone who thinks it's a good idea to change a CONST that's not in a module
that they have full control over must really know what they're doing or suffer
the consequences. Most often, the consequences will be nasty bugs.

Cheers!
/Joel Hedlund

Mar 28 '06 #27

Joel Hedlund

> a is None

is quicker than

a == None

I think it's not such a good idea to focus on speed gains here, since they
really are marginal (max 2 seconds total after 10000000 comparisons):

import timeit
print timeit.Timer("a == None", "a = 1").timeit(int(1e7)) 4.19580316544 print timeit.Timer("a == None", "a = None").timeit(int(1e7)) 3.20231699944 print timeit.Timer("a is None", "a = 1").timeit(int(1e7)) 2.37486410141 print timeit.Timer("a is None", "a = None").timeit(int(1e7))

2.48372101784

Your observation is certainly correct, but I think it's better applied to more
complex comparisons (say for example comparisons between gigantic objects or
objects where value equality determination require a lot of nontrivial
computations). That's where any real speed gains can be found. PEP8 tells me
it's better style to write "a is None" and that's good enough for me. Otherwise
I try to stay away from speed microoptimisations as much as possible since it
generally results in less readable code, which in turn often results in an
overall speed loss because code maintenance will be harder.

Cheers!
/Joel Hedlund

Mar 28 '06 #28

Tim Churches

Am I correct in thinking that there is no longer any link from anywhere
on the Python Web site at http;//www.python.org to the Daily Python-URL
at http://www.pythonware.com/daily/ ? There is no sign of it on the
Community page, nor any reference to it at http://planet.python.org/

I'm sure there was a link to it last time I looked.

Tim C

Mar 28 '06 #29

Peter Otten

Tim Churches wrote:

Am I correct in thinking that there is no longer any link from anywhere
on the Python Web site at http;//www.python.org to the Daily Python-URL
at http://www.pythonware.com/daily/ ? There is no sign of it on the
Community page, nor any reference to it at http://planet.python.org/

See http://www.python.org/links/
All links provided on that page would benefit from a short description.

Peter

Mar 28 '06 #30

Peter Hansen

Joel Hedlund wrote:

This does *not* also mean constants and such:

<snip>
>>> a = 123456789
>>> a == 123456789

True
>>> a is 123456789

False

I didn't mean that kind of constant. I meant named constants with defined
meaning, as in the example that I cooked up in my post. More examples: os.R_OK,
or more complex ones like mymodule.DEFAULT_CONNECTION_CLASS.

If it weren't for the current CPython optimization (caching small
integers) this code which it appears you would support writing, would fail:

if (flags & os.R_OK) is os.R_OK:
# do something

while this, on the other hand, is not buggy, because it correctly uses
equality comparison when identity comparison is not called for:

if (flags & os.R_OK) == os.R_OK:
# do something

(I think you should give it up... you're trying to push a rope.)

-Peter

Mar 28 '06 #31

Steven D'Aprano

On Tue, 28 Mar 2006 12:12:52 +0200, Joel Hedlund wrote:

I try to stay away from speed microoptimisations as much as possible since it
generally results in less readable code, which in turn often results in an
overall speed loss because code maintenance will be harder.

+1 QOTW
--
Steven.

Mar 28 '06 #32

Ross Ridge

Felipe Almeida Lessa wrote:

That said, you can do thinks like:
import socket
a = socket.AF_UNIX
a is socket.AF_UNIX True

That kind of constants can be used with "is". But if don't want to be
prone to errors as I do, use "is" only when you really know for sure
that you're dealing with singletons.

It's only safe to to compare address family values with socket.AF_UNIX
using "is", if small integers are guaranteed to be singletons, and
socket.AF_UNIX has one of those small values. Otherwise, address
family values equal in value to socket.AF_UNIX can be generated using
different objects. There's no requirement that the socket module or
anything else return values using the same object that the
socket.AF_UNIX constant uses.

Consider this example using the socket.IPPROTO_RAW constant:

socket.getaddrinfo("localhost", None, socket.AF_INET, socket.SOCK_RAW, socket.IPPROTO_RAW)[0][2] is socket.IPPROTO_RAW False
socket.getaddrinfo("localhost", None, socket.AF_INET, socket.SOCK_RAW, socket.IPPROTO_RAW)[0][2] == socket.IPPROTO_RAW

True

Ross Ridge

Mar 28 '06 #33

Felipe Almeida Lessa

Em Ter, 2006-03-28 Ã*s 15:18 -0800, Ross Ridge escreveu:
[snip]

Consider this example using the socket.IPPROTO_RAW constant:
socket.getaddrinfo("localhost", None, socket.AF_INET, socket.SOCK_RAW, socket.IPPROTO_RAW)[0][2] is socket.IPPROTO_RAW False
socket.getaddrinfo("localhost", None, socket.AF_INET, socket.SOCK_RAW, socket.IPPROTO_RAW)[0][2] == socket.IPPROTO_RAW

True

Ok, you win. It's not safe to do "is" checks on these kinds of
constants.

--
Felipe.

Mar 28 '06 #34

Joel Hedlund

> If it weren't for the current CPython optimization (caching small

integers)
This has already been covered elsewhere in this thread. Read up on it.
this code which it appears you would support writing

if (flags & os.R_OK) is os.R_OK:
I do not.

You compare a module.CONSTANT to the result of an expression (flags & os.R_OK).
Expressions are not names bound to objects, the identity of which is what I'm
talking about. This example does not apply. Also, the identity check in my
example has a value equality fallback. Yours doesn't, so it really does not apply.
(I think you should give it up... you're trying to push a rope.)

I'm not pushing anything. I just don't like being misquoted.

Cheers,
Joel Hedlund

Mar 29 '06 #35

Joel Hedlund

> There's no requirement that the socket module or

anything else return values using the same object that the
socket.AF_UNIX constant uses.
Ouch. That's certainly an eyeopener.

For me, this means several things, and I'd really like to hear people's
thoughts about them.

It basically boils down to "don't ever use 'is' unless pushed into a corner,
and nevermind what PEP8 says about it".

So here we go... *takes deep breath*

Identity checks can only be done safely to compare a variable to a defined
builtin singleton such as None. Since this is only marginally faster than a
value equality comparison, there is little practical reason for doing so.
(Except for the sake of following PEP8, more of that below).

You cannot expect to ever have identity between a value returned by a
function/method and a CONSTANT defined in the same package/module, if you do
not have comlete control over that module. Therefore, such identity checks
should always be given a value equality fallback. In most cases the identity
check will not be significantly faster than a value equality check, so for the
sake of readability it's generally a good idea to skip the identity check and
just do a value equality check directly. (Personally, I don't think it's good
style to define constants and not be strict about how you use them, but that's
on a side note and not very relevant to this discussion)

It may be a good idea to use identity checks for variables vs CONSTANTs defined
in the same module/package, if it's Your module/package and you have complete
control over it. Felipe Almeida Lessa provided a good argument for this earlier
in this thread:
Here I knew 128 == MSG_EOR, but what if that was a
coincidence of some other function I created? I would *never* catch that
bug as the function that tests for MSG_EOR expects any integer. By
testing with "is" you test for *that* integer, the one defined on your
module and that shouldn't go out of it anyway.

However it may be a bad idea to do so, since it may lure you into a false sense
of security, so you may start to unintentionally misuse 'is' in an unsafe manner.

So the only motivated use of 'is' would then be the one shown in my first
example with the massive_computations() function: as a shortcut past costly
value equality computations where the result is known, and with an added value
equality fallback for safety. Preferably, the use of identity should then also
be motivated in a nearby comment.

My conlusion is then that using 'is' is a bad habit and leads to less readable
code. You should never use it, unless it leads to a *measurable* gain in
performance, in which it should also be given a value equality fallback and a
comment. And lastly, PEP8 should be changed to reflect this.

Wow... that got a bit long and I applaud you for getting this far! :-) Thanks
for taking the time to read it.

So what are your thoughts about this, then?

Cheers!
/Joel Hedlund

Mar 29 '06 #36

Joel Hedlund

sorry

You compare a module.CONSTANT to the result of an expression

s/an expression/a binary operation/

/joel

Joel Hedlund wrote:

If it weren't for the current CPython optimization (caching small
integers)

This has already been covered elsewhere in this thread. Read up on it.

this code which it appears you would support writing

if (flags & os.R_OK) is os.R_OK:

I do not.

You compare a module.CONSTANT to the result of an expression (flags & os.R_OK).
Expressions are not names bound to objects, the identity of which is what I'm
talking about. This example does not apply. Also, the identity check in my
example has a value equality fallback. Yours doesn't, so it really does not apply.
> (I think you should give it up... you're trying to push a rope.)

I'm not pushing anything. I just don't like being misquoted.

Cheers,
Joel Hedlund

Mar 29 '06 #37

Fredrik Lundh

Joel Hedlund wrote:

For me, this means several things, and I'd really like to hear people's
thoughts about them.

It basically boils down to "don't ever use 'is' unless pushed into a corner,
and nevermind what PEP8 says about it".
nonsense.
Identity checks can only be done safely to compare a variable to a defined
builtin singleton such as None.
utter nonsense.
You cannot expect to ever have identity between a value returned by a
function/method and a CONSTANT defined in the same package/module, if you do
not have comlete control over that module.
or if the documentation guarantees that you can use "is" (e.g. by specifying
that you get back the object you passed in, or by specifying that a certain
object is a singleton, etc).
Therefore, such identity checks should always be given a value equality
fallback.
if the documentation guarantees that you can use "is", you don't need any
"value equality fallback".
My conlusion is then that using 'is' is a bad habit and leads to less readable
code. You should never use it, unless it leads to a *measurable* gain in
performance, in which it should also be given a value equality fallback and a
comment. And lastly, PEP8 should be changed to reflect this.

Wow... that got a bit long and I applaud you for getting this far! :-) Thanks
for taking the time to read it.

So what are your thoughts about this, then?

you need to spend more time relaxing, and less time making up arbitrary
rules for others to follow.

read the PEP and the documentation. use "is" when you want object identity,
and you're sure it's the right thing to do. don't use it when you're not sure.
any other approach would be unpythonic.

</F>

Mar 29 '06 #38

Max M

Joel Hedlund wrote:

There's no requirement that the socket module or
anything else return values using the same object that the
socket.AF_UNIX constant uses.

Ouch. That's certainly an eyeopener.

For me, this means several things, and I'd really like to hear people's
thoughts about them.

It basically boils down to "don't ever use 'is' unless pushed into a
corner, and nevermind what PEP8 says about it".

Identity checks are often used for checking input parameters of a function:

def somefunc(val=None):
if val is None:
val = []
do_stuff(val)

Or if None is a possible parameter you can use your own object as a marker::

_marker = []

def somefunc(val=_marker):
if val is marker:
val = []
do_stuff(val)

--

hilsen/regards Max M, Denmark

http://www.mxm.dk/
IT's Mad Science

Phone: +45 66 11 84 94
Mobile: +45 29 93 42 96

Mar 29 '06 #39

Joel Hedlund

>> For me, this means several things, and I'd really like to hear people's

thoughts about them.
you need to spend more time relaxing, and less time making up arbitrary
rules for others to follow.
I'm very relaxed, thank you. I do not make up rules for others to follow. I ask
for other peoples opinions so that I can reevaluate my views.

I do respect your views, as I clearly can see you have been helpful and
constructive in earlier discussions in this newsgroup. So therefore if you
think my statements are nonsense, there's a good chance you're right. And
that's why I posted. To hear what other people think.

Sorry if I came off stiff and belligerent because that certainly wasn't ny intent.
read the PEP and the documentation.
Always do.
use "is" when you want object identity,
and you're sure it's the right thing to do. don't use it when you're not sure.
any other approach would be unpythonic.
Right.

Chill!
/Joel Hedlund

</F>

Mar 29 '06 #40

Duncan Booth

Joel Hedlund wrote:

It basically boils down to "don't ever use 'is' unless pushed into a
corner, and nevermind what PEP8 says about it".

A quick grep[*] of the Python library shows the following common use-cases
for 'is'. The library isn't usually a good indicator of current style
though: a lot of it was forced to do things differently at the time when it
was written. (Also, my search includes site-packages so this isn't all
standard lib: I found 5091 uses of the keyword 'is').

Comparison against None '<expr> is None' or '<expr> is not None' are by far
the commonest.

Comparison against specific types or classes are the next most common (in
the standard library '<expr> is type([])' although if the code were
rewritten today and didn't need backward compatability it could just do
'<expr> is list'.

Comparison against sentinal or marker objects e.g. in the standard library
this is usually a constant set to [].

Other singletons e.g. NotImplemented

code.py has:
if filename and type is SyntaxError:
which might be better style if it used 'issubclass' rather than 'is'.

cookielib.py has:
if port is None and port_specified is True:
naughty.

difflib.py uses:
if a is self.a:
return
in SequenceMatcher to avoid recomputing related values when changing one or
other of the sequences.

doctest does some fairly advanced identity testing. It also has:

SUCCESS, FAILURE, BOOM = range(3) # `outcome` state
...
if outcome is SUCCESS:
...
elif outcome is FAILURE:
...
elif outcome is BOOM:

fnmatch.py uses 'is' on some module names to optimise out a function call.

optparse.py uses 'is' to test for non-default options, but it has some
pretty dubious ways of generating the values it tests against:
NO_DEFAULT = ("NO", "DEFAULT")
SUPPRESS_HELP = "SUPPRESS"+"HELP"
SUPPRESS_USAGE = "SUPPRESS"+"USAGE"
....
if default_value is NO_DEFAULT or default_value is None:
default_value = self.NO_DEFAULT_VALUE
....
if not option.help is SUPPRESS_HELP:
....
elif usage is SUPPRESS_USAGE:
sre defines a bunch of constants strings like an enumeration and uses 'is'
to test them.

threading.py does identity checking on threads:
me = currentThread()
if self.__owner is me:
Non standard library:

Spambayes has:
if val is True:
val = "Yes"
elif val is False:
val = "No"
when displaying configuration options to the user.

zsi does things like:
if item.isSimple() is True:
if item.content.isRestriction() is True:
self.content = RestrictionContainer()
elif item.content.isUnion() is True:
self.content = UnionContainer()
elif item.content.isList() is True:
self.content = ListContainer()
ick.
[*] I discovered a neat feature I didn't know my editor had: grepping for
"<[c:python-keyword>is" finds all occurences of the keyword 'is' while
ignoring it everywhere it isn't a keyword.

Mar 29 '06 #41

Joel Hedlund

>[*] I discovered a neat feature I didn't know my editor had: grepping for

"<[c:python-keyword>is"

Neat indeed. Which editor is that?

Thanks for a quick and comprehensive answer, btw.

Cheers!
/Joel

Mar 29 '06 #42

Duncan Booth

Joel Hedlund wrote:

[*] I discovered a neat feature I didn't know my editor had: grepping
for "<[c:python-keyword>is"

Neat indeed. Which editor is that?

Epsilon from www.lugaru.com. The drawback is that it costs real money
although you can try the beta for the next version until it is released.

Mar 29 '06 #43

Adam DePrince

On Mon, 2006-03-27 at 17:17 -0500, Terry Reedy wrote:

"Clemens Hepper" <et*******@gmx.net> wrote in message
news:e0*********@news2.open-news-network.org...
It's strange: python seem to cache constants from 0 to 99:

The Python specification allows but does not require such behind-the-scenes
implementation optimization hacks. As released, CPython 2.4 caches -5 to
99, I believe. In 2.5, the upper limit was increased to 256. The limits
are in a pair of #define statements in the int object source file. Anyone
who compiles from source can adjust as desired (though the corresponding
test will fail unless also adjusted ;-).

I think the visibility of this implementation detail from Python code is an
example of a leaky abstraction. For more, see
http://www.joelonsoftware.com/articl...tractions.html

I disagree wholeheartedly with this. == and is are two very different
operators that have very different meaning. It just happens that the
logical operation

(a is b ) -> (a == b )

is always True.

There is no abstraction going on here; a==b is not an abstract version
of a is b. They are different operations.

a == b tests if the values of objects at a and b are equal. a and b
point be the same darn object, or they might be different objects, but
the question we are asking is if they have the same value.

The is operator is different, you use it if you are interested in
introspecting the language.

Some people have noticed that 1 + 1 is 2 will return True and 1 + 100
is 101 returns False. This isn't a "leaky abstraction" unless you
wrongly consider is to be an analogy for ==.

Python has certain optimizations ... small strings and numbers are
"interned," that is canonical copies are maintained and efforts to
create fresh objects result in the old cached copies being returned,
albeit with higher reference counts. This saves memory and time; time
because for any intern-able objects the truth of the first test is a
realistic possibility.

if a is b:
return True
if hash( a ) != hash( b )
return False
.... Now do proper equality testing. ...

As for "joelonsoftware's" leaky abstraction article, I respectfully
disagree. His leaky abstractions are merely inappropriate analogies.
TCP is perfectly reliable with respect to its definition of
reliability.

As for wipers abstracting away the rain ...

Well over a decade ago I recall walking to lunch with my supervisor, a
truly masterful C programmer. We worked in Manhattan, a land where two
way streets are the exception. When crossing each street he would look
the wrong way, look the correct way and then stare the wrong way again.
Upon noticing my inquisitorial expression, he answered "A good
programmer always looks both ways when crossing a one way street."

I'm uncertain that a quip about abstracting away the rain would have
prompted the same adjective "masterful" now 10+ years in the future.

- Adam DePrince

Apr 3 '06 #44

Roy Smith

Adam DePrince <ad***********@gmail.com> wrote:

It just happens that the
logical operation

(a is b ) -> (a == b )

is always True.

Only for small values of "always". You can always do pathological
things with operators:

class Foo:
def __eq__ (self, other):
return False

f = Foo()
print f is f
print f == f

frame:play$ ./is.py
True
False

This may even be useful. What if you were trying to emulate SQL's
NULL? NULL compares false to anything, even itself. To test for
NULLness, you have to use the special "is NULL" operator.

Apr 3 '06 #45

Dave Hansen

On 3 Apr 2006 10:37:11 -0400 in comp.lang.python, ro*@panix.com (Roy
Smith) wrote:

Adam DePrince <ad***********@gmail.com> wrote:
It just happens that the
logical operation

(a is b ) -> (a == b )

is always True.

Only for small values of "always". You can always do pathological
things with operators:

class Foo:
def __eq__ (self, other):
return False

f = Foo()
print f is f
print f == f

frame:play$ ./is.py
True
False

This may even be useful. What if you were trying to emulate SQL's
NULL? NULL compares false to anything, even itself. To test for
NULLness, you have to use the special "is NULL" operator.

Another instance where this may be useful is IEEE-754 NaN. I don't
have fpconst to verify if that's the case, but I would expect
NaN is NaN to be true, but NaN == NaN to be false.

Regards,
-=Dave

--
Change is inevitable, progress is not.

Apr 3 '06 #46

Jon Ribbens

In article <e0**********@panix2.panix.com>, Roy Smith wrote:

This may even be useful. What if you were trying to emulate SQL's
NULL? NULL compares false to anything, even itself.

Strictly speaking, comparing NULL to anything gives NULL, not False.

Apr 3 '06 #47

Duncan Booth

Adam DePrince wrote:

It just happens that the
logical operation

(a is b ) -> (a == b )

is always True.

That is incorrect:

inf = 1e300*1e300
nan = inf-inf
nan is nan, nan==nan

(True, False)

Apr 3 '06 #48

Difference between 'is' and '=='

Similar topics