class with __len__ member fools boolean usage "if x:" ; bad coding style?

george young

[Python 2.3.3, x86 linux]
I had developed the habit of using the neat python form:
if someinstance:
someinstance.memb()

because it seems cleaner than "if someinstance is not None".
{please no flames about "is not None" vs. "!= None" ...}

This seemed like a good idea at the time :(). Twice, recently,
however, as my
app grew, I thought, hmm... it would make things clearer if I gave
this container class a __len__ member and maybe a __getitem__. Great,
code looks
nicer now... crash,crash, many expletives deleted...

Its great to be able to say containerinstance[seq] instead of
containerinstance.steps[seq], but I've also changed the semantics of
(containerinstance) in a boolean context. My app breaks only in the
seldom
case that the container is empty.

Obviously I know how to fix the code, but I'm wondering if this isn't
a message
that "if containerinstance:" is not a good coding practice. Or maybe
that one
should never *add* on sequence emulation to a class that's already in
use.
It may look like adding a __len__ and __getitem__ is just *extending*
the
functionality of the class, but in fact, it strongly *changes*
semantics of
existing usage of the class.

I know the ref manual mentions this behaviour,
but I begin to wonder at the wisdom of a language design and common
usage pattern of (containerinstance) instead of
(len(containerinstance)) and (containerinstance is None) as a boolean
expression.
Comments? Suggestions?

Jul 18 '05 #1

Subscribe Post Reply

1850

François Pinard

[george young]

I had developed the habit of using the neat python form:
if someinstance:
someinstance.memb() Its great to be able to say containerinstance[seq] instead of
containerinstance.steps[seq], but I've also changed the semantics of
(containerinstance) in a boolean context. My app breaks only in the
seldom case that the container is empty.
Sequences, when empty, are False in boolean contexts. So, the empty
string, the empty list, the empty tuple are all False.

If you do not specify anything "special" (:-), your own objects are
always True. If you specify `__len__', Python will consider your object
in the spirit of a sequence, where a zero-length means False. If you do
not want that Python do see your object in the spirit of a sequence in
boolean contexts, you might have to add a method:

def __nonzero__(self):
return True

to tell that your own objects are always True. (You might of course use
`__nonzero__' to implement more subtle concepts of Truth and Falsity.)
Comments? Suggestions?

There is nothing wrong in writing `if x:', as long as you decide
yourself what it should mean. But you have to let Python know what
your decision was. If you do not say anything, Python has its ways for
guessing, which are chosen so to be useful on the average case, and also
well documented.

--
François Pinard http://www.iro.umontreal.ca/~pinard

Jul 18 '05 #2

John Roth

"george young" <gr*@ll.mit.edu> wrote in message
news:78**************************@posting.google.c om...

[Python 2.3.3, x86 linux]
I had developed the habit of using the neat python form:
if someinstance:
someinstance.memb()

because it seems cleaner than "if someinstance is not None".
{please no flames about "is not None" vs. "!= None" ...}

This seemed like a good idea at the time :(). Twice, recently,
however, as my
app grew, I thought, hmm... it would make things clearer if I gave
this container class a __len__ member and maybe a __getitem__. Great,
code looks
nicer now... crash,crash, many expletives deleted...

Its great to be able to say containerinstance[seq] instead of
containerinstance.steps[seq], but I've also changed the semantics of
(containerinstance) in a boolean context. My app breaks only in the
seldom case that the container is empty.

Obviously I know how to fix the code, but I'm wondering if this isn't
a message
that "if containerinstance:" is not a good coding practice.

Almost. The message is that testing for None, however
you're doing it, is a Code Smell in the sense defined in
the Refactoring book. If some attribute is supposed to have
a Foo object, then it should have a Foo or a subclass of
Foo, not None.

Sometimes there's no way around it, but whenever you find
yourself testing for None, consider using a Null Object instead.
A Null Object is a subclass of the normal object you would
be expecting, but one that has methods and attributes that
handle the exceptional case cleanly.

Of course, there are a couple of very pretty idioms for
handling optional parameters that depend on tests for None,
but they're definitely special cases, and they also break if the
real parameter can be False.

John Roth

Jul 18 '05 #3

george young

"John Roth" <ne********@jhrothjr.com> wrote in message news:<10*************@news.supernews.com>...

"george young" <gr*@ll.mit.edu> wrote in message
news:78**************************@posting.google.c om...
[Python 2.3.3, x86 linux]
I had developed the habit of using the neat python form:
if someinstance:
someinstance.memb()

because it seems cleaner than "if someinstance is not None".
{please no flames about "is not None" vs. "!= None" ...}

This seemed like a good idea at the time :(). Twice, recently,
however, as my
app grew, I thought, hmm... it would make things clearer if I gave
this container class a __len__ member and maybe a __getitem__. Great,
code looks
nicer now... crash,crash, many expletives deleted...

Its great to be able to say containerinstance[seq] instead of
containerinstance.steps[seq], but I've also changed the semantics of
(containerinstance) in a boolean context. My app breaks only in the
seldom case that the container is empty.

Obviously I know how to fix the code, but I'm wondering if this isn't
a message
that "if containerinstance:" is not a good coding practice.

Almost. The message is that testing for None, however
you're doing it, is a Code Smell in the sense defined in
the Refactoring book. If some attribute is supposed to have
a Foo object, then it should have a Foo or a subclass of
Foo, not None.

Sometimes there's no way around it, but whenever you find
yourself testing for None, consider using a Null Object instead.
A Null Object is a subclass of the normal object you would
be expecting, but one that has methods and attributes that
handle the exceptional case cleanly.

Of course, there are a couple of very pretty idioms for
handling optional parameters that depend on tests for None,
but they're definitely special cases, and they also break if the
real parameter can be False.

Null Object seems like a perfect fit for this. I was unaware of it.
I read the original GOF book, but not much since then on patterns.
Thnks very much!

-- George

Jul 18 '05 #4

Heiko Wundram

Am Dienstag, 29. Juni 2004 07:59 schrieb Peter Otten:

Now you have an object that neither behaves consistently as a boolean nor
as a sequence, I fear you in for even subtler bugs...

That isn't necessarily true... Given the following example, I'd say what
__nonzero__ and __len__ implement is quite understandable, and if documented,
the programmer isn't in for any bug:

<code>

import time

class ID(object):

def __init__(self):
self.__id = "test"

def __len__(self):
return len(self.__id)

class Host(ID):

def __init__(self):
self.__timeout = time.time() + 30

def __nonzero__(self):
return self.__timeout >= time.time()

</code>

nonzero and len implement something completely different, where __len__ is an
operator on the underlying ID of a Host, and __nonzero__ is an operator on
the Host itself, to check whether the Host has timed out.

It doesn't make sense to have __nonzero__ refer to the ID (which is always
nonzero, a string), and it neither makes sense to have __len__ refer to the
Host (which doesn't have a length), so the situation here is pretty clear
(IMHO).

But, as always, documentation is better than guessing. ;)

Heiko.

Jul 18 '05 #5

Peter Otten

Heiko Wundram wrote:

Am Dienstag, 29. Juni 2004 07:59 schrieb Peter Otten:
Now you have an object that neither behaves consistently as a boolean nor
as a sequence, I fear you in for even subtler bugs...
That isn't necessarily true... Given the following example, I'd say what
__nonzero__ and __len__ implement is quite understandable, and if
documented, the programmer isn't in for any bug:

<code>

import time

class ID(object):

def __init__(self):
self.__id = "test"

def __len__(self):
return len(self.__id)

class Host(ID):

def __init__(self):
self.__timeout = time.time() + 30

def __nonzero__(self):
return self.__timeout >= time.time()

</code>

nonzero and len implement something completely different, where __len__ is
an operator on the underlying ID of a Host, and __nonzero__ is an operator
on the Host itself, to check whether the Host has timed out.

In Python, you don't normally check for a timeout (google for LBYL), you'd
rather throw an exception. This avoids problems like

h = Host()
if h:
sleep(justLongEnoughForHostToTimeOut)
h.someOperation() # fails due to timeout

It doesn't make sense to have __nonzero__ refer to the ID (which is always
nonzero, a string), and it neither makes sense to have __len__ refer to
the Host (which doesn't have a length), so the situation here is pretty
clear (IMHO).
A __len__() without a __getitem__() method doesn't make sense to me. But
maybe your example is just too terse...
But, as always, documentation is better than guessing. ;)

No amount of documentation can heal an unintuitive API.
The convention of using bool(o) as an abbreviation of o.isValid() for
non-sequences and of len(o) != 0 for sequences seems natural to me. Mixing
these two meanings or even adding "was this parameter provided" as a third
one will result in highly ambiguous code that is bound to break.

Peter

Jul 18 '05 #6

Donn Cave

In article <cb*************@news.t-online.com>,
Peter Otten <__*******@web.de> wrote:

Heiko Wundram wrote:

....

But, as always, documentation is better than guessing. ;)

No amount of documentation can heal an unintuitive API.
The convention of using bool(o) as an abbreviation of o.isValid() for
non-sequences and of len(o) != 0 for sequences seems natural to me. Mixing
these two meanings or even adding "was this parameter provided" as a third
one will result in highly ambiguous code that is bound to break.

I agree, but I think part of the problem is trying to milk too
much from all of these features.

The time to implement __nonzero__, __getitem__ etc. is when
there's a clear need for them, like an application context
where this polymorphism is needed to make things work. If you
do it because you think it looks nicer, don't complain when
it breaks things because you brought it on yourself with this
fuzzy thinking. (Using "you" in the generic sense.)

Donn Cave, do**@u.washington.edu

Jul 18 '05 #7

Heiko Wundram

Am Dienstag, 29. Juni 2004 20:34 schrieb Peter Otten:

In Python, you don't normally check for a timeout (google for LBYL), you'd
rather throw an exception. This avoids problems like

h = Host()
if h:
sleep(justLongEnoughForHostToTimeOut)
h.someOperation() # fails due to timeout
The operations on the host don't fail if the timeout has expired, in my use
case. It's just that a host has a timeout, which signals the surrounding
code, that this host needs to be contacted in the next run of ping signals.

What I can do to get these hosts now looks like the following:

ping_hosts = [h for h in hosts if not h]

That's what I call nice and concise, and at least for me the meaning is clear
by just looking at the code. If the host has expired (not host), add it to
the list of hosts to ping now.
A __len__() without a __getitem__() method doesn't make sense to me. But
maybe your example is just too terse...
Why should you need a __getitem__() if you have __len__() defined? In my use
case, the ID (of a host/data-item, etc.) is not retrievable by single
character (that would make no sense), but the length of the ID is
significant, as it can signal important information as on the protocol to use
to contact the host, etc.

So, I can how write:

someid = host
myid_old = myhost.old_id
myid_new = myhost.new_id

if len(someid) == 26:
dist = myid_old ^ someid
elif len(someid) == 30:
dist = myid_new ^ someid
else:
raise ValueError, "Invalid host."

(where __xor__ is again a method defined on two IDs, returning the numerical
distance, the binary xor between the two numbers)

I would never call this unintuitive, as in effect hosts are just IDs which
have additional data (namely the IP/port pair), and can be used in any
context in the program where IDs are wanted. And IDs can be asked for their
length (to decide what to do with it). This doesn't just mean Hosts, also
Data Items are IDs with additional data, which can be used just as well here.
No amount of documentation can heal an unintuitive API.
The convention of using bool(o) as an abbreviation of o.isValid() for
non-sequences and of len(o) != 0 for sequences seems natural to me. Mixing
these two meanings or even adding "was this parameter provided" as a third
one will result in highly ambiguous code that is bound to break.

I can understand that in the normal case you would want to code something
either as a sequence, or as a non-sequence. But, in this case, you have two
classes, which have different use cases, but are derived from another (and
deriving a Host from an ID isn't strange, at least for me). And for code
which does something with the ID (and is specifically marked as such), it's
pretty fair to use the ID part of the class which is passed in (which in this
case are __len__ and __xor__) to make it concise, while where you use hosts,
you take the host protocol to get at the host data (__nonzero__).

I don't see anything unintuitive in this... Actually, it makes the code look
cleaner, IMHO.

Heiko.

Jul 18 '05 #8

Donn Cave

In article <ma**************************************@python.o rg>,
Heiko Wundram <he*****@ceosg.de> wrote:

The operations on the host don't fail if the timeout has expired, in my use
case. It's just that a host has a timeout, which signals the surrounding
code, that this host needs to be contacted in the next run of ping signals.

What I can do to get these hosts now looks like the following:

ping_hosts = [h for h in hosts if not h]

That's what I call nice and concise, and at least for me the meaning is clear
by just looking at the code. If the host has expired (not host), add it to
the list of hosts to ping now. .... I can understand that in the normal case you would want to code something
either as a sequence, or as a non-sequence. But, in this case, you have two
classes, which have different use cases, but are derived from another (and
deriving a Host from an ID isn't strange, at least for me). And for code
which does something with the ID (and is specifically marked as such), it's
pretty fair to use the ID part of the class which is passed in (which in this
case are __len__ and __xor__) to make it concise, while where you use hosts,
you take the host protocol to get at the host data (__nonzero__).

I don't see anything unintuitive in this... Actually, it makes the code look
cleaner, IMHO.

It's the __nonzero__ part that hurts. Insofar as you have
explained it, an expired() method would be much clearer
than __nonzero__(), and it would make more sense. e.g.,

ping_hosts = [h for h in hosts if h.expired()]

This approach lets you have more than one boolean attribute,
because they can each have their own name.

Donn Cave, do**@u.washington.edu

Jul 18 '05 #9

Jeff Epler

> Am Dienstag, 29. Juni 2004 20:34 schrieb Peter Otten:

A __len__() without a __getitem__() method doesn't make sense to me. But
maybe your example is just too terse...

On Wed, Jun 30, 2004 at 12:53:15AM +0200, Heiko Wundram wrote: Why should you need a __getitem__() if you have __len__() defined? In my use
case, the ID (of a host/data-item, etc.) is not retrievable by single
character (that would make no sense), but the length of the ID is
significant, as it can signal important information as on the protocol touse
to contact the host, etc.

I agree with Herr Otten. __len__ is intended to be implemented by
container objects. http://docs.python.org/ref/sequence-types.html
If your object doesn't permit item access (o[i] or o[k]) then I think
people will be surprised to find that len(o) does not cause a TypeError.

Of course, you're permitted to write any program you like, and Python
will even execute some of them and give results that please you. You're
not required to write only programs that don't surprise anybody.

Jeff

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)

iD8DBQFA4ioUJd01MZaTXX0RAs8vAJ9euelDDplx2ESwJAKfq7 PMVJYDNgCbBg/W
iKF9ncw6jUULzRqt4luiyJc=
=5WY2
-----END PGP SIGNATURE-----

Jul 18 '05 #10

Greg Ewing

Heiko Wundram wrote:

ping_hosts = [h for h in hosts if not h]

That's what I call nice and concise, and at least for me the meaning is clear
by just looking at the code.

I suspect it's only clear to you because you wrote it.
I would find that piece of code *extremely* confusing --
it looks like it's going to return a list full of Nones!

If, on the other hand, it were written something like

ping_hosts = [h for h in hosts if h.expired()]

the meaning would be crystal clear to everyone, I think.

--
Greg Ewing, Computer Science Dept,
University of Canterbury,
Christchurch, New Zealand
http://www.cosc.canterbury.ac.nz/~greg

Jul 18 '05 #11

by: Peter King | last post by:

if you assign multiple classes in order to keep your classes generic e.g ..classA { position.absolute; left:5 top:5 height:200 width:800 } ..classB { background-color: red} ..classC {...

HTML / CSS

Should I use "if" or "try" (as a matter of speed)?

by: Steve Juranich | last post by:

I know that this topic has the potential for blowing up in my face, but I can't help asking. I've been using Python since 1.5.1, so I'm not what you'd call a "n00b". I dutifully evangelize on the...

Python

Differences between "class::member" & "object.member"

by: Jian H. Li | last post by:

Hello, What's the essential differences between the two ways of "class::member" & "object.member"(or object_pointer->member)? class C{ public: void f() {} int i; };

C / C++

Changing an elements CSS class style with DHTML

by: bissatch | last post by:

Hi, Is it possible to change the class style of an HTML element using DHTML? For example... <td class="my_class">Text</td> I have used DHTML to change style elements such as backgroundColor...

Javascript

what is an "explicit instance of a class"

by: kevin | last post by:

I have a form and in the form I have a sub that uses a class I instantiate using visual basic code: Public oCP As New Rs232 'instantiate the comm port I need to share this sub with...

Visual Basic .NET

class=""

by: JD | last post by:

Hi all Is it OK to have empty class attribute values, such as <href="url" class="">link</a> ?

HTML / CSS

How to suppress the "new protected member declared in sealed class"?

by: NetronProject | last post by:

My sealed class defines the necessary deserialization constructor: protected SomeMethod(SerializationInfo info, StreamingContext context){..} which generates the well-known warning "new...

C# / C Sharp

Class style with event handler embedded ?

by: pamelafluente | last post by:

Hi I am back with a question. I have something like: <span id ="SomeID1" onclick = "Clicked(this.id)"<div class=c1> Something here </div></span> <span id ="SomeID2" onclick =...

HTML / CSS

What is a div class="selection"

by: Angus | last post by:

I am trying to change the selection in Javascript - but this HTML element is not a standard option control. On the web page it looks like a dropdown list - and you click on the right hand down...

Javascript

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Basic Javascript concepts

by: aa123db | last post by:

Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...

Javascript

Batch import of multiple excel files into the database

by: ryjfgjl | last post by:

If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...

Data Management

Merging data from multiple Excel files

by: ryjfgjl | last post by:

In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...

Data Management

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

class with len member fools boolean usage "if x:" ; bad coding style?

Similar topics