By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
455,547 Members | 1,417 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 455,547 IT Pros & Developers. It's quick & easy.

Relying on the behaviour of empty container in conditional statements

P: n/a
Hi,

my collegues and I recently held a coding style review.
All of the code we produced is used in house on a commerical project.
One of the minor issues I raised was the common idiom of specifing:

<pre>
if len(x) 0:
do_something()
</pre>
Instead of using the language-defined bahviour, as stated by PEP8:

<pre>
- For sequences, (strings, lists, tuples), use the fact that empty
sequences are false.

Yes: if not seq:
if seq:

No: if len(seq)
if not len(seq)
</pre>

Without wishing to start a flame war, what are other's opinions on
this, is using "len" safer? If so why?

Arguments that have been presented for using <code>len(x) 0</codeto
test emptiness of a container include:
- It's safer
- Not relying on weird behaviour of the language
- Explicit is better than implicit (as stated by 'this' module, Zen
of Python)

My own feeling is that I am willing to work with the behaviours defined
by Python, and treat the use of len in these cases as excessive
duplication (this is however, quite a minor point i agree).

Note that I have much more experience with the language (6-7 years),
whilst the majority of my collegues have about 1-2 years experience.

Jul 11 '06 #1
Share this Question
Share on Google+
9 Replies


P: n/a
horizon5 wrote:
Hi,

my collegues and I recently held a coding style review.
All of the code we produced is used in house on a commerical project.
One of the minor issues I raised was the common idiom of specifing:

<pre>
if len(x) 0:
do_something()
</pre>
Instead of using the language-defined bahviour, as stated by PEP8:

<pre>
- For sequences, (strings, lists, tuples), use the fact that empty
sequences are false.

Yes: if not seq:
if seq:

No: if len(seq)
if not len(seq)
</pre>

Without wishing to start a flame war, what are other's opinions on
this, is using "len" safer? If so why?
I fail to see why it would be safer. But I clearly see a drawback to
explicitely testing length of objects : it doesn't work on unsized
objects (like None or 0 or False etc), so it makes code less generic
(wether this is a problem or not in a given context depends of course on
the context).
Arguments that have been presented for using <code>len(x) 0</codeto
test emptiness of a container include:
- It's safer
cf above
- Not relying on weird behaviour of the language
It's not a "weird behaviour", it's a well defined and documented
behaviour. And it's not specific to Python.
- Explicit is better than implicit (as stated by 'this' module, Zen
of Python)
Given that this behaviour is well defined and documented, using "if
[not] seq" is perfectly explicit.
My own feeling is that I am willing to work with the behaviours defined
by Python,
and use the common Python idiom.
and treat the use of len in these cases as excessive
duplication (this is however, quite a minor point i agree).
It's also not idiomatic and less generic.
Note that I have much more experience with the language (6-7 years),
whilst the majority of my collegues have about 1-2 years experience.
Do they still write code like the following ?

if someBooleanExpression == True:
return True
else:
return False

If yes, time to look for another place to work IMHO. Else, what do they
think of the above snippet, vs:

return someBooleanExpression

Do they think the first one is safer and/or more explicit ?-)

--
bruno desthuilliers
python -c "print '@'.join(['.'.join([w[::-1] for w in p.split('.')]) for
p in 'o****@xiludom.gro'.split('@')])"
Jul 11 '06 #2

P: n/a
On Tue, 11 Jul 2006 04:52:42 -0700, horizon5 wrote:
Hi,

my collegues and I recently held a coding style review.
All of the code we produced is used in house on a commerical project.
One of the minor issues I raised was the common idiom of specifing:

<pre>
if len(x) 0:
do_something()
</pre>
Instead of using the language-defined bahviour, as stated by PEP8:

<pre>
- For sequences, (strings, lists, tuples), use the fact that empty
sequences are false.

Yes: if not seq:
if seq:

No: if len(seq)
if not len(seq)
</pre>

Without wishing to start a flame war, what are other's opinions on
this, is using "len" safer? If so why?
What do you mean by "safer"? What sort of bad consequences are you trying
to avoid? What sort of errors do your colleagues think writing "if seq"
encourages?

Just waving hands in the air and saying "Doing foo is safer" is
meaningless. Safer than what? What are the costs and benefits of doing
foo? What are the consequences of failure?
Arguments that have been presented for using <code>len(x) 0</codeto
test emptiness of a container include:
- It's safer
On the contrary, the longer test isn't safer, it is more dangerous because
there is a whole family of potential bugs that can occur when using that
idiom, but which can't occur in the shorter test.

For example, there are typos like writing "len(x) < 0" by mistake. (Or,
for that matter, len(x) 9.)

Do you have a function ln defined in the current scope? Then you better
hope that you don't mistype len(x) as ln(x).

Sure, the chances of these sorts of errors are small -- but not zero.
Write enough tests, and you will surely make them. But they are impossible
to make in the shorter "if x" idiom.

- Not relying on weird behaviour of the language
There is nothing weird about it. Languages which don't accept arbitrary
objects as truth values are weird. Every object should know whether or not
it is nonempty/true or empty/false.

- Explicit is better than implicit (as stated by 'this' module, Zen
of Python)
"if x:" is explicit. It is explicitly asking, does x evaluate as true in a
Boolean context or as false?

In an object-oriented framework, it is bad practice for the developer to
concern himself with what makes an object true or false. Instead, you
should simply ask the object, are you true?

What happens when you decide that using a list (or tuple) for x is not the
right way to solve your problem? You change it to something like, say, a
binary tree. Now len(x) 0 is undefined, because trees don't have a
length. You could ask if height(x) 0 but calculating the height of a
tree is, in general, expensive. Better to simply ask the tree if it is
true or false.

Or, you subclass list to use a sentinel value. There may be some good
reason for not wanting to redefine the __len__ method of the subclass, so
the length of the subclassed list is ALWAYS positive, never zero. If you
have peppered your code with tests like "if len(x) 0", you're in
trouble. You're in even greater trouble if some of those tests might be
called by *either* regular lists or by the new subclass. Now you have to
start turning your tests into something awful like this:

if (x.__class__ == ListWithSentinal and len(x) 1) or len(x) 0:

But if the subclass redefines the __nonzero__ method, and you use "if x"
as your test, you're done. One change, in one place, compared to
potentially hundreds of changes scattered all over your code.
There are times where your code is constrained by your data-type, e.g.
tests like "if len(x) 5:" or similar. But they are comparatively rare.
Why force simple true/false tests into that idiom?

My own feeling is that I am willing to work with the behaviours defined
by Python, and treat the use of len in these cases as excessive
duplication (this is however, quite a minor point i agree).

Note that I have much more experience with the language (6-7 years),
whilst the majority of my collegues have about 1-2 years experience.
Sounds like they are still programming C or Java in Python.

Good luck teaching these young whipper-snappers. *wink*

--
Steven.

Jul 11 '06 #3

P: n/a
horizon5 wrote:
Hi,

my collegues and I recently held a coding style review.
All of the code we produced is used in house on a commerical project.
One of the minor issues I raised was the common idiom of specifing:

<pre>
if len(x) 0:
do_something()
</pre>
Instead of using the language-defined bahviour, as stated by PEP8:
[...]
Without wishing to start a flame war, what are other's opinions on
this, is using "len" safer? If so why?
All objects evaluate to a boolean, but not all objects support len.
So if someone passes x = iter ([]) to your method, then len (x) will
result in an exception, whereas 'if x:' will happily (and wrongly)
answer true.

On the other hand, maybe you should be able to accept iterators, and
then the test would look different anyway.
>
My own feeling is that I am willing to work with the behaviours defined
by Python, and treat the use of len in these cases as excessive
duplication (this is however, quite a minor point i agree).
I find

if parameters:

perfectly acceptable, as it allows parameters to be None as well.
On the other hand, I can't stand interpreting integers as booleans, so I
wouldn't write

if count:

Everyone has their personal quirks.
>
Note that I have much more experience with the language (6-7 years),
whilst the majority of my collegues have about 1-2 years experience.
I guess they're coming from Java and not from Perl :-)

Daniel
Jul 11 '06 #4

P: n/a
Le mardi 11 juillet 2006 13:52, horizon5 a écrit*:
Arguments that have been presented for using <code>len(x) 0</codeto
test emptiness of a container include:
* - It's safer
* - Not relying on weird behaviour of the language
* - Explicit is better than implicit (as stated by 'this' module, Zen
of Python)
Too bad.
From the doc :
"""
__nonzero__( self)
Called to implement truth value testing, and the built-in operation bool();
should return False or True, or their integer equivalents 0 or 1. When this
method is not defined, __len__() is called, if it is defined (see below). If
a class defines neither __len__() nor __nonzero__(), all its instances are
considered true.
"""

So, the bool(container) *is* the test for emptiness for all container in
python.
What is weird is to not follow the semantic of the language.

'if len(container) :" means "if container's length is not zero", while "if
container :" means "if container is empty".
Using the second is far better because a random container can implement a
faster algorithm to test its emptiness (the __nonzero__ method for any
container in python).
--
_____________

Maric Michaud
_____________

Aristote - www.aristote.info
3 place des tapis
69004 Lyon
Tel: +33 426 880 097
Jul 11 '06 #5

P: n/a
horizon5 wrote:
Hi,

my collegues and I recently held a coding style review.
All of the code we produced is used in house on a commerical project.
One of the minor issues I raised was the common idiom of specifing:

<pre>
if len(x) 0:
do_something()
</pre>
Instead of using the language-defined bahviour, as stated by PEP8:

<pre>
- For sequences, (strings, lists, tuples), use the fact that empty
sequences are false.

Yes: if not seq:
if seq:

No: if len(seq)
if not len(seq)
</pre>

Without wishing to start a flame war, what are other's opinions on
this, is using "len" safer? If so why?

Arguments that have been presented for using <code>len(x) 0</codeto
test emptiness of a container include:
- It's safer
- Not relying on weird behaviour of the language
- Explicit is better than implicit (as stated by 'this' module, Zen
of Python)

My own feeling is that I am willing to work with the behaviours defined
by Python, and treat the use of len in these cases as excessive
duplication (this is however, quite a minor point i agree).

Note that I have much more experience with the language (6-7 years),
whilst the majority of my collegues have about 1-2 years experience.
I've been programming in python for years and I've always used this
form

if not seq:
if seq:

rather than the other and never had any problems.

Anyone presenting arguments in favor of the len() form is IMHO, not
"getting it". AFAIK, python is not "smart" enough to optimize away the
(totally unnecessary) call to len(), so the programmer should do it for
herself.

The "pythonic" form is safe, not weird, and just as explicit.

There's no more point to using the len() form than there is to saying
"seq[len(seq)-1]" rather than just "seq[-1]" to get the last item of a
sequence.

My $0.02

Peace,
~Simon

Jul 11 '06 #6

P: n/a
Simon Forman schreef:
I've been programming in python for years and I've always used this
form

if not seq:
if seq:

rather than the other and never had any problems.

Anyone presenting arguments in favor of the len() form is IMHO, not
"getting it". AFAIK, python is not "smart" enough to optimize away the
(totally unnecessary) call to len(), so the programmer should do it for
herself.

The "pythonic" form is safe, not weird, and just as explicit.
I know that that is the consensus, and I mostly the form without len(),
but somehow I still feel it is not as explicit. In

if seq:

there is no distinction between seq is None on the one hand and seq
being a valid empty sequence on the other hand.

I feel that that is an import distinction, and it's the reason I find
myself using len() from time to time (even though I can't think of a use
case right now).

--
If I have been able to see further, it was only because I stood
on the shoulders of giants. -- Isaac Newton

Roel Schroeven
Jul 11 '06 #7

P: n/a
Interesting replies, thank you all.
Since the language defines the bahviour, I'm all for using it (when
appropriate).
there is no distinction between seq is None on the one hand and seq
being a valid empty sequence on the other hand.

I feel that that is an import distinction, and it's the reason I find
myself using len() from time to time (even though I can't think of a use
case right now).
Depends on the context.

def action1(container):
if container:
do_something()
else:
do_somthing_else()

def action2(mutator):
container = list()
mutator(container)
if container:
do_something()
else:
do_somthing_else()

def action3(container):
n = len(container)
if n 0:
report("%i items left in container" % n)
else:
alert("container is empty")

In these examples;
action1 receives the container, so we *might* have received a None,
thus we would not get an error that we might expect. The meaning is
lost, so action1 could say:

if container is not None and container:

Or it could be bad design, and one compose do_somthing and
do_something_else to take the container reference and dispatch
accordingly. (these arn't concrete examples so will depend on context).

in action2 however, the function is repsonsible for creating the
container, and thus (with good unit tests as always) we should be able
to assume that mutator does not destroy the reference to container, and
the if is fine.

In action3, we have a use for the lenth of container, so we might as
well use it.

So, imho, there's no problem with using the behaviour of the language,
where appropriate.
As another contributer(s) have mentioned, using the defined behaviour
(as oppsed to invoking len)
can eliviate the need to make as many changes.
The problem for me is: how to communicate the benefits of adhereing to
a language, rather than fighting against it, either because it doesn't
feel right or is not explicit enough for people who may not be aware of
the language's bahviour. Of course over time, we read docs, and come
to learn to love the things a language gives us (esp. in Python;).

I don't think changing jobs is the answer; who says in the next job I
may have that they won't hire new people who aren't vetern Pythonistas?
I'm sure the long term solution lies in eductation, but I'm not sure
what the short term solution is.

And don't I sound arrogant? Programmers feel threatended by someone who
has more experience, and feel inadequete (i did at least). So, perhaps
I should read up on some communication case studies. Can anyone point
me to such resources?

Thanks

Jul 11 '06 #8

P: n/a
On Tue, 11 Jul 2006 19:43:57 +0000, Roel Schroeven wrote:
I know that that is the consensus, and I mostly the form without len(),
but somehow I still feel it is not as explicit. In

if seq:

there is no distinction between seq is None on the one hand and seq
being a valid empty sequence on the other hand.
If seq can be None as well as a sequence, doing a test "if len(seq) 0"
won't save you because len(None) will fail. You need an explicit test
for seq being None:

if seq is not None and len(seq) 0

Or even better:

if seq

which Just Works regardless of the type of seq.
I feel that that is an import distinction, and it's the reason I find
myself using len() from time to time (even though I can't think of a use
case right now).
If you are writing code where the argument in question could be None or a
list, then obviously you need to explicitly test for it being None -- but
that doesn't mean you have to explicitly test for the length of the list.
Likewise if the argument could be an integer as well as a sequence, you
need a more complex test again:

if x is not None and (isinstance(x, int) and x 0) or len(x) 0

(and I hope I got that right... so many potential bugs in that "safer"
code)

Or, the right way to do it:

if x

which Just Works for any x without worrying about type-checking or
combining boolean expressions or the precedence of operators.
Explicit doesn't mean you have to spell out every last detail. We don't
normally write:

x.__getitem__(0) instead of x[0]
s.split(None, 0) instead of s.split()
range(0, 100, 1) instead of range(100)
locals().__getitem__('name') instead of name

because they are "more explicit". There are times when more is too much.
The idiom "if seq" is a well-understood, well-defined, explicit test of
whether the sequence is true in a Boolean context, which for many
containers (lists, tuples, dicts, etc.) corresponds to them having
non-zero length, but that is not an invariant. There are potential
container types that either don't define a length or always have a
non-zero length even when they are false.
--
Steven.

Jul 11 '06 #9

P: n/a
Steven D'Aprano schreef:
If seq can be None as well as a sequence, doing a test "if len(seq) 0"
won't save you because len(None) will fail. You need an explicit test
for seq being None:

if seq is not None and len(seq) 0

Or even better:

if seq

which Just Works regardless of the type of seq.
Yes, true.

I agree that testing in a Boolean context works best in those cases.
After a good night's sleep I remember why I felt uneasy doing it like
that: I feel that 'if seq' should be synonym with 'if seq is not None',
but I can't explain why. No rational reasons I think; it's probably just
from being used to C and C++ where 'if (p)' in pointer contexts is used
as synonym for 'if (p != NULL)'.

In general I don't have too many problems using Python idioms instead of
C or C++ idioms, but apparently sometimes my years of experience in
these languages sometimes show trough in Python. Luckily my BASIC habits
have died out long ago.

--
If I have been able to see further, it was only because I stood
on the shoulders of giants. -- Isaac Newton

Roel Schroeven
Jul 12 '06 #10

This discussion thread is closed

Replies have been disabled for this discussion.