469,599 Members | 2,597 Online

# Iterator length

Often I need to tell the len of an iterator, this is a stupid example:
>>l = (i for i in xrange(100) if i&1)
len isn't able to tell it:
>>len(l)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: object of type 'generator' has no len()

This is a bad solution, it may need too much memory, etc:
>>len(list(l))
This is a simple solution in a modern Python:
>>sum(1 for _ in l)
50

This is a faster solution (and Psyco helps even more):

def leniter(iterator):
"""leniter(iterator): return the length of an iterator,
consuming it."""
if hasattr(iterator, "__len__"):
return len(iterator)
nelements = 0
for _ in iterator:
nelements += 1
return nelements

Is it a good idea to extend the functionalities of the built-in len
function to cover such situation too?

Bye,
bearophile

Jan 18 '07 #1
8 11569
be************@lycos.com wrote:
Often I need to tell the len of an iterator, this is a stupid example:
>l = (i for i in xrange(100) if i&1)

len isn't able to tell it:
>len(l)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: object of type 'generator' has no len()

This is a bad solution, it may need too much memory, etc:
>len(list(l))

This is a simple solution in a modern Python:
>sum(1 for _ in l)
50

This is a faster solution (and Psyco helps even more):

def leniter(iterator):
"""leniter(iterator): return the length of an iterator,
consuming it."""
if hasattr(iterator, "__len__"):
return len(iterator)
nelements = 0
for _ in iterator:
nelements += 1
return nelements

Is it a good idea to extend the functionalities of the built-in len
function to cover such situation too?

Bye,
bearophile
Is this a rhetorical question ? If not, try this:
>>x = (i for i in xrange(100) if i&1)
if leniter(x): print x.next()
George

Jan 19 '07 #2
George Sakkis:
Is this a rhetorical question ? If not, try this:
It wasn't a rhetorical question.

>x = (i for i in xrange(100) if i&1)
if leniter(x): print x.next()
What's your point? Maybe you mean that it consumes the given iterator?
I am aware of that, it's written in the function docstring too. But
sometimes you don't need the elements of a given iterator, you just
need to know how many elements it has. A very simple example:

s = "aaabbbbbaabbbbbb"
from itertools import groupby
print [(h,leniter(g)) for h,g in groupby(s)]

Bye,
bearophile

Jan 19 '07 #3
be************@lycos.com writes:
But sometimes you don't need the elements of a given iterator, you
just need to know how many elements it has.
AFAIK, the iterator protocol doesn't allow for that.

Bear in mind, too, that there's no way to tell from outside that an
iterater even has a finite length; also, many finite-length iterators
have termination conditions that preclude knowing the number of
iterations until the termination condition actually happens.

--
\ "When a well-packaged web of lies has been sold to the masses |
`\ over generations, the truth will seem utterly preposterous and |
_o__) its speaker a raving lunatic." -- Dresden James |
Ben Finney

Jan 19 '07 #4
At Thursday 18/1/2007 20:26, be************@lycos.com wrote:
>def leniter(iterator):
"""leniter(iterator): return the length of an iterator,
consuming it."""
if hasattr(iterator, "__len__"):
return len(iterator)
nelements = 0
for _ in iterator:
nelements += 1
return nelements

Is it a good idea to extend the functionalities of the built-in len
function to cover such situation too?
I don't think so, because it may consume the iterator, and that's a
big side effect that one would not expect from builtin len()
--
Gabriel Genellina
Softlab SRL

__________________________________________________
Preguntá. Respondé. Descubrí.
Todo lo que querías saber, y lo que ni imaginabas,
está en Yahoo! Respuestas (Beta).
¡Probalo ya!
http://www.yahoo.com.ar/respuestas

Jan 19 '07 #5
On Thu, 18 Jan 2007 16:55:39 -0800, bearophileHUGS wrote:
What's your point? Maybe you mean that it consumes the given iterator?
I am aware of that, it's written in the function docstring too. But
sometimes you don't need the elements of a given iterator, you just
need to know how many elements it has. A very simple example:

s = "aaabbbbbaabbbbbb"
from itertools import groupby
print [(h,leniter(g)) for h,g in groupby(s)]
s isn't an iterator. It's a sequence, a string, and an iterable, but not
an iterator.

I hope you know what sequences and strings are :-)

An iterable is anything that can be iterated over -- it includes sequences
and iterators.

An iterator, on the other hand, is something with the iterator protocol,
that is, it has a next() method and raises StopIteration when it's done.
>>s = "aaabbbbbaabbbbbb"
s.next()
Traceback (most recent call last):
File "<stdin>", line 1, in ?
AttributeError: 'str' object has no attribute 'next'

An iterator should return itself if you pass it to iter():
>>iter(s) is s
False
>>it = iter(s); iter(it) is it
True

You've said that you understand len of an iterator will consume the
iterator, and that you don't think that matters. It might not matter in
a tiny percentage of cases, but it will certainly matter all the rest
of the time!

And let's not forget, in general you CAN'T calculate the length of an
iterator, not even in theory:

def randnums():
while random.random != 0.123456789:
yield "Not finished yet"
yield "Finished"

What should the length of randnums() return?

One last thing which people forget... iterators can have a length, the
same as any other object, if they have a __len__ method:
>>s = "aaabbbbbaabbbbbb"
it = iter(s)
len(it)
16

So, if you want the length of an arbitrary iterator, just call len()
and deal with the exception.

--
Steven.

Jan 19 '07 #6
Steven D'Aprano:
s = "aaabbbbbaabbbbbb"
from itertools import groupby
print [(h,leniter(g)) for h,g in groupby(s)]

s isn't an iterator. It's a sequence, a string, and an iterable, but not
an iterator.
If you look better you can see that I use the leniter() on g, not on s.
g is the iterator I need to compute the len of.

I hope you know what sequences and strings are :-)
Well, I know little still about the C implementation of CPython
iterators :-)

But I agree with the successive things you say, iterators may be very
general things, and there are too many drawbacks/dangers, so it's
better to keep leniter() as a function separated from len(), with
specialized use.

Bye and thank you,
bearophile

Jan 19 '07 #7
On Fri, 19 Jan 2007 05:04:01 -0800, bearophileHUGS wrote:
Steven D'Aprano:
s = "aaabbbbbaabbbbbb"
from itertools import groupby
print [(h,leniter(g)) for h,g in groupby(s)]

s isn't an iterator. It's a sequence, a string, and an iterable, but not
an iterator.

If you look better you can see that I use the leniter() on g, not on s.
g is the iterator I need to compute the len of.

Oops, yes you're right. But since g is not an arbitrary iterator, one can
easily do this:

print [(h,len(list(g))) for h,g in groupby(s)]

No need for a special function.
>I hope you know what sequences and strings are :-)

Well, I know little still about the C implementation of CPython
iterators :-)

But I agree with the successive things you say, iterators may be very
general things, and there are too many drawbacks/dangers, so it's
better to keep leniter() as a function separated from len(), with
specialized use.
I don't think it's better to have leniter() at all. If you, the iterator
creator, know enough about the iterator to be sure it has a predictable
length, you know how to calculate it. Otherwise, iterators in general
don't have a predictable length even in principle.

--
Steven.

Jan 19 '07 #8
Steven D'Aprano:
since g is not an arbitrary iterator, one can easily do this:
print [(h,len(list(g))) for h,g in groupby(s)]
No need for a special function.
If you look at my first post you can see that I have shown that
solution too, but it creates a list that may be long, that may use a
lot of of memory, and then throws it away each time. I think that's a
bad solution. It goes against the phylosophy of iterators too, they are
things created to avoid managing true lists of items too.

If you, the iterator
creator, know enough about the iterator to be sure it has a predictable
length, you know how to calculate it.
I don't agree, sometimes I know I have a finite iterator, but I may
ignore how many elements it gives (and sometimes they may be a lot).
See the simple example with the groupby.

Bye,
bearophile

Jan 19 '07 #9

### This discussion thread is closed

Replies have been disabled for this discussion.