By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
445,645 Members | 1,048 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 445,645 IT Pros & Developers. It's quick & easy.

Re: Peek inside iterator (is there a PEP about this?)

P: n/a
Luis Zarrabeitia wrote:
Hi there.

For most use cases I think about, the iterator protocol is more than enough.
However, on a few cases, I've needed some ugly hacks.

Ex 1:

a = iter([1,2,3,4,5]) # assume you got the iterator from a function and
b = iter([1,2,3]) # these two are just examples.

then,

zip(a,b)

has a different side effect from

zip(b,a)

After the excecution, in the first case, iterator a contains just [5], on the
second, it contains [4,5]. I think the second one is correct (the 5 was never
used, after all). I tried to implement my 'own' zip, but there is no way to
know the length of the iterator (obviously), and there is also no way
to 'rewind' a value after calling 'next'.
Interesting observation. Iterators are intended for 'iterate through
once and discard' usages. To zip a long sequence with several short
sequences, either use itertools.chain(short sequences) or put the short
sequences as the first zip arg.
Ex 2:

Will this iterator yield any value? Like with most iterables, a construct

if iterator:
# do something

would be a very convenient thing to have, instead of wrapping a 'next' call on
a try...except and consuming the first item.
To test without consuming, wrap the iterator in a trivial-to-write
one_ahead or peek class such as has been posted before.
Ex 3:

if any(iterator):
# do something ... but the first true value was already consumed and
# cannot be reused. "Any" cannot peek inside the iterator without
# consuming the value.
If you are going to do something with the true value, use a for loop and
break. If you just want to peek inside, use a sequence (list(iterator)).
Instead,

i1, i2 = tee(iterator)
if any(i1):
# do something with i2
This effectively makes two partial lists and tosses one. That may or
may not be a better idea.
Question/Proposal:

Has there been any PEP regarding the problem of 'peeking' inside an iterator?
Iterators are not sequences and, in general, cannot be made to act like
them. The iterator protocol is a bare-minimum, least-common-denominator
requirement for inter-operability. You can, of course, add methods to
iterators that you write for the cases where one-ahead or random access
*is* possible.
Knowing if the iteration will end or not, and/or accessing the next value,
without consuming it? Is there any (simple, elegant) way around it?
That much is trivial. As suggested above, write a wrapper with the
exact behavior you want. A sample (untested)

class one_ahead():
"Self.peek is the next item or undefined"
def __init__(self, iterator):
try:
self.peek = next(iterator)
self._it = iterator
except StopIteration:
pass
def __bool__(self):
return hasattr(self, 'peek')
def __next__(self): # 3.0, 2.6?
try:
next = self.peek
try:
self.peek = next(self._it)
except StopIteration:
del self.peek
return next
except AttrError:
raise StopIteration

Terry Jan Reedy

Oct 1 '08 #1
Share this Question
Share on Google+
2 Replies


P: n/a
On Oct 1, 3:14*pm, Terry Reedy <tjre...@udel.eduwrote:
Luis Zarrabeitia wrote:
Hi there.
For most use cases I think about, the iterator protocol is more than enough.
However, on a few cases, I've needed some ugly hacks.
Ex 1:
a = iter([1,2,3,4,5]) # assume you got the iterator from a function and
b = iter([1,2,3]) * * # these two are just examples.
then,
zip(a,b)
has a different side effect from
zip(b,a)
After the excecution, in the first case, iterator a contains just [5], on the
second, it contains [4,5]. I think the second one is correct (the 5 wasnever
used, after all). I tried to implement my 'own' zip, but there is no way to
know the length of the iterator (obviously), and there is also no way
to 'rewind' a value after calling 'next'.

Interesting observation. *Iterators are intended for 'iterate through
once and discard' usages. *To zip a long sequence with several short
sequences, either use itertools.chain(short sequences) or put the short
sequences as the first zip arg.
Ex 2:
Will this iterator yield any value? Like with most iterables, a construct
if iterator:
* *# do something
would be a very convenient thing to have, instead of wrapping a 'next' call on
a try...except and consuming the first item.

To test without consuming, wrap the iterator in a trivial-to-write
one_ahead or peek class such as has been posted before.
Ex 3:
if any(iterator):
* *# do something ... but the first true value was already consumedand
* *# cannot be reused. "Any" cannot peek inside the iterator without
* *# consuming the value.

If you are going to do something with the true value, use a for loop and
break. *If you just want to peek inside, use a sequence (list(iterator)).
Instead,
i1, i2 = tee(iterator)
if any(i1):
* *# do something with i2

This effectively makes two partial lists and tosses one. *That may or
may not be a better idea.
Question/Proposal:
Has there been any PEP regarding the problem of 'peeking' inside an iterator?

Iterators are not sequences and, in general, cannot be made to act like
them. *The iterator protocol is a bare-minimum, least-common-denominator
requirement for inter-operability. *You can, of course, add methods to
iterators that you write for the cases where one-ahead or random access
*is* possible.
Knowing if the iteration will end or not, and/or accessing the next value,
without consuming it? Is there any (simple, elegant) way around it?

That much is trivial. *As suggested above, write a wrapper with the
exact behavior you want. *A sample (untested)

class one_ahead():
* *"Self.peek is the next item or undefined"
* *def __init__(self, iterator):
* * *try:
* * * *self.peek = next(iterator)
* * * *self._it = iterator
* * *except StopIteration:
* * * *pass
* *def __bool__(self):
* * *return hasattr(self, 'peek')
* *def __next__(self): # 3.0, 2.6?
* * *try:
* * * *next = self.peek
* * * *try:
* * * * *self.peek = next(self._it)
* * * *except StopIteration:
* * * * *del self.peek
* * * *return next
* * *except AttrError:
* * * *raise StopIteration

Terry Jan Reedy
Terry's is close. '__nonzero__' instead of '__bool__', missing
'__iter__', 'next', 'self._it.next( )' in 2.5.

Then just define your own 'peekzip'. Short:

def peekzip( *itrs ):
while 1:
if not all( itrs ):
raise StopIteration
yield tuple( [ itr.next( ) for itr in itrs ] )

In some cases, you could require 'one_ahead' instances in peekzip, or
create them yourself in new iterators.

Here is your output: The first part uses zip, the second uses peekzip.

[(1, 1), (2, 2), (3, 3)]
5
[(1, 1), (2, 2), (3, 3)]
4

4 is what you expect.

Here's the full code.

class one_ahead(object):
"Self.peek is the next item or undefined"
def __init__(self, iterator):
try:
self.peek = iterator.next( )
self._it = iterator
except StopIteration:
pass
def __nonzero__(self):
return hasattr(self, 'peek')
def __iter__(self):
return self
def next(self): # 3.0, 2.6?
try:
next = self.peek
try:
self.peek = self._it.next( )
except StopIteration:
del self.peek
return next
except AttributeError:
raise StopIteration
a= one_ahead( iter( [1,2,3,4,5] ) )
b= one_ahead( iter( [1,2,3] ) )
print zip( a,b )
print a.next()

def peekzip( *itrs ):
while 1:
if not all( itrs ):
raise StopIteration
yield tuple( [ itr.next( ) for itr in itrs ] )

a= one_ahead( iter( [1,2,3,4,5] ) )
b= one_ahead( iter( [1,2,3] ) )
print list( peekzip( a,b ) )
print a.next()

There's one more option, which is to create your own 'push-backable'
class, which accepts a 'previous( item )' message.

(Unproduced)
>>a= push_backing( iter( [1,2,3,4,5] ) )
a.next( )
1
>>a.next( )
2
>>a.previous( 2 )
a.next( )
2
>>a.next( )
3

Oct 2 '08 #2

P: n/a
On Wed, 01 Oct 2008 16:14:09 -0400, Terry Reedy wrote:
Iterators are intended for 'iterate through once and discard' usages.
Also for reading files, which are often seekable.

I don't disagree with the rest of your post, I thought I'd just make an
observation that if the data you are iterating over supports random
access, it's possible to write an iterator that also supports random
access.

--
Steven
Oct 2 '08 #3

This discussion thread is closed

Replies have been disabled for this discussion.