473,320 Members | 2,122 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

Re: Peek inside iterator (is there a PEP about this?)

Luis Zarrabeitia wrote:
Hi there.

For most use cases I think about, the iterator protocol is more than enough.
However, on a few cases, I've needed some ugly hacks.

Ex 1:

a = iter([1,2,3,4,5]) # assume you got the iterator from a function and
b = iter([1,2,3]) # these two are just examples.

then,

zip(a,b)

has a different side effect from

zip(b,a)

After the excecution, in the first case, iterator a contains just [5], on the
second, it contains [4,5]. I think the second one is correct (the 5 was never
used, after all). I tried to implement my 'own' zip, but there is no way to
know the length of the iterator (obviously), and there is also no way
to 'rewind' a value after calling 'next'.
Interesting observation. Iterators are intended for 'iterate through
once and discard' usages. To zip a long sequence with several short
sequences, either use itertools.chain(short sequences) or put the short
sequences as the first zip arg.
Ex 2:

Will this iterator yield any value? Like with most iterables, a construct

if iterator:
# do something

would be a very convenient thing to have, instead of wrapping a 'next' call on
a try...except and consuming the first item.
To test without consuming, wrap the iterator in a trivial-to-write
one_ahead or peek class such as has been posted before.
Ex 3:

if any(iterator):
# do something ... but the first true value was already consumed and
# cannot be reused. "Any" cannot peek inside the iterator without
# consuming the value.
If you are going to do something with the true value, use a for loop and
break. If you just want to peek inside, use a sequence (list(iterator)).
Instead,

i1, i2 = tee(iterator)
if any(i1):
# do something with i2
This effectively makes two partial lists and tosses one. That may or
may not be a better idea.
Question/Proposal:

Has there been any PEP regarding the problem of 'peeking' inside an iterator?
Iterators are not sequences and, in general, cannot be made to act like
them. The iterator protocol is a bare-minimum, least-common-denominator
requirement for inter-operability. You can, of course, add methods to
iterators that you write for the cases where one-ahead or random access
*is* possible.
Knowing if the iteration will end or not, and/or accessing the next value,
without consuming it? Is there any (simple, elegant) way around it?
That much is trivial. As suggested above, write a wrapper with the
exact behavior you want. A sample (untested)

class one_ahead():
"Self.peek is the next item or undefined"
def __init__(self, iterator):
try:
self.peek = next(iterator)
self._it = iterator
except StopIteration:
pass
def __bool__(self):
return hasattr(self, 'peek')
def __next__(self): # 3.0, 2.6?
try:
next = self.peek
try:
self.peek = next(self._it)
except StopIteration:
del self.peek
return next
except AttrError:
raise StopIteration

Terry Jan Reedy

Oct 1 '08 #1
2 2265
On Oct 1, 3:14*pm, Terry Reedy <tjre...@udel.eduwrote:
Luis Zarrabeitia wrote:
Hi there.
For most use cases I think about, the iterator protocol is more than enough.
However, on a few cases, I've needed some ugly hacks.
Ex 1:
a = iter([1,2,3,4,5]) # assume you got the iterator from a function and
b = iter([1,2,3]) * * # these two are just examples.
then,
zip(a,b)
has a different side effect from
zip(b,a)
After the excecution, in the first case, iterator a contains just [5], on the
second, it contains [4,5]. I think the second one is correct (the 5 wasnever
used, after all). I tried to implement my 'own' zip, but there is no way to
know the length of the iterator (obviously), and there is also no way
to 'rewind' a value after calling 'next'.

Interesting observation. *Iterators are intended for 'iterate through
once and discard' usages. *To zip a long sequence with several short
sequences, either use itertools.chain(short sequences) or put the short
sequences as the first zip arg.
Ex 2:
Will this iterator yield any value? Like with most iterables, a construct
if iterator:
* *# do something
would be a very convenient thing to have, instead of wrapping a 'next' call on
a try...except and consuming the first item.

To test without consuming, wrap the iterator in a trivial-to-write
one_ahead or peek class such as has been posted before.
Ex 3:
if any(iterator):
* *# do something ... but the first true value was already consumedand
* *# cannot be reused. "Any" cannot peek inside the iterator without
* *# consuming the value.

If you are going to do something with the true value, use a for loop and
break. *If you just want to peek inside, use a sequence (list(iterator)).
Instead,
i1, i2 = tee(iterator)
if any(i1):
* *# do something with i2

This effectively makes two partial lists and tosses one. *That may or
may not be a better idea.
Question/Proposal:
Has there been any PEP regarding the problem of 'peeking' inside an iterator?

Iterators are not sequences and, in general, cannot be made to act like
them. *The iterator protocol is a bare-minimum, least-common-denominator
requirement for inter-operability. *You can, of course, add methods to
iterators that you write for the cases where one-ahead or random access
*is* possible.
Knowing if the iteration will end or not, and/or accessing the next value,
without consuming it? Is there any (simple, elegant) way around it?

That much is trivial. *As suggested above, write a wrapper with the
exact behavior you want. *A sample (untested)

class one_ahead():
* *"Self.peek is the next item or undefined"
* *def __init__(self, iterator):
* * *try:
* * * *self.peek = next(iterator)
* * * *self._it = iterator
* * *except StopIteration:
* * * *pass
* *def __bool__(self):
* * *return hasattr(self, 'peek')
* *def __next__(self): # 3.0, 2.6?
* * *try:
* * * *next = self.peek
* * * *try:
* * * * *self.peek = next(self._it)
* * * *except StopIteration:
* * * * *del self.peek
* * * *return next
* * *except AttrError:
* * * *raise StopIteration

Terry Jan Reedy
Terry's is close. '__nonzero__' instead of '__bool__', missing
'__iter__', 'next', 'self._it.next( )' in 2.5.

Then just define your own 'peekzip'. Short:

def peekzip( *itrs ):
while 1:
if not all( itrs ):
raise StopIteration
yield tuple( [ itr.next( ) for itr in itrs ] )

In some cases, you could require 'one_ahead' instances in peekzip, or
create them yourself in new iterators.

Here is your output: The first part uses zip, the second uses peekzip.

[(1, 1), (2, 2), (3, 3)]
5
[(1, 1), (2, 2), (3, 3)]
4

4 is what you expect.

Here's the full code.

class one_ahead(object):
"Self.peek is the next item or undefined"
def __init__(self, iterator):
try:
self.peek = iterator.next( )
self._it = iterator
except StopIteration:
pass
def __nonzero__(self):
return hasattr(self, 'peek')
def __iter__(self):
return self
def next(self): # 3.0, 2.6?
try:
next = self.peek
try:
self.peek = self._it.next( )
except StopIteration:
del self.peek
return next
except AttributeError:
raise StopIteration
a= one_ahead( iter( [1,2,3,4,5] ) )
b= one_ahead( iter( [1,2,3] ) )
print zip( a,b )
print a.next()

def peekzip( *itrs ):
while 1:
if not all( itrs ):
raise StopIteration
yield tuple( [ itr.next( ) for itr in itrs ] )

a= one_ahead( iter( [1,2,3,4,5] ) )
b= one_ahead( iter( [1,2,3] ) )
print list( peekzip( a,b ) )
print a.next()

There's one more option, which is to create your own 'push-backable'
class, which accepts a 'previous( item )' message.

(Unproduced)
>>a= push_backing( iter( [1,2,3,4,5] ) )
a.next( )
1
>>a.next( )
2
>>a.previous( 2 )
a.next( )
2
>>a.next( )
3

Oct 2 '08 #2
On Wed, 01 Oct 2008 16:14:09 -0400, Terry Reedy wrote:
Iterators are intended for 'iterate through once and discard' usages.
Also for reading files, which are often seekable.

I don't disagree with the rest of your post, I thought I'd just make an
observation that if the data you are iterating over supports random
access, it's possible to write an iterator that also supports random
access.

--
Steven
Oct 2 '08 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

19
by: les_ander | last post by:
Hi, suppose I am reading lines from a file or stdin. I want to just "peek" in to the next line, and if it starts with a special character I want to break out of a for loop, other wise I want to...
1
by: kill sunday | last post by:
I'm working on an RPN calculator, and i can't get the input right. I have to use cin.peek() to check the next character and do whatever i need with it. I can't get it to look for a specific type...
5
by: Mr A | last post by:
Hi! I'm trying to do the following: emplate <typename Resource> class ResourceManager { public: typedef std::list<Resource*>::iterator Iterator; typedef std::list<Resource*>::const_iterator...
9
by: wizofaus | last post by:
Is the any reason according to the standard that calling tellg() on an std::ifstream after a call to peek() could place the filebuf in an inconsistent state? I think it's a bug in the VC7...
4
by: Manfred Braun | last post by:
Hi All ! I think, there is a bug in the System.Console class related to use the STDIO streams. I am doing a very simple thing in a console-based program named CS Console.In.Peek(); and...
4
by: Rares Vernica | last post by:
Hi, How can I save a reference inside a container? For example I have: map<string, unsignedX; I would like to be able to save a reference to a position inside X. For a vector, the...
5
by: Luis Zarrabeitia | last post by:
Hi there. For most use cases I think about, the iterator protocol is more than enough. However, on a few cases, I've needed some ugly hacks. Ex 1: a = iter() # assume you got the iterator...
0
by: Lie Ryan | last post by:
On Wed, 01 Oct 2008 10:46:33 -0400, Luis Zarrabeitia wrote: No (or I'm not aware of any). Why? Because for some iterable, it is not possible to know in advance its length (video data stream,...
6
by: Pallav singh | last post by:
Hi when we should have Class defined Inside a Class ? can any one give me explanation for it ? Does it is used to Hide some information of Class Data-Member and Function from friend class? ...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
0
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.