inverse of izip

So I know that zip(*) is the inverse of zip(), e.g.:

zip(*zip(range(10), range(10))) [(0, 1, 2, 3, 4, 5, 6, 7, 8, 9), (0, 1, 2, 3, 4, 5, 6, 7, 8, 9)]

What's the inverse of izip? Of course, I could use zip(*) or izip(*),
e.g.:
zip(*itertools.izip(range(10), range(10))) [(0, 1, 2, 3, 4, 5, 6, 7, 8, 9), (0, 1, 2, 3, 4, 5, 6, 7, 8, 9)] x, y = itertools.izip(*itertools.izip(range(10), range(10)))
x, y

((0, 1, 2, 3, 4, 5, 6, 7, 8, 9), (0, 1, 2, 3, 4, 5, 6, 7, 8, 9))

But then I get a pair of tuples, not a pair of iterators. Basically,
I want to convert an iterator of tuples into a tuple of iterators.

Steve

Jul 18 '05 #1

Subscribe Post Reply

2667

Satchidanand Haridas

Steven Bethard wrote:

So I know that zip(*) is the inverse of zip(), e.g.:

zip(*zip(range(10), range(10)))

[(0, 1, 2, 3, 4, 5, 6, 7, 8, 9), (0, 1, 2, 3, 4, 5, 6, 7, 8, 9)]

What's the inverse of izip? Of course, I could use zip(*) or izip(*),
e.g.:
zip(*itertools.izip(range(10), range(10)))

[(0, 1, 2, 3, 4, 5, 6, 7, 8, 9), (0, 1, 2, 3, 4, 5, 6, 7, 8, 9)]

x, y = itertools.izip(*itertools.izip(range(10), range(10)))
x, y

((0, 1, 2, 3, 4, 5, 6, 7, 8, 9), (0, 1, 2, 3, 4, 5, 6, 7, 8, 9))

But then I get a pair of tuples, not a pair of iterators. Basically,
I want to convert an iterator of tuples into a tuple of iterators.

Steve

---------------------------------

a = itertools.izip(*itertools.izip(range(10),range(10) ))
a <itertools.izip object at 0x40164f2c>
a.next() (0, 1, 2, 3, 4, 5, 6, 7, 8, 9) a.next() (0, 1, 2, 3, 4, 5, 6, 7, 8, 9) a.next()

Traceback (most recent call last):
File "<stdin>", line 1, in ?
StopIteration

-----------------------------
Regards,
Satchit
----
Satchidanand Haridas (sharidas at zeomega dot com)

ZeOmega (www.zeomega.com)
Open Minds' Open Solutions

#20,Rajalakshmi Plaza,
South End Road,
Basavanagudi,
Bangalore-560 004, India

Jul 18 '05 #2

Steven Bethard

On Thu, 19 Aug 2004 12:07:48 +0530, Satchidanand Haridas
<sh******@zeomega.com> wrote:

a = itertools.izip(*itertools.izip(range(10),range(10) ))
a <itertools.izip object at 0x40164f2c>
a.next() (0, 1, 2, 3, 4, 5, 6, 7, 8, 9) a.next() (0, 1, 2, 3, 4, 5, 6, 7, 8, 9) a.next()
I'm assuming you popped this one off without actually reading my
email. No worries - it happens some times. You'll note however, that
this is exactly what I said didn't work:

Steven Bethard wrote:x, y = itertools.izip(*itertools.izip(range(10), range(10)))
x, y

((0, 1, 2, 3, 4, 5, 6, 7, 8, 9), (0, 1, 2, 3, 4, 5, 6, 7, 8, 9))

But then I get a pair of tuples, not a pair of iterators. Basically,
I want to convert an iterator of tuples into a tuple of iterators.

I want the elements returned by the itertools.izip object to be
iterators, not tuples or lists.

Steve

--
You can wordify anything if you just verb it.
- Bucky Katt, Get Fuzzy

Jul 18 '05 #3

Steven Bethard

Steven Bethard <steven.bethard <at> gmail.com> writes:

What's the inverse of izip? Of course, I could use zip(*) or izip(*),
e.g.:
zip(*itertools.izip(range(10), range(10))) [(0, 1, 2, 3, 4, 5, 6, 7, 8, 9), (0, 1, 2, 3, 4, 5, 6, 7, 8, 9)] x, y = itertools.izip(*itertools.izip(range(10), range(10)))
x, y ((0, 1, 2, 3, 4, 5, 6, 7, 8, 9), (0, 1, 2, 3, 4, 5, 6, 7, 8, 9))

But then I get a pair of tuples, not a pair of iterators. Basically,
I want to convert an iterator of tuples into a tuple of iterators.

Sorry to respond to myself, but after playing around with itertools for a
while, this seems to work:

import itertools
starzip = lambda iterables: ((tuple[i] for tuple in itr) for i, itr in enumerate(itertools.tee(iterables))) starzip(itertools.izip(range(10), range(10))) <generator object at 0x008DED28> x, y = starzip(itertools.izip(range(10), range(10)))
x <generator object at 0x008E1058> y <generator object at 0x008E1080> list(x) [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] list(y)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Seems like a bit of work for the inverse of izip though so I'll wait to see if
anyone else has a better solution. (Not to mention, it wouldn't be a single
line solution if I wasn't using 2.4...)

Steve

Jul 18 '05 #4

Satchidanand Haridas

Hi,

How about using iter() to get another solution like the following:

starzip2 = lambda it: tuple([iter(x) for x in itertools.izip(*it)]) l,m = starzip2(itertools.izip(range(10),range(10))) l <tupleiterator object at 0x4016802c> m <tupleiterator object at 0x4016896c>
list(l) [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] list(m)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Thanks,

Satchit

----
Satchidanand Haridas (sharidas at zeomega dot com)

ZeOmega (www.zeomega.com)
Open Minds' Open Solutions

#20,Rajalakshmi Plaza,
South End Road,
Basavanagudi,
Bangalore-560 004, India

Steven Bethard wrote:

Steven Bethard <steven.bethard <at> gmail.com> writes:

What's the inverse of izip? Of course, I could use zip(*) or izip(*),
e.g.:

>zip(*itertools.izip(range(10), range(10)))
>
>

[(0, 1, 2, 3, 4, 5, 6, 7, 8, 9), (0, 1, 2, 3, 4, 5, 6, 7, 8, 9)]

>x, y = itertools.izip(*itertools.izip(range(10), range(10)))
>x, y
>
>

((0, 1, 2, 3, 4, 5, 6, 7, 8, 9), (0, 1, 2, 3, 4, 5, 6, 7, 8, 9))

But then I get a pair of tuples, not a pair of iterators. Basically,
I want to convert an iterator of tuples into a tuple of iterators.

Sorry to respond to myself, but after playing around with itertools for a
while, this seems to work:
import itertools
starzip = lambda iterables: ((tuple[i] for tuple in itr) for i, itr in

enumerate(itertools.tee(iterables)))

starzip(itertools.izip(range(10), range(10)))

<generator object at 0x008DED28>

x, y = starzip(itertools.izip(range(10), range(10)))
x

<generator object at 0x008E1058>

y

<generator object at 0x008E1080>

list(x)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

list(y)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Seems like a bit of work for the inverse of izip though so I'll wait to see if
anyone else has a better solution. (Not to mention, it wouldn't be a single
line solution if I wasn't using 2.4...)

Steve

Jul 18 '05 #5

Steven Bethard

Satchidanand Haridas <sharidas <at> zeomega.com> writes:

How about using iter() to get another solution like the following:
>>> starzip2 = lambda it: tuple([iter(x) for x in itertools.izip(*it)]) >>> l,m = starzip2(itertools.izip(range(10),range(10))) >>> l <tupleiterator object at 0x4016802c> >>> m <tupleiterator object at 0x4016896c>
>>> list(l) [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] >>> list(m) [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Unfortunately, I think this exhausts the iterators too early because it
applies * to the iterator:

def range10(): .... for i in range(10):
.... yield i
.... print "exhausted"
.... l,m = starzip2(itertools.izip(range10(),range10()))

exhausted

I believe we only get one "exhausted" because as soon as one iterator is used
up with izip, the next iterator is discarded. But we are hitting "exhausted"
before we ever ask for an element from the starzip2 iterators, so it looks to
me like all the pairs from the first iterator are read into memory before the
second iterators are ever accessed...

Steve

Jul 18 '05 #6

Peter Otten

Steven Bethard wrote:

Steven Bethard <steven.bethard <at> gmail.com> writes:
What's the inverse of izip? Of course, I could use zip(*) or izip(*),
e.g.:
>>> zip(*itertools.izip(range(10), range(10)))

[(0, 1, 2, 3, 4, 5, 6, 7, 8, 9), (0, 1, 2, 3, 4, 5, 6, 7, 8, 9)]
>>> x, y = itertools.izip(*itertools.izip(range(10), range(10)))
>>> x, y

((0, 1, 2, 3, 4, 5, 6, 7, 8, 9), (0, 1, 2, 3, 4, 5, 6, 7, 8, 9))

But then I get a pair of tuples, not a pair of iterators. Basically,
I want to convert an iterator of tuples into a tuple of iterators.

Sorry to respond to myself, but after playing around with itertools for a
while, this seems to work:
import itertools
starzip = lambda iterables: ((tuple[i] for tuple in itr) for i, itr in enumerate(itertools.tee(iterables))) starzip(itertools.izip(range(10), range(10))) <generator object at 0x008DED28> x, y = starzip(itertools.izip(range(10), range(10)))
x <generator object at 0x008E1058> y <generator object at 0x008E1080> list(x) [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] list(y) [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Seems like a bit of work for the inverse of izip though so I'll wait to
see if
anyone else has a better solution. (Not to mention, it wouldn't be a
single line solution if I wasn't using 2.4...)

Because Python supports function definitions you only have to do it once :-)

However, your sample data is badly chosen. Unless I have made a typo
repeating your demo, you are getting the same (last) sequence twice due to
late binding of i.

import itertools as it
def starzip(iterables): .... return ((t[i] for t in itr) for (i, itr) in
enumerate(it.tee(iterables)))
.... map(list, starzip(it.izip("123", "abc"))) [['1', '2', '3'], ['a', 'b', 'c']] x, y = starzip(it.izip("123", "abc"))
list(x) ['a', 'b', 'c'] list(y) ['a', 'b', 'c']

Here's my fix.

# requires Python 2.4
def cut(itr, index):
# avoid late binding of index
return (item[index] for item in itr)

def starzip(tuples):
a, b = it.tee(tuples)
try:
tuple_len = len(a.next())
except StopIteration:
raise ValueError(
"starzip() does not allow an empty sequence as argument")
t = it.tee(b, tuple_len)
return (cut(itr, index) for (index, itr) in enumerate(t))

a, b, c = starzip(it.izip("abc", [1,2,3], "xyz"))
print a, b, c
assert list(a) == list("abc")
assert list(b) == [1, 2, 3]
assert list(c) == list("xyz")

Peter

Jul 18 '05 #7

Satchidanand Haridas

Steven Bethard wrote:

Satchidanand Haridas <sharidas <at> zeomega.com> writes:

How about using iter() to get another solution like the following:
>>> starzip2 = lambda it: tuple([iter(x) for x in itertools.izip(*it)])

>>> l,m = starzip2(itertools.izip(range(10),range(10)))

>>> l

<tupleiterator object at 0x4016802c>
>>> m

<tupleiterator object at 0x4016896c>
>>> list(l)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> list(m)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Unfortunately, I think this exhausts the iterators too early because it
applies * to the iterator:

Could you expand on what you mean by exhaust the iterators too early?

The reason I ask is that the * operator is applied to
((1,1),(2,2),....(9,9)). The operation of the
itertools.izip(range10(),range10()) is completed before the * operation
is applied. And the iter() simply converts the result of the inverse
izip operation into an iterator. I hope the above was not too
confusing. :-)

I am trying to understand a little about the izip myself. Thanks.

Satchit

def range10():

... for i in range(10):
... yield i
... print "exhausted"
...

l,m = starzip2(itertools.izip(range10(),range10()))

exhausted

I believe we only get one "exhausted" because as soon as one iterator is used
up with izip, the next iterator is discarded. But we are hitting "exhausted"
before we ever ask for an element from the starzip2 iterators, so it looks to
me like all the pairs from the first iterator are read into memory before the
second iterators are ever accessed...

Steve

Could you expand on what you mean by exhaust the iterators too early?

The reason I ask is that the * operator is applied to the tuple
((1,1),(2,2),...(9,9)). Actually to the iterator which is called 10
times, each time returning (i,i) for 0<=0<10. When the iterator is
called the 11th time, it prints "exhausted".

So the operation of the itertools.izip(range10(),range10()) is completed
and "exhausted" is printed before the * operation is applied. The iter()
simply converts the result of the inverse izip operation which into an
iterator. I hope the above was not too confusing. :-)

I am trying to understand what goes on inside izip myself. Thanks.

Regards,
Satchit

Jul 18 '05 #8

Steven Bethard

Peter Otten <__*******@web.de> wrote in message news:<cg*************@news.t-online.com>...

However, your sample data is badly chosen. Unless I have made a typo
repeating your demo, you are getting the same (last) sequence twice due to
late binding of i.

[snip]

map(list, starzip(it.izip("123", "abc"))) [['1', '2', '3'], ['a', 'b', 'c']] x, y = starzip(it.izip("123", "abc"))
list(x) ['a', 'b', 'c'] list(y) ['a', 'b', 'c']

I knew there was something funny about binding in generators, but I
couldn't remember what... Could you explain why 'map(list, ...)'
works, but 'x, y = ...' doesn't? I read the PEP, but I'm still not
clear on this point.

Thanks,

Steve

Jul 18 '05 #9

Steven Bethard

Satchidanand Haridas <sh******@zeomega.com> wrote in message news:<ma**************************************@pyt hon.org>...

Could you expand on what you mean by exhaust the iterators too early?

The reason I ask is that the * operator is applied to
((1,1),(2,2),....(9,9)). The operation of the
itertools.izip(range10(),range10()) is completed before the * operation
is applied. And the iter() simply converts the result of the inverse
izip operation into an iterator. I hope the above was not too
confusing. :-)

Yeah, the difference is a little subtle here. What we have before you
use the * operator is an iterator that will yield (1,1) then (2,2) up
to (9,9). Note that we don't actually have the tuple
((1,1),(2,2),....(9,9)) yet, just an iterator that will produce the
same elements. If your list is very large and you don't want to keep
it all in memory at once, it's crucial that we have the iterator here,
not the tuple.

When you use the * operator, Python converts the iterable following
the * into the argument list of the function. This means that if
you're using an iterable, it reads all of the elements of the iterable
into memory at once. That's why my range10 iterators printed
"exhausted" after the * application -- all their elements had been
read into memory. Again, if your list is very large, this is a bad
thing because you now have all the elements of the list in memory at
the same time. My other solution (well, Peter Otten's correction of
my solution) never has the whole list in memory at the same time --
each time enumerate generates a tuple and it's index, each of the
iterators returned by starzip generates their appropriate items.[*]

Steve
[*] Of course, if you exhaust one of the iterators before the others,
itertools.tee's implicit cache will actually store all the elements,
so starzip would really only be efficient if you wanted to iterate
through the sub-iterators in lockstep. This means you'd probably want
to itertools.izip them back together at some point, but being able to
starzip them means you can wrap the individual iterators with extra
functionality if necessary.

Jul 18 '05 #10

Peter Otten

Steven Bethard wrote:

Peter Otten <__*******@web.de> wrote in message
news:<cg*************@news.t-online.com>...
However, your sample data is badly chosen. Unless I have made a typo
repeating your demo, you are getting the same (last) sequence twice due
to late binding of i.

[snip]
>>> map(list, starzip(it.izip("123", "abc")))

[['1', '2', '3'], ['a', 'b', 'c']]
>>> x, y = starzip(it.izip("123", "abc"))
>>> list(x)

['a', 'b', 'c']
>>> list(y)

['a', 'b', 'c']
>>>

I knew there was something funny about binding in generators, but I
couldn't remember what... Could you explain why 'map(list, ...)'
works, but 'x, y = ...' doesn't? I read the PEP, but I'm still not
clear on this point.

Maybe the following example can illustrate what I think is going on:

import itertools as it

def starzip(iterables):
return ((t[i] for t in itr)
for (i, itr) in
enumerate(it.tee(iterables)))

# the order of calls equivalent to map(list, starzip(...))
s = starzip(it.izip("abc", "123"))
x = s.next()
# the local variable i in starzip() shared by x and y
# is now 0
print x.next(),
print x.next(),
print x.next()
y = s.next()
# i is now 1, but because no further calls to x.next()
# will occur it doesn't matter
print y.next(),
print y.next(),
print y.next()

s = starzip(it.izip("abc", "123"))
x = s.next() # i is 0
y = s.next() # i is 1
# both x an y yield t[1]
print x.next(),
print x.next(),
print x.next()

print y.next(),
print y.next(),
print y.next()

You can model the nested generator expressions' behaviour with the following
function - which I think is much clearer.

def starzip(iterables):
def inner(itr):
for t in itr:
yield t[i]

for (i, itr) in enumerate(it.tee(iterables)):
yield inner(itr)

Note how itr is passed explicitly, i. e. it is not affected by later
rebindings in startzip() whereas i is looked up in inner()'s surrounding
namespace at every yield.

Peter

Jul 18 '05 #11

Steven Bethard

Peter Otten <__peter__ <at> web.de> writes:

You can model the nested generator expressions' behaviour with the following
function - which I think is much clearer.

def starzip(iterables):
def inner(itr):
for t in itr:
yield t[i]

for (i, itr) in enumerate(it.tee(iterables)):
yield inner(itr)

Note how itr is passed explicitly, i. e. it is not affected by later
rebindings in startzip() whereas i is looked up in inner()'s surrounding
namespace at every yield.

Thanks, that was really helpful! It also clarifies why your solution works
right; your code basically does:

def starzip(iterables):
def inner(itr, i):
for t in itr:
yield t[i]

for i, itr in enumerate(itertools.tee(iterables)):
yield inner(itr, i)

where i is now passed explicitly too.

Thanks again,

Steve

Jul 18 '05 #12

Satchidanand Haridas

Hi Steve,

Thanks for the explanation. I understand izip a little better now.

Regards,
Satchit

Steven Bethard wrote:

Satchidanand Haridas <sh******@zeomega.com> wrote in message news:<ma**************************************@pyt hon.org>...

Could you expand on what you mean by exhaust the iterators too early?

The reason I ask is that the * operator is applied to
((1,1),(2,2),....(9,9)). The operation of the
itertools.izip(range10(),range10()) is completed before the * operation
is applied. And the iter() simply converts the result of the inverse
izip operation into an iterator. I hope the above was not too
confusing. :-)

Yeah, the difference is a little subtle here. What we have before you
use the * operator is an iterator that will yield (1,1) then (2,2) up
to (9,9). Note that we don't actually have the tuple
((1,1),(2,2),....(9,9)) yet, just an iterator that will produce the
same elements. If your list is very large and you don't want to keep
it all in memory at once, it's crucial that we have the iterator here,
not the tuple.

When you use the * operator, Python converts the iterable following
the * into the argument list of the function. This means that if
you're using an iterable, it reads all of the elements of the iterable
into memory at once. That's why my range10 iterators printed
"exhausted" after the * application -- all their elements had been
read into memory. Again, if your list is very large, this is a bad
thing because you now have all the elements of the list in memory at
the same time. My other solution (well, Peter Otten's correction of
my solution) never has the whole list in memory at the same time --
each time enumerate generates a tuple and it's index, each of the
iterators returned by starzip generates their appropriate items.[*]

Steve

[*] Of course, if you exhaust one of the iterators before the others,
itertools.tee's implicit cache will actually store all the elements,
so starzip would really only be efficient if you wanted to iterate
through the sub-iterators in lockstep. This means you'd probably want
to itertools.izip them back together at some point, but being able to
starzip them means you can wrap the individual iterators with extra
functionality if necessary.

Jul 18 '05 #13

inverse of izip

Similar topics