473,396 Members | 1,784 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

inverse of izip

So I know that zip(*) is the inverse of zip(), e.g.:
zip(*zip(range(10), range(10))) [(0, 1, 2, 3, 4, 5, 6, 7, 8, 9), (0, 1, 2, 3, 4, 5, 6, 7, 8, 9)]

What's the inverse of izip? Of course, I could use zip(*) or izip(*),
e.g.:
zip(*itertools.izip(range(10), range(10))) [(0, 1, 2, 3, 4, 5, 6, 7, 8, 9), (0, 1, 2, 3, 4, 5, 6, 7, 8, 9)] x, y = itertools.izip(*itertools.izip(range(10), range(10)))
x, y

((0, 1, 2, 3, 4, 5, 6, 7, 8, 9), (0, 1, 2, 3, 4, 5, 6, 7, 8, 9))

But then I get a pair of tuples, not a pair of iterators. Basically,
I want to convert an iterator of tuples into a tuple of iterators.

Steve
Jul 18 '05 #1
12 2667

Steven Bethard wrote:
So I know that zip(*) is the inverse of zip(), e.g.:
zip(*zip(range(10), range(10)))

[(0, 1, 2, 3, 4, 5, 6, 7, 8, 9), (0, 1, 2, 3, 4, 5, 6, 7, 8, 9)]

What's the inverse of izip? Of course, I could use zip(*) or izip(*),
e.g.:
zip(*itertools.izip(range(10), range(10)))

[(0, 1, 2, 3, 4, 5, 6, 7, 8, 9), (0, 1, 2, 3, 4, 5, 6, 7, 8, 9)]

x, y = itertools.izip(*itertools.izip(range(10), range(10)))
x, y

((0, 1, 2, 3, 4, 5, 6, 7, 8, 9), (0, 1, 2, 3, 4, 5, 6, 7, 8, 9))

But then I get a pair of tuples, not a pair of iterators. Basically,
I want to convert an iterator of tuples into a tuple of iterators.

Steve

---------------------------------
a = itertools.izip(*itertools.izip(range(10),range(10) ))
a <itertools.izip object at 0x40164f2c>
a.next() (0, 1, 2, 3, 4, 5, 6, 7, 8, 9) a.next() (0, 1, 2, 3, 4, 5, 6, 7, 8, 9) a.next()

Traceback (most recent call last):
File "<stdin>", line 1, in ?
StopIteration

-----------------------------
Regards,
Satchit
----
Satchidanand Haridas (sharidas at zeomega dot com)

ZeOmega (www.zeomega.com)
Open Minds' Open Solutions

#20,Rajalakshmi Plaza,
South End Road,
Basavanagudi,
Bangalore-560 004, India

Jul 18 '05 #2
On Thu, 19 Aug 2004 12:07:48 +0530, Satchidanand Haridas
<sh******@zeomega.com> wrote:
a = itertools.izip(*itertools.izip(range(10),range(10) ))
a <itertools.izip object at 0x40164f2c>
a.next() (0, 1, 2, 3, 4, 5, 6, 7, 8, 9) a.next() (0, 1, 2, 3, 4, 5, 6, 7, 8, 9) a.next()
I'm assuming you popped this one off without actually reading my
email. No worries - it happens some times. You'll note however, that
this is exactly what I said didn't work:

Steven Bethard wrote:x, y = itertools.izip(*itertools.izip(range(10), range(10)))
x, y

((0, 1, 2, 3, 4, 5, 6, 7, 8, 9), (0, 1, 2, 3, 4, 5, 6, 7, 8, 9))

But then I get a pair of tuples, not a pair of iterators. Basically,
I want to convert an iterator of tuples into a tuple of iterators.


I want the elements returned by the itertools.izip object to be
iterators, not tuples or lists.

Steve

--
You can wordify anything if you just verb it.
- Bucky Katt, Get Fuzzy
Jul 18 '05 #3
Steven Bethard <steven.bethard <at> gmail.com> writes:
What's the inverse of izip? Of course, I could use zip(*) or izip(*),
e.g.:
zip(*itertools.izip(range(10), range(10))) [(0, 1, 2, 3, 4, 5, 6, 7, 8, 9), (0, 1, 2, 3, 4, 5, 6, 7, 8, 9)] x, y = itertools.izip(*itertools.izip(range(10), range(10)))
x, y ((0, 1, 2, 3, 4, 5, 6, 7, 8, 9), (0, 1, 2, 3, 4, 5, 6, 7, 8, 9))

But then I get a pair of tuples, not a pair of iterators. Basically,
I want to convert an iterator of tuples into a tuple of iterators.


Sorry to respond to myself, but after playing around with itertools for a
while, this seems to work:
import itertools
starzip = lambda iterables: ((tuple[i] for tuple in itr) for i, itr in enumerate(itertools.tee(iterables))) starzip(itertools.izip(range(10), range(10))) <generator object at 0x008DED28> x, y = starzip(itertools.izip(range(10), range(10)))
x <generator object at 0x008E1058> y <generator object at 0x008E1080> list(x) [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] list(y)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Seems like a bit of work for the inverse of izip though so I'll wait to see if
anyone else has a better solution. (Not to mention, it wouldn't be a single
line solution if I wasn't using 2.4...)

Steve
Jul 18 '05 #4
Hi,

How about using iter() to get another solution like the following:
starzip2 = lambda it: tuple([iter(x) for x in itertools.izip(*it)]) l,m = starzip2(itertools.izip(range(10),range(10))) l <tupleiterator object at 0x4016802c> m <tupleiterator object at 0x4016896c>
list(l) [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] list(m)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Thanks,

Satchit

----
Satchidanand Haridas (sharidas at zeomega dot com)

ZeOmega (www.zeomega.com)
Open Minds' Open Solutions

#20,Rajalakshmi Plaza,
South End Road,
Basavanagudi,
Bangalore-560 004, India

Steven Bethard wrote:
Steven Bethard <steven.bethard <at> gmail.com> writes:

What's the inverse of izip? Of course, I could use zip(*) or izip(*),
e.g.:
>zip(*itertools.izip(range(10), range(10)))
>
>

[(0, 1, 2, 3, 4, 5, 6, 7, 8, 9), (0, 1, 2, 3, 4, 5, 6, 7, 8, 9)]

>x, y = itertools.izip(*itertools.izip(range(10), range(10)))
>x, y
>
>

((0, 1, 2, 3, 4, 5, 6, 7, 8, 9), (0, 1, 2, 3, 4, 5, 6, 7, 8, 9))

But then I get a pair of tuples, not a pair of iterators. Basically,
I want to convert an iterator of tuples into a tuple of iterators.


Sorry to respond to myself, but after playing around with itertools for a
while, this seems to work:
import itertools
starzip = lambda iterables: ((tuple[i] for tuple in itr) for i, itr in

enumerate(itertools.tee(iterables)))

starzip(itertools.izip(range(10), range(10)))

<generator object at 0x008DED28>

x, y = starzip(itertools.izip(range(10), range(10)))
x

<generator object at 0x008E1058>

y

<generator object at 0x008E1080>

list(x)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

list(y)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Seems like a bit of work for the inverse of izip though so I'll wait to see if
anyone else has a better solution. (Not to mention, it wouldn't be a single
line solution if I wasn't using 2.4...)

Steve

Jul 18 '05 #5
Satchidanand Haridas <sharidas <at> zeomega.com> writes:
How about using iter() to get another solution like the following:
>>> starzip2 = lambda it: tuple([iter(x) for x in itertools.izip(*it)]) >>> l,m = starzip2(itertools.izip(range(10),range(10))) >>> l <tupleiterator object at 0x4016802c> >>> m <tupleiterator object at 0x4016896c>
>>> list(l) [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] >>> list(m) [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Unfortunately, I think this exhausts the iterators too early because it
applies * to the iterator:
def range10(): .... for i in range(10):
.... yield i
.... print "exhausted"
.... l,m = starzip2(itertools.izip(range10(),range10()))

exhausted

I believe we only get one "exhausted" because as soon as one iterator is used
up with izip, the next iterator is discarded. But we are hitting "exhausted"
before we ever ask for an element from the starzip2 iterators, so it looks to
me like all the pairs from the first iterator are read into memory before the
second iterators are ever accessed...

Steve

Jul 18 '05 #6
Steven Bethard wrote:
Steven Bethard <steven.bethard <at> gmail.com> writes:
What's the inverse of izip? Of course, I could use zip(*) or izip(*),
e.g.:
>>> zip(*itertools.izip(range(10), range(10)))

[(0, 1, 2, 3, 4, 5, 6, 7, 8, 9), (0, 1, 2, 3, 4, 5, 6, 7, 8, 9)]
>>> x, y = itertools.izip(*itertools.izip(range(10), range(10)))
>>> x, y

((0, 1, 2, 3, 4, 5, 6, 7, 8, 9), (0, 1, 2, 3, 4, 5, 6, 7, 8, 9))

But then I get a pair of tuples, not a pair of iterators. Basically,
I want to convert an iterator of tuples into a tuple of iterators.


Sorry to respond to myself, but after playing around with itertools for a
while, this seems to work:
import itertools
starzip = lambda iterables: ((tuple[i] for tuple in itr) for i, itr in enumerate(itertools.tee(iterables))) starzip(itertools.izip(range(10), range(10))) <generator object at 0x008DED28> x, y = starzip(itertools.izip(range(10), range(10)))
x <generator object at 0x008E1058> y <generator object at 0x008E1080> list(x) [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] list(y) [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Seems like a bit of work for the inverse of izip though so I'll wait to
see if
anyone else has a better solution. (Not to mention, it wouldn't be a
single line solution if I wasn't using 2.4...)


Because Python supports function definitions you only have to do it once :-)

However, your sample data is badly chosen. Unless I have made a typo
repeating your demo, you are getting the same (last) sequence twice due to
late binding of i.
import itertools as it
def starzip(iterables): .... return ((t[i] for t in itr) for (i, itr) in
enumerate(it.tee(iterables)))
.... map(list, starzip(it.izip("123", "abc"))) [['1', '2', '3'], ['a', 'b', 'c']] x, y = starzip(it.izip("123", "abc"))
list(x) ['a', 'b', 'c'] list(y) ['a', 'b', 'c']


Here's my fix.

# requires Python 2.4
def cut(itr, index):
# avoid late binding of index
return (item[index] for item in itr)

def starzip(tuples):
a, b = it.tee(tuples)
try:
tuple_len = len(a.next())
except StopIteration:
raise ValueError(
"starzip() does not allow an empty sequence as argument")
t = it.tee(b, tuple_len)
return (cut(itr, index) for (index, itr) in enumerate(t))

a, b, c = starzip(it.izip("abc", [1,2,3], "xyz"))
print a, b, c
assert list(a) == list("abc")
assert list(b) == [1, 2, 3]
assert list(c) == list("xyz")

Peter

Jul 18 '05 #7
Steven Bethard wrote:
Satchidanand Haridas <sharidas <at> zeomega.com> writes:

How about using iter() to get another solution like the following:
>>> starzip2 = lambda it: tuple([iter(x) for x in itertools.izip(*it)])

>>> l,m = starzip2(itertools.izip(range(10),range(10)))

>>> l

<tupleiterator object at 0x4016802c>
>>> m

<tupleiterator object at 0x4016896c>
>>> list(l)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> list(m)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Unfortunately, I think this exhausts the iterators too early because it
applies * to the iterator:

Could you expand on what you mean by exhaust the iterators too early?

The reason I ask is that the * operator is applied to
((1,1),(2,2),....(9,9)). The operation of the
itertools.izip(range10(),range10()) is completed before the * operation
is applied. And the iter() simply converts the result of the inverse
izip operation into an iterator. I hope the above was not too
confusing. :-)

I am trying to understand a little about the izip myself. Thanks.

Satchit

def range10():

... for i in range(10):
... yield i
... print "exhausted"
...

l,m = starzip2(itertools.izip(range10(),range10()))

exhausted

I believe we only get one "exhausted" because as soon as one iterator is used
up with izip, the next iterator is discarded. But we are hitting "exhausted"
before we ever ask for an element from the starzip2 iterators, so it looks to
me like all the pairs from the first iterator are read into memory before the
second iterators are ever accessed...

Steve


Could you expand on what you mean by exhaust the iterators too early?

The reason I ask is that the * operator is applied to the tuple
((1,1),(2,2),...(9,9)). Actually to the iterator which is called 10
times, each time returning (i,i) for 0<=0<10. When the iterator is
called the 11th time, it prints "exhausted".

So the operation of the itertools.izip(range10(),range10()) is completed
and "exhausted" is printed before the * operation is applied. The iter()
simply converts the result of the inverse izip operation which into an
iterator. I hope the above was not too confusing. :-)

I am trying to understand what goes on inside izip myself. Thanks.

Regards,
Satchit


Jul 18 '05 #8
Peter Otten <__*******@web.de> wrote in message news:<cg*************@news.t-online.com>...
However, your sample data is badly chosen. Unless I have made a typo
repeating your demo, you are getting the same (last) sequence twice due to
late binding of i.

[snip]
map(list, starzip(it.izip("123", "abc"))) [['1', '2', '3'], ['a', 'b', 'c']] x, y = starzip(it.izip("123", "abc"))
list(x) ['a', 'b', 'c'] list(y) ['a', 'b', 'c']


I knew there was something funny about binding in generators, but I
couldn't remember what... Could you explain why 'map(list, ...)'
works, but 'x, y = ...' doesn't? I read the PEP, but I'm still not
clear on this point.

Thanks,

Steve
Jul 18 '05 #9
Satchidanand Haridas <sh******@zeomega.com> wrote in message news:<ma**************************************@pyt hon.org>...
Could you expand on what you mean by exhaust the iterators too early?

The reason I ask is that the * operator is applied to
((1,1),(2,2),....(9,9)). The operation of the
itertools.izip(range10(),range10()) is completed before the * operation
is applied. And the iter() simply converts the result of the inverse
izip operation into an iterator. I hope the above was not too
confusing. :-)


Yeah, the difference is a little subtle here. What we have before you
use the * operator is an iterator that will yield (1,1) then (2,2) up
to (9,9). Note that we don't actually have the tuple
((1,1),(2,2),....(9,9)) yet, just an iterator that will produce the
same elements. If your list is very large and you don't want to keep
it all in memory at once, it's crucial that we have the iterator here,
not the tuple.

When you use the * operator, Python converts the iterable following
the * into the argument list of the function. This means that if
you're using an iterable, it reads all of the elements of the iterable
into memory at once. That's why my range10 iterators printed
"exhausted" after the * application -- all their elements had been
read into memory. Again, if your list is very large, this is a bad
thing because you now have all the elements of the list in memory at
the same time. My other solution (well, Peter Otten's correction of
my solution) never has the whole list in memory at the same time --
each time enumerate generates a tuple and it's index, each of the
iterators returned by starzip generates their appropriate items.[*]

Steve
[*] Of course, if you exhaust one of the iterators before the others,
itertools.tee's implicit cache will actually store all the elements,
so starzip would really only be efficient if you wanted to iterate
through the sub-iterators in lockstep. This means you'd probably want
to itertools.izip them back together at some point, but being able to
starzip them means you can wrap the individual iterators with extra
functionality if necessary.
Jul 18 '05 #10
Steven Bethard wrote:
Peter Otten <__*******@web.de> wrote in message
news:<cg*************@news.t-online.com>...
However, your sample data is badly chosen. Unless I have made a typo
repeating your demo, you are getting the same (last) sequence twice due
to late binding of i.

[snip]
>>> map(list, starzip(it.izip("123", "abc")))

[['1', '2', '3'], ['a', 'b', 'c']]
>>> x, y = starzip(it.izip("123", "abc"))
>>> list(x)

['a', 'b', 'c']
>>> list(y)

['a', 'b', 'c']
>>>


I knew there was something funny about binding in generators, but I
couldn't remember what... Could you explain why 'map(list, ...)'
works, but 'x, y = ...' doesn't? I read the PEP, but I'm still not
clear on this point.


Maybe the following example can illustrate what I think is going on:

import itertools as it

def starzip(iterables):
return ((t[i] for t in itr)
for (i, itr) in
enumerate(it.tee(iterables)))

# the order of calls equivalent to map(list, starzip(...))
s = starzip(it.izip("abc", "123"))
x = s.next()
# the local variable i in starzip() shared by x and y
# is now 0
print x.next(),
print x.next(),
print x.next()
y = s.next()
# i is now 1, but because no further calls to x.next()
# will occur it doesn't matter
print y.next(),
print y.next(),
print y.next()

s = starzip(it.izip("abc", "123"))
x = s.next() # i is 0
y = s.next() # i is 1
# both x an y yield t[1]
print x.next(),
print x.next(),
print x.next()

print y.next(),
print y.next(),
print y.next()

You can model the nested generator expressions' behaviour with the following
function - which I think is much clearer.

def starzip(iterables):
def inner(itr):
for t in itr:
yield t[i]

for (i, itr) in enumerate(it.tee(iterables)):
yield inner(itr)

Note how itr is passed explicitly, i. e. it is not affected by later
rebindings in startzip() whereas i is looked up in inner()'s surrounding
namespace at every yield.

Peter

Jul 18 '05 #11
Peter Otten <__peter__ <at> web.de> writes:
You can model the nested generator expressions' behaviour with the following
function - which I think is much clearer.

def starzip(iterables):
def inner(itr):
for t in itr:
yield t[i]

for (i, itr) in enumerate(it.tee(iterables)):
yield inner(itr)

Note how itr is passed explicitly, i. e. it is not affected by later
rebindings in startzip() whereas i is looked up in inner()'s surrounding
namespace at every yield.


Thanks, that was really helpful! It also clarifies why your solution works
right; your code basically does:

def starzip(iterables):
def inner(itr, i):
for t in itr:
yield t[i]

for i, itr in enumerate(itertools.tee(iterables)):
yield inner(itr, i)

where i is now passed explicitly too.

Thanks again,

Steve

Jul 18 '05 #12
Hi Steve,

Thanks for the explanation. I understand izip a little better now.

Regards,
Satchit

Steven Bethard wrote:
Satchidanand Haridas <sh******@zeomega.com> wrote in message news:<ma**************************************@pyt hon.org>...

Could you expand on what you mean by exhaust the iterators too early?

The reason I ask is that the * operator is applied to
((1,1),(2,2),....(9,9)). The operation of the
itertools.izip(range10(),range10()) is completed before the * operation
is applied. And the iter() simply converts the result of the inverse
izip operation into an iterator. I hope the above was not too
confusing. :-)


Yeah, the difference is a little subtle here. What we have before you
use the * operator is an iterator that will yield (1,1) then (2,2) up
to (9,9). Note that we don't actually have the tuple
((1,1),(2,2),....(9,9)) yet, just an iterator that will produce the
same elements. If your list is very large and you don't want to keep
it all in memory at once, it's crucial that we have the iterator here,
not the tuple.

When you use the * operator, Python converts the iterable following
the * into the argument list of the function. This means that if
you're using an iterable, it reads all of the elements of the iterable
into memory at once. That's why my range10 iterators printed
"exhausted" after the * application -- all their elements had been
read into memory. Again, if your list is very large, this is a bad
thing because you now have all the elements of the list in memory at
the same time. My other solution (well, Peter Otten's correction of
my solution) never has the whole list in memory at the same time --
each time enumerate generates a tuple and it's index, each of the
iterators returned by starzip generates their appropriate items.[*]

Steve

[*] Of course, if you exhaust one of the iterators before the others,
itertools.tee's implicit cache will actually store all the elements,
so starzip would really only be efficient if you wanted to iterate
through the sub-iterators in lockstep. This means you'd probably want
to itertools.izip them back together at some point, but being able to
starzip them means you can wrap the individual iterators with extra
functionality if necessary.

Jul 18 '05 #13

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
by: David C. Fox | last post by:
Is there a function which takes a list of tuples and returns a list of lists made up of the first element of each tuple, the second element of each tuple, etc.? In other words, the the inverse...
6
by: vishnu mahendra | last post by:
hello to all, can any one please give me an algorithm to find inverse of a matrix of order n rows and m columns. thank you in advance, vishnu.
41
by: rurpy | last post by:
The code below should be pretty self-explanatory. I want to read two files in parallel, so that I can print corresponding lines from each, side by side. itertools.izip() seems the obvious way to...
4
by: Jonathan Fine | last post by:
Hello As part of the MathTran project I found myself wanting to maintain a bijection between long names and short names. http://www.open.ac.uk/mathtran In other words, I wanted to have two...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.