Efficiently Split A List of Tuples

Richard

I have a large list of two element tuples. I want two separate
lists: One list with the first element of every tuple, and the
second list with the second element of every tuple.

Each tuple contains a datetime object followed by an integer.

Here is a small sample of the original list:

((datetime.datetime(2005, 7, 13, 16, 0, 54), 315),
(datetime.datetime(2005, 7, 13, 16, 6, 12), 313),
(datetime.datetime(2005, 7, 13, 16, 16, 45), 312),
(datetime.datetime(2005, 7, 13, 16, 22), 315),
(datetime.datetime(2005, 7, 13, 16, 27, 18), 312),
(datetime.datetime(2005, 7, 13, 16, 32, 35), 307),
(datetime.datetime(2005, 7, 13, 16, 37, 51), 304),
(datetime.datetime(2005, 7, 13, 16, 43, 8), 307))

I know I can use a 'for' loop and create two new lists
using 'newList1.append(x)', etc. Is there an efficient way
to create these two new lists without using a slow for loop?

r

Jul 21 '05 #1

Subscribe Post Reply

12235

Paul Rubin

Richard <no**@pacbell.net> writes:

I have a large list of two element tuples. I want two separate
lists: One list with the first element of every tuple, and the
second list with the second element of every tuple.

I know I can use a 'for' loop and create two new lists
using 'newList1.append(x)', etc. Is there an efficient way
to create these two new lists without using a slow for loop?

Not really. You could get a little cutesey with list comprehensions
to keep the code concise, but the underlying process would be about
the same:

a = ((1,2), (3, 4), (5, 6), (7, 8), (9, 10))
x,y = [[z[i] for z in a] for i in (0,1)]
# x is now (1,3,5,7,9) and y is (2,4,6,8,10)

Jul 21 '05 #2

Peter Hansen

Richard wrote:

I have a large list of two element tuples. I want two separate
lists: One list with the first element of every tuple, and the
second list with the second element of every tuple.

Variant of Paul's example:

a = ((1,2), (3, 4), (5, 6), (7, 8), (9, 10))
zip(*a)

or
[list(t) for t in zip(*a)] if you need lists instead of tuples.

(I believe this is something Guido considers an "abuse of *args", but I
just consider it an elegant use of zip() considering how the language
defines *args. YMMV]

-Peter

Jul 21 '05 #3

Joseph Garvin

Peter Hansen wrote:

(I believe this is something Guido considers an "abuse of *args", but I
just consider it an elegant use of zip() considering how the language
defines *args. YMMV]

-Peter

An abuse?! That's one of the most useful things to do with it. It's
transpose.

Jul 21 '05 #4

Peter Hansen

Joseph Garvin wrote:

Peter Hansen wrote:
(I believe this is something Guido considers an "abuse of *args", but
I just consider it an elegant use of zip() considering how the
language defines *args. YMMV]

-Peter

An abuse?! That's one of the most useful things to do with it. It's
transpose.

Note that it's considered (as I understand) an abuse of "*args", not an
abuse of "zip". I can see a difference...

-Peter

Jul 21 '05 #5

Richard

On Wed, 13 Jul 2005 20:53:58 -0400, Peter Hansen wrote:

a = ((1,2), (3, 4), (5, 6), (7, 8), (9, 10))
zip(*a)

This seems to work. Thanks.

Where do I find documentation on "*args"?

Jul 21 '05 #6

Peter Hansen

Richard wrote:

On Wed, 13 Jul 2005 20:53:58 -0400, Peter Hansen wrote:
a = ((1,2), (3, 4), (5, 6), (7, 8), (9, 10))
zip(*a)
This seems to work. Thanks.

Where do I find documentation on "*args"?

In the language reference: http://docs.python.org/ref/calls.html#calls

-Peter

Jul 21 '05 #7

Raymond Hettinger

> Variant of Paul's example:

a = ((1,2), (3, 4), (5, 6), (7, 8), (9, 10))
zip(*a)

or

[list(t) for t in zip(*a)] if you need lists instead of tuples.

[Peter Hansen] (I believe this is something Guido considers an "abuse of *args", but I
just consider it an elegant use of zip() considering how the language
defines *args. YMMV]

It is somewhat elegant in terms of expressiveness; however, it is also
a bit disconcerting in light of the underlying implementation.

All of the tuples are loaded one-by-one onto the argument stack. For a
few elements, this is no big deal. For large datasets, it is a less
than ideal way of transposing data.

Guido's reaction makes sense when you consider that most programmers
would cringe at a function definition with thousands of parameters.
There is a sense that this doesn't scale-up very well (with each Python
implementation having its own limits on how far you can push this
idiom).

Raymond

Jul 21 '05 #8

Raymond Hettinger

[Richard]

I know I can use a 'for' loop and create two new lists
using 'newList1.append(x)', etc. Is there an efficient way
to create these two new lists without using a slow for loop?

If trying to optimize before writing and timing code, then at least
validate your assumptions. In Python, for-loops are blazingly fast.
They are almost never the bottleneck. Python is not Matlab --
"vectorizing" for-loops only pays-off when a high-speed functional
happens to exactly match you needs (in this case, zip() happens to be a
good fit).

Even when a functional offers a speed-up, much of the gain is likely
due to implementation specific optimizations which allocate result
lists all at once rather than building them one at time.

Also, for all but the most simple inner-loop operations, the for-loop
overhead almost always dominated by the time to execute the operation
itself.

Executive summary: Python's for-loops are both elegant and fast. It
is a mistake to habitually avoid them.

Raymond

Jul 21 '05 #9

Ron Adam

Raymond Hettinger wrote:

Variant of Paul's example:

a = ((1,2), (3, 4), (5, 6), (7, 8), (9, 10))
zip(*a)

or

[list(t) for t in zip(*a)] if you need lists instead of tuples.

[Peter Hansen]
(I believe this is something Guido considers an "abuse of *args", but I
just consider it an elegant use of zip() considering how the language
defines *args. YMMV]

It is somewhat elegant in terms of expressiveness; however, it is also
a bit disconcerting in light of the underlying implementation.

All of the tuples are loaded one-by-one onto the argument stack. For a
few elements, this is no big deal. For large datasets, it is a less
than ideal way of transposing data.

Guido's reaction makes sense when you consider that most programmers
would cringe at a function definition with thousands of parameters.
There is a sense that this doesn't scale-up very well (with each Python
implementation having its own limits on how far you can push this
idiom).
Raymond

Currently we can implicitly unpack a tuple or list by using an
assignment. How is that any different than passing arguments to a
function? Does it use a different mechanism?

(Warning, going into what-if land.)

There's a question relating to the above also so it's not completely in
outer space. :-)
We can't use the * syntax anywhere but in function definitions and
calls. I was thinking the other day that using * in function calls is
kind of inconsistent as it's not used anywhere else to unpack tuples.
And it does the opposite of what it means in the function definitions.

So I was thinking, In order to have explicit packing and unpacking
outside of function calls and function definitions, we would need
different symbols because using * in other places would conflict with
the multiply and exponent operators. Also pack and unpack should not be
the same symbols for obvious reasons. Using different symbols doesn't
conflict with * and ** in functions calls as well.

So for the following examples, I'll use '~' as pack and '^' as unpack.

~ looks like a small 'N', for put stuff 'in'.
^ looks like an up arrow, as in take stuff out.

(Yes, I know they are already used else where. Currently those are
binary operators. The '^' is used with sets also. I did say this is a
"what-if" scenario. Personally I think the binary operator could be
made methods of a bit type, then they ,including the '>>' '<<' pair,
could be freed up and put to better use. The '<<' would make a nice
symbol for getting values from an iterator. The '>>' is already used in
print as redirect.)
Simple explicit unpacking would be:

(This is a silly example, I know it's not needed here but it's just to
show the basic pattern.)

x = (1,2,3)
a,b,c = ^x # explicit unpack, take stuff out of x
So, then you could do the following.

zip(^a) # unpack 'a' and give it's items to zip.

Would that use the same underlying mechanism as using "*a" does? Is it
also the same implicit unpacking method used in an assignment using
'='?. Would it be any less "a bit disconcerting in light of the
underlying implementation"?

Other possible ways to use them outside of function calls:

Sequential unpacking..

x = [(1,2,3)]
a,b,c = ^^x -> a=1, b=2, c=3

Or..

x = [(1,2,3),4]
a,b,c,d = ^x[0],x[1] -> a=1, b=2, c=3, d=4

I'm not sure what it should do if you try to unpack an item not in a
container. I expect it should give an error because a tuple or list was
expected.

a = 1
x = ^a # error!
Explicit packing would not be as useful as we can put ()'s or []'s
around things. One example that come to mind at the moment is using it
to create single item tuples.

x = ~1 -> (1,)

Possible converting strings to tuples?

a = 'abcd'
b = ~^a -> ('a','b','c','d') # explicit unpack and repack

and:

b = ~a -> ('abcd',) # explicit pack whole string

for:

b = a, -> ('abcd',) # trailing comma is needed here.
# This is an error opportunity IMO
Choice of symbols aside, packing and unpacking are a very big part of
Python, it just seems (to me) like having an explicit way to express it
might be a good thing.

It doesn't do anything that can't already be done, of course. I think
it might make some code easier to read, and possibly avoid some errors.

Would there be any (other) advantages to it beside the syntax sugar?

Is it a horrible idea for some unknown reason I'm not seeing. (Other
than the symbol choices breaking current code. Maybe other symbols
would work just as well?)

Regards,
Ron

Jul 21 '05 #10

Simon Dahlbacka

Oooh.. you make my eyes bleed. IMO that proposal is butt ugly (and
looks like the C++.NET perversions.)

Jul 21 '05 #11

Raymond Hettinger

[Ron Adam]

Currently we can implicitly unpack a tuple or list by using an
assignment. How is that any different than passing arguments to a
function? Does it use a different mechanism?

It is the same mechanism, so it is also only appropriate for low
volumes of data:

a, b, c = *args # three elements, no problem
f(*xrange(1000000)) # too much data, not scalable, bad idea

Whenever you get the urge to write something like the latter, then take
it as cue to be passing iterators instead of unpacking giant tuples.
Raymond

Jul 21 '05 #12

Steven D'Aprano

On Sun, 17 Jul 2005 19:38:29 -0700, Raymond Hettinger wrote:

Executive summary: Python's for-loops are both elegant and fast. It
is a mistake to habitually avoid them.

And frequently much more readable and maintainable than the alternatives.

I cringe when I see well-meaning people trying to turn Python into Perl,
by changing perfectly good, fast, readable pieces of code into
obfuscated one-liners simply out of some perverse desire to optimize for
the sake of optimization.
--
Steven.

Jul 21 '05 #13

Ron Adam

Raymond Hettinger wrote:

[Ron Adam]
Currently we can implicitly unpack a tuple or list by using an
assignment. How is that any different than passing arguments to a
function? Does it use a different mechanism?

It is the same mechanism, so it is also only appropriate for low
volumes of data:

a, b, c = *args # three elements, no problem
f(*xrange(1000000)) # too much data, not scalable, bad idea

Whenever you get the urge to write something like the latter, then take
it as cue to be passing iterators instead of unpacking giant tuples.
Raymond

Ah... that's what I expected. So it better to transfer a single
reference or object than a huge list of separated items. I suppose that
would be easy to see in byte code.

In examples like the above, the receiving function would probably be
defined with *args also and not individual arguments. So is it
unpacked, transfered to the function, and then repacked. or unpacked,
repacked and then transfered to the function?

And if the * is used on both sides, couldn't it be optimized to skip the
unpacking and repacking? But then it would need to make a copy wouldn't
it? That should still be better than passing individual references.

Cheers,
Ron

Jul 21 '05 #14

Ron Adam

Simon Dahlbacka wrote:

Oooh.. you make my eyes bleed. IMO that proposal is butt ugly (and
looks like the C++.NET perversions.)

I haven't had the displeasure of using C++.NET fortunately.
point = [5,(10,20,5)]

size,t = point
x,y,z = t

size,x,y,z = point[0], point[1][0], point[1][1], point[1][2]

size,x,y,z = point[0], ^point[1] # Not uglier than the above.

size,(x,y,z) = point # Not as nice as this.
I forget sometimes that this last one is allowed, so ()'s on the left of
the assignment is an explicit unpack. Seems I'm tried to reinvent the
wheel yet again.

Cheers,
Ron

Jul 21 '05 #15

by: Thorsten Kampe | last post by:

I found out that I am rarely using tuples and almost always lists because of the more flexible usability of lists (methods, etc.) To my knowledge, the only fundamental difference between tuples...

Python

gather information from various files efficiently

by: Klaus Neuner | last post by:

Hello, I need to gather information that is contained in various files. Like so: file1: ===================== foo : 1 2 bar : 2 4

Python

intersection of 2 list of pairs

by: les_ander | last post by:

Hi, I have 2 lists of tuples that look like: E1= and E2=. In this tuple, the ordering does not matter, i.e. (u,v) is the same as (v,u). What I want to do is the following: given 2 list of...

Python

split an iteration

by: Robin Becker | last post by:

This function from texlib in oedipus.sf.net is a real cpu hog and I determined to see if it could be optimized. def add_active_node(self, active_nodes, node): """Add a node to the active node...

Python

map vs. list-comprehension

by: Mandus | last post by:

Hi there, inspired by a recent thread where the end of reduce/map/lambda in Python was discussed, I looked over some of my maps, and tried to convert them to list-comprehensions. This one I...

Python

list comprehension

by: a | last post by:

can someone tell me how to use them thanks

Python

How to reverse tuples in a list?

by: Noah | last post by:

I have a list of tuples I want to reverse the order of the elements inside the tuples. I know I could do this long-form: q = y = for i in y: t=list(t)

Python

Referencing Items in a List of Tuples

by: rshepard | last post by:

While working with lists of tuples is probably very common, none of my five Python books or a Google search tell me how to refer to specific items in each tuple. I find references to sorting a list...

Python

Tuple vs List: Whats the difference?

by: Shafik | last post by:

Hello folks, I am an experienced programmer, but very new to python (2 days). I wanted to ask: what exactly is the difference between a tuple and a list? I'm sure there are some, but I can't...

Python

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

General

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

Career Advice

Efficiently Split A List of Tuples

Similar topics