473,237 Members | 1,265 Online

# Indexing list of lists

This may have been asked before but I can't find it. If I have
a rectangular list of lists, say, l = [[1,10],[2,20],[3,30]], is
there a handy syntax for retrieving the ith item of every sublist?
I know about [i[0] for i in l] but I was hoping for something more
like l[;0].

Hilde
Jul 18 '05 #1
21 4264
hi*************@yahoo.de (Hilde Roth) writes:
This may have been asked before but I can't find it. If I have
a rectangular list of lists, say, l = [[1,10],[2,20],[3,30]], is
there a handy syntax for retrieving the ith item of every sublist?
I know about [i[0] for i in l] but I was hoping for something more
like l[;0].

If you need that kind of thing a lot, look at Numeric (or its
replacement, numarray), or perhaps the standard library's array
module.
John
Jul 18 '05 #2
hi*************@yahoo.de (Hilde Roth) wrote:
This may have been asked before but I can't find it. If I have
a rectangular list of lists, say, l = [[1,10],[2,20],[3,30]], is
there a handy syntax for retrieving the ith item of every sublist?
I know about [i[0] for i in l] but I was hoping for something more
like l[;0].

l = [[1,10],[2,20],[3,30]]
zip(*l)[0]

(1, 2, 3)

--
David Eppstein http://www.ics.uci.edu/~eppstein/
Univ. of California, Irvine, School of Information & Computer Science
Jul 18 '05 #3
Hilde Roth wrote:
This may have been asked before but I can't find it. If I have
a rectangular list of lists, say, l = [[1,10],[2,20],[3,30]], is
there a handy syntax for retrieving the ith item of every sublist?
I know about [i[0] for i in l] but I was hoping for something more
like l[;0].

If efficiency is not an issue and/or you need
[item[index] for item in theList] for more than one index at a time, you can
do:
s = [[1,2],[3,4]]
t = zip(*s)
t [(1, 3), (2, 4)] t[1] (2, 4)

This creates a transposed (?) copy of the "matrix". The side effect of
creating tupples instead of inner lists should do no harm if you need only

Peter
Jul 18 '05 #4
Thanks for the suggestion but zip is not nice for large lists
and as for array/numpy, although I chose a numeric example in
the posting, I don't see why only numeric arrays should enjoy
the benefit of such a notation.

l[;0] is illegal right now but does anyone of any other bit
of syntax it might conflict with if proposed as an extension?

Hilde
Jul 18 '05 #5
Hilde Roth wrote:
Thanks for the suggestion but zip is not nice for large lists
and as for array/numpy, although I chose a numeric example in
the posting, I don't see why only numeric arrays should enjoy
the benefit of such a notation.

l[;0] is illegal right now but does anyone of any other bit
of syntax it might conflict with if proposed as an extension?

I think that in alist[from:to:step] the step argument is already overkill.
If there is sufficient demand for column extraction, I would rather make it
a method of list, as alist[:columnIndex] can easily be confused with
alist[;toIndex] (or was it the other way round :-). Would you allow
slicing, too, or make slicing and column extraction mutually exclusive?
Here's how to extract rows 2,4,6 and then columns 4 to 5:

m = n[2:7:2;4:6] # not valid python

Also, ";" is already used (though seldom found in real code) as an alternate
way to delimit statements.
So your suggestion might further complicate the compiler without compelling
benefits over the method approach.

Peter
Jul 18 '05 #6
> Would you allow slicing, too, or make slicing and column extraction
mutually exclusive?
dimensions of the array, which there doesn't seem to be any handy way
of doing right now. So yes I would allow both: within each ";" separated
index group, the current syntax (whatever it is) would apply.
Also, ";" is already used (though seldom found in real code) as an
alternate way to delimit statements.

I doubt it can be found within sqare brackets, which is not difficult
to disambiguate.

While we are at it, I also don't understand why sequences can't be
used as indices. Why not, say, l[[2,3]] or l[(2, 3)]? Why a special
slice concept? To me, it's not just the step argument in the slice
that is overkill...

Hilde
Jul 18 '05 #7
Hilde Roth wrote:
Would you allow slicing, too, or make slicing and column extraction
mutually exclusive?
dimensions of the array, which there doesn't seem to be any handy way

Like it or not, there are no "different dimensions", just lists of lists of
lists... so the N dimension case would resolve to (N-1) 2 dimension
operations.
While we are at it, I also don't understand why sequences can't be
used as indices. Why not, say, l[[2,3]] or l[(2, 3)]? Why a special
slice concept? To me, it's not just the step argument in the slice
that is overkill...

(a) Why not alist[[2, -3, 7]]?
OK with me.

[alist[2], alist[-3], alist[7]]

and

[alist[i] for i in [2, -3, 7]]

are not particularly cumbersome, though.

(b) Why a special slice concept?
It covers the most common cases of list item extraction with a concise
syntax.

Peter
Jul 18 '05 #8
hi*************@yahoo.de (Hilde Roth) writes:
While we are at it, I also don't understand why sequences can't be
used as indices. Why not, say, l[[2,3]] or l[(2, 3)]? Why a special
slice concept? To me, it's not just the step argument in the slice
that is overkill...

1. It will be more typing and harder to visually parse

l[:3] would be l[(0, 3)]
l[3:] would be l[(3,-1)]

2. Slicing two dimensional object will not be possible as the notion
you proposed is used just for that (ex. l[1,2] which is equivallent
to l[(1,2)] see below), and Numeric and numarray use it. See what
happens with an object which on indexing just returns the index
class C: .... def __getitem__(self, i): return i
.... c = C()
c[3] 3 c[:3] slice(0, 3, None) # multi dimensional indexing
c[1,3] (1, 3) c[1:3,3:5]

(slice(1, 3, None), slice(3, 5, None))

--

=*= Lukasz Pankowski =*=
Jul 18 '05 #9
hi*************@yahoo.de (Hilde Roth) writes:
[...]
of doing right now. So yes I would allow both: within each ";" separated
index group, the current syntax (whatever it is) would apply.
Ain't going to happen. If you want that kind of thing without forking
your own version of Python, Numeric/numarray is the closest you'll get
(no special syntax, but lots of useful functions and 'ufuncs').
[...] While we are at it, I also don't understand why sequences can't be
used as indices. Why not, say, l[[2,3]] or l[(2, 3)]? Why a special

[...]

I'd guess you can subclass numarray's arrays and get this behaviour.
Or simply write your own sequence object and override __getitem__.

I'd guess it's highly unlikely ever to be part of the standard
sequence protocol, though.
John
Jul 18 '05 #10
hi*************@yahoo.de (Hilde Roth) writes:
Thanks for the suggestion but zip is not nice for large lists
and as for array/numpy, although I chose a numeric example in
the posting, I don't see why only numeric arrays should enjoy
the benefit of such a notation.
Numeric (and presumably numarray) can handle arbitrary Python objects.

l[;0] is illegal right now but does anyone of any other bit
of syntax it might conflict with if proposed as an extension?

BTW, just remembered that Numeric/numarray *does* use commas for
multi-dimensional indexing. Apparently (glancing at the language ref)
that's not actually indexing with a tuple, but part of Python's
syntax. Same goes for that other obscure bit of Python syntax, the
ellipsis, as used by Numeric: foo[a, ..., b].
John
Jul 18 '05 #11
jj*@pobox.com (John J. Lee) writes:
[...]
(no special syntax, but lots of useful functions and 'ufuncs').

[...]

Oops, wrong. See my other post.
John
Jul 18 '05 #12
> Like it or not, there are no "different dimensions", just lists of lists
of lists...
You are being too litteral. A list of list is like a 2D array from an
indexing point of view, a list of lists of lists like a 3D array etc.
E.g., (((1,10),(2,20),(3,30)),((-1,'A'),(-2,'B'),(-3,'C'))) is a
2 x 3 x 2 rectangular data structure and has 3 dimensions. Hence,
e.g., l[0;2;1] ~ l[0][2][1] = 30
[alist[2], alist[-3], alist[7]] and [alist[i] for i in [2, -3, 7]]
I agree that comprehensions alleviate the problem to an extent.
However the first notation is definitely cumbersome for all but the
shortest index lists.
It covers the most common cases of list item extraction with a concise
syntax.

Maybe but
1/ it is more or less redundant: the (x)range syntax could have been
extended with the same effect
2/ it lacks generality since it can only generate arithmetic progressions

Hilde
Jul 18 '05 #13
> It will be harder to parse.

No because in many cases the reason why you want to use the syntax
l[seq] is that you already have seq, so you would refer to it by name
and "l[s]" is definitely not hard to parse.
l[:3] would be l[(0, 3)]
l[3:] would be l[(3,-1)]
Not at all. I was suggesting to use a semi-colon not a colon. Thus if
l is (10,20,30,40), l[:3] -> (10,20,30) whereas l[(0,3)] -> (10, 30),
i.e., same as in your class-based example, minus the parentheses, which
I now realize are superfluous (odd that python allows you to omit the
parentheses but not the commas).
2. Slicing two dimensional object will not be possible as the notion
you proposed is used just for that (ex. l[1,2] which is equivalent
to l[(1,2)] see below), and Numeric and numarray use it.
Same misunderstanding as above, I believe. Let, e.g., l be
((1,10,100),(2,20,200),(3,30,300),(4,40,400)). Then l[2:; 1:] ->
[(30, 300), (40, 400)]. This is equivalent to [i[1:] for i in l[2:]]
but, at least to me, it is the index notation that is easier to parse.

Incidentally, it strikes me that there are a lot of superfluous
commas. Why not just (1 10 100) or even 1 10 100 instead of (1,10,100)?
The commas do make the expression harder to parse visually.
# multi dimensional indexing
c[1,3]

I disagree that this is "multidimensional". You are passing a list
and getting back a list, so this is still flat. I think you are
confusing dimensionality and cardinality.
Numeric and numarray use it

This is good if it is true (I have yet to look at these two because
my work is not primarily numerical) but, again, the restriction of this
syntax to arrays, and arrays of numeric values at that, strikes me
as completely arbitrary.

The bottom line is that python claims to be simple but has a syntax
which, in this little corner at least, is neither simple nor regular:
xranges and slices are both sequence abstractions but are used in
different contexts and have different syntaxes; C-style arrays are
treated differently than lists of lists of ..., although they are
conceptually equivalent; numeric structures are treated differently
than non-numeric ones etc etc

Oh well. Maybe a future PEP will streamline all that.

Hilde
Jul 18 '05 #14
Hilde Roth wrote:
Like it or not, there are no "different dimensions", just lists of lists
of lists...

You are being too litteral. A list of list is like a 2D array from an
indexing point of view, a list of lists of lists like a 3D array etc.
E.g., (((1,10),(2,20),(3,30)),((-1,'A'),(-2,'B'),(-3,'C'))) is a
2 x 3 x 2 rectangular data structure and has 3 dimensions. Hence,
e.g., l[0;2;1] ~ l[0][2][1] = 30

Only if all the sublists are of the same length, which is guaranteed for
a multi-dimensional array, but not for a list of lists.

What do you expect a[;1] to return if a = [[], [1, 2, 3], [4], 5]?

That's why Numeric has a specific type for multi-dimensional arrays.

David

Jul 18 '05 #15
> Numeric (and presumably numarray) can handle arbitrary Python objects.

Oh. I will look at them pronto, then. As for:
I'd guess it's highly unlikely ever to be part of the standard
sequence protocol, though.

Passing a slice is passing an abstraction of a sequence (since indexing
a list with a slice returns a list). Given that, it seems hard to justify
not accepting the sequence itself... Passing the abstraction rather than
the thing itself should be an optimization, as in xrange vs. range, left
to the discretion of the programmer.

-- O.L.
Jul 18 '05 #16
hi*************@yahoo.de (Hilde Roth) writes:

No because in many cases the reason why you want to use the syntax
l[seq] is that you already have seq, so you would refer to it by name
and "l[s]" is definitely not hard to parse.
So you want l[seq] to be a shorter way for current

[l[i] for i in seq]

Which is pythonic as it is explicit, easier to read if the code is not
written by you yesterday, although in some situations your interpretation
of l[seq] might be guessable from the context.

Not at all. I was suggesting to use a semi-colon not a colon. Thus if
l is (10,20,30,40), l[:3] -> (10,20,30) whereas l[(0,3)] -> (10, 30),
i.e., same as in your class-based example, minus the parentheses, which
I now realize are superfluous (odd that python allows you to omit the
parentheses but not the commas).
Sorry for my misunderstanding, yes it would be nice to have a
posibility to index a sequence with a list of indices, here is a pure
Python (>= 2.2) implementation of the idea:

class List(list):

def __getitem__(self, index):
if isinstance(index, (tuple, list)):
return[list.__getitem__(self, i) for i in index]
else:
return list.__getitem__(self, index)
l = List(range(0, 100, 10))
l[0,2,3] [0, 20, 30]

but in this simple using both commas and slices will not work as
expected
l[0,1,7:] [0, 10, [70, 80, 90]]

2. Slicing two dimensional object will not be possible as the notion
you proposed is used just for that (ex. l[1,2] which is equivalent
to l[(1,2)] see below), and Numeric and numarray use it.
Same misunderstanding as above, I believe. Let, e.g., l be
((1,10,100),(2,20,200),(3,30,300),(4,40,400)). Then l[2:; 1:] ->
[(30, 300), (40, 400)]. This is equivalent to [i[1:] for i in l[2:]]
but, at least to me, it is the index notation that is easier to parse.

Incidentally, it strikes me that there are a lot of superfluous
commas. Why not just (1 10 100) or even 1 10 100 instead of (1,10,100)?
The commas do make the expression harder to parse visually.

This will will work until there are no expressions in the sequence.
If there are it is harder to read (and may be more error prone)

(1 + 3 4 + 2)

>>> # multi dimensional indexing
>>> c[1,3]
I disagree that this is "multidimensional". You are passing a list
and getting back a list, so this is still flat. I think you are
confusing dimensionality and cardinality.

That is the notion of multidimensional indexing in Python.
Numeric and numarray use it

This is good if it is true (I have yet to look at these two because
my work is not primarily numerical) but, again, the restriction of this
syntax to arrays, and arrays of numeric values at that, strikes me
as completely arbitrary.

Here is an example of two dimentional Numeric array and it's indexing:
from Numeric import *
reshape(arange(9), (3,3)) array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]]) a = reshape(arange(9), (3,3))
a[1][0] 3 a[1,0] 3 a[(1,0)]

3

So currently indexing with sequence has a settled meaning of
multidimensional indexing if lists and tuples would allow indexing by
sequence, than either:

1. it might be confused with multidimensional indexing of numeric
types (the same notion for two different things)

2. it will require rework of multidimensional indexing maybe with your
semicolon notion, but will introduce incompatibilities (back and
forward)

The bottom line is that python claims to be simple but has a syntax
which, in this little corner at least, is neither simple nor regular:
xranges and slices are both sequence abstractions but are used in
different contexts and have different syntaxes; C-style arrays are
treated differently than lists of lists of ..., although they are
conceptually equivalent; numeric structures are treated differently
than non-numeric ones etc etc

With multidimensional arrays and list of lists you may both write
a[i][j], so it is consistent, with arrays you may write a[i,j] if you
know that you have an 2-dim array in your hand.

--

=*= Lukasz Pankowski =*=
Jul 18 '05 #17
Lukasz Pankowski wrote:
Sorry for my misunderstanding, yes it would be nice to have a
posibility to index a sequence with a list of indices, here is a pure
Python (>= 2.2) implementation of the idea:

class List(list):

def __getitem__(self, index):
if isinstance(index, (tuple, list)):
return[list.__getitem__(self, i) for i in index]
else:
return list.__getitem__(self, index)
l = List(range(0, 100, 10))
l[0,2,3] [0, 20, 30]
This is nice :-)
but in this simple using both commas and slices will not work as
expected
l[0,1,7:]

[0, 10, [70, 80, 90]]

Your implementation can be extended to handle slices and still remains
simple:

class List3(list):
def __getitem__(self, index):
if hasattr(index, "__getitem__"): # is index list-like?
result = []
for i in index:
if hasattr(i, "start"): # is i slice-like?
result.extend(list.__getitem__(self, i))
else:
result.append(list.__getitem__(self, i))
return result
else:
return list.__getitem__(self, index)

I have used hasattr() instead of isinstance() tests because I think that the
"we take everything that behaves like X" approach is more pythonic than
"must be an X instance".
While __getitem__() is fairly basic for lists, I am not sure if start can be
considered mandatory for slices, though.

Peter
Jul 18 '05 #18
> Only if all the sublists are of the same length, which is guaranteed for
a multi-dimensional array, but not for a list of lists.
This is a red herring.
What do you expect a[;1] to return if a = [[], [1, 2, 3], [4], 5]?

Whatever error python returns if you ask, e.g., for (1 2 3)[4].

Hilde
Jul 18 '05 #19
> Only if all the sublists are of the same length, which is guaranteed for
a multi-dimensional array, but not for a list of lists.
This is a red herring.
What do you expect a[;1] to return if a = [[], [1, 2, 3], [4], 5]?

Whatever error python returns if you ask, e.g., for (1,2,3)[4].

Hilde
Jul 18 '05 #20
> So you want l[seq] to be a shorter way for current

[l[i] for i in seq]

Which is pythonic as it is explicit, easier to read if the code is not
written by you yesterday

But consistency is pythonic, too ;-) and, as I pointed in another message,
if you accept l[sequence_abstraction], why not l[actual_sequence]??

Hilde
Jul 18 '05 #21
Hilde Roth wrote:
Only if all the sublists are of the same length, which is guaranteed for
a multi-dimensional array, but not for a list of lists.

This is a red herring.

No, it's not. It is a hint that, despite the similarity of notation
between Matrix[i][j] and NestedList[i][j], there is something
fundamentally different between the two. See below for another example.

What do you expect a[;1] to return if a = [[], [1, 2, 3], [4], 5]?

Whatever error python returns if you ask, e.g., for (1 2 3)[4].

Hilde

Okay, here's a better example which I thought of just after I posted my

Given

x = [[1, 2], [3, 4]]

the statement

x[0] = [5, 6]

will result in a nested list x = [ [5, 6], [3, 4] ]. If you think of x
as a multi-dimensional array represented by a nested list, then I've
just replaced a row of the original array

1 2
3 4

with a new row, yielding

5 6
3 4

If we had an x[;0] notation, then you would expect to be able to do the
same thing to replace a column:

x[;0] = [7, 8]

Unfortunately, there is no pre-existing list representing the first
column of x, so x[;0] has to return a new list [1, 3], and assigning to
that new list has no affect on x.

Again, my point is that nested lists are a fundamentally different
structure than multi-dimensional arrays. For simple things like
x[i][j], you can use a nested list to represent a multi-dimensional
array, but if you actually want to manipulate a two-dimensional array
like a matrix, you are better off using a class designed for that
purpose, like the ones defined in Numeric Python.

David

Jul 18 '05 #22

This thread has been closed and replies have been disabled. Please start a new discussion.