473,890 Members | 1,686 Online

# Sorting a multidimensiona l array by multiple keys

Hello everyone,

can I sort a multidimensiona l array in Python by multiple sort keys? A
litte code sample would be nice!

Thx,
Rehceb

Mar 31 '07 #1
16 3641
Rehceb Rotkiv schrieb:
can I sort a multidimensiona l array in Python by multiple sort keys? A
litte code sample would be nice!
You can pass a function as argument to the sort method of a list.
The function should take two arguments and return -1, 0 or 1 as
comparison result. Just like the cmp function.

This will objects in list obj_lst by their id attributes:

def sorter(a, b):
return cmp(a.id, b.id)

obj_lst.sort(so rter)

Thomas
Mar 31 '07 #2
Rehceb Rotkiv:
can I sort a multidimensiona l array in Python by multiple sort keys? A
litte code sample would be nice!
If you want a good answer you have to give me/us more details, and an
example too.

Bye,
bearophile

Mar 31 '07 #3
If you want a good answer you have to give me/us more details, and an
example too.
OK, here is some example data:

reaction is BUT by the
sodium , BUT it is
sea , BUT it is
this manner BUT the dissolved
pattern , BUT it is
rapid , BUT it is

As each line consists of 5 words, I would break up the data into an array
of five-field-arrays (Would you use lists or tuples or a combination in
Python?). The word "BUT" would be in the middle, with two fields/words
left and two fields/words right of it. I then want to sort this list by

- field 3
- field 4
- field 1
- field 0

in this hierarchy. This is the desired result:

pattern , BUT it is
rapid , BUT it is
sea , BUT it is
sodium , BUT it is
reaction is BUT by the
this manner BUT the dissolved

The first 4 lines all could not be sorted by fields 3 & 4, as they are
identical ("it", "is"), so they have been sorted first by field 1 (which
is also identical: ",") and then by field 0:

pattern
rapid
sea
sodium

I hope I have explained this in an understandable way. It would be cool
if you could show me how this can be done in Python!

Regards,
Rehceb
Mar 31 '07 #4
Wait, I made a mistake. The correct result would be

reaction is BUT by the
pattern , BUT it is
rapid , BUT it is
sea , BUT it is
sodium , BUT it is
this manner BUT the dissolved

because "by the" comes before and "the dissolved" after "it is". Sorry
for the confusion.
Mar 31 '07 #5
Rehceb Rotkiv wrote:
>If you want a good answer you have to give me/us more details, and an
example too.

OK, here is some example data:

reaction is BUT by the
sodium , BUT it is
sea , BUT it is
this manner BUT the dissolved
pattern , BUT it is
rapid , BUT it is

As each line consists of 5 words, I would break up the data into an array
of five-field-arrays (Would you use lists or tuples or a combination in
Python?). The word "BUT" would be in the middle, with two fields/words
left and two fields/words right of it. I then want to sort this list by

- field 3
- field 4
- field 1
- field 0
You're probably looking for the key= argument to list.sort(). If your
function simply returns the fields in the order above, I believe you get
the right thing::
>>s = '''\
.... reaction is BUT by the
.... sodium , BUT it is
.... sea , BUT it is
.... this manner BUT the dissolved
.... pattern , BUT it is
.... rapid , BUT it is
.... '''
>>word_lists = [line.split() for line in s.splitlines()]
def key(word_list):
.... return word_list[3], word_list[4], word_list[1], word_list[0]
....
>>word_lists.so rt(key=key)
word_lists
[['reaction', 'is', 'BUT', 'by', 'the'],
['pattern', ',', 'BUT', 'it', 'is'],
['rapid', ',', 'BUT', 'it', 'is'],
['sea', ',', 'BUT', 'it', 'is'],
['sodium', ',', 'BUT', 'it', 'is'],
['this', 'manner', 'BUT', 'the', 'dissolved']]

STeVe
Mar 31 '07 #6
Rehceb Rotkiv <re****@no.spam .plzwrote:
Wait, I made a mistake. The correct result would be

reaction is BUT by the
pattern , BUT it is
rapid , BUT it is
sea , BUT it is
sodium , BUT it is
this manner BUT the dissolved

because "by the" comes before and "the dissolved" after "it is". Sorry
for the confusion.
>>data = [
"reaction is BUT by the",
"sodium , BUT it is",
"sea , BUT it is",
"this manner BUT the dissolved",
"pattern , BUT it is",
"rapid , BUT it is",
]
>>data = [ s.split() for s in data]
from pprint import pprint
pprint(data )
[['reaction', 'is', 'BUT', 'by', 'the'],
['sodium', ',', 'BUT', 'it', 'is'],
['sea', ',', 'BUT', 'it', 'is'],
['this', 'manner', 'BUT', 'the', 'dissolved'],
['pattern', ',', 'BUT', 'it', 'is'],
['rapid', ',', 'BUT', 'it', 'is']]
>>from operator import itemgetter
data.sort(key =itemgetter(0))
data.sort(key =itemgetter(1))
data.sort(key =itemgetter(4))
data.sort(key =itemgetter(3))
pprint(data )
[['reaction', 'is', 'BUT', 'by', 'the'],
['pattern', ',', 'BUT', 'it', 'is'],
['rapid', ',', 'BUT', 'it', 'is'],
['sea', ',', 'BUT', 'it', 'is'],
['sodium', ',', 'BUT', 'it', 'is'],
['this', 'manner', 'BUT', 'the', 'dissolved']]
Mar 31 '07 #7
On Mar 31, 6:42 am, Rehceb Rotkiv <reh...@no.spam .plzwrote:

(snipped)
As each line consists of 5 words, I would break up the data into an array
of five-field-arrays (Would you use lists or tuples or a combination in
Python?). The word "BUT" would be in the middle, with two fields/words
left and two fields/words right of it. I then want to sort this list by

- field 3
- field 4
- field 1
- field 0

import StringIO

buf = """
reaction is BUT by the
sodium , BUT it is
sea , BUT it is
this manner BUT the dissolved
pattern , BUT it is
rapid , BUT it is
""".lstrip( )

mockfile = StringIO.String IO(buf)

tokens = [ line.split() + [ line ] for line in mockfile ]
tokens.sort(key =lambda l: (l[3], l[4], l[1], l[0]))
for l in tokens:
print l[-1],
--
Hope this helps,
Steven

Mar 31 '07 #8
Duncan Booth wrote:
>>>from operator import itemgetter
data.sort(ke y=itemgetter(0) )
data.sort(ke y=itemgetter(1) )
data.sort(ke y=itemgetter(4) )
data.sort(ke y=itemgetter(3) )
Or, in Python 2.5:
>>data.sort(key =itemgetter(3, 4, 1, 0))
Peter
Mar 31 '07 #9

Regards,
Rehceb
Mar 31 '07 #10

This thread has been closed and replies have been disabled. Please start a new discussion.