Connecting Tech Pros Worldwide Forums | Help | Site Map

Sorting a multidimensional array by multiple keys

Rehceb Rotkiv
Guest
 
Posts: n/a
#1: Mar 31 '07
Hello everyone,

can I sort a multidimensional array in Python by multiple sort keys? A
litte code sample would be nice!

Thx,
Rehceb


=?UTF-8?B?VGhvbWFzIEtyw7xnZXI=?=
Guest
 
Posts: n/a
#2: Mar 31 '07

re: Sorting a multidimensional array by multiple keys


Rehceb Rotkiv schrieb:
Quote:
can I sort a multidimensional array in Python by multiple sort keys? A
litte code sample would be nice!
You can pass a function as argument to the sort method of a list.
The function should take two arguments and return -1, 0 or 1 as
comparison result. Just like the cmp function.

This will objects in list obj_lst by their id attributes:

def sorter(a, b):
return cmp(a.id, b.id)

obj_lst.sort(sorter)

Thomas
bearophileHUGS@lycos.com
Guest
 
Posts: n/a
#3: Mar 31 '07

re: Sorting a multidimensional array by multiple keys


Rehceb Rotkiv:
Quote:
can I sort a multidimensional array in Python by multiple sort keys? A
litte code sample would be nice!
If you want a good answer you have to give me/us more details, and an
example too.

Bye,
bearophile

Rehceb Rotkiv
Guest
 
Posts: n/a
#4: Mar 31 '07

re: Sorting a multidimensional array by multiple keys


If you want a good answer you have to give me/us more details, and an
Quote:
example too.
OK, here is some example data:

reaction is BUT by the
sodium , BUT it is
sea , BUT it is
this manner BUT the dissolved
pattern , BUT it is
rapid , BUT it is

As each line consists of 5 words, I would break up the data into an array
of five-field-arrays (Would you use lists or tuples or a combination in
Python?). The word "BUT" would be in the middle, with two fields/words
left and two fields/words right of it. I then want to sort this list by

- field 3
- field 4
- field 1
- field 0

in this hierarchy. This is the desired result:

pattern , BUT it is
rapid , BUT it is
sea , BUT it is
sodium , BUT it is
reaction is BUT by the
this manner BUT the dissolved

The first 4 lines all could not be sorted by fields 3 & 4, as they are
identical ("it", "is"), so they have been sorted first by field 1 (which
is also identical: ",") and then by field 0:

pattern
rapid
sea
sodium

I hope I have explained this in an understandable way. It would be cool
if you could show me how this can be done in Python!

Regards,
Rehceb
Rehceb Rotkiv
Guest
 
Posts: n/a
#5: Mar 31 '07

re: Sorting a multidimensional array by multiple keys


Wait, I made a mistake. The correct result would be

reaction is BUT by the
pattern , BUT it is
rapid , BUT it is
sea , BUT it is
sodium , BUT it is
this manner BUT the dissolved

because "by the" comes before and "the dissolved" after "it is". Sorry
for the confusion.
Steven Bethard
Guest
 
Posts: n/a
#6: Mar 31 '07

re: Sorting a multidimensional array by multiple keys


Rehceb Rotkiv wrote:
Quote:
Quote:
>If you want a good answer you have to give me/us more details, and an
>example too.
>
OK, here is some example data:
>
reaction is BUT by the
sodium , BUT it is
sea , BUT it is
this manner BUT the dissolved
pattern , BUT it is
rapid , BUT it is
>
As each line consists of 5 words, I would break up the data into an array
of five-field-arrays (Would you use lists or tuples or a combination in
Python?). The word "BUT" would be in the middle, with two fields/words
left and two fields/words right of it. I then want to sort this list by
>
- field 3
- field 4
- field 1
- field 0
You're probably looking for the key= argument to list.sort(). If your
function simply returns the fields in the order above, I believe you get
the right thing::
Quote:
Quote:
Quote:
>>s = '''\
.... reaction is BUT by the
.... sodium , BUT it is
.... sea , BUT it is
.... this manner BUT the dissolved
.... pattern , BUT it is
.... rapid , BUT it is
.... '''
Quote:
Quote:
Quote:
>>word_lists = [line.split() for line in s.splitlines()]
>>def key(word_list):
.... return word_list[3], word_list[4], word_list[1], word_list[0]
....
Quote:
Quote:
Quote:
>>word_lists.sort(key=key)
>>word_lists
[['reaction', 'is', 'BUT', 'by', 'the'],
['pattern', ',', 'BUT', 'it', 'is'],
['rapid', ',', 'BUT', 'it', 'is'],
['sea', ',', 'BUT', 'it', 'is'],
['sodium', ',', 'BUT', 'it', 'is'],
['this', 'manner', 'BUT', 'the', 'dissolved']]

STeVe
Duncan Booth
Guest
 
Posts: n/a
#7: Mar 31 '07

re: Sorting a multidimensional array by multiple keys


Rehceb Rotkiv <rehceb@no.spam.plzwrote:
Quote:
Wait, I made a mistake. The correct result would be
>
reaction is BUT by the
pattern , BUT it is
rapid , BUT it is
sea , BUT it is
sodium , BUT it is
this manner BUT the dissolved
>
because "by the" comes before and "the dissolved" after "it is". Sorry
for the confusion.
Quote:
Quote:
Quote:
>>data = [
"reaction is BUT by the",
"sodium , BUT it is",
"sea , BUT it is",
"this manner BUT the dissolved",
"pattern , BUT it is",
"rapid , BUT it is",
]
Quote:
Quote:
Quote:
>>data = [ s.split() for s in data]
>>from pprint import pprint
>>pprint(data)
[['reaction', 'is', 'BUT', 'by', 'the'],
['sodium', ',', 'BUT', 'it', 'is'],
['sea', ',', 'BUT', 'it', 'is'],
['this', 'manner', 'BUT', 'the', 'dissolved'],
['pattern', ',', 'BUT', 'it', 'is'],
['rapid', ',', 'BUT', 'it', 'is']]
Quote:
Quote:
Quote:
>>from operator import itemgetter
>>data.sort(key=itemgetter(0))
>>data.sort(key=itemgetter(1))
>>data.sort(key=itemgetter(4))
>>data.sort(key=itemgetter(3))
>>pprint(data)
[['reaction', 'is', 'BUT', 'by', 'the'],
['pattern', ',', 'BUT', 'it', 'is'],
['rapid', ',', 'BUT', 'it', 'is'],
['sea', ',', 'BUT', 'it', 'is'],
['sodium', ',', 'BUT', 'it', 'is'],
['this', 'manner', 'BUT', 'the', 'dissolved']]
attn.steven.kuo@gmail.com
Guest
 
Posts: n/a
#8: Mar 31 '07

re: Sorting a multidimensional array by multiple keys


On Mar 31, 6:42 am, Rehceb Rotkiv <reh...@no.spam.plzwrote:

(snipped)
Quote:
As each line consists of 5 words, I would break up the data into an array
of five-field-arrays (Would you use lists or tuples or a combination in
Python?). The word "BUT" would be in the middle, with two fields/words
left and two fields/words right of it. I then want to sort this list by
>
- field 3
- field 4
- field 1
- field 0


import StringIO

buf = """
reaction is BUT by the
sodium , BUT it is
sea , BUT it is
this manner BUT the dissolved
pattern , BUT it is
rapid , BUT it is
""".lstrip()

mockfile = StringIO.StringIO(buf)

tokens = [ line.split() + [ line ] for line in mockfile ]
tokens.sort(key=lambda l: (l[3], l[4], l[1], l[0]))
for l in tokens:
print l[-1],


--
Hope this helps,
Steven

Peter Otten
Guest
 
Posts: n/a
#9: Mar 31 '07

re: Sorting a multidimensional array by multiple keys


Duncan Booth wrote:
Quote:
Quote:
Quote:
>>>from operator import itemgetter
>>>data.sort(key=itemgetter(0))
>>>data.sort(key=itemgetter(1))
>>>data.sort(key=itemgetter(4))
>>>data.sort(key=itemgetter(3))
Or, in Python 2.5:
Quote:
Quote:
Quote:
>>data.sort(key=itemgetter(3, 4, 1, 0))
Peter
Rehceb Rotkiv
Guest
 
Posts: n/a
#10: Mar 31 '07

re: Sorting a multidimensional array by multiple keys


Thank you all for your helpful solutions!

Regards,
Rehceb
Duncan Booth
Guest
 
Posts: n/a
#11: Mar 31 '07

re: Sorting a multidimensional array by multiple keys


Peter Otten <__peter__@web.dewrote:
Quote:
Duncan Booth wrote:
>
Quote:
Quote:
>>>>from operator import itemgetter
>>>>data.sort(key=itemgetter(0))
>>>>data.sort(key=itemgetter(1))
>>>>data.sort(key=itemgetter(4))
>>>>data.sort(key=itemgetter(3))
>
Or, in Python 2.5:
>
Quote:
Quote:
>>>data.sort(key=itemgetter(3, 4, 1, 0))
>
Thanks, I'd forgotten itemgetter had that strangley assymmetric behaviour
of returning either a single value or a tuple.
Paulo da Silva
Guest
 
Posts: n/a
#12: Apr 2 '07

re: Sorting a multidimensional array by multiple keys


Rehceb Rotkiv escreveu:
Quote:
Hello everyone,
>
can I sort a multidimensional array in Python by multiple sort keys? A
litte code sample would be nice!
class MyList(list):
# This is the index of the element to be compared
CmpIndex=0

# Comparision methods
@staticmethod
def __cmp_pars(x,y):
if isinstance(x,MyList):
x=x[MyList.CmpIndex]
if isinstance(y,MyList):
y=y[MyList.CmpIndex]
return x,y
def __cmp__(self,other):
s,o=MyList.__cmp_pars(self,other)
return cmp(s,o)
def __lt__(self,other):
s,o=MyList.__cmp_pars(self,other)
return s<o
def __le__(self,other):
s,o=MyList.__cmp_pars(self,other)
return s<=o
def __gt__(self,other):
s,o=MyList.__cmp_pars(self,other)
return s>o
def __ge__(self,other):
s,o=MyList.__cmp_pars(self,other)
return s>=o
def __eq__(self,other):
s,o=MyList.__cmp_pars(self,other)
return s==o
def __ne__(self,other):
s,o=MyList.__cmp_pars(self,other)
return s!=o

Use:

x=MyList(<list of lists>)
MyList.CmpIndex=2 # Compare by index 2
x.sort()

May be there is a better solution ...
HTH
Paulo
Alex Martelli
Guest
 
Posts: n/a
#13: Apr 2 '07

re: Sorting a multidimensional array by multiple keys


Thomas Krüger <newsgroups@nospam.nowire.orgwrote:
Quote:
Rehceb Rotkiv schrieb:
Quote:
can I sort a multidimensional array in Python by multiple sort keys? A
litte code sample would be nice!
>
You can pass a function as argument to the sort method of a list.
The function should take two arguments and return -1, 0 or 1 as
comparison result. Just like the cmp function.
>
This will objects in list obj_lst by their id attributes:
>
def sorter(a, b):
return cmp(a.id, b.id)
>
obj_lst.sort(sorter)
A MUCH better way to obtain exactly the same semantics would be:

def getid(a):
return a.id

obj_list.sort(key=getid)


Alex
=?ISO-8859-1?Q?Thomas_Kr=FCger?=
Guest
 
Posts: n/a
#14: Apr 2 '07

re: Sorting a multidimensional array by multiple keys


Alex Martelli schrieb:
Quote:
Thomas Krüger <newsgroups@nospam.nowire.orgwrote:
Quote:
>def sorter(a, b):
> return cmp(a.id, b.id)
>>
>obj_lst.sort(sorter)
>
A MUCH better way to obtain exactly the same semantics would be:
>
def getid(a):
return a.id
>
obj_list.sort(key=getid)
Frankly speaking the purpose of the example was to show how to pass a
function as argument for the sort method.
Your code may be more efficient but it explains something different.

Thomas
Steven Bethard
Guest
 
Posts: n/a
#15: Apr 2 '07

re: Sorting a multidimensional array by multiple keys


Thomas Krüger wrote:
Quote:
Alex Martelli schrieb:
Quote:
>Thomas Krüger <newsgroups@nospam.nowire.orgwrote:
Quote:
>>def sorter(a, b):
>> return cmp(a.id, b.id)
>>>
>>obj_lst.sort(sorter)
>A MUCH better way to obtain exactly the same semantics would be:
>>
>def getid(a):
> return a.id
>>
>obj_list.sort(key=getid)
>
Frankly speaking the purpose of the example was to show how to pass a
function as argument for the sort method.
Your code may be more efficient but it explains something different.
Yes, but there's almost never a reason to use the cmp= argument to
sort() anymore. It's almost always better to use the key= argument.

STeVe
bearophileHUGS@lycos.com
Guest
 
Posts: n/a
#16: Apr 2 '07

re: Sorting a multidimensional array by multiple keys


Steven Bethard:
Quote:
there's almost never a reason to use the cmp= argument to
sort() anymore. It's almost always better to use the key= argument.
I always use key now, but maybe cmp uses less memory. There can be few
situations where cmp is better still.

Bye,
bearophile

Alex Martelli
Guest
 
Posts: n/a
#17: Apr 2 '07

re: Sorting a multidimensional array by multiple keys


Steven Bethard <steven.bethard@gmail.comwrote:
Quote:
Thomas Krüger wrote:
Quote:
Alex Martelli schrieb:
Quote:
Thomas Krüger <newsgroups@nospam.nowire.orgwrote:
>def sorter(a, b):
> return cmp(a.id, b.id)
>>
>obj_lst.sort(sorter)
A MUCH better way to obtain exactly the same semantics would be:
>
def getid(a):
return a.id
>
obj_list.sort(key=getid)
Frankly speaking the purpose of the example was to show how to pass a
function as argument for the sort method.
Your code may be more efficient but it explains something different.
>
Yes, but there's almost never a reason to use the cmp= argument to
sort() anymore. It's almost always better to use the key= argument.
Exactly. Passing a "comparison function" rather than a key-extraction
function is rarely a good idea -- except for some extremely complicated
cases involving e.g. certain string (or more complicated) fields needing
to be sorted in the reverse direction from others. In practice, the
existence of that argument and its prominent position as the first
positional argument is due entirely to backwards compatibility issues.


Alex
Closed Thread