473,468 Members | 1,849 Online
Bytes | Software Development & Data Engineering Community
Create Post

Home Posts Topics Members FAQ

Sorting a list

Hi everyone. If I have a list of tuples, and each tuple is in the form:

(year, text) as in ('1995', 'This is a citation.')

How can I sort the list so that they are in chronological order based on
the year? Is there a better way to do this than making a list of tuples?

(So far I have a text file and on each line is a citation. I use an RE
to search for the year, then put this year and the entire citation in a
tuple, and add this tuple to a list. Perhaps this step can be changed to
be more efficient when I then need to sort them by date and write a new
file with the citations in order.)

Thanks.
Feb 1 '07 #1
17 2756
John Salerno wrote:
Hi everyone. If I have a list of tuples, and each tuple is in the form:

(year, text) as in ('1995', 'This is a citation.')

How can I sort the list so that they are in chronological order
L.sort()

--
"kad ima¹ 7 godina glup si ko kurac, sve je predobro: autiæi i bageri u
kvartu.. to je ¾ivot"
Drito Konj
Feb 1 '07 #2
John Salerno a écrit :
Hi everyone. If I have a list of tuples, and each tuple is in the form:

(year, text) as in ('1995', 'This is a citation.')

How can I sort the list so that they are in chronological order based on
the year?
Calling sort() on the list should just work.
Is there a better way to do this than making a list of tuples?
Depends...
(So far I have a text file and on each line is a citation. I use an RE
to search for the year, then put this year and the entire citation in a
tuple, and add this tuple to a list.
You don't tell how these lines are formatted, but it's possible that you
don't even need a regexp here. But wrt/ sorting, the list of tuples with
the sort key as first element is one of the best solutions.
Feb 1 '07 #3
Bruno Desthuilliers wrote:
You don't tell how these lines are formatted, but it's possible that you
don't even need a regexp here. But wrt/ sorting, the list of tuples with
the sort key as first element is one of the best solutions.
Ah, so simply using sort() will default to the first element of each tuple?

The citations are like this:

lastname, firstname. (year). title. other stuff.
Feb 1 '07 #4
Bruno Desthuilliers wrote:
John Salerno a écrit :
>Hi everyone. If I have a list of tuples, and each tuple is in the form:

(year, text) as in ('1995', 'This is a citation.')

How can I sort the list so that they are in chronological order based
on the year?

Calling sort() on the list should just work.
Amazing, it was that easy. :)
Feb 1 '07 #5
John Salerno wrote:
Bruno Desthuilliers wrote:
>John Salerno a écrit :
>>Hi everyone. If I have a list of tuples, and each tuple is in the form:

(year, text) as in ('1995', 'This is a citation.')

How can I sort the list so that they are in chronological order based
on the year?

Calling sort() on the list should just work.

Amazing, it was that easy. :)
Here's what I did:

import re

file = open('newrefs.txt')
text = file.readlines()
file.close()

newfile = open('sortedrefs.txt', 'w')
refs = []

pattern = re.compile('\(\d{4}\)')

for line in text:
year = pattern.search(line).group()
refs.append((year, line))

refs.sort()

for ref in refs:
newfile.write(ref[1])

newfile.close()
Feb 1 '07 #6
John Salerno wrote:
Bruno Desthuilliers wrote:
>John Salerno a écrit :
>>Hi everyone. If I have a list of tuples, and each tuple is in the form:

(year, text) as in ('1995', 'This is a citation.')

How can I sort the list so that they are in chronological order based
on the year?

Calling sort() on the list should just work.

Amazing, it was that easy. :)
One more thing. What if I want them in reverse chronological order? I
tried reverse() but that seemed to put them in reverse alphabetical
order based on the second element of the tuple (not the year).
Feb 1 '07 #7
John Salerno a écrit :
Bruno Desthuilliers wrote:
>You don't tell how these lines are formatted, but it's possible that
you don't even need a regexp here. But wrt/ sorting, the list of
tuples with the sort key as first element is one of the best solutions.


Ah, so simply using sort() will default to the first element of each tuple?
Yes. Then on the second value if the first compares equal, etc...
The citations are like this:

lastname, firstname. (year). title. other stuff.
Then you theoretically don't even need regexps:
>>line = "lastname, firstname. (year). title. other stuff."
line.split('.')[1].strip().strip('()')
'year'

But since you may have a dot in the "lastname, firstname" part, I'd
still go for a regexp here just to make sure.
Feb 1 '07 #8
John Salerno a écrit :
Bruno Desthuilliers wrote:
>John Salerno a écrit :
>>Hi everyone. If I have a list of tuples, and each tuple is in the form:

(year, text) as in ('1995', 'This is a citation.')

How can I sort the list so that they are in chronological order based
on the year?


Calling sort() on the list should just work.


Amazing, it was that easy. :)
A very common Python idiom is "decorate/sort/undecorate", which is just
what you've done here. It's usually faster than passing a custom
comparison callback function (cf a recent thread named "Sorting a List
of Lists, where Paddy posted a link to a benchmark).
Feb 1 '07 #9
John Salerno a écrit :
John Salerno wrote:
>Bruno Desthuilliers wrote:
>>John Salerno a écrit :

Hi everyone. If I have a list of tuples, and each tuple is in the form:

(year, text) as in ('1995', 'This is a citation.')

How can I sort the list so that they are in chronological order
based on the year?
Calling sort() on the list should just work.


Amazing, it was that easy. :)


One more thing. What if I want them in reverse chronological order? I
tried reverse() but that seemed to put them in reverse alphabetical
order based on the second element of the tuple (not the year).
Really ?
>>lines = [('1995', 'aaa'), ('1997', 'bbb'), ('1995', 'bbb'),
('1997', 'aaa'), ('1995', 'ccc'), ('1996', 'ccc'), ('1996', 'aaa')]
>>lines.sort()
lines
[('1995', 'aaa'), ('1995', 'bbb'), ('1995', 'ccc'), ('1996', 'aaa'),
('1996', 'ccc'), ('1997', 'aaa'), ('1997', 'bbb')]
>>lines.reverse()
lines
[('1997', 'bbb'), ('1997', 'aaa'), ('1996', 'ccc'), ('1996', 'aaa'),
('1995', 'ccc'), ('1995', 'bbb'), ('1995', 'aaa')]
>>>
As you see, the list is being sorted on *both* items - year first, then
sentence. And then of course reversed, since we asked for it !-)

If you want to prevent this from happening and don't mind creating a
copy of the list, you can use the sorted() function with the key and
reverse arguments and operator.itemgetter:
>>lines = [('1995', 'aaa'), ('1997', 'bbb'), ('1995', 'bbb'),
('1997', 'aaa'), ('1995', 'ccc'), ('1996', 'ccc'), ('1996', 'aaa')]
>>from operator import itemgetter
sorted(lines, key=itemgetter(0), reverse=True)
[('1997', 'bbb'), ('1997', 'aaa'), ('1996', 'ccc'), ('1996', 'aaa'),
('1995', 'aaa'), ('1995', 'bbb'), ('1995', 'ccc')]

HTH.
Feb 1 '07 #10
Bruno Desthuilliers wrote:
If you want to prevent this from happening and don't mind creating a
copy of the list, you can use the sorted() function with the key and
reverse arguments and operator.itemgetter:
>>lines = [('1995', 'aaa'), ('1997', 'bbb'), ('1995', 'bbb'),
('1997', 'aaa'), ('1995', 'ccc'), ('1996', 'ccc'), ('1996', 'aaa')]
>>from operator import itemgetter
>>sorted(lines, key=itemgetter(0), reverse=True)
[('1997', 'bbb'), ('1997', 'aaa'), ('1996', 'ccc'), ('1996', 'aaa'),
('1995', 'aaa'), ('1995', 'bbb'), ('1995', 'ccc')]
You don't need to use sorted() -- sort() also takes the key= and
reverse= arguments::
>>lines = [('1995', 'aaa'), ('1997', 'bbb'), ('1995', 'bbb'),
... ('1997', 'aaa'), ('1995', 'ccc'), ('1996', 'ccc'),
... ('1996', 'aaa')]
>>from operator import itemgetter
lines.sort(key=itemgetter(0), reverse=True)
lines
[('1997', 'bbb'), ('1997', 'aaa'), ('1996', 'ccc'), ('1996', 'aaa'),
('1995', 'aaa'), ('1995', 'bbb'), ('1995', 'ccc')]

STeVe
Feb 1 '07 #11
Steven Bethard a écrit :
Bruno Desthuilliers wrote:
>If you want to prevent this from happening and don't mind creating a
copy of the list, you can use the sorted() function with the key and
reverse arguments and operator.itemgetter:
(snip)
>
You don't need to use sorted() -- sort() also takes the key= and
reverse= arguments::
Yeps - thanks for the reminder.
Feb 1 '07 #12
Bruno Desthuilliers wrote:
>One more thing. What if I want them in reverse chronological order? I
tried reverse() but that seemed to put them in reverse alphabetical
order based on the second element of the tuple (not the year).

Really ?
>>lines = [('1995', 'aaa'), ('1997', 'bbb'), ('1995', 'bbb'),
('1997', 'aaa'), ('1995', 'ccc'), ('1996', 'ccc'), ('1996', 'aaa')]
>>lines.sort()
>>lines
[('1995', 'aaa'), ('1995', 'bbb'), ('1995', 'ccc'), ('1996', 'aaa'),
('1996', 'ccc'), ('1997', 'aaa'), ('1997', 'bbb')]
>>lines.reverse()
>>lines
[('1997', 'bbb'), ('1997', 'aaa'), ('1996', 'ccc'), ('1996', 'aaa'),
('1995', 'ccc'), ('1995', 'bbb'), ('1995', 'aaa')]
>>>
Oh I didn't sort then reverse, I just replaced sort with reverse. Maybe
that's why!
Feb 1 '07 #13
John Salerno a écrit :
(snip)
Oh I didn't sort then reverse, I just replaced sort with reverse. Maybe
that's why!
Hmmm... Probably, yes...

!-)
Feb 1 '07 #14
John Salerno <jo******@NOSPAMgmail.comwrites:
Ah, so simply using sort() [on a list of tuples] will default to the
first element of each tuple?
More precisely, list.sort will ask the elements of the list to compare
themselves. Those elements are tuples; two tuples will compare based
on comparison of their corresponding elements.

--
\ "The cost of a thing is the amount of what I call life which is |
`\ required to be exchanged for it, immediately or in the long |
_o__) run." -- Henry David Thoreau |
Ben Finney

Feb 1 '07 #15
On Thu, 01 Feb 2007 14:52:03 -0500, John Salerno wrote:
Bruno Desthuilliers wrote:
>You don't tell how these lines are formatted, but it's possible that you
don't even need a regexp here. But wrt/ sorting, the list of tuples with
the sort key as first element is one of the best solutions.

Ah, so simply using sort() will default to the first element of each tuple?
No. It isn't that sort() knows about tuples. sort() knows how to sort a list
by asking the list items to compare themselves, whatever the items are.
Tuples compare themselves by looking at the first element (if any), and
in the event of a tie going on to the second element, then the third, etc.

--
Steven D'Aprano

Feb 2 '07 #16
Steven Bethard <st************@gmail.comwrote:
You don't need to use sorted() -- sort() also takes the key= and
reverse= arguments::
>>lines = [('1995', 'aaa'), ('1997', 'bbb'), ('1995', 'bbb'),
... ('1997', 'aaa'), ('1995', 'ccc'), ('1996', 'ccc'),
... ('1996', 'aaa')]
>>from operator import itemgetter
>>lines.sort(key=itemgetter(0), reverse=True)
>>lines
[('1997', 'bbb'), ('1997', 'aaa'), ('1996', 'ccc'), ('1996', 'aaa'),
('1995', 'aaa'), ('1995', 'bbb'), ('1995', 'ccc')]
I suspect you want another line in there to give the OP what they actually
want: sort the list alphabetically first and then reverse sort on the year.

The important thing to note is that the reverse flag on the sort method
doesn't reverse elements which compare equal. This makes it possible to
sort on multiple keys comparatively easily.
>>lines = [('1995', 'aaa'), ('1997', 'bbb'), ('1995', 'bbb'),
('1997', 'aaa'), ('1995', 'ccc'), ('1996', 'ccc'),
('1996', 'aaa')]
>>from operator import itemgetter
lines.sort(key=itemgetter(1))
lines.sort(key=itemgetter(0), reverse=True)
lines
[('1997', 'aaa'), ('1997', 'bbb'), ('1996', 'aaa'), ('1996', 'ccc'),
('1995', 'aaa'), ('1995', 'bbb'), ('1995', 'ccc')]
Feb 2 '07 #17
Bruno Desthuilliers wrote:
John Salerno a écrit :
(snip)
>Oh I didn't sort then reverse, I just replaced sort with reverse.
Maybe that's why!

Hmmm... Probably, yes...

!-)
lol, this is what a couple months away from python does to me!
Feb 2 '07 #18

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

9
by: custard_pie | last post by:
I need help sorting a list...I just can't figure out how to sort a list and then return a list with the index of the sorted items in the list for example if the list I want to sort is I need to...
13
by: Nenad Jalsovec | last post by:
using std::list; struct Something{ Something( val ): value( val ){} bool operator<( Something & s ){ return value < s.value; } int value; }; main(){ list< Something * > things;
5
by: Felix Collins | last post by:
Hi All, does anyone know any cleaver tricks to sort a list of outline numbers. An outline number is a number of the form... 1.2.3 they should be sorted in the following way... 1 1.1 1.2
1
by: Giovanni Toffoli | last post by:
Hi, I'm not in the mailing list. By Googling, I stepped into this an old post: (Thu Feb 14 20:40:08 CET 2002) of Jeff Shannon:...
2
by: suzanne099 | last post by:
Hello Everyone... I have been trying to figure out how to sort the data in a list box. (I want the data in the Pick_Batch list box to be sorted by batch_num.) I've tried the following code, but...
16
by: skip | last post by:
The thread on sorting in Python 3 got me to thinking. How could I sort a list of complex numbers using key? As expected: Traceback (most recent call last): File "<stdin>", line 1, in...
3
by: nicstel | last post by:
I'm trying to find a way to sort a list.. list = list.sort() list.sort() give me: I want this: Do I need to use sting.split or something like that?
6
HaLo2FrEeEk
by: HaLo2FrEeEk | last post by:
I cannot seem to even think of a way I might do this. Basically I have 6 lists with various pieces of information about the data I'm processing. I need to sort one of the lists (the one containing...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
1
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
1
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.