473,320 Members | 1,933 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

Sorting a list

Hi everyone. If I have a list of tuples, and each tuple is in the form:

(year, text) as in ('1995', 'This is a citation.')

How can I sort the list so that they are in chronological order based on
the year? Is there a better way to do this than making a list of tuples?

(So far I have a text file and on each line is a citation. I use an RE
to search for the year, then put this year and the entire citation in a
tuple, and add this tuple to a list. Perhaps this step can be changed to
be more efficient when I then need to sort them by date and write a new
file with the citations in order.)

Thanks.
Feb 1 '07 #1
17 2702
John Salerno wrote:
Hi everyone. If I have a list of tuples, and each tuple is in the form:

(year, text) as in ('1995', 'This is a citation.')

How can I sort the list so that they are in chronological order
L.sort()

--
"kad ima¹ 7 godina glup si ko kurac, sve je predobro: autiæi i bageri u
kvartu.. to je ¾ivot"
Drito Konj
Feb 1 '07 #2
John Salerno a écrit :
Hi everyone. If I have a list of tuples, and each tuple is in the form:

(year, text) as in ('1995', 'This is a citation.')

How can I sort the list so that they are in chronological order based on
the year?
Calling sort() on the list should just work.
Is there a better way to do this than making a list of tuples?
Depends...
(So far I have a text file and on each line is a citation. I use an RE
to search for the year, then put this year and the entire citation in a
tuple, and add this tuple to a list.
You don't tell how these lines are formatted, but it's possible that you
don't even need a regexp here. But wrt/ sorting, the list of tuples with
the sort key as first element is one of the best solutions.
Feb 1 '07 #3
Bruno Desthuilliers wrote:
You don't tell how these lines are formatted, but it's possible that you
don't even need a regexp here. But wrt/ sorting, the list of tuples with
the sort key as first element is one of the best solutions.
Ah, so simply using sort() will default to the first element of each tuple?

The citations are like this:

lastname, firstname. (year). title. other stuff.
Feb 1 '07 #4
Bruno Desthuilliers wrote:
John Salerno a écrit :
>Hi everyone. If I have a list of tuples, and each tuple is in the form:

(year, text) as in ('1995', 'This is a citation.')

How can I sort the list so that they are in chronological order based
on the year?

Calling sort() on the list should just work.
Amazing, it was that easy. :)
Feb 1 '07 #5
John Salerno wrote:
Bruno Desthuilliers wrote:
>John Salerno a écrit :
>>Hi everyone. If I have a list of tuples, and each tuple is in the form:

(year, text) as in ('1995', 'This is a citation.')

How can I sort the list so that they are in chronological order based
on the year?

Calling sort() on the list should just work.

Amazing, it was that easy. :)
Here's what I did:

import re

file = open('newrefs.txt')
text = file.readlines()
file.close()

newfile = open('sortedrefs.txt', 'w')
refs = []

pattern = re.compile('\(\d{4}\)')

for line in text:
year = pattern.search(line).group()
refs.append((year, line))

refs.sort()

for ref in refs:
newfile.write(ref[1])

newfile.close()
Feb 1 '07 #6
John Salerno wrote:
Bruno Desthuilliers wrote:
>John Salerno a écrit :
>>Hi everyone. If I have a list of tuples, and each tuple is in the form:

(year, text) as in ('1995', 'This is a citation.')

How can I sort the list so that they are in chronological order based
on the year?

Calling sort() on the list should just work.

Amazing, it was that easy. :)
One more thing. What if I want them in reverse chronological order? I
tried reverse() but that seemed to put them in reverse alphabetical
order based on the second element of the tuple (not the year).
Feb 1 '07 #7
John Salerno a écrit :
Bruno Desthuilliers wrote:
>You don't tell how these lines are formatted, but it's possible that
you don't even need a regexp here. But wrt/ sorting, the list of
tuples with the sort key as first element is one of the best solutions.


Ah, so simply using sort() will default to the first element of each tuple?
Yes. Then on the second value if the first compares equal, etc...
The citations are like this:

lastname, firstname. (year). title. other stuff.
Then you theoretically don't even need regexps:
>>line = "lastname, firstname. (year). title. other stuff."
line.split('.')[1].strip().strip('()')
'year'

But since you may have a dot in the "lastname, firstname" part, I'd
still go for a regexp here just to make sure.
Feb 1 '07 #8
John Salerno a écrit :
Bruno Desthuilliers wrote:
>John Salerno a écrit :
>>Hi everyone. If I have a list of tuples, and each tuple is in the form:

(year, text) as in ('1995', 'This is a citation.')

How can I sort the list so that they are in chronological order based
on the year?


Calling sort() on the list should just work.


Amazing, it was that easy. :)
A very common Python idiom is "decorate/sort/undecorate", which is just
what you've done here. It's usually faster than passing a custom
comparison callback function (cf a recent thread named "Sorting a List
of Lists, where Paddy posted a link to a benchmark).
Feb 1 '07 #9
John Salerno a écrit :
John Salerno wrote:
>Bruno Desthuilliers wrote:
>>John Salerno a écrit :

Hi everyone. If I have a list of tuples, and each tuple is in the form:

(year, text) as in ('1995', 'This is a citation.')

How can I sort the list so that they are in chronological order
based on the year?
Calling sort() on the list should just work.


Amazing, it was that easy. :)


One more thing. What if I want them in reverse chronological order? I
tried reverse() but that seemed to put them in reverse alphabetical
order based on the second element of the tuple (not the year).
Really ?
>>lines = [('1995', 'aaa'), ('1997', 'bbb'), ('1995', 'bbb'),
('1997', 'aaa'), ('1995', 'ccc'), ('1996', 'ccc'), ('1996', 'aaa')]
>>lines.sort()
lines
[('1995', 'aaa'), ('1995', 'bbb'), ('1995', 'ccc'), ('1996', 'aaa'),
('1996', 'ccc'), ('1997', 'aaa'), ('1997', 'bbb')]
>>lines.reverse()
lines
[('1997', 'bbb'), ('1997', 'aaa'), ('1996', 'ccc'), ('1996', 'aaa'),
('1995', 'ccc'), ('1995', 'bbb'), ('1995', 'aaa')]
>>>
As you see, the list is being sorted on *both* items - year first, then
sentence. And then of course reversed, since we asked for it !-)

If you want to prevent this from happening and don't mind creating a
copy of the list, you can use the sorted() function with the key and
reverse arguments and operator.itemgetter:
>>lines = [('1995', 'aaa'), ('1997', 'bbb'), ('1995', 'bbb'),
('1997', 'aaa'), ('1995', 'ccc'), ('1996', 'ccc'), ('1996', 'aaa')]
>>from operator import itemgetter
sorted(lines, key=itemgetter(0), reverse=True)
[('1997', 'bbb'), ('1997', 'aaa'), ('1996', 'ccc'), ('1996', 'aaa'),
('1995', 'aaa'), ('1995', 'bbb'), ('1995', 'ccc')]

HTH.
Feb 1 '07 #10
Bruno Desthuilliers wrote:
If you want to prevent this from happening and don't mind creating a
copy of the list, you can use the sorted() function with the key and
reverse arguments and operator.itemgetter:
>>lines = [('1995', 'aaa'), ('1997', 'bbb'), ('1995', 'bbb'),
('1997', 'aaa'), ('1995', 'ccc'), ('1996', 'ccc'), ('1996', 'aaa')]
>>from operator import itemgetter
>>sorted(lines, key=itemgetter(0), reverse=True)
[('1997', 'bbb'), ('1997', 'aaa'), ('1996', 'ccc'), ('1996', 'aaa'),
('1995', 'aaa'), ('1995', 'bbb'), ('1995', 'ccc')]
You don't need to use sorted() -- sort() also takes the key= and
reverse= arguments::
>>lines = [('1995', 'aaa'), ('1997', 'bbb'), ('1995', 'bbb'),
... ('1997', 'aaa'), ('1995', 'ccc'), ('1996', 'ccc'),
... ('1996', 'aaa')]
>>from operator import itemgetter
lines.sort(key=itemgetter(0), reverse=True)
lines
[('1997', 'bbb'), ('1997', 'aaa'), ('1996', 'ccc'), ('1996', 'aaa'),
('1995', 'aaa'), ('1995', 'bbb'), ('1995', 'ccc')]

STeVe
Feb 1 '07 #11
Steven Bethard a écrit :
Bruno Desthuilliers wrote:
>If you want to prevent this from happening and don't mind creating a
copy of the list, you can use the sorted() function with the key and
reverse arguments and operator.itemgetter:
(snip)
>
You don't need to use sorted() -- sort() also takes the key= and
reverse= arguments::
Yeps - thanks for the reminder.
Feb 1 '07 #12
Bruno Desthuilliers wrote:
>One more thing. What if I want them in reverse chronological order? I
tried reverse() but that seemed to put them in reverse alphabetical
order based on the second element of the tuple (not the year).

Really ?
>>lines = [('1995', 'aaa'), ('1997', 'bbb'), ('1995', 'bbb'),
('1997', 'aaa'), ('1995', 'ccc'), ('1996', 'ccc'), ('1996', 'aaa')]
>>lines.sort()
>>lines
[('1995', 'aaa'), ('1995', 'bbb'), ('1995', 'ccc'), ('1996', 'aaa'),
('1996', 'ccc'), ('1997', 'aaa'), ('1997', 'bbb')]
>>lines.reverse()
>>lines
[('1997', 'bbb'), ('1997', 'aaa'), ('1996', 'ccc'), ('1996', 'aaa'),
('1995', 'ccc'), ('1995', 'bbb'), ('1995', 'aaa')]
>>>
Oh I didn't sort then reverse, I just replaced sort with reverse. Maybe
that's why!
Feb 1 '07 #13
John Salerno a écrit :
(snip)
Oh I didn't sort then reverse, I just replaced sort with reverse. Maybe
that's why!
Hmmm... Probably, yes...

!-)
Feb 1 '07 #14
John Salerno <jo******@NOSPAMgmail.comwrites:
Ah, so simply using sort() [on a list of tuples] will default to the
first element of each tuple?
More precisely, list.sort will ask the elements of the list to compare
themselves. Those elements are tuples; two tuples will compare based
on comparison of their corresponding elements.

--
\ "The cost of a thing is the amount of what I call life which is |
`\ required to be exchanged for it, immediately or in the long |
_o__) run." -- Henry David Thoreau |
Ben Finney

Feb 1 '07 #15
On Thu, 01 Feb 2007 14:52:03 -0500, John Salerno wrote:
Bruno Desthuilliers wrote:
>You don't tell how these lines are formatted, but it's possible that you
don't even need a regexp here. But wrt/ sorting, the list of tuples with
the sort key as first element is one of the best solutions.

Ah, so simply using sort() will default to the first element of each tuple?
No. It isn't that sort() knows about tuples. sort() knows how to sort a list
by asking the list items to compare themselves, whatever the items are.
Tuples compare themselves by looking at the first element (if any), and
in the event of a tie going on to the second element, then the third, etc.

--
Steven D'Aprano

Feb 2 '07 #16
Steven Bethard <st************@gmail.comwrote:
You don't need to use sorted() -- sort() also takes the key= and
reverse= arguments::
>>lines = [('1995', 'aaa'), ('1997', 'bbb'), ('1995', 'bbb'),
... ('1997', 'aaa'), ('1995', 'ccc'), ('1996', 'ccc'),
... ('1996', 'aaa')]
>>from operator import itemgetter
>>lines.sort(key=itemgetter(0), reverse=True)
>>lines
[('1997', 'bbb'), ('1997', 'aaa'), ('1996', 'ccc'), ('1996', 'aaa'),
('1995', 'aaa'), ('1995', 'bbb'), ('1995', 'ccc')]
I suspect you want another line in there to give the OP what they actually
want: sort the list alphabetically first and then reverse sort on the year.

The important thing to note is that the reverse flag on the sort method
doesn't reverse elements which compare equal. This makes it possible to
sort on multiple keys comparatively easily.
>>lines = [('1995', 'aaa'), ('1997', 'bbb'), ('1995', 'bbb'),
('1997', 'aaa'), ('1995', 'ccc'), ('1996', 'ccc'),
('1996', 'aaa')]
>>from operator import itemgetter
lines.sort(key=itemgetter(1))
lines.sort(key=itemgetter(0), reverse=True)
lines
[('1997', 'aaa'), ('1997', 'bbb'), ('1996', 'aaa'), ('1996', 'ccc'),
('1995', 'aaa'), ('1995', 'bbb'), ('1995', 'ccc')]
Feb 2 '07 #17
Bruno Desthuilliers wrote:
John Salerno a écrit :
(snip)
>Oh I didn't sort then reverse, I just replaced sort with reverse.
Maybe that's why!

Hmmm... Probably, yes...

!-)
lol, this is what a couple months away from python does to me!
Feb 2 '07 #18

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

9
by: custard_pie | last post by:
I need help sorting a list...I just can't figure out how to sort a list and then return a list with the index of the sorted items in the list for example if the list I want to sort is I need to...
13
by: Nenad Jalsovec | last post by:
using std::list; struct Something{ Something( val ): value( val ){} bool operator<( Something & s ){ return value < s.value; } int value; }; main(){ list< Something * > things;
5
by: Felix Collins | last post by:
Hi All, does anyone know any cleaver tricks to sort a list of outline numbers. An outline number is a number of the form... 1.2.3 they should be sorted in the following way... 1 1.1 1.2
1
by: Giovanni Toffoli | last post by:
Hi, I'm not in the mailing list. By Googling, I stepped into this an old post: (Thu Feb 14 20:40:08 CET 2002) of Jeff Shannon:...
2
by: suzanne099 | last post by:
Hello Everyone... I have been trying to figure out how to sort the data in a list box. (I want the data in the Pick_Batch list box to be sorted by batch_num.) I've tried the following code, but...
16
by: skip | last post by:
The thread on sorting in Python 3 got me to thinking. How could I sort a list of complex numbers using key? As expected: Traceback (most recent call last): File "<stdin>", line 1, in...
3
by: nicstel | last post by:
I'm trying to find a way to sort a list.. list = list.sort() list.sort() give me: I want this: Do I need to use sting.split or something like that?
6
HaLo2FrEeEk
by: HaLo2FrEeEk | last post by:
I cannot seem to even think of a way I might do this. Basically I have 6 lists with various pieces of information about the data I'm processing. I need to sort one of the lists (the one containing...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
0
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.