By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
424,952 Members | 1,674 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 424,952 IT Pros & Developers. It's quick & easy.

Python - Sort files based on timestamp encoded in the filename

P: 8
I have a list which contains list of file names, i wanted to sort list of those files based on timestamp encoded in file names.

Note: In file, Hello_Hi_2015-02-20T084521_1424543480.tar.gz --> 2015-02-20T084521 represents as "year-moth-dayTHHMMSS" ( Based on this i wanted to sort )

Input file below:

file_list = ['Hello_Hi_2015-02-20T084521_1424543480.tar.gz',
'Hello_Hi_2015-02-20T095845_1424543481.tar.gz',
'Hello_Hi_2015-02-20T095926_1424543481.tar.gz',
'Hello_Hi_2015-02-20T100025_1424543482.tar.gz',
'Hello_Hi_2015-02-20T111631_1424543483.tar.gz',
'Hello_Hi_2015-02-20T111718_1424543483.tar.gz',
'Hello_Hi_2015-02-20T112502_1424543483.tar.gz',
'Hello_Hi_2015-02-20T112633_1424543484.tar.gz',
'Hello_Hi_2015-02-20T113427_1424543484.tar.gz',
'Hello_Hi_2015-02-20T113456_1424543484.tar.gz',
'Hello_Hi_2015-02-20T113608_1424543484.tar.gz',
'Hello_Hi_2015-02-20T113659_1424543485.tar.gz',
'Hello_Hi_2015-02-20T113809_1424543485.tar.gz',
'Hello_Hi_2015-02-20T113901_1424543485.tar.gz',
'Hello_Hi_2015-02-20T113955_1424543485.tar.gz',
'Hello_Hi_2015-03-20T114122_1424543485.tar.gz',
'Hello_Hi_2015-02-20T114532_1424543486.tar.gz',
'Hello_Hi_2015-02-20T120045_1424543487.tar.gz',
'Hello_Hi_2015-02-20T120146_1424543487.tar.gz',
'Hello_WR_2015-02-20T084709_1424543480.tar.gz',
'Hello_WR_2015-02-20T113016_1424543486.tar.gz']

Output should be:

file_list = ['Hello_Hi_2015-02-20T084521_1424543480.tar.gz',
'Hello_WR_2015-02-20T084709_1424543480.tar.gz',
'Hello_Hi_2015-02-20T095845_1424543481.tar.gz',
'Hello_Hi_2015-02-20T095926_1424543481.tar.gz',
'Hello_Hi_2015-02-20T100025_1424543482.tar.gz',
'Hello_Hi_2015-02-20T111631_1424543483.tar.gz',
'Hello_Hi_2015-02-20T111718_1424543483.tar.gz',
'Hello_Hi_2015-02-20T112502_1424543483.tar.gz',
'Hello_Hi_2015-02-20T112633_1424543484.tar.gz',
'Hello_WR_2015-02-20T113016_1424543486.tar.gz',
'Hello_Hi_2015-02-20T113427_1424543484.tar.gz',
'Hello_Hi_2015-02-20T113456_1424543484.tar.gz',
'Hello_Hi_2015-02-20T113608_1424543484.tar.gz',
'Hello_Hi_2015-02-20T113659_1424543485.tar.gz',
'Hello_Hi_2015-02-20T113809_1424543485.tar.gz',
'Hello_Hi_2015-02-20T113901_1424543485.tar.gz',
'Hello_Hi_2015-02-20T113955_1424543485.tar.gz',
'Hello_Hi_2015-02-20T114532_1424543486.tar.gz',
'Hello_Hi_2015-02-20T120045_1424543487.tar.gz',
'Hello_Hi_2015-02-20T120146_1424543487.tar.gz',
'Hello_Hi_2015-03-20T114122_1424543485.tar.gz']
Below is the code which i have tried.
Expand|Select|Wrap|Line Numbers
  1. def sort( dir ):
  2.    os.chdir( dir )
  3.    file_list = glob.glob('Hello_*')
  4.    file_list.sort(key=os.path.getmtime)
  5.    print("\n".join(file_list))
  6.    return 0
  7.  
Thanks in advance!!
Aug 7 '15 #1

✓ answered by bvdet

Here is a regex solution:
Expand|Select|Wrap|Line Numbers
  1. import re
  2.  
  3. pattern = re.compile(r"^\D+?_\D+?_(.+?)_")
  4.  
  5. def sort_on_TS(a, b):
  6.     return cmp(pattern.match(a).group(1), pattern.match(b).group(1))
  7.  
  8. for item in sorted(file_list, sort_on_TS):
  9.     print item
This method uses string method split:
Expand|Select|Wrap|Line Numbers
  1. def sort_on_TS(a,b):
  2.     return cmp(a.split("_")[2], b.split("_")[2])
More lines to write, but you could also use string method index:
Expand|Select|Wrap|Line Numbers
  1. def sort_on_TS(a,b):
  2.     idx1 = a.index("_", a.index("_")+1)
  3.     idx2 = a.index("_", idx1+1)
  4.     idx11 = b.index("_", b.index("_")+1)
  5.     idx22 = b.index("_", idx11+1)
  6.     return cmp(a[idx1:idx2+1], b[idx11:idx22+1])

Share this Question
Share on Google+
2 Replies


bvdet
Expert Mod 2.5K+
P: 2,851
Here is a regex solution:
Expand|Select|Wrap|Line Numbers
  1. import re
  2.  
  3. pattern = re.compile(r"^\D+?_\D+?_(.+?)_")
  4.  
  5. def sort_on_TS(a, b):
  6.     return cmp(pattern.match(a).group(1), pattern.match(b).group(1))
  7.  
  8. for item in sorted(file_list, sort_on_TS):
  9.     print item
This method uses string method split:
Expand|Select|Wrap|Line Numbers
  1. def sort_on_TS(a,b):
  2.     return cmp(a.split("_")[2], b.split("_")[2])
More lines to write, but you could also use string method index:
Expand|Select|Wrap|Line Numbers
  1. def sort_on_TS(a,b):
  2.     idx1 = a.index("_", a.index("_")+1)
  3.     idx2 = a.index("_", idx1+1)
  4.     idx11 = b.index("_", b.index("_")+1)
  5.     idx22 = b.index("_", idx11+1)
  6.     return cmp(a[idx1:idx2+1], b[idx11:idx22+1])
Aug 7 '15 #2

P: 8
@bvdet: Excellent solution!!! Thank you very much!!!
Aug 8 '15 #3

Post your reply

Sign in to post your reply or Sign up for a free account.