By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
440,629 Members | 1,222 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 440,629 IT Pros & Developers. It's quick & easy.

Matching Directory Names and Grouping Them

P: n/a
J
Hello Group-

I have limited programming experience, but I'm looking for a generic
way to search through a root directory for subdirectories with similar
names, organize and group them by matching their subdirectory path, and
then output their full paths into a text file. For example, the
contents of the output text file may look like this:

<root>\Input1\2001\01\
<root>\Input2\2001\01\
<root>\Input3\2001\01\

<root>\Input1\2002\03\
<root>\Input2\2002\03\
<root>\Input3\2002\03\

<root>\Input2\2005\05\
<root>\Input3\2005\05\

<root>\Input1\2005\12\
<root>\Input3\2005\12\

I tried working with python regular expressions, but so far haven't
found code that can do the trick. Any help would be greatly
appreciated. Thanks!
J.

Jan 11 '07 #1
Share this Question
Share on Google+
4 Replies


P: n/a
J wrote:
Hello Group-

I have limited programming experience, but I'm looking for a generic
way to search through a root directory for subdirectories with similar
names, organize and group them by matching their subdirectory path, and
then output their full paths into a text file. For example, the
contents of the output text file may look like this:

<root>\Input1\2001\01\
<root>\Input2\2001\01\
<root>\Input3\2001\01\

<root>\Input1\2002\03\
<root>\Input2\2002\03\
<root>\Input3\2002\03\

<root>\Input2\2005\05\
<root>\Input3\2005\05\

<root>\Input1\2005\12\
<root>\Input3\2005\12\

I tried working with python regular expressions, but so far haven't
found code that can do the trick. Any help would be greatly
appreciated. Thanks!
Define "similar".

regards
Steve
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://del.icio.us/steve.holden
Blog of Note: http://holdenweb.blogspot.com

Jan 11 '07 #2

P: n/a
J
Steve-

Thanks for the reply. I think what I'm trying to say by similar is
pattern matching. Essentially, walking through a directory tree
starting at a specified root folder, and returning a list of all
folders that matches a pattern, in this case, a folder name containing
a four digit number representing year and a subdirectory name
containing a two digit number representing a month. The matches are
grouped together and written into a text file. I hope this helps.

Kind Regards,
J

Steve Holden wrote:
J wrote:
Hello Group-

I have limited programming experience, but I'm looking for a generic
way to search through a root directory for subdirectories with similar
names, organize and group them by matching their subdirectory path, and
then output their full paths into a text file. For example, the
contents of the output text file may look like this:

<root>\Input1\2001\01\
<root>\Input2\2001\01\
<root>\Input3\2001\01\

<root>\Input1\2002\03\
<root>\Input2\2002\03\
<root>\Input3\2002\03\

<root>\Input2\2005\05\
<root>\Input3\2005\05\

<root>\Input1\2005\12\
<root>\Input3\2005\12\

I tried working with python regular expressions, but so far haven't
found code that can do the trick. Any help would be greatly
appreciated. Thanks!
Define "similar".

regards
Steve
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://del.icio.us/steve.holden
Blog of Note: http://holdenweb.blogspot.com
Jan 11 '07 #3

P: n/a
>From your example, if you want to group every path that has the same
last 9 characters, a simple solution could be something like:

groups = {}
for path in paths:
group = groups.setdefault(path[-9:],[])
group.append(path)

I didn't actually test it, there ight be syntax errors.

J wrote:
Steve-

Thanks for the reply. I think what I'm trying to say by similar is
pattern matching. Essentially, walking through a directory tree
starting at a specified root folder, and returning a list of all
folders that matches a pattern, in this case, a folder name containing
a four digit number representing year and a subdirectory name
containing a two digit number representing a month. The matches are
grouped together and written into a text file. I hope this helps.

Kind Regards,
J

Steve Holden wrote:
J wrote:
Hello Group-
>
I have limited programming experience, but I'm looking for a generic
way to search through a root directory for subdirectories with similar
names, organize and group them by matching their subdirectory path, and
then output their full paths into a text file. For example, the
contents of the output text file may look like this:
>
<root>\Input1\2001\01\
<root>\Input2\2001\01\
<root>\Input3\2001\01\
>
<root>\Input1\2002\03\
<root>\Input2\2002\03\
<root>\Input3\2002\03\
>
<root>\Input2\2005\05\
<root>\Input3\2005\05\
>
<root>\Input1\2005\12\
<root>\Input3\2005\12\
>
I tried working with python regular expressions, but so far haven't
found code that can do the trick. Any help would be greatly
appreciated. Thanks!
>
Define "similar".

regards
Steve
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://del.icio.us/steve.holden
Blog of Note: http://holdenweb.blogspot.com
Jan 12 '07 #4

P: n/a
On 2007-01-11, J <wi***********@gmail.comwrote:
Steve-

Thanks for the reply. I think what I'm trying to say by similar
is pattern matching. Essentially, walking through a directory
tree starting at a specified root folder, and returning a list
of all folders that matches a pattern, in this case, a folder
name containing a four digit number representing year and a
subdirectory name containing a two digit number representing a
month. The matches are grouped together and written into a text
file. I hope this helps.
Here's a solution using itertools.groupby, just because this is
the first programming problem I've seen that seemed to call for
it. Hooray!

from itertools import groupby

def print_by_date(dirs):
r""" Group a directory list according to date codes.
>>data = [
... "<root>/Input2/2002/03/",
... "<root>/Input1/2001/01/",
... "<root>/Input3/2005/05/",
... "<root>/Input3/2001/01/",
... "<root>/Input1/2002/03/",
... "<root>/Input3/2005/12/",
... "<root>/Input2/2001/01/",
... "<root>/Input3/2002/03/",
... "<root>/Input2/2005/05/",
... "<root>/Input1/2005/12/"]
>>print_by_date(data)
<root>/Input1/2001/01/
<root>/Input2/2001/01/
<root>/Input3/2001/01/
<BLANKLINE>
<root>/Input1/2002/03/
<root>/Input2/2002/03/
<root>/Input3/2002/03/
<BLANKLINE>
<root>/Input2/2005/05/
<root>/Input3/2005/05/
<BLANKLINE>
<root>/Input1/2005/12/
<root>/Input3/2005/12/
<BLANKLINE>

"""
def date_key(path):
return path[-7:]
groups =[list(g) for _,g in groupby(sorted(dirs, key=date_key), date_key)]
for g in groups:
print '\n'.join(path for path in sorted(g))
print

if __name__ == "__main__":
import doctest
doctest.testmod()

I really wanted nested join calls for the output, to suppress
that trailing blank line, but I kept getting confused and
couldn't sort it out.

It would better to use the os.path module, but I couldn't find
the function in there lets me pull out path tails.

I didn't filter out stuff that didn't match the date path
convention you used.

--
Neil Cerutti
Jan 12 '07 #5

This discussion thread is closed

Replies have been disabled for this discussion.