473,326 Members | 2,680 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,326 software developers and data experts.

Matching Directory Names and Grouping Them

J
Hello Group-

I have limited programming experience, but I'm looking for a generic
way to search through a root directory for subdirectories with similar
names, organize and group them by matching their subdirectory path, and
then output their full paths into a text file. For example, the
contents of the output text file may look like this:

<root>\Input1\2001\01\
<root>\Input2\2001\01\
<root>\Input3\2001\01\

<root>\Input1\2002\03\
<root>\Input2\2002\03\
<root>\Input3\2002\03\

<root>\Input2\2005\05\
<root>\Input3\2005\05\

<root>\Input1\2005\12\
<root>\Input3\2005\12\

I tried working with python regular expressions, but so far haven't
found code that can do the trick. Any help would be greatly
appreciated. Thanks!
J.

Jan 11 '07 #1
4 1214
J wrote:
Hello Group-

I have limited programming experience, but I'm looking for a generic
way to search through a root directory for subdirectories with similar
names, organize and group them by matching their subdirectory path, and
then output their full paths into a text file. For example, the
contents of the output text file may look like this:

<root>\Input1\2001\01\
<root>\Input2\2001\01\
<root>\Input3\2001\01\

<root>\Input1\2002\03\
<root>\Input2\2002\03\
<root>\Input3\2002\03\

<root>\Input2\2005\05\
<root>\Input3\2005\05\

<root>\Input1\2005\12\
<root>\Input3\2005\12\

I tried working with python regular expressions, but so far haven't
found code that can do the trick. Any help would be greatly
appreciated. Thanks!
Define "similar".

regards
Steve
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://del.icio.us/steve.holden
Blog of Note: http://holdenweb.blogspot.com

Jan 11 '07 #2
J
Steve-

Thanks for the reply. I think what I'm trying to say by similar is
pattern matching. Essentially, walking through a directory tree
starting at a specified root folder, and returning a list of all
folders that matches a pattern, in this case, a folder name containing
a four digit number representing year and a subdirectory name
containing a two digit number representing a month. The matches are
grouped together and written into a text file. I hope this helps.

Kind Regards,
J

Steve Holden wrote:
J wrote:
Hello Group-

I have limited programming experience, but I'm looking for a generic
way to search through a root directory for subdirectories with similar
names, organize and group them by matching their subdirectory path, and
then output their full paths into a text file. For example, the
contents of the output text file may look like this:

<root>\Input1\2001\01\
<root>\Input2\2001\01\
<root>\Input3\2001\01\

<root>\Input1\2002\03\
<root>\Input2\2002\03\
<root>\Input3\2002\03\

<root>\Input2\2005\05\
<root>\Input3\2005\05\

<root>\Input1\2005\12\
<root>\Input3\2005\12\

I tried working with python regular expressions, but so far haven't
found code that can do the trick. Any help would be greatly
appreciated. Thanks!
Define "similar".

regards
Steve
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://del.icio.us/steve.holden
Blog of Note: http://holdenweb.blogspot.com
Jan 11 '07 #3
>From your example, if you want to group every path that has the same
last 9 characters, a simple solution could be something like:

groups = {}
for path in paths:
group = groups.setdefault(path[-9:],[])
group.append(path)

I didn't actually test it, there ight be syntax errors.

J wrote:
Steve-

Thanks for the reply. I think what I'm trying to say by similar is
pattern matching. Essentially, walking through a directory tree
starting at a specified root folder, and returning a list of all
folders that matches a pattern, in this case, a folder name containing
a four digit number representing year and a subdirectory name
containing a two digit number representing a month. The matches are
grouped together and written into a text file. I hope this helps.

Kind Regards,
J

Steve Holden wrote:
J wrote:
Hello Group-
>
I have limited programming experience, but I'm looking for a generic
way to search through a root directory for subdirectories with similar
names, organize and group them by matching their subdirectory path, and
then output their full paths into a text file. For example, the
contents of the output text file may look like this:
>
<root>\Input1\2001\01\
<root>\Input2\2001\01\
<root>\Input3\2001\01\
>
<root>\Input1\2002\03\
<root>\Input2\2002\03\
<root>\Input3\2002\03\
>
<root>\Input2\2005\05\
<root>\Input3\2005\05\
>
<root>\Input1\2005\12\
<root>\Input3\2005\12\
>
I tried working with python regular expressions, but so far haven't
found code that can do the trick. Any help would be greatly
appreciated. Thanks!
>
Define "similar".

regards
Steve
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://del.icio.us/steve.holden
Blog of Note: http://holdenweb.blogspot.com
Jan 12 '07 #4
On 2007-01-11, J <wi***********@gmail.comwrote:
Steve-

Thanks for the reply. I think what I'm trying to say by similar
is pattern matching. Essentially, walking through a directory
tree starting at a specified root folder, and returning a list
of all folders that matches a pattern, in this case, a folder
name containing a four digit number representing year and a
subdirectory name containing a two digit number representing a
month. The matches are grouped together and written into a text
file. I hope this helps.
Here's a solution using itertools.groupby, just because this is
the first programming problem I've seen that seemed to call for
it. Hooray!

from itertools import groupby

def print_by_date(dirs):
r""" Group a directory list according to date codes.
>>data = [
... "<root>/Input2/2002/03/",
... "<root>/Input1/2001/01/",
... "<root>/Input3/2005/05/",
... "<root>/Input3/2001/01/",
... "<root>/Input1/2002/03/",
... "<root>/Input3/2005/12/",
... "<root>/Input2/2001/01/",
... "<root>/Input3/2002/03/",
... "<root>/Input2/2005/05/",
... "<root>/Input1/2005/12/"]
>>print_by_date(data)
<root>/Input1/2001/01/
<root>/Input2/2001/01/
<root>/Input3/2001/01/
<BLANKLINE>
<root>/Input1/2002/03/
<root>/Input2/2002/03/
<root>/Input3/2002/03/
<BLANKLINE>
<root>/Input2/2005/05/
<root>/Input3/2005/05/
<BLANKLINE>
<root>/Input1/2005/12/
<root>/Input3/2005/12/
<BLANKLINE>

"""
def date_key(path):
return path[-7:]
groups =[list(g) for _,g in groupby(sorted(dirs, key=date_key), date_key)]
for g in groups:
print '\n'.join(path for path in sorted(g))
print

if __name__ == "__main__":
import doctest
doctest.testmod()

I really wanted nested join calls for the output, to suppress
that trailing blank line, but I kept getting confused and
couldn't sort it out.

It would better to use the os.path module, but I couldn't find
the function in there lets me pull out path tails.

I didn't filter out stuff that didn't match the date path
convention you used.

--
Neil Cerutti
Jan 12 '07 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

7
by: lawrence | last post by:
2 Questions: 1.) Can anyone think of a way to speed up this function? It is terribly slow. I plan to reduce the number of directories to 3, which I guess will speed it up in the end. 2.) This...
2
by: Jim Dabell | last post by:
I'm in the middle of writing a small app for Linux that needs to create directories that take their names from untrusted data. If possible, I'd like to preserve special characters rather than...
0
by: Carl | last post by:
I want to create a generic xslt that would take xml input like: ########################################################################## <?xml version="1.0" encoding="utf-8" ?>...
2
by: NeverLift | last post by:
I can't believe I'm the first to encounter this, but I've not found any similar posting. Been developing pages for a few months, getting up to speed in javascript tricks, inline incremental...
2
by: Sheila | last post by:
I am hoping someone here has experience with this. I am using Access 97, but if someone has done this in another version I'd appreciate input as well I need to populate a table with subdirectory...
1
by: Trint Smith | last post by:
I can get file names with the following code... Dim di As New DirectoryInfo("c:\OrderData\Database") Dim fiArr As FileInfo() = di.GetFiles() Dim f As FileInfo For Each f In fiArr Dim...
2
by: Steven J. Reed | last post by:
I have a Web Service that needs to find a Virtual Directory name, determine the actual path of that directory, then read / write files to that directory. How can I find a virtual directory then...
22
by: rtilley | last post by:
# Spaces are present before and after the XXX filename = ' XXX ' new_filename = filename.strip() if new_filename != filename: print filename Macs allow these spaces in file and folder...
5
by: Martin Carpella | last post by:
Hi! I know, that Path-names should be shorter than MAX_PATH (250), but the Win32-SDK allows specifiying longer names using the "\\?\" prefix. As I have an application that accesses files...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.