This script is not recursive... in order to make it recursive, I have to
call it several times (my kludge... hey, it works). I thought os.walk's
sole purpose was to recursively walk a directory structure, no? Also,it
generates the below error during the os.renames section, but the odd
thing is that it actually renames the files before saying it can't find
them. Any ideas are welcomed. If I'm doing something *really* wrong
here, just let me know.
#-------------- ERROR Message ----------------------#
File "/home/rbt/fix-names-1.1.py", line 29, in ?
clean_names(setpath)
File "/home/rbt/fix-names-1.1.py", line 27, in clean_names
os.renames(oldpath, newpath)
File "/usr/local/lib/python2.3/os.py", line 196, in renames
rename(old, new)
OSError: [Errno 2] No such file or directory
#------------- Code -------------------------#
setpath = raw_input("Path to the Directory: ")
bad = re.compile(r'[*?<>/\|\\]')
for root, dirs, files in os.walk(setpath):
for dname in dirs:
badchars = bad.findall(dname)
for badchar in badchars:
newdname = dname.replace(badchar,'-')
if newdname != dname:
newpath = os.path.join(root, newdname)
oldpath = os.path.join(root, dname)
os.renames(oldpath, newpath) 9 9312
hokieghal99 wrote: This script is not recursive... in order to make it recursive, I have to call it several times (my kludge... hey, it works). I thought os.walk's sole purpose was to recursively walk a directory structure, no? Also,it generates the below error during the os.renames section, but the odd thing is that it actually renames the files before saying it can't find them. Any ideas are welcomed. If I'm doing something *really* wrong here, just let me know.
#-------------- ERROR Message ----------------------#
File "/home/rbt/fix-names-1.1.py", line 29, in ? clean_names(setpath) File "/home/rbt/fix-names-1.1.py", line 27, in clean_names os.renames(oldpath, newpath) File "/usr/local/lib/python2.3/os.py", line 196, in renames rename(old, new) OSError: [Errno 2] No such file or directory
#------------- Code -------------------------#
setpath = raw_input("Path to the Directory: ") bad = re.compile(r'[*?<>/\|\\]') for root, dirs, files in os.walk(setpath): for dname in dirs: badchars = bad.findall(dname) for badchar in badchars: newdname = dname.replace(badchar,'-') if newdname != dname: newpath = os.path.join(root, newdname) oldpath = os.path.join(root, dname) os.renames(oldpath, newpath)
Your code is trying to recurse into the list of directories in 'dirs',
but you are renaming these directories before it can get to them. For
example, if dirs = ['baddir?*', 'gooddir', 'okdir'], you rename
'baddir?*' to 'baddir--' and then os.walk tries to enter 'baddir?*' and
cannot find it. You're better off building a list of paths to rename,
and then renaming them outside of the os.walk scope, or doing something
like...
dirs.remove(dname)
dirs.append(newdname)
....in your 'if' block.
Peace,
Joe
Joe Francia wrote: Your code is trying to recurse into the list of directories in 'dirs', but you are renaming these directories before it can get to them. For example, if dirs = ['baddir?*', 'gooddir', 'okdir'], you rename 'baddir?*' to 'baddir--' and then os.walk tries to enter 'baddir?*' and cannot find it. You're better off building a list of paths to rename, and then renaming them outside of the os.walk scope, or doing something like...
dirs.remove(dname) dirs.append(newdname)
...in your 'if' block.
Peace, Joe
So, which is better... rename in the os.walk scope or not? The below
code works sometimes at others it produces this error:
ValueError: list.remove(x): x is not in list
setpath = raw_input("Path to the Directory: ")
def clean_names(setpath):
bad = re.compile(r'%2f|%25|%20|[*?<>/\|\\]')
for root, dirs, files in os.walk(setpath):
for dname in dirs:
badchars = bad.findall(dname)
for badchar in badchars:
newdname = dname.replace(badchar,'-')
if newdname != dname:
dirs.remove(dname)
dirs.append(newdname)
newpath = os.path.join(root, newdname)
oldpath = os.path.join(root, dname)
os.renames(oldpath, newpath)
hokiegal99 <ho********@hotmail.com> wrote: Joe Francia wrote: Your code is trying to recurse into the list of directories in 'dirs', but you are renaming these directories before it can get to them. For example, if dirs = ['baddir?*', 'gooddir', 'okdir'], you rename 'baddir?*' to 'baddir--' and then os.walk tries to enter 'baddir?*' and cannot find it. You're better off building a list of paths to rename, and then renaming them outside of the os.walk scope, or doing something like...
dirs.remove(dname) dirs.append(newdname)
...in your 'if' block.
Peace, Joe So, which is better... rename in the os.walk scope or not? The below code works sometimes at others it produces this error:
ValueError: list.remove(x): x is not in list
That's strange. It shouldn't be happening. Stick some print statements
in there and see what's going on:
setpath = raw_input("Path to the Directory: ") def clean_names(setpath): bad = re.compile(r'%2f|%25|%20|[*?<>/\|\\]') for root, dirs, files in os.walk(setpath): for dname in dirs: badchars = bad.findall(dname) for badchar in badchars: newdname = dname.replace(badchar,'-') if newdname != dname:
try: dirs.remove(dname)
except ValueError:
print "%s not in %s" % (dname, dirs)
else: dirs.append(newdname) newpath = os.path.join(root, newdname) oldpath = os.path.join(root, dname) os.renames(oldpath, newpath)
Note that I'm assuming it's the dirs.remove(dname) call that's
triggering the ValueError, since there aren't any invocations of
list.remove() anywhere else in your sample code. But I could be wrong;
you should look at the complete exception trace, which will include the
line number at which the exception was thrown.
--
Robin Munn rm***@pobox.com
Thanks for the tip. That code shows all of the dirs that Python is
complaining about not in the list... trouble is, they *are* in the list.
Go figure. I'd like to try doing the rename outside the scope of
os.walk, but I don't undersdtand how to do this, when I break out of
os.walk and try the rename at a parallel level, Python complains that
variables such as "oldpath" and "newpath" are undefined.
Robin Munn wrote: hokiegal99 <ho********@hotmail.com> wrote:
Joe Francia wrote:
Your code is trying to recurse into the list of directories in 'dirs', but you are renaming these directories before it can get to them. For example, if dirs = ['baddir?*', 'gooddir', 'okdir'], you rename 'baddir?*' to 'baddir--' and then os.walk tries to enter 'baddir?*' and cannot find it. You're better off building a list of paths to rename, and then renaming them outside of the os.walk scope, or doing something like...
dirs.remove(dname) dirs.append(newdname)
...in your 'if' block.
Peace, Joe
So, which is better... rename in the os.walk scope or not? The below code works sometimes at others it produces this error:
ValueError: list.remove(x): x is not in list
That's strange. It shouldn't be happening. Stick some print statements in there and see what's going on:
setpath = raw_input("Path to the Directory: ") def clean_names(setpath): bad = re.compile(r'%2f|%25|%20|[*?<>/\|\\]') for root, dirs, files in os.walk(setpath): for dname in dirs: badchars = bad.findall(dname) for badchar in badchars: newdname = dname.replace(badchar,'-') if newdname != dname:
try:
dirs.remove(dname)
except ValueError: print "%s not in %s" % (dname, dirs) else:
dirs.append(newdname) newpath = os.path.join(root, newdname) oldpath = os.path.join(root, dname) os.renames(oldpath, newpath)
Note that I'm assuming it's the dirs.remove(dname) call that's triggering the ValueError, since there aren't any invocations of list.remove() anywhere else in your sample code. But I could be wrong; you should look at the complete exception trace, which will include the line number at which the exception was thrown.
> This script is not recursive... in order to make it recursive, I have to call it several times (my kludge... hey, it works). I thought os.walk's sole purpose was to recursively walk a directory structure, no? Also,it generates the below error during the os.renames section, but the odd thing is that it actually renames the files before saying it can't find them. Any ideas are welcomed. If I'm doing something *really* wrong here, just let me know.
Try iterating from bottom to top.
See "help(os.walk)":
walk(top, topdown=True, onerror=None)
...
If optional arg 'topdown' is true or not specified, the triple for a
directory is generated before the triples for any of its
subdirectories
(directories are generated top down). If topdown is false, the triple
for a directory is generated after the triples for all of its
subdirectories (directories are generated bottom up).
...
hokiegal99 <ho********@hotmail.com> wrote: Thanks for the tip. That code shows all of the dirs that Python is complaining about not in the list... trouble is, they *are* in the list. Go figure. I'd like to try doing the rename outside the scope of os.walk, but I don't undersdtand how to do this, when I break out of os.walk and try the rename at a parallel level, Python complains that variables such as "oldpath" and "newpath" are undefined.
Wait, I just realized that you're changing the list *while* you're
iterating over it. That's a bad idea. See the warning at the bottom of
this page in the language reference: http://www.python.org/doc/current/ref/for.html
Instead of modifying the list while you're looping over it, use the
topdown argument to os.walk to build the tree from the bottom up instead
of from the top down. That way you won't have to futz with the dirnames
list at all:
def clean_names(rootpath):
bad = re.compile(r'%2f|%25|%20|[*?<>/\|\\]')
for root, dirs, files in os.walk(rootpath, topdown=False):
for dname in dirs:
newdname = re.sub(bad, '-', dname)
if newdname != dname:
newpath = os.path.join(root, newdname)
oldpath = os.path.join(root, dname)
os.renames(oldpath, newpath)
Notice also the use of re.sub to do all the character substitutions at
once. Your code as written would have failed on a filename like "foo*?",
since it always renamed from the original filename: it would have first
done os.renames("foo*?", "foo-?") followed by os.renames("foo*?",
"foo--") and the second would have raised an OSError.
--
Robin Munn rm***@pobox.com
Robin Munn wrote: hokiegal99 <ho********@hotmail.com> wrote: Thanks for the tip. That code shows all of the dirs that Python is complaining about not in the list... trouble is, they *are* in the list. Go figure. I'd like to try doing the rename outside the scope of os.walk, but I don't undersdtand how to do this, when I break out of os.walk and try the rename at a parallel level, Python complains that variables such as "oldpath" and "newpath" are undefined.
Wait, I just realized that you're changing the list *while* you're iterating over it. That's a bad idea. See the warning at the bottom of this page in the language reference:
Here's a way to modify the list while iterating over it. Too lazy to
generate the sample directory tree, so I suggest that the OP test it :-)
<untested>
def clean_names(rootpath):
bad = re.compile(r'%2f|%25|%20|[*?<>/\|\\]')
for root, dirs, files in os.walk(rootpath):
for index, dname in enumerate(dirs):
newdname = bad.sub('-', dname)
if newdname != dname:
newpath = os.path.join(root, newdname)
oldpath = os.path.join(root, dname)
try:
os.rename(oldpath, newpath)
except OSError:
print >> sys.stderr, "cannot rename %r to %r" %
(oldpath, newpath)
else:
dirs[index] = newdname # inform os.walk() about the new
name
</untested>
Peter
This works great! No errors... and it gets dirs that are 8 levels deep
(that's as far down as I've tested). Thanks for the tip! The re.sub
seems to be much faster than the string find/replace approach as well...
I need to read-up more on the documentation of os.walk and re in
general. Thanks again!!!
Robin Munn wrote: Wait, I just realized that you're changing the list *while* you're iterating over it. That's a bad idea. See the warning at the bottom of this page in the language reference:
http://www.python.org/doc/current/ref/for.html
Instead of modifying the list while you're looping over it, use the topdown argument to os.walk to build the tree from the bottom up instead of from the top down. That way you won't have to futz with the dirnames list at all:
def clean_names(rootpath): bad = re.compile(r'%2f|%25|%20|[*?<>/\|\\]') for root, dirs, files in os.walk(rootpath, topdown=False): for dname in dirs: newdname = re.sub(bad, '-', dname) if newdname != dname: newpath = os.path.join(root, newdname) oldpath = os.path.join(root, dname) os.renames(oldpath, newpath)
Notice also the use of re.sub to do all the character substitutions at once. Your code as written would have failed on a filename like "foo*?", since it always renamed from the original filename: it would have first done os.renames("foo*?", "foo-?") followed by os.renames("foo*?", "foo--") and the second would have raised an OSError.
Could we discuss more about the topdown feature in os.walk? My script is
working fine now, I have no trouble at all with it. I just want to
better understand os.walk in Python 2.3. This is how I understand it as
of today, someone please correct me if I'm wrong:
topdown=False would build a list of filesystem (fs) objects from the
bottom up. The objects at the begining of the list would be the end-most
objects (the leaf nodes) of the fs. When you make changes to that list,
the changes would be from leaf node to os.walk's root instead of root to
leaf node, correct? For example, if I had this dir structure:
dir_a
file_a
dir_b
file_b
My list would look like this:
file_b
dir_b
file_a
dir_a
And, if I made changes to the list and commited those changes to the fs
then there would be no problems because of the order in which the
changes are made. Is this a proper way to describe topdown=False in
os.walk? Or in other words, our list would be static (one change would
not impact another), where if topdown=True our list would be dynamic
(one change could impact another).
Thanks for the help!!! Robin Munn wrote:
Wait, I just realized that you're changing the list *while* you're iterating over it. That's a bad idea. See the warning at the bottom of this page in the language reference:
http://www.python.org/doc/current/ref/for.html
Instead of modifying the list while you're looping over it, use the topdown argument to os.walk to build the tree from the bottom up instead of from the top down. That way you won't have to futz with the dirnames list at all:
def clean_names(rootpath): bad = re.compile(r'%2f|%25|%20|[*?<>/\|\\]') for root, dirs, files in os.walk(rootpath, topdown=False): for dname in dirs: newdname = re.sub(bad, '-', dname) if newdname != dname: newpath = os.path.join(root, newdname) oldpath = os.path.join(root, dname) os.renames(oldpath, newpath)
Notice also the use of re.sub to do all the character substitutions at once. Your code as written would have failed on a filename like "foo*?", since it always renamed from the original filename: it would have first done os.renames("foo*?", "foo-?") followed by os.renames("foo*?", "foo--") and the second would have raised an OSError.
This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: Tom |
last post by:
I need help to implement the following task in Java and any XML API but
preferably JDOM. I am a total newbie to this.
I have a DocBook document, for example:
<chapter>
<title>Title</title>...
|
by: Ivan Shevanski |
last post by:
To continue with my previous problems, now I'm trying out classes. But I
have a problem (which I bet is easily solveable) that I really don't get.
The numerous tutorials I've looked at just...
|
by: delraydog |
last post by:
It's quite simple to walk to the DOM tree going forward however I can't
figure out a nice clean way to walk the DOM tree in reverse. Checking
previousSibling is not sufficient as the...
|
by: Bruce |
last post by:
Hi all,
I have a question about traversing file systems, and could use some
help. Because of directories with many files in them, os.walk appears
to be rather slow. I`m thinking there is a...
|
by: nemisis |
last post by:
Hi Everyone I have am doing an object oriented C++ program and I have no idea as to how start it.................
I am not that good at coding so i tried and made an overview of what...
|
by: gregpinero |
last post by:
In the example from help(os.walk) it lists this:
from os.path import join, getsize
for root, dirs, files in walk('python/Lib/email'):
print root, "consumes",
print sum(),
print "bytes in",...
|
by: inFocus |
last post by:
Hello,
I am new to python and wanted to write something for myself where
after inputing two words it would search entire drive and when finding
both names in files name would either copy or move...
|
by: Jeff McNeil |
last post by:
Your args are fine, that's just the way os.path.walk works. If you
just need the absolute pathname of a directory when given a relative
path, you can always use os.path.abspath, too.
A couple...
|
by: Jeff Nyman |
last post by:
Greetings all.
I did some searching on this but I can't seem to find a specific
solution. I have code like this:
=========================================
def walker1(arg, dirname, names):...
|
by: isladogs |
last post by:
The next Access Europe meeting will be on Wednesday 4 Oct 2023 starting at 18:00 UK time (6PM UTC+1) and finishing at about 19:15 (7.15PM)
The start time is equivalent to 19:00 (7PM) in Central...
|
by: giovanniandrean |
last post by:
The energy model is structured as follows and uses excel sheets to give input data:
1-Utility.py contains all the functions needed to calculate the variables and other minor things (mentions...
|
by: NeoPa |
last post by:
Hello everyone.
I find myself stuck trying to find the VBA way to get Access to create a PDF of the currently-selected (and open) object (Form or Report).
I know it can be done by selecting :...
|
by: Teri B |
last post by:
Hi, I have created a sub-form Roles. In my course form the user selects the roles assigned to the course.
0ne-to-many. One course many roles.
Then I created a report based on the Course form and...
|
by: isladogs |
last post by:
The next Access Europe meeting will be on Wednesday 1 Nov 2023 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM)
Please note that the UK and Europe revert to winter time on...
|
by: nia12 |
last post by:
Hi there,
I am very new to Access so apologies if any of this is obvious/not clear.
I am creating a data collection tool for health care employees to complete. It consists of a number of...
|
by: NeoPa |
last post by:
Introduction
For this article I'll be focusing on the Report (clsReport) class. This simply handles making the calling Form invisible until all of the Reports opened by it have been closed, when it...
|
by: isladogs |
last post by:
The next online meeting of the Access Europe User Group will be on Wednesday 6 Dec 2023 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM).
In this month's session, Mike...
|
by: GKJR |
last post by:
Does anyone have a recommendation to build a standalone application to replace an Access database? I have my bookkeeping software I developed in Access that I would like to make available to other...
| |