473,320 Members | 1,961 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

os.walk help

This script is not recursive... in order to make it recursive, I have to
call it several times (my kludge... hey, it works). I thought os.walk's
sole purpose was to recursively walk a directory structure, no? Also,it
generates the below error during the os.renames section, but the odd
thing is that it actually renames the files before saying it can't find
them. Any ideas are welcomed. If I'm doing something *really* wrong
here, just let me know.

#-------------- ERROR Message ----------------------#

File "/home/rbt/fix-names-1.1.py", line 29, in ?
clean_names(setpath)
File "/home/rbt/fix-names-1.1.py", line 27, in clean_names
os.renames(oldpath, newpath)
File "/usr/local/lib/python2.3/os.py", line 196, in renames
rename(old, new)
OSError: [Errno 2] No such file or directory

#------------- Code -------------------------#

setpath = raw_input("Path to the Directory: ")
bad = re.compile(r'[*?<>/\|\\]')
for root, dirs, files in os.walk(setpath):
for dname in dirs:
badchars = bad.findall(dname)
for badchar in badchars:
newdname = dname.replace(badchar,'-')
if newdname != dname:
newpath = os.path.join(root, newdname)
oldpath = os.path.join(root, dname)
os.renames(oldpath, newpath)

Jul 18 '05 #1
9 9334
hokieghal99 wrote:
This script is not recursive... in order to make it recursive, I have to
call it several times (my kludge... hey, it works). I thought os.walk's
sole purpose was to recursively walk a directory structure, no? Also,it
generates the below error during the os.renames section, but the odd
thing is that it actually renames the files before saying it can't find
them. Any ideas are welcomed. If I'm doing something *really* wrong
here, just let me know.

#-------------- ERROR Message ----------------------#

File "/home/rbt/fix-names-1.1.py", line 29, in ?
clean_names(setpath)
File "/home/rbt/fix-names-1.1.py", line 27, in clean_names
os.renames(oldpath, newpath)
File "/usr/local/lib/python2.3/os.py", line 196, in renames
rename(old, new)
OSError: [Errno 2] No such file or directory

#------------- Code -------------------------#

setpath = raw_input("Path to the Directory: ")
bad = re.compile(r'[*?<>/\|\\]')
for root, dirs, files in os.walk(setpath):
for dname in dirs:
badchars = bad.findall(dname)
for badchar in badchars:
newdname = dname.replace(badchar,'-')
if newdname != dname:
newpath = os.path.join(root, newdname)
oldpath = os.path.join(root, dname)
os.renames(oldpath, newpath)


Your code is trying to recurse into the list of directories in 'dirs',
but you are renaming these directories before it can get to them. For
example, if dirs = ['baddir?*', 'gooddir', 'okdir'], you rename
'baddir?*' to 'baddir--' and then os.walk tries to enter 'baddir?*' and
cannot find it. You're better off building a list of paths to rename,
and then renaming them outside of the os.walk scope, or doing something
like...

dirs.remove(dname)
dirs.append(newdname)

....in your 'if' block.

Peace,
Joe
Jul 18 '05 #2
Joe Francia wrote:
Your code is trying to recurse into the list of directories in 'dirs',
but you are renaming these directories before it can get to them. For
example, if dirs = ['baddir?*', 'gooddir', 'okdir'], you rename
'baddir?*' to 'baddir--' and then os.walk tries to enter 'baddir?*' and
cannot find it. You're better off building a list of paths to rename,
and then renaming them outside of the os.walk scope, or doing something
like...

dirs.remove(dname)
dirs.append(newdname)

...in your 'if' block.

Peace,
Joe


So, which is better... rename in the os.walk scope or not? The below
code works sometimes at others it produces this error:

ValueError: list.remove(x): x is not in list

setpath = raw_input("Path to the Directory: ")
def clean_names(setpath):
bad = re.compile(r'%2f|%25|%20|[*?<>/\|\\]')
for root, dirs, files in os.walk(setpath):
for dname in dirs:
badchars = bad.findall(dname)
for badchar in badchars:
newdname = dname.replace(badchar,'-')
if newdname != dname:
dirs.remove(dname)
dirs.append(newdname)
newpath = os.path.join(root, newdname)
oldpath = os.path.join(root, dname)
os.renames(oldpath, newpath)

Jul 18 '05 #3
hokiegal99 <ho********@hotmail.com> wrote:
Joe Francia wrote:
Your code is trying to recurse into the list of directories in 'dirs',
but you are renaming these directories before it can get to them. For
example, if dirs = ['baddir?*', 'gooddir', 'okdir'], you rename
'baddir?*' to 'baddir--' and then os.walk tries to enter 'baddir?*' and
cannot find it. You're better off building a list of paths to rename,
and then renaming them outside of the os.walk scope, or doing something
like...

dirs.remove(dname)
dirs.append(newdname)

...in your 'if' block.

Peace,
Joe
So, which is better... rename in the os.walk scope or not? The below
code works sometimes at others it produces this error:

ValueError: list.remove(x): x is not in list


That's strange. It shouldn't be happening. Stick some print statements
in there and see what's going on:
setpath = raw_input("Path to the Directory: ")
def clean_names(setpath):
bad = re.compile(r'%2f|%25|%20|[*?<>/\|\\]')
for root, dirs, files in os.walk(setpath):
for dname in dirs:
badchars = bad.findall(dname)
for badchar in badchars:
newdname = dname.replace(badchar,'-')
if newdname != dname: try: dirs.remove(dname) except ValueError:
print "%s not in %s" % (dname, dirs)
else: dirs.append(newdname)
newpath = os.path.join(root, newdname)
oldpath = os.path.join(root, dname)
os.renames(oldpath, newpath)


Note that I'm assuming it's the dirs.remove(dname) call that's
triggering the ValueError, since there aren't any invocations of
list.remove() anywhere else in your sample code. But I could be wrong;
you should look at the complete exception trace, which will include the
line number at which the exception was thrown.

--
Robin Munn
rm***@pobox.com
Jul 18 '05 #4
Thanks for the tip. That code shows all of the dirs that Python is
complaining about not in the list... trouble is, they *are* in the list.
Go figure. I'd like to try doing the rename outside the scope of
os.walk, but I don't undersdtand how to do this, when I break out of
os.walk and try the rename at a parallel level, Python complains that
variables such as "oldpath" and "newpath" are undefined.

Robin Munn wrote:
hokiegal99 <ho********@hotmail.com> wrote:
Joe Francia wrote:
Your code is trying to recurse into the list of directories in 'dirs',
but you are renaming these directories before it can get to them. For
example, if dirs = ['baddir?*', 'gooddir', 'okdir'], you rename
'baddir?*' to 'baddir--' and then os.walk tries to enter 'baddir?*' and
cannot find it. You're better off building a list of paths to rename,
and then renaming them outside of the os.walk scope, or doing something
like...

dirs.remove(dname)
dirs.append(newdname)

...in your 'if' block.

Peace,
Joe


So, which is better... rename in the os.walk scope or not? The below
code works sometimes at others it produces this error:

ValueError: list.remove(x): x is not in list

That's strange. It shouldn't be happening. Stick some print statements
in there and see what's going on:

setpath = raw_input("Path to the Directory: ")
def clean_names(setpath):
bad = re.compile(r'%2f|%25|%20|[*?<>/\|\\]')
for root, dirs, files in os.walk(setpath):
for dname in dirs:
badchars = bad.findall(dname)
for badchar in badchars:
newdname = dname.replace(badchar,'-')
if newdname != dname:


try:
dirs.remove(dname)


except ValueError:
print "%s not in %s" % (dname, dirs)
else:
dirs.append(newdname)
newpath = os.path.join(root, newdname)
oldpath = os.path.join(root, dname)
os.renames(oldpath, newpath)

Note that I'm assuming it's the dirs.remove(dname) call that's
triggering the ValueError, since there aren't any invocations of
list.remove() anywhere else in your sample code. But I could be wrong;
you should look at the complete exception trace, which will include the
line number at which the exception was thrown.

Jul 18 '05 #5
> This script is not recursive... in order to make it recursive, I have to
call it several times (my kludge... hey, it works). I thought os.walk's
sole purpose was to recursively walk a directory structure, no? Also,it
generates the below error during the os.renames section, but the odd
thing is that it actually renames the files before saying it can't find
them. Any ideas are welcomed. If I'm doing something *really* wrong
here, just let me know.


Try iterating from bottom to top.

See "help(os.walk)":
walk(top, topdown=True, onerror=None)

...

If optional arg 'topdown' is true or not specified, the triple for a
directory is generated before the triples for any of its
subdirectories
(directories are generated top down). If topdown is false, the triple
for a directory is generated after the triples for all of its
subdirectories (directories are generated bottom up).

...

Jul 18 '05 #6
hokiegal99 <ho********@hotmail.com> wrote:
Thanks for the tip. That code shows all of the dirs that Python is
complaining about not in the list... trouble is, they *are* in the list.
Go figure. I'd like to try doing the rename outside the scope of
os.walk, but I don't undersdtand how to do this, when I break out of
os.walk and try the rename at a parallel level, Python complains that
variables such as "oldpath" and "newpath" are undefined.


Wait, I just realized that you're changing the list *while* you're
iterating over it. That's a bad idea. See the warning at the bottom of
this page in the language reference:

http://www.python.org/doc/current/ref/for.html

Instead of modifying the list while you're looping over it, use the
topdown argument to os.walk to build the tree from the bottom up instead
of from the top down. That way you won't have to futz with the dirnames
list at all:

def clean_names(rootpath):
bad = re.compile(r'%2f|%25|%20|[*?<>/\|\\]')
for root, dirs, files in os.walk(rootpath, topdown=False):
for dname in dirs:
newdname = re.sub(bad, '-', dname)
if newdname != dname:
newpath = os.path.join(root, newdname)
oldpath = os.path.join(root, dname)
os.renames(oldpath, newpath)

Notice also the use of re.sub to do all the character substitutions at
once. Your code as written would have failed on a filename like "foo*?",
since it always renamed from the original filename: it would have first
done os.renames("foo*?", "foo-?") followed by os.renames("foo*?",
"foo--") and the second would have raised an OSError.

--
Robin Munn
rm***@pobox.com
Jul 18 '05 #7
Robin Munn wrote:
hokiegal99 <ho********@hotmail.com> wrote:
Thanks for the tip. That code shows all of the dirs that Python is
complaining about not in the list... trouble is, they *are* in the list.
Go figure. I'd like to try doing the rename outside the scope of
os.walk, but I don't undersdtand how to do this, when I break out of
os.walk and try the rename at a parallel level, Python complains that
variables such as "oldpath" and "newpath" are undefined.


Wait, I just realized that you're changing the list *while* you're
iterating over it. That's a bad idea. See the warning at the bottom of
this page in the language reference:


Here's a way to modify the list while iterating over it. Too lazy to
generate the sample directory tree, so I suggest that the OP test it :-)

<untested>
def clean_names(rootpath):
bad = re.compile(r'%2f|%25|%20|[*?<>/\|\\]')
for root, dirs, files in os.walk(rootpath):
for index, dname in enumerate(dirs):
newdname = bad.sub('-', dname)
if newdname != dname:
newpath = os.path.join(root, newdname)
oldpath = os.path.join(root, dname)
try:
os.rename(oldpath, newpath)
except OSError:
print >> sys.stderr, "cannot rename %r to %r" %
(oldpath, newpath)
else:
dirs[index] = newdname # inform os.walk() about the new
name
</untested>

Peter
Jul 18 '05 #8
This works great! No errors... and it gets dirs that are 8 levels deep
(that's as far down as I've tested). Thanks for the tip! The re.sub
seems to be much faster than the string find/replace approach as well...
I need to read-up more on the documentation of os.walk and re in
general. Thanks again!!!
Robin Munn wrote:
Wait, I just realized that you're changing the list *while* you're
iterating over it. That's a bad idea. See the warning at the bottom of
this page in the language reference:

http://www.python.org/doc/current/ref/for.html

Instead of modifying the list while you're looping over it, use the
topdown argument to os.walk to build the tree from the bottom up instead
of from the top down. That way you won't have to futz with the dirnames
list at all:

def clean_names(rootpath):
bad = re.compile(r'%2f|%25|%20|[*?<>/\|\\]')
for root, dirs, files in os.walk(rootpath, topdown=False):
for dname in dirs:
newdname = re.sub(bad, '-', dname)
if newdname != dname:
newpath = os.path.join(root, newdname)
oldpath = os.path.join(root, dname)
os.renames(oldpath, newpath)

Notice also the use of re.sub to do all the character substitutions at
once. Your code as written would have failed on a filename like "foo*?",
since it always renamed from the original filename: it would have first
done os.renames("foo*?", "foo-?") followed by os.renames("foo*?",
"foo--") and the second would have raised an OSError.

Jul 18 '05 #9
Could we discuss more about the topdown feature in os.walk? My script is
working fine now, I have no trouble at all with it. I just want to
better understand os.walk in Python 2.3. This is how I understand it as
of today, someone please correct me if I'm wrong:

topdown=False would build a list of filesystem (fs) objects from the
bottom up. The objects at the begining of the list would be the end-most
objects (the leaf nodes) of the fs. When you make changes to that list,
the changes would be from leaf node to os.walk's root instead of root to
leaf node, correct? For example, if I had this dir structure:

dir_a
file_a
dir_b
file_b

My list would look like this:

file_b
dir_b
file_a
dir_a

And, if I made changes to the list and commited those changes to the fs
then there would be no problems because of the order in which the
changes are made. Is this a proper way to describe topdown=False in
os.walk? Or in other words, our list would be static (one change would
not impact another), where if topdown=True our list would be dynamic
(one change could impact another).

Thanks for the help!!!


Robin Munn wrote:
Wait, I just realized that you're changing the list *while* you're
iterating over it. That's a bad idea. See the warning at the bottom of
this page in the language reference:

http://www.python.org/doc/current/ref/for.html

Instead of modifying the list while you're looping over it, use the
topdown argument to os.walk to build the tree from the bottom up instead
of from the top down. That way you won't have to futz with the dirnames
list at all:

def clean_names(rootpath):
bad = re.compile(r'%2f|%25|%20|[*?<>/\|\\]')
for root, dirs, files in os.walk(rootpath, topdown=False):
for dname in dirs:
newdname = re.sub(bad, '-', dname)
if newdname != dname:
newpath = os.path.join(root, newdname)
oldpath = os.path.join(root, dname)
os.renames(oldpath, newpath)

Notice also the use of re.sub to do all the character substitutions at
once. Your code as written would have failed on a filename like "foo*?",
since it always renamed from the original filename: it would have first
done os.renames("foo*?", "foo-?") followed by os.renames("foo*?",
"foo--") and the second would have raised an OSError.


Jul 18 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

7
by: Tom | last post by:
I need help to implement the following task in Java and any XML API but preferably JDOM. I am a total newbie to this. I have a DocBook document, for example: <chapter> <title>Title</title>...
6
by: Ivan Shevanski | last post by:
To continue with my previous problems, now I'm trying out classes. But I have a problem (which I bet is easily solveable) that I really don't get. The numerous tutorials I've looked at just...
22
by: delraydog | last post by:
It's quite simple to walk to the DOM tree going forward however I can't figure out a nice clean way to walk the DOM tree in reverse. Checking previousSibling is not sufficient as the...
6
by: Bruce | last post by:
Hi all, I have a question about traversing file systems, and could use some help. Because of directories with many files in them, os.walk appears to be rather slow. I`m thinking there is a...
45
nemisis
by: nemisis | last post by:
Hi Everyone I have am doing an object oriented C++ program and I have no idea as to how start it................. I am not that good at coding so i tried and made an overview of what...
2
by: gregpinero | last post by:
In the example from help(os.walk) it lists this: from os.path import join, getsize for root, dirs, files in walk('python/Lib/email'): print root, "consumes", print sum(), print "bytes in",...
8
by: inFocus | last post by:
Hello, I am new to python and wanted to write something for myself where after inputing two words it would search entire drive and when finding both names in files name would either copy or move...
0
by: Jeff McNeil | last post by:
Your args are fine, that's just the way os.path.walk works. If you just need the absolute pathname of a directory when given a relative path, you can always use os.path.abspath, too. A couple...
4
by: Jeff Nyman | last post by:
Greetings all. I did some searching on this but I can't seem to find a specific solution. I have code like this: ========================================= def walker1(arg, dirname, names):...
0
by: DolphinDB | last post by:
The formulas of 101 quantitative trading alphas used by WorldQuant were presented in the paper 101 Formulaic Alphas. However, some formulas are complex, leading to challenges in calculation. Take...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
0
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
0
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.