473,714 Members | 2,527 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

using functions and file renaming problem

A few questions about the following code. How would I "wrap" this in a
function, and do I need to?

Also, how can I make the code smart enough to realize that when a file
has 2 or more bad charcters in it, that the code needs to run until all
bad characters are gone? For example, if a file has the name
"<bad*mac\f ile" the program has to run 3 times to get all three bad
chars out of the file name.

The passes look like this:

1. <bad*mac\file becomes -bad*mac\file
2. -bad*mac\file becomes -bad-mac\file
3. -bad-mac\file becomes -bad-mac-file

I think the problem is that once the program finds a bad char in a
filename it replaces it with a dash, which creates a new filename that
wasn't present when the program first ran, thus it has to run again to
see the new filename.

import os, re, string
bad = re.compile(r'%2 f|%25|[*?<>/\|\\]') #search for these.
print " "
setpath = raw_input("Path to the dir that you would like to clean: ")
print " "
print "--- Remove Bad Charaters From Filenames ---"
print " "
for root, dirs, files in os.walk(setpath ):
for file in files:
badchars = bad.findall(fil e)
newfile = ''
for badchar in badchars:
newfile = file.replace(ba dchar,'-') #replace bad chars.
if newfile:
newpath = os.path.join(ro ot,newfile)
oldpath = os.path.join(ro ot,file)
os.rename(oldpa th,newpath)
print oldpath
print newpath
print " "
print "--- Done ---"
print " "

Jul 18 '05 #1
5 3342
bo**@oz.net (Bengt Richter) wrote in message
This looks like an old post that ignores some responses you got to your original post like this.
Did some mail get lost? Or was this an accidental repost of something old? I still see
indentation misalignments, probably due to mixing tabs and spaces (bad news in python ;-)

Regards,
Bengt Richter


Sorry Bengt, I overlooked some responses to an earlier, similar
question. I have too many computers in too many places, so forgive me.
Thanks for taking the time to tell me again!!! I'll clean up the
indentation... I promise.
Jul 18 '05 #2
On Friday 18 Jul 2003 2:33 am, hokiegal99 wrote:
A few questions about the following code. How would I "wrap" this in a
function, and do I need to?

Also, how can I make the code smart enough to realize that when a file
has 2 or more bad charcters in it, that the code needs to run until all
bad characters are gone?
It almost is ;-)
For example, if a file has the name
"<bad*mac\f ile" the program has to run 3 times to get all three bad
chars out of the file name.

The passes look like this:

1. <bad*mac\file becomes -bad*mac\file
2. -bad*mac\file becomes -bad-mac\file
3. -bad-mac\file becomes -bad-mac-file

I think the problem is that once the program finds a bad char in a
filename it replaces it with a dash, which creates a new filename that
wasn't present when the program first ran, thus it has to run again to
see the new filename.


No, the problem is that you're throwing away all but the last correction.
Read my comments below:

import os, re, string
bad = re.compile(r'%2 f|%25|[*?<>/\|\\]') #search for these.
print " "
setpath = raw_input("Path to the dir that you would like to clean: ")
print " "
print "--- Remove Bad Charaters From Filenames ---"
print " "
for root, dirs, files in os.walk(setpath ):
for file in files:
badchars = bad.findall(fil e) # find any bad characters
newfile = file
for badchar in badchars: # loop through each character in badchars
# note that if badchars is empty, this loop is not entered
# show whats happening
print "replacing",bad char,"in",newfi le,":",
# replace all occurrences of this badchar with '-' and remember
# it for next iteration of loop:
newfile = newfile.replace (badchar,'-') #replace bad chars.
print newfile
if badchars: # were there any bad characters in the name?
newpath = os.path.join(ro ot,newfile)
oldpath = os.path.join(ro ot,file)
os.rename(oldpa th,newpath)
print oldpath
print newpath
print " "
print "--- Done ---"
print " "
wrt wrapping it up in a function, here's a starter for you... "fill in the
blanks":
-------8<-----------
def cleanup(setpath ):
# compile regex for finding bad characters

# walk directory tree...

# find any bad characters

# loop through each character in badchars

# replace all occurrences of this badchar with '-' and remember

# were there any bad characters in the name?

-------8<-----------

To call this you could do (on linux - mac paths are different):

-------8<-----------
cleanup("/home/andy/Python")
-------8<-----------

hope that helps
-andyj
Jul 18 '05 #3
On Friday 18 Jul 2003 11:16 pm, hokiegal99 wrote:
Thanks again for the help Andy! One last question: What is the advantage
of placing code in a function? I don't see how having this bit of code
in a function improves it any. Could someone explain this?

Thanks!

8<--- (old quotes)

The 'benefit' of functions is only really reaped when you have a specificneed
for them! You don't *have* to use them if you don't *need* to (but they can
still improve the readability of your code).

Consider the following contrived example:

----------8<------------
# somewhere in the dark recesses of a large project...
. . .
for filename in os.listdir(cfg. userdir):
newname = filename
for ch in cfg.badchars:
newname.replace (ch,"-")
if newname != filename:
os.rename(os.pa th.join(cfg.use rdir,filename),
os.path.join(cf g.userdir,newna me)
. . .
. . .
# in another dark corner...

. . .
for filename in os.listdir(cfg. tempdir):
newname = filename
for ch in cfg.badchars:
newname.replace (ch,"-")
if newname != filename:
os.rename(os.pa th.join(cfg.use rdir,filename),
os.path.join(cf g.userdir,newna me)
. . .
# somewhere else...

. . .
for filename in os.listdir(cfg. extradir):
newname = filename
for ch in cfg.badchars:
newname.replace (ch,"-")
if newname != filename:
os.rename(os.pa th.join(cfg.use rdir,filename),
os.path.join(cf g.userdir,newna me)
. . .
----------8<------------

See the repetition? ;-)

Imagine a situation where you need to do something far more complicated over,
and over again... It's not very programmer efficient, and it makes the code
longer, too - thus costing more to write (time) and more to store (disks).

Imagine having to change the behaviour of this 'hard-coded' routine, and what
would happen if you missed one... however, if it is in a function, you only
have *one* place to change it.

When we generalise the algorithm and put it into a function we can do:

----------8<------------

. . .
. . .

# somewhere near the top of the project code...
def cleanup_filenam es(dir):

""" renames any files within dir that contain bad characters
(ie. ones in cfg.badchars). Does not walk the directory tree.
"""

for filename in os.listdir(dir) :
newname = filename
for ch in cfg.badchars:
newname.replace (ch,"-")
if newname != filename:
os.rename(os.pa th.join(cfg.use rdir,filename),
os.path.join(cf g.userdir,newna me)

. . .
. . .

# somewhere in the dark recesses of a large project...
. . .
cleanup_filenam es(cfg.userdir)
. . .
. . .
# in another dark corner...
. . .
cleanup_filenam es(cfg.tempdir)
. . .
# somewhere else...
. . .
cleanup_filenam es(cfg.extradir )
. . .

----------8<------------

Even in this small, contrived example, we've saved about 13 lines of code(ok,
that's notwithstanding the blank lines and the """ docstring """ at the top
of the function).

There's another twist, too. In the docstring for cleanup_filenam es it says
"Does not walk the directory tree." because we didn't code it to deal with
subdirectories. But we could, without using os.walk...

Directories form a tree structure, and the easiest way to process trees is by
using /recursion/, which means functions that call themselves. An old
programmer's joke is this:

Recursion, defn. [if not understood] see Recursion.

Each time you call a function, it gets a brand new environment, called the
'local scope'. All variables inside this scope are private; they may have
the same names, but they refer to different objects. This can be really
handy...

----------8<------------

def cleanup_filenam es(dir):

""" renames any files within dir that contain bad characters
(ie. ones in cfg.badchars). Walks the directory tree to process
subdirectories.
"""

for filename in os.listdir(dir) :
newname = filename
for ch in cfg.badchars:
newname.replace (ch,"-")
if newname != filename:
os.rename(os.pa th.join(cfg.use rdir,filename),
os.path.join(cf g.userdir,newna me)
# recurse if subdirectory...
if os.path.isdir(o s.path.join(cfg .userdir,newnam e)):
cleanup_filenam es(os.path.join (cfg.userdir,ne wname))

----------8<------------

This version *DOES* deal with subdirectories. .. with only two extra lines,
too! Trying to write this without recursion would be a nightmare (even in
Python).

A very important thing to note, however, is that there is a HARD LIMIT onthe
number of times a function can call itself, called the RecursionLimit:

----------8<------------
n=1
def rec(): n=n+1
rec()
rec() . . .
(huge traceback list)
. . .
RuntimeError: maximum recursion limit reached.n

991
----------8<------------

Another very important thing about recursion is that a recursive function
should *ALWAYS* have a 'get-out-clause', a condition that stops the
recursion. Guess what happens if you don't have one ... ;-)

Finally (at least for now), functions also provide a way to break down your
code into logical sections. Many programmers will write the higher level
functions first, delegating 'complicated bits' to further sub-functions as
they go, and worry about implementing them once they've got the overall
algorithm finished. This allows one to concentrate on the right level of
detail, rather than getting bogged down in the finer points: you just make up
names for functions that you're *going* to implement later. Sometimes, you
might make a 'stub' like:

def doofer(dooby, doo):
pass

so that your program is /syntactically/ correct, and will run (to a certain
degree). This allows debugging to proceed before you have written
everything. You'd do this for functions which aren't *essential* to the
program, but maybe add 'special features', for example, additonal
error-checking or output formatting.

A sort of extension of the function idea is 'modules', which make functions
and other objects available to other 'client' programs. When you say:

import os

you are effectively adding all the functions and objects of the os moduleinto
your own program, without having to re-write them. This enables programmers
to share their functions and other code as convenient 'black boxes'. Modules,
however, are a slightly more advanced topic.
Hope that helps.

-andyj

Jul 18 '05 #4
My scripts aren't long and complex, so I don't really *need* to use
functions. But the idea of using them is appealing to me because it
seems the right thing to do from a design point of view. I can see how
larger, more complex programs would get out of hand if the programmer
did not use functions so they'd be absolutely necessary there. But if
they allow larger programs to have a better overall design that's more
compact and readable (like your examples showed) then one could argue
that they would do the same for smaller, simplier programs too.

Thanks for the indepth explanation. It was very helpful. I'm going to
try using functions within my fix_files.py script.
Andy Jewell wrote:
On Friday 18 Jul 2003 11:16 pm, hokiegal99 wrote:
Thanks again for the help Andy! One last question: What is the advantage
of placing code in a function? I don't see how having this bit of code
in a function improves it any. Could someone explain this?

Thanks!


8<--- (old quotes)

The 'benefit' of functions is only really reaped when you have a specific need
for them! You don't *have* to use them if you don't *need* to (but they can
still improve the readability of your code).

Consider the following contrived example:

----------8<------------
# somewhere in the dark recesses of a large project...
. . .
for filename in os.listdir(cfg. userdir):
newname = filename
for ch in cfg.badchars:
newname.replace (ch,"-")
if newname != filename:
os.rename(os.pa th.join(cfg.use rdir,filename),
os.path.join(cf g.userdir,newna me)
. . .
. . .
# in another dark corner...

. . .
for filename in os.listdir(cfg. tempdir):
newname = filename
for ch in cfg.badchars:
newname.replace (ch,"-")
if newname != filename:
os.rename(os.pa th.join(cfg.use rdir,filename),
os.path.join(cf g.userdir,newna me)
. . .
# somewhere else...

. . .
for filename in os.listdir(cfg. extradir):
newname = filename
for ch in cfg.badchars:
newname.replace (ch,"-")
if newname != filename:
os.rename(os.pa th.join(cfg.use rdir,filename),
os.path.join(cf g.userdir,newna me)
. . .
----------8<------------

See the repetition? ;-)

Imagine a situation where you need to do something far more complicated over,
and over again... It's not very programmer efficient, and it makes the code
longer, too - thus costing more to write (time) and more to store (disks).

Imagine having to change the behaviour of this 'hard-coded' routine, and what
would happen if you missed one... however, if it is in a function, you only
have *one* place to change it.

When we generalise the algorithm and put it into a function we can do:

----------8<------------

. . .
. . .

# somewhere near the top of the project code...
def cleanup_filenam es(dir):

""" renames any files within dir that contain bad characters
(ie. ones in cfg.badchars). Does not walk the directory tree.
"""

for filename in os.listdir(dir) :
newname = filename
for ch in cfg.badchars:
newname.replace (ch,"-")
if newname != filename:
os.rename(os.pa th.join(cfg.use rdir,filename),
os.path.join(cf g.userdir,newna me)

. . .
. . .

# somewhere in the dark recesses of a large project...
. . .
cleanup_filenam es(cfg.userdir)
. . .
. . .
# in another dark corner...
. . .
cleanup_filenam es(cfg.tempdir)
. . .
# somewhere else...
. . .
cleanup_filenam es(cfg.extradir )
. . .

----------8<------------

Even in this small, contrived example, we've saved about 13 lines of code (ok,
that's notwithstanding the blank lines and the """ docstring """ at the top
of the function).

There's another twist, too. In the docstring for cleanup_filenam es it says
"Does not walk the directory tree." because we didn't code it to deal with
subdirectories. But we could, without using os.walk...

Directories form a tree structure, and the easiest way to process trees is by
using /recursion/, which means functions that call themselves. An old
programmer's joke is this:

Recursion, defn. [if not understood] see Recursion.

Each time you call a function, it gets a brand new environment, called the
'local scope'. All variables inside this scope are private; they may have
the same names, but they refer to different objects. This can be really
handy...

----------8<------------

def cleanup_filenam es(dir):

""" renames any files within dir that contain bad characters
(ie. ones in cfg.badchars). Walks the directory tree to process
subdirectories.
"""

for filename in os.listdir(dir) :
newname = filename
for ch in cfg.badchars:
newname.replace (ch,"-")
if newname != filename:
os.rename(os.pa th.join(cfg.use rdir,filename),
os.path.join(cf g.userdir,newna me)
# recurse if subdirectory...
if os.path.isdir(o s.path.join(cfg .userdir,newnam e)):
cleanup_filenam es(os.path.join (cfg.userdir,ne wname))

----------8<------------

This version *DOES* deal with subdirectories. .. with only two extra lines,
too! Trying to write this without recursion would be a nightmare (even in
Python).

A very important thing to note, however, is that there is a HARD LIMIT on the
number of times a function can call itself, called the RecursionLimit:

----------8<------------
n=1
def rec():
n=n+1
rec()

rec()
. . .
(huge traceback list)
. . .
RuntimeError: maximum recursion limit reached.
n

991
----------8<------------

Another very important thing about recursion is that a recursive function
should *ALWAYS* have a 'get-out-clause', a condition that stops the
recursion. Guess what happens if you don't have one ... ;-)

Finally (at least for now), functions also provide a way to break down your
code into logical sections. Many programmers will write the higher level
functions first, delegating 'complicated bits' to further sub-functions as
they go, and worry about implementing them once they've got the overall
algorithm finished. This allows one to concentrate on the right level of
detail, rather than getting bogged down in the finer points: you just make up
names for functions that you're *going* to implement later. Sometimes, you
might make a 'stub' like:

def doofer(dooby, doo):
pass

so that your program is /syntactically/ correct, and will run (to a certain
degree). This allows debugging to proceed before you have written
everything. You'd do this for functions which aren't *essential* to the
program, but maybe add 'special features', for example, additonal
error-checking or output formatting.

A sort of extension of the function idea is 'modules', which make functions
and other objects available to other 'client' programs. When you say:

import os

you are effectively adding all the functions and objects of the os module into
your own program, without having to re-write them. This enables programmers
to share their functions and other code as convenient 'black boxes'. Modules,
however, are a slightly more advanced topic.
Hope that helps.

-andyj

Jul 18 '05 #5
hokiegal99 wrote:

My scripts aren't long and complex, so I don't really *need* to use
functions. But the idea of using them is appealing to me because it
seems the right thing to do from a design point of view. I can see how
larger, more complex programs would get out of hand if the programmer
did not use functions so they'd be absolutely necessary there. But if
they allow larger programs to have a better overall design that's more
compact and readable (like your examples showed) then one could argue
that they would do the same for smaller, simplier programs too.


Uh oh! You're being poisoned with the meme of "right design".

Don't use functions because they "seem the right thing to do", use
them because *they reduce repetition*, or simplify code by *making
it more readable*.

If you aren't reducing repetition with them you are losing the
primary benefit. If you also aren't making your code more readable,
don't use functions.

Whatever you do, don't use them just because somebody has convinced
you "they're the right thing".

-Peter
Jul 18 '05 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
1734
by: Saradhi | last post by:
Hi All, Here I am facing a performance problem with the TreeView Node renaming. I am displaying a hierarchy Data in a treeview in my Windows C# Application. My tree view represents an hierarchical view of Parent Nodes and projects where in a projectnode can be added to any ParentNode and hence we may have a project node added to 100 Parent nodes. In this one, I have an operation of Renaming a Project Node. So whenever I am doing the...
7
2580
by: Lalasa | last post by:
Hi, Can anybody tell me how many cpu cycles File.copy would take and how many cpu cycles File.Move would take? CFile::Rename in C++ takes just one cpu cycle. As there is no File.Rename in C#, I am worried about using File.copy or File.Move which would do the same job of renaming a file but I want it done in just one cpu cycle.
1
3801
by: MikeY | last post by:
Hopefully someone can help, I have a listview box where I display my desired files. I single click on the desired file to be renamed and I rename it with a new name. My problem arises when the new name gets displayed through refreshing of my loop and display function. The problem is that when, if I rename the last file entry, everything appears to display fine. If I rename a file above the last entry (bottom entry), two of the new file...
6
3068
by: Divya | last post by:
Hi, I have a web page which generates a CSV file based on some user input. When this file is downloaded by the user, the file is being automatically converted to .xls. Any idea how I can prevent this? My code (snippet) - StreamWriter sw; if(File.Exists(filename))
1
1604
by: oldgent | last post by:
I am having a problem installing the starter kits. I have reinstalled VS 2005, think that might be the problem. I then installed both 'Personal Website" and the "Club Website" starter kits. I still have the same problem. Any web starter kit that is installed under 'My Templates" does not allow you to enter the location directory or use the browse button. Neither the "Personal Website" or the "Club Website" allows you to select the...
30
4108
by: Pep | last post by:
Is it best to include the code "using namespace std;" in the source or should each keyword in the std namespace be qualified by the namespace tag, such as std::cout << "using std namespace" << std::endl; Myself I am not sure which I prefer, it is certainly easier to specify that the std namespace is being used instead of tagging each member of the namespace?
4
1427
by: samjnaa | last post by:
Please check for sanity and approve for posting at python-dev. Currently file-directory-related functionality in the Python standard library is scattered among various modules such as shutil, os, dircache etc. So I request that the functions be gathered and consolidated at one place. Some may need renaming to avoid conflicts or for clarification.
18
2292
by: Angus | last post by:
Hello We have a lot of C++ code. And we need to now create a library which can be used from C and C++. Given that we have a lot of C++ code using classes how can we 'hide' the fact that it is C++ from C compilers? Can we have a C header file which uses the functionality of the C++ files and compile this into a lib file?
36
5391
by: Don | last post by:
I wrote an app that alerts a user who attempts to open a file that the file is currently in use. It works fine except when the file is opened by Notepad. If a text file is opened, most computers are configured to use Notepad to open the file by default and if they are configured to use Notepad by default I want it to remain that way rather than retrieve the text into my app or force the user to use another app to read the file. I'm...
0
8796
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9170
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
9071
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9009
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
7946
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
5943
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4462
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
2
2514
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2105
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.