473,608 Members | 2,127 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

making a valid file name...

Hi I'm writing a python script that creates directories from user
input.
Sometimes the user inputs characters that aren't valid characters for a
file or directory name.
Here are the characters that I consider to be valid characters...

valid =
':./,^0123456789abc defghijklmnopqr stuvwxyzABCDEFG HIJKLMNOPQRSTUV WXYZ '

if I have a string called fname I want to go through each character in
the filename and if it is not a valid character, then I want to replace
it with a space.

This is what I have:

def fixfilename(fna me):
valid =
':.\,^012345678 9abcdefghijklmn opqrstuvwxyzABC DEFGHIJKLMNOPQR STUVWXYZ '
for i in range(len(fname )):
if valid.find(fnam e[i]) < 0:
fname[i] = ' '
return fname

Anyone think of a simpler solution?

Oct 17 '06 #1
10 4236
I would suggest something like string.maketran s
http://docs.python.org/lib/node41.html. I don't remember exactly how
it works, but I think it's something like
>>invalid_cha rs = "abc"
replace_cha rs = "123"
char_map = string.maketran s(invalid_chars , replace_chars)
filename = "abc123.txt "
filename.tran slate(charmap)
'123123.txt'

--
Jerry

Oct 17 '06 #2

SpreadTooThin wrote:
Hi I'm writing a python script that creates directories from user
input.
Sometimes the user inputs characters that aren't valid characters for a
file or directory name.
Here are the characters that I consider to be valid characters...

valid =
':./,^0123456789abc defghijklmnopqr stuvwxyzABCDEFG HIJKLMNOPQRSTUV WXYZ '

if I have a string called fname I want to go through each character in
the filename and if it is not a valid character, then I want to replace
it with a space.

This is what I have:

def fixfilename(fna me):
valid =
':.\,^012345678 9abcdefghijklmn opqrstuvwxyzABC DEFGHIJKLMNOPQR STUVWXYZ '
for i in range(len(fname )):
if valid.find(fnam e[i]) < 0:
fname[i] = ' '
return fname

Anyone think of a simpler solution?
If you want to strip 'em:
>>valid=':./,^0123456789abc defghijklmnopqr stuvwxyzABCDEFG HIJKLMNOPQRSTUV WXYZ '
filename = '!"£!£$"$££$%$£ %$£lasfjalsfjdl asfjasfd()()()s omethingelse.da t'
stripped = ''.join(c for c in filename if c in valid)
stripped
'lasfjalsfjdlas fjasfdsomething else.dat'

If you want to replace them with something, be careful of the regex
string being built (ie a space character).
import re
>>re.sub(r'[^%s]' % valid,' ',filename)
' lasfjalsfjdlasf jasfd somethingelse.d at'
Jon.

Oct 17 '06 #3
Sometimes the user inputs characters that aren't valid
characters for a file or directory name. Here are the
characters that I consider to be valid characters...

valid =
':./,^0123456789abc defghijklmnopqr stuvwxyzABCDEFG HIJKLMNOPQRSTUV WXYZ '
Just a caveat, as colons and slashes can give grief on various
operating systems...combi ned with periods, it may be possible to
cause trouble too...
This is what I have:

def fixfilename(fna me):
valid =
':.\,^012345678 9abcdefghijklmn opqrstuvwxyzABC DEFGHIJKLMNOPQR STUVWXYZ '
for i in range(len(fname )):
if valid.find(fnam e[i]) < 0:
fname[i] = ' '
return fname

Anyone think of a simpler solution?
I don't know if it's simpler, but you can use
>>fname = "this is a test & it ain't expen$ive.py"
''.join(c in valid and c or ' ' for c in fname)
'this is a test it ain t expen ive.py'

It does use the "it's almost a ternary operator, but not quite"
method concurrently being discussed/lambasted in another thread.
Treat accordingly, with all that may entail. Should be good in
this case though.

If you're doing it on a time-critical basis, it might help to
make "valid" a set, which should have O(1) membership testing,
rather than using the "in" test with a string. I don't know how
well the find() method of a string performs in relationship to
"in" testing of a set. Test and see, if it's important.

-tkc

Oct 17 '06 #4
Hi,

On 10/17/2006 06:22:45 PM, SpreadTooThin wrote:
valid =
':./,^0123456789abc defghijklmnopqr stuvwxyzABCDEFG HIJKLMNOPQRSTUV WXYZ '
not specifying the OS platform, these are not all the characters
that may occur in a filename: '[]{}-=", etc. And '/' is NOT valid.
On a unix platform. And it should be easy to scan the filename and
check every character against the 'valid-string'.

HTH, cu l8r, Edgar.
--
\|||/
(o o) Just curious...
----ooO-(_)-Ooo---------------------------------------------------------
Oct 17 '06 #5
On 2006-10-17, Tim Chase <py*********@ti m.thechases.com wrote:
If you're doing it on a time-critical basis, it might help to
make "valid" a set, which should have O(1) membership testing,
rather than using the "in" test with a string. I don't know
how well the find() method of a string performs in relationship
to "in" testing of a set. Test and see, if it's important.
The find method of (8-bit) strings is really, really fast. My
guess is that set can't beat it. I tried to beat it recently with
a binary search function. Even after applying psyco find was
still faster (though I could beat the bisect functions by a
little bit by replacing a divide with a shift).

--
Neil Cerutti
This is not a book to be put down lightly. It should be thrown
with great force. --Dorothy Parker
Oct 17 '06 #6
>If you're doing it on a time-critical basis, it might help to
>make "valid" a set, which should have O(1) membership testing,
rather than using the "in" test with a string. I don't know
how well the find() method of a string performs in relationship
to "in" testing of a set. Test and see, if it's important.

The find method of (8-bit) strings is really, really fast. My
guess is that set can't beat it. I tried to beat it recently with
a binary search function. Even after applying psyco find was
still faster (though I could beat the bisect functions by a
little bit by replacing a divide with a shift).
In "theory" (you know...that little town in west Texas where
everything goes right), a set-membership test should be O(1). A
binary search function would be O(log N). A linear search of a
string for a member should be O(N).

In practice, however, for such small strings as the given
whitelist, the underlying find() operation likely doesn't put a
blip on the radar. If your whitelist were some huge document
that you were searching repeatedly, it could have worse
performance. Additionally, the find() in the underlying C code
is likely about as bare-metal as it gets, whereas the set
membership aspect of things may go through some more convoluted
setup/teardown/hashing and spend a lot more time further from the
processor's op-codes.

And I know that a number of folks have done some hefty
optimization of Python's string-handling abilities. There's
likely a tradeoff point where it's better to use one over the
other depending on the size of the whitelist. YMMV

-tkc



Oct 17 '06 #7
On 2006-10-17, Edgar Matzinger <ed***@edgar-matzinger.nlwro te:
Hi,

On 10/17/2006 06:22:45 PM, SpreadTooThin wrote:
>valid =
':./,^0123456789abc defghijklmnopqr stuvwxyzABCDEFG HIJKLMNOPQRSTUV WXYZ '

not specifying the OS platform, these are not all the
characters that may occur in a filename: '[]{}-=", etc. And '/'
is NOT valid. On a unix platform. And it should be easy to
scan the filename and check every character against the
'valid-string'.
In the interactive fiction world where I come from, a portable
filename is only 8 chars long and matches the regex
[A-Z][A-Z0-9]*, i.e., capital letters and numbers, with no
extension. That way it'll work on old DOS machines and on
Risc-OS. Wait... is there Python for Risc-OS?
--
Neil Cerutti
>
HTH, cu l8r, Edgar.
Oct 17 '06 #8
Matthew Warren wrote:
>>import re
badfilename=' £"%^"£^"£$^ihge roighroeig3645^ £$^"knovin98u4# 346#1461461'
valid=':./,^0123456789abc defghijklmnopqr stuvwxyzABCDEFG HIJKLMNOPQRSTUV WXYZ '
goodfilename= re.sub('[^'+valid+']',' ',badfilename)
to create arbitrary character sets, it's usually best to run the character string through
re.escape() before passing it to the RE engine.

</F>

Oct 18 '06 #9
Tim Chase:
In practice, however, for such small strings as the given
whitelist, the underlying find() operation likely doesn't put a
blip on the radar. If your whitelist were some huge document
that you were searching repeatedly, it could have worse
performance. Additionally, the find() in the underlying C code
is likely about as bare-metal as it gets, whereas the set
membership aspect of things may go through some more convoluted
setup/teardown/hashing and spend a lot more time further from the
processor's op-codes.
With this specific test (half good half bad), on Py2.5, on my PC, sets
start to be faster than the string search when the string "good" is
about 5-6 chars long (this means set are quite fast, I presume).

from random import choice, seed
from time import clock

def main(choice=cho ice):
seed(1)
n = 100000

for good in ("ab", "abc", "abcdef", "abcdefgh",
"abcdefghijklmn opqrstuvwxyz"):
poss = good + good.upper()
data = [choice(poss) for _ in xrange(n)] * 10
print "len(good) = ", len(good)

t = clock()
for c in data:
c in good
print round(clock()-t, 2)

t = clock()
sgood = set(good)
for c in data:
c in sgood
print round(clock()-t, 2), "\n"

main()
Bye,
bearophile

Oct 18 '06 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
2756
by: |-|erc | last post by:
Hi! Small challenge for you. The index.php uses this file and calls layout(). Take a look at www.chatty.net this file draws the chat login box on the right. I traced the CHAT button it submits and goes to the index file again, I can't figure out how it opens the chatroom. I want to get it to skip the login box and go straight to the room, with user name "guest" or "" or whatever but I'll add a field to type the name in the chat room....
9
2007
by: Mike McGee | last post by:
I am new to database apps, but I am making a db with access 2002. Here is what I have and what I would like for it to do. tblCustomers = holds customer info (Name, Address, City, State, Zip, Phone) tblzips = holds ( Zip codes, City, State, County, Country) tblzips ID Zipcode City State County Country
3
14523
by: Chris | last post by:
Hi, In C# I tried to save a file from a generated file name. Just before launching the dialog I check for a valid file name to be sure. There for I used the method ValidateNames from the save dialog. The strange thing is that sometimes the save dialogue
7
2011
by: Nathan Sokalski | last post by:
I have a form that allows the user to upload a file. Even though <input type="file" runat="server"> is intended to have the user choose the file using the browse button, it still allows them to change the path before it is uploaded. I currently use the following code to upload the file: Dim upfilename As String = "" If fileDetails.Value <> "" Then Dim dir As String() = fileDetails.PostedFile.FileName.Split("\".ToCharArray())
2
1944
by: lucifer | last post by:
hi i am making an http server it has following functions main() { if option is "-?", output the hints and stop check the directory supplied is sensible and not a security risk become a daemon process ignore child programs (to avoid zombies when child processes stop)
351
12910
by: CBFalconer | last post by:
We often find hidden, and totally unnecessary, assumptions being made in code. The following leans heavily on one particular example, which happens to be in C. However similar things can (and do) occur in any language. These assumptions are generally made because of familiarity with the language. As a non-code example, consider the idea that the faulty code is written by blackguards bent on foulling the language. The term...
1
1203
by: keithb | last post by:
My ASP.NET 2.0 application has a User Control that contains a DataList that is unable to get style information from a style located in a css file in the themes folder. The user control CssClass and the DataList ItemStyle CssClass properties are both set to a valid name of a style in a stylesheet located in a subfolder of the App_Themes folder. The Web.config file contains a <pages theme property that references the name of the themes...
10
5926
by: Academia | last post by:
I'd like to check a string to see that it is a valid file name. Is there a Like pattern or RegEx that can do that. 1) Just the file name with maybe an extension 2)A full path An help for either of the above would be appreciated.
50
4462
by: Juha Nieminen | last post by:
I asked a long time ago in this group how to make a smart pointer which works with incomplete types. I got this answer (only relevant parts included): //------------------------------------------------------------------ template<typename Data_t> class SmartPointer { Data_t* data; void(*deleterFunc)(Data_t*);
0
8063
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8498
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
8478
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
6817
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
6014
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
3962
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
1
2474
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
1
1598
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
1331
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.