473,765 Members | 1,958 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

fileinput module dangerous?

I just started working with the fileinput module, and reading the bit
about inplace modification, it seems very dangerous:

From the doc:
---
Optional in-place filtering: if the keyword argument inplace=1 is
passed to input() or to the FileInput constructor, the file is moved
to a backup file and standard output is directed to the input file

**(if a file of the same name as the backup file already exists, it
will be replaced silently)**.

This makes it possible to write a filter that rewrites its input file
in place. If the keyword argument backup='.<some extension>' is also
given, it specifies the extension for the backup file, and the backup
file remains around; by default, the extension is '.bak' and it is
deleted when the output file is closed. In-place filtering is disabled
when standard input is read.
---[Emphasis mine]

This seems like very dangerous behavior, as .bak is a very common and
useful extension for old versions of files users would like to keep
around, and this module obliterates files one would not expect it to
touch. This behavior is also very un-Pythonic. The Zen says "Errors
should never pass silently." There seems to be no way to get useful
behavior either, as the backup keyword just specifies a backup
extension, which you can never guarantee will not be in use, and it
leaves the file around in the directory as well.

It seems like this module should use tmpfile in the os module for it's
temp file needs.

Is there a rational reason it's this way, or should I start working on
a patch?

--
Chris Connett
Jul 18 '05 #1
3 1846
On 12 Oct 2004 21:57:40 -0700, Chris Connett <ch**********@g mail.com> wrote:
I just started working with the fileinput module, and reading the bit
about inplace modification, it seems very dangerous:

From the doc:
---
Optional in-place filtering: if the keyword argument inplace=1 is
passed to input() or to the FileInput constructor, the file is moved
to a backup file and standard output is directed to the input file

**(if a file of the same name as the backup file already exists, it
will be replaced silently)**. ....
This seems like very dangerous behavior, as .bak is a very common and
useful extension for old versions of files users would like to keep
around, and this module obliterates files one would not expect it to
touch. This behavior is also very un-Pythonic.
(The end user probably doesn't care if this is Pythonic or not when it
clobbers her files ...)

I had no idea it worked like this -- thanks for pointing it out!
It was such an obvious import of the Perl idiom:
perl -pi -e 's/a/b/' file1 file2 ...
perl -pi.bak -e 's/a/b/' file1 file2 ...
which clobbers existing .bak files -- /if you tell it to, not otherwise/.
Anything else is insane.

(Perl will happily overwrite write-protected files, at least under Unix, but
that's a different story...)
Is there a rational reason it's this way, or should I start working on
a patch?


I can't see how there could be a reason. I'd welcome a patch.

/Jorgen

--
// Jorgen Grahn <jgrahn@ Ph'nglui mglw'nafh Cthulhu
\X/ algonet.se> R'lyeh wgah'nagl fhtagn!
Jul 18 '05 #2
Jorgen Grahn wrote:
[...]

I can't see how there could be a reason. I'd welcome a patch.
OK, I've gone ahead and made a fix. It makes use of os.tmpfile if it is
available, but implements the same (now safe) behavior if it is
unavailable through other means.

I also searched through old postings about this module, and went over
various complaints. I found that unicode awareness was missing and
wanted, and since there is no other effective way to mimic it in client
code, I added encoding support. All my changes *should* be backwards
compatible, which brings me to my next question.

As someone who has never posted a patch, what are next steps from here.
Is there a standard test suite that I need to run? What about
documentation updates?

This patch should be considered alpha quality, especially the encoding
support, though if the module is used without specifying an encoding, it
should work exactly as before. Please beat on it, though do note that I
haven't run any standard test suites if they exist, so bugs may not
necessarily be obscure ones. I will continue to run more tests myself
in the coming days.

Note:
- Due to the buffering nature of fileinput, universal newlines only
works when using an encoding whose StreamReader's readlines() method
performs universal newlining.

--
Chris Connett

190d189
< self._encoding = encoding
200a200 self._encoding = encoding 211a212,218 # need parameterized file modes, since the files backing
# encodings should be raw bytestreams
self._readmode = 'r'
self._writemode = 'w'
if self._encoding:
self._readmode += 'b'
self._writemode += 'b' 248c255
< self._savestdou t = 0
--- self._savestdou t = None 253c260
< self._output = 0
--- self._output = None 290c297
< self._backupfil ename = 0
--- self._backupfil ename = None 297,307c304,305
< self._backupfil ename = (
< self._filename + (self._backup or os.extsep+"bak" ))
< try: os.unlink(self. _backupfilename )
< except os.error: pass
< # The next few lines may raise IOError
< os.rename(self. _filename, self._backupfil ename)
< self._file = open(self._back upfilename, "r")
< try:
< perm = os.fstat(self._ file.fileno()). st_mode
< except OSError:
< self._output = open(self._file name, "w")
--- if self._backup:
self._backupfil ename = self._filename + self._backup 309,312d306
< fd = os.open(self._f ilename,
< os.O_CREAT | os.O_WRONLY | os.O_TRUNC,
< perm)
< self._output = os.fdopen(fd, "w")
314,316c308,331
< if hasattr(os, 'chmod'):
< os.chmod(self._ filename, perm)
< except OSError:
--- self._file = os.tmpfile()
# copy our input into the tmpfile to keep the same
# backup/write relationship
self._file.writ e(open(self._fi lename,'rb').re ad())
self._file.seek (0)
except NameError: # tmpfile not available
import random
self._backupfil ename = self._filename
while os.path.exists( self._backupfil ename ):
self._backupfil ename += (
'%x' % random.randint( 0, 16))
# The next few lines may raise IOError
if not self._file: # _file may already be a tmpfile
os.unlink(self. _backupfilename )
# self._backupfil ename will only exist if the
# user explicitly specified an extension to
# use, i.e., it's their fault if we clobber
# something now
os.rename(self. _filename, self._backupfil ename)
self._file = open(self._back upfilename, self._readmode)
try:
perm = os.stat(self._b ackupfilename). st_mode
os.chmod(self._ filename,perm)
except (OSError, NameError): 317a333,334
self._output = open(self._file name, self._writemode ) 322c339,347
< self._file = open(self._file name, "r")
--- self._file = open(self._file name, self._readmode)
# now wrap all our new file objects if the user specified
# an encoding
if self._encoding:
(x, x, reader, writer) = codecs.lookup(s elf._encoding)
self._file = reader(self._fi le)
if self._output:
self._output = writer(self._ou tput)
sys.stdout = self._output


Jul 18 '05 #3
OK, I found the SF page for all that stuff and took care of everything.

Patch #1048075:
https://sourceforge.net/tracker/inde...70&atid=305470
Jul 18 '05 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
1383
by: Daniel Yoo | last post by:
Hi everyone, I'm was wondering: would it be a good idea to have the FileInput class support a read() method? I found myself running into a small problem while using xml.sax.parse in combination with fileinput. Here's a snippet that demonstrates the problem: ### import xml.sax import fileinput
5
2486
by: the.theorist | last post by:
I was writing a small script the other day with the following CLI prog * I've used getopt to parse out the possible options, so we'll ignore that part, and assume for the rest of the discussion that args is a list of file names (if any provided). I used this bit of code to detect wether i want stdinput or not. if len(args)==0:
6
8730
by: cyberco | last post by:
Using fileinput.input('test.txt') I probably forgot to process all lines or so, since I get the error 'input() already active' when i try to call fileinput.input('test.txt') again. But how can I 'close' the previous version I opened? I have no handle to it or so...
0
1229
by: cyberco | last post by:
Opening, reading and writing to a file works fine in mod_python, but using fileinput (with inplace editing) gives me a 'permission denied' with exactly the same fileName: ========================= fileinput.input(fileName, inplace=1) ========================= I suspect that this has to do with the temporary file it creates, or am I wrong?
10
2947
by: wo_shi_big_stomach | last post by:
Newbie to python writing a script to recurse a directory tree and delete the first line of a file if it contains a given string. I get the same error on a Mac running OS X 10.4.8 and FreeBSD 6.1. Here's the script: # start of program # p.pl - fix broken SMTP headers in email files #
0
1570
by: Phoe6 | last post by:
Hi All, I am able to use urlib2 through proxy. I give proxy credentials and use # Set the Proxy Address proxy_ip = "10.0.1.1:80" proxy_user = 'senthil_or' proxy_password_orig='password'
4
3250
by: Adam Funk | last post by:
I'm using this sort of standard thing: for line in fileinput.input(): do_stuff(line) and wondering whether it reads until it hits an EOF and then passes lines (one at a time) into the variable line. This appears to be the behaviour when it's reading STDIN interactively (i.e. from the keyboard).
11
11377
by: jo3c | last post by:
hi everybody im a newbie in python i need to read line 4 from a header file using linecache will crash my computer due to memory loading, because i am working on 2000 files each is 8mb fileinput don't load the file into memory first how do i use fileinput module to read a specific line from a file? for line in fileinput.Fileinput('sample.txt')
3
3594
by: Robert | last post by:
I would like to count lines in a file using the fileinput module and I am getting an unusual output. ------------------------------------------------------------------------------ #!/usr/bin/python import fileinput # cycle through files for line in fileinput.input(): if (fileinput.isfirstline()): if (fileinput.lineno 1):
0
9398
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10160
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10007
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
9951
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9832
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
8831
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
5275
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
2
3531
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2805
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.