472,354 Members | 1,800 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,354 software developers and data experts.

avoiding file corruption

Hi,

Trying to open a file for writing that is already open for writing
should result in an exception.

It's all too easy to accidentally open a shelve for writing twice and
this can lead to hard to track down database corruption errors.

Amir

Aug 27 '06 #1
15 3625
27 Aug 2006 00:44:33 -0700, Amir Michail <am******@gmail.com>:
Hi,

Trying to open a file for writing that is already open for writing
should result in an exception.

It's all too easy to accidentally open a shelve for writing twice and
this can lead to hard to track down database corruption errors.

Amir

--
http://mail.python.org/mailman/listinfo/python-list
Even if it could be strange, the OS usually allow you to open a file
twice, that's up to the programmer to ensure the consistency of the
operations.

PAolo

--
if you have a minute to spend please visit my photogrphy site:
http://mypic.co.nr
Aug 27 '06 #2

Paolo Pantaleo wrote:
27 Aug 2006 00:44:33 -0700, Amir Michail <am******@gmail.com>:
Hi,

Trying to open a file for writing that is already open for writing
should result in an exception.

It's all too easy to accidentally open a shelve for writing twice and
this can lead to hard to track down database corruption errors.

Amir

--
http://mail.python.org/mailman/listinfo/python-list
Even if it could be strange, the OS usually allow you to open a file
twice, that's up to the programmer to ensure the consistency of the
operations.

PAolo
But if this is usually a serious bug, shouldn't an exception be raised?

Amir
>

--
if you have a minute to spend please visit my photogrphy site:
http://mypic.co.nr
Aug 27 '06 #3
Amir Michail schrieb:
Paolo Pantaleo wrote:
>27 Aug 2006 00:44:33 -0700, Amir Michail <am******@gmail.com>:
>>Hi,

Trying to open a file for writing that is already open for writing
should result in an exception.

It's all too easy to accidentally open a shelve for writing twice and
this can lead to hard to track down database corruption errors.

Amir

--
http://mail.python.org/mailman/listinfo/python-list
Even if it could be strange, the OS usually allow you to open a file
twice, that's up to the programmer to ensure the consistency of the
operations.

PAolo

But if this is usually a serious bug, shouldn't an exception be raised?
executing "rm -rf /" via subprocess is usually also a bad idea. So? No
language can prevent you from doing such mistake. And there is no way to
know if a file is opened twice - it might that you open the same file
twice via e.g. a network share. No way to know that it is the same file.

Diez
Aug 27 '06 #4

Amir Michail wrote:
Hi,

Trying to open a file for writing that is already open for writing
should result in an exception.

It's all too easy to accidentally open a shelve for writing twice and
this can lead to hard to track down database corruption errors.

Amir
I've never done this in anger so feel free to mock (a little :-).

I'd have a fixed field at the beginning of the field that can hold the
hostname process number, and access time of a writing process, togeher
with a sentinal value that means "no process has access to the file".

A program would:
1. wait a random time.
2. open for update the file
3. read the locking data
4. If it is already being used by another process then goto 1.
5. write the process's locking data and time into the lock field.
6 Modify the files other fields.
7 write the sentinal value to the locking field.
8. Close and flush the file to disk.

I have left what to do if a process has locked the file for too long as
a simple exercise for you ;-).

- Paddy.

Aug 27 '06 #5
Diez B. Roggisch wrote:
Amir Michail schrieb:
Paolo Pantaleo wrote:
27 Aug 2006 00:44:33 -0700, Amir Michail <am******@gmail.com>:
Hi,

Trying to open a file for writing that is already open for writing
should result in an exception.

It's all too easy to accidentally open a shelve for writing twice and
this can lead to hard to track down database corruption errors.

Amir

--
http://mail.python.org/mailman/listinfo/python-list

Even if it could be strange, the OS usually allow you to open a file
twice, that's up to the programmer to ensure the consistency of the
operations.

PAolo
But if this is usually a serious bug, shouldn't an exception be raised?

executing "rm -rf /" via subprocess is usually also a bad idea. So? No
language can prevent you from doing such mistake. And there is no way to
know if a file is opened twice - it might that you open the same file
twice via e.g. a network share. No way to know that it is the same file.

Diez
The scenario I have in mind is something like this:

def f():
db=shelve.open('test.db', 'c')
# do some stuff with db
g()
db.close()

def g():
db=shelve.open('test.db', 'c')
# do some stuff with db
db.close()

I think it would be easy for python to check for this problem in
scenarios like this.

Amir

Aug 27 '06 #6
Amir Michail schrieb:
Diez B. Roggisch wrote:
>Amir Michail schrieb:
>>Paolo Pantaleo wrote:
27 Aug 2006 00:44:33 -0700, Amir Michail <am******@gmail.com>:
Hi,
>
Trying to open a file for writing that is already open for writing
should result in an exception.
>
It's all too easy to accidentally open a shelve for writing twice and
this can lead to hard to track down database corruption errors.
>
Amir
>
--
http://mail.python.org/mailman/listinfo/python-list
>
Even if it could be strange, the OS usually allow you to open a file
twice, that's up to the programmer to ensure the consistency of the
operations.

PAolo

But if this is usually a serious bug, shouldn't an exception be raised?
executing "rm -rf /" via subprocess is usually also a bad idea. So? No
language can prevent you from doing such mistake. And there is no way to
know if a file is opened twice - it might that you open the same file
twice via e.g. a network share. No way to know that it is the same file.

Diez

The scenario I have in mind is something like this:

def f():
db=shelve.open('test.db', 'c')
# do some stuff with db
g()
db.close()

def g():
db=shelve.open('test.db', 'c')
# do some stuff with db
db.close()

I think it would be easy for python to check for this problem in
scenarios like this.
You are requesting a general solution for a very particular problem. As
I pointed out, that solution is unlikely to work reliably - if not
infeasible at all.

If you really have problems as the above, use a custom wrapper for
shelve that prevents _you_ from making that mistake.

Diez
Aug 27 '06 #7
Amir Michail wrote:
Trying to open a file for writing that is already open for writing
should result in an exception.

It's all too easy to accidentally open a shelve for writing twice and
this can lead to hard to track down database corruption errors.
The right solution is file locking. Unfortunately, the Python
tandard distribution doesn't have a portable file lock, but you
can do it on Unix and Win NT or better. See:

http://mail.python.org/pipermail/pyt...ry/002957.html

and/or

http://aspn.activestate.com/ASPN/Coo...n/Recipe/65203.
--
--Bryan
Aug 27 '06 #8
On 2006-08-27, Amir Michail <am******@gmail.comwrote:
Trying to open a file for writing that is already open for writing
should result in an exception.
MS Windows seems to do something similar, and it pisses me off
no end. Trying to open a file and read it while somebody else
has it open for writing causes an exception. If I want to open
a file and read it while it's being writtent to, that's my
business.

Likewise, if I want to have a file open for writing twice,
that's my business as well. I certainly don't want to be
hobbled to prevent me from wandering off in the wrong direction.
It's all too easy to accidentally open a shelve for writing
twice and this can lead to hard to track down database
corruption errors.
It's all to easy to delete the wrong element from a list. It's
all to easy to re-bind the wrong object to a name. Should
lists be immutable and names be permanently bound?

--
Grant Edwards grante Yow! I'm in a twist
at contest!! I'm in a
visi.com bathtub! It's on Mars!! I'm
in tip-top condition!
Aug 27 '06 #9
Grant Edwards wrote:
On 2006-08-27, Amir Michail <am******@gmail.comwrote:
Trying to open a file for writing that is already open for writing
should result in an exception.

MS Windows seems to do something similar, and it pisses me off
no end. Trying to open a file and read it while somebody else
has it open for writing causes an exception. If I want to open
a file and read it while it's being writtent to, that's my
business.

Likewise, if I want to have a file open for writing twice,
that's my business as well. I certainly don't want to be
hobbled to prevent me from wandering off in the wrong direction.
It's all too easy to accidentally open a shelve for writing
twice and this can lead to hard to track down database
corruption errors.

It's all to easy to delete the wrong element from a list. It's
all to easy to re-bind the wrong object to a name. Should
lists be immutable and names be permanently bound?
How often do you need to open a file multiple times for writing?

As a high-level language, Python should prevent people from corrupting
data as much as possible.

Amir
--
Grant Edwards grante Yow! I'm in a twist
at contest!! I'm in a
visi.com bathtub! It's on Mars!! I'm
in tip-top condition!
Aug 27 '06 #10
Amir Michail wrote:
Hi,

Trying to open a file for writing that is already open for writing
should result in an exception.
Look at fcntl module, I use it in a class to control access from within my processes.
I don't think this functionality should be inherent to python though.
Keep in mind only my processes open the shelve db so your mileage may vary.
get and set methods are just for convenience
This works under linux, don't know about windows.

#!/usr/bin/env python

import fcntl, shelve, time, bsddb
from os.path import exists

class fLocked:

def __init__(self, fname):
if exists(fname):
#verify it is not corrupt
bsddb.db.DB().verify(fname)
self.fname = fname
self.have_lock = False
self.db = shelve.open(self.fname)
self.fileno = self.db.dict.db.fd()

def __del__(self):
try: self.db.close()
except: pass

def aquire_lock(self, timeout = 5):
if self.have_lock: return True
started = time.time()
while not self.have_lock and (time.time() - started < timeout):
try:
fcntl.flock(self.fileno, fcntl.LOCK_EX + fcntl.LOCK_NB)
self.have_lock = True
except IOError:
# wait for it to become available
time.sleep(.5)
return self.have_lock

def release_lock(self):
if self.have_lock:
fcntl.flock(self.fileno, fcntl.LOCK_UN)
self.have_lock = False
return not self.have_lock

def get(self, key, default = {}):
if self.aquire_lock():
record = self.db.get(key, default)
self.release_lock()
else:
raise IOError, "Unable to lock %s" % self.fname
return record

def set(self, key, value):
if self.aquire_lock():
self.db[key] = value
self.release_lock()
else:
raise IOError, "Unable to lock %s" % self.fname

if __name__ == '__main__':
fname = 'test.db'
dbs = []
for i in range(2): dbs.append(fLocked(fname))
print dbs[0].aquire_lock()
print dbs[1].aquire_lock(1) #should fail getting flock
dbs[0].release_lock()
print dbs[1].aquire_lock() #should be able to get lock
--Tim

Aug 27 '06 #11
Paddy wrote:
I've never done this in anger so feel free to mock (a little :-).

I'd have a fixed field at the beginning of the field that can hold the
hostname process number, and access time of a writing process, togeher
with a sentinal value that means "no process has access to the file".

A program would:
1. wait a random time.
2. open for update the file
3. read the locking data
4. If it is already being used by another process then goto 1.
5. write the process's locking data and time into the lock field.
6 Modify the files other fields.
7 write the sentinal value to the locking field.
8. Close and flush the file to disk.
That doesn't really work; you have still have a race condition.

Locking the file is the good solution, but operating systems
vary in how it works. Other reasonable solutions are to re-name
the file, work with the renamed version, then change it back
after closing; and to use "lock files", which Wikipedia explains
near the bottom of the "File locking" article.
--
--Bryan
Aug 27 '06 #12
Grant Edwards wrote:
Amir Michail wrote:
>Trying to open a file for writing that is already open for writing
should result in an exception.

MS Windows seems to do something similar, and it pisses me off
no end. Trying to open a file and read it while somebody else
has it open for writing causes an exception. If I want to open
a file and read it while it's being writtent to, that's my
business.
Windows is actually much more sophisticated. It does allows shared
write access; see the FILE_SHARE_WRITE option for Win32's CreateFile.
You can also lock specific byte ranges in a file.
--
--Bryan
Aug 27 '06 #13
On 2006-08-27, Amir Michail <am******@gmail.comwrote:
How often do you need to open a file multiple times for writing?
Not very often, but I don't think it should be illegal. That's
probably a result of being a 25 year user of Unix where it's
assumed that the user knows what he's doing.
As a high-level language, Python should prevent people from
corrupting data as much as possible.
For somebody with a Unix background it seems overly restrictive.

--
Grant Edwards grante Yow! Youth of today! Join
at me in a mass rally
visi.com for traditional mental
attitudes!
Aug 27 '06 #14
Dennis Lee Bieber wrote:
On Sun, 27 Aug 2006 14:41:05 -0000, Grant Edwards <gr****@visi.com>
declaimed the following in comp.lang.python:
>>
MS Windows seems to do something similar, and it pisses me off
no end. Trying to open a file and read it while somebody else
has it open for writing causes an exception. If I want to open
a file and read it while it's being writtent to, that's my
business.
Though strangely, Windows seems to permit one to make a COPY of that
open file, and then open that with another application...
Yes, so long as the file hasn't been opened so as to deny reading you can
open it for reading, but you do have to specify the sharing mode. Microsoft
too follow the rule that "Explicit is better than implicit."
Aug 27 '06 #15
On Sun, 2006-08-27 at 07:51 -0700, Amir Michail wrote:
How often do you need to open a file multiple times for writing?
How often do you write code that you don't understand well enough to
fix? This issue is clearly a problem within *your* application.

I'm curious how you could possibly think this could be solved in any
case. What if you accidentally open two instances of the application?
How would Python know? You are asking Python to perform an OS-level
operation (and a questionable one at that).

My suggestion is that you use a real database if you need concurrent
access. If you don't need concurrent access then fix your application.
As a high-level language, Python should prevent people from corrupting
data as much as possible.
"Data" is application-specific. Python has no idea how you intend to
use your data and therefore should not (even if it could) try to protect
you.

Regards,
Cliff

Aug 28 '06 #16

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

12
by: Riley DeWiley | last post by:
I have a project that is using a Jet backend and having trouble with Jet's tendency to bloat it's MDB file. I can compact it but it is a hassle. I am considering switching to a Fox backend but have...
13
by: Bob Darlington | last post by:
I have a repair and backup database routine which runs when a user closes down my application. It works fine in my development machine, but breaks on a client's at the following line: If...
8
by: ranjeet.gupta | last post by:
Dear All Is the Root Cause of the Memory corruption is the Memory leak, ?? suppose If in the code there is Memory leak, Do this may lead to the Memory Corruption while executing the program ? ...
2
by: nepdae | last post by:
Please forgive me, this is a long one. My 11-user Access 2000 database is having recurring corruption problems. The symptoms include the following: 1) corrupted fields in recently created or...
17
by: shineofleo | last post by:
Here is the situation: I wrote a VB programm, which stores all the information in a single Access database file using jet engine. It worked well, however one of my customs reported that there was...
5
by: robert.waters | last post by:
Hello, I have been experiencing crashes and code corruption in my project (vbe6.dll; a decompile fixes the corruption); for the life of me I cannot figure out why, and I can't pin down the...
6
by: rdemyan via AccessMonster.com | last post by:
I'm writing code to backup the back-end file from my front-end. The code will automatically run the routine when the main app is closed and when certain critieria are met. The question is: What...
16
by: Wayne | last post by:
I have an Access 2003 data file that has now corrupted twice in a week. The database is extremely simple with one main data table and a few lookup tables. The lookup tables are linked to the main...
3
by: Martincruise | last post by:
I face the below error message, when I attempt to mount an Access database "Microsoft Access has detected corruption in this file. To try to repair the corruption, first make a backup copy of the...
2
by: Kemmylinns12 | last post by:
Blockchain technology has emerged as a transformative force in the business world, offering unprecedented opportunities for innovation and efficiency. While initially associated with cryptocurrencies...
0
by: Naresh1 | last post by:
What is WebLogic Admin Training? WebLogic Admin Training is a specialized program designed to equip individuals with the skills and knowledge required to effectively administer and manage Oracle...
0
by: antdb | last post by:
Ⅰ. Advantage of AntDB: hyper-convergence + streaming processing engine In the overall architecture, a new "hyper-convergence" concept was proposed, which integrated multiple engines and...
0
by: AndyPSV | last post by:
HOW CAN I CREATE AN AI with an .executable file that would suck all files in the folder and on my computerHOW CAN I CREATE AN AI with an .executable file that would suck all files in the folder and...
1
by: Matthew3360 | last post by:
Hi, I have been trying to connect to a local host using php curl. But I am finding it hard to do this. I am doing the curl get request from my web server and have made sure to enable curl. I get a...
0
Oralloy
by: Oralloy | last post by:
Hello Folks, I am trying to hook up a CPU which I designed using SystemC to I/O pins on an FPGA. My problem (spelled failure) is with the synthesis of my design into a bitstream, not the C++...
0
by: Carina712 | last post by:
Setting background colors for Excel documents can help to improve the visual appeal of the document and make it easier to read and understand. Background colors can be used to highlight important...
0
BLUEPANDA
by: BLUEPANDA | last post by:
At BluePanda Dev, we're passionate about building high-quality software and sharing our knowledge with the community. That's why we've created a SaaS starter kit that's not only easy to use but also...
0
by: Ricardo de Mila | last post by:
Dear people, good afternoon... I have a form in msAccess with lots of controls and a specific routine must be triggered if the mouse_down event happens in any control. Than I need to discover what...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.