472,952 Members | 2,188 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,952 software developers and data experts.

Funky file contents when os.rename or os.remove are interrupted

I've got a case where I'm seeing text files that are either all null
characters, or are trailed with nulls due to interrupted file access
resulting from an electrical power interruption on the WinXP pc.

In tracking it down, it seems that what is being interrupted is either
os.remove(), or os.rename(). Has anyone seen this behaviour, or have
any clue what is going on?

On first pass I would think that both of those calls are single step
operations (removing/changing an entry in the FAT, or FAT-like thing,
on the HDD) and wouldn't result in an intermediate, null-populated,
step, but the evidence seems to indicate I'm wrong...

Any insight from someone with knowledge of the internal operations of
os.remove and/or os.rename would be greatly appreciated, although I
expect the crux may be at the os level and not in python.

Russ

Oct 10 '06 #1
5 1510
Russell Warren enlightened us with:
On first pass I would think that both of those calls are single step
operations (removing/changing an entry in the FAT, or FAT-like
thing, on the HDD) and wouldn't result in an intermediate,
null-populated, step, but the evidence seems to indicate I'm
wrong...
They require multiple blocks to be written to disc, so if you're not
using a journaling filesystem, bad things can happen.
Any insight from someone with knowledge of the internal operations
of os.remove and/or os.rename would be greatly appreciated, although
I expect the crux may be at the os level and not in python.
You're right about that.

Sybren
--
Sybren Stüvel
Stüvel IT - http://www.stuvel.eu/
Oct 10 '06 #2
hg
Russell Warren wrote:
I've got a case where I'm seeing text files that are either all null
characters, or are trailed with nulls due to interrupted file access
resulting from an electrical power interruption on the WinXP pc.

In tracking it down, it seems that what is being interrupted is either
os.remove(), or os.rename(). Has anyone seen this behaviour, or have
any clue what is going on?

On first pass I would think that both of those calls are single step
operations (removing/changing an entry in the FAT, or FAT-like thing,
on the HDD) and wouldn't result in an intermediate, null-populated,
step, but the evidence seems to indicate I'm wrong...

Any insight from someone with knowledge of the internal operations of
os.remove and/or os.rename would be greatly appreciated, although I
expect the crux may be at the os level and not in python.

Russ
Taking a quick look at the code, it looks like MoveFileW (Windows API)
is eventually being called by posixmodule.c.

My gut feeling is that you are correct and not facing a Python but
Windows issue (sigh) ... you might want to test your problem on an NTFS
file system and see if the problems are similar.

Regards,

hg

Oct 10 '06 #3
Russell Warren schrieb:
Any insight from someone with knowledge of the internal operations of
os.remove and/or os.rename would be greatly appreciated, although I
expect the crux may be at the os level and not in python.
Just to confirm what others have said: Python has nothing to do with
that. It calls the relevant Win32 API rather directly.

Then, Windows has nothing to do with it, either. It calls the routines
of the file system driver rather directly.

It's the FAT file system that may suffer from metadata corruption in
case of power loss. If you lose power on a disk that has a FAT file
system on it, you need to run chkdsk before using the file system
again, and you *still* may see corruption. As others have said:
use NTFS if you want a reasonable chance of getting in a clean state
in case of a power loss.

Regards,
Martin
Oct 11 '06 #4
Thanks, guys... this has all been very useful information.

The machine this is happening on is already running NTFS.

The good news is that we just discovered/remembered that there is a
write-caching option (in device manager -HDD -properties ->
Policies tab) available in XP. The note right beside the
write-cache-enable checkbox says:

"This setting enables write caching to improve disk performance, but a
power outage or equipment failure might result in data loss or
corruption."

Well waddya know... write-caching was enabled on the machine. It is
now disabled and we'll be power-cycle testing to see if it happens
again.

Regarding the comment on journaling file systems, I looked into it and
it looks like NTFS actually does do journaling to some extent, and some
effort was expended to make NTFS less susceptible to the exact problem
I'm experiencing. I'm currently hopeful that the corrupted files we've
seen are entirely due to the mistake of having write-caching enabled
(the default).
Then, Windows has nothing to do with it, either. It calls the routines
of the file system driver rather directly.
It looks like that is not entirely true... this write-caching appears
to sit above the file system itself. In any case, it is certainly not
a Python issue!

One last non-python question... a few things I read seemed to vaguely
indicate that the journaling feature of NTFS is an extension/option.
Wording could also indicate a simple feature, though. Are there
options you can set on your file system (aside from block size and
partition)?! I've certainly never heard of that, but want to be sure.
I definitely need this system to be as crash-proof as possible.

Thanks again,
Russ

Oct 11 '06 #5
Russell Warren wrote:
One last non-python question... a few things I read seemed to vaguely
indicate that the journaling feature of NTFS is an extension/option.
http://www.microsoft.com/whdc/system...reinstall.mspx

NTFS is a journaling file system. NTFS writes a log of changes being
made, which offers significant benefit in cases where a system loses
power, experiences an unexpected reset, or crashes.

http://en.wikipedia.org/wiki/NTFS

A file system journal is used in order to guarantee the integrity of
the file system itself (but not of each individual file). Systems
using NTFS are known to have improved reliability compared to FAT file
systems.

</F>

Oct 11 '06 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

18
by: Dino | last post by:
dear all, i've created an application for a customer where the customer can upload ..csv-files into a specified ftp-directory. on the server, a php-script, triggered by a cronjob, reads all the...
48
by: Joseph | last post by:
Hi I'm writing a commercial program which must be reliable. It has to do some basic reading and writing to and from files on the hard disk, and also to a floppy. I have foreseen a potential...
3
by: Don Pasquale | last post by:
The following function intends to delete "numberoflines" lines from a text file, named "s" (string pointer) and pointed to by file pointer "fp", starting from line "line". Now, the function...
3
by: kimimaro | last post by:
hi below is my save function that is used to placed data from the C program to a text file for future usage. void save() { FILE *save; int i = 0; save=fopen("employeerecord.txt", "a+");
6
by: Kiran | last post by:
Hi, I have program, which opens file at the startup and logs error messages to the file, file handle is closed at the end of the program. However if file is deleted in-between, program do not...
2
by: Neo | last post by:
I want to remove 10 bytes from end of the file, how I can perform this task using C++ filing functions? I am not want to make new, write contents, remove input file and rename output file. In C++...
5
by: erikcw | last post by:
Hi all, I've created a script that reads in a file, replaces some data (regex), then writes the new data back to the file. At first I was convinced that "w+" was the tool for the job. But now...
5
by: parthaspanda22 | last post by:
How can I get to discard the contents of a text file from a specified offset( say, obtained from ftell)? Sincerely.
185
by: jacob navia | last post by:
Hi We are rewriting the libc for the 64 bit version of lcc-win and we have added a new field in the FILE structure: char *FileName; fopen() will save the file name and an accessor function will...
0
by: lllomh | last post by:
Define the method first this.state = { buttonBackgroundColor: 'green', isBlinking: false, // A new status is added to identify whether the button is blinking or not } autoStart=()=>{
0
by: Aliciasmith | last post by:
In an age dominated by smartphones, having a mobile app for your business is no longer an option; it's a necessity. Whether you're a startup or an established enterprise, finding the right mobile app...
0
tracyyun
by: tracyyun | last post by:
Hello everyone, I have a question and would like some advice on network connectivity. I have one computer connected to my router via WiFi, but I have two other computers that I want to be able to...
4
NeoPa
by: NeoPa | last post by:
Hello everyone. I find myself stuck trying to find the VBA way to get Access to create a PDF of the currently-selected (and open) object (Form or Report). I know it can be done by selecting :...
3
NeoPa
by: NeoPa | last post by:
Introduction For this article I'll be using a very simple database which has Form (clsForm) & Report (clsReport) classes that simply handle making the calling Form invisible until the Form, or all...
1
by: Teri B | last post by:
Hi, I have created a sub-form Roles. In my course form the user selects the roles assigned to the course. 0ne-to-many. One course many roles. Then I created a report based on the Course form and...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 1 Nov 2023 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM) Please note that the UK and Europe revert to winter time on...
0
NeoPa
by: NeoPa | last post by:
Introduction For this article I'll be focusing on the Report (clsReport) class. This simply handles making the calling Form invisible until all of the Reports opened by it have been closed, when it...
0
isladogs
by: isladogs | last post by:
The next online meeting of the Access Europe User Group will be on Wednesday 6 Dec 2023 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, Mike...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.