473,396 Members | 1,827 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

Funky file contents when os.rename or os.remove are interrupted

I've got a case where I'm seeing text files that are either all null
characters, or are trailed with nulls due to interrupted file access
resulting from an electrical power interruption on the WinXP pc.

In tracking it down, it seems that what is being interrupted is either
os.remove(), or os.rename(). Has anyone seen this behaviour, or have
any clue what is going on?

On first pass I would think that both of those calls are single step
operations (removing/changing an entry in the FAT, or FAT-like thing,
on the HDD) and wouldn't result in an intermediate, null-populated,
step, but the evidence seems to indicate I'm wrong...

Any insight from someone with knowledge of the internal operations of
os.remove and/or os.rename would be greatly appreciated, although I
expect the crux may be at the os level and not in python.

Russ

Oct 10 '06 #1
5 1531
Russell Warren enlightened us with:
On first pass I would think that both of those calls are single step
operations (removing/changing an entry in the FAT, or FAT-like
thing, on the HDD) and wouldn't result in an intermediate,
null-populated, step, but the evidence seems to indicate I'm
wrong...
They require multiple blocks to be written to disc, so if you're not
using a journaling filesystem, bad things can happen.
Any insight from someone with knowledge of the internal operations
of os.remove and/or os.rename would be greatly appreciated, although
I expect the crux may be at the os level and not in python.
You're right about that.

Sybren
--
Sybren Stüvel
Stüvel IT - http://www.stuvel.eu/
Oct 10 '06 #2
hg
Russell Warren wrote:
I've got a case where I'm seeing text files that are either all null
characters, or are trailed with nulls due to interrupted file access
resulting from an electrical power interruption on the WinXP pc.

In tracking it down, it seems that what is being interrupted is either
os.remove(), or os.rename(). Has anyone seen this behaviour, or have
any clue what is going on?

On first pass I would think that both of those calls are single step
operations (removing/changing an entry in the FAT, or FAT-like thing,
on the HDD) and wouldn't result in an intermediate, null-populated,
step, but the evidence seems to indicate I'm wrong...

Any insight from someone with knowledge of the internal operations of
os.remove and/or os.rename would be greatly appreciated, although I
expect the crux may be at the os level and not in python.

Russ
Taking a quick look at the code, it looks like MoveFileW (Windows API)
is eventually being called by posixmodule.c.

My gut feeling is that you are correct and not facing a Python but
Windows issue (sigh) ... you might want to test your problem on an NTFS
file system and see if the problems are similar.

Regards,

hg

Oct 10 '06 #3
Russell Warren schrieb:
Any insight from someone with knowledge of the internal operations of
os.remove and/or os.rename would be greatly appreciated, although I
expect the crux may be at the os level and not in python.
Just to confirm what others have said: Python has nothing to do with
that. It calls the relevant Win32 API rather directly.

Then, Windows has nothing to do with it, either. It calls the routines
of the file system driver rather directly.

It's the FAT file system that may suffer from metadata corruption in
case of power loss. If you lose power on a disk that has a FAT file
system on it, you need to run chkdsk before using the file system
again, and you *still* may see corruption. As others have said:
use NTFS if you want a reasonable chance of getting in a clean state
in case of a power loss.

Regards,
Martin
Oct 11 '06 #4
Thanks, guys... this has all been very useful information.

The machine this is happening on is already running NTFS.

The good news is that we just discovered/remembered that there is a
write-caching option (in device manager -HDD -properties ->
Policies tab) available in XP. The note right beside the
write-cache-enable checkbox says:

"This setting enables write caching to improve disk performance, but a
power outage or equipment failure might result in data loss or
corruption."

Well waddya know... write-caching was enabled on the machine. It is
now disabled and we'll be power-cycle testing to see if it happens
again.

Regarding the comment on journaling file systems, I looked into it and
it looks like NTFS actually does do journaling to some extent, and some
effort was expended to make NTFS less susceptible to the exact problem
I'm experiencing. I'm currently hopeful that the corrupted files we've
seen are entirely due to the mistake of having write-caching enabled
(the default).
Then, Windows has nothing to do with it, either. It calls the routines
of the file system driver rather directly.
It looks like that is not entirely true... this write-caching appears
to sit above the file system itself. In any case, it is certainly not
a Python issue!

One last non-python question... a few things I read seemed to vaguely
indicate that the journaling feature of NTFS is an extension/option.
Wording could also indicate a simple feature, though. Are there
options you can set on your file system (aside from block size and
partition)?! I've certainly never heard of that, but want to be sure.
I definitely need this system to be as crash-proof as possible.

Thanks again,
Russ

Oct 11 '06 #5
Russell Warren wrote:
One last non-python question... a few things I read seemed to vaguely
indicate that the journaling feature of NTFS is an extension/option.
http://www.microsoft.com/whdc/system...reinstall.mspx

NTFS is a journaling file system. NTFS writes a log of changes being
made, which offers significant benefit in cases where a system loses
power, experiences an unexpected reset, or crashes.

http://en.wikipedia.org/wiki/NTFS

A file system journal is used in order to guarantee the integrity of
the file system itself (but not of each individual file). Systems
using NTFS are known to have improved reliability compared to FAT file
systems.

</F>

Oct 11 '06 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

18
by: Dino | last post by:
dear all, i've created an application for a customer where the customer can upload ..csv-files into a specified ftp-directory. on the server, a php-script, triggered by a cronjob, reads all the...
48
by: Joseph | last post by:
Hi I'm writing a commercial program which must be reliable. It has to do some basic reading and writing to and from files on the hard disk, and also to a floppy. I have foreseen a potential...
3
by: Don Pasquale | last post by:
The following function intends to delete "numberoflines" lines from a text file, named "s" (string pointer) and pointed to by file pointer "fp", starting from line "line". Now, the function...
3
by: kimimaro | last post by:
hi below is my save function that is used to placed data from the C program to a text file for future usage. void save() { FILE *save; int i = 0; save=fopen("employeerecord.txt", "a+");
6
by: Kiran | last post by:
Hi, I have program, which opens file at the startup and logs error messages to the file, file handle is closed at the end of the program. However if file is deleted in-between, program do not...
2
by: Neo | last post by:
I want to remove 10 bytes from end of the file, how I can perform this task using C++ filing functions? I am not want to make new, write contents, remove input file and rename output file. In C++...
5
by: erikcw | last post by:
Hi all, I've created a script that reads in a file, replaces some data (regex), then writes the new data back to the file. At first I was convinced that "w+" was the tool for the job. But now...
5
by: parthaspanda22 | last post by:
How can I get to discard the contents of a text file from a specified offset( say, obtained from ftell)? Sincerely.
185
by: jacob navia | last post by:
Hi We are rewriting the libc for the 64 bit version of lcc-win and we have added a new field in the FILE structure: char *FileName; fopen() will save the file name and an accessor function will...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.