473,770 Members | 1,841 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

file corruption on windows - possible bug

I've written a piece of code that iterates through a list of items and
determines the filename to write some piece of data to based on
something in the item itself. Here is a small example piece of code to
show the type of thing I'm doing::

############### ############### ###
file_dict = {}

a_list = [("a", "a%s" % i) for i in range(2500)]
b_list = [("b", "b%s" % i) for i in range(2500)]
c_list = [("c", "c%s" % i) for i in range(2500)]
d_list = [("d", "d%s" % i) for i in range(2500)]
joined_list = a_list + b_list + c_list + d_list

for key, value in joined_list:
outfile = file_dict.setde fault(key, open("%s.txt" % key, "w"))
outfile.write(" %s\n" % value)

for f in file_dict.value s():
f.close()
############### ############### ###

Problem is, when I run this on Windows, I get 14,520 null ("\x00")
characters at the front of the file and each file is 16,390 bytes long.
When I run this script on Linux, each file is 13,890 bytes and contains
no "\x00" characters. This piece of code::

############### ############### ###
import cStringIO

file_dict = {}

a_list = [("a", "a%s" % i) for i in range(2500)]
b_list = [("b", "b%s" % i) for i in range(2500)]
c_list = [("c", "c%s" % i) for i in range(2500)]
d_list = [("d", "d%s" % i) for i in range(2500)]
joined_list = a_list + b_list + c_list + d_list

for key, value in joined_list:
#outfile = file_dict.setde fault(key, open("%s.txt" % key, "w"))
outfile = file_dict.setde fault(key, cStringIO.Strin gIO())
outfile.write(" %s\n" % value)

for key, io_string in file_dict.items ():
outfile = open("%s.txt" % key, "w")
io_string.seek( 0)
outfile.write(i o_string.read() )
outfile.close()
############### ############### ###

results in files containing 16,390 bytes and no "\x00" characters on
Windows and 13,890 bytes on Linux and no "\x00" characters (file size
difference on Windows and Linux is due to line ending). I'm still doing
a setdefault on the dictionary to create an object if the key doesn't
exist, but I'm using a cStringIO object rather than a file object. So,
I'm treating this just like it was a file and writing it out later.

Does anyone have any idea as to why this is writing over 14,000 "\x00"
characters to my file to start off with where printable characters
should go and then writing the remainder of the file correctly?
Jeremy Jones
Jul 19 '05 #1
2 1586
Jeremy Jones wrote:
Here is a small example piece of code to
show the type of thing I'm doing::

############### ############### ###
file_dict = {}

a_list = [("a", "a%s" % i) for i in range(2500)]
b_list = [("b", "b%s" % i) for i in range(2500)]
c_list = [("c", "c%s" % i) for i in range(2500)]
d_list = [("d", "d%s" % i) for i in range(2500)]
joined_list = a_list + b_list + c_list + d_list

for key, value in joined_list:
outfile = file_dict.setde fault(key, open("%s.txt" % key, "w"))
outfile.write(" %s\n" % value)

for f in file_dict.value s():
f.close()
############### ############### ###

Problem is, when I run this on Windows, I get 14,520 null ("\x00")
characters at the front of the file and each file is 16,390 bytes long.


Your call to setdefault is opening the file for writing every time it is
called, but using only the first handle to write to the file. I presume you
get a nasty interaction between the file handle you are using to write and
the other file handles which open the file in a destructive ("w") mode.

The fix is simply to only open each file once instead of 2500 times. e.g.
(untested code)

for key, value in joined_list:
if key in file_dict:
outfile = file_dict[key]
else:
outfile = file_dict[key] = open("%s.txt" % key, "w")
outfile.write(" %s\n" % value)
Jul 19 '05 #2
On Mon, 09 May 2005 10:54:22 -0400, Jeremy Jones <za******@bells outh.net> wrote:
I've written a piece of code that iterates through a list of items and
determines the filename to write some piece of data to based on
something in the item itself. Here is a small example piece of code to
show the type of thing I'm doing::

############## ############### ####
file_dict = {}

a_list = [("a", "a%s" % i) for i in range(2500)]
b_list = [("b", "b%s" % i) for i in range(2500)]
c_list = [("c", "c%s" % i) for i in range(2500)]
d_list = [("d", "d%s" % i) for i in range(2500)]
joined_list = a_list + b_list + c_list + d_list

for key, value in joined_list:
outfile = file_dict.setde fault(key, open("%s.txt" % key, "w")) You are opening files multiply, since the open is a default value expression that is
always evaluated. Try replacing the above line with the following two lines:
try: outfile = file_dict[key]
except KeyError: outfile = file_dict[key] = open("%s.txt" % key, 'w') outfile.write(" %s\n" % value)

for f in file_dict.value s():
f.close()
############## ############### ####

Problem is, when I run this on Windows, I get 14,520 null ("\x00")
characters at the front of the file and each file is 16,390 bytes long.
When I run this script on Linux, each file is 13,890 bytes and contains
no "\x00" characters. This piece of code::

I don't want to think about the _exact_ explanation, but try the above (untested ;-)
and see if the symptoms change ;-)

Regards,
Bengt Richter
Jul 19 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
1542
by: lsloan | last post by:
Hi! I have a reoccuring problem with one of my databases where forms will occasionally become corrupted. Since I have backup copies of my production databases, it is easy to recover from using the 'standard' methodology discussed in the various threads that address this issue. However, I would like to prevent the problem from happening in the first place. This has occured to different forms in my database and it strikes
18
7346
by: Andre Laplume via AccessMonster.com | last post by:
I have inherited a bunch of dbs which are are shared among a small group in my dept. We typically use the dbs to write queries to extract data, usually dumping it into Excel. Most dbs originated in MsAccess 97 or prior and have been converted to 2003. On occassion user 1 will open a db. When user 2 opens the db it will not let user 2 modify macros and what not. I can understand this and realize we could split the db; it is not worth ...
17
4894
by: shineofleo | last post by:
Here is the situation: I wrote a VB programm, which stores all the information in a single Access database file using jet engine. It worked well, however one of my customs reported that there was some problems with this programm. I checked, the log files showed that the database was corrupted. The customer told me that there no 'illegal' operation such as pull out the plug, or kill the programm via task manager... So is there any...
15
3743
by: Amir Michail | last post by:
Hi, Trying to open a file for writing that is already open for writing should result in an exception. It's all too easy to accidentally open a shelve for writing twice and this can lead to hard to track down database corruption errors. Amir
0
1318
by: Kritin | last post by:
Hi, Could someone please guide me as to how shud i fetch opertional attributes like creatorsname,numsubordinates in my c# Windows Application. i m using DireetoryServices.It fetches me attributes like cn,sn,userpassword ete but when i use it to fetch operational attributes of Sun Directory it errors out. Please any help accepted Thanks
0
9617
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9453
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10254
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
10036
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9904
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
8929
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
5481
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
2
3607
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2849
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.