473,796 Members | 2,648 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

how exactly do binary files work in python?

In C#, writing to a binary file wrote the actual data types into the
file (integers, etc.). Is this not how Python binary files work? I tried
to write integers into a file, but the write method only takes a string
argument anyway.

Is there a way to actually store integers in a file, so that they can be
read and used (added, compared, etc.) as integers?
Mar 13 '06 #1
7 6484
John Salerno wrote:
In C#, writing to a binary file wrote the actual data types into the
file (integers, etc.).
This was inherently nonportable.
Is this not how Python binary files work? I tried
to write integers into a file, but the write method only takes a string
argument anyway.

Is there a way to actually store integers in a file, so that they can be
read and used (added, compared, etc.) as integers?


You can use the struct module for converting fundamental types to a
portable string representation for writing to binary files. Since
you're dealing with a high-level language, you can also just use the
pickle module for a more general form of serialization and persistence.

--
Erik Max Francis && ma*@alcyone.com && http://www.alcyone.com/max/
San Jose, CA, USA && 37 20 N 121 53 W && AIM erikmaxfrancis
My reputation grows with every failure.
-- George Bernard Shaw
Mar 13 '06 #2
Erik Max Francis wrote:
You can use the struct module for converting fundamental types to a
portable string representation for writing to binary files.


But if it's a string, why not just use a text file? What does a binary
file do that a text file doesn't, aside from not converting the end of
line characters?
Mar 13 '06 #3
On 2006-03-13, John Salerno <jo******@NOSPA Mgmail.com> wrote:
You can use the struct module for converting fundamental types
to a portable string representation for writing to binary
files.
But if it's a string, why not just use a text file?


Because string != text.

In Python a "string" is just an arbitrary length chunk of bytes.
What does a binary file do that a text file doesn't, aside
from not converting the end of line characters?


Nothing. It's the end-of-line conversion that can break binary
data.

--
Grant Edwards
gr****@visi.com

Mar 13 '06 #4
Grant Edwards <gr****@visi.co m> wrote:
...
What does a binary file do that a text file doesn't, aside
from not converting the end of line characters?


Nothing. It's the end-of-line conversion that can break binary
data.


I believe that a "control-Z" (ord(26)) in a file that's being read as
text, on Windows, is also taken as an end-of-file indication.
Alex
Mar 13 '06 #5
On Sun, 12 Mar 2006 22:01:46 -0500, John Salerno wrote:
Erik Max Francis wrote:
You can use the struct module for converting fundamental types to a
portable string representation for writing to binary files.


But if it's a string, why not just use a text file? What does a binary
file do that a text file doesn't, aside from not converting the end of
line characters?


Nothing. It is all bytes under the hood.

People generally consider a file to be "text" if it only includes bytes 32
through 126, plus a few control characters like 9 (tab) and 10 (newline).
Other applications don't care what bytes are included. Python is (mostly)
like that: you can deal with any bytes you collect from any file.

Other than this informal difference between text an binary, the major
difference comes about when you read lines from a text file. Each
operating system has a line separator: Unix/BSD/Linux systems use newline
(char 10), classic Macintosh used to use carriage return (char 12) and
DOS/Windows uses a two-byte carriage return + newline.

When writing lines to a file, Python does not automatically append the
line marker, so you need to do so yourself. But some other languages do --
I believe C++ is one of those languages. So C++ needs to know whether you
are writing in text mode so it can append that end-of-line maker, or
binary mode so it doesn't. Since Python doesn't modify the line you write
to the file, it doesn't care whether you are writing in text or binary
mode, it is all the same.

Operating systems such as Unix and Linux don't distinguish between binary
and text mode, the results are the same. I'm told that Windows does
distinguish between the two, although I couldn't tell you how they
differ.

--
Steven.

Mar 13 '06 #6
On 2006-03-13, Alex Martelli <al*****@yahoo. com> wrote:
Grant Edwards <gr****@visi.co m> wrote:
...
> What does a binary file do that a text file doesn't, aside
> from not converting the end of line characters?


Nothing. It's the end-of-line conversion that can break binary
data.


I believe that a "control-Z" (ord(26)) in a file that's being read as
text, on Windows, is also taken as an end-of-file indication.


Ah yes. IIRC, that's left over from CP/M, where the filesystem
didn't keep a file length for files other than a block count.
It was up to the application(s) to keep track of where in that
last block the "real" data ended.

--
Grant Edwards grante Yow! I wonder if I ought
at to tell them about my
visi.com PREVIOUS LIFE as a COMPLETE
STRANGER?
Mar 13 '06 #7
Steven D'Aprano wrote:
[Generally fine stuff, I am elaborating rather than dis-agreeing.]
On Sun, 12 Mar 2006 22:01:46 -0500, John Salerno wrote:
Erik Max Francis wrote:
You can use the struct module for converting fundamental types to a
portable string representation for writing to binary files. But if it's a string, why not just use a text file? What does a binary
file do that a text file doesn't, aside from not converting the end of
line characters?


Nothing. It is all bytes under the hood.

Modeling a file as "a continuous undifferentiate d string of bytes under
the hood" is a Unix-ism. There were (and are) other models.
When writing lines to a file, Python does not automatically append the
line marker, so you need to do so yourself. This is, indeed the behavior with "write," but not with "print"
A "print" statement ending w/o a comma will tack an end-of-line onto its
output.
But some other languages do -- I believe C++ is one of those languages.
So C++ needs to know whether you are writing in text mode so it can
append that end-of-line maker, or binary mode so it doesn't. Actually C++ (and C) convert any ('\12' == '\n' == LF) character to
the local file system's "line terminator" character on output to a
text-mode file.
Since Python doesn't modify the line you write to the file, it doesn't
care whether you are writing in text or binary mode, it is all the same. Well, actually CPython uses C I/O, so it does convert the '\n' chars
just as C does.
Operating systems such as Unix and Linux don't distinguish between binary
and text mode, the results are the same. I'm told that Windows does
distinguish between the two, although I couldn't tell you how they
differ.


The way Windows differs from Unix:
If the actual file data is built as:
f = open('dead_parr ot', 'wb')
f.write('dead\r \nparrot')
f.close()
g = open('ex_parrot ', 'w')
g.write('Dead\n Parrot')
g.close()
ft = open('dead_parr ot', 'r')
ft.read(6) returns 'dead\np'
gt = open('ex_parrot ', 'r')
gt.read(6) returns 'Dead\nD'

fb = open('dead_parr ot', 'rb')
fb.read(6) returns 'dead\r\n'
gb = open('ex_parrot ', 'rb')
gb.read(6) returns 'Dead\r\n'

In case you didn't follow the above too precisely, both files
(dead_parrot and ex_parrot) have exactly the same byes as contents.

This, by the way, is one of the few places Windows did it "by the
standard" and Unix "made up their own standard." The Unix decision
was, essentially: "there are too many ways to get in trouble with
both CR and LF determining line ending: what do you do for LF-CR pairs,
What does a LF by itself mean w/o a CR, .... Let's just treat LF
as a single-character line separator." Note how funny this for how
you type: you type <a> <b> <c> <Enter> for a line, but <Enter> sends
a CR ('\r' == '\15' == ASCII 13), which the I/O systems somewhere
magically transforms into a LF ('\n' == '\12' == ASCII 10).

The C standard (which evolved with Unix) does these translation
"for you" (or "to you" depending on your mood) because it was meant
to be compatible with _many_ file systems, including those which did
not explicitly represent ends-of-lines (text files are such systems
are sequences of lines, and there is a maximum length to each line).
By the way, before you think such systems are foolish, think about
how nice it might sometimes be to get to line 20972 of a file without
reading through the entire front of the file.

--Scott David Daniels
sc***********@a cm.org
Mar 13 '06 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
1920
by: Jacob H | last post by:
Hello all, Today I began writing a utility script that takes given binary files and puts them all into one datafile. My idea is to be able to access any binary data I want by indexing the datafile, e.g. wanted_image_data = datafileobj. The purpose is to hide external image files from the user in a simple game I'm writing. Though I have a good idea of how to implement this, before I begin I am curious to know if some Python master out...
13
15269
by: yaipa | last post by:
What would be the common sense way of finding a binary pattern in a ..bin file, say some 200 bytes, and replacing it with an updated pattern of the same length at the same offset? Also, the pattern can occur on any byte boundary in the file, so chunking through the code at 16 bytes a frame maybe a problem. The file itself isn't so large, maybe 32 kbytes is all and the need for speed is not so great, but the need for accuracy in the...
2
2051
by: Albert Tu | last post by:
Hi, I am learning and pretty new to Python and I hope your guys can give me a quick start. I have an about 1G-byte binary file from a flat panel x-ray detector; I know at the beggining there is a 128-byte header and the rest of the file is integers in 2-byte format. What I want to do is to save the binary data into several smaller files
27
5041
by: Eric | last post by:
Assume that disk space is not an issue (the files will be small < 5k in general for the purpose of storing preferences) Assume that transportation to another OS may never occur. Are there any solid reasons to prefer text files over binary files files?
16
2701
by: thenightfly | last post by:
Ok, I know all about how binary numbers translate into text characters. My question is what exactly IS a text character? Is it a bitmap?
2
1902
by: amfr | last post by:
On windows, is there anything special I have to do to read a binary file correctly?
10
3668
by: joelagnel | last post by:
hi friends, i've been having this confusion for about a year, i want to know the exact difference between text and binary files. using the fwrite function in c, i wrote 2 bytes of integers in binary mode. according to me, notepad opens files and each byte of the file read, it converts that byte from ascii to its correct character and displays
0
1072
by: Michael B. Trausch | last post by:
I am attempting to piece together a Python client for Fotobilder, the picture management server on Livejournal. The protocol calls for binary data to be transmitted, and I cannot seem to be able to do it, because I get this error: Traceback (most recent call last): File "<stdin>", line 1, in <module> File "scrapbook.py", line 181, in UploadSinglePicture {Request: pic_mem})
4
2108
by: larry | last post by:
Ok I'm a Python noob, been doing OK so far, working on a data conversion program and want to create some character image files from an 8-bit ROM file. Creating the image I've got down, I open the file and use TK to draw the images... but 1) It does not seem to end (running in IDLE), I have to kill the process to retry it seems tkinter does not close(?)
0
9535
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
1
10200
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9061
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7558
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6800
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5582
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4127
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
3744
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2931
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.