473,396 Members | 2,121 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

Question about reading a big binary file and write it into several text (ascii) files

Hi,

I am learning and pretty new to Python and I hope your guys can give me
a quick start.

I have an about 1G-byte binary file from a flat panel x-ray detector; I
know at the beggining there is a 128-byte header and the rest of the
file is integers in 2-byte format.

What I want to do is to save the binary data into several smaller files
in integer format and each smaller file has the size of 2*1024*768
bytes.

I know I can do something like
f=open("xray.seq", 'rb')
header=f.read(128)
file1=f.read(2*1024*768)
file2=f.read(2*1024*768)
......
f.close()


Bur I don't them how to save files in integer format (converting from
binary to ascii files) and how to do this in an elegant and snappy way.
Please reply when you guyes can get a chance.
Thanks,
Warm regards,
Albert

Jul 18 '05 #1
2 2004
Albert Tu wrote:
I am learning and pretty new to Python and I hope your guys can give me
a quick start.

I have an about 1G-byte binary file from a flat panel x-ray detector; I
know at the beggining there is a 128-byte header and the rest of the
file is integers in 2-byte format.

What I want to do is to save the binary data into several smaller files
in integer format and each smaller file has the size of 2*1024*768
bytes.

I know I can do something like
f=open("xray.seq", 'rb')
header=f.read(128)
file1=f.read(2*1024*768)
file2=f.read(2*1024*768)
......
(using a loop might help)
f.close()


Bur I don't them how to save files in integer format (converting from
binary to ascii files) and how to do this in an elegant and snappy way.


I think you have to define "integer format" a bit better. A text file with
integer values, written out in decimal?

If so, take a look at the array module:

http://docs.python.org/lib/module-array.html

Here's an (untested) example; tweak as necessary:

linesize = 1024
data = array("h", filedata)
for i in range(0, len(data), linesize):
# convert to list of decimal integers
list = map(str, data[i:i+linesize])
print " ".join(list)

tools like PIL and NumPy may also come in handy, but I suspect they're
overkill in this case.

</F>

Jul 18 '05 #2
On 24 Jan 2005 12:44:32 -0800, "Albert Tu" <sj*******@gmail.com> wrote:
Hi,

I am learning and pretty new to Python and I hope your guys can give me
a quick start.

I have an about 1G-byte binary file from a flat panel x-ray detector; I
know at the beggining there is a 128-byte header and the rest of the
file is integers in 2-byte format. It looks like 16-bit pixels in the 1024*768 images, I assume
What I want to do is to save the binary data into several smaller files
in integer format and each smaller file has the size of 2*1024*768
bytes. You could do that, but why duplicate so much data that you may never look at?
E.g., why not a class that provides a view of your big file in terms of an image index
and returns an efficient array in memory e.g., (untested)

import array
def getimage(n, f, offset=128):
f.seek(offset+n*2*1024*768)
return array('H', f.read(2*1024*768)) # 'H' is for unsigned 2-byte integers (check endianness for swap need!)

Then usage would be
imfile = open('big_file.bin', 'rb')
imarray = getimage(23, imfile)
And you could get pixel x,y by
xpix, ypix = imarray[x+y*1024] # or maybe x*768+y etc.

or your could make getimage a method of a class that you intialize with
the file and which could maintain an lru cache of images
with a particular disk directory as backup, etc. etc. and would provide
images wrapped with nice methods to support whatever you are doing with the images.


I know I can do something like
f=open("xray.seq", 'rb')
header=f.read(128)
file1=f.read(2*1024*768)
file2=f.read(2*1024*768)
......
f.close()

Bur I don't them how to save files in integer format (converting from
binary to ascii files) and how to do this in an elegant and snappy way.

Best is probably to leave the original format alone, e.g., (untested and needs try/except)
this should split the big file into individual image files named file0.ximg .. filen.ximg

f = open('xray.seq/, 'rb')
header = f.read(128)
nfile = 0
while 1:
im = f.read(2*1024*768)
if not im: break
if len(im) != 2*1024*768: print 'broken tail of %s bytes'%len(im); break
fw = open('file%s.ximg' % nfile, 'wb')
fw.write(im)
fw.close()
nfile +=1

then you could use getimage above with offset passed as 0 and image number 0, e.g.,

im23 = getimage(0, open('file23.ximg','rb'), 0) # img 0, offset 0

But then you might wonder about all those separate files, unless you want to
put them on a series of CDs where they wouldn't all fit on one. Whatever ;-)

You will probably lose in both speed and space if you try to make some kind
of ascii disk files. You aren't thinking XML are you??!! For this, definitely ick ;-)
What you want to do will depend on the big picture, which is not apparent yet ;-)
Please reply when you guyes can get a chance.
Thanks,


Sorry to give nothing but untested suggestion, but I have to go, and I
will be off line mostly for a while.

Regards,
Bengt Richter
Jul 18 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

21
by: siroregano | last post by:
Hi Everyone- I'm new to this group, and almost-as-new to asking programming questions publicly, so please forgive me if I miss a convention or two! I have a text file, around 40,000 lines...
1
by: siliconwafer | last post by:
Hi All, here is one code: int main() { FILE*fp; unsigned long a; fp = fopen("my_file.txt","w+"); a = 24; fprintf(fp,"%ld",a); while(fscanf(fp,"%ld",&a) == 1) {
40
by: googler | last post by:
I'm trying to read from an input text file and print it out. I can do this by reading each character, but I want to implement it in a more efficient way. So I thought my program should read one...
9
by: jeff M via .NET 247 | last post by:
I'm still having problems reading EBCDIC files. Currently itlooks like the lower range (0 to 127) is working. I have triedthe following code pages 20284, 20924, 1140, 37, 500 and 20127.By working I...
2
by: Youssef Mesri | last post by:
I have two files, the first one is an ascii file and the second is a binary one. I want to add the ascii file on the end of the binary file in order to obtain a one binary file: I have done...
68
by: vim | last post by:
hello everybody Plz tell the differance between binary file and ascii file............... Thanks in advance vim
6
by: arne.muller | last post by:
Hello, I've come across some problems reading strucutres from binary files. Basically I've some strutures typedef struct { int i; double x; int n; double *mz;
11
by: Freddy Coal | last post by:
Hi, I'm trying to read a binary file of 2411 Bytes, I would like load all the file in a String. I make this function for make that: '-------------------------- Public Shared Function...
9
by: szclark | last post by:
Hello, I am trying to extract information from a file I have so that I can answer some questions on it, the problem I'm having is that it is returning back mostly gibberish and I really don't know...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.