By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
443,795 Members | 1,702 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 443,795 IT Pros & Developers. It's quick & easy.

Binary file output using python

P: n/a
Hi,
Is there a way in python to output binary files? I need to python to
write out a stream of 5 million floating point numbers, separated by
some separator, but it seems that all python supports natively is string
information output, which is extremely space inefficient.

I'd tried using the pickle module, but it crashed whenever I tried using
it due to the large amount of data involved.

Thanks for your help!
Apr 17 '07 #1
Share this Question
Share on Google+
7 Replies


P: n/a
On Apr 17, 12:41 pm, Chi Yin Cheung <c-che...@northwestern.eduwrote:
Hi,
Is there a way in python to output binary files? I need to python to
write out a stream of 5 million floating point numbers, separated by
some separator, but it seems that all python supports natively is string
information output, which is extremely space inefficient.

I'd tried using the pickle module, but it crashed whenever I tried using
it due to the large amount of data involved.

Thanks for your help!
You can create a binary file by doing something like this:

f = open(r'filename, 'b')
f.write('1,2,3,4,5,6')
f.close()

See also: http://www.devshed.com/c/a/Python/Fi...ent-in-Python/

Have fun!

Mike

Apr 17 '07 #2

P: n/a
Chi Yin Cheung wrote:
Hi,
Is there a way in python to output binary files? I need to python to
write out a stream of 5 million floating point numbers, separated by
some separator, but it seems that all python supports natively is string
information output, which is extremely space inefficient.
I recommend using PyTables for this sort of thing. It also allows you to
choose from several compression algorithms. I'm using it to store files
with 22000 x (2000, 12) datasets, or 528 million Float64s.
--
Michael Hoffman
Apr 17 '07 #3

P: n/a
Michael Hoffman wrote:
Chi Yin Cheung wrote:
>Hi,
Is there a way in python to output binary files? I need to python to
write out a stream of 5 million floating point numbers, separated by
some separator, but it seems that all python supports natively is
string information output, which is extremely space inefficient.

I recommend using PyTables for this sort of thing. It also allows you to
choose from several compression algorithms. I'm using it to store files
with 22000 x (2000, 12) datasets, or 528 million Float64s.
Addendum: it should also deal with endianness issues I wouldn't want to
handle myself, so your code will also be portable.
--
Michael Hoffman
Apr 17 '07 #4

P: n/a
Den Tue, 17 Apr 2007 11:07:38 -0700 skrev kyosohma:
On Apr 17, 12:41 pm, Chi Yin Cheung <c-che...@northwestern.eduwrote:
>Hi,
Is there a way in python to output binary files? I need to python to
write out a stream of 5 million floating point numbers, separated by
some separator, but it seems that all python supports natively is
string information output, which is extremely space inefficient.
I don't understand. To me it seams like there is no space difference:

[thomas@localhost ~]$ python
Python 2.4.4 (#1, Oct 23 2006, 13:58:00)
[GCC 4.1.1 20061011 (Red Hat 4.1.1-30)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>f = open("test2", "w")
f.write(str(range(10**7)))
f.close()
f = open("test", "wb")
f.write(str(range(10**7)))
f.close()
[thomas@localhost ~]$ ls -l test test2
-rw-rw-r-- 1 thomas thomas 88888890 17 apr 22:28 test
-rw-rw-r-- 1 thomas thomas 88888890 17 apr 22:27 test2
[thomas@localhost ~]$
Apr 17 '07 #5

P: n/a
On Apr 17, 10:30 pm, Thomas Dybdahl Ahle <lob...@gmail.comwrote:
Den Tue, 17 Apr 2007 11:07:38 -0700 skrev kyosohma:
On Apr 17, 12:41 pm, Chi Yin Cheung <c-che...@northwestern.eduwrote:
Hi,
Is there a way in python to output binary files? I need to python to
write out a stream of 5 million floating point numbers, separated by
some separator, but it seems that all python supports natively is
string information output, which is extremely space inefficient.

I don't understand. To me it seams like there is no space difference:

[thomas@localhost ~]$ python
Python 2.4.4 (#1, Oct 23 2006, 13:58:00)
[GCC 4.1.1 20061011 (Red Hat 4.1.1-30)] on linux2
Type "help", "copyright", "credits" or "license" for more information.>>f = open("test2", "w")
>f.write(str(range(10**7)))
f.close()
f = open("test", "wb")
f.write(str(range(10**7)))
f.close()

[thomas@localhost ~]$ ls -l test test2
-rw-rw-r-- 1 thomas thomas 88888890 17 apr 22:28 test
-rw-rw-r-- 1 thomas thomas 88888890 17 apr 22:27 test2
[thomas@localhost ~]$
That's OK, but he might also take a look at the 'struct' module which
can solve the "stream of 5 million floating point numbers, separated
by
some separator" part of the issue ( if binary format is needed ). From
the python docs...
>>from struct import *
pack('hhl', 1, 2, 3)
'\x00\x01\x00\x02\x00\x00\x00\x03'
>>unpack('hhl', '\x00\x01\x00\x02\x00\x00\x00\x03')
(1, 2, 3)
>>calcsize('hhl')
8
Apr 17 '07 #6

P: n/a
Chi Yin Cheung wrote:
Is there a way in python to output binary files? I need to python to
write out a stream of 5 million floating point numbers, separated by
some separator, but it seems that all python supports natively is string
information output, which is extremely space inefficient.

I'd tried using the pickle module, but it crashed whenever I tried using
it due to the large amount of data involved.
A minimalistic alternative is array.tofile()/fromfile(), but pickle should
handle a list, say, of 5 million floating point numbers just fine. What
exactly are you doing to provoke a crash, and what does it look like?
Please give minimal code and the traceback.

Peter
Apr 18 '07 #7

P: n/a
Peter Otten <__*******@web.dewrote:
Chi Yin Cheung wrote:
Is there a way in python to output binary files? I need to python to
write out a stream of 5 million floating point numbers, separated by
some separator, but it seems that all python supports natively is string
information output, which is extremely space inefficient.

I'd tried using the pickle module, but it crashed whenever I tried using
it due to the large amount of data involved.

A minimalistic alternative is array.tofile()/fromfile(), but pickle should
handle a list, say, of 5 million floating point numbers just fine. What
exactly are you doing to provoke a crash, and what does it look like?
Please give minimal code and the traceback.
cPickle worked fine when I tried it...
>>L=map(float, range(5000000))
import cPickle
out=file("z", "wb")
cPickle.dump(L, out, -1)
out.close()
inp=file("z", "rb")
K=cPickle.load(inp)
inp.close()
import os
os.system("ls -l z")
-rw-r--r-- 1 ncw ncw 45010006 Apr 19 18:43 z
0
>>>
Indicating each float took 9 bytes to store, which is 1 byte more than
a 64 bit float would normally take.

The pickle dump / load each took about 2 seconds.

--
Nick Craig-Wood <ni**@craig-wood.com-- http://www.craig-wood.com/nick
Apr 19 '07 #8

This discussion thread is closed

Replies have been disabled for this discussion.