By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
432,069 Members | 1,716 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 432,069 IT Pros & Developers. It's quick & easy.

Binary file Pt 1 - Only reading some

P: n/a
I'm trying to create a program to read a certain binary format. I have
the format's spec which goes something like:

First 6 bytes: String
Next 4 bytes: 3 digit number and a blank byte
---
Next byte: Height (Number up to 255)
Next byte: Width (Number up to 255)
Next byte: Number 0 - 5
Every 2 bytes after that: Supposedly a number 0000 - 0899?

Anyway, I'm able to do the first 2 objects fine:

a = info.read(6)
b = info.read(4)

Printing both gives me what I mentioned above, a string and a 3 digit
number with a space. However, as I continue, things get trickier.

c = info.read(1)
d = info.read(1)

Printing c and d in this case gives me a block in the SPE output, or
if I run in a DOS prompt, 2 funny symbols. How do I get an integer out
of this? I'll probably need help once I get to the "every 2 byte"
section, but that'll be a separate post.
Feb 5 '08 #1
Share this Question
Share on Google+
4 Replies


P: n/a
On 5 feb, 01:51, Mastastealth <mastastea...@gmail.comwrote:
I'm trying to create a program to read a certain binary format. I have
the format's spec which goes something like:

First 6 bytes: String
Next 4 bytes: 3 digit number and a blank byte
---
Next byte: Height (Number up to 255)
Next byte: Width (Number up to 255)
Next byte: Number 0 - 5
Every 2 bytes after that: Supposedly a number 0000 - 0899?

Anyway, I'm able to do the first 2 objects fine:

a = info.read(6)
b = info.read(4)

Printing both gives me what I mentioned above, a string and a 3 digit
number with a space. However, as I continue, things get trickier.

c = info.read(1)
d = info.read(1)

Printing c and d in this case gives me a block in the SPE output, or
if I run in a DOS prompt, 2 funny symbols. How do I get an integer out
of this? I'll probably need help once I get to the "every 2 byte"
section, but that'll be a separate post.
Using the struct module http://docs.python.org/lib/module-struct.html

import struct
data = info.read(15)
str1, str2, blank, height, width, num2, num3 =
struct.unpack("6s3s1cBBBh", data)

Consider this like a "first attempt", open issues: is the data little-
endian or big-endian? does the 0-5 mean 0x00-0x05 or "0"-"5"? the last
numbers are 2-byte binary integers, or 0000-0899 might indicate BDC?
But building the right format is surely faster and easier than parsing
the data by hand.

--
Gabriel Genellina
Feb 5 '08 #2

P: n/a
On Feb 5, 1:17*am, Gabriel Genellina <gagsl-...@yahoo.com.arwrote:
Using the struct module *http://docs.python.org/lib/module-struct.html

import struct
data = info.read(15)
str1, str2, blank, height, width, num2, num3 =
struct.unpack("6s3s1cBBBh", data)

Consider this like a "first attempt", open issues: is the data little-
endian or big-endian? does the 0-5 mean 0x00-0x05 or "0"-"5"? the last
numbers are 2-byte binary integers, or 0000-0899 might indicate BDC?
But building the right format is surely faster and easier than parsing
the data by hand.

--
Gabriel Genellina
Ah ok, thanks! That worked, though the line "str1, str2, blank,
height, width, num2, num3 =" spit out a syntax error. However, I do
see that it creates a tuple with all the values in a readable format
for me. Also, I needed to change info.read(15) to 16. More questions:

What is this value for? "6s3s1cBBBh" and why is my unpack limited to a
length of "16"?

Unfortunately it seems my understanding of binary is way too basic for
what I'm dealing with. Can you point me to a simple guide to
explaining most of it? As far as I know this is just a bunch of 1's
and 0's right? Each byte has 8 digits of, of which somehow is
converted to a number or letter. Don't know what most of that stuff in
the struct page means. -_-

As for you questions, I suppose it would be "little-endian" as the
format is on PC (and the Python docs say: "Intel and DEC processors
are little-endian"). 0-5 means a single digit "0" through "5". Lastly,
I'm not building the format, it's already made (a format for tiled
maps in a game). My program is just reading it.
Feb 5 '08 #3

P: n/a
On Feb 5, 8:50*am, Mastastealth <mastastea...@gmail.comwrote:
What is this value for? "6s3s1cBBBh" and why is my unpack limited to a
length of "16"?

Unfortunately it seems my understanding of binary is way too basic for
what I'm dealing with. Can you point me to a simple guide to
explaining most of it? As far as I know this is just a bunch of 1's
and 0's right? Each byte has 8 digits of, of which somehow is
converted to a number or letter. Don't know what most of that stuff in
the struct page means. -_-
Ah never mind, after closer inspection is was something on my side of
the code that ruined the tuple unpacking. Also, I now understand (a
little) what the string meant.

6s = A string of 6 characters
3s = String with 3 characters
1c = For the blank character
B's = An unsigned char, any number I guess?
h = short integer. Whatever that is.

Either way, just answering myself so you don't have to explain it. :D
Feb 5 '08 #4

P: n/a
En Tue, 05 Feb 2008 11:50:25 -0200, Mastastealth <ma**********@gmail.com>
escribi�:
On Feb 5, 1:17Â*am, Gabriel Genellina <gagsl-...@yahoo.com.arwrote:
>Using the struct module Â*http://docs.python.org/lib/module-struct.html

import struct
data = info.read(15)
str1, str2, blank, height, width, num2, num3 =
struct.unpack("6s3s1cBBBh", data)

Consider this like a "first attempt", open issues: is the data little-
endian or big-endian? does the 0-5 mean 0x00-0x05 or "0"-"5"? the last
numbers are 2-byte binary integers, or 0000-0899 might indicate BDC?
But building the right format is surely faster and easier than parsing
the data by hand.

Ah ok, thanks! That worked, though the line "str1, str2, blank,
height, width, num2, num3 =" spit out a syntax error. However, I do
see that it creates a tuple with all the values in a readable format
for me. Also, I needed to change info.read(15) to 16. More questions:

What is this value for? "6s3s1cBBBh" and why is my unpack limited to a
length of "16"?
That's the format describing how to decode your bytes: first, a string of
six characters; then a string of 3 characters followed by a single
character, three individual bytes, and a short integer.
Unfortunately it seems my understanding of binary is way too basic for
what I'm dealing with. Can you point me to a simple guide to
explaining most of it? As far as I know this is just a bunch of 1's
and 0's right? Each byte has 8 digits of, of which somehow is
converted to a number or letter. Don't know what most of that stuff in
the struct page means. -_-
I'm afraid I don't know any simple guide to struct. Perhaps people think
that if you manage binary data with struct, you must already be a Python
guru or something...
You could try "construct" instead: "Construct is a python library for
parsing and building of data structures (binary or textual). It is based
on the concept of defining data structures in a declarative manner, rather
than procedural code. [...] It's the first library that makes parsing fun,
instead of the usual headache it is today."
http://pyconstruct.wikispaces.com

There are some recipes in the Python Cookbook too:
http://aspn.activestate.com/ASPN/Cookbook/Python

--
Gabriel Genellina

Feb 5 '08 #5

This discussion thread is closed

Replies have been disabled for this discussion.