469,951 Members | 2,717 Online

# Problem with unpack hex to decimal

Hello,

I was looking at this:
http://docs.python.org/lib/module-struct.html
and tried the following
import struct
struct.calcsize('h') 2 struct.calcsize('b') 1 struct.calcsize('bh') 4

I would have expected
struct.calcsize('bh')

3

what am I missing ?

Jake.

Jul 19 '05 #1
6 2389
"se*******@gmail.com" wrote:
I was looking at this:
http://docs.python.org/lib/module-struct.html
and tried the following
import struct
struct.calcsize('h') 2 struct.calcsize('b') 1 struct.calcsize('bh') 4

I would have expected
struct.calcsize('bh')

3

what am I missing ?

the sentence

"By default, C numbers are represented in the machine's native format
and byte order, and properly aligned by skipping pad bytes if necessary
(according to the rules used by the C compiler)."

and the text and table following that sentence.

</F>

Jul 19 '05 #2

<se*******@gmail.com> wrote in message
Hello,

I was looking at this:
http://docs.python.org/lib/module-struct.html
and tried the following
import struct
struct.calcsize('h') 2 struct.calcsize('b') 1 struct.calcsize('bh') 4

I would have expected
struct.calcsize('bh') 3

what am I missing ?

Not sure, however I also find the following confusing:
struct.calcsize('hb') 3 struct.calcsize('hb') == struct.calcsize('bh')

False

I could understand aligning to multiples of 4, but why is 'hb' different
from 'bh'?
Jul 19 '05 #3
... I also find the following confusing:
struct.calcsize('hb') 3
struct.calcsize('hb') == struct.calcsize('bh')

False

I could understand aligning to multiples of 4, but why is 'hb' different
from 'bh'?

at the end of structs in C, only between fields, when
byte, but the trailing one doesn't need padding.

-Peter
Jul 19 '05 #4
On Sun, 17 Apr 2005 20:47:20 +0100, "Jonathan Brady"
<no****@denbridgedigital.com> wrote:

<se*******@gmail.com> wrote in message
Hello,

I was looking at this:
http://docs.python.org/lib/module-struct.html
and tried the following
> import struct
> struct.calcsize('h') 2
> struct.calcsize('b')

1
> struct.calcsize('bh')

4

I would have expected
> struct.calcsize('bh')

3

what am I missing ?

A note for the original poster: "unpack hex to decimal" (the subject
decimal are ways of representing the *same* number.

Let's take an example of a two-byte piece of data. Suppose the first
byte has all bits set (== 1) and the second byte has all bits clear
(== 0). The first byte's value is hexadecimal FF or decimal 255,
whether or not you unpack it, if you are interpreting it as an
unsigned number ('B' format). Signed ('b' format) gives you
hexadecimal -1 and decimal -1. The second byte's value is 0
hexadecimal and 0 decimal however you interpret it.

Suppose you want to interpret the two bytes as together representing a
16-bit signed number (the 'h' format). If the perp is little-endian,
the result is hex FF and decimal 255; otherwise it's hex -100 and
decimal -256.

Not sure, however I also find the following confusing: struct.calcsize('hb')3 struct.calcsize('hb') == struct.calcsize('bh')
False

I could understand aligning to multiples of 4,

Given we know nothing about the OP's platform or your platform, "4" is
no more understandable than any other number.
but why is 'hb' different
from 'bh'?

Likely explanation: the C compiler aligns n-byte items on an n-byte
boundary. Thus in 'hb', the h is at offset 0, and the b can start OK
at offset 2, for a total size of 3. With 'bh', the b is at offset 0,
but the h can't (under the compiler's rules) start at 1, it must start
at 2, for a total size of 4.

Typically, you would use "native" byte ordering and alignment (the
default) only where you are accessing data in a C struct that is in
code that is compiled on your platform [1]. When you are picking apart
a file that has been written elsewhere, you will typically need to
read the documentation for the file format and/or use trial & error to
determine which prefix (@, <, >) you should use. If I had to guess for
you, I'd go for "<".

[1] Care may need to be taken if the struct is defined in source
compiled by a compiler *other* than the one used to compile your
Python executable -- there's a slight chance you might need to fiddle
with the "foreign" compiler's alignment options to make it suit.

HTH,

John

Jul 19 '05 #5
<se*******@gmail.com> wrote in message
Hello,

I was looking at this:
http://docs.python.org/lib/module-struct.html
and tried the following

>import struct
>struct.calcsize('h')

2
>struct.calcsize('b')

1
>struct.calcsize('bh')

4

I would have expected

>struct.calcsize('bh')

3

what am I missing ?

Not sure, however I also find the following confusing:
struct.calcsize('hb')
3
struct.calcsize('hb') == struct.calcsize('bh')

False

I could understand aligning to multiples of 4, but why is 'hb' different
from 'bh'?

Evidently, shorts need to be aligned at an even address on your
platform. Consider the following layout, where `b' represents the signed
char, `h' represents the bytes occupied by the short and `X' represents
unused bytes (due to alignment.

'bh', a signed char followed by a short would look like:

bXhh -- or four bytes, but 'hb', a short followed by a signed char would be:

hhb (as `char' and its siblings have no alignment requirements)

HTH,
--ag

--
Artie Gold -- Austin, Texas
http://it-matters.blogspot.com (new post 12/5)
http://www.cafepress.com/goldsays
Jul 19 '05 #6
Thank you all very much. It looked like I was not the only one
confused.

Jake.

Jul 19 '05 #7

### This discussion thread is closed

Replies have been disabled for this discussion.