473,574 Members | 2,675 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Struggling with struct.unpack() and "p" format specifier

Hope someone can help.
I am trying to read data from a file binary file and then unpack the
data into python variables. Some of the data is store like this;

xbuffer: '\x00\x00\xb9\x 02\x13EXCLUDE_C REDIT_CARD'
# the above was printed using repr(xbuffer).
# Note that int(0x13) = 19 which is exactly the length of the visible
text
#

In the code I have the following statement;
x = st.unpack('>xxB Bp',xbuffer)

This throws out the following error;

x = st.unpack('>xxB Bp',xbuffer)
error: unpack str size does not match format

As I read the documentation the "p" format string seems to address
this situation, where the number bytes of the string to read is the
first byte of the stored value but I keep getting this error.

Am I missing something ?
Can the "p" format character be used to unpack this type of data ?

As I mentioned, I can parse the string and read it with multiple
statements, I am just looking for a more efficient solution.

Thanks.
Jul 18 '05 #1
5 13795
[Geoffrey <ge********@hot mail.com>]
I am trying to read data from a file binary file and then unpack the
data into python variables. Some of the data is store like this;

xbuffer: '\x00\x00\xb9\x 02\x13EXCLUDE_C REDIT_CARD'
# the above was printed using repr(xbuffer).
# Note that int(0x13) = 19 which is exactly the length of the visible
text
#

In the code I have the following statement;
x = st.unpack('>xxB Bp',xbuffer)

This throws out the following error;

x = st.unpack('>xxB Bp',xbuffer)
error: unpack str size does not match format

As I read the documentation the "p" format string seems to
address this situation, where the number bytes of the string to
read is the first byte of the stored value but I keep getting this
error.


....

Well, the docs mean it when they say:

Note that for unpack(), the "p" format character consumes count
bytes

You don't have an explicit count in front of your "p" code, so count
defaults to 1, so only one byte of xbuffer will get consumed.

This works, telling struct that this particular p field consumes 20
bytes (including the string-length byte):
struct.unpack(' >xxBB20p',xbuff er) (185, 2, 'EXCLUDE_CREDIT _CARD')

Or, a bit more generally, assuming your p field is always at the end,
and is preceded by 4 bytes:
struct.unpack(' >xxBB%dp' % (len(xbuffer) - 4), xbuffer)

(185, 2, 'EXCLUDE_CREDIT _CARD')

Note that there's no direct support for any kind of variable-width
data in struct. The number of bytes involved has to be deducible from
the format string alone.
Jul 18 '05 #2
Geoffrey wrote:
I am trying to read data from a file binary file and then unpack the
data into python variables. Some of the data is store like this; .... As I read the documentation the "p" format string seems to address
this situation, where the number bytes of the string to read is the
first byte of the stored value but I keep getting this error.

Am I missing something ?
Can the "p" format character be used to unpack this type of data ?


I've tried experimenting with "p" and cannot get any meaningful
results. In all cases pack() returns '\x00' while unpack()
with anything other than a one-byte string returns an exception
(unpack str size does not match format) while with a one-byte
string it always returns ('',).

I would be inclined to say that the "p" format in struct (using
Python 2.4rc1 or Python 2.3.3) does not act as documented on
Windows XP SP2, at least...

I hope we've both just missed something obvious.

-Peter
Jul 18 '05 #3
Peter Hansen wrote:
I would be inclined to say that the "p" format in struct (using
Python 2.4rc1 or Python 2.3.3) does not act as documented on
Windows XP SP2, at least...

I hope we've both just missed something obvious.


Okay, we were certainly missing something, but I don't believe
I would call it obvious.

I can't deduce from the documentation the fact that the "p"
format requires a length *in front of the p in the format string*.

Furthermore, it assumes a length of 1 if one is not specified.

And there is no example that shows how to do it correctly.

(I did Google searches and found examples, but by then I
was looking for a bug report and didn't even think to look
at the examples themselves. :-( )

Doc bug? Did anyone else find the documentation on "p"
to be clear and effective?

-Peter
Jul 18 '05 #4
Geoffrey wrote:
As I mentioned, I can parse the string and read it with multiple
statements, I am just looking for a more efficient solution.


This looks like about the best you can do, using the information
from Tim's reply:
buf = '\0\0\xb9\x02\x 13EXCLUDE_CREDI T_CARD'
import struct
x = struct.unpack(' >xxBB%sp' % (ord(buf[4])+1), buf)
x

(185, 2, 'EXCLUDE_CREDIT _CARD')

If you wanted to avoid hard-coding the 4, you would
be most correct to do this:

header = '>xxBB'
lenIndex = struct.calcsize (header)
x = struct.unpack(' %s%dp' % (header, ord(buf[lenIndex])+1), buf)

.... though that doesn't exactly make it all that readable.

-Peter
Jul 18 '05 #5
Thanks for your response.

I guess the documentation on the p format wasn't clear to me ... or
perhaps I was just hoping to much for an easy solution !

The data is part of a record structure that is written to a file with
a few "int"'s and "longs" mixed in. The pattern repeats through the
file with sometime up to 2500 repititions.

Clearly I can create a subroutine to read the records and extract out
the fields. I was just hoping I could use the "struct" module and
create a pattern like 'LLHpHLpppH' which would unpack the date and
automatically give me the strings without needing to first determine
their lengths as the length is already embedded in the data.

Any suggestion on how to go about proposing the ability to read
variable length strings based on the preceeding byte value to the
struct module ? It seems it would be a valuable addition, helping
with code clarity, readability and saving quite a few lines of code -
well atleast me anyways !

Thanks again.

Peter Hansen <pe***@engcorp. com> wrote in message news:<co******* ***@utornnr1pp. grouptelecom.ne t>...
Geoffrey wrote:
As I mentioned, I can parse the string and read it with multiple
statements, I am just looking for a more efficient solution.


This looks like about the best you can do, using the information
from Tim's reply:
>>> buf = '\0\0\xb9\x02\x 13EXCLUDE_CREDI T_CARD'
>>> import struct
>>> x = struct.unpack(' >xxBB%sp' % (ord(buf[4])+1), buf)
>>> x

(185, 2, 'EXCLUDE_CREDIT _CARD')

If you wanted to avoid hard-coding the 4, you would
be most correct to do this:

header = '>xxBB'
lenIndex = struct.calcsize (header)
x = struct.unpack(' %s%dp' % (header, ord(buf[lenIndex])+1), buf)

... though that doesn't exactly make it all that readable.

-Peter

Jul 18 '05 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
1923
by: Matthew Barnes | last post by:
I was wondering if there would be any interest in extending the struct.unpack format notation to be able to express groups of data with parenthesis. For example: >>> data = struct.pack('iiii', 1, 2, 3, 4) >>> struct.unpack('i(ii)i', data) # Note the parentheses (1, (2, 3), 4)
5
5447
by: grant | last post by:
Hi All, I am pretty new to python and am having a problem intepreting binary data using struct.unpack. I am reading a file containing binary packed data using open with "rb". All the values are coming through fine when using (integer1,) = struct.unpack('l', line) except when line contains "carriage-return" "linefeed" which are valid...
6
4507
by: g.franzkowiak | last post by:
Hello Everybody, I've read a pipe and store it in a object. My next step was the separation from 4 bytes with obj = string.join(list(dataObject) ==> '\x16 \x00 \x00 \x00' and the converting by value = struct.unpack('I', obj) generated the error "unpack str size does not match format" Unfortunately is len(obj) 7, but integer lengt 4.
3
5507
by: Eric Jacoboni | last post by:
Hi, To experiment with unpacking, i've written a little C code which stores one record in a file. Then, i try to reread this file to unpack the record. Here's the struct of a record: typedef struct { char nom;
2
6006
by: Nadav Samet | last post by:
Hi, I am trying to unpack a 32-bit unsigned integer from a string using struct.unpack. so using string.unpack('L', data) would work fine on 32-bit systems, But apparently, on 64-bit platforms it tries to read 64-bit unsigned integer (since that's what the C Type unsigned long means on 64-bit platforms).
4
2792
by: OhKyu Yoon | last post by:
Hi! I have a really long binary file that I want to read. The way I am doing it now is: for i in xrange(N): # N is about 10,000,000 time = struct.unpack('=HHHH', infile.read(8)) # do something tdc = struct.unpack('=LiLiLiLi',self.lmf.read(32)) # do something
2
4382
by: brnstrmrs | last post by:
If I run: testValue = '\x02\x00' junk = struct.unpack('h', testValue) Everything works but If I run testValue = raw_input("Enter Binary Code..:") inputting at the console '\x02\x00' junk = struct.unpack('h', testValue)
0
1601
by: Ping Zhao | last post by:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi, I am writing a small program to decode MS bitmap image. When I use statements as follow, it works fine: header = str(struct.unpack('2s', self.__read(src, 2))) header = int(struct.unpack('1i', self.__read(src, 4)))
2
3526
by: Heikki Toivonen | last post by:
M2Crypto has some old code that gets and sets socket timeouts in http://svn.osafoundation.org/m2crypto/trunk/M2Crypto/SSL/Connection.py, for example: def get_socket_read_timeout(self): return timeout.struct_to_timeout(self.socket.getsockopt(socket.SOL_SOCKET, socket.SO_RCVTIMEO, timeout.struct_size())) The helper timeout module is here:
0
7808
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main...
0
7732
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language...
0
8243
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that...
1
7822
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For...
0
8101
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the...
0
6456
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then...
1
5626
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes...
0
3742
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in...
0
1062
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.