473,561 Members | 3,672 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

proposed struct module format code addition

Good day everyone,

I have produced a patch against the latest CVS to add support for two
new formatting characters in the struct module. It is currently an RFE,
which I include a link to at the end of this post. Please read the
email before you respond to it.

Generally, the struct module is for packing and unpacking of binary
data. It includes support to pack and unpack the c types:
byte, char, short, long, long long, char[], *, and certain variants of
those (signed/unsigned, big/little endian, etc.)
I had proposed two new formatting characters, 'g' and 'G' (for biGint or
lonG int).

There was one primary purpose, to offer users the opportunity to specify
their own integer lengths (very useful for cryptography, and real-world
applications that involve non-standard sized integers). Current
solutions involve shifting, masking, and multiple passes over data.

There is a secondary purpose, and that is that future n-byte integers
(like 16-byte/128-bit integers as supported by SSE2) are already taken
care of.

It also places packing and unpacking of these larger integers in the
same module as packing and packing of other integers, floats, etc. This
makes documentation easy.

Functionality-wise, it merely uses the two C functions
_PyLong_FromByt eArray() and _PyLong_ToByteA rray(), with a few lines to
handle interfacing with the pack and unpack functions in the struct module.

An example of use is as follows:
struct.pack('>3 g', -1) '\xff\xff\xff' struct.pack('>3 g', 2**23-1) '\x7f\xff\xff' struct.pack('>3 g', 2**23) Traceback (most recent call last):
File "<stdin>", line 1, in ?
OverflowError: long too big to convert struct.pack('>3 G', 2**23)


It follows the struct module standard 'lowercase for signed, uppercase
for unsigned'.
There seem to be a few arguments against its inclusion into
structmodule.c. ..

The size specifier is variable, so you must know the size/magnitude
of the thing you are (un)packing before you (un)pack it.

My Response:
All use cases I have for this particular mechanism involve not using
'variable' sized structs, but fixed structs with integers of
non-standard byte-widths. Specifically, I have a project in which I use
some 3 and 5 byte unsigned integers. One of my (un)pack format
specifiers is '>H3G3G', and another is '>3G5G' (I have others, but these
are the most concise).
Certainly this does not fit the pickle/cPickle long (un)packing
use-case, but that problem relies on truely variable long integer
lengths, of which this specifier does not seek to solve.
Really, the proposed 'g' and 'G' format specifiers are only as
variable as the previously existing 's' format specifier.
The new specifiers are not standard C types.

My Response:
Certainly they are not standard C types, but they are flexible
enough to subsume all current integer C type specifiers. The point was
to allow a user to have the option of specifying their own integer
lengths. This supports use cases involving certain kinds of large
dataset processing (my use case, which I may discuss after we release)
and cryptography, specifically in the case of PKC...
while 1:
blk = get_block()
iblk = struct.unpack(' >128G', blk)[0]
uiblk = pow(iblk, power, modulous)
write_block(str uct.pack('>128G ', uiblk))

The 'p' format specifier is also not a standard C type, and yet it
is included in struct, specifically because it is useful.
You can already do the same thing with:
pickle.encode_l ong(long_int)
pickle.decode_l ong(packed_long )
and some likely soon-to-be included additions to the binascii module.

My Response:
That is not the same. Nontrivial problems require multiple passes
over your data with multiple calls. A simple:
struct.unpack(' H3G3G', st)
pickle.decode_l ong(st[:2]) #or an equivalent struct call
pickle.decode_l ong(st[2:5])
pickle.decode_l ong(st[5:8])
And has no endian or sign options, or requires the status quo using of
masks and shifts to get the job done. As previously stated, one point
of the module is to reduce the amount of bit shifting and masking required.
We could just document a method for packing/unpacking these kinds of
things in the struct module, if this really is where people would look
for such a thing.

My Response:
I am not disputing that there are other methods of doing this, I am
saying that the struct module includes a framework and documentation
location that can include this particular modification with little
issue, which is far better than any other proposed location for
equivalent functionality.
Note that functionality equivalent to pickle.encode/decode_long is
NOT what this proposed enhancement is for.
The struct module has a steep learning curve already, and this new
format specifier doesn't help it.

My Response:
I can't see how a new format specifier would necessarily make the
learning curve any more difficult, if it was even difficult in the first

Why am I even posting
Raymond has threatened to close this RFE due to the fact that only I
have been posting to state that I would find such an addition useful.

If you believe this functionality is useful, or even if you think that I
am full of it, tell us: http://python.org/sf/1023290

- Josiah
Jul 18 '05 #1
0 2285

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

by: Angelo Secchi | last post by:
I'm trying to use the unpack method in the struct module to parse a binary file without success. I have a binary file with records that include many fields for a total length of 1970. Few days ago I was suggested by the list to use the struct module to parse it using the following code in the hypothesis that for each records I have just two...
by: Raymond Hettinger | last post by:
Comments are invited on the following proposed PEP. Raymond Hettinger ------------------------------------------------------- PEP: 329
by: Geoffrey | last post by:
Hope someone can help. I am trying to read data from a file binary file and then unpack the data into python variables. Some of the data is store like this; xbuffer: '\x00\x00\xb9\x02\x13EXCLUDE_CREDIT_CARD' # the above was printed using repr(xbuffer). # Note that int(0x13) = 19 which is exactly the length of the visible text #
by: Bryan Olson | last post by:
The Python slice type has one method 'indices', and reportedly: This method takes a single integer argument /length/ and computes information about the extended slice that the slice object would describe if applied to a sequence of length items. It returns a tuple of three integers; respectively these are the /start/ and /stop/ indices and...
by: Richard Cornford | last post by:
Anyone who has taken a look at the online FAQ today may have noticed that I have updated it. The majority of the changes are the updating of broken links and the implementation of that extensive suggestions for re-wording provided by Michael Winter. Other changes are listed below. Fore those interested, a zipped snapshot of the entire FAQ...
by: Giovanni Bajo | last post by:
Hello, given the ongoing work on struct (which I thought was a dead module), I was wondering if it would be possible to add an API to register custom parsing codes for struct. Whenever I use it for non-trivial tasks, I always happen to write small wrapper functions to adjust the values returned by struct. An example API would be the...
by: nephish | last post by:
hello there, all. i have a difficult app that connects to a server to get information for our database here. this server is our access point to some equipment in the field that we monitor. the messages come in over a socket connection. And according to their (very limited) documentation, are set up in a particular order. like this
by: Jansson Christer | last post by:
Hi all, I have discovered that in my Python 2.4.1 installation (on Solaris 8), struct.pack handles things in a way that seems inconsistent to me. I haven't found any comprehensible documentation over known issues with Python 2.4.1 so I try this... Here's the thing:
by: homostannous | last post by:
I want to import some binary data with the struct module, edit it, then export it again. The problem is that I can't find a shorthand way to represent my list of data without having to write it twice. The simple way to do this is: = struct.unpack("BBB", "\x01\x02\x03") mass+=1 struct.pack("BBB", mass, speed, width) The problem is that I...
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main...
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. ...
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that...
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then...
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert...
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.