Reading a Bitstream

Dietrich Epp

Are there any good modules for reading a bitstream? Specifically, I
have a string and I want to be able to get the next N bits as an
integer. Right now I'm using struct.unpack and bit operations, it's a
bit kludgy but it gets the right results.

Thanks in advance.

Jul 18 '05 #1

Subscribe Reply

6569

Miki Tebeka

Hello Dietrich,

Are there any good modules for reading a bitstream? Specifically, I
have a string and I want to be able to get the next N bits as an
integer. Right now I'm using struct.unpack and bit operations, it's a
bit kludgy but it gets the right results.

Have you looked at 'array' and 'xdrlib.Upnacker'?

HTH.
Miki

Jul 18 '05 #2

Dietrich Epp

On Nov 18, 2003, at 7:28 AM, Miki Tebeka wrote:

Hello Dietrich,
Are there any good modules for reading a bitstream? Specifically, I
have a string and I want to be able to get the next N bits as an
integer. Right now I'm using struct.unpack and bit operations, it's a
bit kludgy but it gets the right results.

Have you looked at 'array' and 'xdrlib.Upnacker'?

Both of those look like they're aligned to byte boundaries. Am I
mistaken?

The file I'm reading has fields ranging from 1 to 32 bits wide, and
they are packed bit-to-bit.

I guess I'll write my own module.

Jul 18 '05 #3

Patrick Maupin

Dietrich Epp wrote:

Are there any good modules for reading a bitstream? Specifically, I
have a string and I want to be able to get the next N bits as an
integer. Right now I'm using struct.unpack and bit operations, it's a
bit kludgy but it gets the right results.

As Miki wrote, the array module will probably give you what
you want more easily than struct.unpack. If you need more
help, just post a few more details and I will post a code
snippet. (As to the rest of Miki's post, I'm not sure that
I really want to know what an "Upnacker" is :)

Pat

Jul 18 '05 #4

Dietrich Epp

On Nov 18, 2003, at 6:10 PM, Patrick Maupin wrote:

Dietrich Epp wrote:
Are there any good modules for reading a bitstream? Specifically, I
have a string and I want to be able to get the next N bits as an
integer. Right now I'm using struct.unpack and bit operations, it's a
bit kludgy but it gets the right results.

As Miki wrote, the array module will probably give you what
you want more easily than struct.unpack. If you need more
help, just post a few more details and I will post a code
snippet. (As to the rest of Miki's post, I'm not sure that
I really want to know what an "Upnacker" is :)

Maybe I should clarify: I need to read bit fields. Neither are they
aligned to bytes or do they have fixed offsets. In fact, in one part
of the file there is a list of objects which starts with a 9 bit object
type followed by fields whose length and number depend on that object
type, ranging from a dummy 1-bit field to a tuple of four fields of
length 9, 5, 8, and 8 bits.

I looked at the array module and can't find what I'm looking for.
Here's a bit of typical usage.

def readStuff(bytes):
bits = BitStream(bytes[2:])
isSimple = bits.Get(1)
objType = chr(bits.Get(8))
objType += chr(bits.Get(8))
objType += chr(bits.Get(8))
objType += chr(bits.Get(8))
count = bits.Get(3)
bits.Ignore(5)
if not isSimple:
objId = bits.Get(32)
bytes = bytes[2+bits.PartialBytesRead():]
return bytes, objType

This is basically the gamut of what I want to do. I have a string, and
create a bit stream object. I read fields from the bit stream, some
may not be present, then return an object and the string that comes
after it. The objects are aligned to bytes in this case even though
their fields aren't.

I can't figure out how to get array to do this. Array does not look at
all suited to reading a bit stream. struct.unpack *does* work right
now, with a lot of help, I was wondering if there was an easier way.

Jul 18 '05 #5

Bengt Richter

On Wed, 19 Nov 2003 01:47:26 -0800, Dietrich Epp <di******@zdome.net> wrote:

On Nov 18, 2003, at 6:10 PM, Patrick Maupin wrote:
Dietrich Epp wrote:
Are there any good modules for reading a bitstream? Specifically, I
have a string and I want to be able to get the next N bits as an
integer. Right now I'm using struct.unpack and bit operations, it's a
bit kludgy but it gets the right results.

As Miki wrote, the array module will probably give you what
you want more easily than struct.unpack. If you need more
help, just post a few more details and I will post a code
snippet. (As to the rest of Miki's post, I'm not sure that
I really want to know what an "Upnacker" is :)

Maybe I should clarify: I need to read bit fields. Neither are they
aligned to bytes or do they have fixed offsets. In fact, in one part
of the file there is a list of objects which starts with a 9 bit object
type followed by fields whose length and number depend on that object
type, ranging from a dummy 1-bit field to a tuple of four fields of
length 9, 5, 8, and 8 bits.

I looked at the array module and can't find what I'm looking for.
Here's a bit of typical usage.

def readStuff(bytes):
bits = BitStream(bytes[2:])
isSimple = bits.Get(1)
objType = chr(bits.Get(8))
objType += chr(bits.Get(8))
objType += chr(bits.Get(8))
objType += chr(bits.Get(8))
count = bits.Get(3)
bits.Ignore(5)
if not isSimple:
objId = bits.Get(32)
bytes = bytes[2+bits.PartialBytesRead():]
return bytes, objType

This is basically the gamut of what I want to do. I have a string, and
create a bit stream object. I read fields from the bit stream, some
may not be present, then return an object and the string that comes
after it. The objects are aligned to bytes in this case even though
their fields aren't.

I can't figure out how to get array to do this. Array does not look at
all suited to reading a bit stream. struct.unpack *does* work right
now, with a lot of help, I was wondering if there was an easier way.

Maybe this will do something for you?
Note that this is a response to your post, and not something previously tested,
(in fact not tested beyond what you see ;-) and it will be slow if you have
huge amounts of data to process.

You pass a string to the constructor, specifying big-endian if not little-endian,
and then you use the read method to read bit fields, which may optionally have
their most significant bits interpreted as sign bits.

E.g., reading 4-bit chunks or bits, little-endian and big-endian:

import sbits
sb = sbits.SBits('01234567')
for i in xrange(8*2): print sb.read(4), ...
0 3 1 3 2 3 3 3 4 3 5 3 6 3 7 3 sb = sbits.SBits('01234567',False)
for i in xrange(8*2): print sb.read(4), ...
3 0 3 1 3 2 3 3 3 4 3 5 3 6 3 7 sb = sbits.SBits('\x05')
for i in xrange(8): print sb.read(1), ...
1 0 1 0 0 0 0 0 sb = sbits.SBits('\x05',False)
for i in xrange(8): print sb.read(1), ...
0 0 0 0 0 1 0 1
sb = sbits.SBits('01234567')
hex(sb.read(64)) '0x3736353433323130L' sb = sbits.SBits('01234567',False)
hex(sb.read(64)) '0x3031323334353637L' sb = sbits.SBits('01234567')
hex(sb.read(32)) '0x33323130' hex(sb.read(32)) '0x37363534' sb = sbits.SBits('01234567',False)
hex(sb.read(32)) '0x30313233' hex(sb.read(32))

'0x34353637'

Sorry for the lack of doc strings ;-/
Please let me know if/when you find a bug.
====< sbits.py >=========================================
import itertools
class SBits(object):
def __init__(self, s='', little_endian=True):
self.le = little_endian
self.buf = 0L
self.bufbits=0
self.getbyte = itertools.imap(ord, s).next
def read(self, nb=0, signed=False):
try:
while self.bufbits<nb:
if self.le:
self.buf |= (long(self.getbyte())<<self.bufbits) # put at top
else:
self.buf = (self.buf<<8) | self.getbyte()
self.bufbits+=8
except StopIteration: # no more getbyte data
raise EOFError, 'Failed to read %s bits from available %s.'%(nb, self.bufbits)
self.bufbits -= nb
if self.le:
ret = self.buf & ((1L<<nb)-1)
self.buf >>= nb
else:
ret = self.buf>>self.bufbits
self.buf &= ((1L<<self.bufbits)-1)
if signed:
signbit = 1L<<(nb-1)
if signbit & ret:
ret = ret - signbit -signbit
if -2**31 <= ret < 2**31: return int(ret)
return ret #, nb

def test():
sb = SBits('\x03'*(sum(xrange(37))+7))
bits = [sb.read(wid, wid&1>0) for wid in xrange(37)]
hexis = map(hex,bits)
shouldbe = [
'0x0', '0xffffffff', '0x1', '0x0', '0xc', '0x0', '0x6', '0x18',
'0x30', '0x30', '0x18', '0xfffffe06', '0xc0', '0xc0c', '0x2060', '0x181',
'0x303', '0xffff0303', '0x18181', '0x6060', '0xc0c0c', '0xc0c0', '0x60606', '0x181818',
'0x303030', '0x303030', '0x181818', '0xfe060606', '0xc0c0c0', '0xc0c0c0c', '0x20606060', '0x1818181',
'0x3030303', '-0xFCFCFCFDL', '0x181818181L', '0x60606060', '0xC0C0C0C0CL']
for i,h in enumerate(hexis): print '%12s%s'%(h,'\n'[:i%4==3]),
print '\n-----\nThat was%s what was expected.\n-----'%((' not','')[hexis==shouldbe],)

sb = SBits('\xc0'*(sum(xrange(37))+7), False)
bits = [sb.read(wid, wid&1>0) for wid in xrange(37)]
hexis = map(hex,bits)
shouldbe = [
'0x0', '0xffffffff', '0x2', '0x0', '0x3', '0x0', '0x18', '0xc',
'0xc', '0x18', '0x60', '0x303', '0x30', '0x606', '0x181', '0xffffc0c0',
'0xc0c0', '0xffff8181', '0x20606', '0x3030', '0x30303', '0x6060', '0x181818', '0xc0c0c',
'0xc0c0c', '0x181818', '0x606060', '0x3030303', '0x303030', '0x6060606', '0x1818181', '0xc0c0c0c0',
'0xC0C0C0C0L', '0x81818181', '0x206060606L', '0x30303030', '0x303030303L']
for i,h in enumerate(hexis): print '%12s%s'%(h,'\n'[:i%4==3]),
print '\n-----\nThat was%s what was expected.\n-----'%((' not','')[hexis==shouldbe],)
if __name__ == '__main__':
test()
================================================== =======

Regards,
Bengt Richter

Jul 18 '05 #6

Dietrich Epp

On Nov 19, 2003, at 7:02 PM, Bengt Richter wrote:

[snip]

Maybe this will do something for you?
Note that this is a response to your post, and not something
previously tested,
(in fact not tested beyond what you see ;-) and it will be slow if you
have
huge amounts of data to process.

You pass a string to the constructor, specifying big-endian if not
little-endian,
and then you use the read method to read bit fields, which may
optionally have
their most significant bits interpreted as sign bits.

E.g., reading 4-bit chunks or bits, little-endian and big-endian:

[snip]

It looks like what I did before I started using struct.unpack. I think
I'll stick with my current code, which, although it looks like ugly C
code, works. Someday I'll probably get the itch again and make a
decent module out of it.

Jul 18 '05 #7

Similar topics

3562

Serializing Objects and reading data in

by: Andy | last post by:

Hi, In the code below (not pretty I know but it's an early version :-P) I'm having problems reading the data object back in. If I move the reading code to immediately after the section where it...

Java

2371

Summer reading list

by: Raymond Hettinger | last post by:

Found in a pamphlet at a pre-school: --------------------------------------- Reading improves vocabulary Reading raises cultural literacy through shared knowledge Reading develops writing skills...

Python

3031

[perl-python] 20050121 file reading & writing

by: Xah Lee | last post by:

# -*- coding: utf-8 -*- # Python # to open a file and write to file # do f=open('xfile.txt','w') # this creates a file "object" and name it f. # the second argument of open can be

Python

10264

Reading unformatted text from stdin

by: Lionel B | last post by:

Greetings, I need to read (unformatted text) from stdin up to EOF into a char buffer; of course I cannot allocate my buffer until I know how much text is available, and I do not know how much...

C / C++

9802

reading file created with tmpfile?

by: Oliver Knoll | last post by:

According to my ANSI book, tmpfile() creates a file with wb+ mode (that is just writing, right?). How would one reopen it for reading? I got the following (which works): FILE *tmpFile =...

C / C++

5479

Reading Cookie Expiration

by: Mike Reed | last post by:

I must be having a "senile" day! I cannot recall, nor get to work, code to read a cookie's expiration date/time in an ASP page/VBScript. What am I missing? *** Sent via Developersdex...

Javascript

2434

bitstream

by: gangesmaster | last post by:

anyone has a good bit-stream reader and writer? (before i go to write my own) i.e. f = open(..) b = BitStream(f) b.write("10010010") b.read(5) # 10010

Python

3236

Problems reading strings from files

by: Gaijinco | last post by:

I had a file named nap.in which looks like this: 4 10:00 12:00 Lectures 12:00 13:00 Lunch, like always. 13:00 15:00 Boring lectures... 15:30 17:45 Reading 4 10:00 12:00 Lectures 12:00 13:00...

C / C++

5284

Bitstream type of class

by: PSN | last post by:

hello everyone, Can someone suggest me what would be the best way to build something like a bitstream class that allows me to do all sorts of bit operations, same as the bitstream and also whole...

C / C++

7040

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

6905

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

6736

The easy way to turn off automatic updates for Windows 10/11

by: Hystou | last post by:

Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

Windows Server

6908

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

General

5331

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

Career Advice

4772

Access Europe - Using VBA to create a class based on a table - Wed 1 May

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

Microsoft Access / VBA

4478

Couldn’t get equations in html when convert word .docx file to html file in C#.

by: conductexam | last post by:

I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...

C# / C Sharp

2980

Windows Forms - .Net 8.0

by: adsilva | last post by:

A Windows Forms form does not have the event Unload, like VB6. What one acts like?

Visual Basic .NET

178

Comprehensive Guide to Website Development in Toronto: Expert Insights from BSMN Consultancy

by: bsmnconsultancy | last post by:

In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence...

General