473,806 Members | 2,253 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Conversion of 24bit binary to int

Is there an effecient/fast way in python to convert binary data from file
(24bit hex(int) big endian) to 32bit int (little endian)? Have seen
struct.unpack, but I am unsure how and what Python has to offer. Idar

The orginal data format is stored in blocks of 512 words
(1536B=3Bytes/word) on the form Ch1: 1536B (3B*512), the binary (hex) data
is big endian
Ch2: 1536B (3B*512)
Ch3: 1536B (3B*512)
and so on

The equivalent c++ program looks like this:
for(i=0;i<nchn; i++)
{
for(k=0;k<segl; k++)
{
ar24[k]=0;//output array=32 bit int array->Mt24 fmt
pdt=(unsigned char *)(&ar24[k]);
*pdt =*(a+2);
*(pdt+1)=*(a+1) ;
*(pdt+2)=*(a+0) ;
a+=3;
ar24[k]-=DownloadDataOf fset;
// printf("%d\n",a r24[k]);//this is the number on 32 bit format
}
}

Jul 18 '05 #1
11 14115
Idar wrote:

Is there an effecient/fast way in python to convert binary data from file
(24bit hex(int) big endian) to 32bit int (little endian)? Have seen
struct.unpack, but I am unsure how and what Python has to offer. Idar


I think the question is unclear. You say you've seen struct.unpack.
So what then? Don't you think struct.unpack will work? What do you
mean you are unsure how and what Python has to offer? The documentation
which is on the web site clearly explains how and what struct.unpack
has to offer...

Please clarify.

-Peter
Jul 18 '05 #2
If I'm understanding correctly, hex has nothing to do with this and the
data is really binary, so what you're looking for is probably:
data = '\000\001\002'
temp = struct.unpack( '>I', '\000'+data ) # pad to 4-byte unsigned big-endian integer format print temp # is now a regular python integer (in a tuple) (258L,) print repr(struct.pac k( '<I', *temp )) # encode in 4-byte unsigned
little-endian integer format
'\x02\x01\x00\x 00'

There are faster ways if you have a lot of such data (e.g. PIL would
likely have something to manipulate RGB to RGBA images), similarly, you
could use Numpy to add large numbers of rows simultaneously (all 512 if
I understand your description of the data correctly). Without knowing
what type of data is being loaded it's hard to give a better recommendation.

HTH,
Mike
Idar wrote:
Is there an effecient/fast way in python to convert binary data from
file (24bit hex(int) big endian) to 32bit int (little endian)? Have
seen struct.unpack, but I am unsure how and what Python has to offer.
Idar


....
_______________ _______________ _________
Mike C. Fletcher
Designer, VR Plumber, Coder
http://members.rogers.com/mcfletch/


Jul 18 '05 #3
Idar wrote:
Is there an effecient/fast way in python to convert binary data from file
(24bit hex(int) big endian) to 32bit int (little endian)? Have seen
struct.unpack, but I am unsure how and what Python has to offer. Idar


As Peter mentions, you haven't _really_ given enough information
about what you need, but here is some code which will do what
I _think_ you said you want...

This code assumes that you have a string (named teststr here)
in the source format you describe. You can get a string
like this in several ways, e.g. by reading from a file object.

This code then swaps every 3 characters and inserts a null
byte between every group of three characters.

The result is in a list, which can easily be converted back
to a string by ''.join() as shown in the test printout.

I would expect that either the array module or Numpy would
work faster with _exactly_ the same technique, but I'm
not bored enough to check that out right now.

If this isn't fast enough after using array or NumPy (or
after Alex, Tim, et al. get through with it), I would
highly recommend Pyrex -- you can do exactly the same
sorts of coercions you were doing in your C++ code.
teststr = ''.join([chr(i) for i in range(128,128+2 0*3)])

result = len(teststr) * 4 // 3 * [chr(0)]
for x in range(3):
result[2-x::4] = teststr[x::3]

print repr(''.join(re sult))
Regards,
Pat
Jul 18 '05 #4


On Tue, 11 Nov 2003 10:11:05 -0500, Peter Hansen <pe***@engcorp. com> wrote:
Idar wrote:

Is there an effecient/fast way in python to convert binary data from
file
(24bit hex(int) big endian) to 32bit int (little endian)? Have seen
struct.unpack, but I am unsure how and what Python has to offer. Idar
I think the question is unclear. You say you've seen struct.unpack.
So what then? Don't you think struct.unpack will work? What do you
mean you are unsure how and what Python has to offer? The documentation
which is on the web site clearly explains how and what struct.unpack
has to offer...


It is due to slack reading........

The doc says "Standard size and alignment are as follows: no alignment is
required for any type (so you have to use pad bytes)......... ......."

It was unclear (at the time of reading) in the sence that I didn't see the
above text + there was no example on how to handle odd-byte/padding
conversion and the test program crashed!

But if you know how to convert this format (the file is about 6MB)
effeciently, pls do give me a hint. The data is stored binary with the
format:
Ch1: 1536B (512*3B)
...
Ch6 1536B (512*3B)
Then it is repeated again until end:
Ch1 1536B (512*3B)
...
Ch6 1536B (512*3B)


Please clarify.

-Peter


--
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
Jul 18 '05 #5
On Tue, 11 Nov 2003 10:21:29 -0500, Mike C. Fletcher <mc******@roger s.com>
wrote:
If I'm understanding correctly, hex has nothing to do with this and the
data is really binary, so what you're looking for is probably:
Thanks for the hint!! and sorry - i ment binary!
data = '\000\001\002'
temp = struct.unpack( '>I', '\000'+data ) # pad to 4-byte unsigned big-endian integer format print temp # is now a regular python integer (in a tuple) (258L,) print repr(struct.pac k( '<I', *temp )) # encode in 4-byte unsigned
little-endian integer format
'\x02\x01\x00\x 00'

There are faster ways if you have a lot of such data (e.g. PIL would
likely have something to manipulate RGB to RGBA images), similarly, you
could use Numpy to add large numbers of rows simultaneously (all 512 if I
understand your description of the data correctly). Without knowing what
type of data is being loaded it's hard to give a better recommendation.


It is binary with no formating characters to indicate start/end of each
block (fixed size).
A file is about 6MB (and about 300 of them again...),
Ch1: 1536B (512*3B) - the 3B are big endian (int)
...
Ch6: 1536B (512*3B)
And then it is repeated till the end:
Ch1: 1536B (512*3B)
...
Ch6: 1536B (512*3B)

ciao, idar

HTH,
Mike
Idar wrote:
Is there an effecient/fast way in python to convert binary data from
file (24bit hex(int) big endian) to 32bit int (little endian)? Have seen
struct.unpack, but I am unsure how and what Python has to offer. Idar


... _______________ _______________ _________
Mike C. Fletcher
Designer, VR Plumber, Coder
http://members.rogers.com/mcfletch/



--
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
Jul 18 '05 #6
Thanks for the example!

The format is binary with no formating characters to indicate start/end of
each block (fixed size).
A file is about 6MB (and about 300 of them again...), so

Ch1: 1536B (512*3B) - the 3B are big endian (int)
...
Ch6: 1536B (512*3B)
And then it is repeated till the end (say Y sets of Ch1 (the same for
Ch2,3,4,5,6)):
Ch1,Y: 1536B (512*3B)
...
Ch6,Y: 1536B (512*3B)

And idealy I would like to convert it to this format:
Ch1: Y*512*4B (normal int with little endian)
Ch2
Ch3
Ch4
Ch5
Ch6
And that is the end :)
Idar

This code assumes that you have a string (named teststr here)
in the source format you describe. You can get a string
like this in several ways, e.g. by reading from a file object.

This code then swaps every 3 characters and inserts a null
byte between every group of three characters.

The result is in a list, which can easily be converted back
to a string by ''.join() as shown in the test printout.

I would expect that either the array module or Numpy would
work faster with _exactly_ the same technique, but I'm
not bored enough to check that out right now.

If this isn't fast enough after using array or NumPy (or
after Alex, Tim, et al. get through with it), I would
highly recommend Pyrex -- you can do exactly the same
sorts of coercions you were doing in your C++ code.
teststr = ''.join([chr(i) for i in range(128,128+2 0*3)])

result = len(teststr) * 4 // 3 * [chr(0)]
for x in range(3):
result[2-x::4] = teststr[x::3]

print repr(''.join(re sult))
Regards,
Pat


--
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
Jul 18 '05 #7
Idar wrote:
Thanks for the example!

The format is binary with no formating characters to indicate start/end of
each block (fixed size).
A file is about 6MB (and about 300 of them again...), so

Ch1: 1536B (512*3B) - the 3B are big endian (int)
..
Ch6: 1536B (512*3B)
And then it is repeated till the end (say Y sets of Ch1 (the same for
Ch2,3,4,5,6)):
Ch1,Y: 1536B (512*3B)
..
Ch6,Y: 1536B (512*3B)

And idealy I would like to convert it to this format:
Ch1: Y*512*4B (normal int with little endian)
Ch2
Ch3
Ch4
Ch5
Ch6
And that is the end :)


So, you don't really need to convert binary to int or anything, just
shuffle bytes around, right? Your file starts with (e.g.), using a
letter for each arbitrary binary byte:

A B C D E F G H I ...

and you want to output the bytes

C B A 0 F E D 0 I H G 0 ...

I.e, swap 3 bytes, insert a 0 byte for padding, and proceed (for all
Ch1, which is spread out in the original file -- then for all Ch2, and
so on). Each file fits comfortably in memory (3MB for input, becoming
4MB for output due to the padding). You can use two instances of
array.array('B' ), with .read for input and .write for output (just
remember .read _appends_ to the array, so make a new empty one for
each file you're processing -- the _output_ array you can reuse).

It's LOTS of indexing and single-byte moving, so I doubt the Python
native performance will be great. Still, once you've implemented and
checked it out you can use psyco or pyrex to optimize it, if needed.

The primitive you need is typically "copy with swapping and padding
a block of 1536 input bytes [starting from index SI] to a block of
2048 output bytes" [starting from index SO -- the 0 bytes in the
output you'll leave untouched after at first preparing the output
array with OA = array.array('B' , Y*2048*6*'\0') of course].
That's just (using predefined ranges for speed, no need to remake
them every time):

r512 = xrange(512)

def doblock(SI, SO, IA, OA, r512=r512):
ii = SI
io = SO
for i in r512:
OA[io:io+3] = IA[ii+2:ii-1:-1]
ii += 3
io += 4

so basically it only remains to compute SI and SO appropriately
and loop ditto calling this primitive (or some speeded-up version
thereof) 6*Y times for all the blocks in the various channels.
Alex

Jul 18 '05 #8
Alex Martelli wrote:
r512 = xrange(512)

def doblock(SI, SO, IA, OA, r512=r512):
ii = SI
io = SO
for i in r512:
OA[io:io+3] = IA[ii+2:ii-1:-1]
ii += 3
io += 4

It's my guess this would be faster using array.array
in combination with extended slicing, as per the list
example I gave in a previous message, even though I'm
still not bored enough to time it :) (The for loop
in my previous example only requires 3 interations,
rather than 512 as in this example.)

Pat
Jul 18 '05 #9
Idar wrote:
Thanks for the example!

The format is binary with no formating characters to indicate start/end of
each block (fixed size).
A file is about 6MB (and about 300 of them again...), so

Ch1: 1536B (512*3B) - the 3B are big endian (int)
..
Ch6: 1536B (512*3B)
And then it is repeated till the end (say Y sets of Ch1 (the same for
Ch2,3,4,5,6)):
Ch1,Y: 1536B (512*3B)
..
Ch6,Y: 1536B (512*3B)

And idealy I would like to convert it to this format:
Ch1: Y*512*4B (normal int with little endian)
Ch2
Ch3
Ch4
Ch5
Ch6
And that is the end :)
Idar


OK, now that I have a beer and a specification, here is some code
which (I think) should do what (I think) you are asking for.
On my Athlon 2200+ (marketing number) computer, with the source
file cached by the OS, it operates at around 10 source megabytes/second.

(That should be about 3 minutes plus actual file I/O operations
for the 300 6MB files you describe.)

Verifying that it actually produces the data you expect is up to you :)

Regards,
Pat
import array

def mungeio(srcfile ,dstfile, numchannels=6, blocksize=512):
"""
This function converts 24 bit RGB into 32 bit BGR0,
and simultaneously de-interleaves video from multiple
sources. The parameters are:

srcfile -- an file object opened with 'rb'
(or similar object)
dstfile -- a file object opened with 'wb'
(or similar object)
numchannels -- the number of interleaved video channels
blocksize -- the number of pixels per channel on
each interleaved block (interleave factor)

This function reads all the data from srcfile and writes
it to dstfile. It is up to the caller to close both files.

The function asserts that the amount of data to be read
from the source file is an integral multiple of
blocksize*numch annels*3.

This function assumes that multiple copies of the data
will easily fit into RAM, as the target file size is
6MB for the source files and 8MB for the destination
files. If this is not a good assumption, it should
be rearchitected to output to one file per channel,
and then stitch the output files together at the end.
"""

srcblocksize = blocksize * 3
dstblocksize = blocksize * 4

def mungeblock(src, dstarray=array. array('B',dstbl ocksize*[0])):
"""
This function accepts a string representing a single
source block, and returns a string representing a
single destination block.
"""
srcarray = array.array('B' ,src)
for i in range(3):
dstarray[2-i::4] = srcarray[i::3]
return dstarray.tostri ng()

channellist = [[] for i in range(numchanne ls)]

while 1:
for channel in channellist:
data = srcfile.read(sr cblocksize)
if len(data) != srcblocksize:
break
channel.append( mungeblock(data ))
else:
continue # (with while statement)
break # Propagate break from 'for' out of 'while'

# Check that input file length is valid (no leftovers),
# and then write the result.

assert channel is channellist[0] and not len(data)
dstfile.write(' '.join(sum(chan nellist,[])))
def mungefile(srcna me,dstname):
"""
Actual I/O done in a separate function so it can
be more easily unit-tested.
"""
srcfile = open(srcname,'r b')
dstfile = open(dstname,'w b')
mungeio(srcfile ,dstfile)
srcfile.close()
dstfile.close()
Jul 18 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
2390
by: Aakash Bordia | last post by:
Hello, Does anybody know what is the documented and known behavior of inserting/updating binary columns using host variables from a client to a server which have different code pages? Will any code page / character set conversion take place? I am particulary interested in insert/update from subqueries. eg: insert into t1(binarycol) select :HV1 from t2 versus
5
3366
by: john | last post by:
Here is the short story of what i'm trying to do. I have a 4 sided case labeling printer setting out on one of our production lines. Now then i have a vb.net application that sends data to this printer using a RawPrinterHelper class that i found I believe in the msdn. the class works wonderfully when I send stright text data (in the correctly formated string that the sato printer requires.) but when i go to send a image nothing is...
4
2664
by: Ken Tough | last post by:
Seems like a simple thing to find out, but I'm struggling. I have googled, but everything I find is about implicit conversion, not explicit. Is this implementation-specific, or does ANSI/ISO lay out what should happen for: -------------------------- signed char sc; unsigned char uc;
9
2287
by: bowsayge | last post by:
Inspired by fb, Bowsayge decided to write a decimal integer to binary string converter. Perhaps some of the experienced C programmers here can critique it. It allocates probably way too much memory, but it should certainly handle 64-bit cpus :) #include <stdio.h> #include <stdlib.h> char * to_binary (unsigned long value) {
16
5157
by: TTroy | last post by:
Hello, I'm relatively new to C and have gone through more than 4 books on it. None mentioned anything about integral promotion, arithmetic conversion, value preserving and unsigned preserving. And K&R2 mentions "signed extension" everywhere. Reading some old clc posts, I've beginning to realize that these books are over-generalizing the topic. I am just wondering what the difference between the following pairs of terms are: 1)...
3
33581
by: Flix | last post by:
Is there some way to convert a Bitmap from one PixelFormat (16bit or with indexed colors) to another(24bit), without doing per pixel operations?
4
5550
by: Russell Warren | last post by:
I've got a case where I want to convert binary blocks of data (various ctypes objects) to base64 strings. The conversion calls in the base64 module expect strings as input, so right now I'm converting the binary blocks to strings first, then converting the resulting string to base64. This seems highly inefficient and I'd like to just go straight from binary to a base64 string. Here is the conversion we're using from object to...
15
85812
by: David Marsh | last post by:
I accidentally typed %b instead of %d in a printf format string and got a binary representation of the number. Is that standard C or a compiler extension?
4
4400
by: dondigitech | last post by:
I want to convert hex to binary without losing bits. I want to preserve the 8-bits because I ultimately need a 24-bit string to grab information from. I am just using this line of code for the conversion: string revLim = Convert.ToString(curveData, 2); // curveData = 0x14 // I want 00010100 // I am getting 10100
0
9718
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9596
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10617
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10109
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
9186
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7649
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6876
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5678
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
3
3008
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.