"Allan Bruce" <al*****@TAKEAW AYf2s.com> wrote in message
news:bm******** **@news.freedom 2surf.net...
| Has anybody coded a basic huffman compression scheme? It would be
| interesting to see how they write the info to file, since I have variable
| length bit patterns (potentially 2 -> 255 bits).
Well, it's been a long long time for me.
The approach I would take is more or less:
static unsigned char cur=0;
static int freeBits = 8; // # of available bits in *dst
void bitsWriter(unsi gned long newbits, int bitCount)
{
while(bitCount> 0) {
unsigned nbits = bitCount;
if( nbits>freeBits ) nbits = freeBits;
freeBits -= nbits;
bitCount -= nbits;
cur = (cur<<nbits)|(n ewbits>>bitCoun t);
if( freeBits==0 ) { // byte is complete
fputc((int)cur, fptr);
freeBits = 8;
}
}
}
//NB: a flush function needs to be added
Data bits are passed in 'newbits', first bit in MSB,
and last bit aligned on bit zero.
If a huffman code takes more than 32 bits, it will
need to be written using multiple calls.
The buffer 'cur' could be of type unsigned long (e.g. 32 bits)
if you are careful with endianess issues.
| What I am doing is reading
| each bit then adding it to the following function:
|
| void Bytify(int xiBit) // takes bits in and once a full byte is
recieved,
| writes to file
| {
| static unsigned char lPosition = 0x80;
| static unsigned char lByte = 0x0;
|
| if (xiBit) // if the xi bit is set
| lByte = lByte | lPosition; // then add it to the char
|
| lPosition = lPosition >> 1; // next position for next time round
|
| if (lPosition == 0x0) // if we are on the last position
| {
| fputc((int)lByt e, fptr); // write byte to file
| lPosition = 0x80; // and reset the postion and the new byte
| lByte = 0x0;
| }
| }
Seems correct.
Might be slower as the function is called for each bit.
| One last thing, if I have the length of the bit field, I am using this to
| find out the byte containing the bit, and the bit within that byte:
|
| lBitPosition = loop%8;
| lCharPosition = (loop - lBitPosition) / 8;
Ok if 'loop' is positive, but the ' - lBitPosition ' part is unnecessary.
In C, integer divide will automatically round down the value.
So: lCharPosition = loop/8;
| Where loop is going from 0 to BitPatternLengt h-1.
| Is this ok? Or is there a better way to do it?
I would store and transmit the Huffman code word by word,
for efficiency reasons:
void sendCode( //NB: assuming 32-bit longs
int bitLen,
unsigned long codeBuf[32]) // 8*32 = 256 bits
{
while(bitLen>32 ) {
bitsWriter( codeBuf, 32);
++codeBuf;
bitLen-=32;
}
bitsWriter( codeBuf, bitLen ); // last (incomplete) word
}
But your code was correct. It is better to first write your encoder
in a way that you can easily understand, and then to look into
performance improvements...
Regards,
Ivan
--
http://ivan.vecerina.com