By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
440,843 Members | 864 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 440,843 IT Pros & Developers. It's quick & easy.

writing 5 bits at the time to a file

P: n/a
Hi

I have assignment for some wierd compression alghoritam that will read
in from a file convert characters to 5 bit codes and then write out
compressed
version of the original file. For example if input file contains string
"AMIR"
and the codes are A=0, M=12,I=8,R=17 i am supposed to write out 3 byte file

The problem? How do I figure out shifting because I can only write out bytes
and not
bits...I figure out I would "create" a byte and write it but the codes will
overlap.
In my example I would have something like this in binary

00000001 00010001 00010000

where first 5 bits are hex 0, second 5 bits are 12 and so on...
Sometimes one created byte is based on codes of 3 different input
characters.
I have a loop in which I read a char at the time from input file, create a
int code and now I have to figure out
an alghoritam that would create a byte ( a char) that I can write out to
output file

// get char at the time from input
while ((curr_chr = getc(in_file)) != EOF)
{
// get code for it
code = getCode(curr_chr);

// shifting code ......
// write created byte...
fwrite(&wbyte, 1, 1, out_file);
}
Any help is much appreciated....

Thnaks
Amir

Nov 13 '05 #1
Share this Question
Share on Google+
5 Replies


P: n/a
zambak writes:
I have assignment for some wierd compression alghoritam that will read
in from a file convert characters to 5 bit codes and then write out
compressed
version of the original file. For example if input file contains string
"AMIR"
and the codes are A=0, M=12,I=8,R=17 i am supposed to write out 3 byte file
The problem? How do I figure out shifting because I can only write out bytes and not
bits...I figure out I would "create" a byte and write it but the codes will overlap.
In my example I would have something like this in binary

00000001 00010001 00010000

where first 5 bits are hex 0, second 5 bits are 12 and so on...
Sometimes one created byte is based on codes of 3 different input
characters.
I have a loop in which I read a char at the time from input file, create a
int code and now I have to figure out
an alghoritam that would create a byte ( a char) that I can write out to
output file

// get char at the time from input
while ((curr_chr = getc(in_file)) != EOF)
{
// get code for it
code = getCode(curr_chr);

// shifting code ......
// write created byte...
fwrite(&wbyte, 1, 1, out_file);
}


I may not fully understand your problem, but what else is new?

It osunds like you want to write five byte "globs" since 40 is the least
common multiple of 8 and 5. Add pad bits as needed and some method of
identifying them, perhaps a four-byte "field count" as a prefix in the file
or an unused five bit code.
Nov 13 '05 #2

P: n/a
Hi
So are you saying create an array of 5 chars like

unsigned char glob[5];

then read in 8 characters at the time , shift manipulate them into the glob
and then
write out the glob array? And if I run out of the input while getting 8
chars at the time
then just padd the rest of the glob array with zeroes....

Is that what you are suggesting?

Thanks
Amir



"osmium" <r1********@comcast.net> wrote in message
news:bp*************@ID-179017.news.uni-berlin.de...
zambak writes:
I have assignment for some wierd compression alghoritam that will read
in from a file convert characters to 5 bit codes and then write out
compressed
version of the original file. For example if input file contains string
"AMIR"
and the codes are A=0, M=12,I=8,R=17 i am supposed to write out 3 byte file

The problem? How do I figure out shifting because I can only write out

bytes
and not
bits...I figure out I would "create" a byte and write it but the codes

will
overlap.
In my example I would have something like this in binary

00000001 00010001 00010000

where first 5 bits are hex 0, second 5 bits are 12 and so on...
Sometimes one created byte is based on codes of 3 different input
characters.
I have a loop in which I read a char at the time from input file, create a int code and now I have to figure out
an alghoritam that would create a byte ( a char) that I can write out to
output file

// get char at the time from input
while ((curr_chr = getc(in_file)) != EOF)
{
// get code for it
code = getCode(curr_chr);

// shifting code ......
// write created byte...
fwrite(&wbyte, 1, 1, out_file);
}


I may not fully understand your problem, but what else is new?

It osunds like you want to write five byte "globs" since 40 is the least
common multiple of 8 and 5. Add pad bits as needed and some method of
identifying them, perhaps a four-byte "field count" as a prefix in the

file or an unused five bit code.

Nov 13 '05 #3

P: n/a
zambak writes:
So are you saying create an array of 5 chars like

unsigned char glob[5];

then read in 8 characters at the time , shift manipulate them into the glob and then
write out the glob array? And if I run out of the input while getting 8
chars at the time
then just padd the rest of the glob array with zeroes....

Is that what you are suggesting?


Yes, something like that is what I had in mind.
Nov 13 '05 #4

P: n/a
HI

Thank you.....That seems to work just fine....:)

Amir
"osmium" <r1********@comcast.net> wrote in message
news:bp*************@ID-179017.news.uni-berlin.de...
zambak writes:
So are you saying create an array of 5 chars like

unsigned char glob[5];

then read in 8 characters at the time , shift manipulate them into the

glob
and then
write out the glob array? And if I run out of the input while getting 8
chars at the time
then just padd the rest of the glob array with zeroes....

Is that what you are suggesting?


Yes, something like that is what I had in mind.

Nov 13 '05 #5

P: n/a
"zambak" <za****@comcast.net> wrote:
Thank you.....That seems to work just fine....:)


I'm glad you managed to get it working... it took me some
debugging to get all the shifts and masks correct.

This basically works; it doesn't know exactly when the input
has ended so you get a variable number (0 to 7) of extra
'a' characters (code value zero) at the end of the decoded
file.

It loses capitalisation as there's not room in 5 bits to
store both uppercase and lowercase letters.

#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>
#include <string.h>

const char *set = "abcdefghijklmnopqrstuvwxyz .,?!\n"; /* 32 chars */

unsigned char encodechar(unsigned char c)
{
char *p = strchr(set, tolower(c));
if(p == NULL) /* Not in code set */
{
if(isprint(c))
fprintf(stderr, "Error: Cannot encode character '%c'\n", (int)c);
else
fprintf(stderr, "Error: Cannot encode character %u\n", (unsigned int)c);
return 0;
}
return p - set;
}

unsigned char decodechar(unsigned char c)
{
if(c > 31)
{
fprintf(stderr, "Error: Got code %d\n", c);
return 0;
}
return set[c];
}

/*
Encode n characters (n <= 8)
from inbuf (up to 8 characters)
to outbuf (5 8-bit bytes)

Bit packing:
where '(' is most significant bit and
')' is least significant bit.
from (---)(---)(---)(---)(---)(---)(---)(---)
to (------)(------)(------)(------)(------)
*/
void encodebuf(unsigned char *outbuf, unsigned char *inbuf, size_t n)
{
memset(outbuf, 0, 5);
for(size_t i = 0; i < n; i++) inbuf[i] = encodechar(inbuf[i]);
if(n >= 1) outbuf[0] |= inbuf[0] << 3;
if(n >= 2) outbuf[0] |= inbuf[1] >> 2, outbuf[1] |= (inbuf[1] & 3) << 6;
if(n >= 3) outbuf[1] |= inbuf[2] << 1;
if(n >= 4) outbuf[1] |= inbuf[3] >> 4, outbuf[2] |= (inbuf[3] & 15) << 4;
if(n >= 5) outbuf[2] |= inbuf[4] >> 1, outbuf[3] |= (inbuf[4] & 1) << 7;
if(n >= 6) outbuf[3] |= inbuf[5] << 2;
if(n >= 7) outbuf[3] |= inbuf[6] >> 3, outbuf[4] |= (inbuf[6] & 7) << 5;
if(n >= 8) outbuf[4] |= inbuf[7];
}

/* Decode 8 characters from inbuf (5 8-bit bytes) to 8-byte outbuf

from (------)(------)(------)(------)(------)
to (---)(---)(---)(---)(---)(---)(---)(---)
*/
void decodebuf(unsigned char *outbuf, unsigned char *inbuf)
{
memset(outbuf, 0, 8);
outbuf[0] = decodechar( inbuf[0] >> 3);
outbuf[1] = decodechar((inbuf[0] & 7) << 2 | inbuf[1] >> 6);
outbuf[2] = decodechar((inbuf[1] >> 1) & 31);
outbuf[3] = decodechar((inbuf[1] & 1) << 4 | inbuf[2] >> 4);
outbuf[4] = decodechar((inbuf[2] & 15) << 1 | inbuf[3] >> 7);
outbuf[5] = decodechar((inbuf[3] >> 2) & 31);
outbuf[6] = decodechar((inbuf[3] & 3) << 3 | inbuf[4] >> 5);
outbuf[7] = decodechar( inbuf[4] & 31);
}

void encode(FILE *fpin, FILE *fpout)
{
unsigned char inbuf[8] = {0};
unsigned char outbuf[5] = {0};
size_t n;
while((n = fread(inbuf, 1, 8, fpin)) != 0)
{
encodebuf(outbuf, inbuf, n);
fwrite(outbuf, 1, 5, fpout);
}
}

void decode(FILE *fpin, FILE *fpout)
{
unsigned char inbuf[5] = {0};
unsigned char outbuf[8] = {0};
size_t n;
while((n = fread(inbuf, 1, 5, fpin)) != 0)
{
decodebuf(outbuf, inbuf);
fwrite(outbuf, 1, 8, fpout);
}
}

int main(int argc, char **argv)
{
if(argc != 4)
{
fprintf(stderr, "Usage: requires three arguments:\n");
fprintf(stderr, " encode or decode\n");
fprintf(stderr, " input filename\n");
fprintf(stderr, " output filename\n");
}
else
{
if(strcmp(argv[1], "encode") == 0)
{
FILE *fpin = fopen(argv[2], "r");
if(fpin == NULL)
{
fprintf(stderr, "Error opening file %s for text reading\n", argv[2]);
exit(EXIT_FAILURE);
}
FILE *fpout = fopen(argv[3], "wb");
if(fpout == NULL)
{
fprintf(stderr, "Error opening file %s for binary writing\n", argv[3]);
exit(EXIT_FAILURE);
}
encode(fpin, fpout);
fclose(fpin);
fclose(fpout);
}
else if(strcmp(argv[1], "decode") == 0)
{
FILE *fpin = fopen(argv[2], "rb");
if(fpin == NULL)
{
fprintf(stderr, "Error opening file %s for binary reading\n", argv[2]);
exit(EXIT_FAILURE);
}
FILE *fpout = fopen(argv[3], "w");
if(fpout == NULL)
{
fprintf(stderr, "Error opening file %s for text writing\n", argv[3]);
exit(EXIT_FAILURE);
}
decode(fpin, fpout);
fclose(fpin);
fclose(fpout);
}
else
{
fprintf(stderr, "Incorrect argument.\n");
}
}
return 0;
}

--
Simon.
Nov 13 '05 #6

This discussion thread is closed

Replies have been disabled for this discussion.