473,386 Members | 1,734 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,386 software developers and data experts.

writing 5 bits at the time to a file

Hi

I have assignment for some wierd compression alghoritam that will read
in from a file convert characters to 5 bit codes and then write out
compressed
version of the original file. For example if input file contains string
"AMIR"
and the codes are A=0, M=12,I=8,R=17 i am supposed to write out 3 byte file

The problem? How do I figure out shifting because I can only write out bytes
and not
bits...I figure out I would "create" a byte and write it but the codes will
overlap.
In my example I would have something like this in binary

00000001 00010001 00010000

where first 5 bits are hex 0, second 5 bits are 12 and so on...
Sometimes one created byte is based on codes of 3 different input
characters.
I have a loop in which I read a char at the time from input file, create a
int code and now I have to figure out
an alghoritam that would create a byte ( a char) that I can write out to
output file

// get char at the time from input
while ((curr_chr = getc(in_file)) != EOF)
{
// get code for it
code = getCode(curr_chr);

// shifting code ......
// write created byte...
fwrite(&wbyte, 1, 1, out_file);
}
Any help is much appreciated....

Thnaks
Amir

Nov 13 '05 #1
5 4321
zambak writes:
I have assignment for some wierd compression alghoritam that will read
in from a file convert characters to 5 bit codes and then write out
compressed
version of the original file. For example if input file contains string
"AMIR"
and the codes are A=0, M=12,I=8,R=17 i am supposed to write out 3 byte file
The problem? How do I figure out shifting because I can only write out bytes and not
bits...I figure out I would "create" a byte and write it but the codes will overlap.
In my example I would have something like this in binary

00000001 00010001 00010000

where first 5 bits are hex 0, second 5 bits are 12 and so on...
Sometimes one created byte is based on codes of 3 different input
characters.
I have a loop in which I read a char at the time from input file, create a
int code and now I have to figure out
an alghoritam that would create a byte ( a char) that I can write out to
output file

// get char at the time from input
while ((curr_chr = getc(in_file)) != EOF)
{
// get code for it
code = getCode(curr_chr);

// shifting code ......
// write created byte...
fwrite(&wbyte, 1, 1, out_file);
}


I may not fully understand your problem, but what else is new?

It osunds like you want to write five byte "globs" since 40 is the least
common multiple of 8 and 5. Add pad bits as needed and some method of
identifying them, perhaps a four-byte "field count" as a prefix in the file
or an unused five bit code.
Nov 13 '05 #2
Hi
So are you saying create an array of 5 chars like

unsigned char glob[5];

then read in 8 characters at the time , shift manipulate them into the glob
and then
write out the glob array? And if I run out of the input while getting 8
chars at the time
then just padd the rest of the glob array with zeroes....

Is that what you are suggesting?

Thanks
Amir



"osmium" <r1********@comcast.net> wrote in message
news:bp*************@ID-179017.news.uni-berlin.de...
zambak writes:
I have assignment for some wierd compression alghoritam that will read
in from a file convert characters to 5 bit codes and then write out
compressed
version of the original file. For example if input file contains string
"AMIR"
and the codes are A=0, M=12,I=8,R=17 i am supposed to write out 3 byte file

The problem? How do I figure out shifting because I can only write out

bytes
and not
bits...I figure out I would "create" a byte and write it but the codes

will
overlap.
In my example I would have something like this in binary

00000001 00010001 00010000

where first 5 bits are hex 0, second 5 bits are 12 and so on...
Sometimes one created byte is based on codes of 3 different input
characters.
I have a loop in which I read a char at the time from input file, create a int code and now I have to figure out
an alghoritam that would create a byte ( a char) that I can write out to
output file

// get char at the time from input
while ((curr_chr = getc(in_file)) != EOF)
{
// get code for it
code = getCode(curr_chr);

// shifting code ......
// write created byte...
fwrite(&wbyte, 1, 1, out_file);
}


I may not fully understand your problem, but what else is new?

It osunds like you want to write five byte "globs" since 40 is the least
common multiple of 8 and 5. Add pad bits as needed and some method of
identifying them, perhaps a four-byte "field count" as a prefix in the

file or an unused five bit code.

Nov 13 '05 #3
zambak writes:
So are you saying create an array of 5 chars like

unsigned char glob[5];

then read in 8 characters at the time , shift manipulate them into the glob and then
write out the glob array? And if I run out of the input while getting 8
chars at the time
then just padd the rest of the glob array with zeroes....

Is that what you are suggesting?


Yes, something like that is what I had in mind.
Nov 13 '05 #4
HI

Thank you.....That seems to work just fine....:)

Amir
"osmium" <r1********@comcast.net> wrote in message
news:bp*************@ID-179017.news.uni-berlin.de...
zambak writes:
So are you saying create an array of 5 chars like

unsigned char glob[5];

then read in 8 characters at the time , shift manipulate them into the

glob
and then
write out the glob array? And if I run out of the input while getting 8
chars at the time
then just padd the rest of the glob array with zeroes....

Is that what you are suggesting?


Yes, something like that is what I had in mind.

Nov 13 '05 #5
"zambak" <za****@comcast.net> wrote:
Thank you.....That seems to work just fine....:)


I'm glad you managed to get it working... it took me some
debugging to get all the shifts and masks correct.

This basically works; it doesn't know exactly when the input
has ended so you get a variable number (0 to 7) of extra
'a' characters (code value zero) at the end of the decoded
file.

It loses capitalisation as there's not room in 5 bits to
store both uppercase and lowercase letters.

#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>
#include <string.h>

const char *set = "abcdefghijklmnopqrstuvwxyz .,?!\n"; /* 32 chars */

unsigned char encodechar(unsigned char c)
{
char *p = strchr(set, tolower(c));
if(p == NULL) /* Not in code set */
{
if(isprint(c))
fprintf(stderr, "Error: Cannot encode character '%c'\n", (int)c);
else
fprintf(stderr, "Error: Cannot encode character %u\n", (unsigned int)c);
return 0;
}
return p - set;
}

unsigned char decodechar(unsigned char c)
{
if(c > 31)
{
fprintf(stderr, "Error: Got code %d\n", c);
return 0;
}
return set[c];
}

/*
Encode n characters (n <= 8)
from inbuf (up to 8 characters)
to outbuf (5 8-bit bytes)

Bit packing:
where '(' is most significant bit and
')' is least significant bit.
from (---)(---)(---)(---)(---)(---)(---)(---)
to (------)(------)(------)(------)(------)
*/
void encodebuf(unsigned char *outbuf, unsigned char *inbuf, size_t n)
{
memset(outbuf, 0, 5);
for(size_t i = 0; i < n; i++) inbuf[i] = encodechar(inbuf[i]);
if(n >= 1) outbuf[0] |= inbuf[0] << 3;
if(n >= 2) outbuf[0] |= inbuf[1] >> 2, outbuf[1] |= (inbuf[1] & 3) << 6;
if(n >= 3) outbuf[1] |= inbuf[2] << 1;
if(n >= 4) outbuf[1] |= inbuf[3] >> 4, outbuf[2] |= (inbuf[3] & 15) << 4;
if(n >= 5) outbuf[2] |= inbuf[4] >> 1, outbuf[3] |= (inbuf[4] & 1) << 7;
if(n >= 6) outbuf[3] |= inbuf[5] << 2;
if(n >= 7) outbuf[3] |= inbuf[6] >> 3, outbuf[4] |= (inbuf[6] & 7) << 5;
if(n >= 8) outbuf[4] |= inbuf[7];
}

/* Decode 8 characters from inbuf (5 8-bit bytes) to 8-byte outbuf

from (------)(------)(------)(------)(------)
to (---)(---)(---)(---)(---)(---)(---)(---)
*/
void decodebuf(unsigned char *outbuf, unsigned char *inbuf)
{
memset(outbuf, 0, 8);
outbuf[0] = decodechar( inbuf[0] >> 3);
outbuf[1] = decodechar((inbuf[0] & 7) << 2 | inbuf[1] >> 6);
outbuf[2] = decodechar((inbuf[1] >> 1) & 31);
outbuf[3] = decodechar((inbuf[1] & 1) << 4 | inbuf[2] >> 4);
outbuf[4] = decodechar((inbuf[2] & 15) << 1 | inbuf[3] >> 7);
outbuf[5] = decodechar((inbuf[3] >> 2) & 31);
outbuf[6] = decodechar((inbuf[3] & 3) << 3 | inbuf[4] >> 5);
outbuf[7] = decodechar( inbuf[4] & 31);
}

void encode(FILE *fpin, FILE *fpout)
{
unsigned char inbuf[8] = {0};
unsigned char outbuf[5] = {0};
size_t n;
while((n = fread(inbuf, 1, 8, fpin)) != 0)
{
encodebuf(outbuf, inbuf, n);
fwrite(outbuf, 1, 5, fpout);
}
}

void decode(FILE *fpin, FILE *fpout)
{
unsigned char inbuf[5] = {0};
unsigned char outbuf[8] = {0};
size_t n;
while((n = fread(inbuf, 1, 5, fpin)) != 0)
{
decodebuf(outbuf, inbuf);
fwrite(outbuf, 1, 8, fpout);
}
}

int main(int argc, char **argv)
{
if(argc != 4)
{
fprintf(stderr, "Usage: requires three arguments:\n");
fprintf(stderr, " encode or decode\n");
fprintf(stderr, " input filename\n");
fprintf(stderr, " output filename\n");
}
else
{
if(strcmp(argv[1], "encode") == 0)
{
FILE *fpin = fopen(argv[2], "r");
if(fpin == NULL)
{
fprintf(stderr, "Error opening file %s for text reading\n", argv[2]);
exit(EXIT_FAILURE);
}
FILE *fpout = fopen(argv[3], "wb");
if(fpout == NULL)
{
fprintf(stderr, "Error opening file %s for binary writing\n", argv[3]);
exit(EXIT_FAILURE);
}
encode(fpin, fpout);
fclose(fpin);
fclose(fpout);
}
else if(strcmp(argv[1], "decode") == 0)
{
FILE *fpin = fopen(argv[2], "rb");
if(fpin == NULL)
{
fprintf(stderr, "Error opening file %s for binary reading\n", argv[2]);
exit(EXIT_FAILURE);
}
FILE *fpout = fopen(argv[3], "w");
if(fpout == NULL)
{
fprintf(stderr, "Error opening file %s for text writing\n", argv[3]);
exit(EXIT_FAILURE);
}
decode(fpin, fpout);
fclose(fpin);
fclose(fpout);
}
else
{
fprintf(stderr, "Incorrect argument.\n");
}
}
return 0;
}

--
Simon.
Nov 13 '05 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
by: rob | last post by:
hey every1, I've got alot of data to write out to file and it's all just 1's and 0's. It's all stored in 2 dimensional arrays of width 32 and varying height. At the moment it's all just...
10
by: Kristian Nybo | last post by:
Hi, I'm writing a simple image file exporter as part of a school project. To implement my image format of choice I need to work with big-endian bytes, where 'byte' of course means '8 bits', not...
15
by: Douglas Garstang | last post by:
All, I posted a newsgroup question here a few weeks back, asking some questions that related to my 10 year quest (so far) to understand pointers. Someone suggested I write a simple emulator....
9
by: curious_one | last post by:
All, I have a struct struct { char a; char b; }some_struct; I have a shared memory that can contain 16bit wide data, I find that when writing an 8bit value in to char "a" the same value is...
24
by: ypjofficial | last post by:
Hello all, I have written a class with many private data members.and i am putting it in a separate dll file. Now when i link that file while writing my main program module,natuarally i have to...
2
by: heathimself | last post by:
Hi, I don't know what type of files they are, but the script allows me to save to a bitmap just fine. But I do know they need to be in RGBA 4444 format, so I've followed what most of the tutorials...
7
by: Matt Kowalczyk | last post by:
Hello, I am working on a compression project and I want to write ASCII characters using the minimum amount of bits. Since I will be writing ASCII characters from 0-127 I only need 7 bits to...
89
by: Skybuck Flying | last post by:
Hello, This morning I had an idea how to write Scalable Software in general. Unfortunately with Delphi 2007 it can't be done because it does not support operating overloading for classes, or...
59
by: riva | last post by:
I am developing a compression program. Is there any way to write a data to file in the form of bits, like write bit 0 then bit 1 and then bit 1 and so on ....
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.