473,326 Members | 2,655 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,326 software developers and data experts.

Is there a standard way to write an 8 bit byte?

Hi,

I'm writing a simple image file exporter as part of a school project,
and I would like to write completely platform-independent code if at all
possible. The problem I've run into is that apparently there's no
guarantee that a particular implementation will have a data type that is
exactly 8 bits wide; char is only guaranteed to be at least 8 bits.
Correct me if I'm wrong, but I assume that if CHAR_BIT happens to be,
say, 9, writing a char into a file in binary mode would result in 9 bits
being written.

At first it seemed that stdint.h and its uint8_t were the answer, but
then I noticed that uint8_t is only guaranteed to exist if the platform
has a data type that is exactly 8 bits wide, which put me back in square
one.

Is there a way around this, or will I just have to use uint8_t and
accept that my code might not work on every conceivable platform that
has an ANSI C compiler?
Kristian
Nov 14 '05 #1
10 2475
In article <ZW***************@reader1.news.jippii.net>,
Kristian Nybo <kr******@begone.spam.iki.fi> wrote:
I'm writing a simple image file exporter as part of a school project,
and I would like to write completely platform-independent code if at all
possible.
PBM -- Portable BitMap (monochrome)
PNG -- Portable Network Graphics (colour, lossless compression)
The problem I've run into is that apparently there's no
guarantee that a particular implementation will have a data type that is
exactly 8 bits wide; char is only guaranteed to be at least 8 bits.
Correct me if I'm wrong, but I assume that if CHAR_BIT happens to be,
say, 9, writing a char into a file in binary mode would result in 9 bits
being written.


Yes -- but what happens when you take that file to an 8-bit machine
is undefined. Very few filesystems (none I -know- of) store files
as bitstreams: they are byte or word streams. Ethernet is bitstream,
but most well-known file-transfer protocols involve 8-bit characters.
--
Ceci, ce n'est pas une idée.
Nov 14 '05 #2

"Kristian Nybo" <kr******@begone.spam.iki.fi> wrote
I'm writing a simple image file exporter as part of a school project, and
I would like to write completely platform-independent code if at all
possible. The problem I've run into is that apparently there's no
guarantee that a particular implementation will have a data type that is
exactly 8 bits wide; char is only guaranteed to be at least 8 bits.
Correct me if I'm wrong, but I assume that if CHAR_BIT happens to be, say,
9, writing a char into a file in binary mode would result in 9 bits being
written.

Just use fgetc and fputc to write integers between 0 and 255.

There is no cast iron guarantee in the C standard that this will produce the
file you want, if say chars are 9 bits wide. However in practise there will
always be an easy way to convert such files so that they can be read by
8-bit machines (maybe by running them though a converter program)..
Computers exist in the real world and have to communicate with each other.
Nov 14 '05 #3
Walter Roberson wrote:
I'm writing a simple image file exporter as part of a school project,
and I would like to write completely platform-independent code if at all
possible.

PBM -- Portable BitMap (monochrome)
PNG -- Portable Network Graphics (colour, lossless compression)


I'm actually using PNG.
The problem I've run into is that apparently there's no
guarantee that a particular implementation will have a data type that is
exactly 8 bits wide; char is only guaranteed to be at least 8 bits.
Correct me if I'm wrong, but I assume that if CHAR_BIT happens to be,
say, 9, writing a char into a file in binary mode would result in 9 bits
being written.

Yes -- but what happens when you take that file to an 8-bit machine
is undefined.


That's exactly what I'm worried about.
Very few filesystems (none I -know- of) store files
as bitstreams: they are byte or word streams. Ethernet is bitstream,
but most well-known file-transfer protocols involve 8-bit characters.


Am I correct in deducing that if the file system stores files as byte
streams, it doesn't matter how many bits wide char is as long as one
doesn't try to write bytes with values that don't fit in an 8-bit byte?

Could there, as far as the standard is concerned, be a platform that
uses 9-bit chars and stores files as bitstreams? :)
Kristian
Nov 14 '05 #4
Kristian Nybo wrote:
I'm writing a simple image file exporter as part of a school project,
and I would like to write completely platform-independent code if at all
possible.
A laudable aim, and people here will be happy to help with anything that
can be done in standard C.
The problem I've run into is that apparently there's no
guarantee that a particular implementation will have a data type that is
exactly 8 bits wide; char is only guaranteed to be at least 8 bits.
Correct.
Correct me if I'm wrong, but I assume that if CHAR_BIT happens to be,
say, 9, writing a char into a file in binary mode would result in 9 bits
being written.
In the simple case, yes. However, if the file is on a file system with 8
bit bytes it might have to do something very strange. Think about what
could happen if the file was on a network share from a Windows box.
At first it seemed that stdint.h and its uint8_t were the answer, but
then I noticed that uint8_t is only guaranteed to exist if the platform
has a data type that is exactly 8 bits wide, which put me back in square
one.
Correct.
Is there a way around this, or will I just have to use uint8_t and
accept that my code might not work on every conceivable platform that
has an ANSI C compiler?


You should try to isolate as much as possible the code that depending on
having a type exactly 8 bits wide and use uint_8 for that. Possibly even
putting it in a separate file.

Where you need at least 8 bits, but do not actually care how many bits,
you could use uint_least8_t which will be 8 bits on systems that support
it, but larger where not supported.

You should also understand that stdint.h was added by the C99 standard
which is not supported by most compilers. However, it is easy enough to
implement what you need of it on any given system, so I would still use
them and implement them for any target that did not have stdint.h
--
Flash Gordon
Living in interesting times.
Although my email address says spam, it is real and I read it.
Nov 14 '05 #5
Malcolm wrote:
"Kristian Nybo" <kr******@begone.spam.iki.fi> wrote
I'm writing a simple image file exporter as part of a school project, and
I would like to write completely platform-independent code if at all
possible. The problem I've run into is that apparently there's no
guarantee that a particular implementation will have a data type that is
exactly 8 bits wide; char is only guaranteed to be at least 8 bits.
Correct me if I'm wrong, but I assume that if CHAR_BIT happens to be, say,
9, writing a char into a file in binary mode would result in 9 bits being
written.


Just use fgetc and fputc to write integers between 0 and 255.

There is no cast iron guarantee in the C standard that this will produce the
file you want, if say chars are 9 bits wide. However in practise there will
always be an easy way to convert such files so that they can be read by
8-bit machines (maybe by running them though a converter program)..
Computers exist in the real world and have to communicate with each other.


Ah, I see. Although there may exist a platform where that code doesn't
result in a file composed of eight bit bytes, all the other files native
to such a platform would share this incompatibility with 8-bit machines,
so the platform would either be completely isolated, in which case
there's no problem, or it would have to have a way of converting its
files into files that *are* composed of eight bit bytes. Great.
Kristian
Nov 14 '05 #6
Kristian Nybo wrote on 02/05/05 :
I'm writing a simple image file exporter as part of a school project, and I
would like to write completely platform-independent code if at all possible.


A data stream is made of octets. The C-vision is unsigned char because
the internal representation is 8 bit or more. It's a C issue, not a
stream issue.

IOW, don't worry. As long as you use a sequence of bytes (unsigned
char), you are portable. Just write a document that specifies how are
the data encoded and it's fine.

--
Emmanuel
The C-FAQ: http://www.eskimo.com/~scs/C-faq/faq.html
The C-library: http://www.dinkumware.com/refxc.html

"C is a sharp tool"

Nov 14 '05 #7
In article <mn***********************@YOURBRAnoos.fr>,
Emmanuel Delahaye <em***@YOURBRAnoos.fr> wrote:
A data stream is made of octets. The C-vision is unsigned char because
the internal representation is 8 bit or more. It's a C issue, not a
stream issue.


Unless you are using a very specific meaning of 'data stream' that is
not apparent from the context (C's definition of stream), then
you are incorrect when you say that a data stream is made of octets.

C89 defines text streams and binary streams, and defines both
of them in terms of characters (section 4.9.2), *not* in terms of
octets.

I have used systems with 9 bit bytes. An "int" written on those
systems was 4 bytes wide, for a total of 36 bits. If, as you say,
streams are made of octets, then what is the standard C mechanism
for dealing with the extra 4 bits that would be left over after
reading 4 octets? Those last 4 bits in the file aren't enough to fill
an octet, and they aren't enough to fill a "char" (9 bits) on that
system either. Please point us to the section of the C standard that
tells us whether those extra bits will be left or right justified
in the char when you read that last octet.
--
I was very young in those days, but I was also rather dim.
-- Christopher Priest
Nov 14 '05 #8

"Walter Roberson" <ro******@ibd.nrc-cnrc.gc.ca> wrote in message
news:d5**********@canopus.cc.umanitoba.ca...
In article <mn***********************@YOURBRAnoos.fr>,
Emmanuel Delahaye <em***@YOURBRAnoos.fr> wrote:
A data stream is made of octets. The C-vision is unsigned char because
the internal representation is 8 bit or more. It's a C issue, not a
stream issue.


Unless you are using a very specific meaning of 'data stream' that is
not apparent from the context (C's definition of stream), then
you are incorrect when you say that a data stream is made of octets.

C89 defines text streams and binary streams, and defines both
of them in terms of characters (section 4.9.2), *not* in terms of
octets.

I have used systems with 9 bit bytes. An "int" written on those
systems was 4 bytes wide, for a total of 36 bits. If, as you say,
streams are made of octets, then what is the standard C mechanism
for dealing with the extra 4 bits that would be left over after
reading 4 octets? Those last 4 bits in the file aren't enough to fill
an octet, and they aren't enough to fill a "char" (9 bits) on that
system either. Please point us to the section of the C standard that
tells us whether those extra bits will be left or right justified
in the char when you read that last octet.
--
I was very young in those days, but I was also rather dim.
-- Christopher Priest


I have written OS software for some of the most obscure of processors and
actual systems. Could you substantiate this?
Nov 14 '05 #9
In article <11*************@corp.supernews.com>,
BGreene <ba****@highstream.net> wrote:
"Walter Roberson" <ro******@ibd.nrc-cnrc.gc.ca> wrote in message
news:d5**********@canopus.cc.umanitoba.ca...
I have used systems with 9 bit bytes. An "int" written on those
systems was 4 bytes wide, for a total of 36 bits.

I have written OS software for some of the most obscure of processors and
actual systems. Could you substantiate this?


It is not clear which part you were asking me to substantiate.

If it was the part about streams being in terms of characters,
then the substantiation is C89 section 4.9.2.

If it was the part about systems with 9 bit bytes, then there
were several. The particular ones I used were the
Xerox 9 and the Honeywell L66. There was also the Xerox 5, Xerox 7,
Honeywell L6, PDP-6, and PDP-10, and machines by DEC and IBM
(you may also see references to machines by Sigma: Sigma was a model
line of SDS, which was bought out by Xerox.)

Reference:

http://www.linique.com/dlm/tenex/hbook.html

--
"I want to make sure [a user] can't get through ... an online
experience without hitting a Microsoft ad"
-- Steve Ballmer [Microsoft Chief Executive]
Nov 14 '05 #10
Walter Roberson wrote:
In article <11*************@corp.supernews.com>,
BGreene <ba****@highstream.net> wrote:

"Walter Roberson" <ro******@ibd.nrc-cnrc.gc.ca> wrote in message
news:d5**********@canopus.cc.umanitoba.ca...


I have used systems with 9 bit bytes. An "int" written on those
systems was 4 bytes wide, for a total of 36 bits.


I have written OS software for some of the most obscure of processors and
actual systems. Could you substantiate this?

It is not clear which part you were asking me to substantiate.

If it was the part about streams being in terms of characters,
then the substantiation is C89 section 4.9.2.

If it was the part about systems with 9 bit bytes, then there
were several. The particular ones I used were the
Xerox 9 and the Honeywell L66. There was also the Xerox 5, Xerox 7,
Honeywell L6, PDP-6, and PDP-10, and machines by DEC and IBM
(you may also see references to machines by Sigma: Sigma was a model
line of SDS, which was bought out by Xerox.)


The SDS Sigma 7 I used in the Xerox-era was called a Xerox Sigma 7. On
the GE645 (GE635 modified for Multics) we used 9-bit bytes also, and
that was probably the case for GE635s.

On the PDP-6, PDP-10, and PDP-20, where a byte was address though a byte
pointer incorporating the byte size (from 1 to 36), the most common
sizes were six 6-bit bytes/word and five 7-bit bytes/word + 1 bit
available for other things, although four 8-bit bytes/word + 4 flag bits
and four 9-bit bytes/word were often seen. When transferring from tapes
written for other machines to the PDP-6 or PDP-10, I often used 4-bit
byte pointers for the incoming stream or 8-bit byte pointers when it was
known that I would not run into half-byte problems with 7-bit or 9-bit
byte pointers on the outgoing stream.
Nov 14 '05 #11

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

14
by: Matt | last post by:
Hi folks. Can you help with some questions? I gather that some types supported by g++ are nonstandard but have been proposed as standards. Are the long long and unsigned long long types still...
8
by: Eric Lilja | last post by:
Hello, I had what I thought was normal text-file and I needed to locate a string matching a certain pattern in that file and, if found, replace that string. I thought this would be simple but I had...
9
by: James Harris | last post by:
I'm wanting to write in C some code in which it is necessary to address a device as I/O space rather than as part of memory. I would like the code to be as standard - and hence as portable - as...
16
by: ben beroukhim | last post by:
I have huge number of legacy code which use standard files functions. I would like to pass a memory pointer rather than a FILE pointer. I am trying to use FILEs in the code to refer to memory...
5
by: David Mathog | last post by:
I recently ran into a problem where a data file downloaded from another site contained more than 4Gb of data and so the index file to items within that data went from unsigned 4 byte integers to...
19
by: anonymouse | last post by:
Is it possible to have C# run in an unmanaged environemnt if somebody should decide to implemnent it this way? Its possible to code C# projects without any dependancy on the libraries at all...
5
by: philip | last post by:
Here is some lines of code than I wrote. You can copy/paste theis code as code of form1 in a new project. My problem is this one : I try to write in a file a serie of bytes. BUT some bytes...
85
by: fermineutron | last post by:
Some compilers support __asm{ } statement which allows integration of C and raw assembly code. A while back I asked a question about such syntax and was told that __asm is not a part of a C...
270
by: jacob navia | last post by:
In my "Happy Christmas" message, I proposed a function to read a file into a RAM buffer and return that buffer or NULL if the file doesn't exist or some other error is found. It is interesting...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.