By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
446,281 Members | 2,268 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 446,281 IT Pros & Developers. It's quick & easy.

end-of-file problem

P: n/a
Hello,

I read a simple bmp-file with this loop:

while ( !feof(fp) ) {
printf("%x\n", fgetc(fp));
}
fclose(fp);

Everything seems to be correct, but at the end of the file, I get a weird
"ffffffff" output. How does this come? Actually the loop must have been
finished before printf can print some data, or when will the file
pointer increased in this case?
Thanks,
Markus
Aug 20 '06 #1
Share this Question
Share on Google+
29 Replies


P: n/a
boa
* Markus Pitha wrote, On 20.08.2006 15:01:
Hello,

I read a simple bmp-file with this loop:

while ( !feof(fp) ) {
printf("%x\n", fgetc(fp));
}
fclose(fp);

Everything seems to be correct,
It isn't, see the faq for details. http://c-faq.com/stdio/feof.html
but at the end of the file, I get a weird
"ffffffff" output. How does this come? Actually the loop must have been
finished before printf can print some data, or when will the file
pointer increased in this case?

ffffffff happens to be the same as -1 on most machines. EOF also happens
to be -1 on most machines. So you're printing the value of EOF, returned
from fgetc(). On the next iteration, feof() returns true and your loop
stops.

boa
>

Thanks,
Markus
Aug 20 '06 #2

P: n/a
Hello,

thanks for the link but I couldn't handle it as in this example. I
got warnings of comparing pointers with NULL and therefore a bad program
behaviour like endless loops.
Now I decided to use the following construct:

fseek(fp, 0, SEEK_END);
filesize = ftell(fp);
rewind(fp);

while (ftell(fp) < filesize) {
printf("%x\n", fgetc(fp));
}
fclose(fp);
ffffffff happens to be the same as -1 on most machines.
to be -1 on most machines. So you're printing the value of EOF, returned
from fgetc(). On the next iteration, feof() returns true and your loop
stops.
Thanks, that sounds plausible to me.
Aug 20 '06 #3

P: n/a
boa
* Markus Pitha wrote, On 20.08.2006 16:36:
Hello,

thanks for the link but I couldn't handle it as in this example. I
got warnings of comparing pointers with NULL and therefore a bad program
behaviour like endless loops.
Now I decided to use the following construct:

fseek(fp, 0, SEEK_END);
filesize = ftell(fp);
rewind(fp);

while (ftell(fp) < filesize) {
printf("%x\n", fgetc(fp));
}
fclose(fp);

Please don't.

How about something like this instead?
int c;

while( (c = fgetc(fp)) != EOF)
printf("%x\n"; c);
Boa

[snip]
Aug 20 '06 #4

P: n/a
boa wrote:
while( (c = fgetc(fp)) != EOF)
printf("%x\n"; c);
I thought this is only allowed with text files because of the fact that
EOF is defined as -1 and binary files could contain "-1"?
Aug 20 '06 #5

P: n/a
Markus Pitha wrote:
boa wrote:
while( (c = fgetc(fp)) != EOF)
printf("%x\n"; c);

I thought this is only allowed with text files because of the fact that
EOF is defined as -1 and binary files could contain "-1"?
No , it is allowed with all files. EOF will be defined in a
way that the value cannot appear in a file on your platform.

Aug 20 '06 #6

P: n/a
Markus Pitha wrote:
boa wrote:
> while( (c = fgetc(fp)) != EOF)
printf("%x\n"; c);

I thought this is only allowed with text files because of the fact that
EOF is defined as -1 and binary files could contain "-1"?
fgetc() always returns either an unsigned char converted to int, or EOF. As
long as c is wide enough to hold both EOF and 0 ... UCHAR_MAX, for example
with c declared as int as it was in the message you replied to, there's no
problem. With c declared as char, there would be a problem.
Aug 20 '06 #7

P: n/a
Markus Pitha wrote:
>
I read a simple bmp-file with this loop:

while ( !feof(fp) ) {
printf("%x\n", fgetc(fp));
}
fclose(fp);

Everything seems to be correct, but at the end of the file, I get
a weird "ffffffff" output. How does this come? Actually the loop
must have been finished before printf can print some data, or when
will the file pointer increased in this case?
Nothing weird about it. It is simple misuse of the feof function.

int ch;

while (EOF != (ch = getc(fp)) {
printf("%x\n", fgetc(fp));
}
if (!feof(fp)) puts("Hardware error occured");

feof does not look forward, as in better languages. It
distinguishes between i/o errors and reaching EOF when an input
statement fails. To distinguish between EOF and all normal chars
it is necessary to receive those chars into an int, rather than a
char.

getc is normally preferable to fgetc when the file argument can be
evaluated more than once.

--
Chuck F (cb********@yahoo.com) (cb********@maineline.net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.netUSE maineline address!
Aug 20 '06 #8

P: n/a
Harald van Dijk wrote:
fgetc() always returns either an unsigned char converted to int, or EOF. As
long as c is wide enough to hold both EOF and 0 ... UCHAR_MAX, for example
with c declared as int as it was in the message you replied to, there's no
problem. With c declared as char, there would be a problem.
I understand, thanks to everybody.
Markus
Aug 20 '06 #9

P: n/a
CBFalconer wrote:
Markus Pitha wrote:
>I read a simple bmp-file with this loop:

while ( !feof(fp) ) {
printf("%x\n", fgetc(fp));
}
fclose(fp);

Everything seems to be correct, but at the end of the file, I get
a weird "ffffffff" output. How does this come? Actually the loop
must have been finished before printf can print some data, or when
will the file pointer increased in this case?

Nothing weird about it. It is simple misuse of the feof function.

int ch;

while (EOF != (ch = getc(fp)) {
printf("%x\n", fgetc(fp));
Did you really mean to read another character and print it, thereby
printing every other character? * think you meant:
printf("%x\n", ch);
}
if (!feof(fp)) puts("Hardware error occured");

feof does not look forward, as in better languages. It
distinguishes between i/o errors and reaching EOF when an input
statement fails. To distinguish between EOF and all normal chars
it is necessary to receive those chars into an int, rather than a
char.
Agreed.
getc is normally preferable to fgetc when the file argument can be
evaluated more than once.
I would say that it is extremely rare to need fgetc rather than getc. I
understand the code constructs which would require it, e.g.
fgetc(fparr[i++), I just can't think of anywhere where they would have
been useful to me. Has anyone here actually done something where they
had to use fgetc rather than getc?
--
Flash Gordon
Still sigless on this computer.
Aug 20 '06 #10

P: n/a
Markus Pitha <ng******@pithax.netwrites:
boa wrote:
> while( (c = fgetc(fp)) != EOF)
printf("%x\n"; c);

I thought this is only allowed with text files because of the fact that
EOF is defined as -1 and binary files could contain "-1"?
comp.lang.c FAQ, <http://www.c-faq.com/>, question 12.1.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Aug 20 '06 #11

P: n/a

Markus Pitha wrote:
Hello,

thanks for the link but I couldn't handle it as in this example. I
got warnings of comparing pointers with NULL and therefore a bad program
behaviour like endless loops.
Now I decided to use the following construct:

fseek(fp, 0, SEEK_END);
filesize = ftell(fp);
rewind(fp);
This is slightly OT, but this particular technique is
really annoying. By doing this, you make it impossible
to run your program on anything but a regular file.
If you try to execute the code on a fifo or a stream
the seek will fail. You almost never need to know
the size of the file before you start processing. If
you do need to know it, you should find its size
using whatever mechanism is provided by your
platform (eg fstat()). Granted, if you do need to
know the file size, then it probably doesn't make
sense to run the program on a fifo or a stream,
but it's a really good idea to expect your program
to be modified at a later time to be used on an
input stream. For this example, (basically an
implementation of xxd), it is very easy to see someone
wanting to run it on a fifo and watching the output
as it progresses.

--
Bill Pursell

Aug 20 '06 #12

P: n/a
On Sun, 20 Aug 2006 15:36:38 +0000, in comp.lang.c , Markus Pitha
<ng******@pithax.netwrote:
>Hello,

thanks for the link but I couldn't handle it as in this example. I
got warnings of comparing pointers with NULL and therefore a bad program
behaviour like endless loops.
Then why not ask about that problem, and maybe someone can help fix
it?

I get this all the time with users, asking the wrong question...

I can't login, can you reset my password to 'potato' please
your existing password isn't expired, did you try that?
I can't use it any more.
why not, its still valid?
I can't type it.
why not?
its got an 'e' in it
pardon, whys that a problem?
I poured coffee in my keyboard and the 'e' is broken, so I need a new
password....

--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan
Aug 20 '06 #13

P: n/a
On Sun, 20 Aug 2006 15:57:38 +0000, in comp.lang.c , Markus Pitha
<ng******@pithax.netwrote:
>boa wrote:
> while( (c = fgetc(fp)) != EOF)
printf("%x\n"; c);

I thought this is only allowed with text files because of the fact that
EOF is defined as -1 and binary files could contain "-1"?
NO. NO. NO. Do not confuse the EOF condition with the EOF character!

EOF is NOT a value read from the file, its a condition returned by the
library to indicate there's no more data. It works for any sort of
file (as long as itr has an end).
--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan
Aug 20 '06 #14

P: n/a
Mark McIntyre <ma**********@spamcop.netwrites:
On Sun, 20 Aug 2006 15:57:38 +0000, in comp.lang.c , Markus Pitha
<ng******@pithax.netwrote:
>>boa wrote:
>> while( (c = fgetc(fp)) != EOF)
printf("%x\n"; c);

I thought this is only allowed with text files because of the fact that
EOF is defined as -1 and binary files could contain "-1"?

NO. NO. NO. Do not confuse the EOF condition with the EOF character!
Better:

Do not confuse the end-of-file condition with the value of the EOF
macro.
EOF is NOT a value read from the file, its a condition returned by the
library to indicate there's no more data. It works for any sort of
file (as long as itr has an end).
End-of-file is a condition. EOF is a macro that expands to a value of
type int; this value is returned by certain library functions to
indicate that there's no more data, and it's distinct from any valid
character value (since EOF is negative, and a character read from a
file is interpreted as unsigned char and then converted to int).
getchar(), for example, returns the EOF value to indicate an
end-of-file (or error) condition.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Aug 20 '06 #15

P: n/a
On Sun, 20 Aug 2006 21:51:38 GMT, in comp.lang.c , Keith Thompson
<ks***@mib.orgwrote:
>Mark McIntyre <ma**********@spamcop.netwrites:
>NO. NO. NO. Do not confuse the EOF condition with the EOF character!

Better:

Do not confuse the end-of-file condition with the value of the EOF
macro.
Thats not what I meant - I was referring to the character EOF in the
ASCII set.

Mind you,. it'd be less confusing had the committee not decided to
call both the macro and the function EOF, case notwithstanding, and
share that with a quite different ASCII code.
--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan
Aug 21 '06 #16

P: n/a
Mark McIntyre <ma**********@spamcop.netwrites:
On Sun, 20 Aug 2006 21:51:38 GMT, in comp.lang.c , Keith Thompson
<ks***@mib.orgwrote:
>>Mark McIntyre <ma**********@spamcop.netwrites:
>>NO. NO. NO. Do not confuse the EOF condition with the EOF character!

Better:

Do not confuse the end-of-file condition with the value of the EOF
macro.

Thats not what I meant - I was referring to the character EOF in the
ASCII set.
Ok, that's what confused me.

<OT>
There is no ASCII character called "EOF". The ASCII control
characters are:

NUL SOH STX ETX EOT ENQ ACK BEL
BS HT LF VT FF CR SO SI
DLE DC1 DC2 DC3 DC4 NAK SYN ETB
CAN EM SUB ESC FS GS RS US

DEL
</OT>
Mind you,. it'd be less confusing had the committee not decided to
call both the macro and the function EOF, case notwithstanding, and
share that with a quite different ASCII code.
Um, the function is called feof(), not EOF.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Aug 21 '06 #17

P: n/a
Mark McIntyre wrote:
On Sun, 20 Aug 2006 21:51:38 GMT, in comp.lang.c , Keith Thompson
<ks***@mib.orgwrote:
>Mark McIntyre <ma**********@spamcop.netwrites:
>>NO. NO. NO. Do not confuse the EOF condition with the EOF character!
Better:

Do not confuse the end-of-file condition with the value of the EOF
macro.

Thats not what I meant - I was referring to the character EOF in the
ASCII set.
[ snip }

There is no EOF character in the ASCII set. Never has been.

The EOF character was invented, as far as I know, by Gary Kildall at
Digital Research for CP/M 80. The CPM filesystem's directory entry
defined a file's length as a number of 128-byte unit records.

That was all well and good for binary and executables, but what about
ASCII text files? If you would concatenate two text files you need to
know where the first one ends, to the byte, not the unit record.

CPM implemented the EOF character (^Z or Ctrl-Z or 0x1a) to address the
problem. The EOF character was a signal to the BIOS that the previous
character was the LAST character in this text stream.

--
Joe Wright
"Everything should be made as simple as possible, but not simpler."
--- Albert Einstein ---
Aug 22 '06 #18

P: n/a
Joe Wright <jo********@comcast.netwrites:
[...]
There is no EOF character in the ASCII set. Never has been.

The EOF character was invented, as far as I know, by Gary Kildall at
Digital Research for CP/M 80. The CPM filesystem's directory entry
defined a file's length as a number of 128-byte unit records.

That was all well and good for binary and executables, but what about
ASCII text files? If you would concatenate two text files you need to
know where the first one ends, to the byte, not the unit record.

CPM implemented the EOF character (^Z or Ctrl-Z or 0x1a) to address
the problem. The EOF character was a signal to the BIOS that the
previous character was the LAST character in this text stream.
<OT>ASCII refers to that character as SUB (substitute).</OT>

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Aug 22 '06 #19

P: n/a
Keith Thompson wrote:
Joe Wright <jo********@comcast.netwrites:
[...]
>There is no EOF character in the ASCII set. Never has been.

The EOF character was invented, as far as I know, by Gary Kildall at
Digital Research for CP/M 80. The CPM filesystem's directory entry
defined a file's length as a number of 128-byte unit records.

That was all well and good for binary and executables, but what about
ASCII text files? If you would concatenate two text files you need to
know where the first one ends, to the byte, not the unit record.

CPM implemented the EOF character (^Z or Ctrl-Z or 0x1a) to address
the problem. The EOF character was a signal to the BIOS that the
previous character was the LAST character in this text stream.

<OT>ASCII refers to that character as SUB (substitute).</OT>
You are correct. Thanks for sharing. At my house..

| 0 NUL| 1 SOH| 2 STX| 3 ETX| 4 EOT| 5 ENQ| 6 ACK| 7 BEL|
| 8 BS | 9 HT | 10 LF | 11 VT | 12 FF | 13 CR | 14 SO | 15 SI |
| 16 DLE| 17 DC1| 18 DC2| 19 DC3| 20 DC4| 21 NAK| 22 SYN| 23 ETB|
| 24 CAN| 25 EM | 26 SUB| 27 ESC| 28 FS | 29 GS | 30 RS | 31 US |

...but even knowing the names of the things, I don't remember what SUB
and several others are supposed to do. Do you?

--
Joe Wright
"Everything should be made as simple as possible, but not simpler."
--- Albert Einstein ---
Aug 22 '06 #20

P: n/a
Joe Wright <jo********@comcast.netwrites:
Keith Thompson wrote:
>Joe Wright <jo********@comcast.netwrites:
[...]
>>CPM implemented the EOF character (^Z or Ctrl-Z or 0x1a) to address
the problem. The EOF character was a signal to the BIOS that the
previous character was the LAST character in this text stream.
<OT>ASCII refers to that character as SUB (substitute).</OT>
You are correct. Thanks for sharing. At my house..

| 0 NUL| 1 SOH| 2 STX| 3 ETX| 4 EOT| 5 ENQ| 6 ACK| 7 BEL|
| 8 BS | 9 HT | 10 LF | 11 VT | 12 FF | 13 CR | 14 SO | 15 SI |
| 16 DLE| 17 DC1| 18 DC2| 19 DC3| 20 DC4| 21 NAK| 22 SYN| 23 ETB|
| 24 CAN| 25 EM | 26 SUB| 27 ESC| 28 FS | 29 GS | 30 RS | 31 US |

..but even knowing the names of the things, I don't remember what SUB
and several others are supposed to do. Do you?
Nope. That's why I Googled "ascii" and found
<http://www.lookuptables.com/>.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Aug 23 '06 #21

P: n/a

Keith Thompson wrote:
Joe Wright <jo********@comcast.netwrites:
Keith Thompson wrote:
Joe Wright <jo********@comcast.netwrites:
[...]
>CPM implemented the EOF character (^Z or Ctrl-Z or 0x1a) to address
the problem. The EOF character was a signal to the BIOS that the
previous character was the LAST character in this text stream.
<OT>ASCII refers to that character as SUB (substitute).</OT>
You are correct. Thanks for sharing. At my house..

| 0 NUL| 1 SOH| 2 STX| 3 ETX| 4 EOT| 5 ENQ| 6 ACK| 7 BEL|
| 8 BS | 9 HT | 10 LF | 11 VT | 12 FF | 13 CR | 14 SO | 15 SI |
| 16 DLE| 17 DC1| 18 DC2| 19 DC3| 20 DC4| 21 NAK| 22 SYN| 23 ETB|
| 24 CAN| 25 EM | 26 SUB| 27 ESC| 28 FS | 29 GS | 30 RS | 31 US |

..but even knowing the names of the things, I don't remember what SUB
and several others are supposed to do. Do you?

Nope. That's why I Googled "ascii" and found
<http://www.lookuptables.com/>.
The ASCII characters significant to this discussion would be
ETX, EOT, ETB, and EM, which all signal the end of something. EM and
EOT would be the likeliest candidates for an "eof character", if one
existed (EM is "End of Medium", which means that we ran out of data - a
classic "end-of-file" condition, while EOT is "End of Transmission",
which may also mean out of data).

OTOH, CP/M (and it's clone PCDOS) used the SUB ("Substitute") character
to flag the end of textual data in a file. SUBstitute is supposed to
signal the existance of a character byte that cannot be expressed in
ASCII. Rather than try to express an invalid character,
ASCII-compatable devices are supposed to substitute the SUB character
for the invalid one.

--
Lew Pitcher

Aug 23 '06 #22

P: n/a
Joe Wright wrote:
| 0 NUL| 1 SOH| 2 STX| 3 ETX| 4 EOT| 5 ENQ| 6 ACK| 7 BEL|
| 8 BS | 9 HT | 10 LF | 11 VT | 12 FF | 13 CR | 14 SO | 15 SI |
| 16 DLE| 17 DC1| 18 DC2| 19 DC3| 20 DC4| 21 NAK| 22 SYN| 23 ETB|
| 24 CAN| 25 EM | 26 SUB| 27 ESC| 28 FS | 29 GS | 30 RS | 31 US |

..but even knowing the names of the things, I don't remember what SUB
and several others are supposed to do. Do you?
SUB was a tty code to print in place of a code that was outside the
character set of the device, or for invalid codes.

Some of the codes are for (paper) tape control. The control and format
stuff was used on ttys that weren't even necessarily part of computer
systems.

Aug 23 '06 #23

P: n/a
jmcgill wrote:
Joe Wright wrote:
>| 0 NUL| 1 SOH| 2 STX| 3 ETX| 4 EOT| 5 ENQ| 6 ACK| 7 BEL|
| 8 BS | 9 HT | 10 LF | 11 VT | 12 FF | 13 CR | 14 SO | 15 SI |
| 16 DLE| 17 DC1| 18 DC2| 19 DC3| 20 DC4| 21 NAK| 22 SYN| 23 ETB|
| 24 CAN| 25 EM | 26 SUB| 27 ESC| 28 FS | 29 GS | 30 RS | 31 US |

..but even knowing the names of the things, I don't remember what
SUB and several others are supposed to do. Do you?

SUB was a tty code to print in place of a code that was outside
the character set of the device, or for invalid codes.

Some of the codes are for (paper) tape control. The control and
format stuff was used on ttys that weren't even necessarily part
of computer systems.
The ones I remember include:

NUL null character
SOH start of heading
EOT end of transmission
ENQ Enquiry. also know as WRU for who are you.
started the automatic reply sequence from a TTY
ACK acknowledge
BEL bell
BS back space
HT horizontal tab
LF line feed
CR carruage return
DLE data link escape
DC1 device control 1, or tape reader on
DC2 device control 2, or tape punch on
DC3 device control 3, or tape reader off
DC4 device control 4, or tape punch off
NAK Negative acknowledge
SYN Synchronize
CAN Cancel
EM End mode
ESC escaoe
FS field separator
GS group separator
RS record separator
US unit separator

and DEL DELETE (255) all bits on (all holes punched)

--
Chuck F (cb********@yahoo.com) (cb********@maineline.net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.netUSE maineline address!
Aug 23 '06 #24

P: n/a
CBFalconer <cb********@yahoo.comwrote:
jmcgill wrote:
Joe Wright wrote:
| 0 NUL| 1 SOH| 2 STX| 3 ETX| 4 EOT| 5 ENQ| 6 ACK| 7 BEL|
| 8 BS | 9 HT | 10 LF | 11 VT | 12 FF | 13 CR | 14 SO | 15 SI |
| 16 DLE| 17 DC1| 18 DC2| 19 DC3| 20 DC4| 21 NAK| 22 SYN| 23 ETB|
| 24 CAN| 25 EM | 26 SUB| 27 ESC| 28 FS | 29 GS | 30 RS | 31 US |

..but even knowing the names of the things, I don't remember what
SUB and several others are supposed to do. Do you?
SUB was a tty code to print in place of a code that was outside
the character set of the device, or for invalid codes.

Some of the codes are for (paper) tape control. The control and
format stuff was used on ttys that weren't even necessarily part
of computer systems.

The ones I remember include:
and DEL DELETE (255) all bits on (all holes punched)
*Bzzzt* 127, please. ASCII is a 7-bit code.

Richard
Aug 23 '06 #25

P: n/a
Richard Bos wrote:
CBFalconer <cb********@yahoo.comwrote:
>jmcgill wrote:
>>Joe Wright wrote:

| 0 NUL| 1 SOH| 2 STX| 3 ETX| 4 EOT| 5 ENQ| 6 ACK| 7 BEL|
| 8 BS | 9 HT | 10 LF | 11 VT | 12 FF | 13 CR | 14 SO | 15 SI |
| 16 DLE| 17 DC1| 18 DC2| 19 DC3| 20 DC4| 21 NAK| 22 SYN| 23 ETB|
| 24 CAN| 25 EM | 26 SUB| 27 ESC| 28 FS | 29 GS | 30 RS | 31 US |

..but even knowing the names of the things, I don't remember what
SUB and several others are supposed to do. Do you?
SUB was a tty code to print in place of a code that was outside
the character set of the device, or for invalid codes.

Some of the codes are for (paper) tape control. The control and
format stuff was used on ttys that weren't even necessarily part
of computer systems.
The ones I remember include:
>and DEL DELETE (255) all bits on (all holes punched)

*Bzzzt* 127, please. ASCII is a 7-bit code.
Chuck's tape punch has even parity. :=)

--
Joe Wright
"Everything should be made as simple as possible, but not simpler."
--- Albert Einstein ---
Aug 23 '06 #26

P: n/a
On Tue, 22 Aug 2006 18:24:54 -0400, in comp.lang.c , Joe Wright
<jo********@comcast.netwrote:
>There is no EOF character in the ASCII set. Never has been.
I'll leave you guys to argue about this, since its offtopic here. My
printed ASCII table, taken from the manual for an IBM-PCXT, has EOF at
position 26 as far as I recall. So blame IBM...

--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan
Aug 23 '06 #27

P: n/a
In article <74******************************@comcast.com>,
Joe Wright <jo********@comcast.netwrote:
>>and DEL DELETE (255) all bits on (all holes punched)

*Bzzzt* 127, please. ASCII is a 7-bit code.
>Chuck's tape punch has even parity. :=)
(Obviously completely off-topic)

Presumably this is a good reason for using even parity for encodings
with an even number of bits, and odd for an odd number of bits: so
that a deleted character - all the holes punched - has correct parity.

-- Richard
Aug 23 '06 #28

P: n/a
In article <1q********************************@4ax.com>,
Mark McIntyre <ma**********@spamcop.netwrote:
>>There is no EOF character in the ASCII set. Never has been.
>I'll leave you guys to argue about this, since its offtopic here. My
printed ASCII table, taken from the manual for an IBM-PCXT, has EOF at
position 26 as far as I recall. So blame IBM...
As far as I know the real ASCII standard is not available without
charge, but the control codes are specified by ISO/IEC 6429:1992 and
according to the Unicode code chart

http://www.unicode.org/charts/PDF/U0000.pdf

character 26 (control-Z) is called "SUB" (substitute) and was
presumably intended to be used as a substitute for any character not
available in ASCII. It is of course used as EOF on many systems.

-- Richard
Aug 23 '06 #29

P: n/a
On Wed, 23 Aug 2006 03:25:58 -0400, CBFalconer <cb********@yahoo.com>
wrote:
<snip ASCII control characters>
The ones I remember include:

NUL null character
SOH start of heading
Originally/also SOA, start of address.

STX start of text
ETX end of text
These three, plus ETB below, were generally used only for
block-oriented synchronous protocols like BSC 'bisync' (and not very
often for that, since it generally used EBCDIC instead) but also for
some Telex/wire/cable formats (especially machine-switched ones).

EOT, ENQ, ACK, and NAK were also used mostly with such protocols, but
sometimes just by themselves.
EOT end of transmission
ENQ Enquiry. also know as WRU for who are you.
started the automatic reply sequence from a TTY
ACK acknowledge
BEL bell
BS back space
HT horizontal tab
LF line feed
VT vertical tab
FF form feed
CR carruage return
Nit: carriage

SO shift out (to special/alternate glyphs, such as Greek, APL, etc.)
SI shift in (to normal glyphs)
DLE data link escape
DC1 device control 1, or tape reader on
DC2 device control 2, or tape punch on
DC3 device control 3, or tape reader off
DC4 device control 4, or tape punch off
DC1/DC3 = (bitpaired) ctrl+Q/S were also labelled on the TTY keyboard
as and hence often referred to as X-ON and X-OFF. And they acquired
another use for flow control, still valid after paper tape has gone to
the great beyond, by swapping the sequence: instead of 'start reader'
and then 'stop reader', ^S is used for 'stop sending, I'm full or
busy' and ^Q for 'you may start sending again, I'm ready'.
NAK Negative acknowledge
SYN Synchronize
ETB end of text block, see above
CAN Cancel
EM End mode
End Medium (or media?)

SUB substitute for error or suspect (as per snipped prior post)
ESC escaoe
Nit: escape
FS field separator
_file_ separator
GS group separator
RS record separator
US unit separator
Note that FS GS RS US are consecutive codes descending in the
traditional file organization hierarchy, and the next sequential
codepoint is 0x20 SP space, the usual (text) word separator.
and DEL DELETE (255) all bits on (all holes punched)
127 0x7F in ASCII which is only 7 bits, but 255 0xFF in common (and
important) embeddings like even parity or mark 'parity'.

- David.Thompson1 at worldnet.att.net
Sep 4 '06 #30

This discussion thread is closed

Replies have been disabled for this discussion.