fgets() equivalent?

J de Boyne Pollard

HSfgets() is standard in C file I/O.
HS>
HSThe only issue you need to pay attention too, is RAW
HS(binary) vs COOK mode. It will relate the EOL (end
HSof line) definitions of MS-DOS (<CRL><LR>) vs
HSUnix (<LF>). Depending on your application that
HSmay or may no pertain.

TRThis is a handy definition, but it is NOT CORRECT. [...]
TRThe raw vs cooked distinction in Unix is VERY different
TRfrom the binary vs text distinction in MS-DOS. [...]

Actually, the binary/text dichotomy comes from the C language. The
operating systems themselves have and make no such distinction. (To
the operating systems themselves, files are just octet streams. There
are no lines, no newline sequences, and no EOF marker characters.) It
is simply the case that C language implementations targetting PC/MS/DR-
DOS use the either-CR+LF-or-LF newline convention for text files
(although they are not required to do so), and C language
implementations targetting Unices and Linux use the LF newline
convention for text files (and are required to do so by the POSIX
standard, which defines additional restrictions on C implementations).

Nov 30 '07 #1

Subscribe Post Reply

3451

santosh

J de Boyne Pollard wrote:

HSfgets() is standard in C file I/O.
HS>
HSThe only issue you need to pay attention too, is RAW
HS(binary) vs COOK mode. It will relate the EOL (end
HSof line) definitions of MS-DOS (<CRL><LR>) vs
HSUnix (<LF>). Depending on your application that
HSmay or may no pertain.

TRThis is a handy definition, but it is NOT CORRECT. [...]
TRThe raw vs cooked distinction in Unix is VERY different
TRfrom the binary vs text distinction in MS-DOS. [...]

Actually, the binary/text dichotomy comes from the C language. The
operating systems themselves have and make no such distinction. (To
the operating systems themselves, files are just octet streams. There
are no lines, no newline sequences, and no EOF marker characters.)

Not the case with all operating systems. Many systems like CP/M and some
mainframes have a record oriented file system, where the file is
represented as a sequence of records. CP/M also had a end-of-file
marker. Also non 8-bit byte systems may not view files as an octet
stream.

<snip>

Nov 30 '07 #2

J de Boyne Pollard

TRThis is a handy definition, but it is NOT CORRECT. [...]
TRThe raw vs cooked distinction in Unix is VERY different
TRfrom the binary vs text distinction in MS-DOS. [...]

JdeBPActually, the binary/text dichotomy comes from the C
JdeBPlanguage. The operating systems themselves have
JdeBPand make no such distinction. (To the operating
JdeBPsystems themselves, files are just octet streams.
JdeBPThere are no lines, no newline sequences, and no
JdeBPEOF marker characters.)

sNot the case with all operating systems. [...]

M. Roberts wasn't talking about all operating systems. The operating
systems that xe was talking about xe mentioned by name.

Nov 30 '07 #3

Tim Roberts

J de Boyne Pollard <j.**************@tesco.netwrote:

>
Actually, the binary/text dichotomy comes from the C language. The
operating systems themselves have and make no such distinction. (To
the operating systems themselves, files are just octet streams. There
are no lines, no newline sequences, and no EOF marker characters.)

I'm sorry, but you are incorrect. Apparently, you never got burned trying
to use the "copy" command without "/b" in the early versions of MS-DOS on a
file that happened to contain an embedded Ctrl-Z (the text-mode "end of
file" character). It, in turn, inherited that behavior from CP/M.

The C run-time library had to ADD the text/binary distinction because CP/M
and MS-DOS embedded it in their file system mechanisms. That concept was
certainly not part of the C run-time before implementations were built for
those operating systems.
--
Tim Roberts, ti**@probo.com
Providenza & Boekelheide, Inc.

Dec 1 '07 #4

Keith Thompson

Tim Roberts <ti**@probo.comwrites:

J de Boyne Pollard <j.**************@tesco.netwrote:
>>Actually, the binary/text dichotomy comes from the C language. The
operating systems themselves have and make no such distinction. (To
the operating systems themselves, files are just octet streams. There
are no lines, no newline sequences, and no EOF marker characters.)

I'm sorry, but you are incorrect. Apparently, you never got burned trying
to use the "copy" command without "/b" in the early versions of MS-DOS on a
file that happened to contain an embedded Ctrl-Z (the text-mode "end of
file" character). It, in turn, inherited that behavior from CP/M.

The C run-time library had to ADD the text/binary distinction because CP/M
and MS-DOS embedded it in their file system mechanisms. That concept was
certainly not part of the C run-time before implementations were built for
those operating systems.

Are you sure that CP/M and MS-DOS where the specific reasons for this
C feature? There are certainly other operating systems (including
VMS) that distinguish between text files and binary files.

--
Keith Thompson (The_Other_Keith) <ks***@mib.org>
Looking for software development work in the San Diego area.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Dec 1 '07 #5

Gary Chanson

"Keith Thompson" <ks***@mib.orgwrote in message
news:87************@kvetch.smov.org...

Tim Roberts <ti**@probo.comwrites:
>J de Boyne Pollard <j.**************@tesco.netwrote:
>>>Actually, the binary/text dichotomy comes from the C language. The
operating systems themselves have and make no such distinction. (To
the operating systems themselves, files are just octet streams. There
are no lines, no newline sequences, and no EOF marker characters.)

I'm sorry, but you are incorrect. Apparently, you never got burned
trying
to use the "copy" command without "/b" in the early versions of MS-DOS on
a
file that happened to contain an embedded Ctrl-Z (the text-mode "end of
file" character). It, in turn, inherited that behavior from CP/M.

The C run-time library had to ADD the text/binary distinction because
CP/M
and MS-DOS embedded it in their file system mechanisms. That concept was
certainly not part of the C run-time before implementations were built
for
those operating systems.

Are you sure that CP/M and MS-DOS where the specific reasons for this
C feature? There are certainly other operating systems (including
VMS) that distinguish between text files and binary files.

My understanstanding is that it was originally imported into CP/M from
Unix.

--

- Gary Chanson (Windows SDK MVP)
- Abolish Public Schools

Dec 2 '07 #6

Keith Thompson

"Gary Chanson" <gc******@No.Spam.mvps.orgwrites:

"Keith Thompson" <ks***@mib.orgwrote in message
news:87************@kvetch.smov.org...

[...

>>The C run-time library had to ADD the text/binary distinction
because CP/M and MS-DOS embedded it in their file system
mechanisms. That concept was certainly not part of the C run-time
before implementations were built for those operating systems.

Are you sure that CP/M and MS-DOS where the specific reasons for this
C feature? There are certainly other operating systems (including
VMS) that distinguish between text files and binary files.

My understanstanding is that it was originally imported into CP/M from
Unix.

That doesn't make sense. CP/M (or at least a C implementation under
CP/M) has to distinguish between text and binary files, because it
uses a two-character CR-LF sequence to mark the end of a line. Unix
uses a single LF character, and thus doesn't need to distinguish
between text and binary.

--
Keith Thompson (The_Other_Keith) <ks***@mib.org>
Looking for software development work in the San Diego area.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Dec 3 '07 #7

David Craig

How about that even Unix needs to generate a CR/LF pair when a 'Newline -
0x0A' is encountered in output to a tty type device. Unix is old and
works/worked with teletype terminals where a CR returns the carriage to
column one and the LF causes the paper to feed up one line. Some even
required multiple CR characters because they were so slow and would loose
characters that followed too quickly when a major movement of the carriage
was required.

"Keith Thompson" <ks***@mib.orgwrote in message
news:87************@kvetch.smov.org...

"Gary Chanson" <gc******@No.Spam.mvps.orgwrites:
>"Keith Thompson" <ks***@mib.orgwrote in message
news:87************@kvetch.smov.org...
[...

>>>The C run-time library had to ADD the text/binary distinction
because CP/M and MS-DOS embedded it in their file system
mechanisms. That concept was certainly not part of the C run-time
before implementations were built for those operating systems.

Are you sure that CP/M and MS-DOS where the specific reasons for this
C feature? There are certainly other operating systems (including
VMS) that distinguish between text files and binary files.

My understanstanding is that it was originally imported into CP/M
from
Unix.

That doesn't make sense. CP/M (or at least a C implementation under
CP/M) has to distinguish between text and binary files, because it
uses a two-character CR-LF sequence to mark the end of a line. Unix
uses a single LF character, and thus doesn't need to distinguish
between text and binary.

--
Keith Thompson (The_Other_Keith) <ks***@mib.org>
Looking for software development work in the San Diego area.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Dec 3 '07 #8

J. J. Farrell

Gary Chanson wrote:

"Keith Thompson" <ks***@mib.orgwrote in message
news:87************@kvetch.smov.org...
>Tim Roberts <ti**@probo.comwrites:
>>J de Boyne Pollard <j.**************@tesco.netwrote:
Actually, the binary/text dichotomy comes from the C language. The
operating systems themselves have and make no such distinction. (To
the operating systems themselves, files are just octet streams. There
are no lines, no newline sequences, and no EOF marker characters.)
I'm sorry, but you are incorrect. Apparently, you never got burned
trying
to use the "copy" command without "/b" in the early versions of MS-DOS on
a
file that happened to contain an embedded Ctrl-Z (the text-mode "end of
file" character). It, in turn, inherited that behavior from CP/M.

The C run-time library had to ADD the text/binary distinction because
CP/M
and MS-DOS embedded it in their file system mechanisms. That concept was
certainly not part of the C run-time before implementations were built
for
those operating systems.
Are you sure that CP/M and MS-DOS where the specific reasons for this
C feature? There are certainly other operating systems (including
VMS) that distinguish between text files and binary files.

My understanstanding is that it was originally imported into CP/M from
Unix.

Your understanding is incorrect. One of the key concepts of UNIX was
that files were just files. There was no distinction between different
types of file, and no "special data" in the file to indicate
end-of-file. I don't know if UNIX originated this concept, but it was
relatively novel at the time and UNIX did much to popularize it. The
distinction between binary and text files in the Standard I/O library
was added when C was ported to other OSes.

Dec 3 '07 #9

Dik T. Winter

In article <OF**************@TK2MSFTNGP03.phx.gbl"David Craig" <dr*****@nowhere.uswrites:

"Keith Thompson" <ks***@mib.orgwrote in message
news:87************@kvetch.smov.org...
That doesn't make sense. CP/M (or at least a C implementation under
CP/M) has to distinguish between text and binary files, because it
uses a two-character CR-LF sequence to mark the end of a line. Unix
uses a single LF character, and thus doesn't need to distinguish
between text and binary.

How about that even Unix needs to generate a CR/LF pair when a 'Newline -
0x0A' is encountered in output to a tty type device.

How about that there is a difference between how files are stored on disk
and what happens if said file is displayed on a tty type device? The
conversion is done by the tty driver. As a MacOS tty driver would
convert a CR to the combined CR/LF. Normally such *tty drivers* would
expect that it is a text file that will be displayed. With respect to
the C programming environment there is no difference between text files
and binary files.
--
dik t. winter, cwi, kruislaan 413, 1098 sj amsterdam, nederland, +31205924131
home: bovenover 215, 1025 jn amsterdam, nederland; http://www.cwi.nl/~dik/

Dec 3 '07 #10

Dik T. Winter

In article <fi***********@energise.enta.net"J. J. Farrell" <jj*@bcs.org.ukwrites:
....

Your understanding is incorrect. One of the key concepts of UNIX was
that files were just files. There was no distinction between different
types of file, and no "special data" in the file to indicate
end-of-file. I don't know if UNIX originated this concept, but it was
relatively novel at the time and UNIX did much to popularize it. The
distinction between binary and text files in the Standard I/O library
was added when C was ported to other OSes.

The concept was much older. On all the older systems I have worked with,
end-of-file was no special data in the file, but merely metadata held by
the system in the information about the file. I think that CP/M was the
first system that made that metadata part of the file. On the other hand,
the distinction between text and binary files has been present in many
file systems, but at a quite different level. And the only level were
they were different was whether to interprete a particular sequence of
bytes as end-of-line. Never whether something should be interpreted as
end-of-file.
--
dik t. winter, cwi, kruislaan 413, 1098 sj amsterdam, nederland, +31205924131
home: bovenover 215, 1025 jn amsterdam, nederland; http://www.cwi.nl/~dik/

Dec 3 '07 #11

CBFalconer

David Craig wrote: *** and top-posted - fixed ***

"Keith Thompson" <ks***@mib.orgwrote in message
>"Gary Chanson" <gc******@No.Spam.mvps.orgwrites:
>>"Keith Thompson" <ks***@mib.orgwrote in message

[...

>>>>The C run-time library had to ADD the text/binary distinction
because CP/M and MS-DOS embedded it in their file system
mechanisms. That concept was certainly not part of the C
run-time before implementations were built for those operating
systems.

Are you sure that CP/M and MS-DOS where the specific reasons
for this C feature? There are certainly other operating
systems (including VMS) that distinguish between text files
and binary files.

My understanstanding is that it was originally imported into
CP/M from Unix.

That doesn't make sense. CP/M (or at least a C implementation
under CP/M) has to distinguish between text and binary files,
because it uses a two-character CR-LF sequence to mark the end
of a line. Unix uses a single LF character, and thus doesn't
need to distinguish between text and binary.

How about that even Unix needs to generate a CR/LF pair when a
'Newline - 0x0A' is encountered in output to a tty type device.
Unix is old and works/worked with teletype terminals where a CR
returns the carriage to column one and the LF causes the paper
to feed up one line. Some even required multiple CR characters
because they were so slow and would loose characters that
followed too quickly when a major movement of the carriage was
required.

This was usually handled by having the terminal driver emit "CR,
LF, DC3" to prompt for a new line. At line end, the echoing
machinery would emit "DC1, CR". I think I have the sequence
right. At any rate, there was enough idle time for the carriage to
recover, and the sequences would also stop/start the tape reader,
if present and loaded. When the input line was half duplex those
sequences would also prompt the sending device to unload another
line.

Please do not top-post. Your answer belongs after (or intermixed
with) the quoted material to which you reply, after snipping all
irrelevant material. I fixed this one. See the following links:

--
<http://www.catb.org/~esr/faqs/smart-questions.html>
<http://www.caliburn.nl/topposting.html>
<http://www.netmeister.org/news/learn2quote.html>
<http://cfaj.freeshell.org/google/ (taming google)
<http://members.fortunecity.com/nnqweb/ (newusers)

--
Posted via a free Usenet account from http://www.teranews.com

Dec 3 '07 #12

CBFalconer

"Dik T. Winter" wrote:

"J. J. Farrell" <jj*@bcs.org.ukwrites:
...
>Your understanding is incorrect. One of the key concepts of UNIX
was that files were just files. There was no distinction between
different types of file, and no "special data" in the file to
indicate end-of-file. I don't know if UNIX originated this
concept, but it was relatively novel at the time and UNIX did
much to popularize it. The distinction between binary and text
files in the Standard I/O library was added when C was ported to
other OSes.

The concept was much older. On all the older systems I have
worked with, end-of-file was no special data in the file, but
merely metadata held by the system in the information about the
file. I think that CP/M was the first system that made that
metadata part of the file. On the other hand, the distinction
between text and binary files has been present in many file
systems, but at a quite different level. And the only level
were they were different was whether to interprete a particular
sequence of bytes as end-of-line. Never whether something
should be interpreted as end-of-file.

No, EOF has always meant "we hit the end of recorded data". The
CP/M solution was because the file length was recorded in terms of
128 byte records, and these did not match the structure of text
files. Therefore CP/M added an EOF character to the text stored.

Similarly CP/M didn't do any LF --CR/LF --LF translation while
writing and reading, but just wrote the CR/LF sequence. Less code
that way :-). DOS just copied it, because of laziness and because
the primary market.

--
Chuck F (cbfalconer at maineline dot net)
<http://cbfalconer.home.att.net>
Try the download section.
--
Posted via a free Usenet account from http://www.teranews.com

Dec 3 '07 #13

Pops

Dik T. Winter wrote:

>
How about that even Unix needs to generate a CR/LF pair when a 'Newline -
0x0A' is encountered in output to a tty type device.

How about that there is a difference between how files are stored on disk
and what happens if said file is displayed on a tty type device? The
conversion is done by the tty driver. As a MacOS tty driver would
convert a CR to the combined CR/LF. Normally such *tty drivers* would
expect that it is a text file that will be displayed. With respect to
the C programming environment there is no difference between text files
and binary files.

+1, while C itself is device independent, the type of device itself
means something as you elegantly pointed out in regards to the device
driver in question.

The original poster, I presume migrating or posting from Unix, wanted
the equivalent behavior of fgets().

My basic point in my reply was he needs to deal with Cooked vs Raw
concepts, especially in windows, and especially if his applications has
to interface with devices or files that from various places.

When a device is opened using the standard C I/O functions with the mode
attribute containing "t" by the Windows or MS-DOS target application,
C/C++ RTL (run time library) will read/write in cooked mode, by default.
Its all clearly there in the MS C/C++ RTL source code provided in every
distribution.

Now if the application needs to interface with the outside world to get
input, then it MAY need to be compiled or switch at run time to do I/O
in non-cooked mode.

You know how many times you see people posting simple C fetch using the
standard device I/O heuristics claiming its 100% portable and Windows
developers run into cooked standard I/O problems? Quite a few times.

In general, for windows, all you need to add a few lines to make the
standard I/O devices raw.

_setmode( _fileno( stdin ), _O_BINARY );
_setmode( _fileno( stdout ), _O_BINARY );

Here is an example:

/* fetch.c -- fetch via HTTP and dump the entire session to stdout
posted by some a unix wienie claiming portability.

- ported to windows to illustrate need to change the stdout
default _O_TEXT cooked mode to _O_BINARY raw mode.

*/

#ifdef _WIN32

#include <windows.h>
#include <stdio.h>
#include <string.h>
#include <winsock.h>
#include <fcntl.h>
#include <io.h>

#pragma comment(lib,"wsock32.lib")
#define close(a) closesocket(a)
#define read(a,b,c) recv(a,b,c,0)
#define write(a,b,c) send(a,b,c,0)

#else
#include <stdio.h>
#include <string.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <netdb.h>
#include <signal.h>
#endif

main(argc, argv)
int argc;
char **argv;
{
int pfd; /* fd from socket */
int len;
char *hostP, *fileP;
char buf[1024];
struct hostent *hP; /* for host */
struct sockaddr_in sin;
#ifdef _WIN32
WSADATA wd;
if (WSAStartup(MAKEWORD(1, 1), &wd) != 0) {
exit(1);
}
_setmode( _fileno( stdin ), _O_BINARY );
_setmode( _fileno( stdout ), _O_BINARY );
#endif

if ( argc != 3 ) {
fprintf( stderr, "Usage: %s host file\n", argv[0] );
exit( 1 );
}

hostP = argv[1];
fileP = argv[2];

hP = gethostbyname( hostP );
if ( hP == NULL ) {
fprintf( stderr, "Unknown host \"%s\"\n", hostP );
exit( 1 );
}

pfd = socket( AF_INET, SOCK_STREAM, 0 );
if ( pfd < 0 ) {
perror( "socket" );
exit( 1 );
}
sin.sin_family = hP->h_addrtype;
memcpy( (char *)&sin.sin_addr, hP->h_addr, hP->h_length );
sin.sin_port = htons( 80 );
if ( connect( pfd, (struct sockaddr *)&sin, sizeof(sin) ) < 0 ) {
perror( "connect" );
close( pfd );
exit( 1 );
}

sprintf( buf, "GET %s HTTP/1.0\r\n\r\n", fileP );
write( pfd, buf, strlen(buf));

while ( ( len = read( pfd, buf, sizeof(buf)) ) 0)
fwrite( buf, 1, len, stdout );

close( pfd );
fflush( stdout );
exit( 0 );
}
--
HLS

Dec 3 '07 #14

J de Boyne Pollard

JdeBPActually, the binary/text dichotomy comes from the C
JdeBPlanguage. The operating systems themselves have
JdeBPand make no such distinction. (To the operating
JdeBPsystems themselves, files are just octet streams.
JdeBPThere are no lines, no newline sequences, and no EOF
JdeBPmarker characters.)

TRI'm sorry, but you are incorrect.

False. Your understanding of the operation of the COPY command is
wrong, and you have an erroneous idea of where the behaviour that you
observed actually originates.

TRApparently, you never got burned trying to use the
TR"copy" command without "/b" in the early versions of
TRMS-DOS on a file that happened to contain an embedded
TRCtrl-Z (the text-mode "end of file" character).

I encountered that behaviour. I encountered the silly behaviour of
the COPY command that caused it to fail to copy zero-length files,
too. However, that behaviour doesn't mean what you think it to mean.

TRIt, in turn, inherited that behavior from CP/M.

No, it didn't. PIP has no equivalent option.

TRThe C run-time library had to ADD the text/binary
TRdistinction because CP/M and MS-DOS embedded
TRit in their file system mechanisms.

False. And this is where your error lies. The behaviour of the COPY
command _is embedded in that command itself_. It has to comprise code
for processing in "binary mode" and in "text mode". (You can see that
code in the FreeDOS COMMAND at <URL:https://
freedos.svn.sourceforge.net/svnroot/freedos/freecom/trunk/cmd/copy.c>,
for example. This, in its turn, uses the stream mode flags of the C
language's standard library, which is where all of the code to make a
distinction between "text" and "binary" streams actually resides.)
The operating system _makes no such distinction_. I suggest actually
taking a look at the PC/MS/DR-DOS system API. There is no text/binary
distinction embedded in the filesystem mechanism. Files are, as I
said, just octet streams.

Dec 3 '07 #15

Kaz Kylheku

On Dec 2, 5:29 pm, "David Craig" <driv...@nowhere.uswrote:

How about that even Unix needs to generate a CR/LF pair when a 'Newline -
0x0A' is encountered in output to a tty type device.

Fortunately, Thompson was intelligent enough to realize that the
control characters for printing devices should not determine the
representation of text files. The conversion is tucked away into the
kernel, and can be turned on and off.

The people who designed CR-LF into the various Internet protocols
really dropped the ball. There was an opportunity to fix this
braindamage in HTTP, but alas.

Unix is old and
works/worked with teletype terminals where a CR returns the carriage to
column one and the LF causes the paper to feed up one line.

That's, like, because CR actually stands for carriage return, and LF
for line feed, which is enshrined in the USASCII code. :)

It's wrong for a character display or printing device to give any
other meanings to these standardized codes.

The VT100 terminal, which is widely emulated today, also works this
way, and so Unix systems in general nearly always have the ONLCR flag
turned on when communicating with their own character consoles or
terminal emulators like xterm, etc.

Dec 3 '07 #16

Ernie Wright

J de Boyne Pollard wrote:

TRThe C run-time library had to ADD the text/binary
TRdistinction because CP/M and MS-DOS embedded
TRit in their file system mechanisms.

False. And this is where your error lies. The behaviour of the COPY
command _is embedded in that command itself_. It has to comprise code
for processing in "binary mode" and in "text mode". [...]
The operating system _makes no such distinction_. I suggest actually
taking a look at the PC/MS/DR-DOS system API. There is no text/binary
distinction embedded in the filesystem mechanism. Files are, as I
said, just octet streams.

But *devices* are not. MS-DOS character-mode devices do distinguish
between text and binary streams. Devices include AUX, PRN and CON.
Since these can be a source or destination for the COPY command, COPY
must also respect the distinction, and so must any other interface that
treats devices as if they were files.

Including C streams. C's stdin, stdout, stderror streams are typically
mapped to the MS-DOS CON device.

MS-DOS Int 21h functions 4400h and 4401h get and set device status. Bit
5 of DX determines whether the device is functioning in text or binary
mode.

I don't think CP/M makes this distinction, but I don't know. I think
the *convention* of terminating text files with Ctrl-Z arose because
CP/M couldn't store the exact byte size of the file. Its file size
granularity was 128 bytes.

- Ernie http://home.comcast.net/~erniew

Dec 3 '07 #17

David Thompson

On Mon, 03 Dec 2007 01:10:37 -0500, CBFalconer <cb********@yahoo.com>
wrote:

David Craig wrote: *** and top-posted - fixed ***

How about that even Unix needs to generate a CR/LF pair when a
'Newline - 0x0A' is encountered in output to a tty type device.

This was usually handled by having the terminal driver emit "CR,
LF, DC3" to prompt for a new line. At line end, the echoing
machinery would emit "DC1, CR". I think I have the sequence
right. At any rate, there was enough idle time for the carriage to
recover, and the sequences would also stop/start the tape reader,
if present and loaded. When the input line was half duplex those
sequences would also prompt the sending device to unload another
line.

I don't think 'usually' is right; I saw this only on a few
'half-duplex' systems like IBM VM/370, and never on any Unix.
(Although there has been such a great profusion of Unices I can't
speak for all.) Where it was used you have it backwards: DC1=0x11 to
prompt/enable, DC3=0x13 to 'hold it there, buddy'.

- formerly david.thompson1 || achar(64) || worldnet.att.net

Dec 17 '07 #18

David Thompson

On Mon, 3 Dec 2007 07:01:53 -0800 (PST), Kaz Kylheku
<kk******@gmail.comwrote:

On Dec 2, 5:29 pm, "David Craig" <driv...@nowhere.uswrote:
How about that even Unix needs to generate a CR/LF pair when a 'Newline -
0x0A' is encountered in output to a tty type device.

Fortunately, Thompson was intelligent enough to realize that the
control characters for printing devices should not determine the
representation of text files. The conversion is tucked away into the
kernel, and can be turned on and off.

Or at least in the tty drivers which are in the system but (nowadays)
not necessarily in the kernel.

The people who designed CR-LF into the various Internet protocols
really dropped the ball. There was an opportunity to fix this
braindamage in HTTP, but alas.

I disagree strongly. For TELNET it was definitely right to use the
(only) meanings then standardized. For other protocols once you've
decided to go textlike (e.g. SMTP/NNTP) same principle -- and
personally I think that textlike, with its implications for reading
traces and monitors and debugging, was a significant factor in the
success of IP-stack. (But less so than free and implement-first.)

For FTP, where you do need to support arbitrary data, they did define
a character-independent 'structured' mode. Which is hardly ever used
nowadays, though when dinosaur (systems) roamed the earth it was.

Mediatype text/html in HTTP does allow any of CR LF or CRLF. Of course
non-PRE whitespace is collapsed in HTML anyway (like in C!).

<snip rest>
- formerly david.thompson1 || achar(64) || worldnet.att.net

Dec 17 '07 #19

David Thompson

(clc re-added, I don't have time to follow a new hierarchy)
On Fri, 30 Nov 2007 19:15:52 +0530, santosh <sa*********@gmail.com>
wrote:

J de Boyne Pollard wrote:

<snip>

Actually, the binary/text dichotomy comes from the C language. The
operating systems themselves have and make no such distinction. (To
the operating systems themselves, files are just octet streams. There
are no lines, no newline sequences, and no EOF marker characters.)

Not the case with all operating systems. Many systems like CP/M and some
mainframes have a record oriented file system, where the file is
represented as a sequence of records.

Most mainframes yes, and historically some minis, but not CP/M.
Unless you mean blocks -- CP/M had fixed-size blocks, normally 128B,
unrelated to the records (if any) stored in them.

CP/M also had a end-of-file marker.

For text, right. Also RT-11, which similarly had 512B blocks.

Also non 8-bit byte systems may not view files as an octet stream.

True but I don't see the relevance.

- formerly david.thompson1 || achar(64) || worldnet.att.net

Dec 17 '07 #20

CBFalconer

David Thompson wrote:

CBFalconer <cb********@yahoo.comwrote:
>David Craig wrote: *** and top-posted - fixed ***

>>How about that even Unix needs to generate a CR/LF pair when a
'Newline - 0x0A' is encountered in output to a tty type device.

>This was usually handled by having the terminal driver emit "CR,
LF, DC3" to prompt for a new line. At line end, the echoing
machinery would emit "DC1, CR". I think I have the sequence
right. At any rate, there was enough idle time for the carriage to
recover, and the sequences would also stop/start the tape reader,
if present and loaded. When the input line was half duplex those
sequences would also prompt the sending device to unload another
line.

I don't think 'usually' is right; I saw this only on a few
'half-duplex' systems like IBM VM/370, and never on any Unix.
(Although there has been such a great profusion of Unices I can't
speak for all.) Where it was used you have it backwards: DC1=0x11 to
prompt/enable, DC3=0x13 to 'hold it there, buddy'.

You're right - I have DC1 and DC3 interchanged. I forget exactly
what systems I saw this on, but I am certain about the HP3000. I
used it myself on micros talking to teletypes.

--
Merry Christmas, Happy Hanukah, Happy New Year
Joyeux Noel, Bonne Annee.
Chuck F (cbfalconer at maineline dot net)
<http://cbfalconer.home.att.net>

--
Posted via a free Usenet account from http://www.teranews.com

Dec 17 '07 #21

Ben Voigt [C++ MVP]

Mediatype text/html in HTTP does allow any of CR LF or CRLF. Of course

non-PRE whitespace is collapsed in HTML anyway (like in C!).

That's the payload, not the request envelope (which is the only thing
defined by HTTP). HTTP request and response must use CR+LF as line
terminator. The format of the payload is application-specific, HTML being
as you say whitespace agnostic, but still only one of many different
possible content types carried by HTTP.

Dec 17 '07 #22

fgets() equivalent?

Similar topics