473,385 Members | 1,942 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

Read a line under MS/Unix/Mac

Someone recently asked about reading lines. I had this code written
some time ago (part of a BASIC-style interpreter based on H. Shildts
in Art of C) to read a file with the lines ended in any format:
Microsoft-style CR/LF pair, Unix-style NL, or Mac-style CR. It also
allows for EOF that does not follow a blank line. I thought this would
make text-file sharing a bit easier.

Here it is:
/* Load a file, normalizing newlines to *nix standard (just NL). */
int load_file(FILE *fp, char *buf, int max_size)
{
int i = 0;
char c;

do {
c = getc(fp); /* read the file into memory */
i++; /* keep track of size of file*/
if (c == '\r') { /* read a CR */
c = getc(fp); /* read another character */
if (c != '\n') { /* whoops, not an NL (Mac style) */
*buf++ = '\n'; /* correct, store NL */
i++; /* and update size */
} /* otherwise, c now holds the NL from the CR/NL pair */
} /* c now holds character to put; NL, (CR/)LF, or (new) char
*/
*buf++ = c;
} while ( !feof(fp) && i < max_size );
/* Null terminate the file, check for NL (LF) at end. */
if (buf[-1] != '\n') /* if file didn't end in new line */
*buf++ = '\n', i++; /* tack it on */
*buf = '\0'; /* put null past file */
fclose(fp);
return i; /* size of file loaded */
}

This allows the file to use a mix of different EOLs. Is that a bad
idea?

-- Marty (I still consider myself a newbie)

Nov 5 '07 #1
16 2918
On Tuesday 06 Nov 2007 1:14 am ma******@gmail.com < ma******@gmail.com>
wrote in article
<11**********************@57g2000hsv.googlegroups. com>:
Someone recently asked about reading lines. I had this code written
some time ago (part of a BASIC-style interpreter based on H. Shildts
in Art of C) to read a file with the lines ended in any format:
Microsoft-style CR/LF pair, Unix-style NL, or Mac-style CR. It also
allows for EOF that does not follow a blank line. I thought this would
make text-file sharing a bit easier.
I believe the C Standard library is required to present all text streams
as being composed of zero or more lines, each line being terminated by
a newline character. The actual end-of-line marker of the file is
abstracted away.
Here it is:
/* Load a file, normalizing newlines to *nix standard (just NL). */
int load_file(FILE *fp, char *buf, int max_size)
{
int i = 0;
char c;

do {
c = getc(fp); /* read the file into memory */
Here is your first problem. getc signals end-of-file or error by
returning EOF, an int value. So you should always assign the return
value of getc to an int and convert it to a char only after making sure
that it is indeed a valid character.
i++; /* keep track of size of file*/
if (c == '\r') { /* read a CR */
c = getc(fp); /* read another character */
if (c != '\n') { /* whoops, not an NL (Mac style) */
*buf++ = '\n'; /* correct, store NL */
i++; /* and update size */
} /* otherwise, c now holds the NL from the CR/NL pair */
} /* c now holds character to put; NL, (CR/)LF, or (new) char
*/
*buf++ = c;
} while ( !feof(fp) && i < max_size );
/* Null terminate the file, check for NL (LF) at end. */
if (buf[-1] != '\n') /* if file didn't end in new line */
*buf++ = '\n', i++; /* tack it on */
You like obfuscation don't you. I'd write the two operations above as
separate statements to avoid error.
*buf = '\0'; /* put null past file */
fclose(fp);
return i; /* size of file loaded */
}

This allows the file to use a mix of different EOLs. Is that a bad
idea?
It's taken care of for text files by the Standard library. You only need
to worry when operating on binary files.

Nov 5 '07 #2
On 05 Nov 2007, santosh <sa*********@gmail.comwrote:
On Tuesday 06 Nov 2007 1:14 am ma******@gmail.com <
ma******@gmail.comwrote in article
<11**********************@57g2000hsv.googlegroups. com>:
[snip code]
>This allows the file to use a mix of different EOLs. Is that a
bad idea?

It's taken care of for text files by the Standard library. You
only need to worry when operating on binary files.
Not true. On many (most? all?) Unix systems, when opening a DOS EOL
file (0x0D 0x0A line endings) using "r", not "rb", the 0x0D
characters are *not* removed from the stream when reading. On DOS
systems, they are removed since DOS recognizes the 2 character
sequence as meaningful. Unix systems don't recognize the sequence as
meaningful so they leave the 0x0Ds.

Dave

--
D.a.v.i.d T.i.k.t.i.n
t.i.k.t.i.n [at] a.d.v.a.n.c.e.d.r.e.l.a.y [dot] c.o.m
Nov 5 '07 #3
santosh wrote, On 05/11/07 19:58:
On Tuesday 06 Nov 2007 1:14 am ma******@gmail.com < ma******@gmail.com>
wrote in article
<11**********************@57g2000hsv.googlegroups. com>:
>Someone recently asked about reading lines. I had this code written
some time ago (part of a BASIC-style interpreter based on H. Shildts
in Art of C) to read a file with the lines ended in any format:
Microsoft-style CR/LF pair, Unix-style NL, or Mac-style CR. It also
allows for EOF that does not follow a blank line. I thought this would
make text-file sharing a bit easier.

I believe the C Standard library is required to present all text streams
as being composed of zero or more lines, each line being terminated by
a newline character. The actual end-of-line marker of the file is
abstracted away.
Only if it is what the implementation considers to be a text stream.
Open an old style Mac text file on a Unix machine and the Unix machine
will not see any new lines.

<snip>
>This allows the file to use a mix of different EOLs. Is that a bad
idea?

It's taken care of for text files by the Standard library. You only need
to worry when operating on binary files.
Sometimes you should leave it to the implementation, but sometimes you
have to cope with "text files" from a foreign system that have not been
translated, and then you have to deal with it yourself.
--
Flash Gordon
Nov 5 '07 #4
On Nov 5, 11:44 am, mazwo...@gmail.com wrote:
Someone recently asked about reading lines. I had this code written
some time ago (part of a BASIC-style interpreter based on H. Shildts
Aha! Here's your problem
^^^^^^^^^^

http://www.lysator.liu.se/c/schildt.html
http://ma.rtij.nl/acllc-c++.FAQ.html#q6.4
in Art of C) to read a file with the lines ended in any format:
Microsoft-style CR/LF pair, Unix-style NL, or Mac-style CR. It also
allows for EOF that does not follow a blank line. I thought this would
make text-file sharing a bit easier.

Here it is:
/* Load a file, normalizing newlines to *nix standard (just NL). */
int load_file(FILE *fp, char *buf, int max_size)
{
int i = 0;
char c;
c should definitely be int and not char.
>
do {
c = getc(fp); /* read the file into memory */
what happens if getc() returned EOF right here? You have no check.
i++; /* keep track of size of file*/
if (c == '\r') { /* read a CR */
c = getc(fp); /* read another character */
if (c != '\n') { /* whoops, not an NL (Mac style) */
*buf++ = '\n'; /* correct, store NL */
i++; /* and update size */
} /* otherwise, c now holds the NL from the CR/NL pair */
} /* c now holds character to put; NL, (CR/)LF, or (new) char
*/
*buf++ = c;
} while ( !feof(fp) && i < max_size );
/* Null terminate the file, check for NL (LF) at end. */
if (buf[-1] != '\n') /* if file didn't end in new line */
*buf++ = '\n', i++; /* tack it on */
*buf = '\0'; /* put null past file */
fclose(fp);
return i; /* size of file loaded */

}

This allows the file to use a mix of different EOLs. Is that a bad
idea?
Don't forget that Macs use '\r', Unix uses '\n' and Windows systems
use "\r\n".

There are programs called dos2unix and unix2dos that come with source
code and accomplish this (there are several variants as I recall).

I guess that the source of an FTP program is probably a lot better,
because it may handle Unix, Windows, OpenVMS, IBM Mainfram, and Mac,
which are all different.

Nov 5 '07 #5
David Tiktin wrote On 11/05/07 15:54,:
On 05 Nov 2007, santosh <sa*********@gmail.comwrote:

>>On Tuesday 06 Nov 2007 1:14 am ma******@gmail.com <
ma******@gmail.comwrote in article
<11**********************@57g2000hsv.googlegroup s.com>:


[snip code]

>>>This allows the file to use a mix of different EOLs. Is that a
bad idea?

It's taken care of for text files by the Standard library. You
only need to worry when operating on binary files.


Not true. On many (most? all?) Unix systems, when opening a DOS EOL
file (0x0D 0x0A line endings) using "r", not "rb", the 0x0D
characters are *not* removed from the stream when reading. On DOS
systems, they are removed since DOS recognizes the 2 character
sequence as meaningful. Unix systems don't recognize the sequence as
meaningful so they leave the 0x0Ds.
What this really means is that the transfer between
the two systems was done incorrectly. You don't have a
"DOS text file," you have a "damaged text file."

When a text file is correctly formed according to the
local conventions (whatever they are, and there are odder
things out there than DOS!), the C library "sees" line
endings as single newline characters. If you need to deal
with damaged files, your problems run deeper than just
ignoring the occasional '\r'.

--
Er*********@sun.com
Nov 5 '07 #6
On 05 Nov 2007, Eric Sosman <Er*********@sun.comwrote:
David Tiktin wrote On 11/05/07 15:54,:
>On 05 Nov 2007, santosh <sa*********@gmail.comwrote:

>>>On Tuesday 06 Nov 2007 1:14 am ma******@gmail.com <
ma******@gmail.comwrote in article
<11**********************@57g2000hsv.googlegrou ps.com>:


[snip code]

>>>>This allows the file to use a mix of different EOLs. Is that a
bad idea?

It's taken care of for text files by the Standard library. You
only need to worry when operating on binary files.


Not true. On many (most? all?) Unix systems, when opening a DOS
EOL file (0x0D 0x0A line endings) using "r", not "rb", the 0x0D
characters are *not* removed from the stream when reading. On
DOS systems, they are removed since DOS recognizes the 2
character sequence as meaningful. Unix systems don't recognize
the sequence as meaningful so they leave the 0x0Ds.

What this really means is that the transfer between
the two systems was done incorrectly. You don't have a
"DOS text file," you have a "damaged text file."
OK, but I'm not sure what point you're trying to make. Yes, FTP has
text mode transfers for just this reason. But the "damaged text
file" case (as you call it) was exactly the case the posted code was
supposed to deal with. And it's still not true that the Standard C
Library will give you any help with this, right?
When a text file is correctly formed according to the
local conventions (whatever they are, and there are odder
things out there than DOS!), the C library "sees" line
endings as single newline characters. If you need to deal
with damaged files, your problems run deeper than just
ignoring the occasional '\r'.
I store C source files on a Linux server which are mapped to both
Windows systems (via Samba) and other Unix machines (via NFS) so I
can build on different systems from a common code base. I'm not
clear what the "local conventions" are in a case like that! Very
occassionally I have to take into account that the line endings of a
text file I'm reading may not match those expected by the tools I'm
using by checking for an "extra" 0x0D at the end of a line. But
ususually, just saving everything with Unix EOL works fine.

Dave

--
D.a.v.i.d T.i.k.t.i.n
t.i.k.t.i.n [at] a.d.v.a.n.c.e.d.r.e.l.a.y [dot] c.o.m
Nov 5 '07 #7
santosh <sa*********@gmail.comwrites:
On Tuesday 06 Nov 2007 1:14 am ma******@gmail.com < ma******@gmail.com>
wrote in article
<11**********************@57g2000hsv.googlegroups. com>:
>Someone recently asked about reading lines. I had this code written
some time ago (part of a BASIC-style interpreter based on H. Shildts
in Art of C) to read a file with the lines ended in any format:
Microsoft-style CR/LF pair, Unix-style NL, or Mac-style CR. It also
allows for EOF that does not follow a blank line. I thought this would
make text-file sharing a bit easier.

I believe the C Standard library is required to present all text streams
as being composed of zero or more lines, each line being terminated by
a newline character. The actual end-of-line marker of the file is
abstracted away.
Yes, if the input file is a text file.

[...]
It's taken care of for text files by the Standard library. You only need
to worry when operating on binary files.
Or when operating on text files copied from a different operating
system, which is a fairly common problem.

If possible, it's usually better to translate the file when copying it
from one system to another, but that's not always possible. Or
rather, it's probably always possible, but it's not always done.

If you assume that the input is in one of those three formats, you can
open it in binary mode and scan for line terminators.

<OT>Note that modern versions of MacOS use Unix-style text files.)

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
Looking for software development work in the San Diego area.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Nov 5 '07 #8
David Tiktin wrote:
>
.... snip ...
>
Not true. On many (most? all?) Unix systems, when opening a DOS
EOL file (0x0D 0x0A line endings) using "r", not "rb", the 0x0D
characters are *not* removed from the stream when reading. On DOS
systems, they are removed since DOS recognizes the 2 character
sequence as meaningful. Unix systems don't recognize the sequence
as meaningful so they leave the 0x0Ds.
If the file is a text file, simply use the appropriate command to
copy it over to the Unix system. And don't use that command to
copy binary files. All done.

That leaves such things as i/o devices to worry about.

--
Chuck F (cbfalconer at maineline dot net)
<http://cbfalconer.home.att.net>
Try the download section.
--
Posted via a free Usenet account from http://www.teranews.com

Nov 6 '07 #9
ma******@gmail.com wrote:
Someone recently asked about reading lines. I had this code written
some time ago (part of a BASIC-style interpreter based on H. Shildts
in Art of C) to read a file with the lines ended in any format:
Microsoft-style CR/LF pair, Unix-style NL, or Mac-style CR. It also
allows for EOF that does not follow a blank line. I thought this would
make text-file sharing a bit easier.

Here it is:
/* Load a file, normalizing newlines to *nix standard (just NL). */
int load_file(FILE *fp, char *buf, int max_size)
{
int i = 0;
char c;
as others have already noted, 'c' should be an int
>
do {
c = getc(fp); /* read the file into memory */
you need to check for EOF (again, I believe someone already mentioned this)
i++; /* keep track of size of file*/
if (c == '\r') { /* read a CR */
c = getc(fp); /* read another character */
would need to check EOF here as well.
if (c != '\n') { /* whoops, not an NL (Mac style) */
*buf++ = '\n'; /* correct, store NL */
this could potentially cause an overflow problem because you will append
a second character to 'buf' before your next i < max_size check.
i++; /* and update size */
} /* otherwise, c now holds the NL from the CR/NL pair */
} /* c now holds character to put; NL, (CR/)LF, or (new) char
*/
*buf++ = c;
} while ( !feof(fp) && i < max_size );
/* Null terminate the file, check for NL (LF) at end. */
if (buf[-1] != '\n') /* if file didn't end in new line */
*buf++ = '\n', i++; /* tack it on */
you neglected to make sure your buffer had enough room to add the '\n'
*buf = '\0'; /* put null past file */
fclose(fp);
return i; /* size of file loaded */
}

This allows the file to use a mix of different EOLs. Is that a bad
idea?

-- Marty (I still consider myself a newbie)
As a suggestion, you might find it easier to use the fgets() function.

Jeff
Nov 6 '07 #10
user923005 <dc*****@connx.comwrites:
[...]
>in Art of C) to read a file with the lines ended in any format:
Microsoft-style CR/LF pair, Unix-style NL, or Mac-style CR.
[...]
Don't forget that Macs use '\r', Unix uses '\n' and Windows systems
use "\r\n".
I don't think he's forgotten that.

But note that the '\n' encoding in C refers to a "new-line" character,
which isn't necessarily an ASCII LF character. On an old Mac system,
I would guess that '\n' is actually the ASCII CR character (either
that, or stdio translates CR to LF on input).

If you're dealing only with native text files, you don't have to worry
about this; read the file in text mode, and each line will appear to
be terminated with '\n', whatever value '\n' happens to have.

If you need to deal with non-native text files, the best approach is
probably to open the file in binary mode and provide code to handle
all possible non-native formats. For reading DOS/Windows text files
on a Unix-like system, you can probably get away reading in text mode
and deleting trailling '\r' characters, but that's not a general
solution.

[...]

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
Looking for software development work in the San Diego area.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Nov 6 '07 #11
CBFalconer <cb********@yahoo.comwrites:
David Tiktin wrote:
>>
... snip ...
>>
Not true. On many (most? all?) Unix systems, when opening a DOS
EOL file (0x0D 0x0A line endings) using "r", not "rb", the 0x0D
characters are *not* removed from the stream when reading. On DOS
systems, they are removed since DOS recognizes the 2 character
sequence as meaningful. Unix systems don't recognize the sequence
as meaningful so they leave the 0x0Ds.

If the file is a text file, simply use the appropriate command to
copy it over to the Unix system. And don't use that command to
copy binary files. All done.
Alas, it's not that simple. If you copy individual text files from
one system to another, you can usually invoke the copying command in a
mode that causes it to do the proper translations. But file are often
copied as parts of archives (*.tar.gz, *.tar.bz2, *.zip, etc.).

There's still usually a reasonably good way to do the translations,
but not always. And the system on which I'm typing this has numerous
text files in two different formats (I use Cygwin under Windows XP).

The problem can't always be avoided.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
Looking for software development work in the San Diego area.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Nov 6 '07 #12
On Nov 5, 4:25 pm, user923005 <dcor...@connx.comwrote:
On Nov 5, 11:44 am, mazwo...@gmail.com wrote:Someone recently asked about reading lines. I had this code written
some time ago (part of a BASIC-style interpreter based on H. Shildts

Aha! Here's your problem
^^^^^^^^^^

http://www.lysator.liu.se/c/schildt.....FAQ.html#q6.4
I did say 'based': Indeed I had to change many things, including
redundancies.
in Art of C) to read a file with the lines ended in any format:
Microsoft-style CR/LF pair, Unix-style NL, or Mac-style CR. It also
allows for EOF that does not follow a blank line. I thought this would
make text-file sharing a bit easier.
Here it is:
/* Load a file, normalizing newlines to *nix standard (just NL). */
int load_file(FILE *fp, char *buf, int max_size)
{
int i = 0;
char c;

c should definitely be int and not char.
Point noted. You're right.
do {
c = getc(fp); /* read the file into memory */

what happens if getc() returned EOF right here? You have no check.
I check for EOF later on, it was easier to ignore it for now. If EOF
is returned, then the low byte is stored, then rewritten with a '\0',
so nothing 'bad' happens.
i++; /* keep track of size of file*/
if (c == '\r') { /* read a CR */
c = getc(fp); /* read another character */
if (c != '\n') { /* whoops, not an NL (Mac style) */
*buf++ = '\n'; /* correct, store NL */
i++; /* and update size */
} /* otherwise, c now holds the NL from the CR/NL pair */
} /* c now holds character to put; NL, (CR/)LF, or (new) char
*/
*buf++ = c;
} while ( !feof(fp) && i < max_size );
/* Null terminate the file, check for NL (LF) at end. */
if (buf[-1] != '\n') /* if file didn't end in new line */
*buf++ = '\n', i++; /* tack it on */
*buf = '\0'; /* put null past file */
fclose(fp);
return i; /* size of file loaded */
}
This allows the file to use a mix of different EOLs. Is that a bad
idea?

Don't forget that Macs use '\r', Unix uses '\n' and Windows systems
use "\r\n".
That's kinda the 'whole point' of this function: to read a file with
EOLs marked in any of those three ways, convert them _internally_ to
Unix-style (because I think that's nicer; I could have used '\0256'
had I wanted), and then runs the interpreter on the internal buffer.
There are programs called dos2unix and unix2dos that come with source
code and accomplish this (there are several variants as I recall).
My neighbor has a Mac, my brother has Linux, and my dad has Windows. I
want to copy BASIC programs from each of them, using a floppy disk. I
also don't want to have to modify the original files. (This scenario
is not for real, but I'm sure people have similar situations at
times.)
The program is, as mentioned, a BASIC interpreter. As such, it heavily
uses find_eol(). Any parser that uses find_eol() - C++ or C99 //
comments - and at runtime can have DOS/Mac/Unix files should be able
to make use of this function.
>
I guess that the source of an FTP program is probably a lot better,
because it may handle Unix, Windows, OpenVMS, IBM Mainfram, and Mac,
which are all different.
I am not aware of any EOL markers in use (using ASCII), besides "\xD",
"\xA", and "\xD\xA". Are there any others? Please let me know.

Thanks, and regards.

-- Marty (In pursuit of undomesticated aquatic avians).

Nov 8 '07 #13
On Nov 6, 2:33 am, Keith Thompson <ks...@mib.orgwrote:
CBFalconer <cbfalco...@yahoo.comwrites:
Not true. On many (most? all?) Unix systems, when opening a DOS
EOL file (0x0D 0x0A line endings) using "r", not "rb", the 0x0D
characters are *not* removed from the stream when reading. On DOS
systems, they are removed since DOS recognizes the 2 character
sequence as meaningful. Unix systems don't recognize the sequence
as meaningful so they leave the 0x0Ds.
If the file is a text file, simply use the appropriate command to
copy it over to the Unix system. And don't use that command to
copy binary files. All done.

Alas, it's not that simple. If you copy individual text files from
one system to another, you can usually invoke the copying command in a
mode that causes it to do the proper translations. But file are often
copied as parts of archives (*.tar.gz, *.tar.bz2, *.zip, etc.).

There's still usually a reasonably good way to do the translations,
but not always. And the system on which I'm typing this has numerous
text files in two different formats (I use Cygwin under Windows XP).

The problem can't always be avoided.
Thanks for the support. Your point is one of the things that got me
going. I wrote many files using Textpad (http://www.textpad.com), and
hated running vi on Cygwin and getting '^M' at the end of every line.

Nov 8 '07 #14
On Nov 7, 4:30 pm, Amandil <mazwo...@gmail.comwrote:
[snip]
I am not aware of any EOL markers in use (using ASCII), besides "\xD",
"\xA", and "\xD\xA". Are there any others? Please let me know.
Yes, there are others, notably EBCDIC and OpenVMS (under some formats
-- other times for sequentail files it can be 0xA). See:
http://en.wikipedia.org/wiki/Newline

Nov 8 '07 #15
ma******@gmail.com writes:
Someone recently asked about reading lines. I had this code written
some time ago (part of a BASIC-style interpreter based on H. Shildts
in Art of C)
Just one think that I think has not been commented on yet...
int i = 0;
char c;
do {
c = getc(fp); /* read the file into memory */
i++; /* keep track of size of file*/
if (c == '\r') { /* read a CR */
c = getc(fp); /* read another character */
if (c != '\n') { /* whoops, not an NL (Mac style) */
*buf++ = '\n'; /* correct, store NL */
i++; /* and update size */
} /* otherwise, c now holds the NL from the CR/NL pair */
} /* c now holds character to put; NL, (CR/)LF, or (new) char
*/
*buf++ = c;
} while ( !feof(fp) && i < max_size );
This loop can end because i == max_size. In that case, buff has been
incremented max_size times (nice and easy to reason about -- there is
an 'i++' for every 'buf++' but, personally, I'd put them closer
together). It now points just past the end of the buffer (if the
max_size parameter is indeed the size of the buffer).
/* Null terminate the file, check for NL (LF) at end. */
if (buf[-1] != '\n') /* if file didn't end in new line */
*buf++ = '\n', i++; /* tack it on */
*buf = '\0'; /* put null past file */
*buf is beyond the buffer so this is undefined behaviour. If the last
thing in the buffer is not '\n' the error occurs on the line before.
To be safe, the caller must pass a value in max_size that is two
larger than the buffer size.
fclose(fp);
return i; /* size of file loaded */
}

This allows the file to use a mix of different EOLs. Is that a bad
idea?
In simple cases you can do this but consider:

a\rb\r\nc\nd

is this: (a) Mac:
a
b
\nc\nd

(b) Windows:

a\rb
c\nd

or (c) Unix:

a\rb\r
c
d
?

--
Ben.
Nov 8 '07 #16
"Keith Thompson" <ks***@mib.orga écrit dans le message de news:
ln************@nuthaus.mib.org...
user923005 <dc*****@connx.comwrites:
[...]
>>in Art of C) to read a file with the lines ended in any format:
Microsoft-style CR/LF pair, Unix-style NL, or Mac-style CR.
[...]
>Don't forget that Macs use '\r', Unix uses '\n' and Windows systems
use "\r\n".

I don't think he's forgotten that.

But note that the '\n' encoding in C refers to a "new-line" character,
which isn't necessarily an ASCII LF character. On an old Mac system,
I would guess that '\n' is actually the ASCII CR character (either
that, or stdio translates CR to LF on input).
So reading an untranslated MS/DOS file on a Mac would result in bogus
interpretation of 0x0D/0x0A pairs: does the Mac C runtime translate those
into a single \n ? Do unix native text files get read as a single long
unterminated line ? This would happen in both text and binary mode. Is
there even a difference between text and binary more on older Macs ?

--
Chqrlie.
Nov 12 '07 #17

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
by: sea | last post by:
I have text files in the following format: 123,34, ,345,890, 123,23 .. .. .. As you can see, the problem is that (1) the commas can occur in
2
by: Stan Sainte-Rose | last post by:
Hi, I need to read a kind of text file from vb.net I do know how to read a simple file, but this one seems to have special characters. Some characters appears like squares when open this file...
9
by: John Howard | last post by:
How can I read a text file that is on a UNIX server in VB.Net? Please keep it simple. Thanks, John
0
by: Samineni | last post by:
Welcome to comp.lang.c++! Read this first. This post is intended to give the new reader an introduction to reading and posting in this newsgroup. We respectfully request that you read all the...
6
by: dylan.boudreau | last post by:
My configuration is Apache/2.0.59 (Unix) mod_ssl/2.0.59 OpenSSL/0.9.7d PHP/5.0.4 on Solaris 10. phpinfo() lists the location of php.ini as /usr/local/apache/conf and my php.ini file is there but...
2
by: Rajen | last post by:
Suppose the field length is 25 characters. After entering the 25th character, it should be available to process. Program should not wait for the user to press enter/return key. Thank you.
6
by: Thomas Kowalski | last post by:
Hi, currently I am reading a huge (about 10-100 MB) text-file line by line using fstreams and getline. I wonder whether there is a faster way to read a file line by line (with std::string line)....
7
by: Marcus Kwok | last post by:
I am working on a program that reads and processes large text files (on the order of 32 MB, so not too huge), so I wanted to add a progress indicator so I can estimate when it will finish. I just...
8
by: dmoore | last post by:
Hi folks, I've seen the following issue come up in multiple posts to this mailing list: I have a python program that spawns a child process with popen or popen2 or popen3 or popen2.popen2...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.