By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
454,501 Members | 1,690 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 454,501 IT Pros & Developers. It's quick & easy.

reading from a text file

P: n/a
I'm trying to read from an input text file and print it out. I can do
this by reading each character, but I want to implement it in a more
efficient way. So I thought my program should read one line at a time
and print it out. How can I do this? I wrote the code below but it's
not correct since the fscanf reads one word (terminating in whitespace
or newline) at a time, instead of reading the whole line.

#include <stdio.h>
void main(void)
{
char str[120];

FILE *fp = fopen("test.txt", "r");

while (!feof(fp))
{
fscanf(fp, "%s\n", str);
printf("%s", str);
}
}

So I get an output like "Thisisatest.Itisnotworking.", when the input
file contains:
This is a test.
It is not working.

Any suggestion is appreciated. Thanks.

Nov 15 '05 #1
Share this Question
Share on Google+
40 Replies


P: n/a
googler wrote:
I'm trying to read from an input text file and print it out. I can do
this by reading each character, but I want to implement it in a more
efficient way. So I thought my program should read one line at a time
and print it out. How can I do this? I wrote the code below but it's
not correct since the fscanf reads one word (terminating in whitespace
or newline) at a time, instead of reading the whole line.
<snip> Any suggestion is appreciated. Thanks.


fgets.

S.
Nov 15 '05 #2

P: n/a
Skarmander wrote:
googler wrote:
I'm trying to read from an input text file and print it out. I can do
this by reading each character, but I want to implement it in a more
efficient way. So I thought my program should read one line at a time
and print it out. How can I do this? I wrote the code below but it's
not correct since the fscanf reads one word (terminating in whitespace
or newline) at a time, instead of reading the whole line.

<snip>
Any suggestion is appreciated. Thanks.


fgets.

S.


Sorry, I should have used fgets.. my bad.

One more question. My code looks like:
while (!feof(fp))
{
fgets(str, 120, fp);
printf("%s", str);
}

This prints the last line twice. I don't understand why. When it prints
the last line for the first time, it should have known that the end of
the file has been reached, so the next condition check for the while
loop should have failed. Why is it still entering the while loop and
printing the last line again?

Thanks.

Nov 15 '05 #3

P: n/a
Don't use feof. You can always use fgets to check if it gets to the end
of the file.

for example,
while( fgets(str, 120,fp) )
{
printfr("%s", str);
}

cheers

Nov 15 '05 #4

P: n/a
Reading a line and a character are not much different in terms of
efficiency because a caching is already done by file reader function.
Meaning it reads some blocks (much more than you need) of file at one
time and stores it on the memory. I think you must use the appropriate
one according to your need. If you just want to print the file on the
screen the two would not make much difference but fgets would be a
little better. And in processing the sum of "a little"s are big.

Nov 15 '05 #5

P: n/a
Or actually, I should've done some explanation that, during your last
double-printing, fgets returned NULL whilst the "str" memory stayed the
same in terms of the content.

the workaround to your problem whilst keeping the feof would be as
follows:
while ( !feof(fp) )
....
if( fgets(...) )
printf....

Nov 15 '05 #6

P: n/a
googler said:
I'm trying to read from an input text file and print it out. I can do
this by reading each character, but I want to implement it in a more
efficient way. So I thought my program should read one line at a time
and print it out. How can I do this? I wrote the code below but it's
not correct since the fscanf reads one word (terminating in whitespace
or newline) at a time, instead of reading the whole line.
Others have already answered your question, but nobody appears to have
pointed out yet that...

#include <stdio.h>
void main(void)


....in C, main returns int, not void. This is a common error, and those who
commit it often find it hard to believe that it's wrong. Nevertheless, no C
compiler is required to accept a main function that returns void unless it
specifically documents that acceptance - which few, if any, do.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)
Nov 15 '05 #7

P: n/a
In article <11**********************@g14g2000cwa.googlegroups .com>
googler <pi********@yahoo.com> wrote:
One more question. [The following code fragment, slightly edited for space]
while (!feof(fp)) {
fgets(str, 120, fp);
printf("%s", str);
}
... prints the last line twice. I don't understand why. When it prints
the last line for the first time, it should have known that the end of
the file has been reached,


What makes you believe that? Remember that fgets() is (loosely)
defined as:

int c;
while ((c = getc(fp)) != EOF && c != '\n') {
line[i++] = c;
if (c == '\n')
break;
}
line[i] = '\0';

(of course fgets() avoids overrunning your buffer, so it is a little
more complicated than that, but assume for the moment that no input
lines will be overlong).

Suppose the stdio "FILE *fp" is connected to a human being (at the
keyboard, or a terminal, or behind a network connection via telnet
or ssh or whatever), who has some way of signalling "end of file"
while still remaining connected (on many OSes this is done by
entering ^Z or ^D or @EOF or some similar character or string as
the only input on a line). The human types:

abc[enter]

so the first fgets() reads "abc\n". The fgets() call returns.
What should feof(fp) be?

The human *might* be *about* to press ^D or ^Z or type @EOF or
whatever it is that will signal EOF. Should feof(fp) wait until
he does so? What should it do if, instead, he types "def" and
presses ENTER?

You are effectively expecting the feof() predicate to predict the
future. There is no way for it to do that. It *could*, of course,
try to read input from the file -- in effect, waiting for the human
to signal EOF or enter "def\n". But it does not do that. Predicting
the future is too difficult. C is a simple language. It is much
easier to "predict" the past ... so that is what feof() does!
Instead of telling you "a future attempt to read is not going to
work because EOF is coming up", it tells you "a previous attempt
to read that failed, failed because EOF came up."

Suppose, now, that instead of a human, the stdio FILE *fp is
connected to a file on a floppy disk (or CD-ROM or DVD or whatever).
Suppose further that the floppy has been corrupted (someone used
a magnet to hold it up on the fridge, or scratched the CD-ROM, or
whatever). Your program/OS knows that the file should be 271483
bytes long, but partway in, the media turns out to be unreadable.
The fgetc() function -- or its getc() equivalent -- will return
EOF, indicating that it is unable to continue reading.

What should feof(fp) be? The file size is known (271483 bytes)
but you have at this point successfully read only 65536 bytes.
Should feof(fp) return nonzero (true)? You have not reached the
end of the file!

As before, feof() does not try to predict the future; instead, it
"predicts" the past. It tells you whether the getc() that returned
EOF did so because of end-of-file. In this case, it is *not* the
end of the file -- so feof(fp) is 0 (i.e., false). The other
predicate, ferror(fp), will be nonzero (i.e., true). It is
"predicting" the past, and telling you that the getc() failed due
to error. (Of course, the ability to distinguish between "normal
end of file" and "error reading data" is O/S and sometimes filesystem
or device specific, but it is fairly common.)

Because feof() only tells you about *previous* failures, and --
worse -- only tells you about EOF and not about errors, any loop
of the form:

while (!feof(fp))

is virtually *guaranteed* to be wrong. If you ever see this in
C code, be very suspicious.

As for why the last line prints twice, well, that one is a FAQ. :-)
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (4039.22'N, 11150.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.
Nov 15 '05 #8

P: n/a
googler wrote:

I'm trying to read from an input text file and print it out. I can do
this by reading each character, but I want to implement it in a more
efficient way. So I thought my program should read one line at a time
and print it out. How can I do this? I wrote the code below but it's
not correct since the fscanf reads one word (terminating in whitespace
or newline) at a time, instead of reading the whole line.

#include <stdio.h>
void main(void)
{
char str[120];

FILE *fp = fopen("test.txt", "r");

while (!feof(fp))
{
fscanf(fp, "%s\n", str);
printf("%s", str);
}
}

So I get an output like "Thisisatest.Itisnotworking.", when the input
file contains:
This is a test.
It is not working.

Any suggestion is appreciated. Thanks.


/* BEGIN new.c */

#include <stdio.h>
#include <stdlib.h>

#define SOURCE "test.txt"
#define LINE_LEN 119
#define str(s) # s
#define xstr(s) str(s)

int main(void)
{
int rc;
FILE *fd;
char line[LINE_LEN + 1];

fd = fopen(SOURCE, "r");
if (fd == NULL) {
fprintf(stderr,
"\nfopen() problem with \"%s\"\n", SOURCE);
exit(EXIT_FAILURE);
}
do {
rc = fscanf(fd, "%" xstr(LINE_LEN) "[^\n]%*[^\n]", line);
if (!feof(fd)) {
getc(fd);
}
if (rc == 0) {
*line = '\0';
++rc;
}
printf("%s\n", line);
} while (rc == 1);
fclose(fd);
return 0;
}

/* END new.c */
--
pete
Nov 15 '05 #9

P: n/a
pete wrote:

googler wrote:

I'm trying to read from an input text file and print it out.
I can do
this by reading each character, but I want to implement it in a more
efficient way. So I thought my program should read one line at a
time
and print it out. How can I do this? I wrote the code below but it's
not correct since the fscanf reads one word
(terminating in whitespace
or newline) at a time, instead of reading the whole line.

#include <stdio.h>
void main(void)
{
char str[120];

FILE *fp = fopen("test.txt", "r");

while (!feof(fp))
{
fscanf(fp, "%s\n", str);
printf("%s", str);
}
}

So I get an output like "Thisisatest.Itisnotworking.",
when the input
file contains:
This is a test.
It is not working.

Any suggestion is appreciated. Thanks.


/* BEGIN new.c */


new.c outputs the last line double.
I'm working on it.

--
pete
Nov 15 '05 #10

P: n/a
pete wrote:
do {
rc = fscanf(fd, "%" xstr(LINE_LEN) "[^\n]%*[^\n]", line);
if (!feof(fd)) {
getc(fd);
}
if (rc == 0) {
*line = '\0';
++rc;
}
printf("%s\n", line);
} while (rc == 1);


/*
** The following shows all of the different values that rc can have
** and also fixes the double output of the last line.
*/

do {
rc = fscanf(fd, "%" xstr(LINE_LEN) "[^\n]%*[^\n]", line);
if (!feof(fd)) {
getc(fd);
}
if (rc == 0) {
*line = '\0';
}
if (rc != EOF) {
printf("%s\n", line);
}
} while (rc == 1 || rc == 0);

--
pete
Nov 15 '05 #11

P: n/a
On Sat, 29 Oct 2005 17:51:20 GMT, pete wrote:
googler wrote:

#include <stdio.h>
void main(void)
{
char str[120];

FILE *fp = fopen("test.txt", "r");

while (!feof(fp))
{
fscanf(fp, "%s\n", str);
printf("%s", str);
}
}

So I get an output like "Thisisatest.Itisnotworking.", when the input
file contains:
This is a test.
It is not working.

Any suggestion is appreciated. Thanks.


/* BEGIN new.c */

#include <stdio.h>
#include <stdlib.h>

#define SOURCE "test.txt"
#define LINE_LEN 119
#define str(s) # s
#define xstr(s) str(s)

int main(void)
{
int rc;
FILE *fd;
char line[LINE_LEN + 1];

fd = fopen(SOURCE, "r");
if (fd == NULL) {
fprintf(stderr,
"\nfopen() problem with \"%s\"\n", SOURCE);
exit(EXIT_FAILURE);
}
do {
rc = fscanf(fd, "%" xstr(LINE_LEN) "[^\n]%*[^\n]", line);
if (!feof(fd)) {
getc(fd);
}
if (rc == 0) {
*line = '\0';
++rc;
}
printf("%s\n", line);
} while (rc == 1);
fclose(fd);
return 0;
}

/* END new.c */


This is my solution, after research. Criticism welcome.

#include <stdio.h>

#define MAX_LEN 120

void ReadFile(FILE *fp);
int ErrorMsg(char *str);

int main(void)
{
FILE *fp;
char filename[]= "test.txt";

if ((fp = fopen(filename, "r")) == NULL){
ErrorMsg(filename);
} else {
ReadFile(fp);
fclose(fp);
}
return 0;
}

void ReadFile(FILE *fp)
{
char buff[MAX_LEN];

while (fgets(buff, MAX_LEN, fp)) {
printf("%s", buff);
}
}

int ErrorMsg(char *str)
{
printf("Cannot open %s.\n", str);
return;
}
Nov 15 '05 #12

P: n/a
bildad wrote:

On Sat, 29 Oct 2005 17:51:20 GMT, pete wrote:
googler wrote:

#include <stdio.h>
void main(void)
{
char str[120];

FILE *fp = fopen("test.txt", "r");

while (!feof(fp))
{
fscanf(fp, "%s\n", str);
printf("%s", str);
}
}

So I get an output like "Thisisatest.Itisnotworking.",
when the input file contains:
This is a test.
It is not working.

Any suggestion is appreciated. Thanks.
/* BEGIN new.c */

#include <stdio.h>
#include <stdlib.h>

#define SOURCE "test.txt"
#define LINE_LEN 119
#define str(s) # s
#define xstr(s) str(s)

int main(void)
{
int rc;
FILE *fd;
char line[LINE_LEN + 1];

fd = fopen(SOURCE, "r");
if (fd == NULL) {
fprintf(stderr,
"\nfopen() problem with \"%s\"\n", SOURCE);
exit(EXIT_FAILURE);
}
do {
rc = fscanf(fd, "%" xstr(LINE_LEN) "[^\n]%*[^\n]", line);
if (!feof(fd)) {
getc(fd);
}
if (rc == 0) {
*line = '\0';
++rc;
}
printf("%s\n", line);
} while (rc == 1);
fclose(fd);
return 0;
}

/* END new.c */


This is my solution, after research. Criticism welcome.

#include <stdio.h>

#define MAX_LEN 120


If the lines are longer than LINE_LEN,
then the characters after LINE_LEN and before the newline,
are discarded.
What happens if the lines are longer than MAX_LEN?
void ReadFile(FILE *fp);
int ErrorMsg(char *str);

int main(void)
{
FILE *fp;
char filename[]= "test.txt";

if ((fp = fopen(filename, "r")) == NULL){
ErrorMsg(filename);
} else {
ReadFile(fp);
fclose(fp);
}
return 0;
}

void ReadFile(FILE *fp)
{
char buff[MAX_LEN];

while (fgets(buff, MAX_LEN, fp)) {
printf("%s", buff);
}
}

int ErrorMsg(char *str)
{
printf("Cannot open %s.\n", str);
return;
return 0; /* maybe? */

}


--
pete
Nov 15 '05 #13

P: n/a
bildad <bi****@wi.rr.com> writes:
[...]
This is my solution, after research. Criticism welcome.
Not bad, but I do have a few comments.
#include <stdio.h>

#define MAX_LEN 120
Obviously this is arbitrary (as it must be). If you haven't already,
you should think about what happens if the input file contains lines
longer than MAX_LEN characters. Since you're using fgets(), the
answer is that it works anyway, but you should understand why. Read
the documentation for fgets() and work through what happens if an
input line is very long.
void ReadFile(FILE *fp);
int ErrorMsg(char *str);

int main(void)
{
FILE *fp;
char filename[]= "test.txt";
In a real program, you'd want to be able to specify the name of the
file, probably on the command line.
if ((fp = fopen(filename, "r")) == NULL){
ErrorMsg(filename);
The ErrorMsg() function claims to return an int (but see below), but
you discard the result.
} else {
ReadFile(fp);
fclose(fp);
}
return 0;
This would be a good opportunity to indicate to the environment
whether you were able to open the file, using "return EXIT_SUCCESS" if
you were successful, "return EXIT_FAILURE" if you weren't (or,
equivalently, "exit(EXIT_SUCCESS)" or "exit(EXIT_FAILURE)"). Note
that the exit() function and the EXIT_SUCCESS and EXIT_FAILURE macros
are defined in <stdlib.h>. (exit(1) is often used in Unix to denote
falure, but it isn't portable.)
}

void ReadFile(FILE *fp)
{
char buff[MAX_LEN];

while (fgets(buff, MAX_LEN, fp)) {
Some people (including me) would prefer an explicit comparison against
NULL. What you've written is fine, though, and any C programmer
should be able to read both forms easily.
printf("%s", buff);
Since you're not doing any formatting of the output, "fputs(buff);"
would be simpler; it's also more closely parallel with fgets(). This
is a matter of style, though. It's often easier to use printf()
consistently than to remember the details of puts() vs. fputs() (as
well as fputc(), putc(), and putchar()).
}
}
As a matter of style, it might make more sense to pass the file name
as an argument to ReadFile(), and make FP local to it rather than to
main(). The fopen() call would then be inside ReadFile(). Also,
CopyFile() (or copy_file()) might be a better name, since it doesn't
just read the file.
int ErrorMsg(char *str)
{
printf("Cannot open %s.\n", str);
return;
}


Unless you're going to add more to this, I'm not sure it needs to be a
function. You might as well just replace the call to ErrorMsg() with
the printf() call. Also, it's traditional to write error messages to
stderr:
fprintf(stderr, "Cannot open %s\n", str);
Note that I've dropped the '.' in the error message; it could look
like it's part of the file name.

The ErrorMsg() function is declared to return an int, but you don't
return a value. In fact, this is illegal (at least in C99). You
should declare the function to return void, not int.

The return statement isn't even necessary. A return with no value is
equivalent to falling off the end of the function, which you're about
to do anyway. (Were you expecting the return to terminate the main
program? It doesn't; it just returns control to the point after the
call.)

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Nov 15 '05 #14

P: n/a
pete <pf*****@mindspring.com> writes:
bildad wrote:

[...]
#define MAX_LEN 120


If the lines are longer than LINE_LEN,
then the characters after LINE_LEN and before the newline,
are discarded.
What happens if the lines are longer than MAX_LEN?


You mean MAX_LEN, not LINE_LEN, right?

In this case, no, they're not discarded, they're just left on the
input stream. See my other response in this thread.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Nov 15 '05 #15

P: n/a
On Sat, 29 Oct 2005 22:06:16 GMT, Keith Thompson wrote:
bildad <bi****@wi.rr.com> writes:
[...]
This is my solution, after research. Criticism welcome.


Not bad, but I do have a few comments.
#include <stdio.h>

#define MAX_LEN 120


Obviously this is arbitrary (as it must be). If you haven't already,
you should think about what happens if the input file contains lines
longer than MAX_LEN characters. Since you're using fgets(), the
answer is that it works anyway, but you should understand why. Read
the documentation for fgets() and work through what happens if an
input line is very long.

K&R2, p.164, 7.7, par.1:

char *fgets(char *line, int maxline, FILE *fp)

"at most maxline-1 characters will be read."

I changed MAX_LEN to test this but it still seemed to work fine. The only
documentation I have is K&R2 and King's C Programming. Am I looking in the
wrong place. I googled "fgets()" and "c programming fgets()" but didn't
find anything relevant (at least to me).
Nov 15 '05 #16

P: n/a
bildad <bi****@wi.rr.com> writes:
On Sat, 29 Oct 2005 22:06:16 GMT, Keith Thompson wrote:
bildad <bi****@wi.rr.com> writes:
[...]
This is my solution, after research. Criticism welcome.


Not bad, but I do have a few comments.
#include <stdio.h>

#define MAX_LEN 120


Obviously this is arbitrary (as it must be). If you haven't already,
you should think about what happens if the input file contains lines
longer than MAX_LEN characters. Since you're using fgets(), the
answer is that it works anyway, but you should understand why. Read
the documentation for fgets() and work through what happens if an
input line is very long.

K&R2, p.164, 7.7, par.1:

char *fgets(char *line, int maxline, FILE *fp)

"at most maxline-1 characters will be read."

I changed MAX_LEN to test this but it still seemed to work fine. The only
documentation I have is K&R2 and King's C Programming. Am I looking in the
wrong place. I googled "fgets()" and "c programming fgets()" but didn't
find anything relevant (at least to me).


Right. Suppose an input line is 300 characters long. Your call to
fgets() will read 119 characters; the resulting buffer will contain a
valid string terminated by a '\0' character, but it won't contain a
newline. Your call to fputs() or printf() will print this partial
line.

Think about what happens when you all fgets() again. You still have
the rest of the line waiting to be read, and the next fgets() gets the
next 119 characters of the line, which you then print.

On the *next* call to fgets(), you read the remainder of the long
input line, including the newline, and you then print it. You've read
and printed the entire line, but you've done it in 3 chunks.

If all you're doing with each result from fgets() is printing it, it
doesn't matter that it might take several calls to fgets() to read the
whole line. If you're doing more processing than that (as you
typically would in a real-world program), it could become a problem.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Nov 15 '05 #17

P: n/a
>On Sat, 29 Oct 2005 22:06:16 GMT, Keith Thompson wrote:
... Since you're using fgets(), the answer is that it works anyway,
but you should understand why. Read the documentation for fgets()
and work through what happens if an input line is very long.

In article <1e******************************@40tude.net>,
bildad <bi****@wi.rr.com> wrote:K&R2, p.164, 7.7, par.1:

char *fgets(char *line, int maxline, FILE *fp)

"at most maxline-1 characters will be read."
The obvious question, then, is: "what happens to characters that
are not read?"

(What do you think *should* happen to them?)
I changed MAX_LEN to test this but it still seemed to work fine.


Indeed; as Keith Thompson noted, "it works anyway".

Suppose MAX_LEN were (say) 3, so that fgets() could read at most
two characters at a time. Suppose further that file being read
consisted entirely of the single line of:

The quick brown fox jumps over the lazy dog.

The first fgets() call would read at most two characters (3 - 1),
stopping if it encounters EOF, and also stopping if it encounters
a newline. The first two characters are 'T' and 'h', so fgets()
will set your array to contain the sequence {'T', 'h', '\0'} (which
is a valid C string) and return a non-NULL value.

What happens to the characters that are not yet read?

The next fgets() call will read at most two more characters. What
will they be?

What happens when fgets() has read 'g' and '.', so that only one
character, '\n', remains in the input file? What will fgets()
read and what will it put in your array? What will happen on the
next fgets() call?
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (4039.22'N, 11150.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.
Nov 15 '05 #18

P: n/a
On 30 Oct 2005 02:07:06 GMT, Chris Torek wrote:
On Sat, 29 Oct 2005 22:06:16 GMT, Keith Thompson wrote:
... Since you're using fgets(), the answer is that it works anyway,
but you should understand why. Read the documentation for fgets()
and work through what happens if an input line is very long.

In article <1e******************************@40tude.net>,
bildad <bi****@wi.rr.com> wrote:
K&R2, p.164, 7.7, par.1:

char *fgets(char *line, int maxline, FILE *fp)

"at most maxline-1 characters will be read."


The obvious question, then, is: "what happens to characters that
are not read?"


They're read in succeeding calls to fgets()?
(What do you think *should* happen to them?)
I changed MAX_LEN to test this but it still seemed to work fine.
Indeed; as Keith Thompson noted, "it works anyway".

Suppose MAX_LEN were (say) 3, so that fgets() could read at most
two characters at a time. Suppose further that file being read
consisted entirely of the single line of:

The quick brown fox jumps over the lazy dog.

The first fgets() call would read at most two characters (3 - 1),
stopping if it encounters EOF, and also stopping if it encounters
a newline. The first two characters are 'T' and 'h', so fgets()
will set your array to contain the sequence {'T', 'h', '\0'} (which
is a valid C string) and return a non-NULL value.

What happens to the characters that are not yet read?

The next fgets() call will read at most two more characters. What
will they be?

What happens when fgets() has read 'g' and '.', so that only one
character, '\n', remains in the input file? What will fgets()
read and what will it put in your array?


newline
What will happen on the
next fgets() call?


EOF ?
Nov 15 '05 #19

P: n/a
>On 30 Oct 2005 02:07:06 GMT, Chris Torek wrote:
[given a buffer of size 3, so that fgets() reads at most two
characters at a time, and an input line that contains an even
number of characters followed by a newline followed by EOF...]
The obvious question, then, is: "what happens to characters that
are not read?"
In article <10*****************************@40tude.net>,
bildad <bi****@wi.rr.com> wrote:
They're read in succeeding calls to fgets()?
Indeed!
What happens when fgets() has read 'g' and '.', so that only one
character, '\n', remains in the input file? What will fgets()
read and what will it put in your array? newline
Correct -- which is of course just one character; the array will
be modified to hold {'\n', '\0'} in elements 0 and 1 respectively,
with element 2 unchanged. (I am not actually sure the standard
*requires* element 2 to be unchanged, but in practice, it is.)
What will happen on the
next fgets() call?

EOF ?


Indeed.

Thus, the loop in:

/* where fp is some valid input file, of course, with
text as described above */
char buf[3];
while (fgets(buf, sizeof buf, fp) != NULL)
printf("%s", buf); /* or: fputs(buf, stdout); */

will print out two characters at a time until it prints the
final newline (one character) and then terminates (because
fgets() will return NULL, having encountered EOF).

In this case, this is just what you want. If you were actually
trying to interpret whole input lines -- as is often the case when
reading input from a human being who is typing commands -- it is
probably not what you want, as the loop might look more like:

while (fgets(buf, sizeof buf, fp) != NULL) {
... code to interpret a command ...
}

and you probably do not want to interpret "he", then "ll", then
"o\n" as three separate commands. In this case you would (a) need
a bigger buffer, and (b) need to double-check to see whether the
human managed to type in an overly long input line despite the
bigger buffer.
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (4039.22'N, 11150.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.
Nov 15 '05 #20

P: n/a
Chris Torek <no****@torek.net> writes:
[...]
In this case, this is just what you want. If you were actually
trying to interpret whole input lines -- as is often the case when
reading input from a human being who is typing commands -- it is
probably not what you want, as the loop might look more like:

while (fgets(buf, sizeof buf, fp) != NULL) {
... code to interpret a command ...
}

and you probably do not want to interpret "he", then "ll", then
"o\n" as three separate commands. In this case you would (a) need
a bigger buffer, and (b) need to double-check to see whether the
human managed to type in an overly long input line despite the
bigger buffer.


And (c) decide what the program should do if the human types in an
overly long input lines. There are numerous possibilities: silently
discard the extra characters, print an error message and abort, print
an error message and continue, build up a longer string containing all
the input (probably using realloc()). In a small toy program, you can
get away with ignoring the issue. In the real world, you had better
decide how to handle it, and write and test the code to do it.

BTW, there are a number of implementations floating around of
functions that read an input line of arbitrary length into a
dynamically allocated buffer.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Nov 15 '05 #21

P: n/a
On 2005-10-30, Keith Thompson <ks***@mib.org> wrote:
bildad <bi****@wi.rr.com> writes:
On Sat, 29 Oct 2005 22:06:16 GMT, Keith Thompson wrote:
bildad <bi****@wi.rr.com> writes:
[...]
This is my solution, after research. Criticism welcome.

Not bad, but I do have a few comments.

#include <stdio.h>

#define MAX_LEN 120

Obviously this is arbitrary (as it must be). If you haven't already,
you should think about what happens if the input file contains lines
longer than MAX_LEN characters. Since you're using fgets(), the
answer is that it works anyway, but you should understand why. Read
the documentation for fgets() and work through what happens if an
input line is very long.

K&R2, p.164, 7.7, par.1:

char *fgets(char *line, int maxline, FILE *fp)

"at most maxline-1 characters will be read."

I changed MAX_LEN to test this but it still seemed to work fine. The only
documentation I have is K&R2 and King's C Programming. Am I looking in the
wrong place. I googled "fgets()" and "c programming fgets()" but didn't
find anything relevant (at least to me).


Right. Suppose an input line is 300 characters long. Your call to
fgets() will read 119 characters; the resulting buffer will contain a
valid string terminated by a '\0' character, but it won't contain a
newline. Your call to fputs() or printf() will print this partial
line.


Speaking of fgets(), i've never liked that it doesn't cope with
embedded nulls. The old gets() function from unix v7 did:

I have converted this code to ANSI to attempt to recover some sense
of on-topicness - it was originally in k&r c and used aspects of a
pre-stdio i/o library. the original can be found at
http://minnie.tuhs.org/UnixTree/V6/u...ib/gets.c.html
and the applicable license at
http://www.tuhs.org/Archive/Caldera-license.pdf

int gets (char *s) {
char *p;
extern FILE *stdin;
p=s;
while ((*s = getc(stdin)) != '\n' && *s != '\0')
/* ^^^^^^^^^^^^^*/
s++;
if (*p == '\0') return (0);
*s = '\0';
return (p);
}

Why is it that feature was removed, anyway?
Nov 15 '05 #22

P: n/a
On Sun, 30 Oct 2005 11:12:51 +0000 (UTC), in comp.lang.c , Jordan Abel
<jm****@purdue.edu> wrote:

Speaking of fgets(), i've never liked that it doesn't cope with
embedded nulls.
Well, it is designed to read in a string which by definition can't
contain a null, so... Data with embedded nulls isn't text and should
probably be fread instead.
The old gets() function from unix v7 did:
Unless I'm missing something, the code you posted won't read an
embedded null either. It stops as soon as it encounters one, and
returns a string consisting of every character up to the null.
int gets (char *s) {
char *p;
extern FILE *stdin;
For what its worth, this requires FILE to be defined, so you must
include the appropriate header. Same applies to getc below (7.1.4 p2). p=s;
while ((*s = getc(stdin)) != '\n' && *s != '\0')
assignment of int to char - possible loss of data - what if getc
returned EOF?
/* ^^^^^^^^^^^^^*/
s++;
if (*p == '\0') return (0);
*s = '\0';
return (p);
This is undefined behaviour since int is not compatible with char*.

You may also want to consider that p is out of scope once gets()
returns, and therefore may point to junk.
Why is it that feature was removed, anyway?


Perhaps because by definition, a string can't contain a null. :-)
--
Mark McIntyre
CLC FAQ <http://www.eskimo.com/~scs/C-faq/top.html>
CLC readme: <http://www.ungerhu.com/jxh/clc.welcome.txt>

----== Posted via Newsfeeds.Com - Unlimited-Uncensored-Secure Usenet News==----
http://www.newsfeeds.com The #1 Newsgroup Service in the World! 120,000+ Newsgroups
----= East and West-Coast Server Farms - Total Privacy via Encryption =----
Nov 15 '05 #23

P: n/a
On 2005-10-30, Mark McIntyre <ma**********@spamcop.net> wrote:
On Sun, 30 Oct 2005 11:12:51 +0000 (UTC), in comp.lang.c , Jordan Abel
<jm****@purdue.edu> wrote:

Speaking of fgets(), i've never liked that it doesn't cope with
embedded nulls.
Well, it is designed to read in a string which by definition can't
contain a null, so... Data with embedded nulls isn't text and should
probably be fread instead.


But suppose a parity error on the terminal causes a zero byte to be
read from the keyboard? This isn't a perfect solution, but it's
better than continuing to read past the null and the remaining data
being lost
The old gets() function from unix v7 did:


Unless I'm missing something, the code you posted won't read an
embedded null either. It stops as soon as it encounters one, and
returns a string consisting of every character up to the null.


as opposed to modern fgets, which keeps on going and data beyond the
null to the end of the line [or the count] is lost.
For what its worth, this requires FILE to be defined, so you must
include the appropriate header. Same applies to getc below (7.1.4 p2).
meh - it was an int originally. and cgetc was implicitly declared.
assignment of int to char - possible loss of data - what if getc
returned EOF?
Not my code. Probably Dennis Ritchie's. or Ken Thompson's. Looking
at the other source, I suspect EOF hadn't been invented yet and 0
doubled for the purpose.
/* ^^^^^^^^^^^^^*/
s++;
if (*p == '\0') return (0);
*s = '\0';
return (p);


This is undefined behaviour since int is not compatible with char*.


it's compatible on a pdp-11.
You may also want to consider that p is out of scope once gets()
returns, and therefore may point to junk.


You are incorrect. it's assigned from a parameter.
Why is it that feature was removed, anyway?


Perhaps because by definition, a string can't contain a null. :-)


Exactly. This code did not return a string containing a null.
fgets() on modern systems attempts to.
Nov 15 '05 #24

P: n/a
On Sun, 30 Oct 2005 17:11:43 +0000 (UTC), in comp.lang.c , Jordan Abel
<jm****@purdue.edu> wrote:
On 2005-10-30, Mark McIntyre <ma**********@spamcop.net> wrote:
On Sun, 30 Oct 2005 11:12:51 +0000 (UTC), in comp.lang.c , Jordan Abel
<jm****@purdue.edu> wrote:

Speaking of fgets(), i've never liked that it doesn't cope with
embedded nulls.
Well, it is designed to read in a string which by definition can't
contain a null, so... Data with embedded nulls isn't text and should
probably be fread instead.


But suppose a parity error on the terminal causes a zero byte to be
read from the keyboard?


Suppose a passing asteroid causes a massive magnetic spike and
generates spurious data. When was the last time you experienced
either?
The old gets() function from unix v7 did:


Unless I'm missing something, the code you posted won't read an
embedded null either. It stops as soon as it encounters one, and
returns a string consisting of every character up to the null.


as opposed to modern fgets, which keeps on going and data beyond the
null to the end of the line [or the count] is lost.


I'm sorry, I thought your argmument was that you /wanted/ fgets to
read nulls.
This is undefined behaviour since int is not compatible with char*.


it's compatible on a pdp-11.


So what?
You may also want to consider that p is out of scope once gets()
returns, and therefore may point to junk.

You are incorrect. it's assigned from a parameter.


My mistake.
Exactly. This code did not return a string containing a null.
fgets() on modern systems attempts to.


if the file you're reading from was opened in text mode, it can't
strictly contain nulls. If it was opened in binary mode, you're using
the wrong function.

--
Mark McIntyre
CLC FAQ <http://www.eskimo.com/~scs/C-faq/top.html>
CLC readme: <http://www.ungerhu.com/jxh/clc.welcome.txt>

----== Posted via Newsfeeds.Com - Unlimited-Uncensored-Secure Usenet News==----
http://www.newsfeeds.com The #1 Newsgroup Service in the World! 120,000+ Newsgroups
----= East and West-Coast Server Farms - Total Privacy via Encryption =----
Nov 15 '05 #25

P: n/a
Keith Thompson wrote:
bildad <bi****@wi.rr.com> writes:
On Sat, 29 Oct 2005 22:06:16 GMT, Keith Thompson wrote:
bildad <bi****@wi.rr.com> writes:
[...]
This is my solution, after research. Criticism welcome.
<snip> Right. Suppose an input line is 300 characters long. Your call to
fgets() will read 119 characters; the resulting buffer will contain a
valid string terminated by a '\0' character, but it won't contain a
newline. Your call to fputs() or printf() will print this partial
line.

Think about what happens when you all fgets() again. You still have
the rest of the line waiting to be read, and the next fgets() gets the
next 119 characters of the line, which you then print.

On the *next* call to fgets(), you read the remainder of the long
input line, including the newline, and you then print it. You've read
and printed the entire line, but you've done it in 3 chunks.

If all you're doing with each result from fgets() is printing it, it
doesn't matter that it might take several calls to fgets() to read the
whole line. If you're doing more processing than that (as you
typically would in a real-world program), it could become a problem.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.


hi
call fflush(stdin) after the first fgets() call.
Regards,
Frodo Baggins

Nov 15 '05 #26

P: n/a
In article <11**********************@g47g2000cwa.googlegroups .com>,
Frodo Baggins <fr*********@gmail.com> wrote:
call fflush(stdin) after the first fgets() call.


Check out http://www.comeaucomputing.com/techtalk/#fflushstdin
--
Greg Comeau / Celebrating 20 years of Comeauity!
Comeau C/C++ ONLINE ==> http://www.comeaucomputing.com/tryitout
World Class Compilers: Breathtaking C++, Amazing C99, Fabulous C90.
Comeau C/C++ with Dinkumware's Libraries... Have you tried it?
Nov 15 '05 #27

P: n/a
On 2005-10-30, Mark McIntyre <ma**********@spamcop.net> wrote:
it's compatible on a pdp-11.


So what?


So it's not my code anyway, and it's pre-ansi :P
Exactly. This code did not return a string containing a null.
fgets() on modern systems attempts to.


if the file you're reading from was opened in text mode, it can't
strictly contain nulls. If it was opened in binary mode, you're
using the wrong function.


The standard forbids text files to contain nulls? Or it allows them
to fail to contain them? I believe the latter is true but not the
former.
Nov 15 '05 #28

P: n/a
On 2005-10-30, Frodo Baggins <fr*********@gmail.com> wrote:
hi
call fflush(stdin) after the first fgets() call.
Regards,
Frodo Baggins


That is incorrect. There is a small possibility it may work if
reading from a terminal, and an even smaller possibility it may work
if reading from a pipe. This should not be mistaken for it being
defined behavior or good programming practice. And in any case it
will almost certainly not work if reading from a file.
Nov 15 '05 #29

P: n/a
On Sun, 30 Oct 2005 17:43:07 +0000 (UTC), in comp.lang.c , Jordan Abel
<jm****@purdue.edu> wrote:
On 2005-10-30, Mark McIntyre <ma**********@spamcop.net> wrote:
it's compatible on a pdp-11.
So what?


So it's not my code anyway, and it's pre-ansi :P


I was assuming that when you said that you'd converted it to ISO C,
you had actually done that....
The standard forbids text files to contain nulls? Or it allows them
to fail to contain them? I believe the latter is true but not the
former.


It probably does neither. If it contains nulls, its definitionally not
a text file since null isn't a printable character.

--
Mark McIntyre
CLC FAQ <http://www.eskimo.com/~scs/C-faq/top.html>
CLC readme: <http://www.ungerhu.com/jxh/clc.welcome.txt>

----== Posted via Newsfeeds.Com - Unlimited-Uncensored-Secure Usenet News==----
http://www.newsfeeds.com The #1 Newsgroup Service in the World! 120,000+ Newsgroups
----= East and West-Coast Server Farms - Total Privacy via Encryption =----
Nov 15 '05 #30

P: n/a
"Frodo Baggins" <fr*********@gmail.com> writes:
[...]
hi
call fflush(stdin) after the first fgets() call.
Regards,
Frodo Baggins


And what exactly do you expect that to accomplish?

fflush() is not defined for input streams. See question 12.26 in the
C FAQ.

The text version of the C FAQ, available at
<ftp://ftp.eskimo.com/u/s/scs/C-faq/faq.gz>, is more up to date than
the HTML version and goes into more detail on this:

] 12.26a: How can I flush pending input so that a user's typeahead isn't
] read at the next prompt? Will fflush(stdin) work?
]
] A: fflush() is defined only for output streams. Since its
] definition of "flush" is to complete the writing of buffered
] characters (not to discard them), discarding unread input would
] not be an analogous meaning for fflush on input streams.
] See also question 12.26b.
]
] References: ISO Sec. 7.9.5.2; H&S Sec. 15.2.
]
] 12.26b: If fflush() won't work, what can I use to flush input?
]
] A: It depends on what you're trying to do. If you're trying to get
] rid of an unread newline or other unexpected input after calling
] scanf() (see questions 12.18a-12.19), you really need to rewrite
] or replace the call to scanf() (see question 12.20).
] Alternatively, you can consume the rest of a partially-read line
] with a simple code fragment like
]
] while((c = getchar()) != '\n' && c != EOF)
] /* discard */ ;
]
] (You may also be able to use the curses flushinp() function.)
]
] There is no standard way to discard unread characters from a
] stdio input stream, nor would such a way necessarily be
] sufficient, since unread characters can also accumulate in
] other, OS-level input buffers. If you're trying to actively
] discard typed-ahead input (perhaps in anticipation of issuing a
] critical prompt), you'll have to use a system-specific
] technique; see questions 19.1 and 19.2.
]
] References: ISO Sec. 7.9.5.2; H&S Sec. 15.2.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Nov 15 '05 #31

P: n/a
On Sun, 30 Oct 2005 08:38:17 GMT, Keith Thompson wrote:
Chris Torek <no****@torek.net> writes:
[...]
In this case, this is just what you want. If you were actually
trying to interpret whole input lines -- as is often the case when
reading input from a human being who is typing commands -- it is
probably not what you want, as the loop might look more like:

while (fgets(buf, sizeof buf, fp) != NULL) {
... code to interpret a command ...
}

and you probably do not want to interpret "he", then "ll", then
"o\n" as three separate commands. In this case you would (a) need
a bigger buffer, and (b) need to double-check to see whether the
human managed to type in an overly long input line despite the
bigger buffer.
And (c) decide what the program should do if the human types in an
overly long input lines. There are numerous possibilities: silently
discard the extra characters, print an error message and abort, print
an error message and continue, build up a longer string containing all
the input (probably using realloc()). In a small toy program, you can
get away with ignoring the issue. In the real world, you had better
decide how to handle it, and write and test the code to do it.


This is where I am right now. I can't say it's bullet-proof but I think it
handles two situations safely. I'm working on malloc and realloc with
little success at this point. Thanks for the suggestions. I'm trying to
implement them.

void CopyFile(FILE *fp)
{
char buff[MAX_LEN]; /* 120 */
//char *p;

//p = malloc(fgets(buff, MAX_LEN + 1, fp));

if (fgets(buff, MAX_LEN, fp)) {
fputs(buff, stdout);
exit(EXIT_SUCCESS);
} else {
fputs("Error: Program Aborting", stdout);
exit(EXIT_FAILURE);
}
}

BTW, there are a number of implementations floating around of
functions that read an input line of arbitrary length into a
dynamically allocated buffer.


Thank you. I'll search for them.
Nov 15 '05 #32

P: n/a
bildad <bi****@wi.rr.com> writes:
On Sun, 30 Oct 2005 08:38:17 GMT, Keith Thompson wrote:

[...]
And (c) decide what the program should do if the human types in an
overly long input lines. There are numerous possibilities: silently
discard the extra characters, print an error message and abort, print
an error message and continue, build up a longer string containing all
the input (probably using realloc()). In a small toy program, you can
get away with ignoring the issue. In the real world, you had better
decide how to handle it, and write and test the code to do it.


This is where I am right now. I can't say it's bullet-proof but I think it
handles two situations safely. I'm working on malloc and realloc with
little success at this point. Thanks for the suggestions. I'm trying to
implement them.

void CopyFile(FILE *fp)
{
char buff[MAX_LEN]; /* 120 */
//char *p;

//p = malloc(fgets(buff, MAX_LEN + 1, fp));

if (fgets(buff, MAX_LEN, fp)) {
fputs(buff, stdout);
exit(EXIT_SUCCESS);
} else {
fputs("Error: Program Aborting", stdout);
exit(EXIT_FAILURE);
}
}


Ok, this *definitely* isn't what you want. You attempt to read and
write the first line of the file (or the first 119 characters if the
line is longer than that) -- and then you abort the program, whether
it was successful or not.

Calling exit() from within your function probably isn't a good idea.
Calling exit() from within your function if it succeeds definitely
isn't a good idea. If the function is intended for general use, you
might want to return a result indicating whether it was successful,
and leave it to the caller to decide how to deal with errors. One
common convention is to return 0 for success, non-0 for any error
(which allows you to enhance the function to specify different kinds
of errors).

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Nov 15 '05 #33

P: n/a
In article <ou********************************@4ax.com> Mark McIntyre <ma**********@spamcop.net> writes:
On Sun, 30 Oct 2005 17:43:07 +0000 (UTC), in comp.lang.c , Jordan Abel
<jm****@purdue.edu> wrote:

....
The standard forbids text files to contain nulls? Or it allows them
to fail to contain them? I believe the latter is true but not the
former.


It probably does neither. If it contains nulls, its definitionally not
a text file since null isn't a printable character.


It is still a text file since text files can contain non-printable
characters. But reading a file that contains NUL characters with
fgets is not really a good idea. fgetc will give you everything you
need.
--
dik t. winter, cwi, kruislaan 413, 1098 sj amsterdam, nederland, +31205924131
home: bovenover 215, 1025 jn amsterdam, nederland; http://www.cwi.nl/~dik/
Nov 15 '05 #34

P: n/a
Mark McIntyre wrote:
If it contains nulls, its definitionally not
a text file since null isn't a printable character.


Your reasoning is invalid.
Text files may contain more than printable characters.

isprint('\n') == 0

new-line characters aren't out of place in a text file.

--
pete
Nov 15 '05 #35

P: n/a
On Sun, 30 Oct 2005 17:43:07 +0000 (UTC), Jordan Abel
<jm****@purdue.edu> wrote:
<snip>
The standard forbids text files to contain nulls? Or it allows them
to fail to contain them? I believe the latter is true but not the
former.


Basically the latter. What's actually in a file is out of scope of the
standard, and on some (rare) systems can in fact differ substantially
from what the program sees. What is in scope is that if you write out
data on a text stream to a (text) file and read it back (implicitly
without anything else changing the file) you are guaranteed to get the
same data (and thus the file must contain or represent it somehow) if:
you use only printing characters, HT and NL (which excludes among
others null); you don't have trailing spaces on a line; the
implementation may require that the last line be terminated by NL, and
may limit line length to no less than 254 characters including NL.
(And, as always, you don't exceed any resource limits, e.g. it's
permitted and reasonable to have a limit on file size.)

It is undefined by omission what happens if you violate these
restrictiions; as for all UB the implementation may choose to make it
work, and probably will if the (OS) filesystem can easily do so.
- David.Thompson1 at worldnet.att.net
Nov 15 '05 #36

P: n/a
Dave Thompson wrote:
On Sun, 30 Oct 2005 17:43:07 +0000 (UTC), Jordan Abel
<jm****@purdue.edu> wrote:
<snip>
The standard forbids text files to contain nulls? Or it allows them
to fail to contain them? I believe the latter is true but not the
former.

Basically the latter. What's actually in a file is out of scope of the
standard, and on some (rare) systems can in fact differ substantially
from what the program sees. What is in scope is that if you write out
data on a text stream to a (text) file and read it back (implicitly
without anything else changing the file) you are guaranteed to get the
same data (and thus the file must contain or represent it somehow) if:
you use only printing characters, HT and NL (which excludes among
others null); you don't have trailing spaces on a line; the
implementation may require that the last line be terminated by NL, and
may limit line length to no less than 254 characters including NL.
(And, as always, you don't exceed any resource limits, e.g. it's
permitted and reasonable to have a limit on file size.)

It is undefined by omission what happens if you violate these
restrictiions; as for all UB the implementation may choose to make it
work, and probably will if the (OS) filesystem can easily do so.
- David.Thompson1 at worldnet.att.net


I think it's a little simpler. Let's stick to Unix for illustration. All
files in Unix are binary. The bytes written on the file are exactly the
ones in memory. There is no difference in the representation. The
Standard allows "rb", "rt" and "wb", "wt" modes for fopen() but the have
no difference with "r" and "w" in Unix. The have effect on files written
by or for other operating systems (Windows, Apple, other), not Unix.

In the old days Unix and C were meant for each other and used ASCII as
the tie that binds. A text file for Unix is one that consists of ASCII
characters. ASCII characters (bytes) have values 0..127 inclusive.

A text file consists of lines of characters. A line consists of 0 or
more characters ending in a new line (NL) character.

Whether the last line must end in NL is implementation defined. The C
Standard doesn't care and neither do I if I am reading. I always write
the NL at the end of the last line of my text files. The char NL has
ASCII value 10.

Note that char '\0' (NUL) is valid in a text file. It has no special
meaning. Anyone writing NUL to a text file should be shot.

The C 'string' is a memory thing, an array of char terminated with 0 or
'\0'. There are no strings in a text file, even it there is a NUL.

Yeah, simpler. Sorry for that.
--
Joe Wright
"Everything should be made as simple as possible, but not simpler."
--- Albert Einstein ---
Nov 15 '05 #37

P: n/a
Joe Wright <jw*****@comcast.net> wrote:
Dave Thompson wrote:
It is undefined by omission what happens if you violate these
restrictiions; as for all UB the implementation may choose to make it
work, and probably will if the (OS) filesystem can easily do so.
- David.Thompson1 at worldnet.att.net


I think it's a little simpler. Let's stick to Unix for illustration. All
files in Unix are binary. The bytes written on the file are exactly the
ones in memory. There is no difference in the representation. The
Standard allows "rb", "rt" and "wb", "wt" modes for fopen() but the have
no difference with "r" and "w" in Unix.


Only "rb" and "wb"; text is the default, and not specified. "w+", "r+",
"rb+", "r+b", "wb+", "w+b", "a", "ab", "a+", "ab+", "a+b" and "0"[1] are
also allowed.
Of course, you could argue that the Standard "allows" the use of "rt",
since it means undefined behaviour, and undefined behaviour does not
mandate a crash. By that norm, fflush()ing input files or writing
through null pointers is also "allowed", so I don't think it counts.

Richard

[1] Ok, so maybe not that one. And there are no "w" or "r" blood groups.
Nov 15 '05 #38

P: n/a
Richard Heathfield wrote:
googler said:
I'm trying to read from an input text file and print it out. I can do
this by reading each character, but I want to implement it in a more
efficient way. So I thought my program should read one line at a time
and print it out. How can I do this? I wrote the code below but it's
not correct since the fscanf reads one word (terminating in whitespace
or newline) at a time, instead of reading the whole line.


Others have already answered your question, but nobody appears to have
pointed out yet that...

#include <stdio.h>
void main(void)


...in C, main returns int, not void. This is a common error, and those who
commit it often find it hard to believe that it's wrong. Nevertheless, no C
compiler is required to accept a main function that returns void unless it
specifically documents that acceptance - which few, if any, do.


MCC18 (Microchip C Compiler for the 18XXX family) actually requires
main to be defined as void main (void). And it makes sense, because
there is no OS to which you could return a value or from which to take
parameters. Many compilers for microcontrollers also have this
requirement, or simply ignore the return statement from main for given
reasons.

Nov 15 '05 #39

P: n/a
Antonio Contreras said:
MCC18 (Microchip C Compiler for the 18XXX family) actually requires
main to be defined as void main (void). And it makes sense, because
there is no OS to which you could return a value or from which to take
parameters. Many compilers for microcontrollers also have this
requirement, or simply ignore the return statement from main for given
reasons.


I should have excluded freestanding implementations from my rather sweeping
statement.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)
Nov 15 '05 #40

P: n/a
"googler" <pi********@yahoo.com> writes:
One more question. My code looks like:
while (!feof(fp))
{
fgets(str, 120, fp);
printf("%s", str);
}

This prints the last line twice. I don't understand why. When
it prints the last line for the first time, it should have
known that the end of the file has been reached, so the next
condition check for the while loop should have failed. Why is
it still entering the while loop and printing the last line
again?


1. When fgets() reads the last line of text, you go ahead and
print it.

2. Then you loop back. EOF hasn't been triggered yet for the
input file.

3. You call fgets() again. The return value of fgets() is NULL
and EOF is signalled on the input stream, but you ignore the
return code of fgets() and go ahead to print whatever happens
to be in str[] -- which is the last line of the input file
read in step (1).

Rewriting the loop to something like:

while (!feof(fp) && fgets(str, 120, fp) != NULL)
printf("%s", str);

makes sure that printf() will not be called on str[] when fgets()
returns NULL (i.e. because there is no more data, since you just
reached EOF).

In fact, feof() is a bit redundant here, since fgets() will
return NULL whenever it can't read more data and EOF is just
*one* of those conditions.

- Giorgos

Nov 27 '05 #41

This discussion thread is closed

Replies have been disabled for this discussion.