Reading a line from a file

Colin JN Breame

Hi,

Fairly new to C. What is the best way to read a line (\n terminated) from
a file? Ive looked at fscanf but was not sure which format specifier to
use. (%s perhaps).

Thanks
Colin

Nov 13 '05 #1

Subscribe Post Reply

9536

Joona I Palaste

Colin JN Breame <c.**********@durham.ac.uk> scribbled the following:

Hi, Fairly new to C. What is the best way to read a line (\n terminated) from
a file? Ive looked at fscanf but was not sure which format specifier to
use. (%s perhaps).

If you know how long the line will be, then fgets() is much better for
the job than fscanf().

--
/-- Joona Palaste (pa*****@cc.helsinki.fi) ------------- Finland --------\
\-- http://www.helsinki.fi/~palaste --------------------- rules! --------/
"War! Huh! Good God, y'all! What is it good for? We asked Mayor Quimby."
- Kent Brockman

Nov 13 '05 #2

Colin JN Breame

On Sun, 07 Dec 2003 18:44:51 +0000, Joona I Palaste wrote:

Colin JN Breame <c.**********@durham.ac.uk> scribbled the following:
Hi,

Fairly new to C. What is the best way to read a line (\n terminated)
from a file? Ive looked at fscanf but was not sure which format
specifier to use. (%s perhaps).

If you know how long the line will be, then fgets() is much better for the
job than fscanf().

The line is variable length.

Nov 13 '05 #3

Kevin Goodsell

Colin JN Breame wrote:

On Sun, 07 Dec 2003 18:44:51 +0000, Joona I Palaste wrote:

Colin JN Breame <c.**********@durham.ac.uk> scribbled the following:
Hi,

Fairly new to C. What is the best way to read a line (\n terminated)
from a file? Ive looked at fscanf but was not sure which format
specifier to use. (%s perhaps).

If you know how long the line will be, then fgets() is much better for the
job than fscanf().

The line is variable length.

Arrays in C are not variable length. (That is, they won't automatically
change size. There is a C99 feature called Variable Length Arrays, but
they never actually change length either.) Therefore, you have a few
options:

1) Create an array that's "large enough" for your longest line.
2) Create a dynamic array and read as much of the line as you can into
that array. If that's the whole line, you are done. If it's not the
whole line, realloc the array to make it larger and read more. Rinse and
repeat until the entire line is read (or you run out of memory,
whichever comes first).

Personally, I'd recommend using 2, and wrapping the functionality up in
a separate function:

char *line = my_getline(some_file);

/* do stuff with line here */

free(line);

-Kevin
--
My email address is valid, but changes periodically.
To contact me please use the address from a recent posting.

Nov 13 '05 #4

Morris Dovey

Kevin Goodsell wrote:

Colin JN Breame wrote:
On Sun, 07 Dec 2003 18:44:51 +0000, Joona I Palaste wrote:
Colin JN Breame <c.**********@durham.ac.uk> scribbled the following:

Fairly new to C. What is the best way to read a line (\n
terminated) from a file? Ive looked at fscanf but was
not sure which format specifier to use. (%s perhaps).

If you know how long the line will be, then fgets() is
much better for the job than fscanf().

The line is variable length.

Arrays in C are not variable length. (That is, they won't
automatically change size. There is a C99 feature called
Variable Length Arrays, but they never actually change length
either.) Therefore, you have a few options:

1) Create an array that's "large enough" for your longest line.
2) Create a dynamic array and read as much of the line as you
can into that array. If that's the whole line, you are done.
If it's not the whole line, realloc the array to make it
larger and read more. Rinse and repeat until the entire line
is read (or you run out of memory, whichever comes first).

Personally, I'd recommend using 2, and wrapping the
functionality up in a separate function:

char *line = my_getline(some_file);

/* do stuff with line here */

free(line);

Colin...

You might try a Google groups comp.lang.c search. Richard
Heathfield and Chuck Falconer have both provided URLs to their
routines; and I have another at http://www.iedu.com/mrd/c/getsm.c
--
Morris Dovey
West Des Moines, Iowa USA
C links at http://www.iedu.com/c
Read my lips: The apple doesn't fall far from the tree.

Nov 13 '05 #5

Richard Heathfield

Colin JN Breame wrote:

On Sun, 07 Dec 2003 18:44:51 +0000, Joona I Palaste wrote:
Colin JN Breame <c.**********@durham.ac.uk> scribbled the following:
Hi,

Fairly new to C. What is the best way to read a line (\n terminated)
from a file? Ive looked at fscanf but was not sure which format
specifier to use. (%s perhaps).

If you know how long the line will be, then fgets() is much better for
the job than fscanf().

The line is variable length.

I discuss this problem at some length(!) on my Web site:

http://users.powernet.co.uk/eton/c/fgetdata.html

where I discuss various standard library functions for capturing string
data, and then present functions for getting: (a) an entire word, and (b)
an entire line, from a stream, irrespective of length (up to obvious limits
of memory, of course).

--
Richard Heathfield : bi****@eton.powernet.co.uk
"Usenet is a strange place." - Dennis M Ritchie, 29 July 1999.
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html
K&R answers, C books, etc: http://users.powernet.co.uk/eton

Nov 13 '05 #6

Richard Heathfield

Morris Dovey wrote:

You might try a Google groups comp.lang.c search. Richard
Heathfield and Chuck Falconer have both provided URLs to their
routines; and I have another at http://www.iedu.com/mrd/c/getsm.c

I have taken the liberty of adding (to the page on my site dealing with this
issue) a link to your URL, with the intent of providing lots of choice to
the discerning input-issues-aware programmer.

--
Richard Heathfield : bi****@eton.powernet.co.uk
"Usenet is a strange place." - Dennis M Ritchie, 29 July 1999.
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html
K&R answers, C books, etc: http://users.powernet.co.uk/eton

Nov 13 '05 #7

CBFalconer

Morris Dovey wrote:

Kevin Goodsell wrote:
Colin JN Breame wrote:
On Sun, 07 Dec 2003 18:44:51 +0000, Joona I Palaste wrote:
<c.**********@durham.ac.uk> scribbled the following:

> Fairly new to C. What is the best way to read a line (\n
> terminated) from a file? Ive looked at fscanf but was
> not sure which format specifier to use. (%s perhaps).

If you know how long the line will be, then fgets() is
much better for the job than fscanf().

The line is variable length.

Arrays in C are not variable length. (That is, they won't
automatically change size. There is a C99 feature called
Variable Length Arrays, but they never actually change length
either.) Therefore, you have a few options:

1) Create an array that's "large enough" for your longest line.
2) Create a dynamic array and read as much of the line as you
can into that array. If that's the whole line, you are done.
If it's not the whole line, realloc the array to make it
larger and read more. Rinse and repeat until the entire line
is read (or you run out of memory, whichever comes first).

Personally, I'd recommend using 2, and wrapping the
functionality up in a separate function:

char *line = my_getline(some_file);

/* do stuff with line here */

free(line);

You might try a Google groups comp.lang.c search. Richard
Heathfield and Chuck Falconer have both provided URLs to their
routines; and I have another at http://www.iedu.com/mrd/c/getsm.c

<http://cbfalconer.home.att.net/download/> which will also
provide a URL for Richards version.

--
Chuck F (cb********@yahoo.com) (cb********@worldnet.att.net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net> USE worldnet address!

Nov 13 '05 #8

Paul Hsieh

In article <m9******************@news.uswest.net>, mr*****@iedu.com says...

Kevin Goodsell wrote:
Colin JN Breame wrote:
On Sun, 07 Dec 2003 18:44:51 +0000, Joona I Palaste wrote:

Colin JN Breame <c.**********@durham.ac.uk> scribbled the following:

> Fairly new to C. What is the best way to read a line (\n
> terminated) from a file? Ive looked at fscanf but was
> not sure which format specifier to use. (%s perhaps).

If you know how long the line will be, then fgets() is
much better for the job than fscanf().

The line is variable length.

So you need a fuly dynamic solution ...
Colin...

You might try a Google groups comp.lang.c search. Richard
Heathfield and Chuck Falconer have both provided URLs to their
routines; and I have another at http://www.iedu.com/mrd/c/getsm.c

This getsm.c, *in practice* will overflow that stack of just about any fixed
stack implementation of C for large enough input. So its just a buffer
overflow of a different kind. Both Richard Heathfield's and Chuck Falconer's
solutions are O(n^2) in performance, where n is the length of the input --
i.e., for large enough input, you machine will simply slow to a crawl (not to
mention the fact that they will shred the heap of any substandard heap
implementation) and actually be unable to retain the input in reasonable time.

A somewhat more general solution that doesn't suffer either of these problems
can be found here:

http://www.pobox.com/~qed/userInput.html

The example modes of usage are O(n) with respect to the input, you can
*optionally* set an upper bound for the target buffer, or with your own
customization you can process the input incrementally without storing the whole
input buffer if that makes sense in your application. It comes with examples
of the most common cases. For example:

char * s;
getstralloc (&s);
if (s) {
printf ("<%s>\n", s);
free (s);
}

will perform what you would have hoped gets() did without the built-in
unavoidable undefined behaviour.

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Nov 13 '05 #9

CBFalconer

Paul Hsieh wrote:

Morris Dovey wrote:
.... snip ...
You might try a Google groups comp.lang.c search. Richard
Heathfield and Chuck Falconer have both provided URLs to their
routines; and I have another at http://www.iedu.com/mrd/c/getsm.c

This getsm.c, *in practice* will overflow that stack of just about
any fixed stack implementation of C for large enough input. So
its just a buffer overflow of a different kind. Both Richard
Heathfield's and Chuck Falconer's solutions are O(n^2) in
performance, where n is the length of the input -- i.e., for large
enough input, you machine will simply slow to a crawl (not to
mention the fact that they will shred the heap of any substandard
heap implementation) and actually be unable to retain the input in
reasonable time.

No they aren't O(n^2), they are O(n) on any system with a good
realloc policy. They are also arranged to provide best
performance on most likely input. Extremely long lines are not
the norm for interactive input or any text - the typer tends to
get tired. At any rate the system is almost certainly going to be
i/o limited, not processer limited.

I don't know about the getsm version.

Please take more care with your attributions. I corrected them
above.

--
Chuck F (cb********@yahoo.com) (cb********@worldnet.att.net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net> USE worldnet address!

Nov 13 '05 #10

Morris Dovey

Paul Hsieh wrote:

This getsm.c, *in practice* will overflow that stack of just
about any fixed stack implementation of C for large enough
input. So its just a buffer overflow of a different kind.

Absolutely correct. It blows up fairly reliably on my test
platform not long after you type the 700,000th character in the
line; but stack size, type, and even existence are, of course,
not C issues.

These essays may stimulate Colin to think in new ways (for him)
and to produce a solution that improves on all of the examples.
This is also, of course, not a C issue - but (IMO) one of the
better possible outcomes.

--
Morris Dovey
West Des Moines, Iowa USA
C links at http://www.iedu.com/c
Read my lips: The apple doesn't fall far from the tree.

Nov 13 '05 #11

Richard Heathfield

Paul Hsieh wrote:

<snip>

Both Richard Heathfield's and Chuck
Falconer's solutions are O(n^2) in performance, where n is the length of
the input -- i.e., for large enough input, you machine will simply slow to
a crawl (not to mention the fact that they will shred the heap of any
substandard heap implementation) and actually be unable to retain the
input in reasonable time.
That was news to me, so I catted all my /usr/include stuff, recursively,
giving me a corpus of about 16 MB. I then wrote a little getc(stdin) loop
to hash the whole lot, and it crunched through the bytes at about 4.8
MB/sec on my machine. Then I wrote a loop using fgetline (the solution you
claim is O(n^2), above). I expected it to be a little slower, since it uses
dynamic memory allocation to allow line-by-line reading (which is the whole
point of the function, obviously). It turned out to crank through at about
3.8 MB/sec, which doesn't square with your claim of O(n^2) as far as I can
see.
A somewhat more general solution that doesn't suffer either of these
problems can be found here:

http://www.pobox.com/~qed/userInput.html

Ah, yes. I tried that, having first inserted the necessary headers and a
little test driver.

I got three rather serious compilation errors (one for a missing declaration
of n, and a couple where you meant . but used -> instead). After I'd fixed
those, I wrote a test program, which segfaulted (in your code, not mine) on
the first iteration.

My routine may not be the fastest line-reader in the world, but at least it
works. If we remove the requirement that the code must work correctly, I
can write a version that will take no memory and run in zero time.

--
Richard Heathfield : bi****@eton.powernet.co.uk
"Usenet is a strange place." - Dennis M Ritchie, 29 July 1999.
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html
K&R answers, C books, etc: http://users.powernet.co.uk/eton

Nov 13 '05 #12

nrk

Richard Heathfield wrote:

<snip>

My routine may not be the fastest line-reader in the world, but at least
it works. If we remove the requirement that the code must work correctly,
I can write a version that will take no memory and run in zero time.

LOL!!! That's a great quote that I am framing for posterity when faced with
similar situations (of course, due attributions will be made :-)

-nrk.

Nov 13 '05 #13

Kevin Goodsell

nrk wrote:

Richard Heathfield wrote:

<snip>
My routine may not be the fastest line-reader in the world, but at least
it works. If we remove the requirement that the code must work correctly,
I can write a version that will take no memory and run in zero time.

LOL!!! That's a great quote that I am framing for posterity when faced with
similar situations (of course, due attributions will be made :-)

I think Richard is copying me. ;-) Several months ago in comp.lang.c++ I
said:

"Once the requirement for correctness is removed, the speed of the
program becomes irrelevant, because it can be arbitrarily fast or
slow."

-Kevin
--
My email address is valid, but changes periodically.
To contact me please use the address from a recent posting.

Nov 13 '05 #14

Richard Heathfield

Kevin Goodsell wrote:

nrk wrote:
Richard Heathfield wrote:

<snip>
My routine may not be the fastest line-reader in the world, but at least
it works. If we remove the requirement that the code must work correctly,
I can write a version that will take no memory and run in zero time.

LOL!!! That's a great quote that I am framing for posterity when faced
with similar situations (of course, due attributions will be made :-)

I think Richard is copying me. ;-)

No, I don't think so. But the more I think about it, the more I think I
copied /somebody/. After posting that earlier this evening, I spent about
half an hour ploughing through some old programming books, trying to
discover whether I'd inadvertently plagiarised someone or other. No joy.
But if I got it from anywhere, it was probably from either McConnell or
Maguire.
Several months ago in comp.lang.c++ I
said:

"Once the requirement for correctness is removed, the speed of the
program becomes irrelevant, because it can be arbitrarily fast or
slow."

And what would I be doing reading clc++? :-)

--
Richard Heathfield : bi****@eton.powernet.co.uk
(Actually, I do, sometimes.)
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html
K&R answers, C books, etc: http://users.powernet.co.uk/eton

Nov 13 '05 #15

Richard Heathfield

Kevin Goodsell wrote:

nrk wrote:
Richard Heathfield wrote:

<snip>
My routine may not be the fastest line-reader in the world, but at least
it works. If we remove the requirement that the code must work correctly,
I can write a version that will take no memory and run in zero time.

LOL!!! That's a great quote that I am framing for posterity when faced
with similar situations (of course, due attributions will be made :-)

I think Richard is copying me. ;-)

Nailed it.

"...your program /doesn't work/. If mine doesn't have to work, I can make it
run instantly and take up no memory." Steve McConnell, "Code Complete",
p682, quoting Gerald Weinberg relating an incident happening to someone
else entirely.

So it's a complete re-phrasing of a half-remembered reference to a
second-hand anecdote. This is probably some kind of plagiaristic record.

--
Richard Heathfield : bi****@eton.powernet.co.uk
"Usenet is a strange place." - Dennis M Ritchie, 29 July 1999.
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html
K&R answers, C books, etc: http://users.powernet.co.uk/eton

Nov 13 '05 #16

Kevin Goodsell

Richard Heathfield wrote:

And what would I be doing reading clc++? :-)

Looking for things to plagiarize, of course. :P

OK, probably not. But aren't you also a C++ programmer? Or have I
confused you with someone else?

-Kevin
--
My email address is valid, but changes periodically.
To contact me please use the address from a recent posting.

Nov 13 '05 #17

Kevin Goodsell

Richard Heathfield wrote:

Kevin Goodsell wrote:

I think Richard is copying me. ;-)

Nailed it.

"...your program /doesn't work/. If mine doesn't have to work, I can make it
run instantly and take up no memory." Steve McConnell, "Code Complete",
p682, quoting Gerald Weinberg relating an incident happening to someone
else entirely.

So it's a complete re-phrasing of a half-remembered reference to a
second-hand anecdote. This is probably some kind of plagiaristic record.

But I had never even read the reference. Does that mean I beat your
record? ;-)

(I do plan to read that book, though. I have a copy, but it's barely
been opened. I think I read the first 10 pages when I first got it, then
switched to something else as I have a tendency to do.)

-Kevin
--
My email address is valid, but changes periodically.
To contact me please use the address from a recent posting.

Nov 14 '05 #18

Richard Heathfield

Kevin Goodsell wrote:

Richard Heathfield wrote:

And what would I be doing reading clc++? :-)

Looking for things to plagiarize, of course. :P

OK, probably not. But aren't you also a C++ programmer?

Only when pressed by hunger. :-)

Yes, I do program in C++ on occasion, mainly when I need a GUI in a hurry.
But I don't particularly /like/ the language, so I generally don't read
clc++ particularly often.
Or have I
confused you with someone else?

Oh, undoubtedly, my dear chap. Undoubtedly. But not for the reason you
think.
--
Richard Heathfield : bi****@eton.powernet.co.uk
"Usenet is a strange place." - Dennis M Ritchie, 29 July 1999.
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html
K&R answers, C books, etc: http://users.powernet.co.uk/eton

Nov 14 '05 #19

John Bode

Colin JN Breame <c.**********@durham.ac.uk> wrote in message news:<pa***************************@dur.ac.uk>...

Hi,

Fairly new to C. What is the best way to read a line (\n terminated) from
a file? Ive looked at fscanf but was not sure which format specifier to
use. (%s perhaps).

Thanks
Colin

If you're going to use fscanf() to read '\n'-terminated lines from a
file and store the whole line to a single buffer, use the %[
conversion specifier:

char buff[101];
FILE *infile;
...
fscanf (infile, "%100[^\n]%*[^\n]%*c", buff);

The conversion specifier "%100[\n]" means "read characters until we
see EOF, a newline character ('\n'), or until we've read 100
characters, and assign them to buff." The conversion specifier
"%*[^\n]" means "read characters until we see EOF or a newline and
throw them away." This conversion specifier is there in case the
input line is longer than our expected maximum line length, and gives
us a way to remove those extra characters from the input buffer. The
"%*c" conversion specifier means "read the next character (which
should be the newline) and throw it away." This removes the newline
character from the input buffer. You never want to use the "%["
conversion specifier without specifying a maximum field width;
fscanf() has no way to tell how big your target buffer is unless you
explicitly tell it, so if your input buffer is sized for 100
characters and the input line is 132 characters and you haven't
specified a maximum field width, fscanf() will attempt to write those
extra 32 characters to memory outside your buffer, which will cause a
crash (if you're lucky) or otherwise weird behavior (if you're not).

Alternately, you can use fgets() to read an input line into a buffer.
Like the %[ conversion specifier above, you specify a maximum buffer
length:

char buff[101];
FILE *infile;
...
fgets (buff, sizeof buff, infile);

Like the %[ conversion specifier above, fgets() will read until it
sees either an EOF, a newline, or until we've read 100 characters, and
stores them to buff. Unlike the conversion specifier used above, the
newline character is stored as part of the buffer. Also, unlike
fscanf(), there's no provision to automatically consume and discard
any characters beyond the expected input line length; you'll have to
call fgets() (or other input routine) repeatedly to clear out the
input buffer. Note that fflush() should *not* be used to clear the
input buffer; you must use an actual input routine.

Nov 14 '05 #20

Dan Pop

In <43**************************@posting.google.com > jo*******@my-deja.com (John Bode) writes:

Colin JN Breame <c.**********@durham.ac.uk> wrote in message news:<pa***************************@dur.ac.uk>...
Hi,

Fairly new to C. What is the best way to read a line (\n terminated) from
a file? Ive looked at fscanf but was not sure which format specifier to
use. (%s perhaps).

Thanks
Colin

If you're going to use fscanf() to read '\n'-terminated lines from a
file and store the whole line to a single buffer, use the %[
conversion specifier:

char buff[101];
FILE *infile;
...
fscanf (infile, "%100[^\n]%*[^\n]%*c", buff);

The conversion specifier "%100[\n]" means "read characters until we
see EOF, a newline character ('\n'), or until we've read 100
characters, and assign them to buff." The conversion specifier
"%*[^\n]" means "read characters until we see EOF or a newline and
throw them away." This conversion specifier is there in case the
input line is longer than our expected maximum line length, and gives
us a way to remove those extra characters from the input buffer. The
"%*c" conversion specifier means "read the next character (which
should be the newline) and throw it away."

What happens if the user types 100 or less characters followed by a
newline?

What happens if the user simply types the newline character?

What happens if the user presses the eof key instead?

Your example is mishandling all these cases (well, the last one can still
be detected with an feof() call, but it's rather ugly).

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de

Nov 14 '05 #21

Colin JN Breame

On Tue, 09 Dec 2003 08:14:49 -0800, John Bode wrote:

Colin JN Breame <c.**********@durham.ac.uk> wrote in message
news:<pa***************************@dur.ac.uk>...
Hi,

Fairly new to C. What is the best way to read a line (\n terminated)
from a file? Ive looked at fscanf but was not sure which format
specifier to use. (%s perhaps).

Thanks
Colin

If you're going to use fscanf() to read '\n'-terminated lines from a file
and store the whole line to a single buffer, use the %[ conversion
specifier:

char buff[101];
FILE *infile;
...
fscanf (infile, "%100[^\n]%*[^\n]%*c", buff);

The conversion specifier "%100[\n]" means "read characters until we see
EOF, a newline character ('\n'), or until we've read 100 characters, and
assign them to buff." The conversion specifier "%*[^\n]" means "read
characters until we see EOF or a newline and throw them away." This
conversion specifier is there in case the input line is longer than our
expected maximum line length, and gives us a way to remove those extra
characters from the input buffer. The "%*c" conversion specifier means
"read the next character (which should be the newline) and throw it away."
This removes the newline character from the input buffer. You never want
to use the "%[" conversion specifier without specifying a maximum field
width; fscanf() has no way to tell how big your target buffer is unless
you explicitly tell it, so if your input buffer is sized for 100
characters and the input line is 132 characters and you haven't specified
a maximum field width, fscanf() will attempt to write those extra 32
characters to memory outside your buffer, which will cause a crash (if
you're lucky) or otherwise weird behavior (if you're not).

Alternately, you can use fgets() to read an input line into a buffer. Like
the %[ conversion specifier above, you specify a maximum buffer length:

char buff[101];
FILE *infile;
...
fgets (buff, sizeof buff, infile);

Like the %[ conversion specifier above, fgets() will read until it sees
either an EOF, a newline, or until we've read 100 characters, and stores
them to buff. Unlike the conversion specifier used above, the newline
character is stored as part of the buffer. Also, unlike fscanf(), there's
no provision to automatically consume and discard any characters beyond
the expected input line length; you'll have to call fgets() (or other
input routine) repeatedly to clear out the input buffer. Note that
fflush() should *not* be used to clear the input buffer; you must use an
actual input routine.

Looks interesting, thanks all for the suggestions. Unfortunately, Im
now being forced to use C++.

Thanks again!

Nov 14 '05 #22

John Bode

Da*****@cern.ch (Dan Pop) wrote in message news:<br**********@sunnews.cern.ch>...

In <43**************************@posting.google.com > jo*******@my-deja.com (John Bode) writes:
Colin JN Breame <c.**********@durham.ac.uk> wrote in message news:<pa***************************@dur.ac.uk>...
Hi,

Fairly new to C. What is the best way to read a line (\n terminated) from
a file? Ive looked at fscanf but was not sure which format specifier to
use. (%s perhaps).

Thanks
Colin

If you're going to use fscanf() to read '\n'-terminated lines from a
file and store the whole line to a single buffer, use the %[
conversion specifier:

char buff[101];
FILE *infile;
...
fscanf (infile, "%100[^\n]%*[^\n]%*c", buff);

The conversion specifier "%100[\n]" means "read characters until we
see EOF, a newline character ('\n'), or until we've read 100
characters, and assign them to buff." The conversion specifier
"%*[^\n]" means "read characters until we see EOF or a newline and
throw them away." This conversion specifier is there in case the
input line is longer than our expected maximum line length, and gives
us a way to remove those extra characters from the input buffer. The
"%*c" conversion specifier means "read the next character (which
should be the newline) and throw it away."

What happens if the user types 100 or less characters followed by a
newline?

What happens if the user simply types the newline character?

What happens if the user presses the eof key instead?

Your example is mishandling all these cases (well, the last one can still
be detected with an feof() call, but it's rather ugly).

Dan

See, this is why I prefer fgets() for interactive input; getting the
conversion specifiers for fscanf() just right is apparently beyond my
abilities.

:-p~~~

For the benefit of the OP, could you show how to handle those cases
properly, since I borked it?

Nov 14 '05 #23

nrk

John Bode wrote:

Da*****@cern.ch (Dan Pop) wrote in message
news:<br**********@sunnews.cern.ch>...
In <43**************************@posting.google.com >
jo*******@my-deja.com (John Bode) writes:
>Colin JN Breame <c.**********@durham.ac.uk> wrote in message
>news:<pa***************************@dur.ac.uk>. ..
>> Hi,
>>
>> Fairly new to C. What is the best way to read a line (\n terminated)
>> from
>> a file? Ive looked at fscanf but was not sure which format specifier
>> to
>> use. (%s perhaps).
>>
>> Thanks
>> Colin
>
>If you're going to use fscanf() to read '\n'-terminated lines from a
>file and store the whole line to a single buffer, use the %[
>conversion specifier:
>
> char buff[101];
> FILE *infile;
> ...
> fscanf (infile, "%100[^\n]%*[^\n]%*c", buff);
>
>The conversion specifier "%100[\n]" means "read characters until we
>see EOF, a newline character ('\n'), or until we've read 100
>characters, and assign them to buff." The conversion specifier
>"%*[^\n]" means "read characters until we see EOF or a newline and
>throw them away." This conversion specifier is there in case the
>input line is longer than our expected maximum line length, and gives
>us a way to remove those extra characters from the input buffer. The
>"%*c" conversion specifier means "read the next character (which
>should be the newline) and throw it away."

What happens if the user types 100 or less characters followed by a
newline?

While the line is correctly read into buff, the trailing newline remains in
the input stream. This is because, there is a failure in scanning for the
%*[^\n] specifier and scanning will stop at that point. The only solution
I can think of is breaking the call into two separate ones, one to get the
first 100 of the line and any possible trailing non-newline junk, and the
next to get rid of newlines.
What happens if the user simply types the newline character?
This is more insidious. buff is uninitialized in this case, as there is a
failure for the very first conversion specifier (%100[^\n]). The way to
handle this is to look at the return code of fscanf. If it's 0, that means
none of the items were assigned to, so handle it appropriately.
What happens if the user presses the eof key instead?
Again, check the return code of fscanf. If it's EOF, then it signifies end
of input or error in input. Handle appropriately.
Your example is mishandling all these cases (well, the last one can still
be detected with an feof() call, but it's rather ugly).

Dan

See, this is why I prefer fgets() for interactive input; getting the
conversion specifiers for fscanf() just right is apparently beyond my
abilities.

:-p~~~

For the benefit of the OP, could you show how to handle those cases
properly, since I borked it?

My attempt:
int rc;

rc = fscanf(input, "%100[^\n]%*[^\n]", buff);

if ( rc == EOF ) {
/* end of input/error in input handling */
}
else if ( rc < 1 ) {
/* buff was not assigned to... */
/* note that this is simply a newline by itself
in all likelihood, you can just scan the newline
out and continue as normal */
}

/* scan out newline + following empty lines */
rc = fscanf(input, "%*[\n]");
if ( rc == EOF ) {
/* end of input/error in input handling */
}

-nrk.

Nov 14 '05 #24

someone else

"Joona I Palaste" <pa*****@cc.helsinki.fi> wrote in message
news:bq**********@oravannahka.helsinki.fi...

Colin JN Breame <c.**********@durham.ac.uk> scribbled the following:
Hi,
Fairly new to C. What is the best way to read a line (\n terminated) from a file? Ive looked at fscanf but was not sure which format specifier to
use. (%s perhaps).

%s will scan up to the first space character

try fscaf(your_file_pointer,"%[^\n]%*c",your_target)
this will scan everything up to \n into your_target, and then scan and
discard the \n character itself
If you know how long the line will be, then fgets() is much better for
the job than fscanf().

--
/-- Joona Palaste (pa*****@cc.helsinki.fi) ------------- Finland --------\
\-- http://www.helsinki.fi/~palaste --------------------- rules! --------/
"War! Huh! Good God, y'all! What is it good for? We asked Mayor Quimby."
- Kent Brockman

Nov 14 '05 #25

Dan Pop

In <_t***************@nwrddc03.gnilink.net> nrk <ra*********@deadbeef.verizon.net> writes:

John Bode wrote:
Da*****@cern.ch (Dan Pop) wrote in message
news:<br**********@sunnews.cern.ch>...
In <43**************************@posting.google.com >
jo*******@my-deja.com (John Bode) writes:

>Colin JN Breame <c.**********@durham.ac.uk> wrote in message
>news:<pa***************************@dur.ac.uk>. ..
>> Hi,
>>
>> Fairly new to C. What is the best way to read a line (\n terminated)
>> from
>> a file? Ive looked at fscanf but was not sure which format specifier
>> to
>> use. (%s perhaps).
>>
>> Thanks
>> Colin
>
>If you're going to use fscanf() to read '\n'-terminated lines from a
>file and store the whole line to a single buffer, use the %[
>conversion specifier:
>
> char buff[101];
> FILE *infile;
> ...
> fscanf (infile, "%100[^\n]%*[^\n]%*c", buff);
>
>The conversion specifier "%100[\n]" means "read characters until we
>see EOF, a newline character ('\n'), or until we've read 100
>characters, and assign them to buff." The conversion specifier
>"%*[^\n]" means "read characters until we see EOF or a newline and
>throw them away." This conversion specifier is there in case the
>input line is longer than our expected maximum line length, and gives
>us a way to remove those extra characters from the input buffer. The
>"%*c" conversion specifier means "read the next character (which
>should be the newline) and throw it away."

What happens if the user types 100 or less characters followed by a
newline?

While the line is correctly read into buff, the trailing newline remains in
the input stream. This is because, there is a failure in scanning for the
%*[^\n] specifier and scanning will stop at that point. The only solution
I can think of is breaking the call into two separate ones, one to get the
first 100 of the line and any possible trailing non-newline junk, and the
next to get rid of newlines.
Right.

What happens if the user simply types the newline character?
This is more insidious. buff is uninitialized in this case, as there is a
failure for the very first conversion specifier (%100[^\n]). The way to
handle this is to look at the return code of fscanf. If it's 0, that means
none of the items were assigned to, so handle it appropriately.

Right, but he was discarding the return value of fscanf, which is
extremely important.

What happens if the user presses the eof key instead?

Again, check the return code of fscanf. If it's EOF, then it signifies end
of input or error in input. Handle appropriately.

Same remark as above.
My attempt:
int rc;

rc = fscanf(input, "%100[^\n]%*[^\n]", buff);

if ( rc == EOF ) {
/* end of input/error in input handling */
}
else if ( rc < 1 ) {
/* buff was not assigned to... */
/* note that this is simply a newline by itself
in all likelihood, you can just scan the newline
out and continue as normal */
}

/* scan out newline + following empty lines */
It's usually better to leave the following empty lines alone. They may
be significant as such.
rc = fscanf(input, "%*[\n]");
if ( rc == EOF ) {
/* end of input/error in input handling */
}

Things can be done much simpler:

char buff[100 + 1] = "";

int rc = fscanf(input, "%100[^\n]%*[^\n]", buff);
if (!feof(input)) getc(input);

Now, if rc != EOF, buff contains valid user input, which may be an
empty string if the user simply pressed the newline key.

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de

Nov 14 '05 #26

nrk

Dan Pop wrote:

<snip>

It's usually better to leave the following empty lines alone. They may
be significant as such.

Agreed. However, if all that needs to be done is maintain a line number
count, I think the fscanf can be modified to add a %n specifier to achieve
the same result.

rc = fscanf(input, "%*[\n]");
if ( rc == EOF ) {
/* end of input/error in input handling */
}

Things can be done much simpler:

char buff[100 + 1] = "";

int rc = fscanf(input, "%100[^\n]%*[^\n]", buff);
if (!feof(input)) getc(input);

Now, if rc != EOF, buff contains valid user input, which may be an
empty string if the user simply pressed the newline key.

Is there any specific reason you've used feof instead of directly testing rc
against EOF above? Wouldn't rc != EOF will give you the same effect?

-nrk.
Dan

Nov 14 '05 #27

Dan Pop

In <6f****************@nwrddc03.gnilink.net> nrk <ra*********@deadbeef.verizon.net> writes:

Dan Pop wrote:

<snip>

It's usually better to leave the following empty lines alone. They may
be significant as such.

Agreed. However, if all that needs to be done is maintain a line number
count, I think the fscanf can be modified to add a %n specifier to achieve
the same result.

When reading input from the terminal, an empty line often means the user's
acceptance of a default value. Your approach merely complicates the
things in such a case.

Things can be done much simpler:

char buff[100 + 1] = "";

int rc = fscanf(input, "%100[^\n]%*[^\n]", buff);
if (!feof(input)) getc(input);

Now, if rc != EOF, buff contains valid user input, which may be an
empty string if the user simply pressed the newline key.

Is there any specific reason you've used feof instead of directly testing rc
against EOF above? Wouldn't rc != EOF will give you the same effect?

What happens if the eof condition occurs after fscanf has already read 20
characters? rc will be 1, but there is no point in calling getc(),
especially on certain implementations with non-sticky eof where getc()
will effectively attempt to get more input.

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de

Nov 14 '05 #28

nrk

Dan Pop wrote:

In <6f****************@nwrddc03.gnilink.net> nrk
<ra*********@deadbeef.verizon.net> writes:
Dan Pop wrote:

<snip>

It's usually better to leave the following empty lines alone. They may
be significant as such.

Agreed. However, if all that needs to be done is maintain a line number
count, I think the fscanf can be modified to add a %n specifier to achieve
the same result.

When reading input from the terminal, an empty line often means the user's
acceptance of a default value. Your approach merely complicates the
things in such a case.
Things can be done much simpler:

char buff[100 + 1] = "";

int rc = fscanf(input, "%100[^\n]%*[^\n]", buff);
if (!feof(input)) getc(input);

Now, if rc != EOF, buff contains valid user input, which may be an
empty string if the user simply pressed the newline key.

Is there any specific reason you've used feof instead of directly testing
rc
against EOF above? Wouldn't rc != EOF will give you the same effect?

What happens if the eof condition occurs after fscanf has already read 20
characters? rc will be 1, but there is no point in calling getc(),
especially on certain implementations with non-sticky eof where getc()
will effectively attempt to get more input.

Dan

Thank you. I had missed considering both interactive input, and the case of
end of file being reached with rc == 1.

-nrk.

Nov 14 '05 #29

Reading a line from a file

Similar topics