By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
445,804 Members | 1,634 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 445,804 IT Pros & Developers. It's quick & easy.

parsing config file

P: n/a
If I am parsing a config file that uses '#' for comments and the
config file itself is 1640 bytes, and the format is VARIABLE=VALUE, is
it recommended to use

a) fgetc (parse a character at a time)
b) fgets (read in blocks of whatever size)
c) fread (get the size of the file and fread the entire thing into
memory)

and when would it be appropriate to use either a, b, or c?
nethlek
Nov 13 '05 #1
Share this Question
Share on Google+
9 Replies


P: n/a
Mantorok Redgormor <ne*****@tokyo.com> scribbled the following:
If I am parsing a config file that uses '#' for comments and the
config file itself is 1640 bytes, and the format is VARIABLE=VALUE, is
it recommended to use a) fgetc (parse a character at a time)
b) fgets (read in blocks of whatever size)
c) fread (get the size of the file and fread the entire thing into
memory) and when would it be appropriate to use either a, b, or c?


If the config file's format is so that each VARIABLE=VALUE is on a
separate line, I definitely recommed b) fgets. Otherwise you're best off
with c) fread, but the problem is, you'll have to parse the delimiters
out yourself.

--
/-- Joona Palaste (pa*****@cc.helsinki.fi) ---------------------------\
| Kingpriest of "The Flying Lemon Tree" G++ FR FW+ M- #108 D+ ADA N+++|
| http://www.helsinki.fi/~palaste W++ B OP+ |
\----------------------------------------- Finland rules! ------------/
"A bicycle cannot stand up by itself because it's two-tyred."
- Sky Text
Nov 13 '05 #2

P: n/a
Joona I Palaste <pa*****@cc.helsinki.fi> spoke thus:
If the config file's format is so that each VARIABLE=VALUE is on a
separate line, I definitely recommed b) fgets. Otherwise you're best off
with c) fread, but the problem is, you'll have to parse the delimiters
out yourself.


Why not

fscanf( "%[^#=]=%s", &variable, &value );

?

--
Christopher Benson-Manica | Jumonji giri, for honour.
ataru(at)cyberspace.org |
Nov 13 '05 #3

P: n/a
On Tue, 16 Sep 2003 16:50:56 UTC, ne*****@tokyo.com (Mantorok
Redgormor) wrote:
If I am parsing a config file that uses '#' for comments and the
config file itself is 1640 bytes, and the format is VARIABLE=VALUE, is
it recommended to use

a) fgetc (parse a character at a time)
Maybe a good choice because you handles each char that comes in
directly. No need to mess up with buffer sizes for a whole line.
b) fgets (read in blocks of whatever size)
Maybe a good choice because you can thereafter handle the line as such
as you likes.
Maybe a bad choice because it may be possible that the buffer you
gives fgets is too small
c) fread (get the size of the file and fread the entire thing into
memory)
Maybe a good choice when you knows the whole size of the file. Anyway
it costs more memory as absolutely required.

and when would it be appropriate to use either a, b, or c?


Does you like to handle undersized input buffers? Then use b).
Does you have quick access to the size of the file? Then use c)
Don't you like to handle dynamic input buffers only to get a line
coplete because it is longer than you had think it should be? And is
your memory limited in size (wheras your progam may not the only that
runs on the mashine)?
Or is it even not so easy to determine the size of the file in a
manner that you can allocate a buffer big enough to read it in at
once?
If the anywer you gives to one of the questions above is yes then a)
is your choice.

--
Tschau/Bye
Herbert

eComStation 1.1 Deutsch Beta ist verügbar
Nov 13 '05 #4

P: n/a
The Real OS/2 Guy wrote:

On Tue, 16 Sep 2003 16:50:56 UTC, ne*****@tokyo.com (Mantorok
Redgormor) wrote:
If I am parsing a config file that uses '#' for comments and the
config file itself is 1640 bytes, and the format is VARIABLE=VALUE, is
it recommended to use

a) fgetc (parse a character at a time)


Maybe a good choice because you handles each char that comes in
directly. No need to mess up with buffer sizes for a whole line.
b) fgets (read in blocks of whatever size)


Maybe a good choice because you can thereafter handle the line as such
as you likes.
Maybe a bad choice because it may be possible that the buffer you
gives fgets is too small
c) fread (get the size of the file and fread the entire thing into
memory)


Maybe a good choice when you knows the whole size of the file. Anyway
it costs more memory as absolutely required.

and when would it be appropriate to use either a, b, or c?


Does you like to handle undersized input buffers? Then use b).
Does you have quick access to the size of the file? Then use c)
Don't you like to handle dynamic input buffers only to get a line
coplete because it is longer than you had think it should be? And is
your memory limited in size (wheras your progam may not the only that
runs on the mashine)?
Or is it even not so easy to determine the size of the file in a
manner that you can allocate a buffer big enough to read it in at
once?
If the anywer you gives to one of the questions above is yes then a)
is your choice.

Herbert, I disagree. Choice b) is the only choice. Choice a) is too ugly
for a mother to love. Choice c), fread() a text file and then parse it,
uses lots of memory and complicates things more than necessary.

The configuration file as described, defines variables in 'key=value'
format, line at a time. It is fgets() that reads a file 'line at a
time'. It is trivial to determine comment lines beginning with '#' or';'
or whatever and skip them.

Everyone please note that in order to read any file correctly, you must
know how it was written, ie. its format. There are 'rules' to writing
..cfg or .ini (or other) files which you must know exactly before you can
read them successfully.
--
Joe Wright mailto:jo********@earthlink.net
"Everything should be made as simple as possible, but not simpler."
--- Albert Einstein ---
Nov 13 '05 #5

P: n/a
On Tue, 16 Sep 2003 17:26:12 +0000 (UTC), Christopher Benson-Manica
<at***@nospam.cyberspace.org> wrote in comp.lang.c:
Joona I Palaste <pa*****@cc.helsinki.fi> spoke thus:
If the config file's format is so that each VARIABLE=VALUE is on a
separate line, I definitely recommed b) fgets. Otherwise you're best off
with c) fread, but the problem is, you'll have to parse the delimiters
out yourself.


Why not

fscanf( "%[^#=]=%s", &variable, &value );

?


Because any *scanf with "%s" lacking a size specifier is just another
name for gets(), a nasty buffer overrun just waiting to happen.

Thus are worms born...

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++ ftp://snurse-l.org/pub/acllc-c++/faq
Nov 13 '05 #6

P: n/a

Joe Wright <jo********@earthlink.net> wrote in message
news:3F*********@earthlink.net...
The Real OS/2 Guy wrote:

On Tue, 16 Sep 2003 16:50:56 UTC, ne*****@tokyo.com (Mantorok
Redgormor) wrote:
If I am parsing a config file that uses '#' for comments and the
config file itself is 1640 bytes, and the format is VARIABLE=VALUE, is
it recommended to use

a) fgetc (parse a character at a time)


Maybe a good choice because you handles each char that comes in
directly. No need to mess up with buffer sizes for a whole line.
b) fgets (read in blocks of whatever size)


Maybe a good choice because you can thereafter handle the line as such
as you likes.
Maybe a bad choice because it may be possible that the buffer you
gives fgets is too small
c) fread (get the size of the file and fread the entire thing into
memory)


Maybe a good choice when you knows the whole size of the file. Anyway
it costs more memory as absolutely required.

and when would it be appropriate to use either a, b, or c?


Does you like to handle undersized input buffers? Then use b).
Does you have quick access to the size of the file? Then use c)
Don't you like to handle dynamic input buffers only to get a line
coplete because it is longer than you had think it should be? And is
your memory limited in size (wheras your progam may not the only that
runs on the mashine)?
Or is it even not so easy to determine the size of the file in a
manner that you can allocate a buffer big enough to read it in at
once?
If the anywer you gives to one of the questions above is yes then a)
is your choice.

Herbert, I disagree. Choice b) is the only choice. Choice a) is too ugly
for a mother to love. Choice c), fread() a text file and then parse it,
uses lots of memory and complicates things more than necessary.

The configuration file as described, defines variables in 'key=value'
format, line at a time. It is fgets() that reads a file 'line at a
time'. It is trivial to determine comment lines beginning with '#' or';'
or whatever and skip them.

Everyone please note that in order to read any file correctly, you must
know how it was written, ie. its format. There are 'rules' to writing
.cfg or .ini (or other) files which you must know exactly before you can
read them successfully.


And robust code that reads them should be able to handle
corrupt or incorrectly formatted data (e.g. by assuming
'defaults', or giving an error message, terminating, etc.)

-Mike

Nov 13 '05 #7

P: n/a
Jack Klein <ja*******@spamcop.net> spoke thus:
Because any *scanf with "%s" lacking a size specifier is just another
name for gets(), a nasty buffer overrun just waiting to happen.


Well, considering the OP was just parsing a config file, the chances for an
exploit shouldn't be too high, eh? Thanks, though, for I had forgotten about
that... Although something like

fscanf( "%20[^=]s=%20s", &s1, &s2 );

would fail if it weren't given exactly 20 characters for the first string,
right...?

--
Christopher Benson-Manica | Jumonji giri, for honour.
ataru(at)cyberspace.org |
Nov 13 '05 #8

P: n/a
Jack Klein <ja*******@spamcop.net> spoke thus:
Because any *scanf with "%s" lacking a size specifier is just another
name for gets(), a nasty buffer overrun just waiting to happen.


Well, considering the OP was just parsing a config file, the chances for an
exploit shouldn't be too high, eh? Thanks, though, for I had forgotten about
that... Although something like

fscanf( "%20[^#=]s=%20s", &s1, &s2 );

would fail if it weren't given exactly 20 characters for the first string,
right...? Maybe I should just be quiet now...

--
Christopher Benson-Manica | Jumonji giri, for honour.
ataru(at)cyberspace.org |
Nov 13 '05 #9

P: n/a
On Wed, 17 Sep 2003 16:30:22 +0000 (UTC), Christopher Benson-Manica
<at***@nospam.cyberspace.org> wrote:
Jack Klein <ja*******@spamcop.net> spoke thus:
Because any *scanf with "%s" lacking a size specifier is just another
name for gets(), a nasty buffer overrun just waiting to happen.


Well, considering the OP was just parsing a config file, the chances for an
exploit shouldn't be too high, eh? Thanks, though, for I had forgotten about
that... Although something like

fscanf( "%20[^#=]s=%20s", &s1, &s2 );

would fail if it weren't given exactly 20 characters for the first string,
right...? Maybe I should just be quiet now...


Not right; no comment on whether you should. A width specifier on any
*scanf conversion is an upper limit, although %Nc will always read to
the upper limit or end-of-input/error.

Also the %20s on the right side won't allow whitespace in the value,
which I would want to; %20[^\n] will. And either of those will
normally leave the newline in the input stream, which is probably OK
if you want to handle # lines with a getc or similar rather than
another (prior?) fscanf; either will also leave any text exceeding the
limit, and %20s any text following a whitespace; adding %*[^\n] would
reduce the number of different cases you have to handle.

- David.Thompson1 at worldnet.att.net
Nov 13 '05 #10

This discussion thread is closed

Replies have been disabled for this discussion.