473,406 Members | 2,549 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,406 software developers and data experts.

parsing config file

If I am parsing a config file that uses '#' for comments and the
config file itself is 1640 bytes, and the format is VARIABLE=VALUE, is
it recommended to use

a) fgetc (parse a character at a time)
b) fgets (read in blocks of whatever size)
c) fread (get the size of the file and fread the entire thing into
memory)

and when would it be appropriate to use either a, b, or c?
nethlek
Nov 13 '05 #1
9 23562
Mantorok Redgormor <ne*****@tokyo.com> scribbled the following:
If I am parsing a config file that uses '#' for comments and the
config file itself is 1640 bytes, and the format is VARIABLE=VALUE, is
it recommended to use a) fgetc (parse a character at a time)
b) fgets (read in blocks of whatever size)
c) fread (get the size of the file and fread the entire thing into
memory) and when would it be appropriate to use either a, b, or c?


If the config file's format is so that each VARIABLE=VALUE is on a
separate line, I definitely recommed b) fgets. Otherwise you're best off
with c) fread, but the problem is, you'll have to parse the delimiters
out yourself.

--
/-- Joona Palaste (pa*****@cc.helsinki.fi) ---------------------------\
| Kingpriest of "The Flying Lemon Tree" G++ FR FW+ M- #108 D+ ADA N+++|
| http://www.helsinki.fi/~palaste W++ B OP+ |
\----------------------------------------- Finland rules! ------------/
"A bicycle cannot stand up by itself because it's two-tyred."
- Sky Text
Nov 13 '05 #2
Joona I Palaste <pa*****@cc.helsinki.fi> spoke thus:
If the config file's format is so that each VARIABLE=VALUE is on a
separate line, I definitely recommed b) fgets. Otherwise you're best off
with c) fread, but the problem is, you'll have to parse the delimiters
out yourself.


Why not

fscanf( "%[^#=]=%s", &variable, &value );

?

--
Christopher Benson-Manica | Jumonji giri, for honour.
ataru(at)cyberspace.org |
Nov 13 '05 #3
On Tue, 16 Sep 2003 16:50:56 UTC, ne*****@tokyo.com (Mantorok
Redgormor) wrote:
If I am parsing a config file that uses '#' for comments and the
config file itself is 1640 bytes, and the format is VARIABLE=VALUE, is
it recommended to use

a) fgetc (parse a character at a time)
Maybe a good choice because you handles each char that comes in
directly. No need to mess up with buffer sizes for a whole line.
b) fgets (read in blocks of whatever size)
Maybe a good choice because you can thereafter handle the line as such
as you likes.
Maybe a bad choice because it may be possible that the buffer you
gives fgets is too small
c) fread (get the size of the file and fread the entire thing into
memory)
Maybe a good choice when you knows the whole size of the file. Anyway
it costs more memory as absolutely required.

and when would it be appropriate to use either a, b, or c?


Does you like to handle undersized input buffers? Then use b).
Does you have quick access to the size of the file? Then use c)
Don't you like to handle dynamic input buffers only to get a line
coplete because it is longer than you had think it should be? And is
your memory limited in size (wheras your progam may not the only that
runs on the mashine)?
Or is it even not so easy to determine the size of the file in a
manner that you can allocate a buffer big enough to read it in at
once?
If the anywer you gives to one of the questions above is yes then a)
is your choice.

--
Tschau/Bye
Herbert

eComStation 1.1 Deutsch Beta ist verügbar
Nov 13 '05 #4
The Real OS/2 Guy wrote:

On Tue, 16 Sep 2003 16:50:56 UTC, ne*****@tokyo.com (Mantorok
Redgormor) wrote:
If I am parsing a config file that uses '#' for comments and the
config file itself is 1640 bytes, and the format is VARIABLE=VALUE, is
it recommended to use

a) fgetc (parse a character at a time)


Maybe a good choice because you handles each char that comes in
directly. No need to mess up with buffer sizes for a whole line.
b) fgets (read in blocks of whatever size)


Maybe a good choice because you can thereafter handle the line as such
as you likes.
Maybe a bad choice because it may be possible that the buffer you
gives fgets is too small
c) fread (get the size of the file and fread the entire thing into
memory)


Maybe a good choice when you knows the whole size of the file. Anyway
it costs more memory as absolutely required.

and when would it be appropriate to use either a, b, or c?


Does you like to handle undersized input buffers? Then use b).
Does you have quick access to the size of the file? Then use c)
Don't you like to handle dynamic input buffers only to get a line
coplete because it is longer than you had think it should be? And is
your memory limited in size (wheras your progam may not the only that
runs on the mashine)?
Or is it even not so easy to determine the size of the file in a
manner that you can allocate a buffer big enough to read it in at
once?
If the anywer you gives to one of the questions above is yes then a)
is your choice.

Herbert, I disagree. Choice b) is the only choice. Choice a) is too ugly
for a mother to love. Choice c), fread() a text file and then parse it,
uses lots of memory and complicates things more than necessary.

The configuration file as described, defines variables in 'key=value'
format, line at a time. It is fgets() that reads a file 'line at a
time'. It is trivial to determine comment lines beginning with '#' or';'
or whatever and skip them.

Everyone please note that in order to read any file correctly, you must
know how it was written, ie. its format. There are 'rules' to writing
..cfg or .ini (or other) files which you must know exactly before you can
read them successfully.
--
Joe Wright mailto:jo********@earthlink.net
"Everything should be made as simple as possible, but not simpler."
--- Albert Einstein ---
Nov 13 '05 #5
On Tue, 16 Sep 2003 17:26:12 +0000 (UTC), Christopher Benson-Manica
<at***@nospam.cyberspace.org> wrote in comp.lang.c:
Joona I Palaste <pa*****@cc.helsinki.fi> spoke thus:
If the config file's format is so that each VARIABLE=VALUE is on a
separate line, I definitely recommed b) fgets. Otherwise you're best off
with c) fread, but the problem is, you'll have to parse the delimiters
out yourself.


Why not

fscanf( "%[^#=]=%s", &variable, &value );

?


Because any *scanf with "%s" lacking a size specifier is just another
name for gets(), a nasty buffer overrun just waiting to happen.

Thus are worms born...

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++ ftp://snurse-l.org/pub/acllc-c++/faq
Nov 13 '05 #6

Joe Wright <jo********@earthlink.net> wrote in message
news:3F*********@earthlink.net...
The Real OS/2 Guy wrote:

On Tue, 16 Sep 2003 16:50:56 UTC, ne*****@tokyo.com (Mantorok
Redgormor) wrote:
If I am parsing a config file that uses '#' for comments and the
config file itself is 1640 bytes, and the format is VARIABLE=VALUE, is
it recommended to use

a) fgetc (parse a character at a time)


Maybe a good choice because you handles each char that comes in
directly. No need to mess up with buffer sizes for a whole line.
b) fgets (read in blocks of whatever size)


Maybe a good choice because you can thereafter handle the line as such
as you likes.
Maybe a bad choice because it may be possible that the buffer you
gives fgets is too small
c) fread (get the size of the file and fread the entire thing into
memory)


Maybe a good choice when you knows the whole size of the file. Anyway
it costs more memory as absolutely required.

and when would it be appropriate to use either a, b, or c?


Does you like to handle undersized input buffers? Then use b).
Does you have quick access to the size of the file? Then use c)
Don't you like to handle dynamic input buffers only to get a line
coplete because it is longer than you had think it should be? And is
your memory limited in size (wheras your progam may not the only that
runs on the mashine)?
Or is it even not so easy to determine the size of the file in a
manner that you can allocate a buffer big enough to read it in at
once?
If the anywer you gives to one of the questions above is yes then a)
is your choice.

Herbert, I disagree. Choice b) is the only choice. Choice a) is too ugly
for a mother to love. Choice c), fread() a text file and then parse it,
uses lots of memory and complicates things more than necessary.

The configuration file as described, defines variables in 'key=value'
format, line at a time. It is fgets() that reads a file 'line at a
time'. It is trivial to determine comment lines beginning with '#' or';'
or whatever and skip them.

Everyone please note that in order to read any file correctly, you must
know how it was written, ie. its format. There are 'rules' to writing
.cfg or .ini (or other) files which you must know exactly before you can
read them successfully.


And robust code that reads them should be able to handle
corrupt or incorrectly formatted data (e.g. by assuming
'defaults', or giving an error message, terminating, etc.)

-Mike

Nov 13 '05 #7
Jack Klein <ja*******@spamcop.net> spoke thus:
Because any *scanf with "%s" lacking a size specifier is just another
name for gets(), a nasty buffer overrun just waiting to happen.


Well, considering the OP was just parsing a config file, the chances for an
exploit shouldn't be too high, eh? Thanks, though, for I had forgotten about
that... Although something like

fscanf( "%20[^=]s=%20s", &s1, &s2 );

would fail if it weren't given exactly 20 characters for the first string,
right...?

--
Christopher Benson-Manica | Jumonji giri, for honour.
ataru(at)cyberspace.org |
Nov 13 '05 #8
Jack Klein <ja*******@spamcop.net> spoke thus:
Because any *scanf with "%s" lacking a size specifier is just another
name for gets(), a nasty buffer overrun just waiting to happen.


Well, considering the OP was just parsing a config file, the chances for an
exploit shouldn't be too high, eh? Thanks, though, for I had forgotten about
that... Although something like

fscanf( "%20[^#=]s=%20s", &s1, &s2 );

would fail if it weren't given exactly 20 characters for the first string,
right...? Maybe I should just be quiet now...

--
Christopher Benson-Manica | Jumonji giri, for honour.
ataru(at)cyberspace.org |
Nov 13 '05 #9
On Wed, 17 Sep 2003 16:30:22 +0000 (UTC), Christopher Benson-Manica
<at***@nospam.cyberspace.org> wrote:
Jack Klein <ja*******@spamcop.net> spoke thus:
Because any *scanf with "%s" lacking a size specifier is just another
name for gets(), a nasty buffer overrun just waiting to happen.


Well, considering the OP was just parsing a config file, the chances for an
exploit shouldn't be too high, eh? Thanks, though, for I had forgotten about
that... Although something like

fscanf( "%20[^#=]s=%20s", &s1, &s2 );

would fail if it weren't given exactly 20 characters for the first string,
right...? Maybe I should just be quiet now...


Not right; no comment on whether you should. A width specifier on any
*scanf conversion is an upper limit, although %Nc will always read to
the upper limit or end-of-input/error.

Also the %20s on the right side won't allow whitespace in the value,
which I would want to; %20[^\n] will. And either of those will
normally leave the newline in the input stream, which is probably OK
if you want to handle # lines with a getc or similar rather than
another (prior?) fscanf; either will also leave any text exceeding the
limit, and %20s any text following a whitespace; adding %*[^\n] would
reduce the number of different cases you have to handle.

- David.Thompson1 at worldnet.att.net
Nov 13 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: Fuzzyman | last post by:
There have been a couple of config file 'systems' announced recently, that focus on building more powerful and complex configuration files. ConfigObj is a module to enable you to much more *simply*...
3
by: Aaron Walker | last post by:
At the beginning of my program, I open a config file and load the contents into a structure (please disregard the non-portable sockaddr_in struct as it is irrelevant to the problem): struct...
3
by: Dave Cullen | last post by:
I'm a C# noob trying to maintain and modify someone else's code. The previous programmer used a config file to load parameters for a database connect string. The file is named app.config and it's...
10
by: NuB | last post by:
I'm creating a C# class file(DLL) that will be used by an asp.net application. In the DLL I want to read a web.config, or app.config file so some information can change without having to go into...
16
by: Timm | last post by:
I'm trying to use ASP objects (basically formed and populated based on Web.Config settings) and I want to use them in a different non-asp program with minimal reprogramming. So, my question is how...
3
by: =?Utf-8?B?RGFuYQ==?= | last post by:
I am re-posting this message after registering my posting alias. When I specify an end tag for the clear element of namespaces in my web.config file, the parser error "Unrecognized element 'add'"...
3
by: =?Utf-8?B?YzY3NjIyOA==?= | last post by:
Hi all, I cut and paste the following code from msdn help page which it just introduces view and multiview server controls. Here is what I do: in vs studio 2005, File --New Web Site, it...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.