On Sep 4, 5:41 pm, Gennaro Prota <gennaro/pr...@yahoo.comwrote:
James Kanze wrote:
[...]
More generally, however, I tend to use regular expressions in
such cases. If the line matches "^[:space:]*$", ignore it.
With a good implementation of regular expressions (which uses a
DFA if the expression contains no extensions), this can be just
as fast as the above, if not faster.
I see that you mention execution speed here and in other posts
of this thread. Since you aren't in the Premature-Optimization
"school of thought", I re-read the original post, and it says
"quickest way". I think that wasn't meant as "the way which
executes fastest", though; I get it as: "how do I avoid
spending time implementing this?".
I suspect that that's wishful thinking on your part. That's
what it should mean, but most of the time, most programmers do
still use "quickest" to refer to execution time. Since the
issue of execution time was raised, I felt it necessary to
address it. The regular expression solution is by far the
simplest, and it's execution time is NOT necessarily too bad.
Of course, the regular expression class I use here is my own,
not that of Boost. The two are significantly different, being
designed from the start with different goals in mind. For most
general use, Boost's regular expression is better than mine, but
in this particular case: my regular expression class supports
the or'ing of multiple regular expressions, with different
return values. So you can write something like:
enum { emptyLine, sectionHeader, attrValuePair } ;
static RegularExpression const re =
RegularExpression( "[[:space:]]*$", emptyLine )
| RegularExpression( "\[.*\][[:space:]]*$", sectionHeader )
| RegularExpression( ".*=.*", attrValuePair ) ;
std::string line ;
while ( std::getline( source, line ) ) {
switch ( re.match( line.begin(), line.end() ).acceptCode ) {
case emptyLine :
break ;
case sectionHeader :
// ...
break ;
case attrValuePair :
// ...
break ;
default :
// process syntax error...
break ;
}
Of course, for the empty line, I'd probably use:
"[[:space:]]*(#.*)?$", to allow comments.
And a small warning: the version of RegularExpression doesn't
support the $ at the end to require a complete match, so you'd
have to add special code to handle this. I've recently reworked
the class considerably, however, for various reasons, and my
current version does have an option to require matching the
complete string, instead of just the start. It also supports
dumping the regular expression as a StaticRegularExpression, a
POD with static initialization that you then compile and link
into your program. (Not that the time to initialize the regular
expression would be an issue here, but I have some that are
complicated enough that parsing and initialing the expression
takes several minutes.)
--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34