On 7 Oct 2007 at 22:55, Richard Heathfield wrote:
Antoninus Twink said:
>The function below is from Richard HeathField's fgetline program.
It appears to be from my emgen utility, in fact.
>For
some reason, it makes three passes through the string (a strlen(), a
strcpy() then another pass to change dots) when two would clearly be
sufficient. This could lead to unnecessarily bad performance on very
long strings.
If it becomes a problem, I'll fix it. So far, it has not been a problem.
But what's frustrating is that it's an inefficiency that's completely
gratuitous! We all know that micro-optimization is bad, but this is a
micro-anti-optimization, which is surely worse!
The most natural way to look at this is "copy the characters from one
string to another, replacing . by _ when we see it". This has the
benefit of being a 1-pass algorithm. Instead, you split this up "first
copy one string to another; then go back to the beginning and swap . for
_". This makes a simple single operation into two, at the same time
introducing an extra pass through the string! It's not as if there's so
much fiendish complexity here that there's any benefit in breaking it up
into two separate operations.
>
>It is also written in a hard-to-read and clunky style.
A matter of opinion. Which bit did you find hard to read?
The function is a completely trivial one, yet I can't see it all at once
in my editor without scrolling! Whitespace can help readability, but
excessive whitespace can reduce it, and at the same time give too much
weight to things that aren't important.
>
>char *dot_to_undersc ore(const char *s)
{
char *t = malloc(strlen(s ) + 1);
if(t != NULL)
{
char *u;
strcpy(t, s);
u = t;
while(*u)
{
if(*u == '.')
{
*u = '_';
}
++u;
}
}
return
t;
}
Proposed solution:
char *dot_to_undersc ore(const char *s)
{
char *t, *u;
if(t=u=malloc(s trlen(s)+1))
while(*u++=(*s= ='.' ? s++, '_' : *s++));
return t;
}
It is not obvious to me that this code correctly replaces the code I wrote.
If you believe that it doesn't correctly replace the code you wrote, it
would be easy to demonstrate that by pointing out a specific input s for
which it gives a different result, or an error (syntax error or
undefined behavior or whatever).