Bug/Gross InEfficiency in HeathField's fgetline program - Page 2

Antoninus Twink

The function below is from Richard HeathField's fgetline program. For
some reason, it makes three passes through the string (a strlen(), a
strcpy() then another pass to change dots) when two would clearly be
sufficient. This could lead to unnecessarily bad performance on very
long strings. It is also written in a hard-to-read and clunky style.

char *dot_to_underscore(const char *s)
{
char *t = malloc(strlen(s) + 1);
if(t != NULL)
{
char *u;
strcpy(t, s);
u = t;
while(*u)
{
if(*u == '.')
{
*u = '_';
}
++u;
}
}
return
t;
}

Proposed solution:

char *dot_to_underscore(const char *s)
{
char *t, *u;
if(t=u=malloc(strlen(s)+1))
while(*u++=(*s=='.' ? s++, '_' : *s++));
return t;
}

Oct 7 '07

Subscribe Post Reply

334

11236

<
1
2
3
4
>
Last »

Richard

Keith Thompson <ks***@mib.orgwrites:

Richard <rg****@gmail.comwrites:
>Richard Heathfield <rj*@see.sig.invalidwrites:
>>Philip Potter said:
Antoninus Twink wrote:
On 7 Oct 2007 at 22:55, Richard Heathfield wrote:
>It is not obvious to me that this code correctly replaces the code I
>wrote.
>
If you believe that it doesn't correctly replace the code you wrote

He hasn't said that this is what he believes.

I always knew you could read, Philip. :-) Some other people, I'm not so
sure about.

It was perfectly obvious what you meant. Word games don't even begin to
cover it.

Yes, it's perfectly obvious what he meant. It's also perfectly
obvious that if he had meant that he believes that it doesn't
correctly replace the code he wrote, the could and would have said so.
He didn't.

You don't have to defend Heathfield. His fairly unique style makes it
very clear, as did his followup, exactly what he meant.

He didn't even bother to read the code as he made clear in a later post.

Oct 8 '07 #51

Richard

"Joachim Schmitz" <no*********@schmitz-digital.dewrites:

>>
So, which, as a C programmer, do you prefer?
I liked your version, for it's elegance and Richard's for it's
simplicity.

Richard's was more complex. It used the same reference/dereference with
additional library calls.

Antonius' version is too terse, good for teaching C maybe, but not good for
production code.

Bye, Jojo

Oct 8 '07 #52

Antoninus Twink

On 8 Oct 2007 at 18:24, santosh wrote:

>And who on earth has said anything about breaking your nose? It sounds
to me like you're suffering from some form of paranoia.

The email address of the poster who wrote this threatening message:
<http://groups.google.com/group/alt.comp.lang.learn.c-c++/msg/82c5c4b2e59984e0?dmode=source>

And this message:
<http://groups.google.com/group/comp....8?dmode=source
&utoken=0qjyGSsAAAAoZoYpti9uhwqsYFEYUqETYYZ-k6tkIFJmSBA4M7hURISd8ZKZA3rNsODuE9WiCO4>

begins identically.

Your messages in this thread share with the second message linked above,
identical "User-Agent", "NNTP-Posting-Host" headers as well as the
timezone.

A is related to B.
B is realted to C.

So

A is related to C.

Except that both connections are, to say the least, tenuous. I mean,
people with slightly similar email addresses *must* be the same! And
after all, there's a *unique* slrn user who posts through aioe!

Put the two together and you get something completely far-fetched. Do
you honestly believe that I, Paulcr, or anyone else would disappear for
5 years then suddently reappear, still nursing a grudge against Mr
Heathfield?

On the subject of names, I've been criticized for writing HeathField as
two words (hardly unreasonable, given that Heath and Field are... well,
both words), but I notice that Joachim and John Bode have both mistaken
my pseudonym for Antonius in this thread. What's sauce for the goose...

But really, it's just amazing how ready people are to shy away from
actual technical discussion and resort to mud-slinging instead.

Oct 8 '07 #53

Richard

Antoninus Twink <no****@nospam.comwrites:

On 8 Oct 2007 at 18:24, santosh wrote:

>>And who on earth has said anything about breaking your nose? It sounds
to me like you're suffering from some form of paranoia.

The email address of the poster who wrote this threatening message:
<http://groups.google.com/group/alt.comp.lang.learn.c-c++/msg/82c5c4b2e59984e0?dmode=source>

And this message:
<http://groups.google.com/group/comp....8?dmode=source
&utoken=0qjyGSsAAAAoZoYpti9uhwqsYFEYUqETYYZ-k6tkIFJmSBA4M7hURISd8ZKZA3rNsODuE9WiCO4>

begins identically.

Your messages in this thread share with the second message linked above,
identical "User-Agent", "NNTP-Posting-Host" headers as well as the
timezone.

A is related to B.
B is realted to C.

So

A is related to C.

Except that both connections are, to say the least, tenuous. I mean,
people with slightly similar email addresses *must* be the same! And
after all, there's a *unique* slrn user who posts through aioe!

Put the two together and you get something completely far-fetched. Do
you honestly believe that I, Paulcr, or anyone else would disappear for
5 years then suddently reappear, still nursing a grudge against Mr
Heathfield?

On the subject of names, I've been criticized for writing HeathField as
two words (hardly unreasonable, given that Heath and Field are... well,
both words), but I notice that Joachim and John Bode have both mistaken
my pseudonym for Antonius in this thread. What's sauce for the goose...

But really, it's just amazing how ready people are to shy away from
actual technical discussion and resort to mud-slinging instead.

I am astonished that people claiming to be professional programmers
could be in any way "confused" or "unsure" about your concise
replacement for Heathfield's version.

,----
| while(*u++=(*s=='.' ? '_' : *s))
| s++;
`----

I simply can not see the complication or the need to "analyse" this. Yes
if we were training in lesson 2 we might expand it out a little. but
only to the extent of removing the ?: usage to an if then else.

Oct 8 '07 #54

John Bode

On Oct 8, 2:54 pm, Richard <rgr...@gmail.comwrote:

John Bode <john_b...@my-deja.comwrites:
On Oct 8, 1:49 pm, Richard <rgr...@gmail.comwrote:
"Joachim Schmitz" <nospam.j...@schmitz-digital.dewrites:
"Richard" <rgr...@gmail.comschrieb im Newsbeitrag
news:dj************@news.individual.net...

[snip]

and inefficient.

Facts not in evidence. As others have pointed out, strcpy() may be
implemented in such a way that is faster than copying individual
characters in a loop. And elsethread RH has pointed out that this
code gets called *once*, that he has benchmarked it for obnoxiously
long inputs (tens of megabytes), and that the performance is perfectly
acceptable (< 1 us according to his profiler).

Regardless, unnecessary call. Sorry.

FWIW, after four runs on a 200 MB input string, RH's code is on
average 4% faster than AT's code.

Oct 8 '07 #55

Keith Thompson

Richard <rg****@gmail.comwrites:

Keith Thompson <ks***@mib.orgwrites:
>Richard <rg****@gmail.comwrites:
>>Richard Heathfield <rj*@see.sig.invalidwrites:
Philip Potter said:
Antoninus Twink wrote:
>On 7 Oct 2007 at 22:55, Richard Heathfield wrote:
>>It is not obvious to me that this code correctly replaces the code I
>>wrote.
>>
>If you believe that it doesn't correctly replace the code you wrote
>
He hasn't said that this is what he believes.

I always knew you could read, Philip. :-) Some other people, I'm not so
sure about.

It was perfectly obvious what you meant. Word games don't even begin to
cover it.

Yes, it's perfectly obvious what he meant. It's also perfectly
obvious that if he had meant that he believes that it doesn't
correctly replace the code he wrote, the could and would have said so.
He didn't.

You don't have to defend Heathfield. His fairly unique style makes it
very clear, as did his followup, exactly what he meant.

I think I've misunderstood what *you* meant. I've been assuming that
you thought RH was claiming that the code doesn't work, and that your
"Word games don't even begin to cover it" remark was aimed at RH.
Looking back at this subthread, I see that this may have been a
misjudgement on my part.

In my defense, your defending Richard Heathfield is so out of
character for you that it didn't occur to me that you were doing so.
And I still don't know what "word games" is supposed to refer to.

He didn't even bother to read the code as he made clear in a later post.

Well, he didn't read it in enough depth to determine whether it works
(I don't recall his exact words). And that's why it's not obvious to
him that ... well, see above.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Oct 8 '07 #56

Mark McIntyre

On Sun, 07 Oct 2007 23:15:21 -0400, in comp.lang.c , CBFalconer
<cb********@yahoo.comwrote:

>Mark McIntyre wrote:
>Antoninus Twink <no****@nospam.comwrote:

... snip ...
>>
>> char *t, *u;
if(t=u=malloc(strlen(s)+1))

hilarious !

What is hilarious? It should detect the failure of malloc quite
reliably. Of course the lack of blanks is rather foul.

in a post complaining about lack of clarity...

--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan

Oct 8 '07 #57

santosh

John Bode wrote:

On Oct 8, 2:54 pm, Richard <rgr...@gmail.comwrote:
>John Bode <john_b...@my-deja.comwrites:
On Oct 8, 1:49 pm, Richard <rgr...@gmail.comwrote:
"Joachim Schmitz" <nospam.j...@schmitz-digital.dewrites:
"Richard" <rgr...@gmail.comschrieb im Newsbeitrag
news:dj************@news.individual.net...

[snip]

>and inefficient.

Facts not in evidence. As others have pointed out, strcpy() may be
implemented in such a way that is faster than copying individual
characters in a loop. And elsethread RH has pointed out that this
code gets called *once*, that he has benchmarked it for obnoxiously
long inputs (tens of megabytes), and that the performance is
perfectly acceptable (< 1 us according to his profiler).

Regardless, unnecessary call. Sorry.

FWIW, after four runs on a 200 MB input string, RH's code is on
average 4% faster than AT's code.

Interesting! This could be a case of simpler constructions being more
easily understood by the optimiser.

Oct 8 '07 #58

Mark McIntyre

On Mon, 08 Oct 2007 17:27:15 +0000, in comp.lang.c , Richard
Heathfield <rj*@see.sig.invalidwrote:

>Well, I agree, but you appear to have misunderstood it nonetheless. I said
*precisely* what I meant to say

You are the Cheshire Cat and I claim my five pounds.
gd&r
--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan

Oct 8 '07 #59

Mark McIntyre

On Mon, 8 Oct 2007 20:12:27 +0200 (CEST), in comp.lang.c , Antoninus
Twink <no****@nospam.comwrote:

>On 8 Oct 2007 at 17:38, Richard Heathfield wrote:
>Not least the need to compress functionality into the fewest possible
source characters at the expense of readability and maintainability. I
recognise that a terse style is considered "cool" by some, but in my
experience it is merely expensive.

Wading through 20 lines of code

"Wading"?

>that do a 4-line job in a roundabout way can also be expensive.

This is why we invented computers - to do tricky things like reading
20 lines of code....
--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan

Oct 8 '07 #60

Richard Heathfield

John Bode said:

<snip>

FWIW, after four runs on a 200 MB input string, RH's code is on
average 4% faster than AT's code.

I couldn't possibly comment. :-)

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999

Oct 8 '07 #61

Richard Heathfield

Keith Thompson said:

Richard <rg****@gmail.comwrites:

<snip>

>He didn't even bother to read the code as he made clear in a later post.

Well, he didn't read it in enough depth to determine whether it works

Quite so. I'm not sure why Richard Riley finds this hard to understand.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999

Oct 8 '07 #62

Old Wolf

On Oct 9, 9:37 am, Richard <rgr...@gmail.comwrote:

I am astonished that people claiming to be professional programmers
could be in any way "confused" or "unsure" about your concise
replacement for Heathfield's version.

,----
| while(*u++=(*s=='.' ? '_' : *s))
| s++;
`----

That is NOT the replacement code suggested by
"Antoninus Twink". Perhaps you should get one of
those threaded newsreaders you keep whingeing
that everybody else should have, so that you can
check your facts before posting.

For the actual code, see:
http://groups.google.co.nz/group/com...e171fc9691cd7d

Oct 8 '07 #63

Antoninus Twink

On 8 Oct 2007 at 22:14, Old Wolf wrote:

On Oct 9, 9:37 am, Richard <rgr...@gmail.comwrote:
>I am astonished that people claiming to be professional programmers
could be in any way "confused" or "unsure" about your concise
replacement for Heathfield's version.

,----
| while(*u++=(*s=='.' ? '_' : *s))
| s++;
`----

That is NOT the replacement code suggested by
"Antoninus Twink". Perhaps you should get one of
those threaded newsreaders you keep whingeing
that everybody else should have, so that you can
check your facts before posting.

You are splitting hares. It is functionally equivalent - its advantage
is a slightly shorter total length, but in exchange the loop doesn't
have an empty body.

And I find it amusing that you object to me posting under a Usenet
handle, when your own posts are from "Old Wolf".

Oct 8 '07 #64

Tor Rustad

John Bode wrote:

[...]

FWIW, after four runs on a 200 MB input string, RH's code is on
average 4% faster than AT's code.

ROFL!

However, the optimizer can be playing tricks with you here. Anyway, RH
code was 96% more readable and maintainable, so OP was Trolling.

--
Tor <torust [at] online [dot] no>

Oct 8 '07 #65

CBFalconer

Thad Smith wrote:

Antoninus Twink wrote:
>Richard Heathfield wrote:
>>Antoninus Twink said:

>The function is a completely trivial one, yet I can't see it all
at once in my editor without scrolling! Whitespace can help
readability, but excessive whitespace can reduce it, and at the
same time give too much weight to things that aren't important.

>>>char *dot_to_underscore(const char *s)
{
char *t = malloc(strlen(s) + 1);
if(t != NULL)
{
char *u;
strcpy(t, s);
u = t;
while(*u)
{
if(*u == '.')
{
*u = '_';
}
++u;
}
}
return
t;
}

Proposed solution:

char *dot_to_underscore(const char *s)
{
char *t, *u;
if(t=u=malloc(strlen(s)+1))
while(*u++=(*s=='.' ? s++, '_' : *s++));
return t;
}
It is not obvious to me that this code correctly replaces the
code I wrote.

If you believe that it doesn't correctly replace the code you
wrote, it would be easy to demonstrate that by pointing out a
specific input s for which it gives a different result, or an
error (syntax error or undefined behavior or whatever).

What happens when malloc returns a null pointer?

While I am basically in favor of the 'tauter code' group, my
rewrite would have been different. For one thing, I don't like the
? coding. My solution:

char *dot_to_underscore(const char *s) {
char *t, *u;

if (!(t = u = malloc(strlen(s) + 1))) {
while (*u = *s++) {
if (*u == '.') *u = '_';
++u;
}
}
return t;
}

However, I dislike taking multiple scans of the same strings, so I
would probably have arranged for the routine to return strlen, and
a negative length if malloc fails. This decision depends heavily
on the use to which the routine is put.

All this is a non-factor, and basically tests individual styles and
preferances. While worthy of a discussion, it is not worth an
argument.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>

--
Posted via a free Usenet account from http://www.teranews.com

Oct 8 '07 #66

santosh

CBFalconer wrote:

Thad Smith wrote:
>Antoninus Twink wrote:
>>Richard Heathfield wrote:
Antoninus Twink said:

>>The function is a completely trivial one, yet I can't see it all
at once in my editor without scrolling! Whitespace can help
readability, but excessive whitespace can reduce it, and at the
same time give too much weight to things that aren't important.

char *dot_to_underscore(const char *s)
{
char *t = malloc(strlen(s) + 1);
if(t != NULL)
{
char *u;
strcpy(t, s);
u = t;
while(*u)
{
if(*u == '.')
{
*u = '_';
}
++u;
}
}
return
t;
}
>
Proposed solution:
>
char *dot_to_underscore(const char *s)
{
char *t, *u;
if(t=u=malloc(strlen(s)+1))
while(*u++=(*s=='.' ? s++, '_' : *s++));
return t;
}
It is not obvious to me that this code correctly replaces the
code I wrote.

If you believe that it doesn't correctly replace the code you
wrote, it would be easy to demonstrate that by pointing out a
specific input s for which it gives a different result, or an
error (syntax error or undefined behavior or whatever).

What happens when malloc returns a null pointer?

While I am basically in favor of the 'tauter code' group, my
rewrite would have been different. For one thing, I don't like the
? coding. My solution:

char *dot_to_underscore(const char *s) {
char *t, *u;

if (!(t = u = malloc(strlen(s) + 1))) {

You mean you commence copy when malloc fails?

while (*u = *s++) {
if (*u == '.') *u = '_';
++u;
}
}
return t;
}

However, I dislike taking multiple scans of the same strings, so I
would probably have arranged for the routine to return strlen, and
a negative length if malloc fails. This decision depends heavily
on the use to which the routine is put.

All this is a non-factor, and basically tests individual styles and
preferances. While worthy of a discussion, it is not worth an
argument.

Oct 8 '07 #67

Tor Rustad

santosh wrote:

John Bode wrote:

[...]

>FWIW, after four runs on a 200 MB input string, RH's code is on
average 4% faster than AT's code.

Interesting! This could be a case of simpler constructions being more
easily understood by the optimiser.

I don't think so. My guess would be, either the measurement is
misleading, or strcpy() help alignment and execute some code in parallel.

--
Tor <torust [at] online [dot] no>

C-FAQ: http://c-faq.com/

Oct 8 '07 #68

Tor Rustad

CBFalconer wrote:

Thad Smith wrote:

[...]

>What happens when malloc returns a null pointer?

While I am basically in favor of the 'tauter code' group, my
rewrite would have been different. For one thing, I don't like the
? coding. My solution:

char *dot_to_underscore(const char *s) {
char *t, *u;

if (!(t = u = malloc(strlen(s) + 1))) {

That should run very fast! :-)

--
Tor <torust [at] online [dot] no>

C-FAQ: http://c-faq.com/

Oct 8 '07 #69

santosh

Tor Rustad wrote:

santosh wrote:
>John Bode wrote:

[...]

>>FWIW, after four runs on a 200 MB input string, RH's code is on
average 4% faster than AT's code.

Interesting! This could be a case of simpler constructions being more
easily understood by the optimiser.

I don't think so. My guess would be, either the measurement is
misleading, or strcpy() help alignment and execute some code in
parallel.

Well I did a small comparison test between the two version on a string
of length 209,715,200 bytes, with the '.' character just before the
terminating null. I used clock to time the functions. For four runs for
each version here are the averages:

RH's version = 1.480000s
AT's version = 1.212500s

[system is Pentium Dual Core 1.6 GHz with 1 Gb RAM]
So for this system at least, AT's version is significantly faster.

If anyone wants, I'll post the driver code.

Oct 8 '07 #70

Keith Thompson

Willem <wi****@stack.nlwrites:
[...]

By the way, the perl equivalent would be:
(ret = str) =~ s/\./_/g;
But that's off-topic.

Indeed, which is why I won't mention that the *correct* perl
equivalent is:

($ret = $str) =~ s/\./_/g;

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Oct 8 '07 #71

Richard Tobin

In article <fe**********@aioe.org>, santosh <sa*********@gmail.comwrote:

>Well I did a small comparison test between the two version on a string
of length 209,715,200 bytes, with the '.' character just before the
terminating null. I used clock to time the functions. For four runs for
each version here are the averages:

RH's version = 1.480000s
AT's version = 1.212500s

If you can be bothered, what happens if you replace the strcpy() in
RH's code with a memcpy() (the length being already known)?

-- Richard
--
"Consideration shall be given to the need for as many as 32 characters
in some alphabets" - X3.4, 1963.

Oct 8 '07 #72

Richard Heathfield

santosh said:

<snip>

RH's version = 1.480000s
AT's version = 1.212500s

[system is Pentium Dual Core 1.6 GHz with 1 Gb RAM]
So for this system at least, AT's version is significantly faster.

No, it isn't significantly faster. The function is called *once* by the
program of which it is a part. On my system, it takes less than a
microsecond to run. Let's round up and call it one microsecond. According
to your timings, AT's version saves (1.48 - 1.2125)/1.48 = approximately
0.180743 of the time. This equates to 181 nanoseconds per program run.

To save as much as a single second, you'd have to run the program well over
five *million* times. If it only took me one minute to verify the change,
update the source, recompile, and re-test, I'd still have to run the code
three hundred million times just to break even. I type at 40wpm, so I can
probably manage to add one error message identifier and meaningful message
text in about five seconds. 300,000,000 * 5 = 1,500,000,000 seconds. So
breaking even would take 47 years (assuming I did nothing else for the
next 47 years, such as eating and sleeping and actual programming and
watching LOTR and so on).

So in fact it would be counter-productive to adopt the change, even if I
thought it were an improvement, which I don't.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999

Oct 8 '07 #73

santosh

Richard Tobin wrote:

In article <fe**********@aioe.org>, santosh <sa*********@gmail.com>
wrote:

>>Well I did a small comparison test between the two version on a string
of length 209,715,200 bytes, with the '.' character just before the
terminating null. I used clock to time the functions. For four runs
for each version here are the averages:

RH's version = 1.480000s
AT's version = 1.212500s

If you can be bothered, what happens if you replace the strcpy() in
RH's code with a memcpy() (the length being already known)?

The average time drops to 1.320000s. Not much.

Oct 8 '07 #74

santosh

Richard Heathfield wrote:

santosh said:

<snip>

>RH's version = 1.480000s
AT's version = 1.212500s

[system is Pentium Dual Core 1.6 GHz with 1 Gb RAM]
So for this system at least, AT's version is significantly faster.

No, it isn't significantly faster. The function is called *once* by
the program of which it is a part. On my system, it takes less than a
microsecond to run. Let's round up and call it one microsecond.
According to your timings, AT's version saves (1.48 - 1.2125)/1.48 =
approximately 0.180743 of the time. This equates to 181 nanoseconds
per program run.

To save as much as a single second, you'd have to run the program well
over five *million* times. If it only took me one minute to verify the
change, update the source, recompile, and re-test, I'd still have to
run the code three hundred million times just to break even. I type at
40wpm, so I can probably manage to add one error message identifier
and meaningful message text in about five seconds. 300,000,000 * 5 =
1,500,000,000 seconds. So breaking even would take 47 years (assuming
I did nothing else for the next 47 years, such as eating and sleeping
and actual programming and watching LOTR and so on).

So in fact it would be counter-productive to adopt the change, even if
I thought it were an improvement, which I don't.

You are right. I got carried away by the literal numbers and failed to
put them perspective with the real world.

Oct 8 '07 #75

Tor Rustad

santosh wrote:

Tor Rustad wrote:

>santosh wrote:
>>John Bode wrote:
[...]

>>>FWIW, after four runs on a 200 MB input string, RH's code is on
average 4% faster than AT's code.
Interesting! This could be a case of simpler constructions being more
easily understood by the optimiser.
I don't think so. My guess would be, either the measurement is
misleading, or strcpy() help alignment and execute some code in
parallel.

Well I did a small comparison test between the two version on a string
of length 209,715,200 bytes, with the '.' character just before the
terminating null. I used clock to time the functions. For four runs for
each version here are the averages:

RH's version = 1.480000s
AT's version = 1.212500s

Yeah, that's more like it.

[system is Pentium Dual Core 1.6 GHz with 1 Gb RAM]
So for this system at least, AT's version is significantly faster.

I have not seen the fgetline() code, but if the function in question is
called only *once*, then OP "optimized" the *outer* loop, a professional
would go for the *inner* loop.

Your measurement, showed that OP's posted nonsense, an insignificant
micro-optimization.

--
Tor <torust [at] online [dot] no>

C-FAQ: http://c-faq.com/

Oct 9 '07 #76

CBFalconer

Antoninus Twink wrote:

>

.... snip ...

>
You are splitting hares. It is functionally equivalent - its
advantage is a slightly shorter total length, but in exchange the
loop doesn't have an empty body.

Splitting hares will not reduce the poor beasts length, only its
width. The emptiness of the body depends (at least in part) on how
much of the interior spills after the split. :-)

You might want to consider 'splitting hairs'.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>

--
Posted via a free Usenet account from http://www.teranews.com

Oct 9 '07 #77

John Bode

On Oct 8, 5:37 pm, Tor Rustad <tor_rus...@hotmail.comwrote:

John Bode wrote:

[...]

FWIW, after four runs on a 200 MB input string, RH's code is on
average 4% faster than AT's code.

ROFL!

However, the optimizer can be playing tricks with you here.

Oh, no doubt. But that kind of reinforces the idea that writing code
in a clear, straightforward way (that is, clear to anyone who *isn't*
an expert in C) is better than trying to save screen real estate.

I'm rewriting the test harness and running it on my home box (Gentoo);
the box I ran it on earlier was RHEL 3 running in a VMware session on
an XP Pro box that's running eleventy billion server processes. I'm
also going to try it with various string lengths and optimization
options.

Anyway, RH
code was 96% more readable and maintainable, so OP was Trolling.

Again, no doubt.

--
Tor <torust [at] online [dot] no>

Oct 9 '07 #78

Richard

John Bode <jo*******@my-deja.comwrites:

On Oct 8, 5:37 pm, Tor Rustad <tor_rus...@hotmail.comwrote:
>John Bode wrote:

[...]

FWIW, after four runs on a 200 MB input string, RH's code is on
average 4% faster than AT's code.

ROFL!

However, the optimizer can be playing tricks with you here.

Oh, no doubt. But that kind of reinforces the idea that writing code
in a clear, straightforward way (that is, clear to anyone who *isn't*
an expert in C) is better than trying to save screen real estate.

I'm rewriting the test harness and running it on my home box (Gentoo);
the box I ran it on earlier was RHEL 3 running in a VMware session on
an XP Pro box that's running eleventy billion server processes. I'm
also going to try it with various string lengths and optimization
options.

>Anyway, RH
code was 96% more readable and maintainable, so OP was Trolling.

Again, no doubt.

If you prefer RHs code I am astonished. But it takes all sorts I
support. it's not "bad code" by any stretch of the imagination.

As was pointed out earlier that kind of speed increase/decrease is
generally immaterial however, I know for a fact that the time saved in
debugging and reading it is worth its weight in gold if you can not
understand such a bog standard 2 lines then there are issues in your
understanding of C.

while(*u++=(*s=='.' ? '_' : *s))
s++;

It couldn't be easier.

Oct 9 '07 #79

Old Wolf

On Oct 9, 11:27 am, Antoninus Twink <nos...@nospam.comwrote:

On 8 Oct 2007 at 22:14, Old Wolf wrote:
On Oct 9, 9:37 am, Richard <rgr...@gmail.comwrote:
,----
| while(*u++=(*s=='.' ? '_' : *s))
| s++;
`----

That is NOT the replacement code suggested by
"Antoninus Twink".

You are splitting hares. It is functionally equivalent - its advantage
is a slightly shorter total length, but in exchange the loop doesn't
have an empty body.

The difference is more than you realise; here we are
talking about code readability and maintainability and
the modified version has two major improvements to your
original:
* The s++ only occurs once
* It has a loop body

And I find it amusing that you object to me posting under a Usenet
handle, when your own posts are from "Old Wolf".

I'm not objecting to you posting under a handle. Also,
I find it amusing that you think it improves code if
you can make a loop body empty.

Oct 9 '07 #80

CBFalconer

santosh wrote:

CBFalconer wrote:

.... snip ...

>
>While I am basically in favor of the 'tauter code' group, my
rewrite would have been different. For one thing, I don't like
the ? coding. My solution:

char *dot_to_underscore(const char *s) {
char *t, *u;

if (!(t = u = malloc(strlen(s) + 1))) {

You mean you commence copy when malloc fails?

> while (*u = *s++) {
if (*u == '.') *u = '_';
++u;
}
}
return t;
}

However, I dislike taking multiple scans of the same strings, so I
would probably have arranged for the routine to return strlen, and
a negative length if malloc fails. This decision depends heavily
on the use to which the routine is put.

All this is a non-factor, and basically tests individual styles and
preferances. While worthy of a discussion, it is not worth an
argument.

Yup. Slight overexuberance with the bang.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>

--
Posted via a free Usenet account from http://www.teranews.com

Oct 9 '07 #81

¬a\\/b

In data Tue, 09 Oct 2007 02:39:06 +0200, Richard scrisse:

>John Bode <jo*******@my-deja.comwrites:

>On Oct 8, 5:37 pm, Tor Rustad <tor_rus...@hotmail.comwrote:
>>John Bode wrote:

[...]

FWIW, after four runs on a 200 MB input string, RH's code is on
average 4% faster than AT's code.

ROFL!

However, the optimizer can be playing tricks with you here.

Oh, no doubt. But that kind of reinforces the idea that writing code
in a clear, straightforward way (that is, clear to anyone who *isn't*
an expert in C) is better than trying to save screen real estate.

I'm rewriting the test harness and running it on my home box (Gentoo);
the box I ran it on earlier was RHEL 3 running in a VMware session on
an XP Pro box that's running eleventy billion server processes. I'm
also going to try it with various string lengths and optimization
options.

>>Anyway, RH
code was 96% more readable and maintainable, so OP was Trolling.

Again, no doubt.

If you prefer RHs code I am astonished. But it takes all sorts I
support. it's not "bad code" by any stretch of the imagination.

As was pointed out earlier that kind of speed increase/decrease is
generally immaterial however, I know for a fact that the time saved in
debugging and reading it is worth its weight in gold if you can not
understand such a bog standard 2 lines then there are issues in your
understanding of C.

while(*u++=(*s=='.' ? '_' : *s))
s++;

>It couldn't be easier.

this is easier
goto l2;
l1: ++s; ++u;
l2: *u=(*s=='.'? '_': *s);
if(*s) goto l1;
in 2 lines:
goto l2;
l1: ++s; ++u; l2: *u=(*s=='.'? '_': *s); if(*s) goto l1;

in one line with #==goto

#l2; l1: ++s; ++u; l2: *u=(*s=='.'? '_': *s); if(*s)#l1;

Oct 9 '07 #82

Philip Potter

Richard wrote:

Why would you post that?

Off Topic for a start and NOTHING to do with C.

The poster was quite correct in his improvements of RHs code. Regardless
of who or what he is.

Really? I see neither bug nor gross inefficiency in RH's original, which
was his main claim. His code posted is not grossly more efficient - it's
only a constant factor faster, not even a different big-O efficiency class.

--
Philip Potter pgp <atdoc.ic.ac.uk

Oct 9 '07 #83

Keith Thompson

"¬a\\/b" <al@f.gwrites:

In data Tue, 09 Oct 2007 02:39:06 +0200, Richard scrisse:

[...]

>while(*u++=(*s=='.' ? '_' : *s))
s++;

>>It couldn't be easier.

this is easier
goto l2;
l1: ++s; ++u;
l2: *u=(*s=='.'? '_': *s);
if(*s) goto l1;
in 2 lines:
goto l2;
l1: ++s; ++u; l2: *u=(*s=='.'? '_': *s); if(*s) goto l1;

in one line with #==goto

#l2; l1: ++s; ++u; l2: *u=(*s=='.'? '_': *s); if(*s)#l1;

Stripped of whitespace, compressed with gzip, and base64 encoded:

H4sIAIUzC0cAA1POMbJWyDG0UtDWLrYGEqVAnpGVglaprYZWsa 2tnr1CPJBXrGmt
kJkGFNFUzjG05gIAN60ghDUAAAA=

Ahh, much better.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Oct 9 '07 #84

¬a\\/b

In data Mon, 8 Oct 2007 00:16:07 +0200 (CEST), Antoninus Twink
scrisse:

>The function below is from Richard HeathField's fgetline program. For
some reason, it makes three passes through the string (a strlen(), a
strcpy() then another pass to change dots) when two would clearly be
sufficient. This could lead to unnecessarily bad performance on very
long strings. It is also written in a hard-to-read and clunky style.

char *dot_to_underscore(const char *s)
{
char *t = malloc(strlen(s) + 1);
if(t != NULL)
{
char *u;
strcpy(t, s);
u = t;
while(*u)
{
if(*u == '.')
{
*u = '_';
}
++u;
}
}
return
t;
}

this is a big waste of vertical spaces

>Proposed solution:

char *dot_to_underscore(const char *s)
{
char *t, *u;
if(t=u=malloc(strlen(s)+1))
while(*u++=(*s=='.' ? s++, '_' : *s++));
return t;
}

Oct 9 '07 #85

¬a\\/b

In data Tue, 09 Oct 2007 09:31:52 +0200, ¬a\/b scrisse:

>In data Tue, 09 Oct 2007 02:39:06 +0200, Richard scrisse:

>>John Bode <jo*******@my-deja.comwrites:

>>On Oct 8, 5:37 pm, Tor Rustad <tor_rus...@hotmail.comwrote:
John Bode wrote:

[...]

FWIW, after four runs on a 200 MB input string, RH's code is on
average 4% faster than AT's code.

ROFL!

However, the optimizer can be playing tricks with you here.

Oh, no doubt. But that kind of reinforces the idea that writing code
in a clear, straightforward way (that is, clear to anyone who *isn't*
an expert in C) is better than trying to save screen real estate.

I'm rewriting the test harness and running it on my home box (Gentoo);
the box I ran it on earlier was RHEL 3 running in a VMware session on
an XP Pro box that's running eleventy billion server processes. I'm
also going to try it with various string lengths and optimization
options.

Anyway, RH
code was 96% more readable and maintainable, so OP was Trolling.
Again, no doubt.

If you prefer RHs code I am astonished. But it takes all sorts I
support. it's not "bad code" by any stretch of the imagination.

As was pointed out earlier that kind of speed increase/decrease is
generally immaterial however, I know for a fact that the time saved in
debugging and reading it is worth its weight in gold if you can not
understand such a bog standard 2 lines then there are issues in your
understanding of C.

while(*u++=(*s=='.' ? '_' : *s))
s++;

>>It couldn't be easier.

this is easier
goto l2;
l1: ++s; ++u;
l2: *u=(*s=='.'? '_': *s);
if(*s) goto l1;
in 2 lines:
goto l2;
l1: ++s; ++u; l2: *u=(*s=='.'? '_': *s); if(*s) goto l1;

in one line with #==goto

#l2; l1: ++s; ++u; l2: *u=(*s=='.'? '_': *s); if(*s)#l1;

#.2
..0: B*i='_'; #.1;
..1: ++i,j; .2: al=*j; al=='.'#.0; *i=al; al#.1

Oct 9 '07 #86

¬a\\/b

In data Tue, 09 Oct 2007 10:06:04 +0200, ¬a\/b scrisse:

>#.2
.0: B*i='_'; #.1;
.1: ++i,j; .2: al=*j; al=='.'#.0; *i=al; al#.1

there is jmp more

#.2
..0: B*i='_'
..1: ++i,j; .2: al=*j; al=='.'#.0; *i=al; al#.1

that should be something like
jmp short .2
..0: mov byte [esi], '.'
..1: inc esi
inc edi
..2: mov al, [edi]
cmp al, '.'
je .0
mov [esi], al
cmp al, 0
jne .1;

Oct 9 '07 #87

Antoninus Twink

On 8 Oct 2007 at 23:55, Tor Rustad wrote:

I have not seen the fgetline() code, but if the function in question is
called only *once*, then OP "optimized" the *outer* loop, a professional
would go for the *inner* loop.

Your measurement, showed that OP's posted nonsense, an insignificant
micro-optimization.

The point was that Mr Heathfield's code was a micro-anti-optimization.
If making the code harder to read in exchange for a small increase in
speed is bad, how much worse is it to make the code harder to read by
implementing a convoluted two-pass algorithm that's more complex and
slower than the natural, idiomatic one?

Or, if someone makes the small decisions badly, does that give us any
faith that they'll do better on the big decisions?

Oct 9 '07 #88

santosh

¬a\/b wrote:

In data Tue, 09 Oct 2007 10:06:04 +0200, ¬a\/b scrisse:

>>#.2
.0: B*i='_'; #.1;
.1: ++i,j; .2: al=*j; al=='.'#.0; *i=al; al#.1

there is jmp more

#.2
.0: B*i='_'
.1: ++i,j; .2: al=*j; al=='.'#.0; *i=al; al#.1

Your programs would be more readable if you decided to use either one of
l33t or brainf?uk.

To get you started:
<http://esoteric.voxelperfect.net/wiki/Brainf?ck>
<http://www.geocities.com/electrodruiduk/l33t.htm>

Oct 9 '07 #89

Richard Heathfield

Antoninus Twink said:

On 8 Oct 2007 at 23:55, Tor Rustad wrote:
>I have not seen the fgetline() code, but if the function in question is
called only *once*, then OP "optimized" the *outer* loop, a professional
would go for the *inner* loop.

Your measurement, showed that OP's posted nonsense, an insignificant
micro-optimization.

The point was that Mr Heathfield's code was a micro-anti-optimization.

No, Mr Heathfield's code was written in about a minute, and worked
perfectly first time, and is easy for even a very inexperienced C
programmer to understand. What's more, it is called once per program
invocation, and does its job in less than a microsecond. You can call it
what you like, but I call it a win.

If making the code harder to read in exchange for a small increase in
speed is bad,

Whether it is bad depends on the relative merits of performance and
clarity. I have already shown how I would have to use the program 24/7 for
almost half a century before your suggested change could save me so much
as a nanosecond (once the time cost of making that change is factored in),
and quite frankly I have better things to do with my life. So in this
case, the performance increase is meaningless, whereas the loss of clarity
is significant.

how much worse is it to make the code harder to read by
implementing a convoluted two-pass algorithm that's more complex and
slower than the natural, idiomatic one?

We have already demonstrated why "slower" is unimportant. If you seriously
think the original code is hard to read, then words fail me. It's
terribly, terribly simple C. It's astoundingly easy for most C people. I
simply cannot understand why you would find it difficult or complex.

Or, if someone makes the small decisions badly, does that give us any
faith that they'll do better on the big decisions?

We have different criteria. You appear to judge code by how "tight" it is.
I prefer to judge it by how readable and maintainable it is.

It is certainly true that I could have merged the copy and the replace
operation, but it is also true that doing so makes practically no
difference to the performance of the code. What you have called "gross
inefficiency" manages to outperform your code on at least one platform (as
reported elsethread), and doesn't perform significantly worse on others.

What's more, my code is easily understood and easily maintained by even
very inexperienced C programmers. For me, that's a key goal, because the
code I write is very often used and perhaps modified in environments over
which I have no control. So I like to keep things simple. If someone
discovers a way that they can shave 180 nanoseconds (that's 0.00000018
seconds!) off a function that is called once per program invocation, I'm
pleased for them, but that doesn't mean I have to incorporate their
changes into my code.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999

Oct 9 '07 #90

John Bode

On Oct 7, 5:16 pm, Antoninus Twink <nos...@nospam.comwrote:

The function below is from Richard HeathField's fgetline program. For
some reason, it makes three passes through the string (a strlen(), a
strcpy() then another pass to change dots) when two would clearly be
sufficient. This could lead to unnecessarily bad performance on very
long strings.

It could, except that on my system (Gentoo Linux, 2.6.22 kernel, gcc
4.1.2), it doesn't. In fact, it leads to overall *better* performance
on very long strings. I wrote a test harness that generates random
strings and calls both Richard's and Antonius' versions of
dot_to_underscore, as well as a "control" version (basically a copy of
Richard's, except that it copies characters to the target buffer
manually instead of using strcpy()).

I ran three rounds of tests. Each round worked on a 2 MB string
(2097152 bytes), and for each run of the program, each version of the
function was called 100 times (basically so I could get some usable
times). For the first round, I compiled the code without any
optimization. For the second round, I compiled with the -O1 flag.
For the third round, I compiled with the -O2 flag. I used gprof to
get the time spent in each function.

For each round, I ran the program 10 times, logging the times from
each, and computed the average.

So, calling each function 100 times on a 2 MB string, I got the
following results (average time over 10 runs):

Function RH AT JB
-------- -- -- --
No optimization 2.72 s 3.49 s 4.01 s
Level 1 optimization 1.33 s 1.93 s 1.74 s
Level 2 optimization 0.60 s 1.11 s 1.13 s

Hmmmm. Two things stand out. One, on my system, calling strcpy() is
*definitely* faster than copying each individual character in a loop
(as that is the only difference between RH and JB). Secondly, writing
code in a straightforward, "clunky" style seems to make it easier to
optimize.

Of course, these results are valid *only* for my particular system.
I'd be interested to see the results from different hardware/OS/
compiler combinations.

It is also written in a hard-to-read and clunky style.

char *dot_to_underscore(const char *s)
{
char *t = malloc(strlen(s) + 1);
if(t != NULL)
{
char *u;
strcpy(t, s);
u = t;
while(*u)
{
if(*u == '.')
{
*u = '_';
}
++u;
}
}
return
t;

}

Proposed solution:

char *dot_to_underscore(const char *s)
{
char *t, *u;
if(t=u=malloc(strlen(s)+1))
while(*u++=(*s=='.' ? s++, '_' : *s++));
return t;

}

It's time to re-learn those two important lessons:

1. Terse code does not necessarily equate to faster code
2. The only way to know what version of code will be more efficient
is to profile all versions under consideration.

More bad code is written in the name of "efficiency" than anything
else.

Oct 9 '07 #91

Antoninus Twink

On 9 Oct 2007 at 9:34, Richard Heathfield wrote:

Antoninus Twink said:
>If making the code harder to read in exchange for a small increase in
speed is bad,

Whether it is bad depends on the relative merits of performance and
clarity. I have already shown how I would have to use the program 24/7 for
almost half a century before your suggested change could save me so much
as a nanosecond (once the time cost of making that change is factored in),
and quite frankly I have better things to do with my life. So in this
case, the performance increase is meaningless, whereas the loss of clarity
is significant.

But exactly the opposite is true - clarity is lost in *your* version, by
taking something simple and making a meal of it. The average programmer
looking at your code is going to do a double-take, wonder why on earth
that strcpy() is there when you immediately make another pass through
the string, and have to convince himself that no, there really isn't any
funny business going on.

Compare that to two lines of idiomatic code.

>how much worse is it to make the code harder to read by
implementing a convoluted two-pass algorithm that's more complex and
slower than the natural, idiomatic one?

We have already demonstrated why "slower" is unimportant. If you seriously
think the original code is hard to read, then words fail me. It's
terribly, terribly simple C. It's astoundingly easy for most C people. I
simply cannot understand why you would find it difficult or complex.

Yes, it's terribly, terribly simple. Almost babyish even. I did not mean
that your *code* is complex, but that the *algorithm* is complex. And
you will say, "But it's a simple algorithm!" Right, it isn't complex by
any objective standard of complexity, but it's *more complex than it
needs to be* - why swap a simple single-pass algorithm for a 2-pass
algorithm?

And if you make simple things over-complicated, we might not
unreasonably suspect that you might make complicated things into a
complete mess.

Oct 9 '07 #92

Mark McIntyre

On Tue, 9 Oct 2007 14:20:41 +0200 (CEST), in comp.lang.c , Antoninus
Twink <no****@nospam.comwrote:

>On 9 Oct 2007 at 9:34, Richard Heathfield wrote:
>Antoninus Twink said:
>>If making the code harder to read in exchange for a small increase in
speed is bad,

Whether it is bad depends on the relative merits of performance and
clarity. I have already shown how I would have to use the program 24/7 for
almost half a century before your suggested change could save me so much
as a nanosecond (once the time cost of making that change is factored in),
and quite frankly I have better things to do with my life. So in this
case, the performance increase is meaningless, whereas the loss of clarity
is significant.

But exactly the opposite is true - clarity is lost in *your* version,

You're wrong, but I don't expect you to believe me since you're
apparently a troll.

>And if you make simple things over-complicated, we might not
unreasonably suspect that you might make complicated things into a
complete mess.

Only if he changes his name to Antoninus Twink.
--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan

Oct 9 '07 #93

Joachim Schmitz

"¬a\/b" <al@f.gschrieb im Newsbeitrag
news:4j********************************@4ax.com...

In data Mon, 8 Oct 2007 00:16:07 +0200 (CEST), Antoninus Twink
scrisse:

>>The function below is from Richard HeathField's fgetline program. For
some reason, it makes three passes through the string (a strlen(), a
strcpy() then another pass to change dots) when two would clearly be
sufficient. This could lead to unnecessarily bad performance on very
long strings. It is also written in a hard-to-read and clunky style.

char *dot_to_underscore(const char *s)
{
char *t = malloc(strlen(s) + 1);
if(t != NULL)
{
char *u;
strcpy(t, s);
u = t;
while(*u)
{
if(*u == '.')
{
*u = '_';
}
++u;
}
}
return
t;
}

this is a big waste of vertical spaces

Scratch the 'big' and I agree. Saving 6 lines:

char *dot_to_underscore(const char *s)
{
char *t = malloc(strlen(s) + 1);
if(t != NULL){
char *u;
strcpy(t, s);
u = t;
while(*u){
if(*u == '.') *u = '_';
++u;
}
}
return t;
}
Bye, Jojo

Oct 9 '07 #94

Chris Dollin

Joachim Schmitz wrote:

>this is a big waste of vertical spaces
Scratch the 'big' and I agree. Saving 6 lines:

char *dot_to_underscore(const char *s)
{
char *t = malloc(strlen(s) + 1);
if(t != NULL){
char *u;
strcpy(t, s);
u = t;
while(*u){
if(*u == '.') *u = '_';
++u;
}
}
return t;
}

Save another line by declaring `char *u = t;`, which also makes it
utterly clear that `u` is initialised to a sensible value.

--
Chris "would be even non-twinkly terser myself" Dollin

Hewlett-Packard Limited registered office: Cain Road, Bracknell,
registered no: 690597 England Berks RG12 1HN

Oct 9 '07 #95

Richard Heathfield

Chris Dollin said:

Joachim Schmitz wrote:

>>this is a big waste of vertical spaces
Scratch the 'big' and I agree. Saving 6 lines:

char *dot_to_underscore(const char *s)
{
char *t = malloc(strlen(s) + 1);
if(t != NULL){
char *u;
strcpy(t, s);
u = t;
while(*u){
if(*u == '.') *u = '_';
++u;
}
}
return t;
}

Save another line by declaring `char *u = t;`, which also makes it
utterly clear that `u` is initialised to a sensible value.

<hell temperature="frozen over">
Or even char *u = strcpy(t, s); saving /two/ lines. :-)
</hell>

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999

Oct 9 '07 #96

Chris Dollin

Richard Heathfield wrote:

Chris Dollin said:

>Joachim Schmitz wrote:

>>>this is a big waste of vertical spaces
Scratch the 'big' and I agree. Saving 6 lines:

char *dot_to_underscore(const char *s)
{
char *t = malloc(strlen(s) + 1);
if(t != NULL){
char *u;
strcpy(t, s);
u = t;
while(*u){
if(*u == '.') *u = '_';
++u;
}
}
return t;
}

Save another line by declaring `char *u = t;`, which also makes it
utterly clear that `u` is initialised to a sensible value.

<hell temperature="frozen over">
Or even char *u = strcpy(t, s); saving /two/ lines. :-)
</hell>

I was just about to post that adjustment, you mind-reader you.

--
Chris "broadcasting" Dollin

Hewlett-Packard Limited registered office: Cain Road, Bracknell,
registered no: 690597 England Berks RG12 1HN

Oct 9 '07 #97

Richard

"Joachim Schmitz" <no************@hp.comwrites:

"Â¬a\/b" <al@f.gschrieb im Newsbeitrag
news:4j********************************@4ax.com...
>In data Mon, 8 Oct 2007 00:16:07 +0200 (CEST), Antoninus Twink
scrisse:

>>>The function below is from Richard HeathField's fgetline program. For
some reason, it makes three passes through the string (a strlen(), a
strcpy() then another pass to change dots) when two would clearly be
sufficient. This could lead to unnecessarily bad performance on very
long strings. It is also written in a hard-to-read and clunky style.

char *dot_to_underscore(const char *s)
{
char *t = malloc(strlen(s) + 1);
if(t != NULL)
{
char *u;
strcpy(t, s);
u = t;
while(*u)
{
if(*u == '.')
{
*u = '_';
}
++u;
}
}
return
t;
}

this is a big waste of vertical spaces
Scratch the 'big' and I agree. Saving 6 lines:

char *dot_to_underscore(const char *s)
{
char *t = malloc(strlen(s) + 1);
if(t != NULL){
char *u;
strcpy(t, s);
u = t;
while(*u){
if(*u == '.') *u = '_';

failed.

you cant easily see in a debugger if the assignment is done.

++u;
}
}
return t;
}
Bye, Jojo

Oct 9 '07 #98

Richard

Mark McIntyre <ma**********@spamcop.netwrites:

On Tue, 9 Oct 2007 14:20:41 +0200 (CEST), in comp.lang.c , Antoninus
Twink <no****@nospam.comwrote:

>>On 9 Oct 2007 at 9:34, Richard Heathfield wrote:
>>Antoninus Twink said:
If making the code harder to read in exchange for a small increase in
speed is bad,

Whether it is bad depends on the relative merits of performance and
clarity. I have already shown how I would have to use the program 24/7 for
almost half a century before your suggested change could save me so much
as a nanosecond (once the time cost of making that change is factored in),
and quite frankly I have better things to do with my life. So in this
case, the performance increase is meaningless, whereas the loss of clarity
is significant.

But exactly the opposite is true - clarity is lost in *your* version,

You're wrong, but I don't expect you to believe me since you're
apparently a troll.

He most certainly is not.

The shorter more concise version would have got through any code review
long before the long winded one on any project I have worked on.

>
>>And if you make simple things over-complicated, we might not
unreasonably suspect that you might make complicated things into a
complete mess.

Only if he changes his name to Antoninus Twink.

This is getting ridiculous.

Oct 9 '07 #99

Richard Bos

Ian Collins <ia******@hotmail.comwrote:

Antoninus Twink wrote:
On 7 Oct 2007 at 22:55, Richard Heathfield wrote:
Antoninus Twink said:
It is also written in a hard-to-read and clunky style.
A matter of opinion. Which bit did you find hard to read?
The function is a completely trivial one, yet I can't see it all at once
in my editor without scrolling! Whitespace can help readability, but
excessive whitespace can reduce it, and at the same time give too much
weight to things that aren't important.
You must have a very small screen.

I don't think it's his _screen_ that's very small.

Richard

Oct 9 '07 #100