about STREQ

Peng Jian

#define STREQ(a, b) (*(a) == *(b) && strcmp((a), (b)) == 0)

I'm a beginner learning C. I can't see what *(a) == *(b) is for.
If either a or b is a null pointer, this may cause crash.
And if neither is a null pointer, only using strcmp((a), (b)) == 0 can
do the job, so *(a) == *(b) seems unnecessary.

Nov 14 '05 #1

Subscribe Post Reply

18375

Leor Zolman

On 16 May 2004 18:08:40 -0700, pe*********@hotmail.com (Peng Jian) wrote:

#define STREQ(a, b) (*(a) == *(b) && strcmp((a), (b)) == 0)

I'm a beginner learning C. I can't see what *(a) == *(b) is for.
If either a or b is a null pointer, this may cause crash.
And if neither is a null pointer, only using strcmp((a), (b)) == 0 can
do the job, so *(a) == *(b) seems unnecessary.

In the case where it is sufficiently uncommon for the strings being
compared to begin with the same character, this approach might get you
better performance than a simple strcmp.

Note that strcmp isn't permitted to take null pointers either, so you'd
have to test for that in either case. However, it is fairly
straight-forward to programmatically constrain pointer-to-char variables to
insure they remain valid before being used in such contexts.
-leor
--
Leor Zolman --- BD Software --- www.bdsoft.com
On-Site Training in C/C++, Java, Perl and Unix
C++ users: download BD Software's free STL Error Message Decryptor at:
www.bdsoft.com/tools/stlfilt.html

Nov 14 '05 #2

Régis Troadec

"Leor Zolman" <le**@bdsoft.com> a écrit dans le message de
news:ha********************************@4ax.com...

Hi,

On 16 May 2004 18:08:40 -0700, pe*********@hotmail.com (Peng Jian) wrote:
#define STREQ(a, b) (*(a) == *(b) && strcmp((a), (b)) == 0)

I'm a beginner learning C. I can't see what *(a) == *(b) is for.
If either a or b is a null pointer, this may cause crash.
And if neither is a null pointer, only using strcmp((a), (b)) == 0 can
do the job, so *(a) == *(b) seems unnecessary.

In the case where it is sufficiently uncommon for the strings being
compared to begin with the same character, this approach might get you
better performance than a simple strcmp.

On the other hand, it's very funny if strings being compared are *very long*
and only differ at, say, *(a+1) and *(b+1),
I think it might not be the case and that this macro is used in a context
where strings being compared differ very often in their first character,
otherwise, it's completely useless :)

Regis

Nov 14 '05 #3

Leor Zolman

On Mon, 17 May 2004 03:49:00 +0200, "Régis Troadec" <re**@wanadoo.fr>
wrote:

"Leor Zolman" <le**@bdsoft.com> a écrit dans le message de
news:ha********************************@4ax.com.. .

Hi,
On 16 May 2004 18:08:40 -0700, pe*********@hotmail.com (Peng Jian) wrote:
>#define STREQ(a, b) (*(a) == *(b) && strcmp((a), (b)) == 0)
>
>I'm a beginner learning C. I can't see what *(a) == *(b) is for.
>If either a or b is a null pointer, this may cause crash.
>And if neither is a null pointer, only using strcmp((a), (b)) == 0 can
>do the job, so *(a) == *(b) seems unnecessary.

In the case where it is sufficiently uncommon for the strings being
compared to begin with the same character, this approach might get you
better performance than a simple strcmp.

On the other hand, it's very funny if strings being compared are *very long*
and only differ at, say, *(a+1) and *(b+1),
I think it might not be the case and that this macro is used in a context
where strings being compared differ very often in their first character,
otherwise, it's completely useless :)

That's what I said: "uncommon to begin with the same character" == "differ
very often in their first character." To paraphrase, the more often they
differ in their first character, the more you gain by the "extra" test of
just that first character. If they /never/ or hardly ever differ in their
first character, the technique is worse than "completely useless"--it would
be a pessimization.

If it is blinding string comparison performance you're after, though,
there's no substitute for tuning the algorithm to the nature of the strings
involved. The more you "know" about the nature of the strings, the more
ammunition you'll have to think of clever ways to code something that runs
faster (on average) than strcmp.

But as usual, beware of premature optimization.
-leor

--
Leor Zolman --- BD Software --- www.bdsoft.com
On-Site Training in C/C++, Java, Perl and Unix
C++ users: download BD Software's free STL Error Message Decryptor at:
www.bdsoft.com/tools/stlfilt.html

Nov 14 '05 #4

Thomas Matthews

Peng Jian wrote:

#define STREQ(a, b) (*(a) == *(b) && strcmp((a), (b)) == 0)

I'm a beginner learning C. I can't see what *(a) == *(b) is for.
If either a or b is a null pointer, this may cause crash.
And if neither is a null pointer, only using strcmp((a), (b)) == 0 can
do the job, so *(a) == *(b) seems unnecessary.

Supposedly, the macro save the cost of a function call if the
first letters differ. The logical AND (&&) operator has a
short circuit that says if the first expression is false, the
other is not evaluated.

On many systems, the execution time saved by this expression
is negligble compared to the actual speed and observed speed
of a program. It is called premature optimization.

I would treat readablility as a more important criteria
and remove the macro. Just replace the macro with the
call to strcmp. After all, you would think that the strcmp
function would perform an optimal search or people wouldn't
use it.

--
Thomas Matthews

C++ newsgroup welcome message:
http://www.slack.net/~shiva/welcome.txt
C++ Faq: http://www.parashift.com/c++-faq-lite
C Faq: http://www.eskimo.com/~scs/c-faq/top.html
alt.comp.lang.learn.c-c++ faq:
http://www.raos.demon.uk/acllc-c++/faq.html
Other sites:
http://www.josuttis.com -- C++ STL Library book

Nov 14 '05 #5

Michael Wojcik

In article <PG*****************@newssvr32.news.prodigy.com> , Thomas Matthews <Th****************************@sbcglobal.net> writes:

Peng Jian wrote:
#define STREQ(a, b) (*(a) == *(b) && strcmp((a), (b)) == 0)
I would treat readablility as a more important criteria
and remove the macro. Just replace the macro with the
call to strcmp.

This macro has different semantics than strcmp(). Most importantly, it
inverts the sense of the comparison. It can't just be replaced with a
call to strcmp, if you want the program logic preserved.

This is another fine example of why change for the sake of change is a
bad idea. I wouldn't recommend using a macro like this one, as it's
both obscuring and dangerous (the arguments are evaluated twice). But
changing it where it's used in existing code seems like a very poor
idea, since the maintainer might well make a mistake (such as simply
changing it to a call to strcmp).

It *might* arguably be a good idea to change the macro definition to

#define STREQ(a, b) (strcmp((a), (b)) == 0)

and get rid of the probably-pointless check of the initial characters,
to avoid any future maintenance introducing an argument with side
effects into a use of STREQ. (That's assuming that there aren't any
current uses of it where an argument with side effects needs to be
evaluated twice. I certainly hope that's the case.)
After all, you would think that the strcmp
function would perform an optimal search or people wouldn't
use it.

Most people use the standard library functions regardless of their
quality. I've seen plenty of little card games that use insufficient
rand implementations to shuffle the deck. But I'll agree that it's
poor practice to worry about the performance of the standard library
until you know it's important for your application. Few programs
need to worry about the performance of strcmp.
--
Michael Wojcik mi************@microfocus.com

I would never understand our engineer. But is there anything in this world
that *isn't* made out of words? -- Tawada Yoko (trans. Margaret Mitsutani)

Nov 14 '05 #6

Eric Sosman

Michael Wojcik wrote:

Most people use the standard library functions regardless of their
quality. I've seen plenty of little card games that use insufficient
rand implementations to shuffle the deck. But I'll agree that it's
poor practice to worry about the performance of the standard library
until you know it's important for your application. Few programs
need to worry about the performance of strcmp.

In two and a half decades of using C, I've encountered
exactly *one* machine on which strcmp's speed was a concern.
The machine has long since gone the way of the dinosaur. So
has the company that built it. So, too, has the company that
bought what was left of the first company. It's nice to
imagine that all this bloodshed was the result of poor QoI,
but there may have been one or two other factors ...

Just use strcmp().

--
Er*********@sun.com

Nov 14 '05 #7

kal

"Régis Troadec" <re**@wanadoo.fr> wrote in message news:<c8**********@news-reader2.wanadoo.fr>...

I think it might not be the case and that this macro is used in a context
where strings being compared differ very often in their first character,
otherwise, it's completely useless :)

In purely random cases where 7-bit ascii is used, the probability that
the first two characters of two strings will be the same is 1/128.

Even in alpha strings the probability is 1/26 or even only 1/52.

Analysis of the English language may reduce this probability somewhat
but, IMO, not less than 1/10.

So the code is likely to be efficient in all except a few rare cases.

To say that one should eschew such obtuse coding methods is, on the
other hand, an entirely different matter.

Nov 14 '05 #8

Mabden

"kal" <k_*****@yahoo.com> wrote in message
news:a5**************************@posting.google.c om...

"Régis Troadec" <re**@wanadoo.fr> wrote in message news:<c8**********@news-reader2.wanadoo.fr>...
I think it might not be the case and that this macro is used in a context where strings being compared differ very often in their first character,
otherwise, it's completely useless :)

<snip> Even in alpha strings the probability is 1/26 or even only 1/52.
<snip>
So the code is likely to be efficient in all except a few rare cases.

To say that one should eschew such obtuse coding methods is, on the
other hand, an entirely different matter.

Macro: #define STREQ(a, b) (*(a) == *(b) && strcmp((a), (b)) == 0)

Assuming alphanumeric (1/62) comparing "this", with "that2", or "the other
thing3" would be doubling the comparisons since strcmp() is going to do it
as well. Is this a savings over the cost of a function call? Also, is it an
improvement worth all the programmer time talking about it?! You would have
to have a lot of very different strings being compared.

At least make the macro: #define STREQ(a, b) (*(a) == *(b) && strcmp((a+1),
(b+1)) == 0)

;-)

--
Mabden

Nov 14 '05 #9

Chris Torek

In article <news:VD****************@newssvr27.news.prodigy.co m>
Mabden <ma****@sbcglobal.net> writes:

At least make the macro: #define STREQ(a, b) (*(a) == *(b) && strcmp((a+1),
(b+1)) == 0)

;-)

Ignoring the line-wrap issue :-) the problem is that this version
of the macro misbehaves:

a = "hello";
if (somecond())
a = "";
...
if (STREQ(a, ""))

What can we say about the values now passed to strcmp(), if strcmp()
is called? (Note that strcmp() is called only when *a == *b, and *b
is '\0'. Hence strcmp() is called only if *a == '\0', and the question
is about a+1 and ""+1.)
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.

Nov 14 '05 #10

Arthur J. O'Dwyer

On Wed, 19 May 2004, Mabden wrote:

"kal" <k_*****@yahoo.com> wrote in message

Even in alpha strings the probability is 1/26 or even only 1/52. <snip>
So the code is likely to be efficient in all except a few rare cases.

Macro: #define STREQ(a, b) (*(a) == *(b) && strcmp((a), (b)) == 0)

Assuming alphanumeric (1/62) comparing "this", with "that2", or "the other
s/62/26/
You've got things backwards in a couple of places below, too. Must be
one of those days. :)
thing3" would be doubling the comparisons since strcmp() is going to do it
as well.
In 1/26 of the cases, yes. In 25/26 of the cases, no, strcmp will
never get called, because the initial characters will differ. Thus
we are trading the cost of (26 comparisons and one call to strcmp) for
the cost of (26 calls to strcmp). It's likely that this is a good
trade, I think, although as the cost of a function call gets cheaper,
it becomes less and less of a good trade.
At least make the macro: #define STREQ(a, b) (*(a) == *(b) && strcmp((a+1),
(b+1)) == 0)

Unfortunately, this code is broken. What happens during STREQ("","")?
[Rhetorical question. Answer: the code breaks.]

-Arthur

Nov 14 '05 #11

Sam Dennis

Mabden wrote:

Macro: #define STREQ(a, b) (*(a) == *(b) && strcmp((a), (b)) == 0)

At least make the macro: #define STREQ(a, b) (*(a) == *(b) && strcmp((a+1),
(b+1)) == 0)

UB if both arguments are empty strings. If one must perform such a
de-optimisation, this seems preferable:

#define STREQ(a, b) (*(a) == *(b) && strcmp((a) + !!*(a), (b) + !!*(a)))

--
++acr@,ka"

Nov 14 '05 #12

Mabden

"Sam Dennis" <sa*@malfunction.screaming.net> wrote in message
news:sl****************@ID-227112.user.uni-berlin.de...

Mabden wrote:
Macro: #define STREQ(a, b) (*(a) == *(b) && strcmp((a), (b)) == 0)

At least make the macro: #define STREQ(a, b) (*(a) == *(b) && strcmp((a+1), (b+1)) == 0)

UB if both arguments are empty strings. If one must perform such a
de-optimisation, this seems preferable:

#define STREQ(a, b) (*(a) == *(b) && strcmp((a) + !!*(a), (b) + !!*(a)))

Not familiar with "UB". I'll assume it means, "You're wrong!"

If the strings are both empty, they are both \0 and the final part of the
macro won't be tried.

If they are random pointers (i.e. uninitialized) then I guess it would be
?UB?, you bet.
Maybe someone could tell me what this means in English:

strcmp((a) + !!*(a), (b) + !!*(a))

My try: compare the pointer to a added to not not the value of a with the
pointer to b added to the not not value of a again?

I don't get it.

--
Mabden

Nov 14 '05 #13

Leor Zolman

On Wed, 19 May 2004 22:56:22 GMT, "Mabden" <ma****@sbcglobal.net> wrote:

"Sam Dennis" <sa*@malfunction.screaming.net> wrote in message
news:sl****************@ID-227112.user.uni-berlin.de...
Mabden wrote:
> Macro: #define STREQ(a, b) (*(a) == *(b) && strcmp((a), (b)) == 0)
>
> At least make the macro: #define STREQ(a, b) (*(a) == *(b) &&strcmp((a+1), > (b+1)) == 0)
UB if both arguments are empty strings. If one must perform such a
de-optimisation, this seems preferable:

#define STREQ(a, b) (*(a) == *(b) && strcmp((a) + !!*(a), (b) + !!*(a)))

Not familiar with "UB". I'll assume it means, "You're wrong!"

Sort of...it means "undefined behavior". You really want to avoid doing
anything that invokes UB, because once you've done that, "all bets are
off". You've given the compiler carte blanche to generate code to do just
about anything at all, and not be out of compliance with the language
standard. UB may manifest as "doing the expected thing", which is actually
your worst nightmare...it means you'll end up shipping code that seems to
work for you but segfaults for your customer (or your customer's client,
or...)

If the strings are both empty, they are both \0 and the final part of the
macro won't be tried.
If both pointers point to the same value, whether that's a NUL or not, then
the expression *(a) == *(b) is true, and it will go on to evaluate the
strcmp call.

If they are random pointers (i.e. uninitialized) then I guess it would be
?UB?, you bet.
No, you get UB when you've advanced the pointers (formerly pointers to NUL
bytes) past the NUL, in the case where the NUL is actually the end of
whatever memory was allocated (which would typically be the case for
dynamically allocated strings). As soon as you make a pointer invalid, it
may as well just be a "random pointer" and you've got UB...before you even
attempt to dereference it. That was actually the subject of a Moby Thread
around here of late.

Maybe someone could tell me what this means in English:

strcmp((a) + !!*(a), (b) + !!*(a))
It's a rather cute trick. !!x produces 0 for any non-zero x, and 1 when x
is zero. Thus, the value of the first expression above (I'll dispense with
the macro-motivated parens for now) is either a (if *a is NUL) or a + 1 (if
*a is not NUL).
-leor

My try: compare the pointer to a added to not not the value of a with the
pointer to b added to the not not value of a again?

I don't get it.

--
Leor Zolman --- BD Software --- www.bdsoft.com
On-Site Training in C/C++, Java, Perl and Unix
C++ users: download BD Software's free STL Error Message Decryptor at:
www.bdsoft.com/tools/stlfilt.html

Nov 14 '05 #14

Leor Zolman

On Wed, 19 May 2004 20:11:28 -0400, Leor Zolman <le**@bdsoft.com> wrote:

strcmp((a) + !!*(a), (b) + !!*(a))
It's a rather cute trick. !!x produces 0 for any non-zero x, and 1 when x
is zero.

I got it backwards. !!x produces 0 for x equal to 0, and 1 for non-zero x.
The part below I got right ;-)
-leor
Thus, the value of the first expression above (I'll dispense with
the macro-motivated parens for now) is either a (if *a is NUL) or a + 1 (if
*a is not NUL).

Nov 14 '05 #15

CBFalconer

Mabden wrote:

.... snip ...
Maybe someone could tell me what this means in English:

strcmp((a) + !!*(a), (b) + !!*(a))

if (a) points to an non-empty string (i.e. *(a) != '\0') then
compare the string pointed to by (a + 1) against that pointed to
by (b). Else compare the strings (a) and (b).

The key is that the operator !! converts a non-zero value to 1,
and a zero value is left alone.

It would seem to make more sense if the operator was plain !

--
Chuck F (cb********@yahoo.com) (cb********@worldnet.att.net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net> USE worldnet address!

Nov 14 '05 #16

Allin Cottrell

CBFalconer wrote:

Mabden wrote:

... snip ...
Maybe someone could tell me what this means in English:

strcmp((a) + !!*(a), (b) + !!*(a))

if (a) points to an non-empty string (i.e. *(a) != '\0') then
compare the string pointed to by (a + 1) against that pointed to
by (b). Else compare the strings (a) and (b).

I think you forgot to advance b in case *a != '\0'. In non-
obfuscated C the original translates to (wrapping it in a
function for completeness):

int weird_strcmp (const char *a, const char *b)
{
if (*a != '\0') {
a++;
b++;
}
return strcmp(a, b);
}

Allin Cottrell

Nov 14 '05 #17

Sam Dennis

CBFalconer wrote:

if (a) points to an non-empty string (i.e. *(a) != '\0') then
compare the string pointed to by (a + 1) against that pointed to
by (b)
.... + 1
It would seem to make more sense if [the check was inverted]

So you think that reading beyond the end of both strings makes sense?
For the example I gave (a = "", b = "")?

--
++acr@,ka"

Nov 14 '05 #18

Sam Dennis

Sam Dennis wrote:

#define STREQ(a, b) (*(a) == *(b) && strcmp((a) + !!*(a), (b) + !!*(a)))

Oops: should be !strcmp( ... ), of course.

--
++acr@,ka"

Nov 14 '05 #19

J. J. Farrell

"Mabden" <ma****@sbcglobal.net> wrote in message news:<qC******************@newssvr29.news.prodigy. com>...

"Sam Dennis" <sa*@malfunction.screaming.net> wrote in message
news:sl****************@ID-227112.user.uni-berlin.de...
Mabden wrote:
Macro: #define STREQ(a, b) (*(a) == *(b) && strcmp((a), (b)) == 0)

At least make the macro: #define STREQ(a, b) (*(a) == *(b) &&
strcmp((a+1), (b+1)) == 0)
UB if both arguments are empty strings. If one must perform such a
de-optimisation, this seems preferable:

#define STREQ(a, b) (*(a) == *(b) && strcmp((a) + !!*(a), (b) + !!*(a)))

Not familiar with "UB". I'll assume it means, "You're wrong!"

It means Undefined Behaviour.
If the strings are both empty, they are both \0 and the final part of the
macro won't be tried.

If that were the case then STREQ("Hello", "Harry") would give 1.
strcmp() is only called if the first characters of the string are
equal, as in STREQ("", "").

Nov 14 '05 #20

Leor Zolman

On Thu, 20 May 2004 02:30:32 +0000 (UTC), Sam Dennis
<sa*@malfunction.screaming.net> wrote:

Sam Dennis wrote:
#define STREQ(a, b) (*(a) == *(b) && strcmp((a) + !!*(a), (b) + !!*(a)))

Oops: should be !strcmp( ... ), of course.

That really underscores what I think is the most important point out of
this thread (which has already been articulated by Thomas Matthews): the
price for this chicanery is just too high, even if it results in a modicum
of performance increase (and it won't necessarily). I was going to say that
earlier, and then got so wrapped up in minutiae that I forgot to.

The farthest extent to which I've ever gone with this sort of thing is to
suggest to students that if the "reverse" sense of the return value from
strcmp is too disconcerting, wrap strcmp in a simple (non-"optimized")
functional version of the wrapper:

int streq(const char *s1, const char *s2)
{
return !strcmp(s1, s2);
}

and be done with it.
-leor
--
Leor Zolman --- BD Software --- www.bdsoft.com
On-Site Training in C/C++, Java, Perl and Unix
C++ users: download BD Software's free STL Error Message Decryptor at:
www.bdsoft.com/tools/stlfilt.html

Nov 14 '05 #21

kal

Thomas Matthews <Th****************************@sbcglobal.net> wrote in message news:<PG*****************@newssvr32.news.prodigy.c om>...

On many systems, the execution time saved by this expression
is negligble compared to the actual speed and observed speed
of a program. It is called premature optimization.

This is true today. But may not have been when the code
referred to by the OP (Original Poster) was written.

Even now there are instances where optimizations, even small
ones, are essential.

Since the code is implemented as a macro and the define is
presumably included in a header file, its impact on readability
is much less than it would otherwise be. I would pause a moment
to tip my hat to him who went before me and wrote that code.

<OT>
Try bubble sort (whose code is far more readable than that
of binary sort) on an array of, say, 1 million entries with
todays FAST computers.
</OT>

Nov 14 '05 #22

Mabden

"Leor Zolman" <le**@bdsoft.com> wrote in message
news:nq********************************@4ax.com...

On Thu, 20 May 2004 02:30:32 +0000 (UTC), Sam Dennis
<sa*@malfunction.screaming.net> wrote:
Sam Dennis wrote:
#define STREQ(a, b) (*(a) == *(b) && strcmp((a) + !!*(a), (b) +
!!*(a)))
Oops: should be !strcmp( ... ), of course.
That really underscores what I think is the most important point out of
this thread (which has already been articulated by Thomas Matthews): the
price for this chicanery is just too high, even if it results in a modicum
of performance increase (and it won't necessarily). I was going to say

that earlier, and then got so wrapped up in minutiae that I forgot to.

Agreed. This stuff is bad enough for those of us who consider ourselves
experts (notice how I weasel my way into that group), now have a newbie try
to make sense of it...

I mean, if you're worried about optimizations, why are you calling strcmp()
at all...
Mabden wrote:
At least make the macro:
#define STREQ(a, b) (*(a) == *(b) && strcmp((a+1), (b+1)) == 0)

Plus, it would only really help if you are reasonably certain that the two
strings will *usually* differ in the first character, so it would have to be
a specific data group. Hence my comment about calling the macros with the
second character (since you already know the first ones match, and if you
know the data that well - like part numbers or something - you would know
there's going to be a second char)

--
Mabden

Nov 14 '05 #23

Leor Zolman

On Thu, 20 May 2004 06:46:20 GMT, "Mabden" <ma****@sbcglobal.net> wrote:

Mabden wrote:
At least make the macro:
#define STREQ(a, b) (*(a) == *(b) && strcmp((a+1), (b+1)) == 0)

Plus, it would only really help if you are reasonably certain that the two
strings will *usually* differ in the first character, so it would have to be
a specific data group. Hence my comment about calling the macros with the
second character (since you already know the first ones match, and if you
know the data that well - like part numbers or something - you would know
there's going to be a second char)

And, in that case, you could go hog wild with something like:

#define LONG_STR_EQ(a,b) (*(a)==*(b) && (a)[1] == (b)[1] && \
(a)[2] == (b)[2] && ... && !strcmp(...)

But life's just too short.
-leor
--
Leor Zolman --- BD Software --- www.bdsoft.com
On-Site Training in C/C++, Java, Perl and Unix
C++ users: download BD Software's free STL Error Message Decryptor at:
www.bdsoft.com/tools/stlfilt.html

Nov 14 '05 #24

Dario (drinking coï¬€ee in the oï¬ƒceâ€¦)

Leor Zolman wrote:

#define LONG_STR_EQ(a,b) (*(a)==*(b) && (a)[1] == (b)[1] && \
(a)[2] == (b)[2] && ... && !strcmp(...)

It is illegal in the following call:
LONG_STR_EQ("", "")

- Dario

Nov 14 '05 #25

Neil Cerutti

In article <a5**************************@posting.google.com >, kal wrote:

Thomas Matthews <Th****************************@sbcglobal.net> wrote in message news:<PG*****************@newssvr32.news.prodigy.c om>...
On many systems, the execution time saved by this expression
is negligble compared to the actual speed and observed speed
of a program. It is called premature optimization.

This is true today. But may not have been when the code
referred to by the OP (Original Poster) was written.

Even now there are instances where optimizations, even small
ones, are essential.

Since the code is implemented as a macro and the define is
presumably included in a header file, its impact on readability
is much less than it would otherwise be. I would pause a moment
to tip my hat to him who went before me and wrote that code.

<OT>
Try bubble sort (whose code is far more readable than that
of binary sort) on an array of, say, 1 million entries with
todays FAST computers.
</OT>

Choosing an appropriate algorithm is more important than
optimization of that algorithm.

--
Neil Cerutti
"The barbarian seated himself upon a stool at the wenches side, exposing
his body, naked save for a loin cloth brandishing a long steel broad
sword..." --The Eye of Argon

Nov 14 '05 #26

Leor Zolman

On Thu, 20 May 2004 14:52:14 +0200, "Dario (drinking co?ee in the o?ce…)"
<da***@despammed.com> wrote:

Leor Zolman wrote:
#define LONG_STR_EQ(a,b) (*(a)==*(b) && (a)[1] == (b)[1] && \
(a)[2] == (b)[2] && ... && !strcmp(...)

It is illegal in the following call:
LONG_STR_EQ("", "")

- Dario

Please have some more coffee, then read the last part of the post I was
replying to and consider it in context (I notice you left out the line I
wrote just before showing that code...)

Thanks,
-leor
--
Leor Zolman --- BD Software --- www.bdsoft.com
On-Site Training in C/C++, Java, Perl and Unix
C++ users: download BD Software's free STL Error Message Decryptor at:
www.bdsoft.com/tools/stlfilt.html

Nov 14 '05 #27

Dario (drinking coï¬€ee in the oï¬ƒceâ€¦)

Leor Zolman wrote:

On Thu, 20 May 2004 14:52:14 +0200, "Dario (drinking coï¬€ee in the oï¬ƒceâ€¦)"
<da***@despammed.com> wrote:
Leor Zolman wrote:
#define LONG_STR_EQ(a,b) (*(a)==*(b) && (a)[1] == (b)[1] && \
(a)[2] == (b)[2] && ... && !strcmp(...)
It is illegal in the following call:
LONG_STR_EQ("", "")

- Dario

Please have some more coffee,

Yes, I do...
then read the last part of the post I was
replying to and consider it in context
(I notice you left out the line I
wrote just before showing that code...)
Read: OK!
Thanks,
Pas de qoi.
-leor

- Dario

Nov 14 '05 #28

Arthur J. O'Dwyer

On Wed, 19 May 2004, Leor Zolman wrote:

On Thu, 20 May 2004 02:30:32 +0000 (UTC), Sam Dennis wrote:
Sam Dennis wrote:
#define STREQ(a, b) (*(a) == *(b) && strcmp((a) + !!*(a), (b) + !!*(a)))
Oops: should be !strcmp( ... ), of course.

That really underscores what I think is the most important point out of
this thread (which has already been articulated by Thomas Matthews): the
price for this chicanery is just too high, even if it results in a modicum
of performance increase (and it won't necessarily). I was going to say that
earlier, and then got so wrapped up in minutiae that I forgot to.

What you're missing (well, I'm sure you're not really missing it,
but you're glossing over it) is that this "chicanery" is not scattered
through the OP's code, but rather stuck behind a very sensibly-named
macro in a sensible part of the program. The programmer never needs
to know how it works, any more than he needs to know how 'qsort' is
optimized to deal with special cases of *its* input. They're both
library functions, conceptually, and if you don't want to know why
the macro works, nobody's forcing you to do all those !!s in your head.
:)
The farthest extent to which I've ever gone with this sort of thing is to
suggest to students that if the "reverse" sense of the return value from
strcmp is too disconcerting, wrap strcmp in a simple (non-"optimized")
functional version of the wrapper:

int streq(const char *s1, const char *s2)
{
return !strcmp(s1, s2);
}

and be done with it.

Except for the namespace invasion, this is decent advice. This
is why in my programs I always include two macros right up at the
top:

#define steq(x,y) (!strcmp(x,y))
#define stneq(x,y) (!steq(x,y))

If I wanted, I could change that to the OP's

#define steq(x,y) (*(x)==*(y) && !strcmp(x,y))

in theory without any loss of sleep. In practice, that would
lose me a lot of sleep, because I know that I use 'steq' heavily
to parse arguments out of 'argv', and there's always the chance
I might have written somewhere

if (steq(argv[i], "--output-file") && steq(argv[++i], "-"))
OutputFile = stdout;

(I doubt it, though, because that would lose the filename if it
weren't "-", and that seems like a silly thing to do.)
This double-evaluation is the biggest danger of the OP's macro; the
tricky negations and "chicanery" have nothing to do with it as far
as I'm concerned.

-Arthur

Nov 14 '05 #29

Michael Wojcik

In article <nq********************************@4ax.com>, Leor Zolman <le**@bdsoft.com> writes:

The farthest extent to which I've ever gone with this sort of thing is to
suggest to students that if the "reverse" sense of the return value from
strcmp is too disconcerting, wrap strcmp in a simple (non-"optimized")
functional version of the wrapper:

There's also the macro which I believe Peter van der Linden gives in
_Expert C Programming_ (though I can't seem to find it there), along
the lines of:

#define CMPSTR(s1, op, s2) (strcmp(s1, s2) op 0)

which is used as in:

if (CMPSTR(word, ==, "hello"))

--
Michael Wojcik mi************@microfocus.com

"We are facing a dire shortage of clowns," said Erickson, also known as
Jingles.

Nov 14 '05 #30

Mabden

"Michael Wojcik" <mw*****@newsguy.com> wrote in message
news:c8*********@news2.newsguy.com...

There's also the macro which I believe Peter van der Linden gives in
_Expert C Programming_ (though I can't seem to find it there), along
the lines of:

I think we are now getting into the realm of redefining the C language.

If the result of all these fancy macros is to rewrite strcmp() then I think
we need to step back and realize that all the C books have a page on
strcmp() but none have CMPSTR or STREQ or STRNEQ or LONG_STR_EQ or whatever.

What is the point of having a terse, manageable language like C and
cluttering it up with crappy macros that only save one character comparison.
Perhaps there _was_ a time when this was a viable, necessary activity. It no
longer is.

Stick to the known functions. Profile your code if it's slow. Then spend the
$150 to upgrade the damn machine, you cheap ass bastard!

--
Mabden

Nov 14 '05 #31

Leor Zolman

On Thu, 20 May 2004 10:06:57 -0400 (EDT), "Arthur J. O'Dwyer"
<aj*@nospam.andrew.cmu.edu> wrote:

On Wed, 19 May 2004, Leor Zolman wrote:

On Thu, 20 May 2004 02:30:32 +0000 (UTC), Sam Dennis wrote:
>Sam Dennis wrote:
>> #define STREQ(a, b) (*(a) == *(b) && strcmp((a) + !!*(a), (b) + !!*(a)))
>
>Oops: should be !strcmp( ... ), of course.
That really underscores what I think is the most important point out of
this thread (which has already been articulated by Thomas Matthews): the
price for this chicanery is just too high, even if it results in a modicum
of performance increase (and it won't necessarily). I was going to say that
earlier, and then got so wrapped up in minutiae that I forgot to.

What you're missing (well, I'm sure you're not really missing it,
but you're glossing over it) is that this "chicanery" is not scattered
through the OP's code, but rather stuck behind a very sensibly-named
macro in a sensible part of the program. The programmer never needs
to know how it works, any more than he needs to know how 'qsort' is
optimized to deal with special cases of *its* input. They're both
library functions, conceptually, and if you don't want to know why
the macro works, nobody's forcing you to do all those !!s in your head.
:)

Okay, once you have a debugged, correct macro (not like the "optimized"
version of STREQ we've been talking about here, since the unintended
side-effects issue relegates anything like that to the fringes), no one
would need to look at its implementation.

However, I was thinking more in terms of the cost of using implementation
techniques like this in new development. Until you can abstract it away,
you'll be paying the price to develop it. And eventually /someone/ will
probably be put in the position of having to understand it again, for one
reason or another, and then the price would only go up...

The farthest extent to which I've ever gone with this sort of thing is to
suggest to students that if the "reverse" sense of the return value from
strcmp is too disconcerting, wrap strcmp in a simple (non-"optimized")
functional version of the wrapper:

int streq(const char *s1, const char *s2)
{
return !strcmp(s1, s2);
}

and be done with it.

Except for the namespace invasion, this is decent advice.

Can you elaborate on "namespace invasion" here? Sorry, I don't know what
you mean.
-leor

--
Leor Zolman --- BD Software --- www.bdsoft.com
On-Site Training in C/C++, Java, Perl and Unix
C++ users: download BD Software's free STL Error Message Decryptor at:
www.bdsoft.com/tools/stlfilt.html

Nov 14 '05 #32

Sam Dennis

Leor Zolman wrote:

On Wed, 19 May 2004, Leor Zolman wrote:
int streq(const char *s1, const char *s2)

Can you elaborate on "namespace invasion" here?

Functions beginning with str and a lowercase letter are reserved for
future expansion of the standard library (<string.h> and <stdlib.h>,
but streq has external linkage here, so it'll be undefined behaviour
regardless.)

There are a few other such names and namespaces listed under `Future
library directions' in the Standard. (is|to)[a-z] and E[A-Z0-9] are
particularly noteworthy, along with mem[a-z], also for <string.h>.

--
++acr@,ka"

Nov 14 '05 #33

Leor Zolman

On Fri, 21 May 2004 04:38:56 +0000 (UTC), Sam Dennis
<sa*@malfunction.screaming.net> wrote:

Leor Zolman wrote:
On Wed, 19 May 2004, Leor Zolman wrote:
int streq(const char *s1, const char *s2)

Can you elaborate on "namespace invasion" here?

Functions beginning with str and a lowercase letter are reserved for
future expansion of the standard library (<string.h> and <stdlib.h>,
but streq has external linkage here, so it'll be undefined behaviour
regardless.)

There are a few other such names and namespaces listed under `Future
library directions' in the Standard. (is|to)[a-z] and E[A-Z0-9] are
particularly noteworthy, along with mem[a-z], also for <string.h>.

Thanks, I just remembered about that this morning. I don't know why I have
such a mental block on that particular aspect of the Standard; perhaps it
just seems completely counter-intuitive to me for it to reserve /any/
arbitrary "ordinary-looking" sequence of initial characters. Well, at least
this time it will probably have finally sunk in...
-leor
--
Leor Zolman --- BD Software --- www.bdsoft.com
On-Site Training in C/C++, Java, Perl and Unix
C++ users: download BD Software's free STL Error Message Decryptor at:
www.bdsoft.com/tools/stlfilt.html

Nov 14 '05 #34

Michael Wojcik

In article <mR*******************@newssvr29.news.prodigy.com> , "Mabden" <ma****@sbcglobal.net> writes:

"Michael Wojcik" <mw*****@newsguy.com> wrote in message
news:c8*********@news2.newsguy.com...

There's also the macro which I believe Peter van der Linden gives in
_Expert C Programming_ (though I can't seem to find it there), along
the lines of:
I think we are now getting into the realm of redefining the C language.

No, since no one has suggested adding any of these to the standard.
We're discussing using the C language, of which function-type macros
are a part.
If the result of all these fancy macros is to rewrite strcmp() then I think
we need to step back and realize that all the C books have a page on
strcmp() but none have CMPSTR or STREQ or STRNEQ or LONG_STR_EQ or whatever.
While some of the macros posted attempt to eliminate strcmp calls in
some cases, I haven't seen one that rewrote strcmp.

And some C books do discuss macros that wrap strcmp. That's what the
text you quoted from my post says, in fact.
What is the point of having a terse, manageable language like C and
cluttering it up with crappy macros that only save one character comparison.
The macro I posted had nothing to do with "sav[ing] one character
comparison". Did you read it? Is this comment in any way relevant
to my post?

And the point of macros in C is and has always been to simplify
development and maintenance of source code. Using macros for this
purpose is not trivial and there is much disagreement on how best
to do it, but that is the point. A macro aims to give a more
meaningful name to a value or a (hopefully short) segment of code;
as such, it should provide more information that what it replaces,
and thereby *increase* terseness and manageability, the goals you
claim for C.
Stick to the known functions.

I'd like to see how you'd implement a significant project in C
with only the standard library functions. No functions of your
own, nothing added by the implementation.

I have no love for the "avoid a call if the first character differs"
macro that started this thread - if a program makes sufficient
calls to strcmp that it becomes necessary to optimize some of them
away, it's almost certainly a candidate for redesign. But using
that as an argument to eliminate strcmp wrappers entirely is silly.

--
Michael Wojcik mi************@microfocus.com

Do not "test" parts, as this may compromise sensitive joinery. Those who
suffer difficulty should abandon the enterprise immediately. -- Chris Ware

Nov 14 '05 #35

Mabden

"Michael Wojcik" <mw*****@newsguy.com> wrote in message
news:c8*********@news3.newsguy.com...

I have no love for the "avoid a call if the first character differs"
macro that started this thread - if a program makes sufficient
calls to strcmp that it becomes necessary to optimize some of them
away, it's almost certainly a candidate for redesign. But using
that as an argument to eliminate strcmp wrappers entirely is silly.

Ah, good then we agree.

--
Mabden

Nov 14 '05 #36

Richard Bos

"Arthur J. O'Dwyer" <aj*@nospam.andrew.cmu.edu> wrote:

On Wed, 19 May 2004, Mabden wrote:

"kal" <k_*****@yahoo.com> wrote in message

Even in alpha strings the probability is 1/26 or even only 1/52. So the code is likely to be efficient in all except a few rare cases.
Macro: #define STREQ(a, b) (*(a) == *(b) && strcmp((a), (b)) == 0)

Assuming alphanumeric (1/62) comparing "this", with "that2", or "the other

s/62/26/

No, he said alpha_numeric_. That is, a-z plus A-Z plus 0-9 is 62 chars.
thing3" would be doubling the comparisons since strcmp() is going to do it
as well.

In 1/26 of the cases, yes. In 25/26 of the cases, no, strcmp will
never get called, because the initial characters will differ. Thus
we are trading the cost of (26 comparisons and one call to strcmp) for
the cost of (26 calls to strcmp). It's likely that this is a good
trade, I think, although as the cost of a function call gets cheaper,
it becomes less and less of a good trade.

Especially since strcmp() is simple enough to be a likely candidate for
inlining.

Richard

Nov 14 '05 #37

Similar topics