string comparison

yadurajj

Hello i am newbie trying to learn C..I need to know about string
comparisons in C, without using a library function,...recently I was
asked this in an interview..I can write a small program but I was told
that wouldn't it be wise to first get the length of the strings..if it
doesn't match then they are not the same..I agreed...then he said..but
again that would be an overhead first measuring the length...and then
doing a character by character comparison...I was confused and was
wondering if anybody has an answer to this theory. I am only trying to
undestand...

thanks

Nov 14 '05 #1

Subscribe Post Reply

5077

SM Ryan

ya******@yahoo.com wrote:
# Hello i am newbie trying to learn C..I need to know about string
# comparisons in C, without using a library function,...recently I was
# asked this in an interview..I can write a small program but I was told
# that wouldn't it be wise to first get the length of the strings..if it

The interviewer is either a jackass (not uncommon) or trying to trick
you (also not uncommon). In general the only way to measure strings
in C is to scan them from the first character to the null character.
If you measure the strings first you're going to end up doing the same
scan twice.

In ANSI C, a string comparison function can look something like
int compare(char *a,char *b) {
int cc = 0; while (cc==0 && *a && *b) cc = *a++ - *b++;
if (cc!=0) ;
else if (*a) cc = 1;
else if (*b) cc = -1;
return cc;
}
Sometimes you can compare words in machine dependent fashion and get
a four or eight time speed up.

--
SM Ryan http://www.rawbw.com/~wyrmwif/
You hate people.
But I love gatherings. Isn't it ironic.

Nov 14 '05 #2

Lew Pitcher

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

ya******@yahoo.com wrote:

Hello i am newbie trying to learn C..I need to know about string
comparisons in C, without using a library function,...recently I was
asked this in an interview..I can write a small program but I was told
that wouldn't it be wise to first get the length of the strings..if it
doesn't match then they are not the same..I agreed...then he said..but
again that would be an overhead first measuring the length...and then
doing a character by character comparison...I was confused and was
wondering if anybody has an answer to this theory. I am only trying to
undestand...

Consider this: in C, a string is just an array of char, with zero or more
significant characters, and a '\0' character which terminates the string.

The length of the string is the count of the significant characters, up to
(but not including) the '\0' character.

It is true that, if the length of one string is different from the length of
another string, then the strings are different.

However, all that is really saying is that, in a character-by-character
comparison, where one string has a '\0', the other does not.

So, yes, it would be ideal to determine the lengths of the strings, but you
can only do so through a character-by-character inspection of each string.
If you are already going to examine the strings character-by-character,
comparing each character position to '\0' (to find the end, and thus the
length, of each string), there is no reason why you should not compare each
character of one string to the correspondingly placed character of the other.

Indeed, identical strings will compare identically up to and including the
'\0' character of each string. Different strings will compare identical up to
the first character difference, which may include the '\0' of one string
comparing unequal to the corresponding character of another string.
A visual example might help

String 1 A B C \0
: : : : Identical
String 2 A B C \0

or

String 1 A B C \0
: : : x Different
String 2 A B C D \0

or

String 1 A B Q \0
: : x Different
String 2 A B C \0

or even

String 1 A B Q \0
x Different
String 2 Z Y X \0

- --
Lew Pitcher

Master Codewright & JOAT-in-training | GPG public key available on request
Registered Linux User #112576 (http://counter.li.org/)
Slackware - Because I know what I'm doing.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)

iD8DBQFCuM2nagVFX4UWr64RApU+AJ4lI1+ui+FjPPN29OMYuk iDF/QVsQCeIv9k
EhRxPiJaixr9YT9PivVRUkw=
=EpUj
-----END PGP SIGNATURE-----

Nov 14 '05 #3

Lew Pitcher

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

SM Ryan wrote:
[snip]

In ANSI C, a string comparison function can look something like
int compare(char *a,char *b) {
int cc = 0; while (cc==0 && *a && *b) cc = *a++ - *b++;
if (cc!=0) ;
else if (*a) cc = 1;
else if (*b) cc = -1;
return cc;
}

Or even

int compare_strings(char *a, char *b)
{
while (*a++ == *b++) ;
return (int)(*a-*b);
}

- --
Lew Pitcher

Master Codewright & JOAT-in-training | GPG public key available on request
Registered Linux User #112576 (http://counter.li.org/)
Slackware - Because I know what I'm doing.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)

iD8DBQFCuNMSagVFX4UWr64RAmRkAKDbXBeL3v3qA5Q0zWfjee 9pgU4LGgCg4u5y
yykp27lAPLGzgEtoXL7eTYw=
=FAF0
-----END PGP SIGNATURE-----

Nov 14 '05 #4

kaby

I suggest you may use *long* instead of *char* to get full capability
of register.
and keep one eye on snaping 32bit and 8bit.

Nov 14 '05 #5

Peter Nilsson

Lew Pitcher wrote:

SM Ryan wrote:
[snip]
In ANSI C, a string comparison function can look something like
int compare(char *a,char *b) {
int cc = 0; while (cc==0 && *a && *b) cc = *a++ - *b++;
if (cc!=0) ;
else if (*a) cc = 1;
else if (*b) cc = -1;
return cc;
}

Or even

int compare_strings(char *a, char *b)
{
while (*a++ == *b++) ;
return (int)(*a-*b);
}

I'd mark both of these as wrong. Neither replicates strcmp(). ;)

--
Peter

Nov 14 '05 #6

CBFalconer

Lew Pitcher wrote:

.... snip ...
Or even

int compare_strings(char *a, char *b)
{
while (*a++ == *b++) ;
return (int)(*a-*b);
}

There is a winged insect crawling over that :-)

--
Some informative links:
news:news.announce.newusers
http://www.geocities.com/nnqweb/
http://www.catb.org/~esr/faqs/smart-questions.html
http://www.caliburn.nl/topposting.html
http://www.netmeister.org/news/learn2quote.html

Nov 14 '05 #7

Morris Dovey

CBFalconer wrote:

Lew Pitcher wrote:

... snip ...

Or even

int compare_strings(char *a, char *b)
{
while (*a++ == *b++) ;
return (int)(*a-*b);
}

There is a winged insect crawling over that :-)

I don't think that bug has wings... ;-)

--
Morris Dovey
DeSoto Solar
DeSoto, Iowa USA
http://www.iedu.com/DeSoto

Nov 14 '05 #8

Lawrence Kirby

On Tue, 21 Jun 2005 19:03:16 -0700, yadurajj wrote:

Hello i am newbie trying to learn C..I need to know about string
comparisons in C, without using a library function,...recently I was
asked this in an interview..I can write a small program but I was told
that wouldn't it be wise to first get the length of the strings..if it
doesn't match then they are not the same.
That's a possible strategy if you are only testing for equality. But if
you also need to determine which of 2 unequal strings is greater then it
doesn't help.
.I agreed...then he said..but
again that would be an overhead first measuring the length...and then
doing a character by character comparison...I was confused and was
wondering if anybody has an answer to this theory. I am only trying to
undestand...

You have to consider whether the strings are likely to differ near the
start. If you have a 100 character string and they differed in the 2nd
character then determining the length would be a very costly operation
compared to just comparing the character sequences. And strings usually
do differ in the first couple of characters unless there is a strong
relationship between them already.

Lawrence

Nov 14 '05 #9

Netocrat

On Tue, 21 Jun 2005 22:54:11 -0700, Peter Nilsson wrote:

Lew Pitcher wrote:
SM Ryan wrote:
[snip]
> In ANSI C, a string comparison function can look something like
> int compare(char *a,char *b) {
> int cc = 0; while (cc==0 && *a && *b) cc = *a++ - *b++; if (cc!=0) ;
> else if (*a) cc = 1;
> else if (*b) cc = -1;
> return cc;
> }
> }

Or even

int compare_strings(char *a, char *b) {
while (*a++ == *b++) ;
return (int)(*a-*b);
}
}

I'd mark both of these as wrong. Neither replicates strcmp(). ;)

The bug in the second is easy to spot, and from what I can tell the first
compare() differs from strcmp() in that strcmp() treats its arguments as
unsigned char *, whereas compare() treats them as they are when passed in:
char *.

Is that what you were referring to? If so, the problem seems to be solved
by changing the first line and adding two:
int compare(char *a,char *b) {
becomes
int compare(char *_a,char *_b) {
unsigned char *a = (unsigned char *)_a;
unsigned char *b = (unsigned char *)_b;

I'm just going by my gcc implementation - I don't know whether the
standard requires strcmp to behave this way...

Nov 14 '05 #10

Lawrence Kirby

On Tue, 21 Jun 2005 22:04:32 -0700, kaby wrote:

I suggest you may use *long* instead of *char* to get full capability
of register.
and keep one eye on snaping 32bit and 8bit.

Unless the OP is already familiar with the technique I think you are
talking about this isn't likely to mean very much. I am familiar with
techniques for handling character data a word at a time, but even then
this doesn't make an awful lot of sense. :-)

I wouldn't worry about this for the purposes of an interview question,
especially when the issue is whether to determine the length of the string
first.

Lawrence

Nov 14 '05 #11

Lawrence Kirby

On Wed, 22 Jun 2005 20:48:09 +1000, Netocrat wrote:

On Tue, 21 Jun 2005 22:54:11 -0700, Peter Nilsson wrote:
Lew Pitcher wrote:
SM Ryan wrote:
[snip]
> In ANSI C, a string comparison function can look something like
> int compare(char *a,char *b) {
> int cc = 0; while (cc==0 && *a && *b) cc = *a++ - *b++; if (cc!=0) ;
> else if (*a) cc = 1;
> else if (*b) cc = -1;
These don't take into account the relative values of *a and *b so are
wrong.
> return cc;
> }
> }
Or even

int compare_strings(char *a, char *b) {
while (*a++ == *b++) ;
return (int)(*a-*b);
}
} I'd mark both of these as wrong. Neither replicates strcmp(). ;)

The bug in the second is easy to spot,

Which of the bugs in the second are you referring to? :-)

It will walk off the end of identical strings.

There are 2 instances of an off-by-one index error in the return
statement.

C doesn't guarantee that char has a smaller range than int so *a-*b could
overflow. You can usually get away with this in practice but at least a
comment is in order.

Since char values can be negative it has a curious property that a
longer string can compare less than a shorter string that is a prefix of
it (assuming a trivial fix for bug 2).
and from what I can tell the first
compare() differs from strcmp() in that strcmp() treats its arguments as
unsigned char *, whereas compare() treats them as they are when passed in:
char *.
True. And using unsigned char eliminates the "curious property".
Is that what you were referring to? If so, the problem seems to be
solved
by changing the first line and adding two:
int compare(char *a,char *b) {
becomes
int compare(char *_a,char *_b) {
unsigned char *a = (unsigned char *)_a; unsigned char *b = (unsigned
char *)_b;

I'm just going by my gcc implementation - I don't know whether the
standard requires strcmp to behave this way...

strcmp() is defined to act on unsigned char values.

Lawrence

Nov 14 '05 #12

websnarf

ya******@yahoo.com wrote:

Hello i am newbie trying to learn C..I need to know about string
comparisons in C, without using a library function,...recently I was
asked this in an interview..I can write a small program but I was told
that wouldn't it be wise to first get the length of the strings..if it
doesn't match then they are not the same..I agreed...then he said..but
again that would be an overhead first measuring the length...
It is *highly* unlikely that this comment was carried to its logical
conclusion. If he's talking about "performance", then there is a lot
more to this question than this.
[...] and then
doing a character by character comparison...I was confused and was
wondering if anybody has an answer to this theory. I am only trying to
undestand...

Ok, first of all, if the lengths are intrinsically available to you in
the first place (in most situations, when you are in control of all the
code and data formats, this is generally trivial to guarantee), then
sure, you can and perhaps should first compare the lengths, afterwhich
you want to perform the equivalent of a memcmp().

If you don't have the length, and your strings are just '\0' terminated
char * strings, then performing strlen()'s first and precomparing will
never be better in terms of performance. The reason is that the whole
cost of string comparison is the loop overhead, and memory traversal.
By calling strlen twice, you are basically (roughly) tripling the loop
overhead, and doubling the memory traversal cost.

In terms of amount of work done, the moment you have enough information
to know the two lengths are different, is the same moment you can know
that the substance of the two strings are different.

Ok, so here are some remaining questions:

1) How should one implement a standard strcmp? -- as I recall, this is
like 3 lines of code.

2) How might one implement a standard memcmp? (On the assumption that
lengths are always available to you, since it basically costs nothing
for this to be the case.) Doing this naively is easy; but trying to
take advantage of low-level hardware characteristics, such as
alignment, this can be hard if you are really intent on squeezing out
the maximum performance.

3) Is this string comparison isolated? For example, let us say that
you are performing a string comparison for the purpose of inserting
into a hash table to avoid inserting duplicates. Well in this case a
scalar "trace" of the string is precomputed anyways in the form of its
hash function mapping. So if you are store the hash values along with
each string then doing this pre-compare (like the length pre-compare,
except you should expect this to have far fewer false positives) will
improve average comparison performance.

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Nov 14 '05 #13

Lew Pitcher

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Lew Pitcher wrote:

SM Ryan wrote:
[snip]
In ANSI C, a string comparison function can look something like
int compare(char *a,char *b) {
int cc = 0; while (cc==0 && *a && *b) cc = *a++ - *b++;
if (cc!=0) ;
else if (*a) cc = 1;
else if (*b) cc = -1;
return cc;
}

Or even

int compare_strings(char *a, char *b)
{
while (*a++ == *b++) ;
return (int)(*a-*b);
}

Gak!!

That's what I get for off-the-cuff coding when I'm tired.

That /should/ have been
int compare_strings(char *a, char *b)
{
while (*b && (*a++ == *b++)) ;
return (int)(*a-*b);
}

- --

Lew Pitcher, IT Specialist, Enterprise Data Systems
Enterprise Technology Solutions, TD Bank Financial Group

(Opinions expressed here are my own, not my employer's)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (MingW32)

iD8DBQFCuVG5agVFX4UWr64RAvPTAJwKHRRb0JTB+zxtf8iHgr AAfRUmpwCfYm8O
QFVzBgyX10UnnW68nJ6iOeU=
=JwjB
-----END PGP SIGNATURE-----

Nov 14 '05 #14

Netocrat

On Wed, 22 Jun 2005 13:28:56 +0100, Lawrence Kirby wrote:

On Wed, 22 Jun 2005 20:48:09 +1000, Netocrat wrote:
On Tue, 21 Jun 2005 22:54:11 -0700, Peter Nilsson wrote:
Lew Pitcher wrote:
SM Ryan wrote:
[snip]
> In ANSI C, a string comparison function can look something like
> int compare(char *a,char *b) {
> int cc = 0; while (cc==0 && *a && *b) cc = *a++ - *b++; if (cc!=0)
> ; else if (*a) cc = 1;
> else if (*b) cc = -1;
These don't take into account the relative values of *a and *b so are
wrong.
At the risk of sounding clueless... I'm uncertain of firstly what you are
referring to by 'these' and secondly in any event how the relative values
aren't being taken into account. Perhaps by the second you mean that *a
and *b need to be treated as unsigned char rather than char, as you
confirm is the case later in your post?

> return cc;
> }
> }
Or even

int compare_strings(char *a, char *b) {
while (*a++ == *b++) ;
return (int)(*a-*b);
}
}
I'd mark both of these as wrong. Neither replicates strcmp(). ;)

The bug in the second is easy to spot,

Which of the bugs in the second are you referring to? :-)

It will walk off the end of identical strings.

That's where I stopped looking...
There are 2 instances of an off-by-one index error in the return
statement.

C doesn't guarantee that char has a smaller range than int so *a-*b
could overflow. You can usually get away with this in practice but at
least a comment is in order.

Since char values can be negative it has a curious property that a
longer string can compare less than a shorter string that is a prefix of
it (assuming a trivial fix for bug 2).
Here's a fix then (if this were for real-life code I'd rewrite it so the
parameters were a and b rather than _a and _b but this way involves
minimal change...):

int compare_strings(char *_a, char *_b) {
unsigned char *a = (unsigned char *)_a;
unsigned char *b = (unsigned char *)_b;
while (*a && *b && *a == *b) {
a++;
b++;
}
return (int)*a-(int)*b;
}

and from what I can tell the first
compare() differs from strcmp() in that strcmp() treats its arguments
as unsigned char *, whereas compare() treats them as they are when
passed in: char *.

True. And using unsigned char eliminates the "curious property".

Truth be told that "curious property" is the means by which I discovered
the bug. For any string of printable characters the original function is
fine but I didn't doubt that Peter had a reason for quibbling with it, so
I dug deeper...

[snip]
strcmp() is defined to act on unsigned char values.

Cheers for the confirmation.

Nov 14 '05 #15

Netocrat

On Wed, 22 Jun 2005 07:55:37 -0400, Lew Pitcher wrote:

Lew Pitcher wrote:
SM Ryan wrote:
[snip]
In ANSI C, a string comparison function can look something like
int compare(char *a,char *b) {
int cc = 0; while (cc==0 && *a && *b) cc = *a++ - *b++; if (cc!=0) ;
else if (*a) cc = 1;
else if (*b) cc = -1;
return cc;
}
}
Or even

int compare_strings(char *a, char *b) {
while (*a++ == *b++) ;
return (int)(*a-*b);
}
}

Gak!!

That's what I get for off-the-cuff coding when I'm tired.

That /should/ have been

Hate to do this to you, but I think you need more sleep :P

The increments should occur within the loop, not in the test, and as has
been discussed in other posts, the comparison should be done treating the
chars as unsigned. Finally Lawrence Kirby pointed out that strictly
speaking char is not guaranteed to have smaller range than int, so the
return statement is not necessarily correct, but usually will be.
int compare_strings(char *a, char *b) {
while (*b && (*a++ == *b++)) ;
return (int)(*a-*b);
}

Perhaps:

int compare_strings(char *a, char *b) {
unsigned char *ua = (unsigned char *)a;
unsigned char *ub = (unsigned char *)b;
while (*ub && *ua == *ub) {
ua++;
ub++;
}
return (int)*ua-(int)*ub;
}

Nov 14 '05 #16

Netocrat

On Wed, 22 Jun 2005 22:30:58 +1000, Netocrat wrote:

On Wed, 22 Jun 2005 13:28:56 +0100, Lawrence Kirby wrote:
C doesn't guarantee that char has a smaller range than int so *a-*b
could overflow. You can usually get away with this in practice but at
least a comment is in order.
Here's a fix then

[snip] return (int)*a-(int)*b;

Actually that's useless as a fix per your words - instead:
return *a == *b ? 0 : *a > *b ? 1 : -1;

There are probably cheaper ways to do this knowing details about the
implementation but this is hopefully a strictly compatible general
version.

Nov 14 '05 #17

Netocrat

On Wed, 22 Jun 2005 04:58:36 -0700, websnarf wrote:

Ok, so here are some remaining questions:

1) How should one implement a standard strcmp? -- as I recall, this is
like 3 lines of code.
I doubt it can be done in 3 lines of code. For a start the arguments to
strcmp are consts so you need at least two lines of variable declarations.
There are a few different implementations - not using const parameters -
scattered through this thread.
2) How might one implement a standard memcmp? (On the assumption that
lengths are always available to you, since it basically costs nothing
for this to be the case.)
No need for assumptions, length is one of the parameters to memcmp.
Doing this naively is easy; but trying to
take advantage of low-level hardware characteristics, such as alignment,
this can be hard if you are really intent on squeezing out the maximum
performance.

If it were 'standard' surely it wouldn't have knowledge of low-level
hardware characteristics. Perhaps I misunderstand you. Anyhow on my
platform (Pentium 4 running Linux with gcc 3.3.5) you can squeeze quite a
big performance gain by knowing that the word-size of the machine is equal
to sizeof(int).

The, as you put it naive, but portable attempt:

int memcmp(const void *a, const void *b, size_t n) {
unsigned char *ua = (unsigned char *)a;
unsigned char *ub = (unsigned char *)b;
unsigned char *ua_max = ua + n;

while (ua < ua_max) {
if (*ua > *ub)
return 1;
else if (*ua < *ub)
return -1;
else {
ua++;
ub++;
}
}
return 0;
}

I compared the speed of this implementation to the glibc memcmp by running
memcmps of 50 million (less 100) pairs of different memory locations
containing random data of length 100 bytes.

It turns out that the glibc version is roughly 10-15% faster - no big deal.

So I tested your idea of using alignment, knowing that my processor likes
to operate on 4 bytes at a time, being equivalent to sizeof(int).

int memcmp_using_int(const void *a, const void *b, size_t n) {
int num_ints = n / sizeof(int);
int extra_bytes = n % sizeof(int);
unsigned int *ua = (unsigned int *)a;
unsigned int *ub = (unsigned int *)b;
unsigned int *ua_max = ua + num_ints;
unsigned char *uac;
unsigned char *ubc;
unsigned char *uac_max;

while (ua < ua_max) {
if (*ua > *ub)
return 1;
else if (*ua < *ub)
return -1;
else {
ua++;
ub++;
}
}
uac = (unsigned char *)ua;
ubc = (unsigned char *)ub;
uac_max = uac + extra_bytes;
while (uac < uac_max) {
if (*uac > *ubc)
return 1;
else if (*uac < *ubc)
return -1;
else {
uac++;
ubc++;
}
}
return 0;
}

This version is approximately 3 times faster than glibc and my original
version, which surprised me. Is there any reason to _not_ make the
assumption that I made - that we should operate on units of sizeof(int)
rather than sizeof(char)? It may not necessarily improve performance - it
certainly improves it on my setup - but could it degrade it?

Are there any other ways to work out - without knowing in advance - an
appropriate unit for your system? I happened to know that my machine's
wordsize is sizeof(int), but can you work out or approximate what it is
from standard includes/definitions?

Nov 15 '05 #18

CBFalconer

Netocrat wrote:

On Wed, 22 Jun 2005 07:55:37 -0400, Lew Pitcher wrote:
[snip]
Gak!!

That's what I get for off-the-cuff coding when I'm tired.

That /should/ have been

Hate to do this to you, but I think you need more sleep :P

The increments should occur within the loop, not in the test, and
as has been discussed in other posts, the comparison should be
done treating the chars as unsigned. Finally Lawrence Kirby
pointed out that strictly speaking char is not guaranteed to have
smaller range than int, so the return statement is not necessarily
correct, but usually will be.
int compare_strings(char *a, char *b) {
while (*b && (*a++ == *b++)) ;
return (int)(*a-*b);
}

Perhaps:

int compare_strings(char *a, char *b) {
unsigned char *ua = (unsigned char *)a;
unsigned char *ub = (unsigned char *)b;
while (*ub && *ua == *ub) {
ua++;
ub++;
}
return (int)*ua-(int)*ub;
}

Still wrong. Try:

int compare_strings(char *a, char *b)
{
unsigned char *ua = a, *ub = b;

while ((*ua == *ub) && *ua) {
ua++; ub++;
}
return (*ua > *ub) - (*ub > *ua);
}

notice the placement of the test for '\0'. No overflow. No casts.

--
Some informative links:
news:news.announce.newusers
http://www.geocities.com/nnqweb/
http://www.catb.org/~esr/faqs/smart-questions.html
http://www.caliburn.nl/topposting.html
http://www.netmeister.org/news/learn2quote.html

Nov 15 '05 #19

Netocrat

On Wed, 22 Jun 2005 16:39:31 +0000, CBFalconer wrote:

Netocrat wrote:
On Wed, 22 Jun 2005 07:55:37 -0400, Lew Pitcher wrote:
int compare_strings(char *a, char *b) {
while (*b && (*a++ == *b++)) ;
return (int)(*a-*b);
} Perhaps:

int compare_strings(char *a, char *b) {
unsigned char *ua = (unsigned char *)a; unsigned char *ub =
(unsigned char *)b; while (*ub && *ua == *ub) {
ua++;
ub++;
}
return (int)*ua-(int)*ub;
}

Still wrong. Try:

int compare_strings(char *a, char *b) {
unsigned char *ua = a, *ub = b;

while ((*ua == *ub) && *ua) {
ua++; ub++;
}
return (*ua > *ub) - (*ub > *ua);
}
}
notice
the placement of the test for '\0'.

I see no reason for this rearrangement... logically it has the same
result. Whether the loop ends because *ua != *ub or because *ua == '\0'
or because as in the case of the original *ub == '\0', the return code
does the same thing... so why does it matter?
No overflow.
Yes, I noted elsewhere that I stuffed the overflow protection up. Your
solution is elegant; mine requires less computation - ignoring the
effects of optimisation. It was (with different variable names):
return *a == *b ? 0 : *a > *b ? 1 : -1;
No casts.

I use gcc's -Wall option.

Without the casts I get:

warning: pointer targets in initialization differ in signedness

which I'd rather not see, but point taken, they're unnecessary.

Nov 15 '05 #20

CBFalconer

Netocrat wrote:

.... snip ...
So I tested your idea of using alignment, knowing that my
processor likes to operate on 4 bytes at a time, being equivalent
to sizeof(int).

int memcmp_using_int(const void *a, const void *b, size_t n) {
int num_ints = n / sizeof(int);
int extra_bytes = n % sizeof(int);
unsigned int *ua = (unsigned int *)a;
unsigned int *ub = (unsigned int *)b;
unsigned int *ua_max = ua + num_ints;
unsigned char *uac;
unsigned char *ubc;
unsigned char *uac_max;
Right here you run into trouble. What if the original a and b are
not suitably aligned for ints?. So you need a preamble to get them
so aligned, and that requires that you know what defines alignment
for ints. That last is only known to the implementor. Similarly
you need a postamble to handle the last few bytes.
while (ua < ua_max) {
if (*ua > *ub) .... snip code ...
This version is approximately 3 times faster than glibc and my
original version, which surprised me. Is there any reason to
_not_ make the assumption that I made - that we should operate on
units of sizeof(int) rather than sizeof(char)? It may not
necessarily improve performance - it certainly improves it on my
setup - but could it degrade it?

Are there any other ways to work out - without knowing in advance
- an appropriate unit for your system? I happened to know that
my machine's wordsize is sizeof(int), but can you work out or
approximate what it is from standard includes/definitions?

What you can't accurate work out is the alignment requirements.
Even if you are the implementor, and know all these things, you may
very well find that you have optimized the rare large comparison at
the cost of slowing down the more common small comparison.

Similar considerations apply to memmov, memcpy, strlen, strcmp,
strcpy, strcat etc.

If you know that the original arguments were created by malloc, you
then know that they are suitably aligned for ints, and need only
worry about the postamble. But then you have the highly probable
(and possibly undetectable on your hardware) error of passing an
argument not created by malloc.

Remember KISS.

--
Some informative links:
news:news.announce.newusers
http://www.geocities.com/nnqweb/
http://www.catb.org/~esr/faqs/smart-questions.html
http://www.caliburn.nl/topposting.html
http://www.netmeister.org/news/learn2quote.html

Nov 15 '05 #21

Netocrat

On Wed, 22 Jun 2005 18:17:07 +0000, CBFalconer wrote:

Netocrat wrote:
... snip ...

So I tested your idea of using alignment, knowing that my processor
likes to operate on 4 bytes at a time, being equivalent to sizeof(int).

int memcmp_using_int(const void *a, const void *b, size_t n) {
int num_ints = n / sizeof(int);
int extra_bytes = n % sizeof(int);
unsigned int *ua = (unsigned int *)a; unsigned int *ub =
(unsigned int *)b; unsigned int *ua_max = ua + num_ints;
unsigned char *uac;
unsigned char *ubc;
unsigned char *uac_max;

Right here you run into trouble. What if the original a and b are not
suitably aligned for ints?.

On my system, it doesn't cause an incorrect result - the tests I'm running
are passing in pointers that are incremented one memory location at a
time. Whether or not it affects performance I don't know. I'm sure many
on this group would be able to answer such a question regarding an x86 but
it's something that has never concerned me before.

However, I posted this code to find out how generally applicable it was,
and as you point out this is a general concern that my code doesn't deal
with.
So you need a preamble to get them so
aligned, and that requires that you know what defines alignment for ints.
That last is only known to the implementor.
OK, that's the sort of thing I wanted to know - you've confirmed my
suspicion that it's not possible to in the general case discover those
sorts of details through standard functions.
Similarly you need a
postamble to handle the last few bytes.
Which I had; although correct me if you don't think it works as it should
(it tested fine)...

while (ua < ua_max) {
if (*ua > *ub)

... snip code ...

This version is approximately 3 times faster than glibc and my original
version, which surprised me. Is there any reason to _not_ make the
assumption that I made - that we should operate on units of sizeof(int)
rather than sizeof(char)? It may not necessarily improve performance -
it certainly improves it on my setup - but could it degrade it?

Are there any other ways to work out - without knowing in advance - an
appropriate unit for your system? I happened to know that my machine's
wordsize is sizeof(int), but can you work out or approximate what it is
from standard includes/definitions?

What you can't accurate work out is the alignment requirements. Even if
you are the implementor, and know all these things, you may very well
find that you have optimized the rare large comparison at the cost of
slowing down the more common small comparison.

What you say makes sense, although alignment isn't an issue in my case. I
wanted to see if, as you suggest is possible, the small comparison is
degraded by my code as it seems it would be. So I re-tested and instead
of setting the length of the random memory to 100 bytes I varied this
length from 1 to 10 bytes. Anything over 3 bytes was faster; less than
that and the library was faster.

Length Speed of new version vs library
1 3 x slower
2 2 x slower
3 2 x slower
4 2.5 x faster
5 2.5 x faster
6 2 x faster
7 slightly faster
8 2.5 x faster
9 2 x faster
10 2 x faster
11 2 x faster

It may be debatable whether the implementors made the right choice -
what's the average length of data that memcmp() is used for? - I suspect a
lot more than 3 bytes - but I don't suppose it's really on topic here.
Similar considerations apply to memmov, memcpy, strlen, strcmp, strcpy,
strcat etc.

If you know that the original arguments were created by malloc, you then
know that they are suitably aligned for ints, and need only worry about
the postamble. But then you have the highly probable (and possibly
undetectable on your hardware) error of passing an argument not created
by malloc.

Remember KISS.

It's a good principle. I like to think of efficiency and speed too but
they often are synonymous with KISS.

Nov 15 '05 #22

Randy Howard

In article <pa****************************@dodo.com.au>,
ne******@dodo.com.au says...

Right here you run into trouble. What if the original a and b are not
suitably aligned for ints?.
On my system, it doesn't cause an incorrect result

That's not the point. In clc, the goal is to demonstrate portable,
correct code, not something that happens to work on your chosen platform.
The "but it works on my system" defense is quite common, so it gets a
lot of (negative) attention.
What you say makes sense, although alignment isn't an issue in my case.

It *is* an issue, you simply decided not to care. Hopefully you won't
leave presents for future maintainers of your code down the road.

--
Randy Howard (2reply remove FOOBAR)
"I don't really care about being right you know,
I just care about success." --Steve Jobs

Nov 15 '05 #23

Netocrat

On Wed, 22 Jun 2005 20:11:12 +0000, Randy Howard wrote:

In article <pa****************************@dodo.com.au>,
ne******@dodo.com.au says...
> Right here you run into trouble. What if the original a and b are not
> suitably aligned for ints?.
On my system, it doesn't cause an incorrect result

That's not the point. In clc, the goal is to demonstrate portable,
correct code, not something that happens to work on your chosen platform.
The "but it works on my system" defense is quite common, so it gets a lot
of (negative) attention.

I understand that... you have quoted the first part of my answer to the
question but you have chosen to ignore the rest which 'aligns' with the
goal of clc as far as I can tell:
However, I posted this code to find out how generally applicable it was,
and as you point out this is a general concern that my code doesn't deal
with.

What you say makes sense, although alignment isn't an issue in my case.

It *is* an issue, you simply decided not to care. Hopefully you won't
leave presents for future maintainers of your code down the road.

I do care, which is why I'm posting to this group for feedback. You
have taken what I said out of context. To make it clear: I acknowledge
that alignment is an issue in general and in clc. I wouldn't use that
code, even if I intended to use it only on my system, given that it's been
pointed out that alignment is an issue. The code that I posted is not
being maintained, it is code created solely for testing and for feedback
here.

Let me put the context back into my statement. CBFalconer wrote: '[e]ven
if you are the implementor, and know all these things..' and I responded
with '...alignment isn't an issue in my case...'; meaning that if I were
the implementor, in my case alignment wouldn't be an issue. At least
that's what I intended; obviously it can be interpreted other ways: the
way you have interpreted it is without that "if" that I intended to be
implicit.

Nov 15 '05 #24

Randy Howard

In article <pa****************************@dodo.com.au>,
ne******@dodo.com.au says...

On Wed, 22 Jun 2005 20:11:12 +0000, Randy Howard wrote:
That's not the point. In clc, the goal is to demonstrate portable,
correct code, not something that happens to work on your chosen platform.
The "but it works on my system" defense is quite common, so it gets a lot
of (negative) attention.

I understand that... you have quoted the first part of my answer to the
question but you have chosen to ignore the rest which 'aligns' with the
goal of clc as far as I can tell:

I didn't ignore it, I simply didn't disagree with the rest. I try not
to include entire articles unless absolutely necessary. I did come on a
bit strong however, for which I apologize. I really reacted to the
hundreds of posts over the years that say "but they work on my system",
and pointed it all at you, which I should not have done.

What you say makes sense, although alignment isn't an issue in my case.

It *is* an issue, you simply decided not to care. Hopefully you won't
leave presents for future maintainers of your code down the road.

To make it clear: I acknowledge
that alignment is an issue in general and in clc. I wouldn't use that
code, even if I intended to use it only on my system, given that it's been
pointed out that alignment is an issue. The code that I posted is not
being maintained, it is code created solely for testing and for feedback
here.

Fair enough.

--
Randy Howard (2reply remove FOOBAR)
"I don't really care about being right you know,
I just care about success." --Steve Jobs

Nov 15 '05 #25

Netocrat

On Wed, 22 Jun 2005 22:50:23 +0000, Randy Howard wrote:

In article <pa****************************@dodo.com.au>,
ne******@dodo.com.au says...
On Wed, 22 Jun 2005 20:11:12 +0000, Randy Howard wrote:
> That's not the point. In clc, the goal is to demonstrate portable,
> correct code, not something that happens to work on your chosen
> platform. The "but it works on my system" defense is quite common, so
> it gets a lot of (negative) attention.

I understand that... you have quoted the first part of my answer to the
question but you have chosen to ignore the rest which 'aligns' with the
goal of clc as far as I can tell:

I didn't ignore it, I simply didn't disagree with the rest. I try not to
include entire articles unless absolutely necessary. I did come on a bit
strong however, for which I apologize. I really reacted to the hundreds
of posts over the years that say "but they work on my system", and pointed
it all at you, which I should not have done.

No probs, I'm a new poster so you don't have much reference on my
perspective. It was good of you to be so gracious.

Nov 15 '05 #26

Old Wolf

Lew Pitcher wrote:

That /should/ have been
int compare_strings(char *a, char *b)
{
while (*b && (*a++ == *b++)) ;
return (int)(*a-*b);
}

What is the purpose of that cast?

Nov 15 '05 #27

Ben Pfaff

Netocrat <ne******@dodo.com.au> writes:

Actually that's useless as a fix per your words - instead:
return *a == *b ? 0 : *a > *b ? 1 : -1;

There are probably cheaper ways to do this knowing details about the
implementation but this is hopefully a strictly compatible general
version.

Here's my favorite version. I got it from someone here on clc
but don't remember whom:
return *a < *b ? -1 : *a > *b;
--
Ben Pfaff
email: bl*@cs.stanford.edu
web: http://benpfaff.org

Nov 15 '05 #28

CBFalconer

Netocrat wrote:

.... snip ...
No probs, I'm a new poster so you don't have much reference on my
perspective. It was good of you to be so gracious.

Now that's rare. Calling any of us "gracious". We usually only
get one-half of that number of letters.

--
Some informative links:
news:news.announce.newusers
http://www.geocities.com/nnqweb/
http://www.catb.org/~esr/faqs/smart-questions.html
http://www.caliburn.nl/topposting.html
http://www.netmeister.org/news/learn2quote.html

Nov 15 '05 #29

Peter Nilsson

CBFalconer wrote:

Try:
int compare_strings(char *a, char *b)
{
unsigned char *ua = a, *ub = b;
...
... No overflow. No casts.

Just a required diagnostic from the constraint violation. ;)

--
Peter

Nov 15 '05 #30

Al Bowers

Ben Pfaff wrote:

Netocrat <ne******@dodo.com.au> writes:

Actually that's useless as a fix per your words - instead:
return *a == *b ? 0 : *a > *b ? 1 : -1;

There are probably cheaper ways to do this knowing details about the
implementation but this is hopefully a strictly compatible general
version.

Here's my favorite version. I got it from someone here on clc
but don't remember whom:
return *a < *b ? -1 : *a > *b;

It's ok, but I don't believe you would need the last comparison
if you check equaity(requiring a return of 0) within the loop.

while(*a++ == *b++)
if(*a == '\0') return 0;
return *a<*b?-1:1;

--
Al Bowers
Tampa, Fl USA
mailto: xa******@myrapidsys.com (remove the x to send email)
http://www.geocities.com/abowers822/

Nov 15 '05 #31

Al Bowers

Al Bowers wrote:

Ben Pfaff wrote:
Netocrat <ne******@dodo.com.au> writes:

Actually that's useless as a fix per your words - instead:
return *a == *b ? 0 : *a > *b ? 1 : -1;

There are probably cheaper ways to do this knowing details about the
implementation but this is hopefully a strictly compatible general
version.

Here's my favorite version. I got it from someone here on clc
but don't remember whom:
return *a < *b ? -1 : *a > *b;

It's ok, but I don't believe you would need the last comparison
if you check equaity(requiring a return of 0) within the loop.

while(*a++ == *b++)
if(*a == '\0') return 0;
return *a<*b?-1:1;

That should be a for loop.
for( ; *a == *b; a++,b++)
if(*a == '\0') return 0;
return *a<*b?-1:1;

--
Al Bowers
Tampa, Fl USA
mailto: xa******@myrapidsys.com (remove the x to send email)
http://www.geocities.com/abowers822/

Nov 15 '05 #32

Rajan

Yadur,
I have written the strcmp function , let me just paste it here.
You don't need to do a seperate length comparison , you just need to
check for each of the arguments.

int strcmp(const char* a1, const char* b1)
{
while (*a1 == *b1)
{
if ((*a1 == '\0') && (*b1 == '\0'))
{
return 0;
}
a1++;
b1++;
}
return(*a1 - *b1);
}

Nov 15 '05 #33

pete

Rajan wrote:

Yadur,
I have written the strcmp function , let me just paste it here.
You don't need to do a seperate length comparison , you just need to
check for each of the arguments.

int strcmp(const char* a1, const char* b1)
{
while (*a1 == *b1)
{
if ((*a1 == '\0') && (*b1 == '\0'))
There's no point in checking both *a1 and *b1
for equality with zero,
since you already know that *a1 == *b1 at this point in code.
{
return 0;
}
a1++;
b1++;
}
return(*a1 - *b1);
}

That's wrong.
Any character compared with strcmp
has to give the same result as if compared by memcmp.

int str_cmp(const char *s1, const char *s2)
{
const unsigned char *p1 = (const unsigned char *)s1;
const unsigned char *p2 = (const unsigned char *)s2;

while (*p1 == *p2) {
if (*p1 == '\0') {
return 0;
}
++p1;
++p2;
}
return *p2 > *p1 ? -1 : 1;
}

N869
7.21.4 Comparison functions
[#1] The sign of a nonzero value returned by the comparison
functions memcmp, strcmp, and strncmp is determined by the
sign of the difference between the values of the first pair
of characters (both interpreted as unsigned char) that
differ in the objects being compared.

--
pete

Nov 15 '05 #34

Rajan

Pete,
I think you should do a "man strcmp" and check because if whenever you
compare let's say "Raj" and "Rajendra" it will always give you the
difference as ascii value of '\0' - ascii value of 'e' which comes to
-101.
It's not the same as memcmp. memcmp compares the number of bytes for
any data type unlike strcmp which is specifically for const char*.
Write a simple program using strcmp function and check the retval.

Nov 15 '05 #35

pete

Ben Pfaff wrote:

Netocrat <ne******@dodo.com.au> writes:
Actually that's useless as a fix per your words - instead:
return *a == *b ? 0 : *a > *b ? 1 : -1;

There are probably cheaper ways to do this knowing details about the
implementation but this is hopefully a strictly compatible general
version.

Here's my favorite version. I got it from someone here on clc
but don't remember whom:
return *a < *b ? -1 : *a > *b;

It wasn't me.

I usually write that as:
return *b > *a ? -1 : *a != *b;
Sometimes as:
return *b > *a ? -1 : *a > *b;

because I like to write sorting functions
in terms of only a GT(,) macro.

#define GT(A, B) (*(A) > *(B))
/* The above macro is for arithmetic types only */

static int comp(const void *arg1, const void *arg2)
{
return GT((e_type*)arg2, (e_type*)arg1)
? -1 : GT((e_type*)arg1, (e_type*)arg2);
}

The above comp function,
would be for a function with a qsort interface,
which is comparing arithmetic types.

For functions which only can sort one dimensional arrays,
the GT(,) macro can be used directly.

void s3sort(e_type *array, size_t nmemb)
{
e_type temp, *i, *j, *k, *after;

after = array + nmemb;
if (nmemb > (size_t)-1 / 3 - 1) {
nmemb = nmemb / 3 - 1;
} else {
nmemb = (3 * nmemb + 1) / 7;
}
while (nmemb != 0) {
i = array + nmemb;
do {
j = i - nmemb;
if (GT(j, i)) {
k = i;
temp = *k;
do {
*k = *j;
k = j;
if (nmemb + array > j) {
break;
}
j -= nmemb;
} while (GT(j, &temp));
*k = temp;
}
++i;
} while (i != after);
nmemb = (3 * nmemb + 1) / 7;
}
}

The GT(,) macro can be made more complicated to do other things
like to sort an array string pointers by string length.

#define GT(A, B) (lencomp((A), (B)) > 0)

int lencomp(const void *a, const void *b)
{
const size_t a_len = strlen(*(char **)a);
const size_t b_len = strlen(*(char **)b);

return b_len > a_len ? -1 : a_len != b_len;
}

--
pete

Nov 15 '05 #36

pete

Rajan wrote:

Pete,
I think you should do a "man strcmp"

I think you should realise that the C standard defines the language,
and that "man" doesn't define the language.
The contents of your man pages are only
topical on this newsgroup to the extent
that they are right if they agree with the standard,
and that they are wrong if they disagree wiht the C standard.

ISO/IEC 9899:1999(E)
7.21.4 Comparison functions
1 The sign of a nonzero value returned by the comparison
functions memcmp, strcmp, and strncmp is determined by
the sign of the difference between the values of the first
pair of characters (both interpreted as unsigned char)
that differ in the objects being compared.

--
pete

Nov 15 '05 #37

Netocrat

On Thu, 23 Jun 2005 01:50:19 +0000, CBFalconer wrote:

Netocrat wrote:

... snip ...

No probs, I'm a new poster so you don't have much reference on my
perspective. It was good of you to be so gracious.

Now that's rare. Calling any of us "gracious". We usually only get
one-half of that number of letters.

But hang on, how do any of you get any programming done if you're all
Chief Information Officers.... oh, OK, not half of _those_ letters...

Nov 15 '05 #38

Lawrence Kirby

On Thu, 23 Jun 2005 05:46:57 +1000, Netocrat wrote:

....

Similarly you need a
postamble to handle the last few bytes.
Which I had; although correct me if you don't think it works as it should
(it tested fine)...
while (ua < ua_max) {
if (*ua > *ub)

This makes big assumptions abut the representation of integers in the
platform. It won't work if they contain any padding bits or use anything
other than big endian byte order. And technically speaking C's type
aliasing rules disallow this.
... snip code ...

This version is approximately 3 times faster than glibc and my original
version, which surprised me. Is there any reason to _not_ make the
assumption that I made - that we should operate on units of sizeof(int)
rather than sizeof(char)? It may not necessarily improve performance -
it certainly improves it on my setup - but could it degrade it?

Were you comparing equal data?

....
What you say makes sense, although alignment isn't an issue in my case. I
wanted to see if, as you suggest is possible, the small comparison is
degraded by my code as it seems it would be. So I re-tested and instead
of setting the length of the random memory to 100 bytes I varied this
length from 1 to 10 bytes. Anything over 3 bytes was faster; less than
that and the library was faster.

Length Speed of new version vs library
1 3 x slower
2 2 x slower
3 2 x slower
4 2.5 x faster
5 2.5 x faster
6 2 x faster
7 slightly faster
8 2.5 x faster
9 2 x faster
10 2 x faster
11 2 x faster

It may be debatable whether the implementors made the right choice -
what's the average length of data that memcmp() is used for? - I suspect a
lot more than 3 bytes - but I don't suppose it's really on topic here.

The average length of data being compared is not important for non-equal
data, what is important is the position of the first character that
differs. For "random" data this will be the first character tested the
great majority of the time.

These are techniques available for the implementation to use for
standard library functions if the implementor chooses. They are not
normally techniques a program should use.

Lawrence

Nov 15 '05 #39

Lawrence Kirby

On Wed, 22 Jun 2005 20:34:00 -0700, Peter Nilsson wrote:

CBFalconer wrote:
Try:
int compare_strings(char *a, char *b)
{
unsigned char *ua = a, *ub = b;
...
... No overflow. No casts.

Just a required diagnostic from the constraint violation. ;)

Fix that and add const because the string aren't being modified and we
might be getting somewhere. :-)

Lawrence

Nov 15 '05 #40

Lawrence Kirby

On Wed, 22 Jun 2005 18:43:59 -0700, Ben Pfaff wrote:

Netocrat <ne******@dodo.com.au> writes:
Actually that's useless as a fix per your words - instead:
return *a == *b ? 0 : *a > *b ? 1 : -1;

There are probably cheaper ways to do this knowing details about the
implementation but this is hopefully a strictly compatible general
version.

Here's my favorite version. I got it from someone here on clc
but don't remember whom:
return *a < *b ? -1 : *a > *b;

I have vague recollections about that, but it was a long time ago,
probably in the previous millennium.

Lawrence

Nov 15 '05 #41

lawrence.jones

Ben Pfaff <bl*@cs.stanford.edu> wrote:

Here's my favorite version. I got it from someone here on clc
but don't remember whom:
return *a < *b ? -1 : *a > *b;

My favorite is the conditionless form:

return (*a > *b) - (*a < *b);

-Larry Jones

I suppose if I had two X chromosomes, I'd feel hostile too. -- Calvin

Nov 15 '05 #42

Netocrat

On Thu, 23 Jun 2005 13:52:50 +0100, Lawrence Kirby wrote:

On Thu, 23 Jun 2005 05:46:57 +1000, Netocrat wrote:

...
Similarly you need a
postamble to handle the last few bytes.
Which I had; although correct me if you don't think it works as it
should (it tested fine)...
while (ua < ua_max) {
if (*ua > *ub)
This makes big assumptions abut the representation of integers in the
platform. It won't work if they contain any padding bits or use anything
other than big endian byte order. And technically speaking C's type
aliasing rules disallow this.

Where can I verify this from documentation - not that I don't believe
you, just that I'd like to know how to find these things out without
asking on the group? I know you can buy the standard, but I'd rather not
spend money. What are my options?

Since I posted I have realised that it is actually failing - my processor
is little endian...

... snip code ...

This version is approximately 3 times faster than glibc and my
original version, which surprised me. Is there any reason to _not_
make the assumption that I made - that we should operate on units of
sizeof(int) rather than sizeof(char)? It may not necessarily improve
performance - it certainly improves it on my setup - but could it
degrade it?
Were you comparing equal data?

I intended that it was random but didn't explicitly randomise it; and I
think that the memory block malloc was returning was zeroed for these
tests, although it's impossible to go back and check... I am now
explicitly setting each byte to a random value. So for the data above,
yes, it probably was all equal.
What you say makes sense, although alignment isn't an issue in my case.
I wanted to see if, as you suggest is possible, the small comparison is
degraded by my code as it seems it would be. So I re-tested and
instead of setting the length of the random memory to 100 bytes I
varied this length from 1 to 10 bytes. Anything over 3 bytes was
faster; less than that and the library was faster.

Length Speed of new version vs library 1 3 x slower 2 2 x slower 3
2 x slower
4 2.5 x faster
5 2.5 x faster
6 2 x faster
7 slightly faster
8 2.5 x faster
9 2 x faster
10 2 x faster
11 2 x faster

It may be debatable whether the implementors made the right choice -
what's the average length of data that memcmp() is used for? - I
suspect a lot more than 3 bytes - but I don't suppose it's really on
topic here.

The average length of data being compared is not important for non-equal
data, what is important is the position of the first character that
differs. For "random" data this will be the first character tested the
great majority of the time.

Well put.

<Off-topic (lots of it)>
It so happens that the data was probably equal for the post to which you
responded. I tested again, this time with equal data except for the final
compared byte which was alternately less than and greater than. I also
re-ordered the inner while loop of my naive memcmp to be:
if (*ua == *ub) {
ua++;
ub++;
} else if (*ua < *ub)
return -1;
else
return 1;

The results that I got are different to those I posted above - those above
were done pretty roughly so I may have made an error translating from
times to relative performance. Anyhow, for all lengths less than or equal
to 10, both of my functions out-perform the library when all data bar
the last byte is the same. Here are some very rough relative times:

Length Library Naive Word-based
0 2.782063 0.824878 1.026883
1 3.839585 1.794128 1.536425
2 3.979533 1.936510 1.711663
3 4.487526 2.082990 2.091040
4 4.911784 2.249875 4.031234
5 5.433289 2.392460 2.443990
6 5.142204 3.054199 2.226923
7 5.226639 3.310050 2.500838
8 5.122240 3.394408 4.416643
9 5.741017 3.498679 2.579461
10 5.832017 4.845331 3.007469
11 6.545730 5.683157 3.090258
12 6.340615 6.557857 4.964385
13 6.434350 6.702367 3.512579
14 6.485347 6.534230 3.369173
15 6.735006 7.560720 4.631879
16 8.304644 6.969153 4.455131
17 6.899765 6.990281 3.269464
18 7.499913 7.809836 4.871980
19 7.754756 7.356656 6.513516
20 8.369894 7.477097 5.325045
....
300 61.002508 65.282563 34.535567
301 61.211380 66.803313 33.170267
302 67.950986 66.556912 33.923567
303 66.034115 67.561750 32.898632

In all of these tests the word-based function far outperforms the library
function - although as I have noted, the endian-ness causes errors.
The naive approach is very similar in time to the library version and
sometimes faster; suggesting that no tricks/optimisations are being used
in the library version.

Intel chips seem to support an op-code that compares a 32-bit value in
byte-order, so I'm going to inline that into my function with some
assembly in gcc and see if I can get both a correct result and also
maintain the performance boost that this function provides, which I
suspect is possible. If so it would be an appropriate way to optimise the
intel version of glibc's memcpy function.

</Off-topic>
These are techniques available for the implementation to use for
standard library functions if the implementor chooses. They are not
normally techniques a program should use.

That's what I wanted to know. Cheers.

Nov 15 '05 #43

Lawrence Kirby

On Sat, 25 Jun 2005 02:59:23 +1000, Netocrat wrote:

On Thu, 23 Jun 2005 13:52:50 +0100, Lawrence Kirby wrote:
On Thu, 23 Jun 2005 05:46:57 +1000, Netocrat wrote:

...
Similarly you need a
postamble to handle the last few bytes.

Which I had; although correct me if you don't think it works as it
should (it tested fine)...

> while (ua < ua_max) {
> if (*ua > *ub)
This makes big assumptions abut the representation of integers in the
platform. It won't work if they contain any padding bits or use anything
other than big endian byte order. And technically speaking C's type
aliasing rules disallow this.

Where can I verify this from documentation - not that I don't believe
you, just that I'd like to know how to find these things out without
asking on the group? I know you can buy the standard, but I'd rather not
spend money. What are my options?

You could download a draft version of the standard or buy a good book. Of
course buying a book implies spending money and isn't guaranteed to be
accurate or complete. Drafts are also not going to be 100% accurate but at
least are based on the actual text of the standard.

The aliasing rules I referred to are in C99 6.5p7:

An object shall have its stored value accessed only by an lvalue
expression that has one of the following types:73)
- a type compatible with the effective type of the object,
- a qualified version of a type compatible with the effective type of the
object,
- a type that is the signed or unsigned type corresponding to the
effective type of the object,
- a type that is the signed or unsigned type corresponding to a
qualified version of the effective type of the object,
- an aggregate or union type that includes one of the aforementioned
types among its members (including, recursively, a member of a
subaggregate or contained union), or
- a character type.

Specifically you can access any type of object through an lvalue of
character type, but the reverse is not true - accessing a character, or an
array of characters, through an lvalue of non-character type is not one of
the permitted options. C99 introduces the concept of "effective type" as
used above. This supports the use of non-character objects in malloc'd
memory.

....
In all of these tests the word-based function far outperforms the library
function - although as I have noted, the endian-ness causes errors.

You can make the code insensitive to byte order by using "word" ops to
find the first word that differs, then using byte ops to find the first
byte in the word that differs.

Lawrence

Nov 15 '05 #44

Netocrat

On Tue, 28 Jun 2005 15:17:11 +0100, Lawrence Kirby wrote:

On Sat, 25 Jun 2005 02:59:23 +1000, Netocrat wrote:
On Thu, 23 Jun 2005 13:52:50 +0100, Lawrence Kirby wrote:
On Thu, 23 Jun 2005 05:46:57 +1000, Netocrat wrote:

...

> Similarly you need a
> postamble to handle the last few bytes.

Which I had; although correct me if you don't think it works as it
should (it tested fine)...

>> while (ua < ua_max) {
>> if (*ua > *ub)

This makes big assumptions abut the representation of integers in the
platform. It won't work if they contain any padding bits or use
anything other than big endian byte order. And technically speaking C's
type aliasing rules disallow this.
Where can I verify this from documentation - not that I don't believe
you, just that I'd like to know how to find these things out without
asking on the group? I know you can buy the standard, but I'd rather not
spend money. What are my options?

You could download a draft version of the standard or buy a good book. Of
course buying a book implies spending money and isn't guaranteed to be
accurate or complete. Drafts are also not going to be 100% accurate but at
least are based on the actual text of the standard.

I'm now using on-line drafts of both C90 and C99 as well as the ANSI
Rationale as posted in another thread by various people. So occasionally
it'll be inaccurate, but for the basics I should be fine, particularly
where the two drafts concur.
The aliasing rules I referred to are in C99 6.5p7:

An object shall have its stored value accessed only by an lvalue
expression that has one of the following types:73) - a type compatible
with the effective type of the object, - a qualified version of a type
compatible with the effective type of the
object,
- a type that is the signed or unsigned type corresponding to the
effective type of the object,
- a type that is the signed or unsigned type corresponding to a
qualified version of the effective type of the object,
- an aggregate or union type that includes one of the aforementioned
types among its members (including, recursively, a member of a
subaggregate or contained union), or
- a character type.

Specifically you can access any type of object through an lvalue of
character type, but the reverse is not true - accessing a character, or an
array of characters, through an lvalue of non-character type is not one of
the permitted options. C99 introduces the concept of "effective type" as
used above. This supports the use of non-character objects in malloc'd
memory.

What do you mean by your last statement - how do you define non-character
objects?

In all of these tests the word-based function far outperforms the
library function - although as I have noted, the endian-ness causes
errors.

You can make the code insensitive to byte order by using "word" ops to
find the first word that differs, then using byte ops to find the first
byte in the word that differs.

It's always the simple solutions that are the most elusive. I had no need
to revert to assembly after all. Anyhow the excursion was informative.

I now have code that does effectively what you suggested except that
should it find a word that differs, it uses an assembly opcode to reverse
the byte order and test again. I haven't checked whether this is any
faster than as you have suggested simply reverting to byte ops in that
case. It may even vary from processor to processor - the array of Intel
chips supporting this opcode is vast.

Nov 15 '05 #45

Netocrat

Netocrat wrote:

[stats throughout 3 different posts demonstrating that glibc's memcpy
function on intel is much slower than a 'word-based' compare function]

It's off-topic but I want to correct the record:

I have discovered that when optimisation is turned on in my version of
gcc, memcpy performance roughly halves. This can be avoided by
accessing memcpy through a function pointer - it's not enough to write
(&memcpy) - you actually have to use a separate variable. The
statistics I provided were based on an optimised compilation without
accessing memcpy through a function pointer, so glibc's memcpy
performance was adversely affected.

Without optimisations the word-based memcmp function is only faster for
the cases where the number of initially equal bytes is 1, 3, 5 or 7.

With optimisations the word-based memcmp function is always
significantly faster than glibc's memcmp accessed directly. When
glibc's memcmp is accessed through a function pointer though, the
word-based memcmp is faster only - and only very slightly - for most of
the cases where the number of initially equal bytes is less than 20.
Thereafter glibc's memcmp is increasingly faster.

Nov 15 '05 #46

Dave Thompson

On 11 Jul 2005 03:01:13 -0700, "Netocrat" <ne******@dodo.com.au>
wrote:
<snip>

It's off-topic but I want to correct the record:

I have discovered that when optimisation is turned on in my version of
gcc, memcpy performance roughly halves. This can be avoided by
accessing memcpy through a function pointer - it's not enough to write
(&memcpy) - you actually have to use a separate variable. <snip>

Multiposted and answered in comp.lang.asm.x86 where it is ontopic --
at least in my opinion and it has been passing moderation.

The standard-C part is that an implementation (compiler) may generate
inline code instead of actually calling any standard library function,
and IME gcc-x86 does for memcmp() with -O >0 and not -fno-builtin.
Although the inlined version is AFAICT always repe cmpsb, which is the
same as in the (out-of-line) library (glibc) source Netocrat found.

- David.Thompson1 at worldnet.att.net

Nov 15 '05 #47

string comparison

Similar topics