By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
429,214 Members | 2,066 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 429,214 IT Pros & Developers. It's quick & easy.

Quandry with the following C code (Intermediate)

P: n/a
Hi all,

I have a slight problem understanding the following code that I saw on
a Unix-PAM tutorial (not OT!)

The following code will compare and old string to a new one, bombing
out if 'max' similar chars is exceeded.
------8<------

static
int compare(unsigned char *old, unsigned char *new, int max)
{
unsigned char in_old[256];
int equal = 0;

(void)memset(in_old, 0, sizeof (in_old));

while (*old)
in_old[*(old++)]++;

while (*new) {
if (in_old[*new])
equal++;
new++;
}

if (equal > max)
return (1);

return (0);
}
------->8---------

I fail to see how the 2 strings are compared for character equality,
especially in how the

in_old[*(old++)]++;

line is used.
Could anyone please shed some light on this for me?

cheers

Bry

Nov 14 '05 #1
Share this Question
Share on Google+
7 Replies


P: n/a
Hi Bry,

In article <11**********************@c13g2000cwb.googlegroups .com>,
BMarsh wrote:
I have a slight problem understanding the following code that I saw on
a Unix-PAM tutorial (not OT!)

The following code will compare and old string to a new one, bombing
out if 'max' similar chars is exceeded.

------8<------

static
int compare(unsigned char *old, unsigned char *new, int max)
{
unsigned char in_old[256];
int equal = 0;

(void)memset(in_old, 0, sizeof (in_old));

while (*old)
in_old[*(old++)]++;

while (*new) {
if (in_old[*new])
equal++;
new++;
}

if (equal > max)
return (1);

return (0);
}
------->8---------

I fail to see how the 2 strings are compared for character equality,
especially in how the

in_old[*(old++)]++;


The numerical character value of each character in the first input
string is used as an index for an array that counts the occurrences of
that character. Think about it like this: when the input string is "aab"
the first while loop does: in_old['a']++, in_old['a']++, in_old['b']++.

The second while loop checks for each character in the second input
string if it occurred in the first input string.

The first while loop could also be written as:

while (*old) {
in_old[*old]++;
old++;
}

Regards,
--
Rob van der Leek | rob(at)ricardis(dot)tudelft(dot)nl
Nov 14 '05 #2

P: n/a
Hi Rob,

Many thanks for your answer; it's cleared it up for me! I was totally
thrown off by the way the loop was written.

Thanks again,

Bryan.

Nov 14 '05 #3

P: n/a
"BMarsh" <b.*****@gmx.net> wrote:
The following code will compare and old string to a new one, bombing
out if 'max' similar chars is exceeded.
It doesn't do a compare the usual way. That is, it does something
completely different from strcmp().

(Oh, btw, if you insist on posting through Google-Broken-Beta, it would
be a good thing if you could get it not to strip all indentation. Your
code is hard to read this way.)
static
int compare(unsigned char *old, unsigned char *new, int max)
{
unsigned char in_old[256];
First of all, you need to use UCHAR_MAX here, instead of 256. If you
don't, you may try to run this code on a Unicode system some day, and be
surprised when your function scribbles all over memory when you pass it
a string with Unicode characters over 256 in it.
int equal = 0;

(void)memset(in_old, 0, sizeof (in_old));
Lose the cast. It does no good, and clutters up the code.
while (*old)
in_old[*(old++)]++;
This tallies the number of occurrences of each separate character value
in the first string. There's a bug in it: what happens if you pass it a
string of UCHAR_MAX 'a's?
while (*new) {
if (in_old[*new])
equal++;
new++;
(See what I mean about the indentation?)

This checks each character in the second string, and if there were any
of the same character at all in the first string, counts it as "equal".
}

if (equal > max)
return (1);

return (0);
If the number of "equal" characters, that is, the number of chars in the
second string of which there was at least one in the first string,
exceeds the passed-in maximum, return 1, else 0. This could be more
easily written as

return (equal>max);
I fail to see how the 2 strings are compared for character equality,
So do I; they're not.

Note, in particular, the different treatment of "old" and "new".

For example, try to explain the discrepancy between

compare("abc", "dbbbe", 2)

and

compare("dbbbe", "abc", 2)

Then, when you want an exercise I can't solve, try to explain _why_
someone would write a function like that, and then call it, sec,
"compare". The logic escapes me, I'm afraid. It's reasonably clear to me
_what_ this function does, but not why.
especially in how the

in_old[*(old++)]++;


The index entry corresponding to the character at the _current_ value of
old is increased (that is, the character now under the old pointer is
tallied); and old is moved to the next character. Not necessarily in
that order, or in any order at all, but since (old++) returns the old
value of old (so to speak) no matter which order is chosen, it doesn't
matter for the result.

Richard
Nov 14 '05 #4

P: n/a
Richard Bos wrote:

"BMarsh" <b.*****@gmx.net> wrote:

<snip>
static
int compare(unsigned char *old, unsigned char *new, int max)
{
unsigned char in_old[256];


First of all, you need to use UCHAR_MAX here, instead of 256.


I think you mean "UCHAR_MAX + 1"
If you
don't, you may try to run this code on a Unicode system some day, and be
surprised when your function scribbles all over memory when you pass it
a string with Unicode characters over 256 in it.


I think you mean "over 255"

<snip>
Nov 14 '05 #5

P: n/a
infobahn <in******@btinternet.com> wrote:
Richard Bos wrote:

"BMarsh" <b.*****@gmx.net> wrote:
unsigned char in_old[256];


First of all, you need to use UCHAR_MAX here, instead of 256.


I think you mean "UCHAR_MAX + 1"


Do we need "UCHAR_MAX + 1L" to cover the case of UCHAR_MAX
equal to UINT_MAX, say both 0xFFFF ?

Francois Grieu
Nov 14 '05 #6

P: n/a
Francois Grieu <fg****@francenet.fr> wrote:
infobahn <in******@btinternet.com> wrote:
Richard Bos wrote:

"BMarsh" <b.*****@gmx.net> wrote:
> unsigned char in_old[256];

First of all, you need to use UCHAR_MAX here, instead of 256.
I think you mean "UCHAR_MAX + 1"


Yes (and yes).
Do we need "UCHAR_MAX + 1L" to cover the case of UCHAR_MAX
equal to UINT_MAX, say both 0xFFFF ?


In theory, yes. In practice, systems where SCHAR_MAX == INT_MAX or
UCHAR_MAX==UINT_MAX have so many problems that I wouldn't bother to
cater for them. Anyone porting code to that kind of implementation knows
he's getting into a hornets' (or mare's <g>) nest, and should take all
necessary precautions himself.
(And why stop there? What if UCHAR_MAX==ULONG_MAX? Could happen
(probably does happen) on a 32-bit embedded processor.)

Richard
Nov 14 '05 #7

P: n/a
Francois Grieu wrote:
infobahn <in******@btinternet.com> wrote:
Richard Bos wrote:
First of all, you need to use UCHAR_MAX here, instead of 256.


I think you mean "UCHAR_MAX + 1"


Do we need "UCHAR_MAX + 1L" to cover the case of UCHAR_MAX
equal to UINT_MAX, say both 0xFFFF ?


Good spot, although I think we'd have to lump such an implementation
in with the DS9K. :-)

Actually, this really is a problem on CSILP32 systems such as
(some) DSPs, and the L suffix doesn't help on such systems.
Nov 14 '05 #8

This discussion thread is closed

Replies have been disabled for this discussion.