468,738 Members | 1,791 Online

# quickest way to determine character sizes

Hello,
I have an unsigned char array. I want to determine if each char's ascii
value is less than 127. Is there a faster way than looping through the
characters and checking if each is less than 127?

Nov 14 '05 #1
8 2129
hello smith <ih***********@ihatespammers.com> scribbled the following:
Hello,
I have an unsigned char array. I want to determine if each char's ascii
value is less than 127. Is there a faster way than looping through the
characters and checking if each is less than 127? Thanks in advance.

That's the most efficient algorithm there is, so that's as efficient
as you're going to get, unless your compiler vendor has supplied you
with a library function coded in pure machine code or something.
From an algorithm theory standpoint, you're asking "Can I see whether
each char is less than 127 without looking at each char to see if it's
less than 127?" What would you want? Clairvoyance?

--
/-- Joona Palaste (pa*****@cc.helsinki.fi) ------------- Finland --------\
\-- http://www.helsinki.fi/~palaste --------------------- rules! --------/
"As we all know, the hardware for the PC is great, but the software sucks."
- Petro Tyschtschenko
Nov 14 '05 #2
hello smith wrote:

Hello,
I have an unsigned char array. I want to determine if each char's ascii
value is less than 127. Is there a faster way than looping through the
characters and checking if each is less than 127?

Probably not.

For the closely related question "less than 128" it
might be just a smidgen quicker to calculate the inclusive
OR of all the characters and then compare the result to
128 (this generalizes to any power-of-two limiting value).

However, I'll bet that the smidgen saved (if anything
is saved at all) will be too small to make any measurable
difference in the performance of the surrounding program.
Have you forgotten Jackson's Laws of Program Optimization?

First Law of Program Optimization:
Don't do it.

Second Law of Program Optimization (for experts only):
Don't do it yet.

I hope you'll pardon my saying so, but the wording of your
question suggests (in two ways) that the Second Law doesn't
apply to you at this stage in your development. Obey the
First Law until your expertise improves.

--
Er*********@sun.com
Nov 14 '05 #3
hello smith wrote:
Hello,
I have an unsigned char array. I want to determine if each char's ascii
value is less than 127. Is there a faster way than looping through the
characters and checking if each is less than 127?

Are you looking for something like:

int checkArray(unsigned char *arr, int len) {
unsigned char temp = 0;

while ( len-- ) temp |= *arr++;
return temp < 127;
}

?

However, I doubt if this has any benefit other than making your code
difficult to understand.

-nrk.

--
Remove devnull for email
Nov 14 '05 #4
nrk wrote:

hello smith wrote:
Hello,
I have an unsigned char array. I want to determine if each char's ascii
value is less than 127. Is there a faster way than looping through the
characters and checking if each is less than 127?

Are you looking for something like:

int checkArray(unsigned char *arr, int len) {
unsigned char temp = 0;

while ( len-- ) temp |= *arr++;
return temp < 127;
}

?

However, I doubt if this has any benefit other than making your code
difficult to understand.

It also makes the code wrong: try it on an array
containing the two values 112 and 7.

Of course, this bug could be considered a "benefit"
in that it will teach the programmer (painfully) the
truth of Knuth's dictum: "Premature optimization is the
root of all evil."

--
Er*********@sun.com
Nov 14 '05 #5
Eric Sosman wrote:
nrk wrote:

hello smith wrote:
> Hello,
> I have an unsigned char array. I want to determine if each char's
> ascii
> value is less than 127. Is there a faster way than looping through the
> characters and checking if each is less than 127?
>

Are you looking for something like:

int checkArray(unsigned char *arr, int len) {
unsigned char temp = 0;

while ( len-- ) temp |= *arr++;
return temp < 127;
}

?

However, I doubt if this has any benefit other than making your code
difficult to understand.

It also makes the code wrong: try it on an array
containing the two values 112 and 7.

Of course, this bug could be considered a "benefit"
in that it will teach the programmer (painfully) the
truth of Knuth's dictum: "Premature optimization is the
root of all evil."

Whoops!! As Dan famously puts it "I should've engaged my brain before
posting" :-) The eyes see 127, but the mind wants and pleads for 128.

To OP: Take Eric's advise and discard mine. Or, alternately, if you
believe experience is the best teacher, feel free to use that buggy code.

-nrk.
--
Remove devnull for email
Nov 14 '05 #6
In <cR******************@bgtnsc05-news.ops.worldnet.att.net> hello smith <ih***********@ihatespammers.com> writes:
I have an unsigned char array. I want to determine if each char's ascii
value is less than 127. Is there a faster way than looping through the
characters and checking if each is less than 127?

Yes, if the array is properly aligned and properly sized to be aliased
with an array of unsigned int (things that you can control):

if (UINT_MAX == 4294967295 && CHAR_BIT == 8 && sizeof(unsigned) == 4) {
unsigned acc = 0;
unsigned *p = (unsigned *)array, *q = p + sizeof array / sizeof *p;
while (p < q) acc |= *p++;
if ((acc & 0x80808080) == 0) allascii = 1;
else allascii = 0;
}
else {
/* check char by char */
}

One could argue that the bits inside an int could be randomly distributed,
but this is the kind of risk that you can reasonably take. However, if
you really want, you can explicitly check:

allascii = -1;
if (UINT_MAX == 4294967295 && CHAR_BIT == 8 && sizeof(unsigned) == 4) {
unsigned n = 0x80808080;
unsigned char *p = (unsigned char *)&n;
unsigned test = p[0] | p[1] | p[2] | p[3];
if (test == 0x80) {
unsigned acc = 0;
unsigned *p = (unsigned *)array, *q = p + sizeof array / sizeof *p;
while (p < q) acc |= *p++;
if ((acc & 0x80808080) == 0) allascii = 1;
else allascii = 0;
}
}
if (allascii < 0) {
/* check char by char */
}

There is another way of coding the loop, to avoid scanning the whole
array in case of an early non-ascii character:

allascii = 1;
while (p < q)
if ((*p++ & 0x80808080) != 0) {
allascii = 0;
break;
}

But now, the body of the loop may execute slower, so it's hard to say
which version to prefer, without knowing how your typical data looks
like. If most arrays contain only ascii characters, the original version
is probably better.

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de
Nov 14 '05 #7
Joona I Palaste <pa*****@cc.helsinki.fi> spoke thus:
What would you want? Clairvoyance?

I don't know, I think

int ask_the_oracle( cont char *question, ... );

if( ask_the_oracle("Do all the characters in the array that follows "
"each have an ASCII code less than 127?"
,&my_array) ) {
printf( "Yay!\n" );
}

would be quite useful.

--
Christopher Benson-Manica | I *should* know what I'm talking about - if I
ataru(at)cyberspace.org | don't, I need to know. Flames welcome.
Nov 14 '05 #8
Joona I Palaste wrote:

hello smith <ih***********@ihatespammers.com> scribbled the following:
Hello,
I have an unsigned char array. I want to determine if each char's ascii
value is less than 127. Is there a faster way than looping through the
characters and checking if each is less than 127?

That's the most efficient algorithm there is, so that's as efficient
as you're going to get, unless your compiler vendor has supplied you
with a library function coded in pure machine code or something.
From an algorithm theory standpoint, you're asking "Can I see whether
each char is less than 127 without looking at each char to see if it's
less than 127?" What would you want? Clairvoyance?

If you knew certain things about the array, such as "it's always a
multiple of 4 bytes", you could, perhaps, speed things up by doing
something (probably platform-specific) like:

for each 4-byte entry "*pt" in array
{
if ( (*pt & 0x80808080) != 0 )
return something_is_greater_than_127;
}
return nothing_is_greater_than_127;

Unless, of course, the OP's "less than 127" is correct, and didn't
really mean "less than or equal to 127". In which case... "never
mind".

--

+---------+----------------------------------+-----------------------------+
| Kenneth | kenbrody at spamcop.net | "The opinions expressed |
| J. | http://www.hvcomputer.com | herein are not necessarily |
| Brody | http://www.fptech.com | those of fP Technologies." |
+---------+----------------------------------+-----------------------------+
Nov 14 '05 #9

### This discussion thread is closed

Replies have been disabled for this discussion.