473,387 Members | 1,553 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

quickest way to determine character sizes

Hello,
I have an unsigned char array. I want to determine if each char's ascii
value is less than 127. Is there a faster way than looping through the
characters and checking if each is less than 127?

Thanks in advance.

Nov 14 '05 #1
8 2260
hello smith <ih***********@ihatespammers.com> scribbled the following:
Hello,
I have an unsigned char array. I want to determine if each char's ascii
value is less than 127. Is there a faster way than looping through the
characters and checking if each is less than 127? Thanks in advance.


That's the most efficient algorithm there is, so that's as efficient
as you're going to get, unless your compiler vendor has supplied you
with a library function coded in pure machine code or something.
From an algorithm theory standpoint, you're asking "Can I see whether
each char is less than 127 without looking at each char to see if it's
less than 127?" What would you want? Clairvoyance?

--
/-- Joona Palaste (pa*****@cc.helsinki.fi) ------------- Finland --------\
\-- http://www.helsinki.fi/~palaste --------------------- rules! --------/
"As we all know, the hardware for the PC is great, but the software sucks."
- Petro Tyschtschenko
Nov 14 '05 #2
hello smith wrote:

Hello,
I have an unsigned char array. I want to determine if each char's ascii
value is less than 127. Is there a faster way than looping through the
characters and checking if each is less than 127?


Probably not.

For the closely related question "less than 128" it
might be just a smidgen quicker to calculate the inclusive
OR of all the characters and then compare the result to
128 (this generalizes to any power-of-two limiting value).

However, I'll bet that the smidgen saved (if anything
is saved at all) will be too small to make any measurable
difference in the performance of the surrounding program.
Have you forgotten Jackson's Laws of Program Optimization?

First Law of Program Optimization:
Don't do it.

Second Law of Program Optimization (for experts only):
Don't do it yet.

I hope you'll pardon my saying so, but the wording of your
question suggests (in two ways) that the Second Law doesn't
apply to you at this stage in your development. Obey the
First Law until your expertise improves.

--
Er*********@sun.com
Nov 14 '05 #3
nrk
hello smith wrote:
Hello,
I have an unsigned char array. I want to determine if each char's ascii
value is less than 127. Is there a faster way than looping through the
characters and checking if each is less than 127?

Thanks in advance.


Are you looking for something like:

int checkArray(unsigned char *arr, int len) {
unsigned char temp = 0;

while ( len-- ) temp |= *arr++;
return temp < 127;
}

?

However, I doubt if this has any benefit other than making your code
difficult to understand.

-nrk.

--
Remove devnull for email
Nov 14 '05 #4
nrk wrote:

hello smith wrote:
Hello,
I have an unsigned char array. I want to determine if each char's ascii
value is less than 127. Is there a faster way than looping through the
characters and checking if each is less than 127?

Thanks in advance.


Are you looking for something like:

int checkArray(unsigned char *arr, int len) {
unsigned char temp = 0;

while ( len-- ) temp |= *arr++;
return temp < 127;
}

?

However, I doubt if this has any benefit other than making your code
difficult to understand.


It also makes the code wrong: try it on an array
containing the two values 112 and 7.

Of course, this bug could be considered a "benefit"
in that it will teach the programmer (painfully) the
truth of Knuth's dictum: "Premature optimization is the
root of all evil."

--
Er*********@sun.com
Nov 14 '05 #5
nrk
Eric Sosman wrote:
nrk wrote:

hello smith wrote:
> Hello,
> I have an unsigned char array. I want to determine if each char's
> ascii
> value is less than 127. Is there a faster way than looping through the
> characters and checking if each is less than 127?
>
> Thanks in advance.


Are you looking for something like:

int checkArray(unsigned char *arr, int len) {
unsigned char temp = 0;

while ( len-- ) temp |= *arr++;
return temp < 127;
}

?

However, I doubt if this has any benefit other than making your code
difficult to understand.


It also makes the code wrong: try it on an array
containing the two values 112 and 7.

Of course, this bug could be considered a "benefit"
in that it will teach the programmer (painfully) the
truth of Knuth's dictum: "Premature optimization is the
root of all evil."


Whoops!! As Dan famously puts it "I should've engaged my brain before
posting" :-) The eyes see 127, but the mind wants and pleads for 128.

To OP: Take Eric's advise and discard mine. Or, alternately, if you
believe experience is the best teacher, feel free to use that buggy code.

-nrk.
--
Remove devnull for email
Nov 14 '05 #6
In <cR******************@bgtnsc05-news.ops.worldnet.att.net> hello smith <ih***********@ihatespammers.com> writes:
I have an unsigned char array. I want to determine if each char's ascii
value is less than 127. Is there a faster way than looping through the
characters and checking if each is less than 127?


Yes, if the array is properly aligned and properly sized to be aliased
with an array of unsigned int (things that you can control):

if (UINT_MAX == 4294967295 && CHAR_BIT == 8 && sizeof(unsigned) == 4) {
unsigned acc = 0;
unsigned *p = (unsigned *)array, *q = p + sizeof array / sizeof *p;
while (p < q) acc |= *p++;
if ((acc & 0x80808080) == 0) allascii = 1;
else allascii = 0;
}
else {
/* check char by char */
}

One could argue that the bits inside an int could be randomly distributed,
but this is the kind of risk that you can reasonably take. However, if
you really want, you can explicitly check:

allascii = -1;
if (UINT_MAX == 4294967295 && CHAR_BIT == 8 && sizeof(unsigned) == 4) {
unsigned n = 0x80808080;
unsigned char *p = (unsigned char *)&n;
unsigned test = p[0] | p[1] | p[2] | p[3];
if (test == 0x80) {
unsigned acc = 0;
unsigned *p = (unsigned *)array, *q = p + sizeof array / sizeof *p;
while (p < q) acc |= *p++;
if ((acc & 0x80808080) == 0) allascii = 1;
else allascii = 0;
}
}
if (allascii < 0) {
/* check char by char */
}

There is another way of coding the loop, to avoid scanning the whole
array in case of an early non-ascii character:

allascii = 1;
while (p < q)
if ((*p++ & 0x80808080) != 0) {
allascii = 0;
break;
}

But now, the body of the loop may execute slower, so it's hard to say
which version to prefer, without knowing how your typical data looks
like. If most arrays contain only ascii characters, the original version
is probably better.

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de
Nov 14 '05 #7
Joona I Palaste <pa*****@cc.helsinki.fi> spoke thus:
What would you want? Clairvoyance?


I don't know, I think

int ask_the_oracle( cont char *question, ... );

if( ask_the_oracle("Do all the characters in the array that follows "
"each have an ASCII code less than 127?"
,&my_array) ) {
printf( "Yay!\n" );
}

would be quite useful.

--
Christopher Benson-Manica | I *should* know what I'm talking about - if I
ataru(at)cyberspace.org | don't, I need to know. Flames welcome.
Nov 14 '05 #8
Joona I Palaste wrote:

hello smith <ih***********@ihatespammers.com> scribbled the following:
Hello,
I have an unsigned char array. I want to determine if each char's ascii
value is less than 127. Is there a faster way than looping through the
characters and checking if each is less than 127?

Thanks in advance.


That's the most efficient algorithm there is, so that's as efficient
as you're going to get, unless your compiler vendor has supplied you
with a library function coded in pure machine code or something.
From an algorithm theory standpoint, you're asking "Can I see whether
each char is less than 127 without looking at each char to see if it's
less than 127?" What would you want? Clairvoyance?


If you knew certain things about the array, such as "it's always a
multiple of 4 bytes", you could, perhaps, speed things up by doing
something (probably platform-specific) like:

for each 4-byte entry "*pt" in array
{
if ( (*pt & 0x80808080) != 0 )
return something_is_greater_than_127;
}
return nothing_is_greater_than_127;

Unless, of course, the OP's "less than 127" is correct, and didn't
really mean "less than or equal to 127". In which case... "never
mind".

--

+---------+----------------------------------+-----------------------------+
| Kenneth | kenbrody at spamcop.net | "The opinions expressed |
| J. | http://www.hvcomputer.com | herein are not necessarily |
| Brody | http://www.fptech.com | those of fP Technologies." |
+---------+----------------------------------+-----------------------------+
Nov 14 '05 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: Axier | last post by:
I cannot get the swedish character set on images. What can be wrong? I would very much appreciate help. regards, axier I can supply some code here and a link:...
1
by: Timo | last post by:
All my font-sizes are set as relative sizes in CSS (large, medium, small, x-small, etc). Let's say something is set in CSS to be xx-large, but a visually impaired user wants it displayed even...
31
by: bilbothebagginsbab5 AT freenet DOT de | last post by:
Hello, hello. So. I've read what I could find on google(groups) for this, also the faq of comp.lang.c. But still I do not understand why there is not standard method to "(...) query the...
9
by: Adam | last post by:
Can someone please help!! I am trying to figure out what a font is? Assume I am working with a fixed font say Courier 10 point font Question 1: What does this mean 10 point font Question 2:...
6
by: Jozef Jarosciak | last post by:
Quickest way to find the string in 1 dimensional string array! I have a queue 1 dimensional array of strings called 'queue' and I need a fast way to search it. Once there is match, I don't need...
13
by: Steve Edwards | last post by:
Hi, Given a map: typedef map<long, string, greater<long> > mapOfFreq; Is there a quicker way to find the rank (i.e. index) of the elememt that has the long value of x? At the moment I'm...
25
by: lovecreatesbeauty | last post by:
Hello experts, I write a function named palindrome to determine if a character string is palindromic, and test it with some example strings. Is it suitable to add it to a company/project library...
0
by: bearophileHUGS | last post by:
George Sakkis: Nice. Here's a little modified version: from collections import namedtuple def slicer(names, sizes): """ 10, 4))
3
by: Giampaolo Rodola' | last post by:
Hi, I'd like to know if there's a way to determine which is the best buffer size to use when you have to send() and recv() some data over the network. I have an FTP server application which, on...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.