473,230 Members | 1,688 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,230 software developers and data experts.

quickest way to determine character sizes

Hello,
I have an unsigned char array. I want to determine if each char's ascii
value is less than 127. Is there a faster way than looping through the
characters and checking if each is less than 127?

Thanks in advance.

Nov 14 '05 #1
8 2253
hello smith <ih***********@ihatespammers.com> scribbled the following:
Hello,
I have an unsigned char array. I want to determine if each char's ascii
value is less than 127. Is there a faster way than looping through the
characters and checking if each is less than 127? Thanks in advance.


That's the most efficient algorithm there is, so that's as efficient
as you're going to get, unless your compiler vendor has supplied you
with a library function coded in pure machine code or something.
From an algorithm theory standpoint, you're asking "Can I see whether
each char is less than 127 without looking at each char to see if it's
less than 127?" What would you want? Clairvoyance?

--
/-- Joona Palaste (pa*****@cc.helsinki.fi) ------------- Finland --------\
\-- http://www.helsinki.fi/~palaste --------------------- rules! --------/
"As we all know, the hardware for the PC is great, but the software sucks."
- Petro Tyschtschenko
Nov 14 '05 #2
hello smith wrote:

Hello,
I have an unsigned char array. I want to determine if each char's ascii
value is less than 127. Is there a faster way than looping through the
characters and checking if each is less than 127?


Probably not.

For the closely related question "less than 128" it
might be just a smidgen quicker to calculate the inclusive
OR of all the characters and then compare the result to
128 (this generalizes to any power-of-two limiting value).

However, I'll bet that the smidgen saved (if anything
is saved at all) will be too small to make any measurable
difference in the performance of the surrounding program.
Have you forgotten Jackson's Laws of Program Optimization?

First Law of Program Optimization:
Don't do it.

Second Law of Program Optimization (for experts only):
Don't do it yet.

I hope you'll pardon my saying so, but the wording of your
question suggests (in two ways) that the Second Law doesn't
apply to you at this stage in your development. Obey the
First Law until your expertise improves.

--
Er*********@sun.com
Nov 14 '05 #3
nrk
hello smith wrote:
Hello,
I have an unsigned char array. I want to determine if each char's ascii
value is less than 127. Is there a faster way than looping through the
characters and checking if each is less than 127?

Thanks in advance.


Are you looking for something like:

int checkArray(unsigned char *arr, int len) {
unsigned char temp = 0;

while ( len-- ) temp |= *arr++;
return temp < 127;
}

?

However, I doubt if this has any benefit other than making your code
difficult to understand.

-nrk.

--
Remove devnull for email
Nov 14 '05 #4
nrk wrote:

hello smith wrote:
Hello,
I have an unsigned char array. I want to determine if each char's ascii
value is less than 127. Is there a faster way than looping through the
characters and checking if each is less than 127?

Thanks in advance.


Are you looking for something like:

int checkArray(unsigned char *arr, int len) {
unsigned char temp = 0;

while ( len-- ) temp |= *arr++;
return temp < 127;
}

?

However, I doubt if this has any benefit other than making your code
difficult to understand.


It also makes the code wrong: try it on an array
containing the two values 112 and 7.

Of course, this bug could be considered a "benefit"
in that it will teach the programmer (painfully) the
truth of Knuth's dictum: "Premature optimization is the
root of all evil."

--
Er*********@sun.com
Nov 14 '05 #5
nrk
Eric Sosman wrote:
nrk wrote:

hello smith wrote:
> Hello,
> I have an unsigned char array. I want to determine if each char's
> ascii
> value is less than 127. Is there a faster way than looping through the
> characters and checking if each is less than 127?
>
> Thanks in advance.


Are you looking for something like:

int checkArray(unsigned char *arr, int len) {
unsigned char temp = 0;

while ( len-- ) temp |= *arr++;
return temp < 127;
}

?

However, I doubt if this has any benefit other than making your code
difficult to understand.


It also makes the code wrong: try it on an array
containing the two values 112 and 7.

Of course, this bug could be considered a "benefit"
in that it will teach the programmer (painfully) the
truth of Knuth's dictum: "Premature optimization is the
root of all evil."


Whoops!! As Dan famously puts it "I should've engaged my brain before
posting" :-) The eyes see 127, but the mind wants and pleads for 128.

To OP: Take Eric's advise and discard mine. Or, alternately, if you
believe experience is the best teacher, feel free to use that buggy code.

-nrk.
--
Remove devnull for email
Nov 14 '05 #6
In <cR******************@bgtnsc05-news.ops.worldnet.att.net> hello smith <ih***********@ihatespammers.com> writes:
I have an unsigned char array. I want to determine if each char's ascii
value is less than 127. Is there a faster way than looping through the
characters and checking if each is less than 127?


Yes, if the array is properly aligned and properly sized to be aliased
with an array of unsigned int (things that you can control):

if (UINT_MAX == 4294967295 && CHAR_BIT == 8 && sizeof(unsigned) == 4) {
unsigned acc = 0;
unsigned *p = (unsigned *)array, *q = p + sizeof array / sizeof *p;
while (p < q) acc |= *p++;
if ((acc & 0x80808080) == 0) allascii = 1;
else allascii = 0;
}
else {
/* check char by char */
}

One could argue that the bits inside an int could be randomly distributed,
but this is the kind of risk that you can reasonably take. However, if
you really want, you can explicitly check:

allascii = -1;
if (UINT_MAX == 4294967295 && CHAR_BIT == 8 && sizeof(unsigned) == 4) {
unsigned n = 0x80808080;
unsigned char *p = (unsigned char *)&n;
unsigned test = p[0] | p[1] | p[2] | p[3];
if (test == 0x80) {
unsigned acc = 0;
unsigned *p = (unsigned *)array, *q = p + sizeof array / sizeof *p;
while (p < q) acc |= *p++;
if ((acc & 0x80808080) == 0) allascii = 1;
else allascii = 0;
}
}
if (allascii < 0) {
/* check char by char */
}

There is another way of coding the loop, to avoid scanning the whole
array in case of an early non-ascii character:

allascii = 1;
while (p < q)
if ((*p++ & 0x80808080) != 0) {
allascii = 0;
break;
}

But now, the body of the loop may execute slower, so it's hard to say
which version to prefer, without knowing how your typical data looks
like. If most arrays contain only ascii characters, the original version
is probably better.

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de
Nov 14 '05 #7
Joona I Palaste <pa*****@cc.helsinki.fi> spoke thus:
What would you want? Clairvoyance?


I don't know, I think

int ask_the_oracle( cont char *question, ... );

if( ask_the_oracle("Do all the characters in the array that follows "
"each have an ASCII code less than 127?"
,&my_array) ) {
printf( "Yay!\n" );
}

would be quite useful.

--
Christopher Benson-Manica | I *should* know what I'm talking about - if I
ataru(at)cyberspace.org | don't, I need to know. Flames welcome.
Nov 14 '05 #8
Joona I Palaste wrote:

hello smith <ih***********@ihatespammers.com> scribbled the following:
Hello,
I have an unsigned char array. I want to determine if each char's ascii
value is less than 127. Is there a faster way than looping through the
characters and checking if each is less than 127?

Thanks in advance.


That's the most efficient algorithm there is, so that's as efficient
as you're going to get, unless your compiler vendor has supplied you
with a library function coded in pure machine code or something.
From an algorithm theory standpoint, you're asking "Can I see whether
each char is less than 127 without looking at each char to see if it's
less than 127?" What would you want? Clairvoyance?


If you knew certain things about the array, such as "it's always a
multiple of 4 bytes", you could, perhaps, speed things up by doing
something (probably platform-specific) like:

for each 4-byte entry "*pt" in array
{
if ( (*pt & 0x80808080) != 0 )
return something_is_greater_than_127;
}
return nothing_is_greater_than_127;

Unless, of course, the OP's "less than 127" is correct, and didn't
really mean "less than or equal to 127". In which case... "never
mind".

--

+---------+----------------------------------+-----------------------------+
| Kenneth | kenbrody at spamcop.net | "The opinions expressed |
| J. | http://www.hvcomputer.com | herein are not necessarily |
| Brody | http://www.fptech.com | those of fP Technologies." |
+---------+----------------------------------+-----------------------------+
Nov 14 '05 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: Axier | last post by:
I cannot get the swedish character set on images. What can be wrong? I would very much appreciate help. regards, axier I can supply some code here and a link:...
1
by: Timo | last post by:
All my font-sizes are set as relative sizes in CSS (large, medium, small, x-small, etc). Let's say something is set in CSS to be xx-large, but a visually impaired user wants it displayed even...
31
by: bilbothebagginsbab5 AT freenet DOT de | last post by:
Hello, hello. So. I've read what I could find on google(groups) for this, also the faq of comp.lang.c. But still I do not understand why there is not standard method to "(...) query the...
9
by: Adam | last post by:
Can someone please help!! I am trying to figure out what a font is? Assume I am working with a fixed font say Courier 10 point font Question 1: What does this mean 10 point font Question 2:...
6
by: Jozef Jarosciak | last post by:
Quickest way to find the string in 1 dimensional string array! I have a queue 1 dimensional array of strings called 'queue' and I need a fast way to search it. Once there is match, I don't need...
13
by: Steve Edwards | last post by:
Hi, Given a map: typedef map<long, string, greater<long> > mapOfFreq; Is there a quicker way to find the rank (i.e. index) of the elememt that has the long value of x? At the moment I'm...
25
by: lovecreatesbeauty | last post by:
Hello experts, I write a function named palindrome to determine if a character string is palindromic, and test it with some example strings. Is it suitable to add it to a company/project library...
0
by: bearophileHUGS | last post by:
George Sakkis: Nice. Here's a little modified version: from collections import namedtuple def slicer(names, sizes): """ 10, 4))
3
by: Giampaolo Rodola' | last post by:
Hi, I'd like to know if there's a way to determine which is the best buffer size to use when you have to send() and recv() some data over the network. I have an FTP server application which, on...
0
by: VivesProcSPL | last post by:
Obviously, one of the original purposes of SQL is to make data query processing easy. The language uses many English-like terms and syntax in an effort to make it easy to learn, particularly for...
0
by: abbasky | last post by:
### Vandf component communication method one: data sharing ​ Vandf components can achieve data exchange through data sharing, state sharing, events, and other methods. Vandf's data exchange method...
2
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 7 Feb 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:30 (7.30PM). In this month's session, the creator of the excellent VBE...
0
by: stefan129 | last post by:
Hey forum members, I'm exploring options for SSL certificates for multiple domains. Has anyone had experience with multi-domain SSL certificates? Any recommendations on reliable providers or specific...
0
Git
by: egorbl4 | last post by:
Скачал я git, хотел начать настройку, а там вылезло вот это Что это? Что мне с этим делать? ...
0
by: MeoLessi9 | last post by:
I have VirtualBox installed on Windows 11 and now I would like to install Kali on a virtual machine. However, on the official website, I see two options: "Installer images" and "Virtual machines"....
0
by: DolphinDB | last post by:
The formulas of 101 quantitative trading alphas used by WorldQuant were presented in the paper 101 Formulaic Alphas. However, some formulas are complex, leading to challenges in calculation. Take...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: Aftab Ahmad | last post by:
So, I have written a code for a cmd called "Send WhatsApp Message" to open and send WhatsApp messaage. The code is given below. Dim IE As Object Set IE =...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.