473,396 Members | 1,743 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

is there a better way to optimise this code

Sid
Hi,

I am writing an application where I look for a white pixel by testing
if all the R,G,B values are 255 i.e. I use

if(RGB[0] == 255 && RGB[1] == 255 && RGB[2] == 255) (assuming RGB is a
pointer to unsigned char)

This statement gets executed for all the pixels in a page, so if I can
find a better way to do this, I could potentially save a lot of cpu
cycles.

I would greatly appreciate any feedback
Nov 14 '05 #1
24 1389
Sid wrote:

I am writing an application where I look for a white pixel by
testing if all the R,G,B values are 255 i.e. I use

if(RGB[0] == 255 && RGB[1] == 255 && RGB[2] == 255) (assuming
RGB is a pointer to unsigned char)

This statement gets executed for all the pixels in a page, so
if I can find a better way to do this, I could potentially
save a lot of cpu cycles.


Most of the time only the first portion of that test will be
executed, so the efficiency is better than it appears. The order
of the tests might make a difference.

--
Chuck F (cb********@yahoo.com) (cb********@worldnet.att.net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net> USE worldnet address!
Nov 14 '05 #2
si***********@gmail.com (Sid) writes:
I am writing an application where I look for a white pixel by testing
if all the R,G,B values are 255 i.e. I use

if(RGB[0] == 255 && RGB[1] == 255 && RGB[2] == 255) (assuming RGB is a
pointer to unsigned char)

This statement gets executed for all the pixels in a page, so if I can
find a better way to do this, I could potentially save a lot of cpu
cycles.


It's possible that something like

if ((RGB[0]<<16 | RGB[1]<<8 | RGB[2]) == 0xFFFFFF)

might be faster, since it avoids the conditional branches that result
from a straightforward compilation of the "&&" operator. As always,
the only way to be sure is to measure the actual performance.
Examining the generated code might also be instructive.

Some judicious casting might be called for; if unsigned int is less
than 24 bits wide, it could overflow.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Nov 14 '05 #3
I am writing an application where I look for a white pixel by testing
if all the R,G,B values are 255 i.e. I use

if(RGB[0] == 255 && RGB[1] == 255 && RGB[2] == 255) (assuming RGB is a
pointer to unsigned char)


Hi,

maybe

if(RGB[0] & RGB[1] & RGB[2] )

You should compile different solutions and compare assembly generated ...
JMB.
Nov 14 '05 #4
Hi,

maybe

if(RGB[0] & RGB[1] & RGB[2] )


if((RGB[0] & RGB[1] & RGB[2])==255 )

should work better
Nov 14 '05 #5
CBFalconer <cb********@yahoo.com> writes:
Sid wrote:
I am writing an application where I look for a white pixel by
testing if all the R,G,B values are 255 i.e. I use

if(RGB[0] == 255 && RGB[1] == 255 && RGB[2] == 255) (assuming
RGB is a pointer to unsigned char)

This statement gets executed for all the pixels in a page, so
if I can find a better way to do this, I could potentially
save a lot of cpu cycles.


Most of the time only the first portion of that test will be
executed, so the efficiency is better than it appears. The order
of the tests might make a difference.


That might or might not speed things up. Conditional branches can
sometimes cause problems with instruction pipelines (though if that's
the case, the compiler should be smart enough to evaluate the
subexpressions unconditionally).

Try several methods, measure the results, and be prepared for the
tradeoffs to change with the next version of the hardware and/or
compiler.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Nov 14 '05 #6
On Tue, 2004-08-24 at 21:35 -0700, Sid wrote:
Hi,

I am writing an application where I look for a white pixel by testing
if all the R,G,B values are 255 i.e. I use

if(RGB[0] == 255 && RGB[1] == 255 && RGB[2] == 255) (assuming RGB is a
pointer to unsigned char)

This statement gets executed for all the pixels in a page, so if I can
find a better way to do this, I could potentially save a lot of cpu
cycles.

I would greatly appreciate any feedback


Depending on your desire for portability, you may want to look into
optimizing it with platform-specific SIMD code, such as MMX/SSE
instructions on Intel platforms.

Fredrik Tolf
Nov 14 '05 #7
si***********@gmail.com (Sid) wrote in message news:<ba**************************@posting.google. com>...
Hi,

I am writing an application where I look for a white pixel by testing
if all the R,G,B values are 255 i.e. I use

if(RGB[0] == 255 && RGB[1] == 255 && RGB[2] == 255) (assuming RGB is a
pointer to unsigned char)

This statement gets executed for all the pixels in a page, so if I can
find a better way to do this, I could potentially save a lot of cpu
cycles.

I would greatly appreciate any feedback


would this help any way?

typedef union
{
unsigned a;
unsigned char val[3];
}RGB;

int main(void)
{
RGB rgb;
rgb.a = 0;

rgb.val[0] = 255;
rgb.val[1] = 255;
rgb.val[2] = 255;

if(0x00ffffff == rgb.a){
}

return 0;
}

-Paul
Nov 14 '05 #8
In <41**********************@feed0.news.be.easynet.ne t> "Jean-Michel Bechet" <jmb@_PAS_DE_PUB_bea.be> writes:

I am writing an application where I look for a white pixel by testing
if all the R,G,B values are 255 i.e. I use

if(RGB[0] == 255 && RGB[1] == 255 && RGB[2] == 255) (assuming RGB is a
pointer to unsigned char)
maybe

if(RGB[0] & RGB[1] & RGB[2] )


It's actually

if ((RGB[0] & RGB[1] & RGB[2]) == 255) ...
You should compile different solutions and compare assembly generated ...


Which is going to tell you exactly zilch about which version is likely to
be faster, because the actual data has a strong influence on the original
code behaviour: if RGB[0] != 255 for most pixels, the original version
is likely to execute in less CPU cycles, because it completely ignores
RGB[1] and RGB[2].

The right thing is to ignore the assembly generated by the compiler and
to benchmark the two versions on typical data sets.

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de
Nov 14 '05 #9
In article <d8**************************@posting.google.com >,
Paul <bg*****@yahoo.com> wrote:
typedef union
{
unsigned a;
unsigned char val[3];
}RGB;

int main(void)
{
RGB rgb;
rgb.a = 0;

rgb.val[0] = 255;
rgb.val[1] = 255;
rgb.val[2] = 255;

if(0x00ffffff == rgb.a){


This relies on the representation of integers. On a typical
little-endian machine with integers of 32 bits it will work.
On a big-endian machine, or one with 16-bit integers, or one
with some strange implementation quirk, it won't.

Unless your code is already conditionalised for such things, forget
it.

It may well be worthwhile packing the values into a suitably-sized
integer in a more portable way (eg shifting and adding), especially
if you pass them around as a group in other contexts.

-- Richard
Nov 14 '05 #10
Keith Thompson wrote:
si***********@gmail.com (Sid) writes:
I am writing an application where I look for a white pixel by testing
if all the R,G,B values are 255 i.e. I use

if(RGB[0] == 255 && RGB[1] == 255 && RGB[2] == 255) (assuming RGB is a
pointer to unsigned char)

This statement gets executed for all the pixels in a page, so if I can
find a better way to do this, I could potentially save a lot of cpu
cycles.

It's possible that something like

if ((RGB[0]<<16 | RGB[1]<<8 | RGB[2]) == 0xFFFFFF)

might be faster, since it avoids the conditional branches that result
from a straightforward compilation of the "&&" operator. As always,
the only way to be sure is to measure the actual performance.
Examining the generated code might also be instructive.

Some judicious casting might be called for; if unsigned int is less
than 24 bits wide, it could overflow.


Sometimes, the overhead in converting multiple bytes into a word
is more than using the multiple comparisons. The truth be told
in the assembly listing.

--
Thomas Matthews

C++ newsgroup welcome message:
http://www.slack.net/~shiva/welcome.txt
C++ Faq: http://www.parashift.com/c++-faq-lite
C Faq: http://www.eskimo.com/~scs/c-faq/top.html
alt.comp.lang.learn.c-c++ faq:
http://www.comeaucomputing.com/learn/faq/
Other sites:
http://www.josuttis.com -- C++ STL Library book

Nov 14 '05 #11
Sid wrote:
Hi,

I am writing an application where I look for a white pixel by testing
if all the R,G,B values are 255 i.e. I use

if(RGB[0] == 255 && RGB[1] == 255 && RGB[2] == 255) (assuming RGB is a
pointer to unsigned char)

This statement gets executed for all the pixels in a page, so if I can
find a better way to do this, I could potentially save a lot of cpu
cycles.

I would greatly appreciate any feedback


Here are some suggestions:

1. If you are testing many pixels against a know value, you
may want to put the value into a variable, then suggest
to the compiler to use a register for that variable.

unsigned char compare_value = 255;
if ( RGB[0] == compare_value
&& RGB[1] == compare_value
&& RGB[2] == compare_value)

Although many compilers may already do that.

2. Change the order of the expressions. The one that is
most likely to be false should be first.

3. Invert (or reverse) the color scheme so that
white has a value of zero. Many processors have
special instructions for "jumping on zero".

4. Convert the function to assembly language to take
advantage of special processor instructions (or
rewrite the code to make the compiler use those
instructions).

Example: The ARM processor can conditionally
execute instructions (in 32-bit mode). This fragment
may be more efficient because it reduces the possibility
of branching.
register unsigned char result;
register unsigned char compare_value = 255;
register unsigned char rgb0, rgb1, rgb2;

rgb0 = RGB[0]; /* This pattern reflects the ARM */
rgb1 = RGB[1]; /* instruction for loading multiple */
rgb2 = RGB[2]; /* registers from memory. */

result = rgb0 == compare_value;
result = result && rgb1 == compare_value;
result = result && rgb2 == compare_value;
if (result)
{
/* ... */
}

5. Simplify your function by using Boolean Algebra.
Convert the "if" statements into the algebra.

--
Thomas Matthews

C++ newsgroup welcome message:
http://www.slack.net/~shiva/welcome.txt
C++ Faq: http://www.parashift.com/c++-faq-lite
C Faq: http://www.eskimo.com/~scs/c-faq/top.html
alt.comp.lang.learn.c-c++ faq:
http://www.comeaucomputing.com/learn/faq/
Other sites:
http://www.josuttis.com -- C++ STL Library book

Nov 14 '05 #12
Keith Thompson wrote:
si***********@gmail.com (Sid) writes:
I am writing an application where I look for a white pixel by
testing if all the R,G,B values are 255 i.e. I use

if(RGB[0] == 255 && RGB[1] == 255 && RGB[2] == 255) (assuming
RGB is a pointer to unsigned char)

This statement gets executed for all the pixels in a page, so
if I can find a better way to do this, I could potentially save
a lot of cpu cycles.


It's possible that something like

if ((RGB[0]<<16 | RGB[1]<<8 | RGB[2]) == 0xFFFFFF)

might be faster, since it avoids the conditional branches that
result from a straightforward compilation of the "&&" operator.
As always, the only way to be sure is to measure the actual
performance. Examining the generated code might also be
instructive.

Some judicious casting might be called for; if unsigned int is
less than 24 bits wide, it could overflow.


Another candidate for testing might be:

if (0xff == (RGB[0] & RGB[1] & RGB[2]))

--
Chuck F (cb********@yahoo.com) (cb********@worldnet.att.net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net> USE worldnet address!
Nov 14 '05 #13
si***********@gmail.com (Sid) wrote in message news:<ba**************************@posting.google. com>...
Hi,

I am writing an application where I look for a white pixel by testing
if all the R,G,B values are 255 i.e. I use

if(RGB[0] == 255 && RGB[1] == 255 && RGB[2] == 255) (assuming RGB is a
pointer to unsigned char)

This statement gets executed for all the pixels in a page, so if I can
find a better way to do this, I could potentially save a lot of cpu
cycles.

I would greatly appreciate any feedback


You could try typecasting the pointer to an integer type big enough to
see all values. i.e., something like --

if ((*(long *)RGB & 0x00ffffff) == 0x00ffffff) {...}

for a little endian machine, or

if ((*(long *)RGB & 0xffffff00) == 0xffffff00 {...}

for a big endian machine. This assumes there is no problem
referencing one byte off of the end of the table for the last RGB
entry, that alignment issues work out (they may or may not) both in
terms of validity and performance, and that there are no other hidden
gotchas. This approach is not particularly portable -- in fact it is
not portable at all, but that does not sound like your concern.

I would strongly recommend in any case encapsulating this logic in a
macro, e.g. -

#define IsWhite(RGB) ((*(long *)(RGB) & 0x00ffffff == 0x00ffffff)

This should aid in experimentation and make it easier to maintain over
the long haul.

The only way to optimize code in a specialized situation like this is
to try anything that has a reasonable chance of working and then test
it for correctness and benchmark it to see what works best.
Nov 14 '05 #14
> I am writing an application where I look for a white pixel by testing
if all the R,G,B values are 255 i.e. I use

if(RGB[0] == 255 && RGB[1] == 255 && RGB[2] == 255) (assuming RGB is a
pointer to unsigned char)

This statement gets executed for all the pixels in a page, so if I can
find a better way to do this, I could potentially save a lot of cpu
cycles.


I'm going to be flamed to death for posting non-portable code to comp.lang.c,
but here's something you could try. You may need to adjust it slightly for
better performance if your target architecture doesn't use 4-byte words.

Have fun.
#include <stdint.h> /* any C99-compliant compiler should provide this */

/* define BYTE_ORDER_VAX for little endian, BYTE_ORDER_NETWORK for big endian */
#define BYTE_ORDER_VAX 1

/*
* find_white --
*
* returns the index of the first white pixel in 'data', an array of
* packed 'npixels' pixels, or -1 if we couldn't find one.
*
* a 'pixel' is a group of three bytes; it's white when all three bytes are
* set to 0xff.
*/
int
find_white(const uint8_t *data, int npixels)
{
const uint32_t *p = (uint32_t *)data;
const uint8_t *q;
int n = (3*npixels) / 12; /* # of groups of 4 pixels (3 words) */
int r = npixels - n*4; /* # of remaining pixels */
int i;

for (i = 0; i < n; i++) {
#if BYTE_ORDER_NETWORK /* UNTESTED */
if ((p[0] & 0xffffff00) == 0xffffff00)
return i*4;

if (((p[0] & 0x000000ff) == 0x000000ff) &&
(p[1] & 0xffff0000) == 0xffff0000)
return i*4 + 1;

if (((p[1] & 0x0000ffff) == 0x0000ffff) &&
(p[2] & 0xff000000) == 0xff000000)
return i*4 + 2;

if ((p[3] & 0x0000ffff) == 0x0000ffff)
return i*4 + 3;
#elif BYTE_ORDER_VAX
if ((p[0] & 0x00ffffff) == 0x00ffffff)
return i*4;

if (((p[0] & 0xff000000) == 0xff000000) &&
(p[1] & 0x0000ffff) == 0x0000ffff)
return i*4 + 1;

if (((p[1] & 0xffff0000) == 0xffff0000) &&
(p[2] & 0x000000ff) == 0x000000ff)
return i*4 + 2;

if ((p[3] & 0xffff0000) == 0xffff0000)
return i*4 + 3;
#else
#error "define either BYTE_ORDER_NETWORK or BYTE_ORDER_VAX"
#endif
p += 3;
}

q = (uint8_t *)p;

for (i = 0; i < r; i++) {
if (q[0] == 0xff && q[1] == 0xff && q[2] == 0xff)
return n*4 + i;

q += 3;
}

return -1;
}
Nov 14 '05 #15
Sid wrote:
Hi,

I am writing an application where I look for a white pixel by testing
if all the R,G,B values are 255 i.e. I use

if(RGB[0] == 255 && RGB[1] == 255 && RGB[2] == 255) (assuming RGB is a
pointer to unsigned char)

This statement gets executed for all the pixels in a page, so if I can
find a better way to do this, I could potentially save a lot of cpu
cycles.

I would greatly appreciate any feedback


One possibility I haven't seen mentioned yet is to
use memchr() to search the RGB array for a 255 byte. If
you find one, you then need to check whether it's in an
R position as opposed to a G or B, which you can do by
computing the distance from &RGB[0] and %'ing by three.
Then you can check the G and B values in the obvious way.
This approach requires a relatively large amount of fooling
around after each successful memchr(), so it's likely to be
advantageous only if 255's are fairly rare and if memchr()
is significantly faster than a simple loop.

As with the other suggested methods, the only way to
tell whether this will be faster or slower than what you've
already got is to measure.

--
Er*********@sun.com

Nov 14 '05 #16

On Wed, 24 Aug 2004, Sid wrote:

I am writing an application where I look for a white pixel by testing
if all the R,G,B values are 255 i.e. I use

if(RGB[0] == 255 && RGB[1] == 255 && RGB[2] == 255) (assuming RGB is a
pointer to unsigned char)


I am absolutely amazed that no one has mentioned the obvious. You
just want to compare one block of unsigned chars to another block;
so write what you mean!

#include <string.h>

unsigned char to_find[3] = {255, 255, 255};

if (memcmp(RGB, to_find, 3) == 0) { ... }

This is the sort of code that gcc loves to optimize, and I would
be surprised if it compared unfavorably to /any/ of the other proposed
solutions, given a modern compiler.

-Arthur
Nov 14 '05 #17
"Arthur J. O'Dwyer" wrote:
On Wed, 24 Aug 2004, Sid wrote:

I am writing an application where I look for a white pixel by
testing if all the R,G,B values are 255 i.e. I use

if(RGB[0] == 255 && RGB[1] == 255 && RGB[2] == 255) (assuming
RGB is a pointer to unsigned char)


I am absolutely amazed that no one has mentioned the obvious.
You just want to compare one block of unsigned chars to another
block; so write what you mean!

#include <string.h>

unsigned char to_find[3] = {255, 255, 255};

if (memcmp(RGB, to_find, 3) == 0) { ... }

This is the sort of code that gcc loves to optimize, and I
would be surprised if it compared unfavorably to /any/ of the
other proposed solutions, given a modern compiler.


It is not quite that easy. That will find:

{{0, 0, 255} {255, 255, 0}}

and other similar things. You have to impose the modulo 3 on the
results (which may well not be 3, depending on how things have
been declared).

--
Chuck F (cb********@yahoo.com) (cb********@worldnet.att.net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net> USE worldnet address!
Nov 14 '05 #18
si***********@gmail.com (Sid) wrote:

I am writing an application where I look for a white pixel by testing
if all the R,G,B values are 255 i.e. I use

if(RGB[0] == 255 && RGB[1] == 255 && RGB[2] == 255) (assuming RGB is a
pointer to unsigned char)

This statement gets executed for all the pixels in a page, so if I can
find a better way to do this, I could potentially save a lot of cpu
cycles.


If RGB is an array of unsigned char, then you could try:

if (memcmp(RGB, "\xFF\xFF\xFF", 3))

because many compilers have special optimisations for "memcmp"
and "memset" and "memcpy". You may also find a speed increase
if you bump it up to an int size, eg. make RGB[3] always be 0xFF
and compare RGB to "\xFF\xFF\xFF\xFF".

Of course what you should do is benchmark a few options and see
what's best for your system.

NB. strictly speaking you can't compare unsigned char to
signed char this way, but in practice it always works, and
is certainly better than comparing a long.
Nov 14 '05 #19
ri*****@cogsci.ed.ac.uk (Richard Tobin) wrote in message news:<cg***********@pc-news.cogsci.ed.ac.uk>...
In article <d8**************************@posting.google.com >,
Paul <bg*****@yahoo.com> wrote:
typedef union
{
unsigned a;
unsigned char val[3];
}RGB;

int main(void)
{
RGB rgb;
rgb.a = 0;

rgb.val[0] = 255;
rgb.val[1] = 255;
rgb.val[2] = 255;

if(0x00ffffff == rgb.a){


This relies on the representation of integers. On a typical
little-endian machine with integers of 32 bits it will work.
On a big-endian machine, or one with 16-bit integers, or one
with some strange implementation quirk, it won't.

Unless your code is already conditionalised for such things, forget
it.

It may well be worthwhile packing the values into a suitably-sized
integer in a more portable way (eg shifting and adding), especially
if you pass them around as a group in other contexts.

-- Richard


Ofcourse this is platform dependent, I never said otherwise though,
this is to just to give him/her idea, upto him/her to use it or trash
it, based on the platform.

-Paul.
Nov 14 '05 #20
>>> I am writing an application where I look for a white pixel by
testing if all the R,G,B values are 255 i.e. I use

if(RGB[0] == 255 && RGB[1] == 255 && RGB[2] == 255) (assuming
RGB is a pointer to unsigned char)

Use XOR rather than a comparison. Much faster (or at least it was back
in the days of the 386 and 486... I assume it still is).

If short int is 2 bytes on your hardware (check, but it probably is) do
the first XOR on a short int at RGB[0] and the second on a char at RGB[2].

This last part might not matter depending on how smart the compiler is
but if you are going sequentially through the entire array then dont use
an array+offset but rather set a pointer to the base before you enter
the loop and increment it.

char test[1024*768*3]; // Just guessing at an image size
unsigned short int * pointer;
unsigned char * limit;
pointer=&test;
limit=pointer+1024*768*3;

do {
if(!((unsigned short int )*pointer^0xffff)
&& !((unsigned char )*(pointer+2)^0xff))
printf("white\n");
pointer+=3;

} } while(pointer<limit);

Other than some warnings from the compiler about wrong pointer type, I
think this should work... check my logic on this as I'm more than a
little rusty to put it lightly.

Oh, and I don't know if this is still true or not with modern processors,
but you might see a benefit from always loading the word (16 bits) from
an even address... if you check the pointer first and then do (if
pointer is even) short int, char, char, short int for each loop (
checking two pixles for each loop) then you will always load a word from
an even address rather than odd. if the base pointer is odd, then start
with char. I don't even know if this still matters though It used to be
that the x86's would load a word from an odd base in two fetches and
from an even in one. Anyone know if this still applies to Pentium and
newer processors?

After all of this if things are STILL too slow, about your only other
option would be to reduce the number of loops by repeating the same
instructions several times in the loop... ie check pixel one, then two,
then three, then four, THEN loop. that way there are 1/4 as many jumps
back to the beginning of the loop.
.................................................. .......... Posted via AtlantisNews - Explore EVERY Newsgroup <
http://www.AtlantisNews.com -- Lightning Fast!! <<
Access to More Content * No Limits * Best Retention <<<

Nov 14 '05 #21
In article <41***************@yahoo.com>,
CBFalconer <cb********@worldnet.att.net> wrote:
"Arthur J. O'Dwyer" wrote:
On Wed, 24 Aug 2004, Sid wrote:

I am writing an application where I look for a white pixel by
testing if all the R,G,B values are 255 i.e. I use

if(RGB[0] == 255 && RGB[1] == 255 && RGB[2] == 255) (assuming
RGB is a pointer to unsigned char)


I am absolutely amazed that no one has mentioned the obvious.
You just want to compare one block of unsigned chars to another
block; so write what you mean!

#include <string.h>

unsigned char to_find[3] = {255, 255, 255};

if (memcmp(RGB, to_find, 3) == 0) { ... }

This is the sort of code that gcc loves to optimize, and I
would be surprised if it compared unfavorably to /any/ of the
other proposed solutions, given a modern compiler.


It is not quite that easy. That will find:

{{0, 0, 255} {255, 255, 0}}

and other similar things. You have to impose the modulo 3 on the
results (which may well not be 3, depending on how things have
been declared).


I haven't had my coffee yet today, so you could be right, but your
comment looks wrong to me.

Arthur's code is comparing the array-of-three-bytes pointed to by RGB
against a pointer to three bytes that all have value 255; if this code
is run for each pixel with RGB pointing at the RGB values for that pixel,
it should do The Right Thing. It's not trying to walk through the whole
pixel array and look for any string of three consecutive 255s.
dave

--
Dave Vandervies dj******@csclub.uwaterloo.ca

You can start a flame war in comp.lang.c just by saying "good morning".
--Richard Heathfield in comp.lang.c
Nov 14 '05 #22
In <ba**************************@posting.google.com > si***********@gmail.com (Sid) writes:
I am writing an application where I look for a white pixel by testing
if all the R,G,B values are 255 i.e. I use

if(RGB[0] == 255 && RGB[1] == 255 && RGB[2] == 255) (assuming RGB is a
pointer to unsigned char)

This statement gets executed for all the pixels in a page, so if I can
find a better way to do this, I could potentially save a lot of cpu
cycles.


Is there also an unused RGB[3], so that each pixel is properly aligned
for a 32-bit access? If yes, you could use a single integer comparison
for each pixel value:

typedef whatever_appropriate uint32t;

uint32t *p = (uint32t *)RGB, white = 0xffffffff;
((unsigned char *)&white)[3] = 0; /* clear the last byte of white */

if ((*p & white) == white) /* the pixel is white */ ;

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de
Nov 14 '05 #23
> >>> if(RGB[0] == 255 && RGB[1] == 255 && RGB[2] == 255) (assuming
RGB is a pointer to unsigned char)

Use XOR rather than a comparison.


Use a XOR, and you'll *still* need to do a comparison - an implicit
comparison with zero.
If short int is 2 bytes on your hardware (check, but it probably is) do
the first XOR on a short int at RGB[0] and the second on a char at RGB[2].
<off-topic>

Trying to read a word from an address that isn't a multiple of its size
will seriously degrade performance on x86, and, on processors such as the
Sparc or the Alpha, will make your program die an horrible death,
screaming "Bus Error".

You're talking about code like

uint8_t *foo;
uint16_t bar = (uint16_t *)foo;

If 'foo' is at an even address, everything will be fine (on most
architectures). But it doesn't need to be; and in this particular
scenario it *won't* be, in 2/3 of the cases.

Incidentally, the code I posted to this very thread suffers from
the same problem; it assumes the array passed to the function starts
at a word-aligned address. It also assumes 32-bit words. Given that,
and a decent compiler, it will do just three memory fetches for
every four pixels, as opposed to twelve fetches for char pointer
walk. But it's definitely not production quality, and maybe I should've
thought twice before posting it.

Yes, this small discussion was off-topic, since the C programming
language has no concept of addresses and words.

</off-topic>
This last part might not matter depending on how smart the compiler is
but if you are going sequentially through the entire array then dont use
an array+offset but rather set a pointer to the base before you enter
the loop and increment it.


Any compiler worth its salt will convert an index walk to a pointer
walk.
Nov 14 '05 #24
> Use a XOR, and you'll *still* need to do a comparison - an implicit
comparison with zero.

Ahh, I see your point that it will still have a conditional jump but an
XOR followed by a conditional jump still uses fewer clock cycles than a
CMP followed by said jump. Though now that I think about it, the
compiler SHOULD be smart enough to recognize this and compile the XOR
anyway since he is comparing to 0xFF.
Trying to read a word from an address that isn't a multiple of its
size will seriously degrade performance on x86, and, on processors
such as the Sparc or the Alpha, will make your program die an horrible
death, screaming "Bus Error".
I wasn't aware of the horrible death potential, but keeping alignment
even to avoid the double fetch is why I suggested checking the pointer
offset at the outset (though he could just declare the array as short
int and the compiler would always place it at an even base) and doing
either short-char-char-short or char-short-short-char depending.

Incidentally [and all appologies for straying so far off topic here], I
wonder if this is still the case on intel for 386 and later since the
data bus is 32 bits rather than 16?
Incidentally, the code I posted to this very thread suffers from
the same problem; it assumes the array passed to the function starts
at a word-aligned address. It also assumes 32-bit words. Given that,
and a decent compiler, it will do just three memory fetches for
every four pixels, as opposed to twelve fetches for char pointer
walk. But it's definitely not production quality, and maybe I
should've thought twice before posting it.
If performance in this loop is important enough that would still be a
valid option, he would have to have the program check the type size at
the outset and then test to see if it is on a little or big endian
machine and adjust accordingly... possibly even having a little endian
and big endian version of the routine and setting a function pointer if
it is called repeatedly from a loop.

Yes, this small discussion was off-topic, since the C programming
language has no concept of addresses and words.
Ooops, all appologies for the stray into assembly land :-)
.................................................. .......... Posted via AtlantisNews - Explore EVERY Newsgroup <
http://www.AtlantisNews.com -- Lightning Fast!! <<
Access to More Content * No Limits * Best Retention <<<

Nov 14 '05 #25

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

11
by: Bhushit Joshipura | last post by:
This post contains one question and one proposal. A. May I know why order of evaluation of arguments is not specified in C/C++? I asked a question in comp.lang.c++ for the following...
6
by: Michael | last post by:
Hi, I'm fairly new at Python, and have the following code that works but isn't very concise, is there a better way of writing it?? It seems much more lengthy than python code i have read..... :-)...
1
by: Jasper Bryant-Greene | last post by:
Hi I have this query: SELECT id, name, YEAR(born) AS year FROM people WHERE DAYOFMONTH(born) = 7 AND MONTH(born) = 12 ORDER BY year DESC, name
5
by: Hugo Elias | last post by:
Hi all, I have an idea for a better IDE. Though I don't have the skills required to write such a thing, if anyone's looking for a killer app, maybe this is it. If I'm typing at a rate of 10...
6
by: DENG | last post by:
hi all, i use SGMLParser to process HTML files, in order to do some optimizations, something like this: <i><b>TEXT1</b></i><b><i><u>TEXT2</u></i></b> optimise to
2
by: Hervé Piedvache | last post by:
Hi, I have may be a stupid question, but I'm a little surprised with some explains I have, using date fields ... I would like to understand exactly when index are used ... I'm using...
9
by: sam_cit | last post by:
Hi Everyone, I have the following structure and i have a function that will work(modify/edit) the members of the structure, please tell me which is a better apprach considering both time and...
45
by: madhawi | last post by:
Is it better to use a macro or a function?
5
by: Aussie Rules | last post by:
Hi, I have a vb.net 2005 project that has just got slower and slower as I develop. Does anybody know of a code tool to use to to pin point performance problems, and clean up/optimise the...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.