473,503 Members | 3,085 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

how to simplify/shorten this loop of bit shifting code?

ben
i have a bit of code, that works absolutely fine as is, but seems over
complicated/long winded. is there anyway to shorten/simplify it?

the code is below. description of it: it's like strcpy in that it
copies one block of data to another block of data until the block that
is being copied contains a zero/null. the difference with this code is
that it's doing 4bits at a time (all the values are 4bits) and the two
blocks of data may not be aligned the same - that is the start of the
data that's being copied could be the rightmost 4bits of a normal byte,
and the start of the data that's being written to could be the leftmost
4bits of a byte for example - so out of sync with each other as it
were. also the terminating NULL is a 4bit value.

// 'datatocopy' is a u_int8_t pointer to the start of the data to copy.

// 'tocopyto' is a u_int8_t pointer to the start of the space to
// copy to.

// 'st' is an int for adding to datatocopy to step along it (the number
// of 4bit blocks from the left (starting at 0)).

// 'st2' dito, but for tocopyto.

while( ((*(datatocopy + ((st % 2 ? st-1 : st)/2 ))) >> (st % 2 ? 0 :
4) & 0xf) != 0 ) { // steps along and reads 4bits at a time

*(tocopyto + ((st2 % 2 ? st2 - 1 : st2) / 2) ) |= 0xf << (st2
% 2 ? 0 : 4); // zero the 4 bits...

*(tocopyto + ((st2 % 2 ? st2 - 1 : st2) / 2) ) ^= 0xf << (st2
% 2 ? 0 : 4); // ...about to be written to

tmp = *(datatocopy + ((st % 2 ? st-1 : st)/2 )); // get the
8bit byte that includes required 4bits

if( st % 2 ) { // it's odd (required 4bits are on the right).
tmp |= 0xf0; // delete the,
tmp ^= 0xf0; // left 4bits.
} else // it's even (required 4bits on the left).
tmp >>= 4; // shift to the right.

*(tocopyto + ((st2 % 2 ? st2 - 1 : st2) / 2)) |= tmp << ( st2
% 2 ? 0 : 4); // copy the 4bits

st++, st2++; // increment 4bit block counters
}

i'm using an intermediate temperory variable half way through - is that
necessary? it just seems an awful lot to do what it's doing.

i'd just like to really consolidate it somehow - possible? easy?
pointers to how to?

any help much apprecaited.

thanks, ben.
Nov 14 '05 #1
8 3641
ben
i missed something obvious, but i'd still like to know if it could be
consolidated further? thanks.
// 'datatocopy' is a u_int8_t pointer to the start of the data to copy.

// 'tocopyto' is a u_int8_t pointer to the start of the space to
// copy to.

// 'st' is an int for adding to datatocopy to step along it (the number
// of 4bit blocks from the left (starting at 0)).

// 'st2' dito, but for tocopyto.
while( (tmp = ((*(datatocopy + ((st % 2 ? st-1 : st)/2 ))) >> (st % 2
? 0 : 4) & 0xf)) != 0 ) { // step along and read 4bits at a time

*(tocopyto + ((st2 % 2 ? st2 - 1 : st2) / 2) ) |= 0xf << (st2
% 2 ? 0 : 4); // zero the 4 bits....

*(tocopyto + ((st2 % 2 ? st2 - 1 : st2) / 2) ) ^= 0xf << (st2
% 2 ? 0 : 4); // ....about to be written to

*(tocopyto + ((st2 % 2 ? st2 - 1 : st2) / 2)) |= tmp << ( st2
% 2 ? 0 : 4); // copy the bits

st++, st2++;
}
Nov 14 '05 #2
In article <16******************@x.x>, ben <x@x.x> wrote:
i missed something obvious, but i'd still like to know if it could be
consolidated further? thanks.
// 'datatocopy' is a u_int8_t pointer to the start of the data to copy.

// 'tocopyto' is a u_int8_t pointer to the start of the space to
// copy to.

// 'st' is an int for adding to datatocopy to step along it (the number
// of 4bit blocks from the left (starting at 0)).

// 'st2' dito, but for tocopyto.
while( (tmp = ((*(datatocopy + ((st % 2 ? st-1 : st)/2 ))) >> (st % 2
? 0 : 4) & 0xf)) != 0 ) { // step along and read 4bits at a time

*(tocopyto + ((st2 % 2 ? st2 - 1 : st2) / 2) ) |= 0xf << (st2
% 2 ? 0 : 4); // zero the 4 bits....
Quite unlikely that |= will set anything to zero.

*(tocopyto + ((st2 % 2 ? st2 - 1 : st2) / 2) ) ^= 0xf << (st2
% 2 ? 0 : 4); // ....about to be written to

*(tocopyto + ((st2 % 2 ? st2 - 1 : st2) / 2)) |= tmp << ( st2
% 2 ? 0 : 4); // copy the bits

st++, st2++;
}


(st2 % 2 ? st2 - 1 : st2) / 2

is exactly the same as

st2 / 2

I would most likely write something like

for (;;) {
src_index = st / 2;
src_shift = st % 2 ? 0 : 4;
src_value = (src [src_index] >> src_shift) & 0x0f;
if (src_value == 0) break;

dst_index = st2 / 2;
dst_shift = st2 % 2 ? 0 : 4;
dst [dst_index] &= ~(0xf << dst_shift);
dst [dst_index] |= src_value << dst_shift;
}
Nov 14 '05 #3
ben
In article
<ch*********************************@slb-newsm1.svr.pol.co.uk>,
Christian Bau <ch***********@cbau.freeserve.co.uk> wrote:
In article <16******************@x.x>, ben <x@x.x> wrote:
while( (tmp = ((*(datatocopy + ((st % 2 ? st-1 : st)/2 ))) >> (st % 2
? 0 : 4) & 0xf)) != 0 ) { // step along and read 4bits at a time

*(tocopyto + ((st2 % 2 ? st2 - 1 : st2) / 2) ) |= 0xf << (st2
% 2 ? 0 : 4); // zero the 4 bits....


Quite unlikely that |= will set anything to zero.


of course. dear oh dear. should have made use of the ~ flip operating
having set to 1's. :/ thanks for pointing that out.
*(tocopyto + ((st2 % 2 ? st2 - 1 : st2) / 2) ) ^= 0xf << (st2
% 2 ? 0 : 4); // ....about to be written to

*(tocopyto + ((st2 % 2 ? st2 - 1 : st2) / 2)) |= tmp << ( st2
% 2 ? 0 : 4); // copy the bits

st++, st2++;
}


(st2 % 2 ? st2 - 1 : st2) / 2

is exactly the same as

st2 / 2


right, i see. so if you've got an odd number and you divide by two (in
an int) you're guaranteed that'll round up downwards? is that ok to
rely on that? i guess it must be. ok thanks.
I would most likely write something like

for (;;) {
src_index = st / 2;
src_shift = st % 2 ? 0 : 4;
src_value = (src [src_index] >> src_shift) & 0x0f;
if (src_value == 0) break;

dst_index = st2 / 2;
dst_shift = st2 % 2 ? 0 : 4;
dst [dst_index] &= ~(0xf << dst_shift);
dst [dst_index] |= src_value << dst_shift;
}


that looks great although there doesn't seem to be any incrementing of
steppers (or am i just not seeing that? - don't think so) but that
shouldn't be hard to add. i'll give it a go now. it certainly looks
clearner and simpler than my version - great, thanks very much for your
time :)

ben.
Nov 14 '05 #4
In article <16******************@x.x> ben <x@x.x> wrote:
i have a bit of code, that works absolutely fine as is, but seems over
complicated/long winded. is there anyway to shorten/simplify it?
Certainly.

(See original for longer description and original code and variable
names -- the idea is to copy 4-bit units from C99-style 8-bit
uint8_t storage units, from source to destination, not necessarily
at the same 4-bit offset. The mask is apparently 0x0f, then 0xf0,
then 0x0f, then 0xf0, etc., based on nybble offset of 0, 1, 2, etc.,
as counted by "st" and "st2".)
// 'st' is an int for adding to datatocopy to step along it (the number
// of 4bit blocks from the left (starting at 0)).

while( ((*(datatocopy + ((st % 2 ? st-1 : st)/2 ))) >> (st % 2 ? 0 :
4) & 0xf) != 0 ) { // steps along and reads 4bits at a time


Note that if st is always nonnegative, there is no need to test
"st % 2" before dividing by 2, as 0/2 is 0, 1/2 is 0, 2/2 is 1, 3/2
is 1, and so on. Just making use of that fact would simplify the
code quite a bit.

If the loop is expected to run for relatively many 4-bit "nybbles",
though, I would compute an initial set of loop invariants, e.g.:

s: points to the uint8_t that contains the source nybble (4-bit unit)
d: points to the uint8_t that contains the destination nybble
sm: source mask, 0x0f or 0xf0 based on whether st is even or odd
dm: destination mask, 0x0f or 0xf0, based on st2 respectively

s = &datatocopy[st / 2];
d = &tocopyto[st2 / 2];
sm = st % 2 ? 0xf0 : 0x0f;
dm = st2 % 2 ? 0xf0 : 0x0f;

At this point you have a choice of "compact code" vs "fast code".
There are two possibilities: sm == dm, or sm != dm, and if you
expect the loop to run quite a lot, you might want to check. If
the two masks are the same, you can copy 8-bit units at a time
with no masking at all (since uint8_t is exactly 8 bits) until
you hit the first 8-bit byte that contains a 4-bit nybble that
is zero:

if (sm == dm) {
... take care of first upper nybble if required
(but note that this might be 0!) ...
/* then do whole bytes while possible */
if (did not already stop)
while ((*s | (*s >> 4)) & 0x0f) {
*d++ = *s++;
st += 2;
st2 += 2;
}
... then take care of last nybble if required ...
} else {
...
}

Even when the masks differ you can still read and copy quickly,
although this gets into "diminishing returns" territory, by reading
two source 8-bit-bytes, composing a destination byte, storing that,
and then reading one new source byte and reusing the "left over"
nybble in the other previously-read source byte.

Assuming you go for "compact code" instead, though, we just set up
the loop invariants as shown above, then do this:

while ((bits = *s & sm) != 0) {
/*
* We have some nonzero bits to copy, saved up in "bits".
* Move them to whichever half is required...
*/
if (sm != dm) {
if (sm & 0xf0) /* same as "if st % 2" */
bits >>= 4;
else
bits <<= 4;
}
*d = (*d & ~dm) | bits;
st++;
st2++;
#if 0
/* restore invariants -- this is one way, but not as compact */
if (sm & 0x0f) /* same as "if st%2 == 0" */
sm = 0xf0;
else {
sm = 0x0f;
s++;
}
if (dm & 0x0f)
dm = 0xf0;
else {
dm = 0x0f;
d++;
}
#else
/*
* The other way is to just recompute them from scratch,
* same as at the top of the loop. It is likely that the
* compiler will re-use that code (and/or we could compute just
* s and sm here, and compute d and dm just after the "while").
*/
s = &datatocopy[st / 2];
d = &tocopyto[st2 / 2];
sm = st % 2 ? 0xf0 : 0x0f; /* or see below */
dm = st2 % 2 ? 0xf0 : 0x0f;
#endif
}

Finally, it is worth considering just doing one "offsets do not
match" special case, because it removes a number of inside-the-loop
tests even when copying 4 bits at a time. We also get to use some
special properties: in 8 bits, ~0x0f is 0xf0, and vice versa; and
since st and st2 are (presumably) never negative, st%2 and st&1
produce the same values. (The compiler should optimize this away
for you, of course, if st and st2 have unsigned types, so it just
becomes a question of which you think is more readable.)

s = &datatocopy[st / 2];
d = &tocopyto[st2 / 2];
mask = st & 1 ? 0xf0 : 0x0f;

if ((st & 1) == (st2 & 1)) {
/*
* Nybble masks match, so do it the "easy way" (no shifting).
* Note that we want to increment the 8-bit pointers only when
* the values in st and st2 are odd, which is also easy to do
* by an arithmetic trick (provided we add to s and d *before*
* increment st and st2).
*/
while ((bits = *s & mask) != 0) {
*d = (*d & ~mask) | bits;
s += st & 1; /* +1 if odd, +0 if even */
d += st2 & 1;
st++;
st2++;
mask = ~mask; /* 0x0f => 0xf0; 0xf0 => 0x0f */
}
} else {
/*
* Nybble masks do not match; we have to alternate shifting
* left and right by 4. We can either unroll the loop a bit,
* or just test either st or st2 inside it (as here).
*/
while ((bits = *s & mask) != 0) {
if (st & 1) {
/*
* Source is odd (upper) nybble so move down.
* We know that (source) mask == 0xf0 here so
* dest mask should be 0x0f. Might as well
* increment s (and not d) here too.
*/
bits >>= 4;
*d = (*d & ~0x0f) | bits;
s++;
} else {
/*
* Source is even (lower) nybble so move up.
* Again could use constants...
*/
bits <<= 4;
*d = (*d & ~0xf0) | bits;
d++;
}
st++;
st2++;
mask = ~mask;
}
}

Unrolling the "do not mask" loop is ugly, but just for illustration,
here is a version using a goto (the other way is to have two separate
unrolled loops!):

} else {
if (st & 1)
goto opposite_mask_st_odd;
/* unrolled loop, in which st is even and st2 is odd at top: */
for (;;) {
/* here st is even and st2 is odd */
if ((bits = *s & 0x0f) == 0)
break;
*d = (*d & ~0xf0) | (bits << 4);
st++, st2++, d++;
opposite_mask_st_odd:
/* here st is odd and st2 is even */
if ((bits = *s & 0xf0) == 0)
break;
*d = (*d & ~0x0f) | (bits >> 4);
st++, st2++, s++;
}
}

Unrolling actually eliminates the "mask" variable entirely since
we just do each half, and eliminates the even/odd test inside the
loop -- so it winds up being about as long, code-wise.

Note that one can unroll even the "mask is the same" loop, but in
that case it is not quite as clever -- it just lets you use constants
for the masks, and eliminate s++ and d++ in one half of the unrolled
loop. If this were assembly, I probably would do it; in C... well,
*maybe* if profiling showed a lot of time going here.

(Incidentally, I will add here that I spent a lot of time optimizing
assembly-language memcpy() on SPARCV9, where you use some special
registers and instructions that require 64-*byte* alignment. This
needs a lot of leading and trailing mopping-up work, but is
fundamentally the same problem.)
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.
Nov 14 '05 #5
ben

btw, this took me *ages* to digest - still haven't completely - an
amazing reply.
In article <c0********@enews4.newsguy.com>, Chris Torek
<no****@torek.net> wrote:

Note that if st is always nonnegative, there is no need to test
"st % 2" before dividing by 2, as 0/2 is 0, 1/2 is 0, 2/2 is 1, 3/2
is 1, and so on. Just making use of that fact would simplify the
code quite a bit.
yes i now realise that. i thought it was not advised to rely on that
but that is obviously not the case.

If the loop is expected to run for relatively many 4-bit "nybbles",
though, I would compute an initial set of loop invariants, e.g.:

s: points to the uint8_t that contains the source nybble (4-bit unit)
d: points to the uint8_t that contains the destination nybble
sm: source mask, 0x0f or 0xf0 based on whether st is even or odd
dm: destination mask, 0x0f or 0xf0, based on st2 respectively

s = &datatocopy[st / 2];
d = &tocopyto[st2 / 2];
sm = st % 2 ? 0xf0 : 0x0f;
dm = st2 % 2 ? 0xf0 : 0x0f;

At this point you have a choice of "compact code" vs "fast code".
There are two possibilities: sm == dm, or sm != dm, and if you
expect the loop to run quite a lot, you might want to check. If
the two masks are the same, you can copy 8-bit units at a time
with no masking at all (since uint8_t is exactly 8 bits) until
you hit the first 8-bit byte that contains a 4-bit nybble that
is zero:

if (sm == dm) {
... take care of first upper nybble if required
did you mean lower there? as in the right most 4bits of the first 8bit
byte? i could be wrong.
(but note that this might be 0!) ...
/* then do whole bytes while possible */
if (did not already stop)
while ((*s | (*s >> 4)) & 0x0f) {
i really can't get my head round that line. it's goal is to find out if
either nybble is zero, right? but it doesn't does it? (again, i could
easily be wrong here but that's the way it seems to me). eg:

01000000 -*s
00000100 -*s>>4
01000100 - the OR'd answer
00001111 -0xf
00000100 -the &'d answer

so that would be a posative and the loop would get run. and the same
happens if the two nybbles are swapped round i think. it seems the only
way that'd give a negative answer to the while() is if *s was 0 ? in
which case that line might aswell be while(*s)

i must have must have something wrong?

*d++ = *s++;
st += 2;
st2 += 2;
}
... then take care of last nybble if required ...
} else {
...
}

Even when the masks differ you can still read and copy quickly,
although this gets into "diminishing returns" territory, by reading
two source 8-bit-bytes, composing a destination byte, storing that,
and then reading one new source byte and reusing the "left over"
nybble in the other previously-read source byte.

Assuming you go for "compact code" instead, though, we just set up
the loop invariants as shown above, then do this:

while ((bits = *s & sm) != 0) {
/*
* We have some nonzero bits to copy, saved up in "bits".
* Move them to whichever half is required...
*/
if (sm != dm) {
if (sm & 0xf0) /* same as "if st % 2" */
bits >>= 4;
else
bits <<= 4;
}
*d = (*d & ~dm) | bits;
st++;
st2++;
#if 0
/* restore invariants -- this is one way, but not as compact */
if (sm & 0x0f) /* same as "if st%2 == 0" */
sm = 0xf0;
else {
sm = 0x0f;
s++;
}
if (dm & 0x0f)
dm = 0xf0;
else {
dm = 0x0f;
d++;
}
#else
/*
* The other way is to just recompute them from scratch,
* same as at the top of the loop. It is likely that the
* compiler will re-use that code (and/or we could compute just
* s and sm here, and compute d and dm just after the "while").
*/
s = &datatocopy[st / 2];
d = &tocopyto[st2 / 2];
sm = st % 2 ? 0xf0 : 0x0f; /* or see below */
dm = st2 % 2 ? 0xf0 : 0x0f;
#endif
}

Finally, it is worth considering just doing one "offsets do not
match" special case, because it removes a number of inside-the-loop
tests even when copying 4 bits at a time. We also get to use some
special properties: in 8 bits, ~0x0f is 0xf0, and vice versa; and
since st and st2 are (presumably) never negative, st%2 and st&1
produce the same values. (The compiler should optimize this away
for you, of course, if st and st2 have unsigned types, so it just
becomes a question of which you think is more readable.)

s = &datatocopy[st / 2];
d = &tocopyto[st2 / 2];
mask = st & 1 ? 0xf0 : 0x0f;

if ((st & 1) == (st2 & 1)) {
/*
* Nybble masks match, so do it the "easy way" (no shifting).
* Note that we want to increment the 8-bit pointers only when
* the values in st and st2 are odd, which is also easy to do
* by an arithmetic trick (provided we add to s and d *before*
* increment st and st2).
*/
while ((bits = *s & mask) != 0) {
*d = (*d & ~mask) | bits;
s += st & 1; /* +1 if odd, +0 if even */
d += st2 & 1;
st++;
st2++;
mask = ~mask; /* 0x0f => 0xf0; 0xf0 => 0x0f */ i like that line :) }
} else {
/*
* Nybble masks do not match; we have to alternate shifting
* left and right by 4. We can either unroll the loop a bit,
* or just test either st or st2 inside it (as here).
*/
while ((bits = *s & mask) != 0) {
if (st & 1) {
/*
* Source is odd (upper) nybble so move down.
hmm, maybe i've got the terminology wrong - i thought upper was on the
left and lower on the right. odd is when the nybble is on the right,
right?
* We know that (source) mask == 0xf0 here so
* dest mask should be 0x0f. Might as well
* increment s (and not d) here too.
*/
bits >>= 4;
*d = (*d & ~0x0f) | bits;
s++;
} else {
/*
* Source is even (lower) nybble so move up.
* Again could use constants...
*/
bits <<= 4;
*d = (*d & ~0xf0) | bits;
d++;
}
st++;
st2++;
mask = ~mask;
}
}
yes i think that's the one i'm going to go for.

Unrolling the "do not mask" loop is ugly, but just for illustration,
here is a version using a goto (the other way is to have two separate
unrolled loops!):

} else {
if (st & 1)
goto opposite_mask_st_odd;
/* unrolled loop, in which st is even and st2 is odd at top: */
for (;;) {
/* here st is even and st2 is odd */
if ((bits = *s & 0x0f) == 0)
break;
*d = (*d & ~0xf0) | (bits << 4);
st++, st2++, d++;
opposite_mask_st_odd:
/* here st is odd and st2 is even */
if ((bits = *s & 0xf0) == 0)
break;
*d = (*d & ~0x0f) | (bits >> 4);
st++, st2++, s++;
}
}

Unrolling actually eliminates the "mask" variable entirely since
we just do each half, and eliminates the even/odd test inside the
loop -- so it winds up being about as long, code-wise.

Note that one can unroll even the "mask is the same" loop, but in
that case it is not quite as clever -- it just lets you use constants
for the masks, and eliminate s++ and d++ in one half of the unrolled
loop. If this were assembly, I probably would do it; in C... well,
*maybe* if profiling showed a lot of time going here.

(Incidentally, I will add here that I spent a lot of time optimizing
assembly-language memcpy() on SPARCV9, where you use some special
registers and instructions that require 64-*byte* alignment. This
needs a lot of leading and trailing mopping-up work, but is
fundamentally the same problem.)


wow that was quite a stunning reply Chris :) thanks very much. it's
made my head spin a bit. still haven't fathommed it all, but a good
amount of it went in successfully. i'm going to go over it again
tomorrow. very interesting and educational though that's for sure.

thanks-a-lot, ben.
Nov 14 '05 #6
In article <news:17******************@x.x> ben <x@x.x> wrote:
[on division rounding down]
yes i now realise that. i thought it was not advised to rely on that
but that is obviously not the case.


For negative numbers, there are multiple possibilities in C89 --
(-1)/2 can be either 0 or -1. For positive numbers (or C99) there
is only one allowed answer (1/2 => 0). That was why I put in the
"assuming these are nonnegative".
if (sm == dm) {
... take care of first upper nybble if required


did you mean lower there? as in the right most 4bits of the first 8bit
byte? i could be wrong.


Naming can get tricky -- if bits 0,1,2,3 make up the value 0x0f and
bits 4,5,6,7 make up the value 0xf0, I would call the latter an "upper
nybble", but some documentation says the opposite (but tends to number
"bit 0" from the left so that bits 0 through 3 make up the value 0xf0!).
Since you are taking an otherwise-atomic unit -- a C99-style "uint8_t"
-- apart, you wind up imposing an ordering: do you take the left
("upper", as I called it) 0xf0 bits first, then the right ("lower")
bits, or the other way around? In my previous article I assumed it
was lower, then upper -- little-endian nybble order, as it were. If
this is wrong the code needs some adjustment.
/* then do whole bytes while possible */
if (did not already stop)
while ((*s | (*s >> 4)) & 0x0f) {


i really can't get my head round that line. it's goal is to find out if
either nybble is zero, right? but it doesn't does it?


Whoops, right you are. It stops when *both* are zero.

There seems to be no really good shortcut for "either one zero":
ORing the two nybbles tests for "both zero", ANDing them fails if
the bits do not match up, and XOR is no help at all. You could
use a lookup table, I suppose.
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.
Nov 14 '05 #7
"Chris Torek" <no****@torek.net> wrote in message news:c0*********@enews1.newsguy.com...
In article <news:17******************@x.x> ben <x@x.x> wrote:

[...]
/* then do whole bytes while possible */
if (did not already stop)
while ((*s | (*s >> 4)) & 0x0f) {


i really can't get my head round that line. it's goal is to find out if
either nybble is zero, right? but it doesn't does it?


Whoops, right you are. It stops when *both* are zero.

There seems to be no really good shortcut for "either one zero":
ORing the two nybbles tests for "both zero", ANDing them fails if
the bits do not match up, and XOR is no help at all. You could
use a lookup table, I suppose.


One could use the nibble-analogon to Alan Mycroft's trick for finding
a zero byte in a 32-bit word: ((x - 0x01010101) & (0x80808080 & ~x))
is non-zero iff at least one of the four bytes is a zero byte. For
the nibble test, this gives us ((*s - 0x11) & (0x88 & (~ *s))).

But there is only 2-way parallelism here so I doubt it would be much
faster than the obvious (!((*s & 0x0f) && (*s >> 4))).

-- Norbert
Nov 14 '05 #8
ben
In article <c0*********@enews1.newsguy.com>, Chris Torek
<no****@torek.net> wrote:
For negative numbers, there are multiple possibilities in C89 --
(-1)/2 can be either 0 or -1. For positive numbers (or C99) there
is only one allowed answer (1/2 => 0). That was why I put in the
"assuming these are nonnegative".
that's good to know.
Naming can get tricky -- if bits 0,1,2,3 make up the value 0x0f and
bits 4,5,6,7 make up the value 0xf0, I would call the latter an "upper
nybble", but some documentation says the opposite (but tends to number
"bit 0" from the left so that bits 0 through 3 make up the value 0xf0!).
Since you are taking an otherwise-atomic unit -- a C99-style "uint8_t"
-- apart, you wind up imposing an ordering: do you take the left
("upper", as I called it) 0xf0 bits first, then the right ("lower")
bits, or the other way around? In my previous article I assumed it
was lower, then upper -- little-endian nybble order, as it were. If
this is wrong the code needs some adjustment.
the way you name it corresponds with how i see it - left=upper
right=lower. i must have misunderstood which actual nybble needed to be
dealt with at that moment (twice i think) which made me think there was
a naming issue.
/* then do whole bytes while possible */
if (did not already stop)
while ((*s | (*s >> 4)) & 0x0f) {


i really can't get my head round that line. it's goal is to find out if
either nybble is zero, right? but it doesn't does it?


Whoops, right you are. It stops when *both* are zero.

There seems to be no really good shortcut for "either one zero":
ORing the two nybbles tests for "both zero", ANDing them fails if
the bits do not match up, and XOR is no help at all. You could
use a lookup table, I suppose.


i see Norbert has posted a way - i'll have a go with that.

great, thanks again :)

ben.
Nov 14 '05 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
4372
by: David Stockwell | last post by:
Hi, My background is c/c++ and java. I'm learning python at this point. My question is does python share java's peculiar mode of bit shifting, or does python adhere closer to c's bit shifting?...
9
8681
by: GGG | last post by:
Noticed something odd in the way bit shifting was working today. As far as I have ever heard, shifting will shift in zeros(signed ints aside) However I foudn something odd when I am shifting...
24
40043
by: Nudge | last post by:
I have an array, and an unrolled loop which looks like this: do_something(A); do_something(A); .... do_something(A); I thought: why should I type so much? I should write a macro. So I was...
24
9017
by: John Smith | last post by:
I want to shorten a string by replacing the last character by '\0'. The following code displays the string. It works fine. It's in a loop and different strings are displayed without problems....
4
4132
by: outforblood74 | last post by:
Ok here's the deal, the for each statment using VB 6 doesn't loop all the objects. I would like it to close all the IE browsers that are open, but it doesn't work. And now I'm concerned that its...
2
2030
by: salsipius | last post by:
Can someone please help me clarify the below code. I think the shifting has to do with converting datatypes and/or loss of data but am not really clear on the details, could you help shed some...
9
1685
by: MLH | last post by:
Select Case Me.OpenArgs Case "frmVehicleEntryForm" Forms!!.Requery Case "frmEditTowedVehicleList" Forms!!.Requery End Select Me.OpenArgs is the name of the calling form. There are only two...
10
10662
by: krunalb | last post by:
Hi, I am trying to shift unsigned long long value by 64 bits and this is what i get #include <stdio.h> int main() { unsigned short shiftby= 64;
16
1230
by: Russell Mangel | last post by:
Hi, Can someone suggest a better way to implement the GetUInt16() method? It works okay but I would like to remove the switch() statement if possible. Thanks, Russell Mangel Las Vegas, NV
0
7093
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
7357
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
1
7012
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
5598
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
1
5023
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
3180
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...
0
3171
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
0
1522
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...
0
402
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.