Vinoth wrote:

I'm working in an ARM (ARM9) system which does not have Floating point

co-processor or Floating point libraries. But it does support long long int

(64 bits).

Can you provide some link that would discuss about ways to emulate floating

point calculations with just long int or long long int. For eg., if i've a

formula X=(1-b)*Y + b*Z in floating point domain, i can calculate X with

just long ints (but, some data may be lost in final division; That's OK)

Floating Point:

X=(1-b)*Y + b*Z

/* 'b' is a floating point variable with 4 points precision and 'b' is in

the range of 0 to 1;X, Y and Z are unsigned int*/

With long int:

I can emulate the above calculation as:

X=((10000-10000*b)*Y +10000*b*Z)/10000

I'm in need of some link that would discuss this and any similar approach.

Your "emulation" should work fine, if the products and

sum in the numerator don't grow too large for `long'. If

you know enough about the ranges of Y and Z to be sure this

won't happen, all is well. If not, you can use `long long'

for the intermediate results:

X = ((10000LL - 10000LL*b) * Y + 10000LL*b * Z) / 10000LL;

There are a number of possible improvements you may want

to consider. The first is to get rid of those `10000LL*b'

computations, which is easy: instead of storing `b' itself,

store `10000 * b' in a `long' variable called `B':

X = ((10000LL - B) * Y + (long long)B * Z) / 10000LL;

Rearranging the expression with a little algebra can

eliminate one of the multiplications and permit a little more

of the computation to use plain `long' instead of `long long'

(which may be faster, especially if `long long' is emulated

in software):

X = Y + (long)((Z - Y) * (long long)B / 10000LL);

If you change the scaling factor from 10000 to something

that's a power of two, you can replace the division with a

shift. 16384 (1 << 14) is pretty close to your original

10000, so assuming that `B' is now `b * 16384' you'd have

X = Y + (long)( ((Z - Y) * (long long)B) >> 14 );

There's a potential trap here: if `Z - Y' is negative

so the product being shifted is also negative, C doesn't

specify exactly what happens with the right shift. Since

you're only concerned with one implementation you could

check whether it does what you want. If it doesn't, or

if you want to be sure the code will work elsewhere, too,

you could make sure that no negative numerators appear:

if (Z >= Y)

X = Y + (long)( ((Z - Y) * (long long)B) >> 14 );

else

X = Y - (long)( ((Y - Z) * (long long)B) >> 14 );

This is about as far as you can go with portable C --

which is a shame, really, because some machines are capable

of better. For example, there may be an instruction (or

instruction sequence) to multiply two 32-bit numbers and

yield a 64-bit product, but C cannot multiply two `long's to

get a `long long'. If you used 32-bit scaling instead of

the 14 bits shown above, the second term would simply be the

high-order 32 bits of the 64-bit product and the machine might

be able to extract it without shifting, but C has no portable

way to perform such dissections. It's possible that a smart

optimizing compiler might be able to exploit such capabilities

of the machine (I'd especially recommend looking into the

possibility of 32-bit scaling), but there are no guarantees.

What you're doing with the "emulation" is called "fixed-

point arithmetic," and the techniques can be applied in more

sophisticated form -- to get a properly-rounded answer, for

example, or to deal with numbers that have both integer and

fractional parts. A small amount of research may give you

some good ideas ...

--

Er*********@sun.com