In article <e2************ **************@ posting.google. com>,

Oc**********@ya hoo.com (OcelotIguana) wrote:

Thanks to everyone who has posted in response to my original message.

Perhaps I should clarify what I am asking. I have a suite of

numerical codes that I recently profiled and found that the standard

sin and cos routines are a huge percentage of my total run time (which

is many days). A colleague told me that the standard math libraries

are optimized for size, not speed, and that since I always call sin
=============== ==============

Usually they are optimised for accuracy.

and cos of the same argument, I should look for a speed optimized

sincos routine which shouldn't take much more time than either a sin

or cos take individually while maintaining the same level of accuracy.

BTW, I'm using Borland C++ Builder 6.0 on a Pentium IV in Win2k.

Have you looked at reducing the number of calls to sin and cos? For

example, consecutive values of sin (a + k * b), for k = 0, 1, 2, 3, etc.

can be calculated very easily with a single multiplication and addition.

Anything doing 3D graphics can usually be done with hardly any

trigonometric functions at all.

Do you have values that are very close together?

sin (x + eps) = sin (x) * cos (eps) + sin (eps) * cos (x)

cos (x + eps) = cos (x) * cos (eps) - sin (eps) * sin (x)

(You better check these)

If eps is small enough then you can replace cos (eps) with 1, sin (eps)

with eps and get

sin (x + eps) = sin (x) + eps * cos (x)

cos (x + eps) = cos (x) - eps * sin (x)

Grab the source of an existing implementation of sin and cos. They all

do two steps, for example for sin (x):

Step 1: Given x, find k such that abs (x - k * pi/2) <= pi/4.

Step 2: Let y = x - k * pi/2.

Step 3: Calculate one of sin(y), cos (y), -sin(y), -cos(y),

depending on the last two bits of k, using a polynomial.

By calculating sin and cos simultaneously, you know both will have the

same k and y. You also will have to calculate both sin(y) and cos(y)

using two polynomials, then just pick the right results and apply the

sign. So you win by just merging two such implementations .

If you have many calls, chances are the arguments are close together, so

many consecutive arguments will use the same value k. Try writing a

vectorised function:

void vec_sincos (double s[], double c[], double x[], size_t n);

where you will lose lots of the overhead and give the compiler a chance

of optimising.

BTW. Profilers have been known to lie, especially for small function

calls. Just write a test program that does a billion calls to sin and

cos, profile it, and compare the results with stopwatch results to make

sure you are not going down the wrong path.