473,883 Members | 1,755 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Efficiency of math.h

Hi all,

Normally, I would trust that the ANSI libraries are written to be as
efficient as possible, but I have an application in which the majority of
the run time is calling the acos(...) method. So now I start to wonder how
that method, and others in the math.h library, are implemented.
Dave
Nov 14 '05
92 4132
In article <40************ **@jpl.nasa.gov > E.************* *@jpl.nasa.gov writes:
Christian Bau wrote:
If you read an article about the Dolph-Chebychev window function,
the first thing that you see is that Chebychev functions
are defined as cos (n*acos(x)) - and they are easily calculated
using a trivial recursion formula that doesn't need anything
more complicated than adding and multiplying.


Please show us the formula.


You mean:
T_0(x) = 1
T_1(x) = x
T_n(x) = 2x.T_{n-1}(x) - T_{n-2}(x) for n > 1
? (This from memory...)
--
dik t. winter, cwi, kruislaan 413, 1098 sj amsterdam, nederland, +31205924131
home: bovenover 215, 1025 jn amsterdam, nederland; http://www.cwi.nl/~dik/
Nov 14 '05 #21
Nick Landsberg wrote:
I think the point of Christian's posting was that
he would like more information from the [Dave Rudolf]
[about] why the acos() function was called so often.
If it is something that requires high precision
and MUST be called often
(as in a signal processing application),
then it is one thing.

If it does not require high precision
or [high precision] is used gratuitously,
there might be better ways to accomplish the same thing.
As one poster pointed out, a rough [interpolation]
between values in a static array may satisfy the requirement.
Without knowing the degree of precision needed,
we just don't know.


Unfortunately,
I don't know what degree of precision is required either.
We can only hope that Dave Rudolf is still following this thread
and will reply with the answer.

But I'll tell you what you can do without knowing any more
about Dave Rudolf's precision requirements.
You can implement you proposed interpolation scheme
and benchmark it against the acos provided with your C compiler.
I predict that you will find that your interpolation scheme
is *slower* even if you need just a few decimal digits precision.
Please post your implementation so that we can repeat
your benchmark experiments ourselves.

There is a reason why math functions like acos
are included in the standard library.
There is *no* single portable implementation
that will deliver performance and accuracy everywhere.
Compiler developers hire experts to customize implementations
for each target platform and, usually, they are hard to beat.

Nov 14 '05 #22


E. Robert Tisdale wrote:
Nick Landsberg wrote:
I think the point of Christian's posting was that
he would like more information from the [Dave Rudolf]
[about] why the acos() function was called so often.
If it is something that requires high precision
and MUST be called often
(as in a signal processing application),
then it is one thing.

If it does not require high precision
or [high precision] is used gratuitously,
there might be better ways to accomplish the same thing.
As one poster pointed out, a rough [interpolation]
between values in a static array may satisfy the requirement.
Without knowing the degree of precision needed,
we just don't know.


[You misunderstood at least one part in your paraphrase
of my post. In the second paragraph I meant ...
"If it does not require high precision or *acos()* is
used gratuitiouly, ..." not, as you have interpreted it.
But, re-reading it, I can see how the sentence
could be so interpreted. I also did not put []'s
around the word "interpolation. "]

Unfortunately,
I don't know what degree of precision is required either.
We can only hope that Dave Rudolf is still following this thread
and will reply with the answer.
Then why attack various proposed algorithms? Without knowing
the problem the poster is trying to solve, why argue about
the details of the solution?

But I'll tell you what you can do without knowing any more
about Dave Rudolf's precision requirements.
You can implement you proposed interpolation scheme
and benchmark it against the acos provided with your C compiler.
I predict that you will find that your interpolation scheme
is *slower* even if you need just a few decimal digits precision.
Please post your implementation so that we can repeat
your benchmark experiments ourselves.
I did not propose the interpolation scheme, some other
poster did, I believe it was Eric Sosman. Since you are the
one questioning the suggestion, may I suggest you post your
benchmark results first to prove your contention?

There is a reason why math functions like acos
are included in the standard library.
There is *no* single portable implementation
that will deliver performance and accuracy everywhere.
Compiler developers hire experts to customize implementations
for each target platform and, usually, they are hard to beat.


All that is true, for applications which require precision.
It is not clear that this application requires that precision.
We come back to full circle regarding:
1 - what is the precision required for this application?
2 - is there a better way rather than the mathematically precise
way to get a "good enough" answer?

And this whole digression belongs in something like

comp.algorithms .philosophy

rather than a discussion of the language, IMO.

--

"It is impossible to make anything foolproof because fools are so
ingenious" - A. Bloch

Nov 14 '05 #23
> Only an idiot would calculate them using the acos () function.

Besides, that would be stupid to re-compute the window function
every time you have to use it: there is most likely some means
of computing it once, place the discret values in an array and
then use this. After all, we are dealing with discret algorithms.

Unless the number of points was gigantic.
Nov 14 '05 #24
Dik T. Winter wrote:
1. Calculating Taylor coefficients on the fly requires division,
Nonsense! Multiply the Taylor coefficients through
by the least common denominator, sum using Horner's rule
then multiply the result by the inverse of the least common denominator.
All of the coefficients can be calculated using integer multiplication.
which is on most processors much slower than multiplication.

2. The number of coefficients required in a Tschebychev approximation
is less than the number of Taylor coefficients needed.
By one or, at best, two terms.
For instance, to get a good sine in double precision on [0, pi/2],
you can use only 6 terms
of the Tschebychev expansion of (sin(x)/x - 1).
(This formula to get good relative precision near x = 0.)


Let's suppose that the Taylor series requires 8 terms
to obtain the minimum precision everywhere.
The Tschebychev approximation could be, at best, 25% faster.
But it could easily require much more time to load
the high precision Tschebychev coefficients from memory.
But don't take my word for it.
Implement both and benchmark them against each other.
The winner will be determined by your computer architecture.

Nov 14 '05 #25
Dik T. Winter wrote:
E.Robert.Tisdal e writes: The first language I used, Algol 60,
Me too. On a CDC 6600.
did not even *have* the arccosine as a function.
It had only sin, cos and arctan from this group.
Only when I started programming in Fortran did I see the
ACOS as a standard function, but have actually never needed it.
Done quite a bit of numerical programming actually.
I only have a little experience

http://csdl.computer.org/comp/procee...0890163abs.htm
Like implementing an arccosine function, but never used it ;-).


Nov 14 '05 #26
In article <40************ **@jpl.nasa.gov > E.************* *@jpl.nasa.gov writes:
Dik T. Winter wrote:
1. Calculating Taylor coefficients on the fly requires division,
Nonsense! Multiply the Taylor coefficients through
by the least common denominator, sum using Horner's rule
then multiply the result by the inverse of the least common denominator.
All of the coefficients can be calculated using integer multiplication.


Pray show the algorithm you intend with Taylor coefficients 1/(2k+1)!.
I have no idea what you mean with the above.
which is on most processors much slower than multiplication.

2. The number of coefficients required in a Tschebychev approximation
is less than the number of Taylor coefficients needed.


By one or, at best, two terms.


As typically the Tschebychev coefficients are about 2^(-n) times the
original coefficients, I wonder where you get your one or two from.
For instance, to get a good sine in double precision on [0, pi/2],
you can use only 6 terms
of the Tschebychev expansion of (sin(x)/x - 1).
(This formula to get good relative precision near x = 0.)


Let's suppose that the Taylor series requires 8 terms
to obtain the minimum precision everywhere.
The Tschebychev approximation could be, at best, 25% faster.


OK, the Tschebyshev approximation in the above case:
double dc[] = {-0.1666666666666 6662966,
0.0083333333333 3094971,
-0.0001984126983 6759707,
0.0000027557316 1030613,
-0.0000000250511 3193827,
0.0000000001591 8135277};
The algorithm reads (y = x*x):
sin = ((((((dc[5] * y + dc[4]) * y + dc[3]) * y + dc[2]) * y + dc[1]) * y
+ dc[0]) * y + 1.0) * x;
(Actually this is for x in [-pi/2,pi/2] with a relative precision of about
1 ULP, or, in tangible terms, the relative error is bounded by about 2e-16.)
But it could easily require much more time to load
the high precision Tschebychev coefficients from memory.
But don't take my word for it.
Implement both and benchmark them against each other.


Now implement your algorithm, without loading high precision coefficients
from memory or using division...
--
dik t. winter, cwi, kruislaan 413, 1098 sj amsterdam, nederland, +31205924131
home: bovenover 215, 1025 jn amsterdam, nederland; http://www.cwi.nl/~dik/
Nov 14 '05 #27

"E. Robert Tisdale" <E.************ **@jpl.nasa.gov > wrote in message
news:40******** ******@jpl.nasa .gov...
Christian Bau wrote:
Since I have actually not used the acos() function _once_
in over twenty years of C programming,


So... you're a newbie eh?


I also have been using C for a long time (about 17 years).
I've never used *any* of the 'math.h' functions in my
production code (well, there might be one or two I've forgotten
about, but you get the idea). That's because my application
domains never needed them. So am I a 'newbie' too?

-Mike
Nov 14 '05 #28
Dik T. Winter wrote:
E.Robert.Tisdal e writes:

> Dik T. Winter wrote:
>
> > 1. Calculating Taylor coefficients on the fly requires division,

>
> Nonsense! Multiply the Taylor coefficients through
> by the least common denominator, sum using Horner's rule
> then multiply the result by the inverse of the least common denominator.
> All of the coefficients can be calculated using integer multiplication.


Pray show the algorithm you intend with Taylor coefficients 1/(2k+1)!.
I have no idea what you mean with the above.


Suppose that the highest order coefficient is 1/(2n+1)!. Compute

[( ... ((x^2 + (2n+1)!/(2n-1)!)*x^2 + (2n+1)!/(2n-3)!)*x^2 + ... + 1)*x]
*[1/(2n+1)!]

The first coefficient

(2n+1)!/(2n-1)! = (2n+1)*2n

The second coefficient

(2n+1)!/(2n-3)!) = ((2n+1)!/(2n-1)!)*(2n-1)*(2n-2)

etc.

For small n, you can compute these coefficients
in an integer register on-the-fly.
All you need to do is multiply the Horner product through by

[1/(2n+1)!]

at the end.

Nov 14 '05 #29
Dik T. Winter wrote:
OK, the Tschebyshev approximation in the above case:
double dc[] = {-0.1666666666666 6662966,
0.0083333333333 3094971,
-0.0001984126983 6759707,
0.0000027557316 1030613,
-0.0000000250511 3193827,
0.0000000001591 8135277};
The algorithm reads (y = x*x):
sin = ((((((dc[5] * y + dc[4]) * y + dc[3]) * y + dc[2]) * y + dc[1]) * y
+ dc[0]) * y + 1.0) * x;
(Actually this is for x in [-pi/2, pi/2] with a relative precision of about
1 ULP, or, in tangible terms, the relative error is bounded by about 2e-16.)


That's only about 4 or 5 decimal digits. You will probably find that
your algorithm would benefit from a "range reduction".

Anyway, Have you tried to "benchmark" your implementation
against the sin(x) function in your standard C library.
If so, please report your results
including the architecture and operating system
where you ran the benchmarks.

Nov 14 '05 #30

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
2304
by: Jussi Mononen | last post by:
Hi, I'm having problems to successfully execute the test scripts on a Compaq host ( OSF1 tr51bdev V5.1 2650 alpha ). Almost all tests end up with the following error message "PARI: *** Invalid arguments to divll. at test_eng/Testout.pm line 30. ...propagated at t/polyser.t line 9. t/polyser.....dubious
17
3639
by: cwdjrxyz | last post by:
Javascript has a very small math function list. However there is no reason that this list can not be extended greatly. Speed is not an issue, unless you nest complicated calculations several levels deep. In that case you need much more ram than a PC has to store functions calculated in loops so that you do not have to recalculate every time you cycle through the nest of loops. Using a HD for storage to extend ram is much too slow for many...
9
3836
by: travisperkins03 | last post by:
Hi, I have read somewhere that C code sometimes cannot be compiled to be as efficient as FORTRAN, eg for matrix multiplication, because a C compiler cannot make the assumptions about arrays that a FORTRAN compiler can. But I don't understand the example, not least because I don't understand FORTRAN. I also don't understand why it is more efficient in this case for a compiler to choose the order of evaluation (or whatever it is that it...
335
11977
by: extrudedaluminiu | last post by:
Hi, Is there any group in the manner of the C++ Boost group that works on the evolution of the C language? Or is there any group that performs an equivalent function? Thanks, -vs
83
3724
by: Licheng Fang | last post by:
Hi, I'm learning STL and I wrote some simple code to compare the efficiency of python and STL. //C++ #include <iostream> #include <string> #include <vector> #include <set> #include <algorithm> using namespace std;
77
3296
by: Aman JIANG | last post by:
THE GAME : Write a C function to swap the bits of a char so that its bits would become the mirror image of the char.MSBs become its LSBs etc. E.g. 11001100 binary would become 00110011 binary. ---------------------------------------
19
2938
by: vamshi | last post by:
Hi all, This is a question about the efficiency of the code. a :- int i; for( i = 0; i < 20; i++ ) printf("%d",i); b:- int i = 10;
9
3328
by: OldBirdman | last post by:
Efficiency I've never stumbled on any discussion of efficiency of various methods of coding, although I have found posts on various forums where individuals were concerned with efficiency. I'm not concerned when dealing with user typing, but I am if a procedure is called by a query. Does the VBA compiler generate "in-line" code for some apparent function calls? For example, y = Abs(x) might be compiled as y = x & mask. The string...
43
4864
by: john | last post by:
Hi, in TC++PL 3 on pages 674-675 it is mentioned: "Maybe your first idea for a two-dimensional vector was something like this: class Matrix { valarray< valarray<doublev; public: // ... };
0
9944
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, weíll explore What is ONU, What Is Router, ONU & Routerís main usage, and What is the difference between ONU and Router. Letís take a closer look ! Part I. Meaning of...
0
9796
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10420
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
1
7975
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Duprť who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
7134
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5804
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
6002
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4620
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
3
3239
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.