473,320 Members | 2,122 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

inline power function replacement

Just want an opinion. I have an algorithm that needs to run as fast as
possible, in fact. Its already too slow. I've done as much algorithmic
changes as I can to reduce the amount of code, so now I'm turning to
micro-optimizations.

One function that gets run a bazillion times is the pow() function from
math.h. However I've realized probably 90% of the time, the exponent
will be 0. 99.9% of the time, the exponent will lie between -3 and 3.
So I read that switch's can sometimes be optimized into jump tables if
the case expressions are nearly contigous integers. So I coded up this
little diddy:

static inline double
mult_power(double x, int y)
{
switch ((y))
{
case 0 : return 1.0;
case -3 : return 1.0 / ((x)*(x)*(x));
case -2 : return 1.0 / ((x)*(x));
case -1 : return 1.0 / (x);
case 1 : return (x);
case 2 : return (x)*(x);
case 3 : return (x)*(x)*(x);
default : return pow((x), (y));
}
}

I actually havent tested this vs sticking the 0 in the middle (which I
will be doing shortly) but is this a sound replacement? Or is there
perhaps a more efficient solution?

Jul 27 '06 #1
32 3274

ch***********@gmail.com wrote:

I actually havent tested this vs sticking the 0 in the middle (which I
will be doing shortly) but is this a sound replacement? Or is there
perhaps a more efficient solution?
Well it depends on how smart your compiler is. It's not unimaginable
that a smart compiler would already be checking for the simple cases
and generating in-line code for this. Only way to find out is to try
it.

If your compiler is somewhat dumb, it may help to make this a macro,
so no call gets generated.

and Oh, you'll get some flamitude from folks that consider any kind of
speculation of run-times has nothing to do with C.

Jul 27 '06 #2
ch***********@gmail.com wrote:
Just want an opinion. I have an algorithm that needs to run as fast as
possible, in fact. Its already too slow. I've done as much algorithmic
changes as I can to reduce the amount of code, so now I'm turning to
micro-optimizations.
I'm not sure this is the right newsgroup for such a question. Not a
lot of people in this group can fathom what exactly you are saying
here.
One function that gets run a bazillion times is the pow() function from
math.h. However I've realized probably 90% of the time, the exponent
will be 0. 99.9% of the time, the exponent will lie between -3 and 3.
So I read that switch's can sometimes be optimized into jump tables if
the case expressions are nearly contigous integers. So I coded up this
little diddy:

static inline double
mult_power(double x, int y)
{
switch ((y))
{
case 0 : return 1.0;
case -3 : return 1.0 / ((x)*(x)*(x));
case -2 : return 1.0 / ((x)*(x));
case -1 : return 1.0 / (x);
case 1 : return (x);
case 2 : return (x)*(x);
case 3 : return (x)*(x)*(x);
default : return pow((x), (y));
}
}

I actually havent tested this vs sticking the 0 in the middle (which I
will be doing shortly) but is this a sound replacement?
Well, there are numerical considerations (i.e., your results will not
be identical to just calling pow().) But your idea is very well
motivated. There really should be a powint(double x, int y) function
in C (that does even more.) The problem of doing this as quickly as
possible for all integers is actually an interesting one.

Oh yeah, and the order of the cases should not matter if the compiler
has transformed this to a jump table. It only matters in the cases
where it doesn't do that, in which case you should just put the most
common cases as early as possible.
Or is there perhaps a more efficient solution?
Indeed. If y is mostly zero, then you should do this:

#define fastPowInt(x,y) ((0 == (y)) ? 1.0 : mult_power ((x), (y)))

The reason is that the single conditional on the top can leverage a
modern microprocessor's branch prediction capabilities, and thus cost
the absolutely least overhead for the vast majority of cases. A
switch() operation indeed does use jmp tables in many compilers, but
unless you are almost always switching to the same case, this has
roughly the same overhead as an indirect function call (calling through
a function pointer.) So, the surrounding conditional as shown above
should improve performance further still.

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Jul 27 '06 #3
ch***********@gmail.com posted:
case -3 : return 1.0 / ((x)*(x)*(x));

Why the copious use of parentheses? You're not writing a macro.

return 1.0 / (x*x*x);

Or maybe:

return 1.0 / x / x / x;

Or is there perhaps a more efficient solution?

Presumbly "pow" should be checking for integer exponents... ?

--

Frederick Gotham
Jul 27 '06 #4


ch***********@gmail.com wrote On 07/27/06 11:22,:
Just want an opinion. I have an algorithm that needs to run as fast as
possible, in fact. Its already too slow. I've done as much algorithmic
changes as I can to reduce the amount of code, so now I'm turning to
micro-optimizations.

One function that gets run a bazillion times is the pow() function from
math.h. However I've realized probably 90% of the time, the exponent
will be 0. 99.9% of the time, the exponent will lie between -3 and 3.
It sounds like you're approaching things correctly: You've
tried changing the algorithms, rearranging the calculations,
and other such structural changes (that's where the big wins
are), and found yourself still a little short of the required
performance. And you've made measurements that finger pow()
as one of the big time sinks, and you've counted the number
of times it's called, and you've even made an empirical study
of the exponent values.

(You HAVE, right? The "bazillion" and "90%" and "99.9%"
are actual measurements, right? Or at least they're rough
statements backed by measurements, right? If so, fine -- but
if not, go back and make the measurements before proceeding.)

Once you've got measurements that finger pow() as the
culprit and give some actual information about the exponent
distribution, your inline function seems a good way to exploit
the knowledge you've acquired about the program, knowledge
that pow() itself does not have.

Whether this will save "enough" time is something you
can only find out by measuring. You may be able to tell just
by considering your initial measurements, though: If you find
that pow() takes 20% of the running time, replacing pow() with
something that takes *zero* time can only speed up the program
by 25%. If you need a 10% speedup, it might be within reach --
but if you need to double the program's speed, this isn't
going to do the trick.
So I read that switch's can sometimes be optimized into jump tables if
the case expressions are nearly contigous integers. So I coded up this
little diddy:

static inline double
mult_power(double x, int y)
{
switch ((y))
{
case 0 : return 1.0;
case -3 : return 1.0 / ((x)*(x)*(x));
case -2 : return 1.0 / ((x)*(x));
case -1 : return 1.0 / (x);
case 1 : return (x);
case 2 : return (x)*(x);
case 3 : return (x)*(x)*(x);
default : return pow((x), (y));
}
}

I actually havent tested this vs sticking the 0 in the middle (which I
will be doing shortly) but is this a sound replacement? Or is there
perhaps a more efficient solution?
The C language itself says nothing about the speeds of
the constructs you write in it, so questions of efficiency
must always refer to the platform or platforms. Still, it
is quite unlikely that the speed will be affected noticeably
by the order in which the case labels appear in the source
code. ((Also, the speed probably won't be affected if you
get rid of some of those needless parentheses ...))

If your "zero 90% of the time" figure is literally true,
you might (*might*) gain a milliquiver or so by testing for
it before doing the switch:

if (y == 0)
return 1.0;
switch (y) {
...
}

Differences due to this sort of thing are usually "down in
the noise," but a bazillion milliquivers is a hectotwitch
(IIRC), which might be noticeable. Keep in mind, though,
that this "optimization" might actually slow you down --
measure, measure, measure!

--
Er*********@sun.com

Jul 27 '06 #5

we******@gmail.com wrote:
ch***********@gmail.com wrote:
Just want an opinion. I have an algorithm that needs to run as fast as
possible, in fact. Its already too slow. I've done as much algorithmic
changes as I can to reduce the amount of code, so now I'm turning to
micro-optimizations.

I'm not sure this is the right newsgroup for such a question. Not a
lot of people in this group can fathom what exactly you are saying
here.
I see your point. I suppose the heart of this question is "what machine
code does this C code get translated into" which obviously is done by
the compiler (gcc 4.1.1 in my case).

Actually, this is more of an archetecture specific question because
really the question is "what should I do to optimize this function for
amd64 processors" heh. But then again, you did give some good insight
depsite the wrong newsgroup! So thanks!

Chris
One function that gets run a bazillion times is the pow() function from
math.h. However I've realized probably 90% of the time, the exponent
will be 0. 99.9% of the time, the exponent will lie between -3 and 3.
So I read that switch's can sometimes be optimized into jump tables if
the case expressions are nearly contigous integers. So I coded up this
little diddy:

static inline double
mult_power(double x, int y)
{
switch ((y))
{
case 0 : return 1.0;
case -3 : return 1.0 / ((x)*(x)*(x));
case -2 : return 1.0 / ((x)*(x));
case -1 : return 1.0 / (x);
case 1 : return (x);
case 2 : return (x)*(x);
case 3 : return (x)*(x)*(x);
default : return pow((x), (y));
}
}

I actually havent tested this vs sticking the 0 in the middle (which I
will be doing shortly) but is this a sound replacement?

Well, there are numerical considerations (i.e., your results will not
be identical to just calling pow().) But your idea is very well
motivated. There really should be a powint(double x, int y) function
in C (that does even more.) The problem of doing this as quickly as
possible for all integers is actually an interesting one.

Oh yeah, and the order of the cases should not matter if the compiler
has transformed this to a jump table. It only matters in the cases
where it doesn't do that, in which case you should just put the most
common cases as early as possible.
Or is there perhaps a more efficient solution?

Indeed. If y is mostly zero, then you should do this:

#define fastPowInt(x,y) ((0 == (y)) ? 1.0 : mult_power ((x), (y)))

The reason is that the single conditional on the top can leverage a
modern microprocessor's branch prediction capabilities, and thus cost
the absolutely least overhead for the vast majority of cases. A
switch() operation indeed does use jmp tables in many compilers, but
unless you are almost always switching to the same case, this has
roughly the same overhead as an indirect function call (calling through
a function pointer.) So, the surrounding conditional as shown above
should improve performance further still.

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/
Jul 27 '06 #6
"Ancient_Hacker" <gr**@comcast.netwrites:
ch***********@gmail.com wrote:
>I actually havent tested this vs sticking the 0 in the middle (which I
will be doing shortly) but is this a sound replacement? Or is there
perhaps a more efficient solution?

Well it depends on how smart your compiler is. It's not unimaginable
that a smart compiler would already be checking for the simple cases
and generating in-line code for this. Only way to find out is to try
it.

If your compiler is somewhat dumb, it may help to make this a macro,
so no call gets generated.
He declared the function as "static inline"; it's unlikely that making
it a macro will improve performance beyond that (assuming the compiler
really does inline it).

If this truly is a bottleneck, both measuring the code's actual
performance and looking at assembly listings could be helpful.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Jul 27 '06 #7
we******@gmail.com wrote:
[...]
There really should be a powint(double x, int y) function
in C (that does even more. [...]
Wimp! Puling almsbeggar! Settle-for-half pantywaist!

C should have an exponentiation *operator*! Since the
obvious ^ has already been co-opted for another purpose and
since the traditional ** would be ambiguous due to other
overloadings, I propose the ^^ operator for exponentiation.
(Or, if folks would prefer it, the **^ operator -- best of
both worlds, don'tcha know.)

NEVER SURRENDER!
-- Dyed-in-the-wool FORTRAN II programmer (always
speaks in capital letters)

--
Eric Sosman
es*****@acm-dot-org.invalid

Jul 28 '06 #8
Chris,
Your problem is architectures & compiler specific. You have to become
"good friends" with your compiler and architecture to optimize this.
It seems like a good thing to do is to hint the branch predictor that
most of the time the value will be 0. So check out the
__builtin_expect() for GCC (there are a bunch of other __builtins for
constant folding and such...)

If that is too much of a headache, try using the -fbranch-arcs compile
option, run the code, then use the -fbranch-probabilities to recompile
after that. On some architectures you will anywhere from 0 to some
improvement. For example between latest AMD and Intel I would expect
a better improvement from Intel as it has longer pipelines ( the longer
the pipeline, the more penalty during wrong branch prediction --
therefore the better the prediction the better chance of a speed
improvement).

Hope this helps,
Nick Vatamaniuc


ch***********@gmail.com wrote:
Just want an opinion. I have an algorithm that needs to run as fast as
possible, in fact. Its already too slow. I've done as much algorithmic
changes as I can to reduce the amount of code, so now I'm turning to
micro-optimizations.

One function that gets run a bazillion times is the pow() function from
math.h. However I've realized probably 90% of the time, the exponent
will be 0. 99.9% of the time, the exponent will lie between -3 and 3.
So I read that switch's can sometimes be optimized into jump tables if
the case expressions are nearly contigous integers. So I coded up this
little diddy:

static inline double
mult_power(double x, int y)
{
switch ((y))
{
case 0 : return 1.0;
case -3 : return 1.0 / ((x)*(x)*(x));
case -2 : return 1.0 / ((x)*(x));
case -1 : return 1.0 / (x);
case 1 : return (x);
case 2 : return (x)*(x);
case 3 : return (x)*(x)*(x);
default : return pow((x), (y));
}
}

I actually havent tested this vs sticking the 0 in the middle (which I
will be doing shortly) but is this a sound replacement? Or is there
perhaps a more efficient solution?
Jul 28 '06 #9
Eric Sosman <es*****@acm-dot-org.invalidwrites:
we******@gmail.com wrote:
>[...]
There really should be a powint(double x, int y) function
in C (that does even more. [...]

Wimp! Puling almsbeggar! Settle-for-half pantywaist!

C should have an exponentiation *operator*! Since the
obvious ^ has already been co-opted for another purpose and
since the traditional ** would be ambiguous due to other
overloadings, I propose the ^^ operator for exponentiation.
(Or, if folks would prefer it, the **^ operator -- best of
both worlds, don'tcha know.)
If you use ^^ for exponentiation, what am I going to use for
short-circuit xor?

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Jul 28 '06 #10
Keith Thompson schrieb:
Eric Sosman <es*****@acm-dot-org.invalidwrites:
>>we******@gmail.com wrote:
>>>[...]
There really should be a powint(double x, int y) function
>>>in C (that does even more. [...]

Wimp! Puling almsbeggar! Settle-for-half pantywaist!

C should have an exponentiation *operator*! Since the
obvious ^ has already been co-opted for another purpose and
since the traditional ** would be ambiguous due to other
overloadings, I propose the ^^ operator for exponentiation.
(Or, if folks would prefer it, the **^ operator -- best of
both worlds, don'tcha know.)

If you use ^^ for exponentiation, what am I going to use for
short-circuit xor?
Maybe we could abuse static once more...
^static
should do nicely.

Cheers
Michael
--
E-Mail: Mine is an /at/ gmx /dot/ de address.
Jul 28 '06 #11
Keith Thompson wrote:
Eric Sosman <es*****@acm-dot-org.invalidwrites:
>we******@gmail.com wrote:
>>[...]
There really should be a powint(double x, int y) function
in C (that does even more. [...]
Wimp! Puling almsbeggar! Settle-for-half pantywaist!

C should have an exponentiation *operator*! Since the
obvious ^ has already been co-opted for another purpose and
since the traditional ** would be ambiguous due to other
overloadings, I propose the ^^ operator for exponentiation.
(Or, if folks would prefer it, the **^ operator -- best of
both worlds, don'tcha know.)

If you use ^^ for exponentiation, what am I going to use for
short-circuit xor?
You could use ^|^
--
Flash Gordon, living in interesting times.
Web site - http://home.flash-gordon.me.uk/
comp.lang.c posting guidelines and intro:
http://clc-wiki.net/wiki/Intro_to_clc
Jul 28 '06 #12
ch***********@gmail.com wrote:
One function that gets run a bazillion times is the pow() function from
math.h. [...too slow ...] So I coded up this little diddy:

static inline double
mult_power(double x, int y)
{
switch ((y))
{
case 0 : return 1.0;
case -3 : return 1.0 / ((x)*(x)*(x));
case -2 : return 1.0 / ((x)*(x));
case -1 : return 1.0 / (x);
case 1 : return (x);
case 2 : return (x)*(x);
case 3 : return (x)*(x)*(x);
default : return pow((x), (y));
}
}
Someone already noted that the (x) can be just x because you are preperly
using an inline function and not a macro. Bad habits die hard, eh?
I actually havent tested this vs sticking the 0 in the middle (which I
will be doing shortly) but is this a sound replacement?
I don't see a case where this would not be sound, but I'd indeed put the
zero in the middle because it just belongs there and might make it easier
for the compiler to figure out things. Anyhow, whatever you do will
probably depend on the compiler and target platform, so if you target
multiple platforms leave yourself some space to switch implementations in
the back.
Or is there perhaps a more efficient solution?
Hard to tell. However, there is one thing I would at least try and that is
to use single-precision math. It's not so much that these are necessarily
faster (I have often seen powf() just cast and forward to pow()) but that
they put less load on the memory bus, i.e. you can fit more values into
the CPU's cache than with double. You then need to use float, the foof()
functions and the f-suffix on literals to make the whole consistent.

HTH

Uli

Jul 28 '06 #13
Flash Gordon said:

<snip>
You could use ^|^
....which would not break any strictly conforming code.

Or maybe you could abandon artificial associations with existing
symbol/semantics relationships and go for :-) which, as far as I can tell,
would also not break any existing strictly conforming code, and which might
even cheer up the compiler.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)
Jul 28 '06 #14
Flash Gordon wrote:
Keith Thompson wrote:
>Eric Sosman <es*****@acm-dot-org.invalidwrites:
>>we******@gmail.com wrote:

[...]

There really should be a powint(double x, int y) function

in C (that does even more. [...]

Wimp! Puling almsbeggar! Settle-for-half pantywaist!

C should have an exponentiation *operator*! Since the
obvious ^ has already been co-opted for another purpose and
since the traditional ** would be ambiguous due to other
overloadings, I propose the ^^ operator for exponentiation.
(Or, if folks would prefer it, the **^ operator -- best of
both worlds, don'tcha know.)


If you use ^^ for exponentiation, what am I going to use for
short-circuit xor?


You could use ^|^
.... or even ;-)

--
Eric Sosman
es*****@acm-dot-org.invalid
Jul 28 '06 #15

Ulrich Eckhardt wrote:
ch***********@gmail.com wrote:
One function that gets run a bazillion times is the pow() function from
math.h. [...too slow ...] So I coded up this little diddy:

static inline double
mult_power(double x, int y)
{
switch ((y))
{
case 0 : return 1.0;
case -3 : return 1.0 / ((x)*(x)*(x));
case -2 : return 1.0 / ((x)*(x));
case -1 : return 1.0 / (x);
case 1 : return (x);
case 2 : return (x)*(x);
case 3 : return (x)*(x)*(x);
default : return pow((x), (y));
}
}

Someone already noted that the (x) can be just x because you are preperly
using an inline function and not a macro. Bad habits die hard, eh?
I actually havent tested this vs sticking the 0 in the middle (which I
will be doing shortly) but is this a sound replacement?

I don't see a case where this would not be sound, but I'd indeed put the
zero in the middle because it just belongs there and might make it easier
for the compiler to figure out things. Anyhow, whatever you do will
probably depend on the compiler and target platform, so if you target
multiple platforms leave yourself some space to switch implementations in
the back.
Or is there perhaps a more efficient solution?

Hard to tell. However, there is one thing I would at least try and that is
to use single-precision math. It's not so much that these are necessarily
faster (I have often seen powf() just cast and forward to pow()) but that
they put less load on the memory bus, i.e. you can fit more values into
the CPU's cache than with double. You then need to use float, the foof()
functions and the f-suffix on literals to make the whole consistent.
I actually have it in my task list to determine when and where using
float is appropriate. The problem with this power function in
particular is that its used to evaluate binomal distributions so the
"double x" is a probability < 1 and I'm multiplying a whole bunch (10
or so) together so I worry about there error propagating to the point
where i have only a couple sig. digits. I have yet to do a formal
analysis on the error diff when multiplying/dividing a series of floats
together vs. doubles.

And as for all the (x)'s, I had it as a macro before ;) but sometimes
I'm dividing by the result and sometimes multiplying depending on some
condition, so i didnt want to introduce another test condition within
the macro to determine whether to use *= or /= .

i.e.
if( k 0)
pdf *= mult_power(probability, some_int);
else
pdf /= mult_power(probability, some_int);

HTH

Uli
Jul 28 '06 #16

Nick Vatamaniuc wrote:
Chris,
Your problem is architectures & compiler specific. You have to become
"good friends" with your compiler and architecture to optimize this.
It seems like a good thing to do is to hint the branch predictor that
most of the time the value will be 0. So check out the
__builtin_expect() for GCC (there are a bunch of other __builtins for
constant folding and such...)
Wow, never got that far in the GCC manual but very interesting. I think
the __builtin_constant() function should be useful too because I have a
bunch of constants that are read from a config file during an init
function called before processing. I'm not sure if the compiler can
determine that once read in, they are never altered therefor treat as
constants but worth investigating! Thanks.

Chris
If that is too much of a headache, try using the -fbranch-arcs compile
option, run the code, then use the -fbranch-probabilities to recompile
after that. On some architectures you will anywhere from 0 to some
improvement. For example between latest AMD and Intel I would expect
a better improvement from Intel as it has longer pipelines ( the longer
the pipeline, the more penalty during wrong branch prediction --
therefore the better the prediction the better chance of a speed
improvement).

Hope this helps,
Nick Vatamaniuc


ch***********@gmail.com wrote:
Just want an opinion. I have an algorithm that needs to run as fast as
possible, in fact. Its already too slow. I've done as much algorithmic
changes as I can to reduce the amount of code, so now I'm turning to
micro-optimizations.

One function that gets run a bazillion times is the pow() function from
math.h. However I've realized probably 90% of the time, the exponent
will be 0. 99.9% of the time, the exponent will lie between -3 and 3.
So I read that switch's can sometimes be optimized into jump tables if
the case expressions are nearly contigous integers. So I coded up this
little diddy:

static inline double
mult_power(double x, int y)
{
switch ((y))
{
case 0 : return 1.0;
case -3 : return 1.0 / ((x)*(x)*(x));
case -2 : return 1.0 / ((x)*(x));
case -1 : return 1.0 / (x);
case 1 : return (x);
case 2 : return (x)*(x);
case 3 : return (x)*(x)*(x);
default : return pow((x), (y));
}
}

I actually havent tested this vs sticking the 0 in the middle (which I
will be doing shortly) but is this a sound replacement? Or is there
perhaps a more efficient solution?
Jul 28 '06 #17


ch***********@gmail.com wrote On 07/28/06 09:51,:
[...]
I actually have it in my task list to determine when and where using
float is appropriate. The problem with this power function in
particular is that its used to evaluate binomal distributions so the
"double x" is a probability < 1 and I'm multiplying a whole bunch (10
or so) together so I worry about there error propagating to the point
where i have only a couple sig. digits. I have yet to do a formal
analysis on the error diff when multiplying/dividing a series of floats
together vs. doubles.
<topicality level=marginal>

Roughly speaking, floating-point multiplication and
division don't increase the relative error very much. It's
addition and subtraction that are usually the troublemakers.

</topicality>

You say you're evaluating binomial distributions: What
exactly do you mean by that? There may be better ways to
proceed than by adding up the terms individually.

--
Er*********@sun.com

Jul 28 '06 #18
In article <11*********************@h48g2000cwc.googlegroups. com>,
ch***********@gmail.com <ch***********@gmail.comwrote:
>I have yet to do a formal
analysis on the error diff when multiplying/dividing a series of floats
together vs. doubles.
float and double present a great many bands in which the error is
absolute, but the absolute error for any one band is approximately
relative. Makes for a messy calculation.

Taking the sightly simpler case where the error is always relative,
f[i] +/- f[i]/c, (e.g., the error is 1/10000 times the original value)
and simplifying even further to f[i]-f[i]/c 0 then
max(f[i]-f[i]/c,f[i]+f[i]/c) = f[i]+f[i]/c
min(f[i]-f[i]/c,f[i]+f[i]/c) = f[i]-f[i]/c
and then the maximum product is product(f[i]+f[i]/c,i=1..n)
and the minimum product is product(f[i]-f[i]/c,i=1..n)

Subtracting the first from the second and simplifying, the result is

product(f[i],i=1..n) * ( (c+1)^n- (c-1)^n ) / c^n

If we say that c = 10^7 and take 20 terms, the relative error is
about 4e-4 times the real product; for c = 10^15 and 20 terms, the
relative error is about 4e-12 times the real product.
--
There are some ideas so wrong that only a very intelligent person
could believe in them. -- George Orwell
Jul 28 '06 #19
On Fri, 28 Jul 2006, ch***********@gmail.com wrote:
[optimizations snipped]

The problem with this power function in
particular is that its used to evaluate binomal distributions so the
Have you investigated the normal approximation ``algorithm''
before attempting your micro-optimizations? I am still
unconvinced about all the micro-optimization stuffs posted in
this thread thus far.

Tak-Shing
Jul 28 '06 #20
Eric Sosman <es*****@acm-dot-org.invalidwrote:
>
C should have an exponentiation *operator*!
It has one: pow(). There's nothing stopping the compiler from recognizing
it and generating inline code, and some compilers do just that. Some
even recognize the case where the second argument is an integer and
generate more efficient code.

-Larry Jones

Hello, I'm wondering if you sell kegs of dynamite. -- Calvin
Jul 28 '06 #21
Nick Vatamaniuc wrote:
Chris,

Please don't top-post. Your replies belong following or interspersed
with properly trimmed quotes.


Brian
Jul 28 '06 #22
la************@ugs.com writes:
Eric Sosman <es*****@acm-dot-org.invalidwrote:
>>
C should have an exponentiation *operator*!

It has one: pow(). There's nothing stopping the compiler from recognizing
it and generating inline code, and some compilers do just that. Some
even recognize the case where the second argument is an integer and
generate more efficient code.
Sure, but they're not required to do so.

Efficiency aside, if C had an exponentation operator overloaded for
either integer or floating-point exponents, you could be sure that
2**3 would yield 8 -- but pow(2, 3), which is really pow(2.0, 3.0),
could yield 7.99999999375.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Jul 29 '06 #23
>Eric Sosman <es*****@acm-dot-org.invalidwrote:
>> C should have an exponentiation *operator*!
>la************@ugs.com writes:
>It has one: pow(). There's nothing stopping the compiler from recognizing
it and generating inline code, and some compilers do just that. Some
even recognize the case where the second argument is an integer and
generate more efficient code.
In article <ln************@nuthaus.mib.org>
Keith Thompson <ks***@mib.orgwrote:
>Sure, but they're not required to do so.
Right. Specifically, making the operator look exactly like an
ordinary function call makes it particularly easy for "lazy"
compilers to generate an ordinary function call: the compiler-writer
does not even need to convert an expression tree node into a call
to __pow(), for instance.
>Efficiency aside, if C had an exponentation operator overloaded for
either integer or floating-point exponents, you could be sure that
2**3 would yield 8 ...
Only if that were part of the specification. For instance, suppose
C really did have a "**" operator, but it were allowed to return
a "double", and a C compiler could turn:

a = b ** c;

into:

a = __pow((double)b, (double)c);

Then the situation would be the same as today.

In other words -- and this is a general principle -- the important
thing is not the syntax, but the required semantics.
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.
Jul 29 '06 #24
Chris Torek <no****@torek.netwrites:
>>Eric Sosman <es*****@acm-dot-org.invalidwrote:
C should have an exponentiation *operator*!
la************@ugs.com writes:
>>It has one: pow(). There's nothing stopping the compiler from recognizing
it and generating inline code, and some compilers do just that. Some
even recognize the case where the second argument is an integer and
generate more efficient code.

In article <ln************@nuthaus.mib.org>
Keith Thompson <ks***@mib.orgwrote:
>>Sure, but they're not required to do so.

Right. Specifically, making the operator look exactly like an
ordinary function call makes it particularly easy for "lazy"
compilers to generate an ordinary function call: the compiler-writer
does not even need to convert an expression tree node into a call
to __pow(), for instance.
>>Efficiency aside, if C had an exponentation operator overloaded for
either integer or floating-point exponents, you could be sure that
2**3 would yield 8 ...

Only if that were part of the specification. For instance, suppose
C really did have a "**" operator, but it were allowed to return
a "double", and a C compiler could turn:

a = b ** c;

into:

a = __pow((double)b, (double)c);

Then the situation would be the same as today.

In other words -- and this is a general principle -- the important
thing is not the syntax, but the required semantics.
Agreed.

My point here is that C already has a limited form of overloading for
predefined operators (the meaning of x + y depends on the types of x
and y) , but not for functions. If we wanted to provide an overloaded
exponentation thingie, which would by definition be equivalent to
repeated multiplication for an integer right operand, the most natural
way to do it would be to make it an operator.

(I'm ignoring C99's <tgmath.h>.)

(And, of course, we couldn't use "**" without breaking existing code;
b ** c already means b * (*c). Disambiguating it based on the type of
the right operand would be possible, I suppose, but that would mean
that tokenization would be affected by expression types, which is just
too ugly to contemplate.)

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Jul 29 '06 #25
Chris Torek wrote:
[...]
In other words -- and this is a general principle -- the important
thing is not the syntax, but the required semantics.
Syntax is important, too, because a helpful syntax can
enhance readability (and writeability).

x1 = div(sub(neg(b),sqrt(sub(pow(b,2),mul(mul(4,a),c))) ),mul(2,a));

(Confession: It took me *three* tries to get this grade-school
formula right -- if, indeed, I *have* gotten it right! I submit
that programmer productivity would be severely impaired if one
were to replace "ordinary" operators with function calls.)

--
Eric Sosman
es*****@acm-dot-org.invalid
Jul 29 '06 #26
On Fri, 28 Jul 2006, Eric Sosman wrote:
Chris Torek wrote:
>[...]
In other words -- and this is a general principle -- the important
thing is not the syntax, but the required semantics.

Syntax is important, too, because a helpful syntax can
enhance readability (and writeability).

x1 = div(sub(neg(b),sqrt(sub(pow(b,2),mul(mul(4,a),c))) ),mul(2,a));
[OT] LISP programmers might disagree (with this particular
example).

Tak-Shing
Jul 29 '06 #27
Tak-Shing Chan wrote:
On Fri, 28 Jul 2006, Eric Sosman wrote:
>Chris Torek wrote:
>>[...]
In other words -- and this is a general principle -- the important
thing is not the syntax, but the required semantics.


Syntax is important, too, because a helpful syntax can
enhance readability (and writeability).

x1 = div(sub(neg(b),sqrt(sub(pow(b,2),mul(mul(4,a),c))) ),mul(2,a));


[OT] LISP programmers might disagree (with this particular
example).
[OT] LISP programmers go into debt trying to buy enough
parentheses to get through the day. Been there, done that
(professionally), still hearing from collection agencies. ;-)

Seriously, one just doesn't find Lisp programmers writing
single-expression forms as complex as those C and FORTRAN folks
dash off as a matter of course. The expression above, in LISP,
would most likely be split into two if not three parts, with
one or perhaps two temporary variables just to hold intermediate
results:

(setq d (- (pow b 2) (* 4 a c)))
(setq n (- (- b) (sqrt d)))
(setq x1 (/ n (mul 2 a)))

In principle, of course, this could be written

(setq x1 (/ (- (- b) (sqrt (- (pow b 2) (* 4 a c)))) (mul 2 a)))

.... but that hardly ever happens. When it does, it's as hard
to read as ... well, just *look* at it. (And note that this
example takes advantage of LISP's "variadic" functions in a way
not available to C; the C equivalent, above, is even worse.)

With practice one becomes more adept at reading this kind
of verbose linearized tree (as I said: been there), but it's
never going to be as readable as the C/FORTRAN/ALGOL syntax
that enjoys the benefit of a close resemblance to the standard
mathematical notation we've all been reading since grade school.

Real-life example: Here, in all its ugliness, is an actual
expression from one of my actual C programs:

quadratic (&t1, &t2,
SQUARE(xvj - xvk) + SQUARE(yvj - yvk),
2.0 * ((xcj - xck) * (xvj - xvk)
+ (ycj - yck) * (yvj - yvk)),
SQUARE(xcj - xck) + SQUARE(ycj - yck)
- SQUARE(rj + ball[k].r));

Even with the benefits of C's looks-like-mathematics syntax,
this thing is sufficiently difficult to read that I went to
a little extra work chopping it into multiple lines and adding
suggestive indentation. Still, it remains comprehensible, if
just barely so. Challenge: Translate this into LISP (or another
syntax using functional forms for operators) and make it no less
readable than the above. I'll even allow you to leave out the
first two arguments, as they're artifacts of the way C does things.
I'll bet you a dollar^H^H^H^H^H^H^H denarius^H^H^H^H^H^H^H^H^H^H^H
ten electrons you can't do it.

--
Eric Sosman
es*****@acm-dot-org.invalid
Jul 29 '06 #28
On Fri, 28 Jul 2006, Eric Sosman wrote:

[snip]
[OT] LISP programmers go into debt trying to buy enough
parentheses to get through the day. Been there, done that
(professionally), still hearing from collection agencies. ;-)

Seriously, one just doesn't find Lisp programmers writing
single-expression forms as complex as those C and FORTRAN folks
dash off as a matter of course. The expression above, in LISP,
would most likely be split into two if not three parts, with
one or perhaps two temporary variables just to hold intermediate
results:

(setq d (- (pow b 2) (* 4 a c)))
(setq n (- (- b) (sqrt d)))
(setq x1 (/ n (mul 2 a)))

In principle, of course, this could be written

(setq x1 (/ (- (- b) (sqrt (- (pow b 2) (* 4 a c)))) (mul 2 a)))

... but that hardly ever happens. When it does, it's as hard
to read as ... well, just *look* at it. (And note that this
example takes advantage of LISP's "variadic" functions in a way
not available to C; the C equivalent, above, is even worse.)

With practice one becomes more adept at reading this kind
of verbose linearized tree (as I said: been there), but it's
never going to be as readable as the C/FORTRAN/ALGOL syntax
that enjoys the benefit of a close resemblance to the standard
mathematical notation we've all been reading since grade school.

Real-life example: Here, in all its ugliness, is an actual
expression from one of my actual C programs:

quadratic (&t1, &t2,
SQUARE(xvj - xvk) + SQUARE(yvj - yvk),
2.0 * ((xcj - xck) * (xvj - xvk)
+ (ycj - yck) * (yvj - yvk)),
SQUARE(xcj - xck) + SQUARE(ycj - yck)
- SQUARE(rj + ball[k].r));

Even with the benefits of C's looks-like-mathematics syntax,
this thing is sufficiently difficult to read that I went to
a little extra work chopping it into multiple lines and adding
suggestive indentation. Still, it remains comprehensible, if
just barely so. Challenge: Translate this into LISP (or another
syntax using functional forms for operators) and make it no less
readable than the above. I'll even allow you to leave out the
first two arguments, as they're artifacts of the way C does things.
I'll bet you a dollar^H^H^H^H^H^H^H denarius^H^H^H^H^H^H^H^H^H^H^H
ten electrons you can't do it.
I claim that this is more readable:

(quadratic &t1 &t2
(squared-dist xvdiff yvdiff)
(* 2.0
(+ (* xcdiff xvdiff) (* ycdiff yvdiff)))
(squared-dist-corrected xcdiff ycdiff rj ball[k].r))

Where can I get my ten electrons? ;-)

Tak-Shing
Jul 29 '06 #29
Tak-Shing Chan wrote:
On Fri, 28 Jul 2006, Eric Sosman wrote:
>[...]
Seriously, one just doesn't find Lisp programmers writing
single-expression forms as complex as those C and FORTRAN folks
dash off as a matter of course. The expression above, in LISP,
would most likely be split into two if not three parts, with
one or perhaps two temporary variables just to hold intermediate
results:
[...]

Real-life example: Here, in all its ugliness, is an actual
expression from one of my actual C programs:

quadratic (&t1, &t2,
SQUARE(xvj - xvk) + SQUARE(yvj - yvk),
2.0 * ((xcj - xck) * (xvj - xvk)
+ (ycj - yck) * (yvj - yvk)),
SQUARE(xcj - xck) + SQUARE(ycj - yck)
- SQUARE(rj + ball[k].r));

Even with the benefits of C's looks-like-mathematics syntax,
this thing is sufficiently difficult to read that I went to
a little extra work chopping it into multiple lines and adding
suggestive indentation. Still, it remains comprehensible, if
just barely so. Challenge: Translate this into LISP (or another
syntax using functional forms for operators) and make it no less
readable than the above. I'll even allow you to leave out the
first two arguments, as they're artifacts of the way C does things.
I'll bet you a dollar^H^H^H^H^H^H^H denarius^H^H^H^H^H^H^H^H^H^H^H
ten electrons you can't do it.

I claim that this is more readable:

(quadratic &t1 &t2
(squared-dist xvdiff yvdiff)
(* 2.0
(+ (* xcdiff xvdiff) (* ycdiff yvdiff)))
(squared-dist-corrected xcdiff ycdiff rj ball[k].r))

Where can I get my ten electrons? ;-)
Go to the FedEX web site and use the tracking number
I sent you earlier. (You didn't get the tracking number?
Use the trackingtracking number to find it. You didn't
get a trackingtracking number either? Time to look up
the trackingtrackingtracking number ...)

As to the readability, I agree that your formula is more
readable than mine, but I contend that's not due to the use
of a functional rather than an infix notation: it's because of
the introduction of temporary variables to hold intermediate
results. That's exactly the practice I mentioned earlier; it
seems to be the LISP programmer's reflexive response to the
inherent unreadability of his language. (At least, when I was
writing and reading a lot of LISP, that was my reflex ...)

Back to C: Its lack of an exponentiation operator is rather
conspicuous (Lawrence Jones' claim that pow() *is* an operator
notwithstanding). However, it's a lack we've learned to live
with, and despite my own grumblings it doesn't really hurt all
*that* badly. Without an exponentiator, C is without perfumed
breezes but not without oxygen.

--
Eric Sosman
es*****@acm-dot-org.invalid
Jul 29 '06 #30
Keith Thompson <ks***@mib.orgwrote:
>
(And, of course, we couldn't use "**" without breaking existing code;
b ** c already means b * (*c).
But it would be the best kind of breakage -- for b**c to be valid, c
must have a pointer type, which wouldn't be valid for exponentiation
and thus wouldn't compile. Hopefully, most people wouldn't write that
without any whitespace between the *'s, so there wouldn't be very much
broken code.

The bigger problem would be with unary usages (**p), since writing
the *'s without any intervening whitespace is typical and C's greedy
tokenization would interpret it as exponentiation instead of two
dereferences.

-Larry Jones

I don't see why some people even HAVE cars. -- Calvin
Jul 29 '06 #31

la************@ugs.com wrote:
Keith Thompson <ks***@mib.orgwrote:

(And, of course, we couldn't use "**" without breaking existing code;
b ** c already means b * (*c).

But it would be the best kind of breakage -- for b**c to be valid, c
must have a pointer type, which wouldn't be valid for exponentiation
and thus wouldn't compile. Hopefully, most people wouldn't write that
without any whitespace between the *'s, so there wouldn't be very much
broken code.

The bigger problem would be with unary usages (**p), since writing
the *'s without any intervening whitespace is typical and C's greedy
tokenization would interpret it as exponentiation instead of two
dereferences.
No reason that ** has to be made a separate token. In the type
pass, whenever there is a**b and b isn't a pointer, the compiler
could adjust the parse tree so that exponentiation is done for
the two adjacent * operators. Alternatively, unary ** with a
pointer operand could mean "double dereference".

My fear is that someone will take these suggestions seriously
and propose one or the other in earnest to the ISO committee.

Jul 31 '06 #32
In article <11**********************@75g2000cwc.googlegroups. com>
<en******@yahoo.comwrote:
>No reason that ** has to be made a separate token. In the type
pass, whenever there is a**b and b isn't a pointer, the compiler
could adjust the parse tree so that exponentiation is done for
the two adjacent * operators.
(This alternative scares me :-) )
>Alternatively, unary ** with a
pointer operand could mean "double dereference".
This one, I like. Only one problem...

x = 2**p;

is valid today if p is a pointer; it parses as:

x = 2 * (*p);

You would have to make the rule be even more complicated: a ** b
means "a to the b power if b is arithmetic, but means a multiplied
by *b if b is a pointer". (The unary operator could always mean
double-indirection.) Note this still works with:

T **pp;
...
x = 2***pp;

which parses as:

x = 2 ** (*pp);

and (*pp) is still a pointer, so the ** operator does the right
thing. Similarly:

T ***q;
...
x = 2****q;

parses as:

x = 2 ** (**q);

and the unary "**" operator means double-indirection, leaving a
pointer, so that 2**(that) is still 2 * *(that).
>My fear is that someone will take these suggestions seriously
and propose one or the other in earnest to the ISO committee.
Hey, it is no worse than:

char arr[4] = "ugh!"; /* no \0 */

:-)
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.
Jul 31 '06 #33

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: StumpY | last post by:
Hi I have a code which works ok, except I would like to load the specified file in an inline frame (I1) on the page instead of a whole new page if at all possible - any help appreciated! Here is...
14
by: Chris Mantoulidis | last post by:
I am not clear with the use of the keyword inline... I believe you add it do a function when you implement the function inside the header file where the class is stored... But is that all? What...
1
by: S Austin | last post by:
Discovered recently (duh) that putting inline code in .h files (e.g. in class definitions) is not a good idea when building DLLs and the applications that use those DLLs. The reason being, of...
47
by: Richard Hayden | last post by:
Hi, I have the following code: /******************************** file1.c #include <iostream> extern void dummy(); inline int testfunc() {
20
by: Grumble | last post by:
Hello everyone, As far as I understand, the 'inline' keyword is a hint for the compiler to consider the function in question as a candidate for inlining, yes? What happens when a function with...
6
by: John Ratliff | last post by:
I was reading the C++ FAQ Lite about inline functions, and it says the following (http://www.parashift.com/c++-faq-lite/inline-functions.html#faq-9.7) " It's usually imperative that the...
18
by: Method Man | last post by:
If I don't care about the size of my executable or compile time, is there any reason why I wouldn't want to inline every function in my code to make the program run more efficient?
3
by: junky_fellow | last post by:
What is an inline function ? When should we declare a particular function as inline ? If a function is declared to be inline, does that mean that the entire code for that function is subtituted...
10
by: Fuzzy | last post by:
I have written a custom replacement for library function Math.Pow(double x, double y) for the specific condition of y being integer and >=0. It is twice as fast as Math.Pow(), but I cannot get...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
0
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.