473,387 Members | 1,420 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

When to use std::pow(x,n) instead of times x for n times?

Hi,

I'm wondering if there is any general guideline on when to using
something like
std::pow(x, n)
rather than
x * x * x * ... * x (n x's).

Thanks,
Peng
Sep 10 '08 #1
18 2207
On Tue, 9 Sep 2008 17:30:04 -0700 (PDT), Peng Yu <Pe*******@gmail.com>
wrote in comp.lang.c++:
Hi,

I'm wondering if there is any general guideline on when to using
something like
std::pow(x, n)
rather than
x * x * x * ... * x (n x's).

Thanks,
Peng
In addition to what Alf said, here's my advice:

ALWAYS use std::pow(x, n) instead of trying to multiply 'x' by
multiple times when 'n' is not integral.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://c-faq.com/
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.club.cc.cmu.edu/~ajo/docs/FAQ-acllc.html
Sep 10 '08 #2
On Tue, 09 Sep 2008 17:30:04 -0700, Peng Yu wrote:
Hi,

I'm wondering if there is any general guideline on when to using
something like
std::pow(x, n)
rather than
x * x * x * ... * x (n x's).

Thanks,
Peng
It can be hard to represent fractional powers in x*x notation.

sf
Sep 10 '08 #3
Peng Yu wrote:
I'm wondering if there is any general guideline on when to using
something like
std::pow(x, n)
rather than
x * x * x * ... * x (n x's).
If you need to calculate that function millions of times per second,
then using the latter form can be considerably faster up to a certain n
(after which std::pow() becomes faster). The maximum n for which the
latter form is faster than std::pow() can be surprisingly large,
depending on the code (eg. something like n=8 might not be far-fetched).

Of course this is heavily system-dependent so there's no rule.
Sep 10 '08 #4
On Sep 10, 4:24*pm, Juha Nieminen <nos...@thanks.invalidwrote:
Peng Yu wrote:
I'm wondering if there is any general guideline on when to using
something like
std::pow(x, n)
rather than
x * x * x * ... * x (n x's).

* If you need to calculate that function millions of times per second,
then using the latter form can be considerably faster up to a certain n
(after which std::pow() becomes faster).
Why? There is no reason for the compiler not to transform pow(x,
<integral-constant>) to the latter form if it were actually faster
(and in fact some compilers do).

--
gpd
Sep 10 '08 #5
Juha Nieminen wrote:
Peng Yu wrote:
>I'm wondering if there is any general guideline on when to using
something like
std::pow(x, n)
rather than
x * x * x * ... * x (n x's).

If you need to calculate that function millions of times per second,
then using the latter form can be considerably faster up to a certain n
(after which std::pow() becomes faster). The maximum n for which the
latter form is faster than std::pow() can be surprisingly large,
depending on the code (eg. something like n=8 might not be far-fetched).

Of course this is heavily system-dependent so there's no rule.
It's always best to profile prior to optimization attempts. Rules of
thumb and "common sense" are frequently wrong; measurements seldom are
seldom wrong.
Sep 11 '08 #6
gpderetta wrote:
Why? There is no reason for the compiler not to transform pow(x,
<integral-constant>) to the latter form if it were actually faster
(and in fact some compilers do).
Some compilers might be able to do that optimizations, others aren't.
And if n is a variable, then it cannot optimize it. (At most the pow()
function itself might have optimizations in it, but in my experience it
doesn't: With most compilers it just generates the FPU opcodes necessary
to calculate the result.)

But we don't have to speculate about this as it's trivially easy to
test in practice. Go ahead and try it.
Sep 11 '08 #7
On Sep 11, 6:39 pm, Juha Nieminen <nos...@thanks.invalidwrote:
gpderetta wrote:
Why? There is no reason for the compiler not to transform pow(x,
<integral-constant>) to the latter form if it were actually faster
(and in fact some compilers do).

Some compilers might be able to do that optimizations, others aren't.
And if n is a variable, then it cannot optimize it.
and if it is variable, you can't write an explicit expression either.
You could use a for loop
(At most the pow()
function itself might have optimizations in it, but in my experience it
doesn't: With most compilers it just generates the FPU opcodes necessary
to calculate the result.)
Today hand optimizations are tomorrow pessimizations. Let the compiler
do its job.

The usual rule apply: use pow, and only if the profiler tells it is a
bottleneck, try to optimize it by hand.
>
But we don't have to speculate about this as it's trivially easy to
test in practice. Go ahead and try it.
I had already tried. 'pow(x, 16)' is inlined exactly as four
multiplies, at least with a recent gcc.

--
gpd
Sep 11 '08 #8
gpderetta wrote:
> Some compilers might be able to do that optimizations, others aren't.
And if n is a variable, then it cannot optimize it.

and if it is variable, you can't write an explicit expression either.
You could use a for loop
In my experience even performing a set of multiplications in a loop
while interpreting bytecode can be faster than a single std::pow() call,
up to a certain exponent.

I have made a function parser/interpreter, and in practice eg.
interpreting the function "x*x*x*x" (which it bytecompiles to three
multiplications) is faster than "x^4" (which it bytecompiles to one
std::pow() call). std::pow() can be incredibly slow.
Sep 12 '08 #9
On Sep 11, 4:56 pm, gpderetta <gpdere...@gmail.comwrote:
On Sep 11, 6:39 pm, Juha Nieminen <nos...@thanks.invalidwrote:
gpderetta wrote:
Why? There is no reason for the compiler not to transformpow(x,
<integral-constant>) to the latter form if it were actually faster
(and in fact some compilers do).
Some compilers might be able to do that optimizations, others aren't.
And if n is a variable, then it cannot optimize it.

and if it is variable, you can't write an explicit expression either.
You could use a for loop
(At most thepow()
function itself might have optimizations in it, but in my experience it
doesn't: With most compilers it just generates the FPU opcodes necessary
to calculate the result.)

Today hand optimizations are tomorrow pessimizations. Let the compiler
do its job.

The usual rule apply: usepow, and only if the profiler tells it is a
bottleneck, try to optimize it by hand.
But we don't have to speculate about this as it's trivially easy to
test in practice. Go ahead and try it.

I had already tried. 'pow(x, 16)' is inlined exactly as four
multiplies, at least with a recent gcc.
Would you please let me know the details the procedure on how you
figure this out? Sometimes I want to know what the compiler compile
the code to.

Thanks,
Peng
Sep 13 '08 #10
On Sep 13, 4:09 pm, "Alf P. Steinbach" <al...@start.nowrote:
* Peng Yu:
Sometimes I want to know what the compiler compile
the code to.

e.g.

g++ -S -masm=intel x.cpp
It is pretty hard to figure out what part of assembly code is
associated with a give portion of source code. For example, the
following C++ and assembly code. How do I figure out where the pow
functions are at in the code?

Thanks,
Peng

$cat main.cc main.s
#include <cmath>
#include <iostream>

int main() {
double x = 1;
std::cout << "pox(x, 1) = " << std::pow(x, 1) << std::endl;
std::cout << "pox(x, 2) = " << std::pow(x, 2) << std::endl;
std::cout << "pox(x, 3) = " << std::pow(x, 3) << std::endl;
std::cout << "pox(x, 4) = " << std::pow(x, 4) << std::endl;
std::cout << "pox(x, 5) = " << std::pow(x, 5) << std::endl;
std::cout << "pox(x, 6) = " << std::pow(x, 6) << std::endl;
std::cout << "pox(x, 7) = " << std::pow(x, 7) << std::endl;
std::cout << "pox(x, 8) = " << std::pow(x, 8) << std::endl;
std::cout << "pox(x, 9) = " << std::pow(x, 9) << std::endl;
std::cout << "pox(x, 10) = " << std::pow(x, 10) << std::endl;
std::cout << "pox(x, 11) = " << std::pow(x, 11) << std::endl;
std::cout << "pox(x, 12) = " << std::pow(x, 12) << std::endl;
std::cout << "pox(x, 13) = " << std::pow(x, 13) << std::endl;
std::cout << "pox(x, 14) = " << std::pow(x, 14) << std::endl;
std::cout << "pox(x, 15) = " << std::pow(x, 15) << std::endl;
std::cout << "pox(x, 16) = " << std::pow(x, 16) << std::endl;
}
.file "main.cc"
.intel_syntax
.section .ctors,"aw",@progbits
.align 8
.quad _GLOBAL__I_main
.text
.align 2
.type _Z41__static_initialization_and_destruction_0ii,
@function
_Z41__static_initialization_and_destruction_0ii:
..LFB1504:
push %rbp
..LCFI0:
mov %rbp, %rsp
..LCFI1:
sub %rsp, 16
..LCFI2:
mov DWORD PTR [%rbp-4], %edi
mov DWORD PTR [%rbp-8], %esi
cmp DWORD PTR [%rbp-4], 1
jne .L5
cmp DWORD PTR [%rbp-8], 65535
jne .L5
mov %edi, OFFSET FLAT:_ZSt8__ioinit
call _ZNSt8ios_base4InitC1Ev
mov %edx, OFFSET FLAT:__dso_handle
mov %esi, 0
mov %edi, OFFSET FLAT:__tcf_0
call __cxa_atexit
..L5:
leave
ret
..LFE1504:
.size _Z41__static_initialization_and_destruction_0ii, .-
_Z41__static_initialization_and_destruction_0ii
..globl __gxx_personality_v0
.align 2
.type _GLOBAL__I_main, @function
_GLOBAL__I_main:
..LFB1506:
push %rbp
..LCFI3:
mov %rbp, %rsp
..LCFI4:
mov %esi, 65535
mov %edi, 1
call _Z41__static_initialization_and_destruction_0ii
leave
ret
..LFE1506:
.size _GLOBAL__I_main, .-_GLOBAL__I_main
.align 2
.type __tcf_0, @function
__tcf_0:
..LFB1505:
push %rbp
..LCFI5:
mov %rbp, %rsp
..LCFI6:
sub %rsp, 16
..LCFI7:
mov QWORD PTR [%rbp-8], %rdi
mov %edi, OFFSET FLAT:_ZSt8__ioinit
call _ZNSt8ios_base4InitD1Ev
leave
ret
..LFE1505:
.size __tcf_0, .-__tcf_0
..globl __powidf2
.section .text._ZSt3powdi,"axG",@progbits,_ZSt3powdi,comdat
.align 2
.weak _ZSt3powdi
.type _ZSt3powdi, @function
_ZSt3powdi:
..LFB54:
push %rbp
..LCFI8:
mov %rbp, %rsp
..LCFI9:
sub %rsp, 32
..LCFI10:
movsd QWORD PTR [%rbp-8], %xmm0
mov DWORD PTR [%rbp-12], %edi
mov %edi, DWORD PTR [%rbp-12]
movlpd %xmm0, QWORD PTR [%rbp-8]
call __powidf2
movsd QWORD PTR [%rbp-24], %xmm0
mov %rax, QWORD PTR [%rbp-24]
mov QWORD PTR [%rbp-24], %rax
movlpd %xmm0, QWORD PTR [%rbp-24]
leave
ret
..LFE54:
.size _ZSt3powdi, .-_ZSt3powdi
.section .rodata
..LC1:
.string "pox(x, 1) = "
..LC2:
.string "pox(x, 2) = "
..LC3:
.string "pox(x, 3) = "
..LC4:
.string "pox(x, 4) = "
..LC5:
.string "pox(x, 5) = "
..LC6:
.string "pox(x, 6) = "
..LC7:
.string "pox(x, 7) = "
..LC8:
.string "pox(x, 8) = "
..LC9:
.string "pox(x, 9) = "
..LC10:
.string "pox(x, 10) = "
..LC11:
.string "pox(x, 11) = "
..LC12:
.string "pox(x, 12) = "
..LC13:
.string "pox(x, 13) = "
..LC14:
.string "pox(x, 14) = "
..LC15:
.string "pox(x, 15) = "
..LC16:
.string "pox(x, 16) = "
.text
.align 2
..globl main
.type main, @function
main:
..LFB1496:
push %rbp
..LCFI11:
mov %rbp, %rsp
..LCFI12:
push %rbx
..LCFI13:
sub %rsp, 24
..LCFI14:
movabs %rax, 4607182418800017408
mov QWORD PTR [%rbp-16], %rax
mov %rax, QWORD PTR [%rbp-16]
mov %edi, 1
mov QWORD PTR [%rbp-32], %rax
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZSt3powdi
movsd QWORD PTR [%rbp-32], %xmm0
mov %rbx, QWORD PTR [%rbp-32]
mov %esi, OFFSET FLAT:.LC1
mov %edi, OFFSET FLAT:_ZSt4cout
call
_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES 5_PKc
mov %rdi, %rax
mov QWORD PTR [%rbp-32], %rbx
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZNSolsEd
mov %rdi, %rax
mov %esi, OFFSET
FLAT:_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostr eamIT_T0_ES6_
call _ZNSolsEPFRSoS_E
mov %rax, QWORD PTR [%rbp-16]
mov %edi, 2
mov QWORD PTR [%rbp-32], %rax
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZSt3powdi
movsd QWORD PTR [%rbp-32], %xmm0
mov %rbx, QWORD PTR [%rbp-32]
mov %esi, OFFSET FLAT:.LC2
mov %edi, OFFSET FLAT:_ZSt4cout
call
_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES 5_PKc
mov %rdi, %rax
mov QWORD PTR [%rbp-32], %rbx
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZNSolsEd
mov %rdi, %rax
mov %esi, OFFSET
FLAT:_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostr eamIT_T0_ES6_
call _ZNSolsEPFRSoS_E
mov %rax, QWORD PTR [%rbp-16]
mov %edi, 3
mov QWORD PTR [%rbp-32], %rax
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZSt3powdi
movsd QWORD PTR [%rbp-32], %xmm0
mov %rbx, QWORD PTR [%rbp-32]
mov %esi, OFFSET FLAT:.LC3
mov %edi, OFFSET FLAT:_ZSt4cout
call
_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES 5_PKc
mov %rdi, %rax
mov QWORD PTR [%rbp-32], %rbx
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZNSolsEd
mov %rdi, %rax
mov %esi, OFFSET
FLAT:_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostr eamIT_T0_ES6_
call _ZNSolsEPFRSoS_E
mov %rax, QWORD PTR [%rbp-16]
mov %edi, 4
mov QWORD PTR [%rbp-32], %rax
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZSt3powdi
movsd QWORD PTR [%rbp-32], %xmm0
mov %rbx, QWORD PTR [%rbp-32]
mov %esi, OFFSET FLAT:.LC4
mov %edi, OFFSET FLAT:_ZSt4cout
call
_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES 5_PKc
mov %rdi, %rax
mov QWORD PTR [%rbp-32], %rbx
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZNSolsEd
mov %rdi, %rax
mov %esi, OFFSET
FLAT:_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostr eamIT_T0_ES6_
call _ZNSolsEPFRSoS_E
mov %rax, QWORD PTR [%rbp-16]
mov %edi, 5
mov QWORD PTR [%rbp-32], %rax
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZSt3powdi
movsd QWORD PTR [%rbp-32], %xmm0
mov %rbx, QWORD PTR [%rbp-32]
mov %esi, OFFSET FLAT:.LC5
mov %edi, OFFSET FLAT:_ZSt4cout
call
_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES 5_PKc
mov %rdi, %rax
mov QWORD PTR [%rbp-32], %rbx
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZNSolsEd
mov %rdi, %rax
mov %esi, OFFSET
FLAT:_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostr eamIT_T0_ES6_
call _ZNSolsEPFRSoS_E
mov %rax, QWORD PTR [%rbp-16]
mov %edi, 6
mov QWORD PTR [%rbp-32], %rax
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZSt3powdi
movsd QWORD PTR [%rbp-32], %xmm0
mov %rbx, QWORD PTR [%rbp-32]
mov %esi, OFFSET FLAT:.LC6
mov %edi, OFFSET FLAT:_ZSt4cout
call
_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES 5_PKc
mov %rdi, %rax
mov QWORD PTR [%rbp-32], %rbx
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZNSolsEd
mov %rdi, %rax
mov %esi, OFFSET
FLAT:_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostr eamIT_T0_ES6_
call _ZNSolsEPFRSoS_E
mov %rax, QWORD PTR [%rbp-16]
mov %edi, 7
mov QWORD PTR [%rbp-32], %rax
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZSt3powdi
movsd QWORD PTR [%rbp-32], %xmm0
mov %rbx, QWORD PTR [%rbp-32]
mov %esi, OFFSET FLAT:.LC7
mov %edi, OFFSET FLAT:_ZSt4cout
call
_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES 5_PKc
mov %rdi, %rax
mov QWORD PTR [%rbp-32], %rbx
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZNSolsEd
mov %rdi, %rax
mov %esi, OFFSET
FLAT:_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostr eamIT_T0_ES6_
call _ZNSolsEPFRSoS_E
mov %rax, QWORD PTR [%rbp-16]
mov %edi, 8
mov QWORD PTR [%rbp-32], %rax
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZSt3powdi
movsd QWORD PTR [%rbp-32], %xmm0
mov %rbx, QWORD PTR [%rbp-32]
mov %esi, OFFSET FLAT:.LC8
mov %edi, OFFSET FLAT:_ZSt4cout
call
_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES 5_PKc
mov %rdi, %rax
mov QWORD PTR [%rbp-32], %rbx
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZNSolsEd
mov %rdi, %rax
mov %esi, OFFSET
FLAT:_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostr eamIT_T0_ES6_
call _ZNSolsEPFRSoS_E
mov %rax, QWORD PTR [%rbp-16]
mov %edi, 9
mov QWORD PTR [%rbp-32], %rax
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZSt3powdi
movsd QWORD PTR [%rbp-32], %xmm0
mov %rbx, QWORD PTR [%rbp-32]
mov %esi, OFFSET FLAT:.LC9
mov %edi, OFFSET FLAT:_ZSt4cout
call
_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES 5_PKc
mov %rdi, %rax
mov QWORD PTR [%rbp-32], %rbx
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZNSolsEd
mov %rdi, %rax
mov %esi, OFFSET
FLAT:_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostr eamIT_T0_ES6_
call _ZNSolsEPFRSoS_E
mov %rax, QWORD PTR [%rbp-16]
mov %edi, 10
mov QWORD PTR [%rbp-32], %rax
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZSt3powdi
movsd QWORD PTR [%rbp-32], %xmm0
mov %rbx, QWORD PTR [%rbp-32]
mov %esi, OFFSET FLAT:.LC10
mov %edi, OFFSET FLAT:_ZSt4cout
call
_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES 5_PKc
mov %rdi, %rax
mov QWORD PTR [%rbp-32], %rbx
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZNSolsEd
mov %rdi, %rax
mov %esi, OFFSET
FLAT:_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostr eamIT_T0_ES6_
call _ZNSolsEPFRSoS_E
mov %rax, QWORD PTR [%rbp-16]
mov %edi, 11
mov QWORD PTR [%rbp-32], %rax
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZSt3powdi
movsd QWORD PTR [%rbp-32], %xmm0
mov %rbx, QWORD PTR [%rbp-32]
mov %esi, OFFSET FLAT:.LC11
mov %edi, OFFSET FLAT:_ZSt4cout
call
_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES 5_PKc
mov %rdi, %rax
mov QWORD PTR [%rbp-32], %rbx
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZNSolsEd
mov %rdi, %rax
mov %esi, OFFSET
FLAT:_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostr eamIT_T0_ES6_
call _ZNSolsEPFRSoS_E
mov %rax, QWORD PTR [%rbp-16]
mov %edi, 12
mov QWORD PTR [%rbp-32], %rax
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZSt3powdi
movsd QWORD PTR [%rbp-32], %xmm0
mov %rbx, QWORD PTR [%rbp-32]
mov %esi, OFFSET FLAT:.LC12
mov %edi, OFFSET FLAT:_ZSt4cout
call
_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES 5_PKc
mov %rdi, %rax
mov QWORD PTR [%rbp-32], %rbx
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZNSolsEd
mov %rdi, %rax
mov %esi, OFFSET
FLAT:_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostr eamIT_T0_ES6_
call _ZNSolsEPFRSoS_E
mov %rax, QWORD PTR [%rbp-16]
mov %edi, 13
mov QWORD PTR [%rbp-32], %rax
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZSt3powdi
movsd QWORD PTR [%rbp-32], %xmm0
mov %rbx, QWORD PTR [%rbp-32]
mov %esi, OFFSET FLAT:.LC13
mov %edi, OFFSET FLAT:_ZSt4cout
call
_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES 5_PKc
mov %rdi, %rax
mov QWORD PTR [%rbp-32], %rbx
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZNSolsEd
mov %rdi, %rax
mov %esi, OFFSET
FLAT:_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostr eamIT_T0_ES6_
call _ZNSolsEPFRSoS_E
mov %rax, QWORD PTR [%rbp-16]
mov %edi, 14
mov QWORD PTR [%rbp-32], %rax
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZSt3powdi
movsd QWORD PTR [%rbp-32], %xmm0
mov %rbx, QWORD PTR [%rbp-32]
mov %esi, OFFSET FLAT:.LC14
mov %edi, OFFSET FLAT:_ZSt4cout
call
_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES 5_PKc
mov %rdi, %rax
mov QWORD PTR [%rbp-32], %rbx
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZNSolsEd
mov %rdi, %rax
mov %esi, OFFSET
FLAT:_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostr eamIT_T0_ES6_
call _ZNSolsEPFRSoS_E
mov %rax, QWORD PTR [%rbp-16]
mov %edi, 15
mov QWORD PTR [%rbp-32], %rax
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZSt3powdi
movsd QWORD PTR [%rbp-32], %xmm0
mov %rbx, QWORD PTR [%rbp-32]
mov %esi, OFFSET FLAT:.LC15
mov %edi, OFFSET FLAT:_ZSt4cout
call
_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES 5_PKc
mov %rdi, %rax
mov QWORD PTR [%rbp-32], %rbx
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZNSolsEd
mov %rdi, %rax
mov %esi, OFFSET
FLAT:_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostr eamIT_T0_ES6_
call _ZNSolsEPFRSoS_E
mov %rax, QWORD PTR [%rbp-16]
mov %edi, 16
mov QWORD PTR [%rbp-32], %rax
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZSt3powdi
movsd QWORD PTR [%rbp-32], %xmm0
mov %rbx, QWORD PTR [%rbp-32]
mov %esi, OFFSET FLAT:.LC16
mov %edi, OFFSET FLAT:_ZSt4cout
call
_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES 5_PKc
mov %rdi, %rax
mov QWORD PTR [%rbp-32], %rbx
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZNSolsEd
mov %rdi, %rax
mov %esi, OFFSET
FLAT:_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostr eamIT_T0_ES6_
call _ZNSolsEPFRSoS_E
mov %eax, 0
add %rsp, 24
pop %rbx
leave
ret
..LFE1496:
.size main, .-main
.local _ZSt8__ioinit
.comm _ZSt8__ioinit,1,1
.weakref _Z20__gthrw_pthread_oncePiPFvvE,pthread_once
.weakref
_Z27__gthrw_pthread_getspecificj,pthread_getspecif ic
.weakref
_Z27__gthrw_pthread_setspecificjPKv,pthread_setspe cific
.weakref
_Z22__gthrw_pthread_createPmPK14pthread_attr_tPFPv S3_ES3_,pthread_create
.weakref _Z22__gthrw_pthread_cancelm,pthread_cancel
.weakref
_Z26__gthrw_pthread_mutex_lockP15pthread_mutex_t,p thread_mutex_lock
.weakref
_Z29__gthrw_pthread_mutex_trylockP15pthread_mutex_ t,pthread_mutex_trylock
.weakref
_Z28__gthrw_pthread_mutex_unlockP15pthread_mutex_t ,pthread_mutex_unlock
.weakref
_Z26__gthrw_pthread_mutex_initP15pthread_mutex_tPK 19pthread_mutexattr_t,pthread_mutex_init
.weakref
_Z26__gthrw_pthread_key_createPjPFvPvE,pthread_key _create
.weakref
_Z26__gthrw_pthread_key_deletej,pthread_key_delete
.weakref
_Z30__gthrw_pthread_mutexattr_initP19pthread_mutex attr_t,pthread_mutexattr_init
.weakref
_Z33__gthrw_pthread_mutexattr_settypeP19pthread_mu texattr_ti,pthread_mutexattr_settype
.weakref
_Z33__gthrw_pthread_mutexattr_destroyP19pthread_mu texattr_t,pthread_mutexattr_destroy
.section .eh_frame,"a",@progbits
..Lframe1:
.long .LECIE1-.LSCIE1
..LSCIE1:
.long 0x0
.byte 0x1
.string "zPR"
.uleb128 0x1
.sleb128 -8
.byte 0x10
.uleb128 0x6
.byte 0x3
.long __gxx_personality_v0
.byte 0x3
.byte 0xc
.uleb128 0x7
.uleb128 0x8
.byte 0x90
.uleb128 0x1
.align 8
..LECIE1:
..LSFDE1:
.long .LEFDE1-.LASFDE1
..LASFDE1:
.long .LASFDE1-.Lframe1
.long .LFB1504
.long .LFE1504-.LFB1504
.uleb128 0x0
.byte 0x4
.long .LCFI0-.LFB1504
.byte 0xe
.uleb128 0x10
.byte 0x86
.uleb128 0x2
.byte 0x4
.long .LCFI1-.LCFI0
.byte 0xd
.uleb128 0x6
.align 8
..LEFDE1:
..LSFDE3:
.long .LEFDE3-.LASFDE3
..LASFDE3:
.long .LASFDE3-.Lframe1
.long .LFB1506
.long .LFE1506-.LFB1506
.uleb128 0x0
.byte 0x4
.long .LCFI3-.LFB1506
.byte 0xe
.uleb128 0x10
.byte 0x86
.uleb128 0x2
.byte 0x4
.long .LCFI4-.LCFI3
.byte 0xd
.uleb128 0x6
.align 8
..LEFDE3:
..LSFDE5:
.long .LEFDE5-.LASFDE5
..LASFDE5:
.long .LASFDE5-.Lframe1
.long .LFB1505
.long .LFE1505-.LFB1505
.uleb128 0x0
.byte 0x4
.long .LCFI5-.LFB1505
.byte 0xe
.uleb128 0x10
.byte 0x86
.uleb128 0x2
.byte 0x4
.long .LCFI6-.LCFI5
.byte 0xd
.uleb128 0x6
.align 8
..LEFDE5:
..LSFDE7:
.long .LEFDE7-.LASFDE7
..LASFDE7:
.long .LASFDE7-.Lframe1
.long .LFB54
.long .LFE54-.LFB54
.uleb128 0x0
.byte 0x4
.long .LCFI8-.LFB54
.byte 0xe
.uleb128 0x10
.byte 0x86
.uleb128 0x2
.byte 0x4
.long .LCFI9-.LCFI8
.byte 0xd
.uleb128 0x6
.align 8
..LEFDE7:
..LSFDE9:
.long .LEFDE9-.LASFDE9
..LASFDE9:
.long .LASFDE9-.Lframe1
.long .LFB1496
.long .LFE1496-.LFB1496
.uleb128 0x0
.byte 0x4
.long .LCFI11-.LFB1496
.byte 0xe
.uleb128 0x10
.byte 0x86
.uleb128 0x2
.byte 0x4
.long .LCFI12-.LCFI11
.byte 0xd
.uleb128 0x6
.byte 0x4
.long .LCFI14-.LCFI12
.byte 0x83
.uleb128 0x3
.align 8
..LEFDE9:
.ident "GCC: (GNU) 4.1.2 20061115 (prerelease) (Debian
4.1.1-21)"
.section .note.GNU-stack,"",@progbits
Sep 14 '08 #11
Peng Yu wrote:
On Sep 13, 4:09 pm, "Alf P. Steinbach" <al...@start.nowrote:
>* Peng Yu:
Sometimes I want to know what the compiler compile
the code to.

e.g.

g++ -S -masm=intel x.cpp

It is pretty hard to figure out what part of assembly code is
associated with a give portion of source code.
Nobody said, it would be easy.

For example, the
following C++ and assembly code. How do I figure out where the pow
functions are at in the code?
[snip]

Did you try modifying one line of code and seeing which portion of the
assembly changed?
Best

Kai-Uwe Bux
Sep 14 '08 #12
On Sep 13, 7:50 pm, Kai-Uwe Bux <jkherci...@gmx.netwrote:
Peng Yu wrote:
On Sep 13, 4:09 pm, "Alf P. Steinbach" <al...@start.nowrote:
* Peng Yu:
Sometimes I want to know what the compiler compile
the code to.
e.g.
g++ -S -masm=intel x.cpp
It is pretty hard to figure out what part of assembly code is
associated with a give portion of source code.

Nobody said, it would be easy.
For example, the
following C++ and assembly code. How do I figure out where the pow
functions are at in the code?

[snip]

Did you try modifying one line of code and seeing which portion of the
assembly changed?
g++ -O3 -S -masm=intel main.cc

I tried to compile the code and the variant of it. And I got the
difference by diff. But I still have difficulty to understand what it
does. Is there a way to annotate the C++ code in the assembly code?

Thanks,
Peng
Sep 14 '08 #13
On 2008-09-13 23:06, Peng Yu wrote:
On Sep 11, 4:56 pm, gpderetta <gpdere...@gmail.comwrote:
>On Sep 11, 6:39 pm, Juha Nieminen <nos...@thanks.invalidwrote:
gpderetta wrote:
Why? There is no reason for the compiler not to transformpow(x,
<integral-constant>) to the latter form if it were actually faster
(and in fact some compilers do).
Some compilers might be able to do that optimizations, others aren't.
And if n is a variable, then it cannot optimize it.

and if it is variable, you can't write an explicit expression either.
You could use a for loop
(At most thepow()
function itself might have optimizations in it, but in my experience it
doesn't: With most compilers it just generates the FPU opcodes necessary
to calculate the result.)

Today hand optimizations are tomorrow pessimizations. Let the compiler
do its job.

The usual rule apply: usepow, and only if the profiler tells it is a
bottleneck, try to optimize it by hand.
But we don't have to speculate about this as it's trivially easy to
test in practice. Go ahead and try it.

I had already tried. 'pow(x, 16)' is inlined exactly as four
multiplies, at least with a recent gcc.

Would you please let me know the details the procedure on how you
figure this out? Sometimes I want to know what the compiler compile
the code to.
In Visual Studio you can run the program in the debugger and then bring
up the assembly code and it will show you can step through it and switch
back and forth between the code and assembly code. I would imagine you
can do similar things in other IDEs and in gdb.

--
Erik Wikström
Sep 14 '08 #14
Peng Yu wrote:
On Sep 13, 7:50 pm, Kai-Uwe Bux <jkherci...@gmx.netwrote:
>Peng Yu wrote:
>>On Sep 13, 4:09 pm, "Alf P. Steinbach" <al...@start.nowrote:
* Peng Yu:
Sometimes I want to know what the compiler compile
the code to.
e.g.
g++ -S -masm=intel x.cpp
It is pretty hard to figure out what part of assembly code is
associated with a give portion of source code.
Nobody said, it would be easy.
>>For example, the
following C++ and assembly code. How do I figure out where the pow
functions are at in the code?
[snip]

Did you try modifying one line of code and seeing which portion of the
assembly changed?

g++ -O3 -S -masm=intel main.cc

I tried to compile the code and the variant of it. And I got the
difference by diff. But I still have difficulty to understand what it
does. Is there a way to annotate the C++ code in the assembly code?
Some compilers (Sun CC for example) do so by default. If you are
working on Solaris or Linux, give it a try.

--
Ian Collins.
Sep 14 '08 #15
On Sep 14, 4:37 am, Erik Wikström <Erik-wikst...@telia.comwrote:
On 2008-09-13 23:06, Peng Yu wrote:
On Sep 11, 4:56 pm, gpderetta <gpdere...@gmail.comwrote:
On Sep 11, 6:39 pm, Juha Nieminen <nos...@thanks.invalidwrote:
gpderetta wrote:
Why? There is no reason for the compiler not to transformpow(x,
<integral-constant>) to the latter form if it were actually faster
(and in fact some compilers do).
Some compilers might be able to do that optimizations, others aren't.
And if n is a variable, then it cannot optimize it.
and if it is variable, you can't write an explicit expression either.
You could use a for loop
(At most thepow()
function itself might have optimizations in it, but in my experienceit
doesn't: With most compilers it just generates the FPU opcodes necessary
to calculate the result.)
Today hand optimizations are tomorrow pessimizations. Let the compiler
do its job.
The usual rule apply: usepow, and only if the profiler tells it is a
bottleneck, try to optimize it by hand.
But we don't have to speculate about this as it's trivially easy to
test in practice. Go ahead and try it.
I had already tried. 'pow(x, 16)' is inlined exactly as four
multiplies, at least with a recent gcc.
Would you please let me know the details the procedure on how you
figure this out? Sometimes I want to know what the compiler compile
the code to.

In Visual Studio you can run the program in the debugger and then bring
up the assembly code and it will show you can step through it and switch
back and forth between the code and assembly code. I would imagine you
can do similar things in other IDEs and in gdb.
Shall there be a problem if I use -O3 option? The source code and the
assembly code might have one-one relationship.

Thanks,
Peng
Sep 14 '08 #16
On 2008-09-14 15:47, Peng Yu wrote:
On Sep 14, 4:37 am, Erik Wikström <Erik-wikst...@telia.comwrote:
>On 2008-09-13 23:06, Peng Yu wrote:
On Sep 11, 4:56 pm, gpderetta <gpdere...@gmail.comwrote:
On Sep 11, 6:39 pm, Juha Nieminen <nos...@thanks.invalidwrote:
>Today hand optimizations are tomorrow pessimizations. Let the compiler
do its job.
>The usual rule apply: usepow, and only if the profiler tells it is a
bottleneck, try to optimize it by hand.
But we don't have to speculate about this as it's trivially easy to
test in practice. Go ahead and try it.
>I had already tried. 'pow(x, 16)' is inlined exactly as four
multiplies, at least with a recent gcc.
Would you please let me know the details the procedure on how you
figure this out? Sometimes I want to know what the compiler compile
the code to.

In Visual Studio you can run the program in the debugger and then bring
up the assembly code and it will show you can step through it and switch
back and forth between the code and assembly code. I would imagine you
can do similar things in other IDEs and in gdb.

Shall there be a problem if I use -O3 option? The source code and the
assembly code might have one-one relationship.
There might be a problem if the compiler decides to remove some code
completely, otherwise no.

--
Erik Wikström
Sep 14 '08 #17
Peng Yu <Pe*******@gmail.comwrites:
g++ -O3 -S -masm=intel main.cc

I tried to compile the code and the variant of it. And I got the
difference by diff. But I still have difficulty to understand what it
does. Is there a way to annotate the C++ code in the assembly code?
Yes, and I'd also disable optimization, which can make the generated
asm code more difficult to follow.

g++ -S -masm=intel -fverbose-asm main.cc

sherm--

--
My blog: http://shermspace.blogspot.com
Cocoa programming in Perl: http://camelbones.sourceforge.net
Sep 14 '08 #18
On Sep 14, 10:49 am, Sherm Pendley <spamt...@dot-app.orgwrote:
Peng Yu <PengYu...@gmail.comwrites:
g++ -O3 -S -masm=intel main.cc
I tried to compile the code and the variant of it. And I got the
difference by diff. But I still have difficulty to understand what it
does. Is there a way to annotate the C++ code in the assembly code?

Yes, and I'd also disable optimization, which can make the generated
asm code more difficult to follow.

g++ -S -masm=intel -fverbose-asm main.cc
But I have to enable the optimization, because I want to know how the
compiler optimize std::pow(x,n).

Thanks,
Peng
Sep 14 '08 #19

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

13
by: Shaobo Hou | last post by:
Can anyone tell me why pow(-8.0, 1.0 / 3.0) (cubic root of -8) returns nan (in linux) and negative infinity or something (in devcpp in windows), instead of -2? The problem seems to be that pow...
52
by: Michel Rouzic | last post by:
I obtain an unwanted behavior from the pow() function : when performing pow(2, 0.5), i obtain 1.414214 when performing pow(2, 1/2), i obtain 1.000000 when performing a=0.5; pow(2, a), i obtain...
5
by: Bo | last post by:
double v = -0.17665577379401631 double f = 0.016666666666666666 Math.Pow(v, f) returns me NaN. What should I do now?
11
by: Russ | last post by:
I have a couple of questions for the number crunchers out there: Does "pow(x,2)" simply square x, or does it first compute logarithms (as would be necessary if the exponent were not an integer)?...
15
by: Mark Healey | last post by:
I'm using gcc on a fedora 3 system here are what I think the relevant lines are: #include <math.h> r=pow(2,z); When I compile I get
42
by: John Smith | last post by:
In a C program I need to do exponentiation where the base is negative and the exponent is a fraction. In standard C this would be something like t = pow(-11.5, .333), but with this combination of...
12
by: zalery | last post by:
so i'm trying to set up this exponents loop, keep in mind this is my first year in computer science so my knowledge of script is somewhat minimal. basically this assignment (or at least part of it)...
2
by: clintonb | last post by:
Victor said: The double value that I'm trying to convert to GCSMoney (which is implemented as cents) was produced by multiplying a dollar amount by an interest rate to get interest. double...
1
by: bvav22 | last post by:
I am trying to write a program that will calculate monthly payments based on information entered by the user. however, when i used math.pow it returns infinity. i know my formula is right i have...
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.