Chris Mantoulidis wrote:
There must be some tips to make a program quicker.
I guess more than 50% of ppl here will say "avoid the if-s". Yeah I
know this makes a program quicker but some times an "if" is
inevitable, isn't it?
What are some tips to make a program quicker?
TIA,
cmad
First - find your hotspots by using a profiler (valgrind etc.).
Next - check if you could use a better algorithm there. Then check if you
copy big objects (like strings!) unnecessarily. Change a
A::method(string ahalfbook)
to
A::method(const string &ahalfbook)
and so on. If you call small methods often inline them. Other tips:
1. Use compile-time linkage if applyable:
Not
class A{
public:
A(FilterBase *filter,int N) { myfilter=filter;
myfilter->setBlockSize(N);}
~A()
void transform(float *x, int size) { myfilter->apply(x,size); }
protected:
FilterBase *filter;
};
But
template<class T,template<class> class Filter> A{
A(int N) : MyFilter(N) {}
~A() {}
void transform(float x, size_t size) { assert(size==MyFilter.BlockSize());
MyFilter.transform(x);
protected:
Filter MyFilter;
};
this example also shows how to initiate objects with constructor arguments
without using new or .resize()-methods.
2. Use valarrays or Blitz++-Arrays for numerical stuff. Valarrays are faster
than C arrays since the compiler can assume the absence of aliases and thus
use aggressive optimizations.
3. For the same reason prefer index access to C arrays over pointer
arithmetics. If there is a pointer to an array the compiler cannot be sure
that you use a pointer-to-that pointer for accessing the array. If you do
that it (the compiler) cannot hold the values in registers nor the cache.
4. Avoid floating-point divisions. This
float b,c;
float a=b/c;
float e=a/c;
take much more time than that
float b,c;
float tmp=1.0/c;
float a=b*tmp;
float e=a*tmp;
5. prefer float to double since float's could be used in SIMD instructions
(SSE,3Dnow).
6. prefer signed ints over unsigned ints when performing
int-to-float-conversions.
7. Prevent type conversions where you can!
8. Move all stuff out of loops what you can get out.
9. Use unsigned int datatypes for loop counters.
10. Make usage of the superscalable cores / SIMD instructions; transform
for(size_t i=0;i<100;i++) sum += data[i];
into
float sum1,sum2,sum3,sum4;
sum1=sum2=sum3=sum4=0.0;
for(size_t i=0;i<100;i+=4)
{
sum1 += data[i];
sum2 += data[i+1];
sum3 += data[i+2];
sum4 += data[i+3];
}
float sum = sum1 + sum2 + sum3 + sum4;
11. Use a decent compiler with aggressive optimization flags (e.g. g++ >
3.2)