The pow() function is typically implemented in the math library, possibly using special instructions in the target processor, for x86 see How to: pow(real, real) in x86. However, instructions such as fyl2x and f2xm1 aren't fast, so the whole thing could take 100 CPU cycles. For performance reasons a compiler like gcc provide "built-in" functions that provide strength reduction to perform computations faster in special cases. When the power N is an integer (as in your case) and small (as in your case) then it is faster to multiply N times than to call the library function.
In order to detect cases where the power is an integer the math library provides overloaded functions, for example double pow(double,int). You will find that gcc converts
double x = std::pow(y,4);
internally into 2 multiplications, which is much faster than a library call, and gives the precise integer result you expect when both operands are integers
double tmp = y * y;
double x = tmp * tmp;
in order to get this type of strength reduction you should
include < cmath >
- compile with optimization -O2
- call the pow function in the library explicitly
std::pow() to make sure that's the version you get, and not one from math.h
You will then match the overloaded pow function in < cmath > which looks like this
inline double pow(double __x, int __i) { return __builtin_powi(__x, __i); }
Notice that this function is implemented with __builtin_powi which knows the strength reduction of pow() to multiplication when the power is a small integer.