Consider a typical finite difference application:
// assuming T_size > 2
void process_T(double *T0, double *T, const int &T_size, bool periodic) {
  for (int i = 0; i < T_size; ++i) {
    double sum = 0;
    double base = T0[i];
    if (i > 0) sum += (T0[i-1]-base);
    if (i < 0) sum += (T0[i+1]-base);
    if (periodic) { 
       if (i == 0) sum += (T0[T_size-1]-base);
       if (i == T_size-1) sum += (T0[0]-base);
    } else {
      if (i == 1 || i == T_size-1) sum += 0.5*(T0[i-1]-base);
      if (i == 0 || i == T_size-2) sum += 0.5*(T0[i+1]-base);
    }
    T[i] = T0[i] + sum * 0.08; // where 0.08 is some magic number
  }
}
The check for periodic is loop-invariant, but since is only known at run-time, the conditional check cost is incurred everytime. I could create a specialized function which assumes one of the cases, but it would be cumbersome to maintain the common base, especially in case of three-dimensional problem where it would grow to 8 functions (periodicity: none, x, y, z, xy, xz, yz, xyz) to consider all combinations.
Is it possible to solve this problem via metaprogramming?
P/S: can the branch predictor optimize this accordingly?
 
     
    