To get an idea of if-statement vs selective-multiplication, I tried the code below and saw that multiplying the result by 0 instead of failed-if-statement(false) and multiplying by 1 instead of passed-if-statement(true), if-statement is slower and just computing always is faster if there are only 3-4 double precision multiplications.
Question: While this multiplication is faster even on cpu, how would it perform on a GPU(opencl/cuda) ? My vote is for absolute speedup. What about precision loss for single precision multiplication? I know there cant be 1.00000 always, it is 0.999999 to multiply. Lets say I dont mind sp precision loss at 5th digit.
This is more suitable for integers but could this be meaningful for at least floats? If float/half are multiplied quicker/faster than doubles, then this would be even more faster.
Result:
 no if: 0.058515741 seconds
 if(){}: 0.073415743 seconds
Can anyone reproduce similar result? if(){} is the second test so JIT couldnt be cheating?
Code:
 public static void main(String[] args)
{
       boolean[]ifBool=new boolean[10000000];
       byte[]ifThen=new byte[10000000];
       double []data=new double[10000000];
       double []data1=new double[10000000];
       double []data2=new double[10000000];
       for(int i=0;i<ifThen.length;i++)
       {
          ifThen[i]=(byte)(0.43+Math.random()); //1 =yes result add, 0= no result add 
          ifBool[i]=(ifThen[i]==1?true:false);
          data[i]=Math.random();
          data1[i]=Math.random();
          data2[i]=Math.random();
      }
         long ref=0,end=0;
         ref=System.nanoTime();
         for(int i=0;i<data.length;i++)
         {
                // multiplying by zero means no change in data
                // multiplying by one means a change in data
            double check=(double)ifThen[i]; // some precision error 0.99999 ?
            data2[i]+=(data[i]*data1[i])*check; // double checked to be sure
            data[i]+=(data2[i]*data1[i])*check; // about adding the result
            data1[i]+=(data[i]*data2[i])*check; // or not adding
                                       //(adding the result or adding a zero)
         }
         end=System.nanoTime();
         System.out.println("no if: "+(end-ref)/1000000000.0+" seconds");
         ref=System.nanoTime();
         for(int i=0;i<data.length;i++)
         {
            if(ifBool[i]) // conventional approach, easy to read
            {
               data2[i]+=data[i]*data1[i];
               data[i]+=data2[i]*data1[i];
               data1[i]+=data[i]*data2[i];
            }
         }
         end=System.nanoTime();
         System.out.println("if(){}: "+(end-ref)/1000000000.0+" seconds");
}
CPU is FX8150 @ 4GHz
 
    