I am trying to check performance of following code but every time I am getting sequential operation gives better performance as compared to fork join.
Problem I want to find max integer:
public class GetMaxIntegerProblem {
    private final int[] intArray;
    private int start;
    private int end;
    private int size;
    public GetMaxIntegerProblem(int[] intArray, int start, int end) {
        super();
        this.intArray = intArray;
        this.start = start;
        this.end = end;
        size = end - start;
    }
    public int getMaxSequentially() {
        int max = Integer.MIN_VALUE;
        for (int i = start; i < end; i++) {
            int n = intArray[i];
            max = n > max ? n : max;
        }
        return max;
    }
    public int getSize() {
        return size;
    }
    public GetMaxIntegerProblem getMaxIntegerSubProblem(int subStart, int subEnd) {
        return new GetMaxIntegerProblem(this.intArray, start + subStart, start + subEnd);
    }
}
My action for fork join:
import java.util.concurrent.RecursiveAction;
public class GetMaxIntegerAction extends RecursiveAction {
    private final int threshold;
    private final GetMaxIntegerProblem problem;
    private int result;
    public GetMaxIntegerAction(int threshold, GetMaxIntegerProblem problem) {
        super();
        this.threshold = threshold;
        this.problem = problem;
    }
    @Override
    protected void compute() {
        if (problem.getSize() < threshold) {
            result = problem.getMaxSequentially();
        } else {
            int midPoint = problem.getSize() / 2;
            GetMaxIntegerProblem leftProblem = problem.getMaxIntegerSubProblem(0, midPoint);
            GetMaxIntegerProblem rightProblem = problem.getMaxIntegerSubProblem(midPoint + 1, problem.getSize());
            GetMaxIntegerAction left = new GetMaxIntegerAction(threshold, leftProblem);
            GetMaxIntegerAction right = new GetMaxIntegerAction(threshold, rightProblem);
            invokeAll(left, right);
            result = Math.max(left.result, right.result);
        }
    }
}
My Main program for testing:
import java.util.Random;
import java.util.concurrent.ForkJoinPool;
public class GetMaxIntegerMain {
    public GetMaxIntegerMain() {
        // TODO Auto-generated constructor stub
    }
    private Random random = new Random();
    private void fillRandomArray(int[] randomArray) {
        for (int i = 0; i < randomArray.length; i++) {
            randomArray[i] = random.nextInt(10000);
        }
    }
    /**
     * @param args
     */
    public static void main(String[] args) {
        GetMaxIntegerMain mainexcution=new GetMaxIntegerMain();
        int arrayLength = 10_00_000;
        int array[] = new int[arrayLength];
        mainexcution.fillRandomArray(array);
        GetMaxIntegerProblem problem=new GetMaxIntegerProblem(array, 0, array.length);
         //No. of times sequential & 
        //Parallel operation should be performed to warm up HotSpot JVM
        final int iterations = 10;
        long start = System.nanoTime();
        int maxSeq=0;
        for (int i = 0; i < iterations; i++) {
            maxSeq=problem.getMaxSequentially();
         }
        long endSeq=System.nanoTime();
        long totalSeq=endSeq-start;
        System.out.println(" time for seq "+(endSeq-start));
        System.out.println("Number of processor available: " + Runtime.getRuntime().availableProcessors());
        //Default parallelism level = Runtime.getRuntime().availableProcessors()
        int threads=Runtime.getRuntime().availableProcessors();
        ForkJoinPool fjpool = new ForkJoinPool(64);
        long startParallel = System.nanoTime();
        for (int i = 0; i < iterations; i++) {
            GetMaxIntegerAction action=new GetMaxIntegerAction(5000, problem);  
            fjpool.invoke(action);
        }
        long endParallel = System.nanoTime();
        long totalP=endParallel-startParallel;
        System.out.println(" time for parallel "+totalP);
        double speedup=(double)(totalSeq/totalP);
        System.out.println(" speedup "+speedup);
        System.out.println("Number of steals: " + fjpool.getStealCount() + "\n");
    }
}
Every time I am running this code, I am getting forkjoin specific code takes 400% more time. I tried with various combination of threshold but I) am not getting to success.
I am running this code on Intel Core i3 Processor 3.3 GHz 64 bit on windows 10.
It would be great help if someone could provide some pointers on this problem.
 
     
    