I have collection of elements that I want to process in parallel. When I use a List, parallelism works. However, when I use a Set, it does not run in parallel.
I wrote a code sample that shows the problem:
public static void main(String[] args) {
ParallelTest test = new ParallelTest();
List<Integer> list = Arrays.asList(1,2);
Set<Integer> set = new HashSet<>(list);
ForkJoinPool forkJoinPool = new ForkJoinPool(4);
System.out.println("set print");
try {
forkJoinPool.submit(() ->
set.parallelStream().forEach(test::print)
).get();
} catch (Exception e) {
return;
}
System.out.println("\n\nlist print");
try {
forkJoinPool.submit(() ->
list.parallelStream().forEach(test::print)
).get();
} catch (Exception e) {
return;
}
}
private void print(int i){
System.out.println("start: " + i);
try {
TimeUnit.SECONDS.sleep(1);
} catch (InterruptedException e) {
}
System.out.println("end: " + i);
}
This is the output that I get on windows 7
set print
start: 1
end: 1
start: 2
end: 2
list print
start: 2
start: 1
end: 1
end: 2
We can see that the first element from the Set had to finish before the second element is processed. For the List, the second element starts before the first element finishes.
Can you tell me what causes this issue, and how to avoid it using a Set collection?