- The
taskloop construct is a task-generating construct, not a worksharing-loop construct. You can read the following in OpenMP specification:
The taskloop construct is a task generating construct. When a thread
encounters a taskloop construct, the construct partitions the
iterations of the associated loops into explicit tasks for parallel
execution.
In your case 6 threads encounter the taskloop construct, therefore the loop is executed 6 times, and you got a bigger result than expected.
To correct it you have to i) make sure that only a single thread executes the tasks generating construct, or ii) use a worksharing-loop construct.
There is a race condition at line sum += n[i];, as different threads update sum simultaneously. To correct it the best option is to use reduction (reduction(+:sum))
In the second case the sharing attribute of sum is private, so the variable sum will be privatized. It means that a local variable is created, and only the value of the local variable is changed. It does not change the value of sum in the enclosing context, therefore you got 0.
To sum up, here are the 2 alternatives mentioned above to correct your code:
i) using single construct and taskloop:
#pragma omp parallel num_threads(NUMTHREAD)
#pragma omp single nowait
{
#pragma omp taskloop reduction(+:sum)
for (int i = 0; i < SIZE; i++) {
sum += n[i];
}
}
ii) using a worksharing-loop construct:
#pragma omp parallel num_threads(NUMTHREAD)
{
#pragma omp for reduction(+:sum)
for (int i = 0; i < SIZE; i++) {
sum += n[i];
}
}
Some minor, off-topic comments:
- The array (allocated by
new operator) is never freed, you should use delete to deallocate it. As pointed out by @VictorEijkhout it is even better if you do not use new at all. Use vector or array from the std library.
- You should prefer
const/constexpr int SIZE=100; over #define SIZE 100. More details can be found e.g. here.
- You should consider reading Why is "using namespace std;" considered bad practice?