I have n local copies of matrices,say 'local', in n threads. I want to update a global shared matrix 's' with its elements being sum of corresponding elements of all local matrices. For eg. s[0][0] = local_1[0][0] + local_2[0][0]+...+local_n[0][0].
I wrote the following loop to achieve it -
#pragma omp parallel for
for(int i=0;i<rows;i++)
{
for(int j=0;j<cols;j++)
s[i][j]=s[i][j]+local[i][j];
}
This doesn't seem to work. Could someone kindly point out where am I going wrong?
Updated with example -
Suppose there are 3 threads, with following local matrices -
thread 1
local = 1 2
3 4
thread 2
local = 5 6
7 8
thread 3
local = 1 0
0 1
shared matrix would then be
s = 7 8
10 13