Is there a simple way to do element wise numpy array operations of python in parallel.
I am trying to do a = b * c + d (all 3d arrays, [9, 200, 200] ). I can't get this to run in parallel because htop (on Linux) shows me only one thread being used when I run the above step for thousands of iterations. I ran it on a cluster too with multiple nodes, but same results.
Tried both Python 2 and 3.
No independent for-loops, so cython won't help much?
All I need is just simple addition/multiplication of numpy arrays in parallel.
Thanks