I have to multiply very large 2D-arrays in Python for around 100 times. Each matrix consists of 32000x32000 elements.
I'm using np.dot(X,Y), but it takes very long time for each multiplication... Below an instance of my code:
import numpy as np
X = None
for i in range(100)
multiplying = True
if X == None:
X = generate_large_2darray()
multiplying = False
else:
Y = generate_large_2darray()
if multiplying:
X = np.dot(X, Y)
Is there any other method much faster?
Update
Here is a screenshot showing the htop interface. My python script is using only one core. Also, after 3h25m only 4 multiplications have been done.

Update 2
I've tried to execute:
import numpy.distutils.system_info as info
info.get_info('atlas')
but I've received:
/home/francescof/.local/lib/python2.7/site-packages/numpy/distutils/system_info.py:564: UserWarning: Specified path /home/apy/atlas/lib is invalid. warnings.warn('Specified path %s is invalid.' % d) {}
So, I think it's not well-configured.
Vice versa, regarding blas I just receive {}, with no warnings or errors.