With shapes (2,3,4), I get:
In [52]: mu.shape
Out[52]: (2, 3, 4, 2, 3, 4)
This einsum expression complains that dimensions are repeated in the output:
In [53]: np.einsum('cijcij,cij->cijcij', prev, other).shape
Traceback (most recent call last):
File "<ipython-input-53-92862a0865a2>", line 1, in <module>
np.einsum('cijcij,cij->cijcij', prev, other).shape
File "<__array_function__ internals>", line 180, in einsum
File "/usr/local/lib/python3.8/dist-packages/numpy/core/einsumfunc.py", line 1359, in einsum
return c_einsum(*operands, **kwargs)
ValueError: einstein sum subscripts string includes output subscript 'c' multiple times
Without the repeat:
In [55]: x=np.einsum('cijcij,cij->cij', prev, other)
In [56]: x.shape
Out[56]: (2, 3, 4)
Nonzero values match:
In [57]: np.allclose(mu[np.nonzero(mu)].ravel(), x.ravel())
Out[57]: True
Or by extracting the diagonals from mu:
In [59]: I,J,K = np.ix_(np.arange(2),np.arange(3),np.arange(4))
In [60]: mu[I,J,K,I,J,K].shape
Out[60]: (2, 3, 4)
In [61]: np.allclose(mu[I,J,K,I,J,K],x)
Out[61]: True
Your einsum satisfies the same 'diagonals' test:
In [68]: y=np.einsum('cijklm,cij->cijklm', prev, other)
In [69]: y.shape
Out[69]: (2, 3, 4, 2, 3, 4)
In [70]: np.allclose(y[I,J,K,I,J,K],x)
Out[70]: True
So the mu values are also present in y, but distributed in a different way. But the arrays are too big to readily view and compare.
OK, each y[i,j,k] is the same, and equal to x. In mu most of these values are 0, with only selected diagonals being nonzero.
While einsum can generate the same nonzero values, it cannot distribute them in the same 3d diagonals way as your loop.
Changing your mu calculation to produce a 3d array:
In [76]: nu = np.zeros((m,n,o))
...: for c in range(m):
...: for i in range(n):
...: for j in range(o):
...: nu[c,i,j] = other[c,i,j]*prev[c,i,j,c,i,j]
...:
In [77]: np.allclose(nu,x)
Out[77]: True
edit
We can assign einsum result to the diagonals with:
In [134]: out = np.zeros((2,3,4,2,3,4))
In [135]: out[I,J,K,I,J,K] = x
In [136]: np.allclose(out, mu)
Out[136]: True
Conceptually it may be simpler than the as_strided solution. And may be just as fast. as_strided, while making a view, is not as fast as a reshape kind of view.
In [143]: %%timeit
...: out = np.zeros((m, n, o, m, n, o))
...: mu_view = np.lib.stride_tricks.as_strided(out,
...: shape=(m, n, o),
...: strides=[sum(mu.strides[i::3]) for i in range(3)
...: ]
...: )
...: np.einsum('cijcij,cij->cij', prev, other, out=mu_view)
...:
...:
31.6 µs ± 69.1 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [144]: %%timeit
...: out = np.zeros((2,3,4,2,3,4))
...: out[I,J,K,I,J,K] =np.einsum('cijcij,cij->cij', prev, other)
...:
...:
18.5 µs ± 178 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
or including the I,J,K generation in the time loop
In [146]: %%timeit
...: I,J,K = np.ix_(np.arange(2),np.arange(3),np.arange(4))
...: out = np.zeros((2,3,4,2,3,4))
...: out[I,J,K,I,J,K] =np.einsum('cijcij,cij->cij', prev, other)
40.4 µs ± 1.45 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
or creating the IJK directly:
In [151]: %%timeit
...: I,J,K = np.arange(2)[:,None,None],np.arange(3)[:,None],np.arange(4)
...: out = np.zeros((2,3,4,2,3,4))
...: out[I,J,K,I,J,K] =np.einsum('cijcij,cij->cij', prev, other)
25.1 µs ± 38.3 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)