I need to perform a summation of the kind i<j on symmetric matrices. This is equivalent to sum over the upper triangular elements of a matrix, diagonal excluded.
Given A a symmetric N x N array, the simplest solution is np.triu(A,1).sum() however I was wondering if faster methods exist that require less memory.
It seems that (A.sum() - np.diag(A).sum())/2 is faster on large array, but how to avoid creating even the N x 1 array from np.diag?
A doubly nested for loop would require no additional memory, but it is clearly not the way to go in Python.