I have in mind to to use getrf and getrs from the cuSolver package and to solve AB=X with B=I.
Is this the most best way to solve this problem?
If so, what is the best way to create the col-major identity matrix
Bin device memory? It can be done trivially using aforloop but this would 1. take up a lot of memory and 2. be quite slow. Is there a faster way?
Note that cuSolver does not provide getri unfortunately. Therefore I must to use getrs.