So scikit-learn's DBSCAN takes in sparse matrices, and if the matrix isn't of csr_matrix format, converts it to such. I'd like to parse in a csr_matrix, but then I get this warning:
EfficiencyWarning: Precomputed sparse input was not sorted by data.
How do I create a data-sorted csr_matrix? If I initialize the matrix data-sorted, the matrix automatically index-sorts it:
>>> from scipy.sparse import csr_matrix
>>> x = csr_matrix(([1,2,3],[[3,2,1],[5,2,1]]))
>>> print(x)
  (1, 1)    3
  (2, 2)    2
  (3, 5)    1
I know csr_matrix has a has_sorted_indices flag, but I'm not sure how to use it. Even if I set it to false, the matrix is still sorted by indices.
Edited: I tried sorted_indices but it doesn't seem to change anything. I'm not sure if my concept of sorted_indices is correct? Is it supposed to sort the data from low to high per row?
>>> from scipy.sparse import csr_matrix
>>> x = csr_matrix(([7,3,5,1,6,2], [[0,1,2,0,1,2],[0,0,0,1,1,1]]), shape=(3, 2))
>>> print(x)
  (0, 0)    7
  (0, 1)    1
  (1, 0)    3
  (1, 1)    6
  (2, 0)    5
  (2, 1)    2
>>> x.has_sorted_indices = False
>>> x.sort_indices()
>>> print(x)
  (0, 0)    7
  (0, 1)    1
  (1, 0)    3
  (1, 1)    6
  (2, 0)    5
  (2, 1)    2
What I want (is this possible or no?)
  (0, 1)    1
  (0, 0)    7
  (1, 0)    3
  (1, 1)    6
  (2, 1)    2
  (2, 0)    5
Basically I need this to return True:
out_of_order = graph.data[:-1] > graph.data[1:]
line_change = np.unique(graph.indptr[1:-1] - 1)
line_change = line_change[line_change < out_of_order.shape[0]]
return (out_of_order.sum() == out_of_order[line_change].sum())
 
    