I have a graph with an adjacency matrix shape (adj_mat.shape = (4000, 4000)). My current problem involves finding the list of path lengths (the sequence of nodes is not so important) that traverses from the source (row = 0 ) to the target (col = trans_mat.shape[0] -1).
I am not interested in finding the path sequences; I am only interested in propagating the path length. As a result, this is different from finding all simple paths - which would be too slow (ie. find all paths from source to target; then score each path). Is there a performant way to do this quickly?
DFS is suggested as one possible strategy (noted here). My current implementation (below) is simply not optimal:
# create graph
G = nx.from_numpy_matrix(adj_mat, create_using=nx.DiGraph())
# initialize nodes
for node in G.nodes:
    G.nodes[node]['cprob'] = []
# set starting node value
G.nodes[0]['cprob'] = [0]
def propagate_prob(G, node):
    # find incoming edges to node
    predecessors = list(G.predecessors(node))
    curr_node_arr = []        
    for prev_node in predecessors:
        # get incoming edge weight
        edge_weight = G.get_edge_data(prev_node, node)['weight']
        # get predecessor node value
        if len(G.nodes[prev_node]['cprob']) == 0:                
            G.nodes[prev_node]['cprob'] = propagate_prob(G, prev_node)            
        prev_node_arr = G.nodes[prev_node]['cprob']   
        # add incoming edge weight to prev_node arr
        curr_node_arr = np.concatenate([curr_node_arr, np.array(edge_weight) + np.array(prev_node_arr)])
    # update current node array
    G.nodes[node]['cprob'] = curr_node_arr
    return G.nodes[node]['cprob']
# calculate all path lengths from source to sink 
part_func = propagate_prob(G, 4000)
 
    