When building a python gensim word2vec model, is there a way to see a doc-to-word matrix?
With input of sentences = [['first', 'sentence'], ['second', 'sentence']] I'd see something like*:
      first  second  sentence
doc0    1       0        1
doc1    0       1        1
*I've illustrated 'human readable', but I'm looking for a scipy (or other) matrix, indexed to model.wv.index2word.
And, can that be transformed into a word-to-word matrix (to see co-occurences)? Something like:
          first  second  sentence
first       1       0        1
second      0       1        1  
sentence    1       1        2   
I've already implemented something like word-word co-occurrence matrix using CountVectorizer. It works well. However, I'm already using gensim in my pipeline and speed/code simplicity matter for my use-case.