I'm trying to do data enhancement with a FAQ dataset. I change words, specifically nouns, by most similar words with Wordnet checking the similarity with Spacy. I use multiple for loop to go through my dataset.
import spacy
import nltk
from nltk.corpus import wordnet as wn
import pandas as pd
nlp = spacy.load('en_core_web_md')
nltk.download('wordnet')
questions = pd.read_csv("FAQ.csv")
list_questions = []
for question in questions.values:
    list_questions.append(nlp(question[0]))
for question in list_questions: 
    for token in question:
        treshold = 0.5
        if token.pos_ == 'NOUN':
            wordnet_syn = wn.synsets(str(token), pos=wn.NOUN)  
            for syn in wordnet_syn:
                for lemma in syn.lemmas():
                    similar_word = nlp(lemma.name())
                    if similar_word.similarity(token) != 1. and similar_word.similarity(token) > treshold:
                        good_word = similar_word
                        treshold = token.similarity(similar_word)
However, the following warning is printed several times and I don't understand why :
UserWarning: [W008] Evaluating Doc.similarity based on empty vectors.
It is my similar_word.similarity(token) which creates the problem but I don't understand why. 
The form of my list_questions is :
list_questions = [Do you have a paper or other written explanation to introduce your model's details?, Where is the BERT code come from?, How large is a sentence vector?]
I need to check token but also the similar_word in the loop, for example, I still get the error here :
tokens = nlp(u'dog cat unknownword')
similar_word = nlp(u'rabbit')
if(similar_word):
    for token in tokens:
        if (token):
            print(token.text, similar_word.similarity(token))
 
     
    