I have a have .txt file that I am using that has multiple lines that contain sentences. Let's say that file is called sentences.txt. I also have a dictionary that I am using that contains pre-defined sentiment for about 2500 words, let's call that dictionary sentiment_scores. My goal is to return a dictionary that predicts the sentiment value for a word that is not in sentiment_scores. I am doing this by taking the average score for each sentence that the word is in.
with open('sentences.txt', 'r') as f:
        sentences = [line.strip() for line in f]
        f.close()
for line in sentences:
    for word in line.split(): #This will iterate through words in the sentence
        if not (word in sentiment_scores):
            new_term_sent[word] = 0 #Assign word a sentiment value of 0 initially
for key in new_term_sent:
    score = 0
    num_sentences = 0
    for sentence in sentences:
        if key in sentence.split():
            num_sentences+=1
            val = get_sentiment(sentence) #This function returns the sentiment of a sentence
            score+=val
    if num_sentences != 0:
        average = round((score)/(num_sentences),1)
        new_term_sent[key] = average
return new_term_sent
Please note: this method works, but the time complexity is too long, takes about 80 seconds to run on my laptop.
My question is therefore how I can do this more efficiently? I have tried just using .readlines() on sentence.txt, but that did not work (can't figure out why, but I know it has to do with iterating through the text file multiple times; maybe a pointer is disappearing somehow). Thank you in advance!
 
     
    