So I've been stuck on this problem for daysss and I would appreciate it if someone helped me. I have a dataframe, and the columns are:
 #   Column      Non-Null Count  Dtype  
---  ------      --------------  -----   
0   PhraseId    93636 non-null  int64   
1   SentenceId  93636 non-null  int64   
2   Phrase      93636 non-null  object  
3   Sentiment   93636 non-null  int64 
The sentiment is from 0 to 4, which basically rated the Phrase from good to bad. I added two columns which might be of help: Number of words for each phrase, and split each phrase into a list, the list containing the words inside the phrase.
What I want to do is create 4 bar graphs (a bar graph for each sentiment) showing the top 15 most repeated words for that sentiment. The x axist would be the top 15 words repeated in that sentiment.
Below, I pasted a code that I wrote which counts how many times a word is repeated for each sentiment. That would probably be needed for the bar graph.
Sample data:
       PhraseId SentenceId  Phrase                Sentiment SplitPhrase  NumOfWords
44723   75358   3866        Build some robots...    0   [Build, some, robots...] 52
To count how many times a word is repeated for each sentiment:
counters = {}
for Sentiment in train_data['Sentiment'].unique():
    counters[Sentiment] = Counter()
    indices = (train_data['Sentiment'] == Sentiment)
    for Phrase in train_data['SplitPhrase'][indices]:
        counters[Sentiment].update(Phrase)
        
print(counters) 
Sample output:
{2: Counter({'the': 28041, ',': 25046, 'a': 19962, 'of': 19376, 'and': 19052, 'to': 13470, '.': 10505, "'s": 10290, 'in': 8108, 'is': 8012, 'that': 7276, 'it': 6176, 'as': 5027, 'with': 4474, 'for': 4362, 'its': 4159, 'film': 3933......}),
 3: Counter({'the': 28041, ',': 25046, 'a': 19962,.....
 
    
