I have a dataframe, and one column contains the string description of movies in Danish:
df.Description.tail()
24756    Der er nye kendisser i rundkredsen, nemlig Ski...
24757    Hvad får man, hvis man blander en gruppe af k...
24758    Hvordan vælter man en minister? Hvordan ødel...
24759    Der er dømt mandehygge i hulen hos ZULUs tera...
24760    Kender du de dage på arbejdet, hvor alt bare ...
I first check that all the values of the column Description are strings:
df.applymap(type).eq(str).all()
Video.ID.v26    False
Title            True
Category        False
Description      True
dtype: bool
What I want is to create another column which contains the words found in each string, separated with a , like this:
24756   [Der, er, nye, kendisser, i, rundkredsen, ...
In my loop, I also use Rake() to delete Danish stop words. Here is my loop:
# initializing the new column
df['Key_words'] = ""
for index, row in df.iterrows():
    plot = row['Description']
    # instantiating Rake, by default is uses english stopwords from NLTK, but we want Danish
    # and discard all puntuation characters
    r = Rake('da')
    # extracting the words by passing the text
    r.extract_keywords_from_text(plot)
    # getting the dictionary whith key words and their scores
    key_words_dict_scores = r.get_word_degrees()
    # assigning the key words to the new column
    row['Key_words'] = list(key_words_dict_scores.keys())
The problem is that the new column Key_words is empty...
df.Key_words.tail()
24756    
24757    
24758    
24759    
24760    
Name: Key_words, dtype: object
Any help appreciated.
 
     
    