I have a dataset that contains a title, and I want to extract some words from it. I used the count() function to check the number of total number of occurrences for each word, and then plot them. Here is the code:
install.packages("remotes")
remotes::install_github("tweed1e/werfriends")
library(werfriends)
friends_raw <- werfriends::friends_episodes
library(tidytext)
library(tidyverse)
custom_stop_words <- bind_rows(tibble(word = c("1","2", "one"), 
                                      lexicon = c("custom", "custom", "custom")), 
                               stop_words)
friends_raw %>%
  unnest_tokens(word, title) %>%
  mutate(word = str_remove(word, "'s")) %>%
  anti_join(bind_rows(custom_stop_words)) %>%
  count(word) %>%
  top_n(10) %>%
  mutate(word = fct_reorder(word, n)) %>%
  ggplot(aes(x = word, y = n)) + geom_col() + coord_flip() + 
  scale_y_continuous(breaks = seq(0,30,5))
In the friends_raw dataset there is also a column season for each title, and I would like to also plot the season where the occurences happen, with fill. The problem is that, with this approach I don't know how to save the season column and do the count, getting the results ordered.
Any clues on how to perform this?
 
    
