I have the dataframe below, where each row represents changes in text. I then use the adist() function to extract whether the change is a match (M), insertion (I), substitution (S) or deletion (D).
I need to find all of the indices of Is in the change column (illustrated here in the insrtion_idx column). Using those indices, I need to extract the corresponding characters in current_text (illustrated here in insertion_chars).
df <- tibble(current_text = c("A","AB","ABCD","ABZ"),
previous_text = c("","A","AB","ABCD"),
change = c("I","MI","MMII","MMSD"),
insertion_idx = c(c(1),c(2),c(3,4),""),
insertion_chars = c("A","B","CD",""))
I have tried splitting up strings and comparing string differences, but this gets very messy very fast with real-world data. How do I accomplish the above task?