I want to find the most similar value from a dataframe column to a specified string , e.g. a='book'. Let's say the dataframe looks like: df
col1
wijk 00 book
Wijk a 
test
Now I want to return wijk 00 book since this is the most similar to a. I am trying to do this with the fuzzywuzzy package.
Therefore, I have a dataframe A with the values I want to have a similar one for. Then I use:
A['similar_value'] = A.col1.apply(lambda x: [process.extract(x, df.col1, limit=1)][0][0][0])  
But when comparing a lot of strings, this takes too much time. Does anyone knows how to do this quickly?
 
     
    