Looking out to find the highest accuracy percentage between 2 column values by using Fuzzy string matching.
I have 2 dataframes where i am trying to use fuzzy match between an specific columns values from both the dataframes.
Lets say df1 has 5 rows and df2 has 4 rows, i want to pick the values of each row of df1 and match with every row of df2 and find the highest accuracy. Suppose Row1 from df1 has been compared with all the rows from df2 so whichever the row from df2 has highest accuracy we will consider it as output. same to be considered for each row from df1.
Input Data :
Dataframe1
id_number  company_name        match_acc
IN2231D    AXN pvt Ltd
UK654IN    Aviva Intl Ltd
SL1432H    Ship Incorporations
LK0678G    Oppo Mobiles pvt ltd
NG5678J    Nokia Inc
Dataframe2
identity_no   Pincode   company_name
 IN2231        110030    AXN pvt Ltd
 UK654IN       897653    Aviva Intl Ltd
 SL1432        07658     Ship Incorporations
 LK0678G       120988    Oppo Mobiles Pvt Ltd
Want to find the highest accuracy percentage and submit the values in match_acc column.
Code i have been using currently:
df1 = pd.read_excel(open(r'input.xlsx', 'rb'), sheet_name='sheet1')
df2 = pd.read_excel(open(r'input.xlsx', 'rb'), sheet_name='sheet2')
from fuzzywuzzy import fuzz 
for index, row in df1.iterrows():
   df1['match_acc']= fuzz.partial_ratio(df1['id_number'], df2['identity_no'])
print(df1['match_acc'])
I have been using Fuzzywuzzy, If there is any other method as well pls suggest.
Any Suggestion.