I have two dataframes named codes and phrases
codes :
| code | keywords |
|---|---|
| bg | burger |
| bg | burgers |
| cbg | chicken burger |
| cbg | burger chicken |
| cbg | chicken burgers |
| -- | -- |
| -- | -- |
phrases :
| text |
|---|
| burgers near me |
| chicken burgers around NYC |
| -- |
| -- |
Using python I want to build a dataframe like this :
| text | code |
|---|---|
| burgers near me | bg |
| chicken burgers around NYC | cbg |
| -- | -- |
| -- | -- |
I am trying to identify which keywords from codes best match with each record of phrases.
If I simply use string contains function, burgers would match with both the phrases above. Is there a better way to accomplish this?
Thanks in advance!