I have two excel sheets with insurance claims data from two different insurance providers. I need to find cases of individuals that have filed claims under both providers.
I would like to have something that pairs names if it seems likely that they are the same name, but does nothing if it doesn't find a similar enough name in the other sheet. From what I have read I think I need to use fuzzy strings for this (and maybe the DL distance). I know R has a string distance function, adist, but I am struggling to learn to use it properly.
For an example:
Provider 1:
Ms. Smith        35        F        Portland,OR             Cardiac
Adam Jacobs      27        M        San Francisco, CA       Gynecology
Emily Lo         19        F        Portland,OR             Ortho
Frances Wu       33        F        Dallas, TX              ENT
Provider 2: 
Clara Smith      35        F        Portland,OR              Cardiac
Bill White       29        M        San Francisco, CA        Ortho
Emily S. Lo      19        F        Portland,OR              Ortho
Dev Patel        22        M        Dallas, TX               Neuro
So here it should recognize that Emily S. Lo is the same person as Emily Lo, and that Clara Smith is the same as Ms.Smith and give me a list with their names and information. How do I do this?
I tried copying what this person did: http://bigdata-doctor.com/fuzzy-string-matching-survival-skill-tackle-unstructured-information-r/ I tried with their data, copy/pasting their code and I keep getting a 0x0 result.
