Finding this one difficult and would like to see if you can help, pls:
I have two df's, dfA and dfB. In each there is a telnum, file, and datetime column, but only dfA contains a name column and only dfA has complete telephone number string values unlike dfB which sometimes has partially complete telephone numbers up to x of n digits, e.g. sometimes the int'l calling code is missing, other times a '0' is present instead of the int'l calling code, and other times neither int'l calling code or the leading '0' is present.
dfA also contains more data than dfB (<10 rows), but between the two there are complete timestamp, file, and telnum columns always with a date-time and string value respectively (even though dfB['telnum'] is not fully complete as stated above).
What I want to do is extract the rows from dfA where dfB['telnum'] matches dfA['telnum'] but because dfB['telnum'] is not always complete then I need to check for matches as a substring of dfA['telnum'] too.
I would like the result to be dfResult but where the returned results have dfA values on the left and dfB on the right so that I can see the different file and datetime values.
Any ideas?
EDIT:
I think I need an inner merge, e.g.
pandas.merge(dfA, dfB, on='telnum', how='inner')
However, because dfB['telnum'] isn't always complete telnum string, the results are not complete. How can I get those that match between the two by checking if dfB['telnum'] is a substring of dfA['telnum'] as well?