My Problem is that I have a column of data (characters & numbers combined) and I want to check for each field if it partially matches with one of the fields in a column of a different df, and if it does write the corresponding value from the second df into a new column of the first variable.
So more practically:
df1 (2x60k)=
QST_ID    QST_CODING
1         M79.68
2         A01.1
3         K07
4         Z09
df2 (2x2451)=
icd    name
A01    intestinal diseases
...
K07    Diseases of oral cavity
And the aim would be to have df1 =
QST_ID    QST_CODING    QST_CODING_RC_NAME
1         M79.68        diseaseX
2         A01.1         Intestinal diseases
3         K07           Diseases of oral cavity
4         Z09           diseaseY
The easiest fix would be to ignore everything after the . and basically treat the 'icd' as icd* But I don't get this to work for the whole list.
I tried doing this with the match, pmatch and grep functions and tried prefixing all icd with ^ (regex wildcat) but it didn't work at all:
df1$QST_CODING_RC_name <- df2$name[pmatch(as.character(df1$QST_CODING), df2$icd, duplicates.ok = T)]
df1$QST_CODING_RC_name <- df2$name[grep(df2$icd, as.character(df1$QST_CODING), ignore.case = T)]
I would be very happy if anyone has a solution for this. Thanks a lot for your help.
Edit: QST_CODING contains values of a different form(g.523, etc.), where the information after the point is necessary to work with later and cannot simply be dropped. Furthermore, the length of bothe the part in front of the point as well as the part after it differs in QST_CODING.
 
    