I have a joining problem that I'm struggling with in that the join IDs I want to use for separate dataframes are spread out across three possible ID columns. I'd like to be able to join if at least one join ID matches. I know the _join and merge functions accept a vector of column names but is it possible to make this work conditionally?
For example, if I have the following two data frames:
df_A <- data.frame(dta = c("FOO", "BAR", "GOO"),
                   id1 = c("abc", "", "bcd"),
                   id2 = c("", "", "xyz"),
                   id3 = c("def", "fgh", ""), stringsAsFactors = F)
df_B <- data.frame(dta = c("FUU", "PAR", "KOO"),
                   id1 = c("abc", "", ""),
                   id2 = c("", "xyz", "zzz"),
                   id3 = c("", "", ""), stringsAsFactors = F)
> df_A
 dta id1 id2 id3
1 FOO abc     def
2 BAR         fgh
3 GOO bcd xyz   
> df_B
  dta id1 id2 id3
1 FUU abc        
2 PAR     xyz    
3 KOO     zzz  
I hope to end up with something like this:
 dta.x dta.y id1  id2  id3  
1 FOO  FUU   abc  ""   def    [matched on id1]
2 BAR  ""    ""   ""   fgh      [unmatched]
3 GOO  PAR   bcd  xyz  ""    [matched on id2]
4 KOO  ""    ""   zzz  ""      [unmatched]
So that unmatched dta1 and dta1 variables are retained but where there is a match (row 1 + 3 above) both dta1 and dta2 are joined in the new table. I have a sense that neither _join, merge, or match will work as is and that I'd need to write a function but I'm not sure where to start. Any help or ideas appreciated. Thank you
 
     
    