I want to do a full join with two dataframes based on 2 columns where 1 column contains the string found in the other. Below are my two dataframes:
date<-as.Date(c('2010-11-1','2008-3-25','2007-3-14'))
site<-c("abcejams.com", "reitimes.com", "posehbc")
desc1<-c("alpha", "beta", "gamma"
df1<-data.frame(date, site, desc1)
df1
        date         site    desc1
1 2010-11-01 abcejams.com    alpha
2 2008-03-25 reitimes.com     beta
3 2007-03-14      posehbc    gamma
date2<-as.Date(c('2010-11-1','2008-3-25','2007-3-14', '2018-2-9'))
site2<-c("jams", "time", "pose", "abce")
metric2<-c(1,2,3,4)
metric3<-c(10,20,30,40)
df2<-data.frame(date2, site2, metric2, metric3)
df2
       date2 site2 metric2 metric3
1 2010-11-01  jams       1      10
2 2008-03-25  time       2      20
3 2007-03-14  pose       3      30
4 2018-02-09  abce       4      40
I want to join this by Date AND Site based on site2 being in site by date. This is how you would normally do it without the grep portion.
finaldf<-full_join(df1, df2, by = c("date"="date2", "site" = "site2"))
There is a way to do this with sqldf but the only option is a left join and not a full join:
test<-sqldf("df1.*, df2.metric2, 
df2.metric3 
        from df1 
        left join df2 
        on 
        instr(df1.site,  df2.site2)
        and 
        df1.date=df2.date2")
The goal is to have the final output look like this:
        date         site     desc1     metric2    metric3
1 2010-11-01 abcejams.com     alpha           1         10 
2 2008-03-25 reitimes.com      beta           2         20
3 2007-03-14      posehbc     gamma           3         30
4 2018-02-09         abce        NA           4         40
Anyone have any experience with this?