I need to join several dataframes based on inexact matching, which can be achieved using the fuzzyjoin and the IRanges packages:
Data:
df1 <- data.frame(
  line = 1:4,
  start = c(75,100,170,240),
  end = c(100,150,190,300)
)
df2 <- data.frame(
  v2 = c("A","B","C","D","E","F","G","H","I","J","K","F"),
  start = c(0,10,30,90,120,130,154,161,175,199,205,300),
  end = c(10,20,50,110,130,140,160,165,180,250,300,305)
)
df3 <- data.frame(
  v3 = c("a","b","c","d","e","f"),
  start = c(5,90,200,333,1000,1500),
  end = c(75,171,210,400,1001,1600)
)
Here I want to join df2 and df3 to df1 based on the intervals between startand end. What I can do is do it in steps, i.e., join by join:
library(fuzzyjoin)
# install package "IRanges":
if (!requireNamespace("BiocManager", quietly = TRUE))
   install.packages("BiocManager")
 
BiocManager::install("IRanges")
library(BiocManager)
# First join:
df12 <- interval_left_join(x = df1,
                            y = df2,
                            by = c("start", "end")) %>%
  select(-c(start.y, end.y)) %>%
  rename(start = start.x, end = end.x)
# Second join:
df123 <- interval_left_join(x = df12,
                             y = df3,
                             by = c("start", "end")) %>%
  select(-c(start.y, end.y)) %>%
  rename(start = start.x, end = end.x)
Result:
df123  
  line start end v2   v3
1    1    75 100  D    a
2    1    75 100  D    b
3    2   100 150  D    b
4    2   100 150  E    b
5    2   100 150  F    b
6    3   170 190  I    b
7    4   240 300  J <NA>
8    4   240 300  K <NA>
9    4   240 300  F <NA>
This all works well but in my actual data I have multiple dataframes to join and, then, doing it join-by-join is impractical and error-prone. How can the join be performed for all dataframes in one go?
 
    