I have two large data frames that look like this:
df1 <- tibble(chrom=c(1,1,1,2,2,2),
              start=c(100,200,300,100,200,300),
              end=c(150,250,350,120,220,320))
df2 <- tibble(chrom=c(1,1,1,2,2,2),
              start2=c(100,50,280,100,10,200),
              end2=c(125,100,320,115,15,350))
df1
#> # A tibble: 6 × 3
#>   chrom start   end
#>   <dbl> <dbl> <dbl>
#> 1     1   100   150
#> 2     1   200   250
#> 3     1   300   350
#> 4     2   100   120
#> 5     2   200   220
#> 6     2   300   320
df2
#> # A tibble: 6 × 3
#>   chrom start2  end2
#>   <dbl>  <dbl> <dbl>
#> 1     1    100   125
#> 2     1     50   100
#> 3     1    280   320
#> 4     2    100   115
#> 5     2     10    15
#> 6     2    200   350
Created on 2023-01-09 with reprex v2.0.2
I want to find which range[start2-end2] of df2 overlaps with the range[start-end] of df1. An ideal output would be something like this, but it's not necessary. Mostly I want the coordinates of the overlapping ranges.
#> # A tibble: 6 × 8
#>   chrom start   end start2  end2 overlap overlap_start overlap_end
#>   <dbl> <dbl> <dbl>  <dbl> <dbl> <chr>   <chr>         <chr>      
#> 1     1   100   150    100   125 yes     100           125        
#> 2     1   200   250     50   100 no      <NA>          <NA>       
#> 3     1   300   350    280   320 yes     300           320        
#> 4     2   100   120    100   115 yes     100           115        
#> 5     2   200   220     10    15 no      <NA>          <NA>       
#> 6     2   300   320    200   350 yes     200,220       300,320
Created on 2023-01-09 with reprex v2.0.2
!Note that on the last line, the range 200-350 overlaps already with two ranges from df1[200-220, 300-320].
 
     
     
     
    