I have 2 datasets that looks like the following (with about 600 total entries for both) Dataframe1
| name | A_time | measurement_A |
|---|---|---|
| A | 6/10/22 7:22 | 3.4 |
| A | 6/10/22 16:15 | 4.4 |
| A | 6/11/22 6:15 | 5.5 |
| A | 6/11/22 15:44 | 6.6 |
| B | 5/20/22 6:30 | 7.6 |
| B | 5/20/22 13:21 | 1.2 |
| B | 5/21/22 8:43 | 2.4 |
| B | 5/21/22 14:35 | 4.4 |
Dataframe2
| name | B_time | measurement_B |
|---|---|---|
| A | 6/10/22 7:24 | 50 |
| A | 6/10/22 16:55 | 50.1 |
| A | 6/11/22 6:13 | 52 |
| A | 6/11/22 15:21 | 61 |
| B | 5/20/22 6:30 | 54 |
| B | 5/20/22 13:22 | 53 |
| B | 5/21/22 8:00 | 54.6 |
| B | 5/21/22 14:12 | 76.1 |
I want to combine these two dataframes by name and date, selecting for rows only where Dataframe1 and Dataframe2 are within 30 minutes of each other.
I'm hoping to get something like this: Dataframe3
| name | A_time | measurement_A | measurement_B |
|---|---|---|---|
| A | 6/10/22 7:22 | 3.4 | 50 |
| A | 6/11/22 6:15 | 5.5 | 52 |
| A | 6/11/22 15:44 | 6.6 | 61 |
| B | 5/20/22 6:30 | 7.6 | 54 |
| B | 5/20/22 13:21 | 1.2 | 53 |
| B | 5/21/22 14:35 | 4.4 | 76.1 |
The code I am using:
library(dplyr)
dataframe3 <- dataframe1 %>%
inner_join(dataframe2, by = "name") %>%
filter(abs(A_time - B_time) <= as.difftime(30, units = "mins"))
But I am getting an error because the names in dataframe1 correlated to multiple rows in dataframe2. How can I get the Dataframe3 that I want?