I need to join two datasets together:
- time series data taken every second with a heart rate measure (with 1.4 million records)
>allsecsHR
timestamp               HRbpm
2023-03-22 09:04:53     101
2023-03-22 09:04:54     124
2023-03-22 09:04:55     103
2023-03-22 09:04:56     111
2023-03-22 09:04:57     112
2023-03-22 09:04:58     143
2023-03-22 09:04:59     109
2023-03-22 09:05:00     129
2023-03-22 09:05:01     122
2023-03-22 09:05:02     125
2023-03-22 09:05:03     110
- behavioural records with a start and end timestamp
>bhr
Behaviour  BhrTimeStart         BhrTimeEnd
Forage     2023-03-22 09:04:53  2023-03-22 09:04:58
Vigilance  2023-03-22 09:04:58  2023-03-22 09:05:03
Forage     2023-03-22 09:05:03  2023-03-22 09:05:10
At the end I would like a dataset that has each row representing one second, with the behaviour that was being performed per second
I have tried to do this in mySQL and in R using sqldf and powerjoin but it keeps losing connection to the sql server after running for 2 hours or in R it does not complete and crashes or returns Error: vector memory exhausted (limit reached?). I would really appreciate some help to find an efficient way to do this!
I think a data.table solution might be the fastest but I am not sure how to do it, a tidy solution would also be great!
So far I have tried:
mySQL/sqldf
library(sqldf)
sqldf("select * from allsecsHR
              left join bhr
              on allsecsHR.timestamp between bhr.BhrTimeStart and bhr.BhrTimeEnd")
powerjoin
library(powerjoin)
power_left_join(
  allsecsHR, bhr, 
  by = ~.x$timestamp > .y$BhrTimeStart & 
    (.x$timestamp < .y$BhrTimeEnd | is.na(.y$BhrTimeEnd)),
  keep = "left")
The output table I would like is this:
timestamp           HRbpm   Bhr         BhrTimeStart        BhrTimeEnd
2023-03-22 09:04:53 101     Forage      2023-03-22 09:04:53   2023-03-22 09:04:58
2023-03-22 09:04:54 124     Forage      2023-03-22 09:04:53   2023-03-22 09:04:58
2023-03-22 09:04:55 103     Forage      2023-03-22 09:04:53   2023-03-22 09:04:58
2023-03-22 09:04:56 111     Forage      2023-03-22 09:04:53   2023-03-22 09:04:58
2023-03-22 09:04:57 112     Forage      2023-03-22 09:04:53   2023-03-22 09:04:58
2023-03-22 09:04:58 143     Vigilance   2023-03-22 09:04:58   2023-03-22 09:05:03
2023-03-22 09:04:59 109     Vigilance   2023-03-22 09:04:58   2023-03-22 09:05:03
2023-03-22 09:05:00 129     Vigilance   2023-03-22 09:04:58   2023-03-22 09:05:03
2023-03-22 09:05:01 122     Vigilance   2023-03-22 09:04:58   2023-03-22 09:05:03
2023-03-22 09:05:02 125     Vigilance   2023-03-22 09:04:58   2023-03-22 09:05:03
2023-03-22 09:05:03 110     Forage      2023-03-22 09:05:03   2023-03-22 09:05:10
 
     
     
    