I am looking for the best way to implement the creation of a new variable,numWithin365, defined as follows:
Given a column of dates, dates, count the number of other dates in the column within the preceding 365 days. This problem could be generalized beyond a vector of dates.
Here is one implementation; I am looking for any advice that could help it scale better.
library(dplyr)
# set seed for reproducibility
set.seed(42)
# function to calculate number of dates in prior year
within365 <- function(col){
  sapply(col, function(x){
    sum(x-365 < col & col <= x-1)
    }
  )
}
# fake data sorted chronologically
df <- data.frame(dates = sample(seq(as.Date('2015/01/01'), as.Date('2020/12/31'), 
                by="day"), 10)) %>% arrange(dates)
# applying the function
df %>% mutate(numWithin365 = within365(dates))
        dates numWithin365
1  2015-12-22            0
2  2016-09-25            1
3  2018-01-02            0
4  2018-02-25            1
5  2018-03-22            2
6  2018-06-05            3
7  2018-08-19            4
8  2019-06-13            1
9  2020-09-02            0
10 2020-09-27            1
 
     
     
    