I think exemple are easier to understand. So here is how to generate a small fake data set as an exemple :
library(tidyr)
day_event<- as.Date("2017-03-01") + 0:6
a<-rep(1,7)
b<-as.numeric(c("", rep(1,6)))
c<-as.numeric(c("","",rep(1,5)))
df_1<-data.frame(day_event,a,b,c)
names(df_1)[2]<-"2017-03-08"
names(df_1)[3]<-"2017-03-09"
names(df_1)[4]<-"2017-03-10"
> df_1
  day_event 2017-03-08 2017-03-09 2017-03-10
1  2017-03-01          1         NA         NA
2  2017-03-02          1          1         NA
3  2017-03-03          1          1          1
4  2017-03-04          1          1          1
5  2017-03-05          1          1          1
6  2017-03-06          1          1          1
7  2017-03-07          1          1          1
I get the data set in df2 format but using tidyr I can go from one format to the other :
df_2<-gather(df_1, day_measure, measure, -day_event)
> df_2
 day_event  day_measure measure
1   2017-03-01 2017-03-08       1
2   2017-03-02 2017-03-08       1
3   2017-03-03 2017-03-08       1
4   2017-03-04 2017-03-08       1
5   2017-03-05 2017-03-08       1
6   2017-03-06 2017-03-08       1
7   2017-03-07 2017-03-08       1
8   2017-03-01 2017-03-09      NA
9   2017-03-02 2017-03-09       1
10  2017-03-03 2017-03-09       1
11  2017-03-04 2017-03-09       1
12  2017-03-05 2017-03-09       1
13  2017-03-06 2017-03-09       1
14  2017-03-07 2017-03-09       1
15  2017-03-01 2017-03-10      NA
16  2017-03-02 2017-03-10      NA
17  2017-03-03 2017-03-10       1
18  2017-03-04 2017-03-10       1
19  2017-03-05 2017-03-10       1
20  2017-03-06 2017-03-10       1
21  2017-03-07 2017-03-10       1
For the context, it represents measures of an event that happened on day_event. But depending on the day the measure is performed the measure of the event on event_day can be different !
My probleme is that I only measure events seven days back : that's why the measure on day_mesure = '2017-03-09' for the day_event = '2017-03-01' is NA
I would like to replace this NA by the last measured perform (7 days after the day_event) : in this case replace by the measure made on '2017-03-08'
I tried
for (i in 1:length(df_2$measure)){
    row<- df_2[i,]
    if (row$day_event +7 < row$day_measure & length(df_2[df_2$day_event == row$day_event & df_2$day_measure == row$day_event + 7,]$measure)>0){
      row$measure<-df_2[df_2$day_event == row$day_event & df_2$day_measure == row$day_event + 7,]$measure
      df_2[i,]<-row
    }
}
It worked :) But on my real data set which is larger it takes forever :(
I think R doesn't like such loops ! Can you think of another method ?
Thanks for your help !
 
    