I have data that looks similar to the following except with hundreds of IDs and thousands of observations:
ID     date          measles
1      2008-09-12    1
1      2008-10-25    NA
1      2009-01-12    1
1      2009-03-12    NA
1      2009-05-12    1
2      2010-05-17    NA
2      2010-06-12    NA
2      2010-07-02    1
2      2010-08-13    NA
I want to create a variable that will store the previous date for each pid like the following:
ID     date          measles  previous_date
1      2008-09-12    1        NA
1      2008-10-25    NA       2008-09-12      
1      2009-01-12    1        2008-10-25
1      2009-03-12    NA       2009-01-12
1      2009-05-12    1        2009-03-12 
2      2010-05-17    NA       NA
2      2010-06-12    NA       2010-05-17
2      2010-07-02    1        2010-06-12
2      2010-08-13    NA       2010-07-02
This should be an extremely easy task, but I have been unsuccessful at getting a lag variable to work properly. I have tried a few methods, such as the following:
dt[, previous_date:=c(NA, current_date[-.N]), by=c("ID")]
dt[,previous_date:=current_date-shift(current_date,1,type="lag"),by=ID]
The code samples above either produce sporadic numbers in the previous_date variable or produce all NAs. I'm not sure why this is? Is it because I'm using a date variable as opposed to an integer?
Is there a better way to accomplish this task that would work for a date variable?
 
    