I am struggling with a particular data manipulation issue. I posted recently on this, and was suggested the na.locf function in the zoo library, but this is a homework question and the TA specified not to load other libraries to solve this, so I need to use a more traditional data manipulation approach. We are given the following dataframe, which I show the first 10 rows of:
my_df[1:10,1:2]
     DATE FLIGHT
1  May 26     NA
2  Jun 10     NA
3  Jun 21     NA
4              1
5              2
6              3
7  Jun 23     NA
8              1
9              2
10 Jun 25     NA
I need to drag values down in the DATE column where there are blanks (rows 4-6 all Jun 21, rows 8-9 both Jun 23, etc.). I understand how na.locf makes this a one liner, a shame I cannot use it.
My idea is as follows:
- grab the dates that need to be dragged down using: - repeat_dates = my_df$DATE[which(my_df$FLIGHT == 1)-1]
- create a vector with the number of times to repeat each date. here 3 corresponds to having to drag Jun 21 down 3 rows, 2 for dragging Jun 23 down 2 rows, etc. - repeat_count = [3, 2, ...]
- use rep.int(repeat_dates, repeat_count) to get the vector: - ["Jun 21", "Jun 21", "Jun 21", "Jun 23", "Jun 23", ...]
- and then update the values 
I am not sure how to do part 2, if I can get this then I can get the problem. My TA suggested using the cumsum() function, possibly on a vector of booleans (1s / 0s where the DATE is blank?). I'm struggling with this, any help is appreciated!
EDIT - reproducable to help
dput(my_df[1:20,1:2])
structure(list(DATE = structure(c(-23961, -23946, -23935, NA, 
NA, NA, -23933, NA, NA, -23931, -23911, -23893, NA, NA, -23891, 
NA, NA, -23890, NA, NA), class = "Date"), FLIGHT = c(NA, NA, 
NA, 1L, 2L, 3L, NA, 1L, 2L, NA, NA, NA, 1L, 2L, NA, 1L, 2L, NA, 
1L, 2L)), .Names = c("DATE", "FLIGHT"), row.names = c(NA, 20L
), class = "data.frame") 
 
    