I have a simple data frame as follows:
 hai_dispense_number hai_atc date_of_claim hai_age
1                  tom    A10A    2011-11-01      42
2                  tom    A10B    2011-11-01      42
3                  tom    G10R    2011-11-01      42
4                  tom    A10A    2012-02-02      42
5                  tom    A10A    2012-04-03      42
6                 mary    A10A    2012-02-02      36
7                 mary    A10A    2012-03-02      36
8                 mary    G123    2012-03-02      36
9                 mary    E123    2012-05-02      36
10                mary    T123    2012-07-02      36
11                mary    A10A    2012-08-02      43
12                 sue   GR123    2012-03-02      43
13                 sue   GR123    2012-03-08      43
14                 sue   GR123    2012-09-03      43
15                 sue   GR123    2012-10-01      43
16                 sue   GR123    2012-03-02      43
17                 pat    A10A    2012-01-02      52
18                 pat   GR123    2011-11-12      52
19                 pat    A10A    2012-03-03      52
20                 pat   GR123    2012-01-01      52
21                 pat    A10A    2012-05-06      52
I want to isolate people who had the code A10A only from 2012-01-01 onwards, and not in 2011. There are some people in the above dataframe who did not have the code A10A at any stage and so I'd like to get rid of them too. So in the end, I want to have data only for pat and mary. Here is how I do it:
Step 1: Assigns the "index date". In other words the first date in 2012 that someone got the code A10A
dt2<- data.table(dff,key=c('hai_dispense_number','date_of_claim'))
dt2[,date_of_claim := as.Date(date_of_claim)]
dt2[grepl('^A10A*?', as.character(dt2$hai_atc))& between(date_of_claim,as.Date("2012-01-01"),as.Date("2012-12-31")),
    index := as.character(min(date_of_claim))
    ,   by=c('hai_dispense_number','hai_atc')]  #
dt2$index<-as.Date(dt2$index, origin='1970-01-01')
dt2$hai_atc<-as.character(dt2$hai_atc)
Step 2: Creating the index date creates some NA values for rows that were not A10A or occurred before 2012-01. I need to fill in this NA value with each person's unique index date. This code usually works, but for reason with these data, I keep getting an error
dt2[, index := index[!is.na(index)][[1]], by=hai_dispense_number]  ##gives the index date for each person to each of their individual rows of data
Error in index[!is.na(index)][[1]] : subscript out of bounds
Steps 3, 4 and 5: I usually can follow on from here, and find out who had a code A10A in 2011 and then exclude them. But my process is being interrupted by the above error.
I looked at this page: Subscript out of bounds - general definition and solution? and tried running the tests they recommend: but it doesn't look like i have extra rows for the variable index. Perhaps I'm not testing properly.
If anyone can shed any light on why the above is happening I would be really grateful. Thank you!
 
     
    