I am using R to analyse a time series. My goal is to count from "response" the consecutive sequences. I want to add a column which classifies my data according to consecutive sequences in column response. Example: row 1 is group 1 for id "A", row 3 is group 2 for id "A", row 6 to 9 is group 3 for id "A". The result what I want is shown in "want_group". The data has the following structure:
"row"   "date"  "id"    "response"  "want_group"
1   2021-10-06  "A" 1   1
2   2021-10-07  "A" 0   0
3   2021-10-08  "A" 1   2
4   2021-10-09  "A" 0   0
5   2021-10-10  "A" 0   0
6   2021-10-11  "A" 1   3
7   2021-10-12  "A" 1   3
8   2021-10-13  "A" 1   3
9   2021-10-14  "A" 1   3
10  2021-10-15  "A" 0   0
11  2021-10-16  "A" 1   4
12  2021-10-17  "A" 0   0
13  2021-10-18  "A" 0   0
14  2021-10-06  "B" 0   0
15  2021-10-07  "B" 0   0
16  2021-10-08  "B" 0   0
17  2021-10-09  "B" 1   1
18  2021-10-10  "B" 1   1
19  2021-10-11  "B" 0   0
20  2021-10-12  "B" 0   0
21  2021-10-13  "B" 0   0
22  2021-10-14  "B" 0   0
23  2021-10-15  "B" 0   0
24  2021-10-16  "B" 1   2
25  2021-10-17  "B" 1   2
26  2021-10-18  "B" 1   2
My Idea was to group the dataframe and calculate the cumsum of variable response to have a similiar structure like in length of longest consecutive elements of sequence, so that I have for row 3 in cs_res=1 and for row 6 to 9 in cs_res=1,2,3,4. But the cumsum was calculated for the hole id. I hope you have a hint for me to find a function in R or how I can find a solution.
df1 <- data.frame(row = c(1:13),
                  date = seq.Date(as.Date("2021-10-06"), as.Date("2021-10-18"), "day"),
                  id = rep("A", times = 13),
                  response = c(1, 0, 1, 0, 0, 1, 1, 1, 1, 0, 1, 0, 0),
                  want_group = c(1, 0, 2, 0, 0, 3, 3, 3, 3, 0, 4, 0, 0) )
df2 <- data.frame(row = c(14:26),
                  date = seq.Date(as.Date("2021-10-06"), as.Date("2021-10-18"), "day"),
                  id = rep("B", times = 13),
                  response = c(0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1),
                  want_group = c(0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 2, 2, 2) ) 
df <- rbind(df1, df2)
df %>% 
  group_by(id, response) %>% 
  mutate(
    cs_res = if_else(response ==  1L, sequence(rle(response)$lengths), 0L) 
    )
"row"   "id"    "response"  "cs_res"
1   "A" 1   1
2   "A" 0   0
3   "A" 1   2
4   "A" 0   0
5   "A" 0   0
6   "A" 1   3
7   "A" 1   4
8   "A" 1   5
9   "A" 1   6
10  "A" 0   0
11  "A" 1   7
12  "A" 0   0
13  "A" 0   0
14  "B" 0   0
15  "B" 0   0
.
.
.
 
    