I am using dplyr::filter(StrawsReleased >= 1000) to work my code.
Filter function retains only the rows where StrawsReleased is greater than or equal to 1000 within each group. But I would like to keep all ShortName values. I tried remove filter of the function but I didn't get solve it.
Could someone help me?
ordered_data <- read.table(text= "ShortName  DiasColeta  StrawsReleased  Idade
BAUL  3  0  5
BAUL  6  0  5
BAUL  9  380  5
BAUL  25  90  5
BAUL  34  900  5
BAUL  68  1500  5
BAUL  90  900  5
BAUL  107  1500  5
JOUL  3  0  4
JOUL  9  0  4
JOUL  15  0  4
JOUL  29  1000  4
JOUL  35  1000  4
JOUL  45  2000  4
JOUL  67  0  4
JOUL  89  1000  4
JOUL  109  50  4", header = TRUE)
library(dplyr)
a2 <- ordered_data %>%
  mutate(StrawsReleased=cumsum(StrawsReleased), 
         Doses=(StrawsReleased >= 1000) + (StrawsReleased >= 3000) + 
           (StrawsReleased >= 5000), .by=ShortName) %>%
  filter(StrawsReleased >= 1000) %>%
  slice_head(by=c(ShortName, Doses)) %>%
  mutate(Doses=paste('Doses', c('1000', '3000', '5000')[Doses])) %>%
  select(ShortName, Doses, DiasColeta, idade)
FYI: I am creating 3 groups "Doses" (1.000/3.000/5.000) based on DiasColeta and idade.
After that, I would like to calculate the general mean for Doses based on ShortName and Idade Values.
I am using:
a2 %>% 
  group_by(Doses, idade) %>% 
  summarise(n=n(), 
            TempoParaProd=mean(DiasColeta))
Desired output:
ShortName Doses Idade DiasColeta
BAUL Doses 1000 5  34
BAUL Doses 3000 5  90
BAUL Doses 5000 5 107
JOUL Doses 1000 4  29
JOUL Doses 3000 4  45
JOUL Doses 5000 4  89
 
    