I have a column member_casual that can take up to 3 values. I want to display a histogram in percent for each of those values so that I can compare them. It's important that, during the percentage calculation, count be the number of row with member_casual=value, and not count equal the total number of row. I was able to do so with the following code:
dataCustomer <- tripDataFiles %>% 
  filter( member_casual == "Customer")
dataSubscriber <- tripDataFiles %>% 
  filter( member_casual == "Subscriber")
dataDependent <- tripDataFiles %>% 
  filter( member_casual == "Dependent")
ggplot(dataCustomer, aes(x=tripduration, y =  stat(count / sum(count))))+
  geom_histogram(aes(fill='customer'), alpha = 0.5)+
  geom_histogram(data=dataSubscriber, aes(fill='subscriber'), alpha = 0.5)+
  geom_histogram(data=dataDependent, aes(fill='dependent'), alpha = 0.5)+
  scale_y_continuous(labels = scales::percent)
that gave me the following graph:
 But I'm not satisfied with this code since I have to add lines for each value of member_casual. If member_casual change I will have to rework this code.
But I'm not satisfied with this code since I have to add lines for each value of member_casual. If member_casual change I will have to rework this code.
Do you know a way to achieve the same result with a code that don't rely on member_casual values?
Thanks
EDIT:
It's data from https://divvy-tripdata.s3.amazonaws.com/index.html 
It's data from years 2015 to 2017 that I formated to 2023 format
tripDataFiles17 <- dataFileNames %>%
  grep(x = dataFileNames, pattern = '2017', value = TRUE) %>% #Select years 2017
  grep(pattern = 'station',  x = ., ignore.case = TRUE, invert = TRUE, value = TRUE) %>% #Remove files on stations
  lapply(fread) %>% #Read data from the selected file
  rbindlist() %>%  #Merge data from selected file
  rename(
    started_at = start_time,
    ended_at = end_time
  ) %>% 
  mutate(started_at = parse_date_time(started_at,dateTimeFormat), ended_at = parse_date_time(ended_at,dateTimeFormat)) #Convert datetime string to datetime
tripDataFiles <- rbindlist( list(tripDataFiles15_16, tripDataFiles17)) %>%
  rename(
    ride_id = trip_id,
    start_station_id = from_station_id,
    start_station_name = from_station_name,
    end_station_id = to_station_id,
    end_station_name = to_station_name,
    member_casual = usertype
  )
dput(tripDataFiles[1:20, c("member_casual", "tripduration")])\
structure(list(member_casual = c("Subscriber", "Customer", "Subscriber", 
"Customer", "Subscriber", "Subscriber", "Subscriber", "Subscriber", 
"Subscriber", "Customer", "Customer", "Customer", "Customer", 
"Subscriber", "Subscriber", "Subscriber", "Subscriber", "Subscriber", 
"Subscriber", "Subscriber"), tripduration = c(299L, 940L, 751L, 
1240L, 1292L, 175L, 930L, 383L, 260L, 1123L, 1167L, 231L, 1092L, 
585L, 401L, 177L, 653L, 303L, 223L, 353L)), row.names = c(NA, 
-20L), class = c("data.table", "data.frame"), ...)
 
     
    