df%>%
group_by(variable1)%>%
summarise(length=length(levels(df$variable2))
group_by does not work and I have the same results for all the levels of the variable1.
df%>%
group_by(variable1)%>%
summarise(length=length(levels(df$variable2))
group_by does not work and I have the same results for all the levels of the variable1.
We need to remove df$. The levels(df$variable2) gets the levels in the full dataset. For factor variables, the unused levels remains unless we drop the levels with droplevels.
df %>%
group_by(variable1)%>%
summarise(length=length(levels(droplevels(variable2))))
Also, instead of using the levels route, we can use n_distinct
df %>%
group_by(variable1) %>%
summarise(length=n_distinct(variable2))
set.seed(24)
df <- data.frame(variable1=sample(letters[1:3],
10,replace=TRUE), variable2= sample(letters[1:5],10, replace=TRUE))