Suppose that we have the following data frame:
ID <- c(1, 1, 1, 2, 2, 3, 3, 3, 3, 4, 4, 5, 5, 5, 6, 6, 6)
age <- c(25, 25, 25, 22, 22, 56, 56, 56, 80, 33, 33, 90, 90, 90, 5, 5, 5)
gender <- c("m", "m", NA, "f", "f", "m", NA, "m", "m", "m", NA, NA, NA, "m", NA, NA, NA)
company <- c("c1", "c2", "c2", "c3", "c3", "c1", "c1", "c1", "c1", "c5", "c5", "c3", "c4", "c5", "c3", "c1", "c1")
income <- c(1000, 1000, 1000, 500, 1700, 200, 200, 250, 500, 700, 700, 300, 350, 300, 500, 1700, 200)
df <- data.frame(ID, age, gender, company, income)
In this data we have 6 unique IDs, and if you look at the gender variable, sometimes in includes NA
I want to replace the NAs with the correct gender category. Also, in case an ID has all NA's for gender, then leave it as is.
The expected outcome would be:
