Hello I have data set like this.
Age  Sallary  
24   >50k  
17   <=50k  
31   >50k  
24   >50k  
I need to find the age which has the most >50k sallary
Hello I have data set like this.
Age  Sallary  
24   >50k  
17   <=50k  
31   >50k  
24   >50k  
I need to find the age which has the most >50k sallary
 
    
    going with akrun's table comment,
names(which.max(table(df)[, ">50k"]))
[1] "24"
table calculates the cross-tab of these two columns. [, ">50K"] subsets to the column of salaries you are looking for, then which.max pulls out the first element of this column that contains the maximum count. Finally, since a named vector is returned by each of these functions, we can extract the age with names.
With a data.frame with additional columns, you could replace table(df) with table(df$Age, df$Sallary) to select these variables from the data.frame.
so
names(which.max(table(df$Age, df$Sallary)[, ">50k"]))
[1] "24"
also works for the example dataset.
data
df <- 
structure(list(Age = c(24L, 17L, 31L, 24L), Sallary = structure(c(2L, 
1L, 2L, 2L), .Label = c("<=50k", ">50k"), class = "factor")), .Names = c("Age", 
"Sallary"), class = "data.frame", row.names = c(NA, -4L))
