I have a data.frame with several columns, where the values are integers. For example:
set.seed(1)
df <- data.frame(s1 = as.integer(runif(10,0,10)),
s2 = as.integer(runif(10,0,10)),
s3 = as.integer(runif(10,0,10)))
My question is how to efficiently add a column to this data.frame that will label the column that has the maximum value for each row, but if there are ties the label will be NA.
The slow way of doing this:
df$max <- sapply(1:nrow(df), function(r){
max.idx <- which(df[r,] == max(df[r,]))
if(length(max.idx) == 1){
max.label <- colnames(df)[max.idx]
} else{
max.label <- NA
}
max.label
})
> df
s1 s2 s3 max
1 2 2 9 s3
2 3 1 2 s1
3 5 6 6 <NA>
4 9 3 1 s1
5 2 7 2 s2
6 8 4 3 s1
7 9 7 0 s1
8 6 9 3 s2
9 6 3 8 s3
10 0 7 3 s2
I'm looking for something faster for a much larger data.frame