In the example below, the indices returned by the order function are used to sort the entries in each group by a :
set.seed(123)
ex.df <- data.frame(
group = sample(LETTERS[1:4],20,replace=TRUE),
score1 = sample(1:10),
score2 = sample(1:10)
)
sortedOrderings <- by(ex.df, ex.df$group, function(df) order(df$score1 + df$score2) )
bestIndices <- lapply(sortedOrderings, FUN= function(lst) lst[1] )
The trouble is that order sees the indices of the data frame subsetted by by rather than ex.df itself, so using it to extract the relevant rows from the ex.df isn't the brightest idea:
print(sortedOrderings)
ex.df$group: A
[1] 2 3 4 1
---------------------------------------------------------------
ex.df$group: B
[1] 5 3 2 4 1
---------------------------------------------------------------
ex.df$group: C
[1] 2 1 3 4
---------------------------------------------------------------
ex.df$group: D
[1] 3 7 4 6 1 2 5
> print(ex.df[bestIndices,])
group score1 score2
2 D 7 9
5 D 4 1
2.1 D 7 9
3 B 6 6
Is there a way to pull out the "best" row from each group in ex.df, or at least have the indices reference ex.df?