I have a dataframe as such:
probe.id       gene.name   variance       databse
A_23_P100002   FAM174B     0.93285966     Database1
A_23_P100013   AP3S2       0.48936044     Database1
...
A_23_P100020   RBPMS2      0.77441359     Database2
A_23_P100072   AVEN        0.36194383     Database2
...
I am interested in reducing this dataframe so that only the 100 genes with the highest variances per database remain. It seems that aggregate could do the job, but I don't have an idea of how to write the function that I would pass to aggregate. I would greatly appreciate any help.
Thank you!
 
     
     
    