I'm trying to get a sense of how to write Rcpp summarise functions that will be fast with dplyr. The motivation for this is a function that dplyr does not seem to have an equivalent for, however, for the sake of simplicity, I'm going to use the example of just taking the last element of a vector.
In the code below, I consider three different functions to get the last element of a vector and apply them using both tapply and dplyr group_by/summarise.
library(dplyr)
library(microbenchmark)
library(Rcpp)
n <- 5000
df <- data.frame(grp = factor(rep(1:n, 2)), valn = rnorm(2L*n), stringsAsFactors = F)
dplyr_num_last_element <- function() df %>% group_by(grp) %>% summarise(valn = last(valn))
dplyr_num_last_element_r <- function() df %>% group_by(grp) %>% summarise(valn = last_r(valn))
dplyr_num_last_element_rcpp <- function() df %>% group_by(grp) %>% summarise(val = last_rcpp(valn))
tapply_num_last_element <- function() tapply(df$valn, df$grp, FUN = last)
tapply_num_last_element_r <- function() tapply(df$valn, df$grp, FUN = last_r)
tapply_num_last_element_rcpp <- function() tapply(df$valn, df$grp, FUN = last_rcpp)
last_r <- function(x) {
  x[1]
}
cppFunction('double last_rcpp(NumericVector x) {
             int n = x.size();
             return x[n-1];
           }')
microbenchmark(dplyr_num_last_element(), dplyr_num_last_element_r(), dplyr_num_last_element_rcpp(), tapply_num_last_element(), tapply_num_last_element_r(), tapply_num_last_element_rcpp(), times = 10) 
Unit: milliseconds
                           expr        min         lq       mean     median         uq       max neval
       dplyr_num_last_element()   6.895850   7.088472   8.264270   7.766421   9.089424  11.00775    10
     dplyr_num_last_element_r() 205.375404 214.481520 220.995218 220.107130 225.971179 238.62544    10
  dplyr_num_last_element_rcpp() 211.593443 216.000009 222.247786 221.984289 228.801007 230.50220    10
      tapply_num_last_element()  97.082102  99.528712 101.955668 101.717887 104.370319 109.26982    10
    tapply_num_last_element_r()   6.101055   6.550065   7.386442   7.069754   7.589164   9.98025    10
 tapply_num_last_element_rcpp()  14.173171  15.145711  16.102816  15.400562  16.053229  22.00147    10
My general questions are:
1) Why does the dplyr_num_last_element_r take on avg 220 ms, while tapply_num_last_element_r takes 7 ms.
2) Is there any way to write my own last function to use with dplyr, but have it take more on the order of 7ms?
Thanks!