As I understand, data.table is more efficient and faster than dplyr but I found the opposite situation today in my job. I created a simulation to explain the situation.
library(data.table)
library(dplyr)
library(microbenchmark)
#  data simulated
dt = data.table(A = sample(1:4247,10000, replace = T),
                B = sample(1:119,10000,replace = T),
                C = sample(1:6,10000,replace = T),
                D = sample(1:30,10000,replace = T))
dt[,ID:=paste(A, ":::" , 
              D,":::",
              C)]
# execution time
microbenchmark(
  DATA_TABLE = dt[, .(count=uniqueN(ID)), 
                  by=c("A","B","C")
                  ],
  DPLYR      = dt %>% 
               group_by(A,B,C)  %>% 
               summarise(count = n_distinct(ID)),
  times = 10
              )
Results
Unit: milliseconds
       expr         min          lq        mean      median    uq         max        neval
 DATA_TABLE 14241.57361 14305.67026 15585.80472 14651.16402  16244.22477 21367.56866  10
      DPLYR    35.95123    37.63894    47.62637    48.56598  53.59919    62.63978     10 
You can see the big difference! Does someone know the reason? Do you have some advice about when use dplyr or data.table?
I have my full code in data.table syntax now I don't know if I need to translate some chunks of code to dplyr due to this situation.
Thanks in advance.
 
    