I want to perform several operations intertwining dtplyr and data.table code. My question is whether, having loaded dtplyr, I can apply dplyr verbs to a data.table object and get optimized data.table code as I would with a lazy_dt.
I here provide some examples and ask: would dtplyr translate to data.table code here? Or is simply dplyr working?
# Setup for all chunks:
library(dplyr)
library(data.table)
library(dtplyr)
a) setDT
dataframe # class data.frame
setDT(dataframe)
dataframe %>%
group_by(id) %>%
mutate(rows_per_group = n())
b) data.table object
dt <- as.data.table(dataframe) # or dt <- data.table::fread(filepath)
dt %>%
group_by(id) %>%
mutate(rows_per_group = n())
Also, if all of them make dtplyr work. What is the most efficient option between a), b) and c) using lazy_dt(dataframe)?