I'm trying to write a function which takes a data.table, a list of columns and a list of values and selects rows such that each column is filtered by the respective value.
So, given the following data.table:
> set.seed(1)
> dt = data.table(sample(1:5, 10, replace = TRUE),
sample(1:5, 10, replace = TRUE),
sample(1:5, 10, replace = TRUE))
> dt
V1 V2 V3
1: 1 5 5
2: 4 5 2
3: 1 2 2
4: 2 2 1
5: 5 1 4
6: 3 5 1
7: 2 5 4
8: 3 1 3
9: 3 1 2
10: 1 5 2
A call to filterDT(dt, c(V1, V3), c(1, 2)) would select the rows where V1 = 1 and V3 = 2 (rows 3 and 10 above).
My best thought was to use .SD and .SDcols to stand in for the desired columns and then do a comparison within i (from dt[i,j,by]):
> filterDT <- function(dt, columns, values) {
dt[.SD == values, , .SDcols = columns]
}
> filterDT(dt, c("V1", "V3"), c(1, 2))
Empty data.table (0 rows and 3 cols): V1,V2,V3
Unfortunately, this doesn't work, even if only filtering by one column.
I've noticed all examples of .SD I've found online use it in j, which tells me I'm probably doing something very wrong.
Any suggestions?