I can conduct a matched copy of a column in a data frame into another by something like
DF2$y <- DF1[match(DF2$id2, DF1$id1), "z"]  # DF1 and DF2 are data frames
where DF2$id2 is matched to DF1$id1. I would like to know what I can do with data tables for this kind of operation. My data tables have millions of rows and hundreds of columns. I've done setkey(DT1, id1) and setkey(DT2, id2).
This works:
DT2[, y := DT1[match(DT2$id2, DT1$id1), "z"]]  # DT1 and DT2 are data tables
but I am afraid that the match part might take long than necessary. (Or is it inevitable?)
I understand that I can also use column selection, merge, and renaming:
tmp <- DT1[, c("id1", "z")]  # column selection
DT3 <- merge(DT2, tmp, by.x = "id2", by.y = "id1", all.x = TRUE, suffixes = c("", ".y")) # merge
setnames(DT3, "z.y", "y")  # rename
(where the first two lines can be written on one line) but this seems a bit too complicated. Would there be a simpler and fast solution?
Thanks.
Example:
library(data.table)
DF1 <- data.frame(id1=2:4, x=LETTERS[1:3], z=11:13)
DF2 <- data.frame(id2=1:4, x=LETTERS[5:8], z=21:24)
DF1
#   id1 x  z
# 1   2 A 11
# 2   3 B 12
# 3   4 C 13
DF2
#   id2 x  z
# 1   1 E 21
# 2   2 F 22
# 3   3 G 23
# 4   4 H 24
DT1 <- data.table(DF1)
DT2 <- data.table(DF2)
setkey(DT1, id1)
setkey(DT2, id2)
DF2$y <- DF1[match(DF2$id2, DF1$id1), "z"]
DF2  # correct
#   id2 x  z  y
# 1   1 E 21 NA
# 2   2 F 22 11
# 3   3 G 23 12
# 4   4 H 24 13
DT2[, y := DT1[match(DT2$id2, DT1$id1), "z"]]
DT2
#    id2 x  z  y
# 1:   1 E 21 NA
# 2:   2 F 22 11
# 3:   3 G 23 12
# 4:   4 H 24 13
DT2[, y := NULL]
tmp <- DT1[, c("id1", "z")]
DT3 <- merge(DT2, tmp, by.x = "id2", by.y = "id1", all.x = TRUE, suffixes = c("", ".y"))
setnames(DT3, "z.y", "y")
DT3
#    id2 x  z  y
# 1:   1 E 21 NA
# 2:   2 F 22 11
# 3:   3 G 23 12
# 4:   4 H 24 13
## Simpler alternatives?
 
     
    