I am trying to write to a subset of rows of a data.table by reference in order to deal with training, testing, and excluded rows of data for a model.
However, when I define this subset of rows and attempt to write to it, it breaks the reference without warning. Conceptually, I know that this works:
library('data.table')
a <- data.table(a1=c(0,1), a2=c(2,3))
a
#    a1 a2
# 1:  0  2
# 2:  1  3
b <- a
b[,b1:=4]
b
#    a1 a2 b1
# 1:  0  2  4
# 2:  1  3  4
a
#    a1 a2 b1
# 1:  0  2  4
# 2:  1  3  4
But what I am trying to do is something like:
a <- data.table(a1=c(0,1), a2=c(2,3))
a
b <- a[1,]
b
#    a1 a2
# 1:  0  2
b[,b1:=4]
b
#    a1 a2 b1
# 1:  0  2  4
a
#    a1 a2
# 1:  0  2
# 2:  1  3
# What I would really like is
#>a
#    a1 a2 b1
# 1:  0  2  4
# 2:  1  3 NA
I am having a hard time reconciling this behavior with the explanation here which suggests that using the data table assignment := shouldn't break the reference like <- would.
I have a key for every row, so merging the scores back is not a big deal.  I'm just curious if there's a way to pass it.  Basically I am trying to createDataPartition() around some excluded rows and finding the book-keeping kind of annoying.
 
    