I'm trying to define a new variable in a data.table through a merge. The wrinkle is that I'd like to access .N in order to define the new variable by group, so I'd like to use by as well, but this is causing an error.
MRE:
dt1<-data.table(pd=rep(1:2,each=3),rnk=rep(1:3,2),
var=c(3:1,1:3),key="pd")
dt2<-data.table(pd=c(1,2),chk=c(2,2),key="pd")
dt1[dt2,new:=var[.N]>i.chk,by=pd]
As you can see, I'd like to define new to be TRUE whenever the (within-pd) highest-rnk value of var exceeds 2. The code above seems natural enough to me, but this results in an error: object 'i.chk' not found (suggesting the merge has not been completed, as the name space of dt2 appears unavailable).
I can get around this with a second step:
> dt1[dt2,new:=var>i.chk][,new:=new[.N],by=pd][]
pd rnk var new
1: 1 1 3 FALSE
2: 1 2 2 FALSE
3: 1 3 1 FALSE
4: 2 1 1 TRUE
5: 2 2 2 TRUE
6: 2 3 3 TRUE
However, this slows down my code substantially since I'm using `:=` to update around 6 such columns.
Is there no way to update by reference by group when merging?