I have a question which drives me crazy and really need your help. The simplified question is this:
d1<-data.table(v1=c("a","b","c","d","d","b","a","c","a","d","b","a"),
                    v2=(seq(1:12)),V3=rep(1:4,times=3))
d2<-data.table(v1=c("a","b","c","d"),v3=c(3,2,1,4),v4=c("y","x","t","e"))
This will yield two data sets:
    D1:     
    v1 v2 V3
 1:  a  1  1
 2:  b  2  2
 3:  c  3  3
 4:  d  4  4
 5:  d  5  1
 6:  b  6  2
 7:  a  7  3
 8:  c  8  4
 9:  a  9  1
10:  d 10  2
11:  b 11  3
12:  a 12  4
> d2
   v1 v3 v4
1:  a  3  y
2:  b  2  x
3:  c  1  t
4:  d  4  e
As you could see that the elements in v1 and v3 is the same. Now I want to joint both data set by creating a new column in the D1 which return the value of V4 in d2 that match both indices v1 and v3, I hope I could get output looking like this:
>
 d3
    v1 v2 V3 V4
 1:  a  1  1 na
 2:  b  2  2  x
 3:  c  3  3 na
 4:  d  4  4  e
 5:  d  5  1 na
 6:  b  6  2  x
 7:  a  7  3  y
 8:  c  8  4 na
 9:  a  9  1 na
10:  d 10  2 na
11:  b 11  3 na
12:  a 12  4 na
The size of actual data I am using is relatively very large. It is something like joint 113MB data with 23MB. I tried to use for loop to do this problem by because the data is so long, it takes ages to finish the task. I also tried mergeand sqldf but both of them failed to finish the job. Could you please help me with this problem? Thank you very much!