I m trying to merge 2 datasets:
dataset 1
id, month, year, postal
dataset 2
id, month, year, postal, Income, name, division
dataset 1
id year month postal  
1 2010   9     j0r1h0
2 2010   8     j0r1h0
....
....
7   2007 6     j3x4p2
dataset 2
id,  year, month, postal, name, division
1   2010 9     j0r1h0 john starting
2   2010 8     j0r1h0 lili retired
I want to keep all my columns and rows in dataset 1 and get the extra columns from dataset 2, like Income and division.
I get wrong result, duplicate field in month and year when I tried:
merge(a,b,by=c(postal,month,year,all.x=TRUE)
This is my expected result:
id year month postal name division
1   2010 9     j0r1h0 john  starting
2   2010 8     j0r1h0 lili  retired
3   2010 7     j1v3c4 verna starting
4   2009 1     j23c5  Greg  medium
5   2007 1     j2j4d3 Greg  medium
6   2008 2     j2p4s3  na   na
7   2007 6     j3x4p2  na   starting
And this is my result:
id year month postal name division
1   2010 9     j0r1h0 john  starting
2   2010 8     j0r1h0 lili  retired
3   2010 8     j0r1h0  na   na
4   2010 7      na     na   na
5   2010 7     j1v3c4 verna starting
6   2009 1     j23c5  Greg  medium
7   2007 1     j2j4d3 Greg  medium
8   2008 2     j2p4s3  na   na
9   2007 6     j3x4p2  na   starting
9   2007 1     j3x4p2  na   starting
my real data set size is over 200000 x 16
