I am running into the following issue. I want to merge two dataframes on multiple columns(11 to be exact). Surprisingly the usual methods do not work. Example dataframes are as follows: df1:
c1  c2  c3  c4  c5  c6  event_count
1   2   a   ff  0   1   5
1   2   b   fg  1   3   6
1   2   c   hg  2   4   20
1   2   d   gf  0   1   7
1   2   e   fg  1   4   1
df2:
c1  c2  c3  c4  c5  c6  event_type  event_price
1   2   a   ff  0   1   a           20
1   2   b   fg  1   3   a           20
1   2   c   hg  2   4   b           30
1   2   d   gf  0   1   b           40
1   2   e   fg  1   4   b           50
I tried
result = pd.merge(df1, df2, how='outer', on = ['c1','c2','c3','c4','c5','c6'])
and also this should technically work:
result = pd.merge(df1, df2, how='outer')
But I get this:
c1  c2  c3  c4  c5  c6  event_count event_type  event_price
1   2   a   ff  0   1   5       
1   2   b   fg  1   3   6       
1   2   c   hg  2   4   20      
1   2   d   gf  0   1   7       
1   2   e   fg  1   4   1       
1   2   a   ff  0   1               a           20
1   2   b   fg  1   3               a           20
1   2   c   hg  2   4               b           30
1   2   d   gf  0   1               b           40
1   2   e   fg  1   4               b           50
When I want to get this:
c1  c2  c3  c4  c5  c6  event_count event_type  event_price
1   2   a   ff  0   1   5            a          20
1   2   b   fg  1   3   6            a          20
1   2   c   hg  2   4   20           b          30
1   2   d   gf  0   1   7            b          40
1   2   e   fg  1   4   1            b          50
I also tried with indexes:
df1_index = df1.set_index(['c1', 'c2','c3'....]); df2_index =df2.set_index(['c1', 'c2','c3'....]); result = pd.concat([df1_index, df2_index], axis=1);
But that gave the same wrong result. I made sure the dtypes of the two tables are the same as well. I am not sure what else to try. Any advice? Thanks :)
 
     
    