Here is my dataframe:
df= pd.DataFrame(
{"mat" : ['A' ,'A', 'A', 'A', 'B'],
 "ppl" : ['P', 'P', 'P', '',  'P'],
 "ia1" : ['',  'X', 'X', '',  'X'],
 "ia2" : ['X', '',  '',  'X', 'X']},
index = [1, 2, 3, 4, 5])
I want to select unique values on the two first columns. I do:
df2 = df.loc[:,['mat','ppl']].drop_duplicates(subset=['mat','ppl']).sort_values(by=['mat','ppl'])
I get, as expected:
  mat ppl
4   A    
1   A   P
5   B   P
What I want now is, df3 to be:
 mat ppl ia1 ia2
   A           X
   A   P   X   X
   B   P   X   X
That is: in df3 for row A+P, in column ia1, I got an X because there is a X in column ia1 in one of the row of df, for A+P
 
    