With the following dataframe:
import pandas as pd
df=pd.DataFrame(data=[[1,5179530,'rs10799170',8.1548,'E001'], [1,5179530,'rs10799170',8.1548,'E002'], [1,5179530,'rs10799170',8.1548,'E003'], [1,455521,'rs235884',2.584,'E003'], [1,455521,'rs235884',2.584,'E007']], col    umns=['CHR','BP','SNP','CM','ANNOT'])
   CHR       BP         SNP      CM ANNOT
0    1  5179530  rs10799170  8.1548  E001
1    1  5179530  rs10799170  8.1548  E002
2    1  5179530  rs10799170  8.1548  E003
3    1   455521    rs235884  2.5840  E003
4    1   455521    rs235884  2.5840  E007
I would like to obtain
   CHR       BP         SNP      CM  E001  E002  E003  E007
0    1  5179530  rs10799170  8.1548     1     1     1     0  
1    1   455521    rs235884  2.5840     0     0     1     1
I tried groupby() and get_dummies() separately
df.groupby(['CHR','BP','SNP','CM']).sum()
    CHR BP      SNP        CM         ANNOT           
1   455521  rs235884   2.5840      E003E007
    5179530 rs10799170 8.1548  E001E002E003
pd.get_dummies(df['ANNOT'])
    E001  E002  E003  E007
0     1     0     0     0
1     0     1     0     0
2     0     0     1     0
3     0     0     1     0
4     0     0     0     1
But I don't know how to combine both or if there is another way.
 
     
     
    