i have this dataframe that have row of each key*id , i want to explode it to id,key1,key2 and remove duplicate rows and keep data_field , i am working with python2.7 but i would glad to a solution that will work both for python2.7 and python3.7
dataframe i have:
import  pandas as pd
d = {'id': [111, 222, 222, 333, 333], 'key': ['key1', 'key2','key1','key2','key1'], 'value':[1,1,2,3,3],'data_field':['dummy1','dummy1','dummy2','dummy3','dummy2']}
df = pd.DataFrame(data=d)
print df[['id','key','value','data_field']].to_string(index=False)
        id   key  value data_field
         111  key1      1     dummy1
         222  key2      1     dummy1
         222  key1      2     dummy2
         333  key2      3     dummy3
         333  key1      3     dummy2
dataframe i want it to be transformed to:
d = {'id': [111, 222, 333], 'key1': [1, 2, 3],'key2':[pd.np.nan,1,3] , 'data_field': ['dummy1', 'dummy2', 'dummy3']}
df = pd.DataFrame(data=d)
print df[['id', 'key1', 'key2', 'data_field']].to_string(index=False)
          
    id  key1  key2 data_field
         111     1   NaN     dummy1
         222     2   1.0     dummy2
         333     3   3.0     dummy3
tried as suggested here
df.pivot(index='id', columns='key', values='value').join(df.drop_duplicates('id')['data_field'])
and got  :
            key1  key2 data_field
id                        
111   1.0   NaN        NaN
222   2.0   1.0        NaN
333   3.0   3.0        NaN
data_field was not kept and id is now index and not column
 
    