The data looks like this:
origin_id   type   serialn     event    year    month    day
1              A       101        X1    2017        6     10
1              A       101        X2    2017        6     10
1              B       101        X3    2017        6     10
2              A       151        X1    2016        7     15
2              B       151        X3    2016        7     15
2              C       151        X4    2016        7     15
And I need it to be this way:
origin_id    serialn   X1    X2    X3   X4    year    month    day
        1        101    A     A     B null    2017        6     10
        2        151    A  null     B    C    2016        7     15
So basically what I need is to use the values of the column event as headers and put the value of the column type for each event, when there is no event for a certain origin_id put a null. Some other columns from the dataframe, like serialn and origin_id should be in the resulting one. Also there should be only one row for each origin_id
This question: How to pivot a dataframe touches on some points, altough it's geared towards performing an aggregation at some point.
This is a possible solution.
I get a df with the origin_id as the index, the events as columns and the types as their value.
stat = df.pivot(values='type', index='origin_id', columns='event')
Now I need some information from the original dataframe, so I only keep one event for each origin_id and drop the columns I'm not gonna use
df1 = df.drop_duplicates(subset='origin_id').drop(['type','event'], axis=1)
Merging the both dataframes, df1 using the values of origin_id and stat using the index.
pd.merge(df1, stat, how='inner', left_on = 'origin_id', right_index = True)
Using the first dataframe with the code above I get this result:
origin_id    serialn   X1    X2    X3   X4    year    month    day
        1        101    A     A     B null    2017        6     10
        2        151    A  null     B    C    2016        7     15
Is there another way to do this?
Thanks!
 
    