I want to make a new dataframe where students column must have a unique value and transform the column named program into other columns according to each category.
To help you understand my problem I provide you my df as follows:
import pandas as pd
import numpy as np
df=pd.DataFrame({'students':['Salim', 'Salim', 'khaled', 'Raoues', 'Raoues', 'Rafik'],
'program':['MBA', 'MS', 'PHD', 'MS', 'PHD', 'MS'],
'count': [2, 3, 4, 2, 1, 1],
'price': [68, 59, 45, 39, 10, 63],
'Teacher':['Pr Yaici', 'Pr Yaici', 'Dr Zeggagh', 'Dr Zeggagh', 'Dr Zeggagh', 'Pr Yaici']
})
df
So my dataframe has following form:
students program count price Teacher
0 Salim MBA 2 68 Pr Yaici
1 Salim MS 3 59 Pr Yaici
2 khaled PHD 4 45 Dr Zeggagh
3 Raoues MS 2 39 Dr Zeggagh
4 Raoues PHD 1 10 Dr Zeggagh
5 Rafik MS 1 63 Pr Yaici
Goal:
The new_df I want to create from above df is:
students programMBA programMS programPHD countMBA countMS countPHD priceMBA priceMS pricePHD Teacher
0 Salim MBA MS NaN 2.0 3.0 NaN 68.0 59.0 NaN Pr Yaici
1 khaled NaN NaN PHD NaN NaN 4.0 NaN NaN 45.0 Dr Zeggagh
2 Raoues NaN MS PHD NaN 2.0 1.0 NaN 39.0 10.0 Dr Zeggagh
3 Rafik NaN MS NaN NaN 1.0 NaN NaN 63.0 NaN Pr Yaici
As you can see each category in column program has been propagated accordingly to columns count and price while the column teacher is not modified.
Tried methods:
First I wanted to use some encoding methods, but they don't create categorical values as they are. Methods like get_dummies is useful to create new columns but it doesn't apply in my case.
Your suggestions will be helpful.