I have this dataframe:
dff = pd.DataFrame({"country":["cn","cn","cn", "cn","bn","bn","sn"], 'date':['2019-6-6','2019-6-5','2019-7-6','2019-7-9','2017-4-3','2017-3-2','2019-2-9'],'Revenue':[100,200,100,90,300,400,500], "class":["A","C", "B","B", "C","A","B"]})            
    dff
    country date    Revenue class
0   cn  2019-6-6    100       A
1   cn  2019-6-5    200       C
2   cn  2019-7-6    100       B
3   cn  2019-7-9    90        B
4   bn  2017-4-3    300       C
5   bn  2017-3-2    400       A
6   sn  2019-2-9    500       B
I want to achieve three features:
- The date 2019-6-6are modified to monthJun-2019
- The revennue are aggregated according to month
- The revenue are classified according to the class
The result table should look like this:
country date         A   B   C
0   cn  Jun-2019    100  0   200   
1   cn  Jul-2019    0    190  0
2   bn  Apl-2017    0    0   300
3   bn  Mar-2017    400  0     0
4   sn  Feb-2019    0    500   0
To achieve feature 1,
dff.date = pd.to_datetime(dff.date )
dff.date = dff.date.apply(lambda x : x.strftime("%b-%Y"))
dff
country date    Revenue class
0   cn  Jun-2019    100 A
1   cn  Jun-2019    200 C
2   cn  Jul-2019    100 B
3   cn  Jul-2019    90  B
4   bn  Apr-2017    300 C
5   bn  Mar-2017    400 A
6   sn  Feb-2019    500 B
To achieve feature 2,
dff1= dff.groupby(["date"]).sum()
dff1
    Revenue
date    
Apr-2017    300
Feb-2019    500
Jul-2019    190
Jun-2019    300
Mar-2017    400
To achieve feature 3,
dff.pivot( columns='class')['Revenue']
class   A   B   C
0   100.0   NaN NaN
1   NaN NaN 200.0
2   NaN 100.0   NaN
3   NaN 90.0    NaN
4   NaN NaN 300.0
5   400.0   NaN NaN
6   NaN 500.0   NaN
I feel I am almost there, but i can't figure out how to combine this simultaneously to achieve the final result. Any idea is appreciated!
 
     
    