I have following data
user_id   session_id    youtube_id 
1           1             2342 
1           1             3523
1           2             3325
2           1             3423
2           1             2352
2           1             3333 
2           2             2351
2           2             9876
2           3             2388
Goal is to group by user_id and calculate both total_sessions, total_views per user and hence average views per session.
user_id, total_sessions, total_views, average_view_per_session
1,         2,            3,           1.5
2,         3,            6,           2    
 result_df['avg'] = df.groupby('user_id').agg({
     'session_id':lambda x : x.nunique(),
     'youtube_id': 'count'}).apply(lambda x : x['total_views']/x['total_sessions']
Two problems with above:
- the resulting columns are still named session_idandyoutube_idthough they are aggregations
- how to carry out the divisionto get theaverage_view_per_session?
The above approach gives a key error which could be due to using the original column name for aggregated columns.
 
     
    