I have the following dataset of students taking multiple SAT exams:
df = pd.DataFrame({'student': 'A A A A A B B B'.split(),
                  'exam_date':[datetime.datetime(2013,4,1),datetime.datetime(2013,6,1),
                               datetime.datetime(2013,8,1),datetime.datetime(2013,10,2),
                               datetime.datetime(2014,1,1),datetime.datetime(2013,11,2),
                               datetime.datetime(2014,2,2),datetime.datetime(2014,5,2)]})
print(df)
  student  exam_date
0   A     2013-04-01
1   A     2013-06-01
2   A     2013-08-01
3   A     2013-10-02
4   A     2014-01-01
5   B     2013-11-02
6   B     2014-02-02
7   B     2014-05-02
I want to make a dataset of each student with their first exam date, second exam date, and so on.
I am trying groupby and min to get the 1st date, but not sure about the subsequent dates.
# Find earliest time
df.groupby('student')['exam_date'].agg('min').reset_index()
I tried rank to get the desired result, but it seems too much of work.
# Rank
df['rank'] = df.groupby('student')['exam_date'].rank(ascending=True)
print(df)
student exam_date   rank
0   A   2013-04-01  1.0
1   A   2013-06-01  2.0
2   A   2013-08-01  3.0
3   A   2013-10-02  4.0
4   A   2014-01-01  5.0
5   B   2013-11-02  1.0
6   B   2014-02-02  2.0
7   B   2014-05-02  3.0
Is there any better way of getting the desired output? Any suggestions would be appreciated. Thanks!
Desired Output:
 student  exam_01     exam_02     exam_03     exam_04
0   A   2013-04-01  2013-06-01  2013-08-01  2013-10-02
1   B   2013-11-02  2014-02-02  2013-05-02      NA
 
     
    