My df looks like this:
sid, class_start,   class_end,   graduated, grad_date,  college_name,  degree,  major
123  2010-01-01     2010-04-15   NA         NA          ABC            NA       NA    
123  2010-06-01     2010-09-15   NA         NA          ABC            NA       NA    
123  2010-10-01     2010-12-15   NA         NA          ABC            NA       NA    
123  NA             NA           Y          2010-12-15  ABC            BS     Biology 
123  2011-01-01     2011-04-15   NA         NA          WRT            NA       NA    
123  2011-06-01     2011-09-15   NA         NA          WRT            NA       NA    
123  2011-10-01     2011-12-15   NA         NA          WRT            NA       NA    
123  NA             NA           Y          2011-12-15  ABC            BS     Chem    
123  2012-01-01     2011-04-15   NA         NA          ABC            NA       NA    
123  2012-06-01     2011-09-15   NA         NA          ABC            NA       NA    
123  2012-10-01     2011-12-15   NA         NA          ABC            NA       NA    
123  NA             NA           Y          2012-12-15  ABC            MS     Biology 
765  2010-01-01     2010-04-15   NA         NA          EGF            NA       NA    
765  2010-06-01     2010-09-15   NA         NA          EGF            NA       NA    
765  2010-10-01     2010-12-15   NA         NA          EGF            NA       NA    
765  NA             NA           N          NA          EGF            BS     Physics 
765  2014-01-01     2014-04-15   NA         NA          RSE            NA       NA    
765  2014-06-01     2014-09-15   NA         NA          RSE            NA       NA    
765  2014-10-01     2014-12-15   NA         NA          RSE            NA       NA    
765  NA             NA           N          NA          RSE            BS     Physics 
what I am trying to achieve is for each sid I want to get the first graduated value from from the first college. If a student does not graduate then we get the value of the last college.
Also, note that that Desired Output has a new columns first_start and last_end
first_start= students first enrollment datelast_end= student last enrollment date
For student graduated == Y, we only take this from college they first graduated.
For student graduated == N, first_start is the first college start date, last_end is the date last_end of last college
Desired Output
sid, first_start,   last_end,   graduated, grad_date,  college_name,  degree,  major
123  2010-01-01     2010-12-15   Y         2010-12-15  ABC            BS     Biology    
765  2010-01-01     2014-12-15   N          NA         RSE            BS     Physics 
My Approach:
- the closest I got is when I group by ['sid', 'college_name'] and then aggregrate value for other columns such as 
minvalue forfirst_startandmaxforlast_endand so on.. that does not get me to the Desired Output 
Could someone please help achieve this result?