I have a df of fruit purchases sorted by date. I want to drop duplicates by fruit. But the way to drop duplicates depend on the column. The solution needs to generalise to more columns. But the 3 types of operations remain the same:
For each fruit:
- price column should be the highest sold price
- date, place and colour columns should be the most recent value that isn't NaN
- qty should be the average number sold
df fruit      date   price  place   colour   qty
0  Apple  25-12-2023   4     NaN    Green    5
1  Apple  22-11-2023   5    London   Red     6 
2  Apple  20-10-2023   6    Paris    NaN     8 
3  Pear   19-10-2023   4    Sweden   Red     8
4  Pear   18-10-2023   5    London   Green   8
5  Pear   17-10-2023   10   Paris   Purple   9
Expected Output:
   fruit     date       price   place   colour   qty
   Apple   25-12-2023    6     London   Green    6.33 (5+6=8/3)
   Pear    19-10-2023    10    Sweden   Red      8.33 (8+8+9/3)
