I have a DataFrame df like this one:
df =
name  group   influence
A     1       2
B     1       3
C     1       0
A     2       5
D     2       1
For each distinct value of group, I want to extract the value of name that has the maximum value of influence.
The expected result is this one:
group  max_name   max_influence
1      B          3
2      A          5
I know how to get max value but I don't know how to getmax_name.
df.groupBy("group").agg(max("influence").as("max_influence")
 
    