I have the following data frame
| user_id | value | 
|---|---|
| 1 | 5 | 
| 1 | 7 | 
| 1 | 11 | 
| 1 | 15 | 
| 1 | 35 | 
| 2 | 8 | 
| 2 | 9 | 
| 2 | 14 | 
I want to drop all rows that are not the maximum value of every user_id
resulting on a 2 row data frame:
| user_id | value | 
|---|---|
| 1 | 35 | 
| 2 | 14 | 
How can I do that?
I have the following data frame
| user_id | value | 
|---|---|
| 1 | 5 | 
| 1 | 7 | 
| 1 | 11 | 
| 1 | 15 | 
| 1 | 35 | 
| 2 | 8 | 
| 2 | 9 | 
| 2 | 14 | 
I want to drop all rows that are not the maximum value of every user_id
resulting on a 2 row data frame:
| user_id | value | 
|---|---|
| 1 | 35 | 
| 2 | 14 | 
How can I do that?
 
    
    You can use pandas.DataFrame.max after the grouping.
Assuming that your original dataframe is named df, try the code below :
out = df.groupby('user_id', as_index=False).max('value')
>>> print(out)If you want to group more than one column, use this :
out = df.groupby(['user_id', 'sex'], as_index=False, sort=False)['value'].max()
>>> print(out)