I would like to filter my dataframe to only keep rows with max value in column some_date.
df.filter(F.col('some_date') = F.max('some_date')) fails, as max is not used in aggregate.
I also tried to just get the max_date value to then use it in filter: max_date = df.groupBy().max('some_date'), which failed telling me that "some_date" is not a numeric column. Aggregation function can only be applied on a numeric column.
In SQL, I would achieve this with a subquery (to the effect of where some_date = (select max(some_date) from ...), but I thought there would be a better way to structure it in Python.