I have the following DataFrame df in Spark:
+------------+---------+-----------+
|OrderID     |     Type|        Qty|
+------------+---------+-----------+
|      571936|    62800|          1|
|      571936|    62800|          1|
|      571936|    62802|          3|
|      661455|    72800|          1|
|      661455|    72801|          1|
I need to select the row that has a largest value of Qty per each unique OrderID or the last rows per OrderID if all Qty are the same (e.g. as for 661455). The expected result:
+------------+---------+-----------+
|OrderID     |     Type|        Qty|
+------------+---------+-----------+
|      571936|    62802|          3|
|      661455|    72801|          1|
Any ides how to get it?
This is what I tried:
val partitionWindow = Window.partitionBy(col("OrderID")).orderBy(col("Qty").asc)
val result = df.over(partitionWindow)
 
    