I have an RDD[Sale] and wanted to leave only the latest sales. So what I did is created a pair RDD and then performed grouping and filtering:
val sales: RDD[(String, Sale)] = rawSales.map(sale => sale.id -> sale)
      .groupByKey()
      .mapValues(_.maxBy(_.timestamp))
But how do I return back to RDD[Sale] instead of the pair RDD in this case?
The only way I figured out is the following:
val value: RDD[Sale] = sales.map(salePaired => salePaired._2)
Is it the most proper solution?