I have a dataset having some colors and counts with related dates.
+-----------+----------+-----+
|      color|      Date|count|
+-----------+----------+-----+
|        red|2014-05-26|    5|
|        red|2014-05-02|    1|
|        red|2015-04-02|    1|
|        red|2015-04-26|    1|
|        red|2015-09-26|    2|
|       blue|2014-05-26|    3|
|       blue|2014-06-02|    1|
|      brown|2014-07-31|    2|
|      green|2014-08-01|    2|
+-----------+----------+-----+
I want max count for each colors with related dates. I am using Spark 2.0.2 with Java 8.
when I used max function then it removed date column and when I put date into groupBy then it gives same table as input dataset.
df.groupBy(color).max("count").show();
+-----------+----------+
|color      |max(count)|
+-----------+----------+
|        red|         5|
|       blue|         3|
|      brown|         2|
|      green|         2|
+-----------+----------+
Expected output:
+-----------+----------+----------+
|color      |      date|max(count)|
+-----------+----------+----------+
|        red|2014-05-26|         5|
|       blue|2014-05-26|         3|
|      brown|2014-07-31|         2|
|      green|2014-08-01|         2|
+-----------+----------+----------+
 
     
    