how to convert a dataframe as below?
a dataframe I have:
| GROUP | ITEM | AMOUNT |
|---|---|---|
| group1 | item1 | 100 |
| group1 | item2 | 200 |
| group1 | item3 | 300 |
| group2 | item1 | 400 |
| group2 | item2 | 500 |
expected result
| GROUP | ITEM1 | ITEM2 | ITEM3 |
|---|---|---|---|
| group1 | 100 | 200 | 300 |
| group2 | 400 | 500 |
how to convert a dataframe as below?
a dataframe I have:
| GROUP | ITEM | AMOUNT |
|---|---|---|
| group1 | item1 | 100 |
| group1 | item2 | 200 |
| group1 | item3 | 300 |
| group2 | item1 | 400 |
| group2 | item2 | 500 |
expected result
| GROUP | ITEM1 | ITEM2 | ITEM3 |
|---|---|---|---|
| group1 | 100 | 200 | 300 |
| group2 | 400 | 500 |
You can use pivot
val pivotDF = df.groupBy("GROUP").pivot("ITEM").first("AMOUNT")
pivotDF.show()
You can read more about pivot here https://databricks.com/blog/2016/02/09/reshaping-data-with-pivot-in-apache-spark.html