I have a data frame as below:
df = sqlContext.createDataFrame([("count","doc_3",3), ("count","doc_2",6), ("type","doc_1",9), ("type","doc_2",6), ("one","doc_2",10)]).withColumnRenamed("_1","word").withColumnRenamed("_2","document").withColumnRenamed("_3","occurences")
From this I need to create the matrix like below:
----------+-----+------+----+
|document |count| type |one | 
+---------+-----+------|----+
|doc_1    |  0  |  9   | 0  |
|doc_2    |  6  |  6   | 10 | 
|doc_3    |  3  |  0   |  0 | 
So I tried
print df.crosstab("document").show()
which didn't give what I wanted .Any help is appreciated