I have a pandas dataframe filled with accounting records that I would like to groupby -> records, but I can't quite figure out which methods to use.
Here is the original dataframe:
person_id|statement_date|expense_type|amount
--------------------------------------------
1        |1/1           |coffee      |5.00
1        |1/1           |bagel       |3.00
1        |1/1           |ticket      |8.00
1        |2/1           |coffee      |5.00
1        |2/1           |donut       |4.00
1        |2/1           |ticket      |8.00
2        |1/1           |coffee      |5.00
2        |1/1           |bagel       |3.00
2        |1/1           |ticket      |8.00
2        |2/1           |coffee      |5.00
2        |2/1           |donut       |4.00
2        |2/1           |ticket      |8.00
Instead of each row containing a single expense line item, I'd like to position expenses as columns, and each record being a person_id + statment_date, like so:
person_id|statement_date|coffee|bagel|ticket|donut
--------------------------------------------------
1        |1/1           |5.00  |3.00 |8.00  |na
1        |2/1           |5.00  |na   |8.00  |3.00
2        |1/1           |5.00  |3.00 |8.00  |na
2        |2/1           |5.00  |na   |8.00  |3.00
The following groupby gets close:
df.groupby(['person_id', 'statement_date','expense_type']).sum()
But I would like the last item, 'expense_type' to become cols.
I'm sure there is some sort of apply that I could use, but I just wasn't able to figure this one out.
Thank you!
