I have a dataset of weekly sales of a few stores, which looks something similar to :
| store_id | item_id | week | sales | 
|---|---|---|---|
| store1 | item1 | 2021-01 | 3 | 
| store1 | item2 | 2021-01 | 2 | 
| store2 | item1 | 2021-01 | 10 | 
| store2 | item3 | 2021-01 | 1 | 
| store1 | item1 | 2021-02 | 5 | 
| store1 | item2 | 2021-02 | 1 | 
| store2 | item1 | 2021-02 | 11 | 
| store1 | item3 | 2021-03 | 6 | 
| store1 | item1 | 2021-04 | 7 | 
| store2 | item3 | 2021-04 | 2 | 
I am trying to use pandas dataframe and transform this into a single entry per every (store, item) combination. I want to create separate columns of sales, for every unique entry in the week column.
Specifically for the example above, it should look like this after the transformation :
| store_id | item_id | sales_week_2021-01 | sales_week_2021-02 | sales_week_2021-03 | sales_week_2021-04 | 
|---|---|---|---|---|---|
| store1 | item1 | 3 | 5 | 0 | 7 | 
| store1 | item2 | 2 | 1 | 0 | 0 | 
| store1 | item3 | 0 | 0 | 6 | 0 | 
| store2 | item1 | 10 | 11 | 0 | 0 | 
| store2 | item3 | 1 | 0 | 0 | 2 | 
[Note that multiple stores can share the same items].
I'm able to achieve this transformation using plain python dictionaries,etc but I need a pandas way to do it. The reason being these are huge datasets and I later might also want to transform this using py-spark to get the advantage of fast processing.
I'm pretty new to pandas, so please excuse if the question sounds naive to the community :)
 
    