Let's say I've got three project categories A, B and C that can have three different statuses Started, Finished and canceled. I'm trying to find the percentage of categories that have for example Status = Finished for all categories. I think the first step should be to display it in a matrix like below:
Input:
Category Status
A Started
A Started
A Finished
A Finished
B Started
B Canceled
B Canceled
C Started
C Finished
Desired Output:
    Started Finished Canceled
A   2       2        0
B   1       0        2
C   1       1        0
But I'm struggling to identify the possible statuses that for example A can have, but does not. In this example Canceled = 0. I've been trying to do this by subsetting a pandas dataframe by unique observations for Category, and then joining them to fill non-existent combinations with nan. But I think this would be very slow for bigger datesets. Also, I'm not quite there yet. Here's the code if anyone would like to try and build on it. But I'm suspecting that there's a much more efficient solution out there somewhere...
My attempt:
import pandas as pd
import numpy as np
#df = pd.read_clipboard(sep='\\s+')
# dft = df.T
frames = {}
n = 0
status = df['Status'].unique()
# Subset and create dataframes
for category in df['Category'].unique():
    n = n + 1
    newname = 'df_' + str(n)
    print(newname)
    dfs = df[df['Category']==category]
    frames[newname] = dfs
# Join dataframes
df_main = frames['df_1']
frames.pop('df_1')
for key in frames:
    df_main = pd.merge(df_main, frames[key], on = 'Category', how = 'outer')
df_main = df_main.set_index(['Category'])
df_main.columns = status
Output df_main:
          Started  Finished  Canceled
Category                              
A          Started       NaN       NaN
A          Started       NaN       NaN
A         Finished       NaN       NaN
A         Finished       NaN       NaN
B              NaN   Started       NaN
B              NaN  Canceled       NaN
B              NaN  Canceled       NaN
C              NaN       NaN   Started
C              NaN       NaN  Finished
