I am trying to compare some columns on a numpy dataset to another set of columns that contain categorical data:
    Gender | Grade | Score 1 | Score 2 | Score 3
    M      | A     | 12      | 8       | 0
    M      | A     | 8       | 11      | 4
    M      | B     | 10      | 8       | 9
    F      | B     | 12      | 2       | 2
    F      | B     | 11      | 8       | 1
    F      | B     | 1       | 4       | 2
What I need to do is compare each score column to Gender and Grade. Then calculate The average score for each category. So for example, mean of values in Score 1 who have gender M and mean for gender F, and mean for grade A and mean for grade B. I cannot define the categories (or Gender and Grade column names) explicitly in code as my actual datasets have a variable distribution of categorical values, some of additional columns with categorical data. I am able to calculate all the categorical values with np.unique() which gives a full list. However, I am stuck as to how I would create a matrix using ufuncs, broadcasting, masking etc that would allow me to compare each of these columns to the categorical values without loops.
The output should be ideally a matrix which contains all the categories available in one column, with the associated means from other columns.
            A           B           M           F
Score1      mean(S1,A)  mean(S1,B)  mean(S1, M) mean(S1, F)
Score2      mean(S2,A)  etc         etc         etc
Score3      mean(S3,A)  etc         etc         etc
 
    