I have a pandas data frame. I want to group it by using one combination of columns and count distinct values of another combination of columns.
For example I have the following data frame:
   a   b    c     d      e
0  1  10  100  1000  10000
1  1  10  100  1000  20000
2  1  20  100  1000  20000
3  1  20  100  2000  20000
I can group it by columns a and b and count distinct values in the column d:
df.groupby(['a','b'])['d'].nunique().reset_index()
As a result I get:
   a   b  d
0  1  10  1
1  1  20  2
However, I would like to count distinct values in a combination of columns. For example if I use c and d, then in the first group I have only one unique combination ((100, 1000)) while in the second group I have two distinct combinations: (100, 1000) and (100, 2000).
The following naive "generalization" does not work:
df.groupby(['a','b'])[['c','d']].nunique().reset_index()
because nunique() is not applicable to data frames.
 
     
     
    
 
    