When grouping a Pandas DataFrame, when should I use transform and when should I use aggregate?  How do
they differ with respect to their application in practice and which one do you
consider more important?
            Asked
            
        
        
            Active
            
        
            Viewed 1.4k times
        
    43
            
            
         
    
    
        piRSquared
        
- 285,575
- 57
- 475
- 624
 
    
    
        Sylvi0202
        
- 901
- 2
- 9
- 13
1 Answers
82
            
            
        consider the dataframe df
df = pd.DataFrame(dict(A=list('aabb'), B=[1, 2, 3, 4], C=[0, 9, 0, 9]))
groupby is the standard use aggregater
df.groupby('A').mean()
maybe you want these values broadcast across the whole group and return something with the same index as what you started with.
use transform
df.groupby('A').transform('mean')
df.set_index('A').groupby(level='A').transform('mean')
agg is used when you have specific things you want to run for different columns or more than one thing run on the same column.
df.groupby('A').agg(['mean', 'std'])
df.groupby('A').agg(dict(B='sum', C=['mean', 'prod']))
 
    
    
        piRSquared
        
- 285,575
- 57
- 475
- 624
- 
                    8fabulously tremendous answer! – mathopt Jul 28 '17 at 04:24
- 
                    2By using `agg` how can I return to original data-frame `df` exploding the aggregated columns? – MAC Aug 12 '21 at 12:12
- 
                    @MAC To explode columns, use `transform`. – Chris Coffee Jun 24 '22 at 07:14





