I want to be able to feed a list as parameters to generate different aggregate functions in pandas. To make this more concrete, let's say I have this as data:
import numpy as np
import pandas as pd
np.random.seed(0)
df_data = pd.DataFrame({
    'group': np.repeat(['x', 'y'], 10),
    'val': np.random.randint(0, 10, 20)
})
So the first few rows of the data looks like this:
| group | val | 
|---|---|
| x | 5 | 
| x | 0 | 
| x | 3 | 
I have a list of per-group percentiles that I want to compute.
percentile_list = [10, 90]
And I tried to use dictionary comprehension with pd.NamedAgg that calls a lambda function to do this.
df_agg = df_data.groupby('group').agg(
    **{f'p{y}_by_dict': pd.NamedAgg('val', lambda x: np.quantile(x, y / 100)) for y in percentile_list},
)
But it doesn't work. Here I calculate both by hand and by dictionary comprehension.
df_agg = df_data.groupby('group').agg(
    p10_by_hand=pd.NamedAgg('val', lambda x: np.quantile(x, 0.1)),
    p90_by_hand=pd.NamedAgg('val', lambda x: np.quantile(x, 0.9)),
    **{f'p{y}_by_dict': pd.NamedAgg('val', lambda x: np.quantile(x, y / 100)) for y in percentile_list},
)
The result looks like this. The manually specified aggregations work but the dictionary comprehension ones have the same values across different aggregations. I guess they just took the last lambda function in the generated dictionary.
| p10_by_hand | p90_by_hand | p10_by_dict | p90_by_dict | |
|---|---|---|---|---|
| x | 1.8 | 7.2 | 7.2 | 7.2 | 
| y | 1.0 | 8.0 | 8.0 | 8.0 | 
How do I fix this? I don't have to use dictionary comprehension, as long as each aggregation can be specified programmatically.
 
    