I was to use the transform method on a groupby object using built-in (ie 'mean', 'sum', etc) functions but keep np.nan values. For example,
np.random.seed(0)
df = pd.DataFrame({'value':np.random.randint(0,100,8)},index = list('aabbccdd'))
df.iloc[[0,6]] = np.nan
df.groupby(level=0).transform('min')
yields
value
a 43.0
a 43.0
b 4.0
b 4.0
c 44.0
c 44.0
d 89.0
d 89.0
but i want:
value
a np.nan
a np.nan
b 4.0
b 4.0
c 44.0
c 44.0
d np.nan
d np.nan
Using my own function such as lambda x: min(skipna=True) will work...eventually but I have rather millions of small groups on which lambda and numpy methods takes an eternity. Any suggestions?
Yes, there is a similar question but note that in that question, the OP wants to include np.nan groups whereas I want to not skip over np.nan values in the groups