While answering this question I came across a behaviour I do not understand.
I am trying to fillna specific columns val2 and val3 for rows which include the first instance of each value in id. For some reason an inplace solution with fillna doesn't appear to work, and I don't understand why.
Let's assume this input dataframe:
id val1 val2 val3 date
0 102 9 NaN 4.0 2002-01-01
1 102 2 3.0 NaN 2002-03-03
2 103 4 NaN NaN 2003-04-04
3 103 7 4.0 5.0 2003-08-09
4 103 6 5.0 1.0 2005-02-03
Desired output, with a fill value of -1:
id val1 val2 val3 date
0 102 9 -1.0 4.0 2002-01-01
1 102 2 3.0 NaN 2002-03-03
2 103 4 -1.0 -1.0 2003-04-04
3 103 7 4.0 5.0 2003-08-09
4 103 6 5.0 1.0 2005-02-03
Below is a solution that works and the inplace variant that does not work:
mask = ~df['id'].duplicated()
val_cols = ['val2', 'val3']
df.loc[mask, val_cols] = df.loc[mask, val_cols].fillna(-1) # WORKS
df.loc[mask, val_cols].fillna(-1, inplace=True) # DOES NOT WORK
I am using Python 3.6.5, Pandas 0.23.0, NumPy 1.14.3.
Possibly this is intended behaviour, but I haven't been able to find a duplicate. As far as I can see, there's no chained indexing involved.