I have a DataFrame consisting of dates, other columns and a numerical value, where some value combinations in "other columns" could be missing, and I want to populate them from previous dates.
Example. Say the DataFrame is like below. You can see on 2016-01-01, we have data for (LN, A), (LN, B), (NY, A) and (NY, B) on columns (location, band).
date location band value
0 2016-01-01 LN A 10.0
1 2016-01-01 LN B 5.0
2 2016-01-01 NY A 9.0
3 2016-01-01 NY B 6.0
4 2016-01-02 LN A 11.0
5 2016-01-02 NY B 7.0
6 2016-01-03 NY A 10.0
Then you notice on 2016-01-02, we only have (LN, A) and (NY, B), but (LN, B) and (NY, A) are missing. Again, on 2016-01-03, only (NY, A) is available; all other three combinations are missing.
What I want to do is to populate the missing combinations of each date from its predecessor. Say for 2016-01-02, I would like to add two more rows, "rolled over" from 2016-01-01: (LN, B, 5.0) and (NY, A, 9.0) for columns (location, band, value). Same for 2016-01-03. So as to make the whole thing like below:
date location band value
0 2016-01-01 LN A 10.0
1 2016-01-01 LN B 5.0
2 2016-01-01 NY A 9.0
3 2016-01-01 NY B 6.0
4 2016-01-02 LN A 11.0
5 2016-01-02 NY B 7.0
6 2016-01-03 NY A 10.0
7 2016-01-02 LN B 5.0
8 2016-01-02 NY A 9.0
9 2016-01-03 LN A 11.0
10 2016-01-03 LN B 5.0
11 2016-01-03 NY B 7.0
Note rows 7-11 are populated from rows 1, 2, 4, 7 and 5, respectively. The order is not really important as I can always sort afterwards if all the data I need is present.
Anyone to help? Thanks a lot!