I have a DF with data like:
| Identificator | Date | Status | 
|---|---|---|
| ID1 | 2021-05-02 19:55:43 | OK | 
| ID2 | 2021-05-02 19:48:01 | FAILED | 
| ID3 | 2021-05-02 19:47:53 | OK | 
| ID1 | 2021-05-03 19:55:43 | FAILED | 
| ID2 | 2021-05-03 20:48:01 | OK | 
| ID3 | 2021-05-03 19:47:53 | OK | 
| ID1 | 2021-05-04 19:55:43 | FAILED | 
| ID2 | 2021-05-04 19:48:01 | OK | 
| ID3 | 2021-05-04 19:47:53 | OK | 
The lines are actually thousands and the IDs are hundreds, but this is not relevant. Let's assume that they are 1 event (line) per ID per day. I want to be able to take only X amount of days back - for example last 3 days and to create grid with number of days back as X axis, IDs as Y axis. Cells are days in a row without success:
| X | 3 | 2 | 1 | 
|---|---|---|---|
| ID1 | 0 | 1 | 2 | 
| ID2 | 1 | 0 | 0 | 
| ID3 | 0 | 0 | 0 | 
ID1 on 3 days back is 0, because there is a success on that day. But the same ID has 2 failures after that, so day 1 (the most recent day) is 2 in the cell for this ID.
The idea is to have a Dataframe ready for some kind of heatmap visualization on a later stage, but even in colored xlsx to have visual sense - where the lack of success is more severe, and where is OK
If we imagine ID4 with longer history:
OK,FAIL,OK,OK,FAIL,FAIL,OK,FAIL,FAIL,FAIL,OK
the row should look like
0,1,0,0,1,2,0,1,2,3,0
It's just like a counter of failures in a row which resets to 0 with every OK.
Any help is greatly appreciated