I have a traffic data that looks like this. Here, each column have data in format meters:seconds. Like in row 1 column 2, 57:9 represents 57 meters and 9 seconds.
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
|---|---|---|---|---|---|---|---|---|---|
| 0:0 | 57:9 | 166:34 | 178:37 | 203:44 | 328:63 | 344:65 | 436:77 | 737:108 | None |
| 0:0 | 166:34 | 178:37 | 203:43 | 328:61 | 436:74 | 596:51 | 737:106 | None | None |
| 0:0 | 57:6 | 166:30 | 178:33 | 203:40 | 328:62 | 344:64 | 436:74 | 596:91 | None |
| 0:0 | 203:43 | 328:61 | None | None | None | None | None | None | None |
| 0:0 | 57:7 | 166:20 | 178:43 | 203:10 | 328:61 | None | None | None | None |
I want to extract meters values from the dataframe and store them in a list in ascending order. Then create a new dataframe in which the the column header will be the meters value (present in the list). Then it will match the meter value in the parent dataframe and add the corresponding second value beneath. The missing meters:second pair should be replaced by NaN and the current pair at the position would move to next column within same row.
The desired outcome is:
list = [0,57,166,178,203,328,344,436,596,737]
dataframe:
| 0 | 57 | 166 | 178 | 203 | 328 | 344 | 436 | 596 | 737 |
|---|---|---|---|---|---|---|---|---|---|
| 0 | 9 | 34 | 37 | 44 | 63 | 65 | 77 | NaN | 108 |
| 0 | NaN | 34 | 37 | 43 | 61 | NaN | 74 | 51 | 106 |
| 0 | 6 | 30 | 33 | 40 | 62 | 64 | 74 | 91 | None |
| 0 | NaN | NaN | NaN | 43 | 61 | None | None | None | None |
| 0 | 7 | 20 | 43 | 10 | 61 | None | None | None | None |
I know I must use a loop to iterate over whole dataframe. I am new to python so I am unable to solve this. I tried using str.split() but it work only on 1 column. I have 98 columns and 290 rows. This is just one month data. I will be having 12 month data. So, need suggestions and help.