I am trying this problem but not getting the right solution.
So, I have a data which has City and Months mapped to them
| City | Month |
|---|---|
| A | M1 |
| A | M2 |
| B | M3 |
| B | M4 |
| C | M5 |
| C | M8 |
I have created dummy variables and have marked them as binary in this manner
| City | M1 | M2 | M3 | M4 | M5 | M8 |
|---|---|---|---|---|---|---|
| A | 1 | 0 | 0 | 0 | 0 | 0 |
| A | 0 | 1 | 0 | 0 | 0 | 0 |
| B | 0 | 0 | 1 | 0 | 0 | 0 |
| B | 0 | 0 | 0 | 0 | 0 | 1 |
| C | 0 | 0 | 0 | 1 | 0 | 0 |
| C | 0 | 0 | 0 | 0 | 1 | 0 |
Now, the main problem is, I want to mark each location to a month in a single row, like this
| City | M1 | M2 | M3 | M4 | M5 | M8 |
|---|---|---|---|---|---|---|
| A | 1 | 1 | 0 | 0 | 0 | 0 |
| B | 0 | 0 | 1 | 0 | 0 | 1 |
| C | 0 | 0 | 0 | 1 | 1 | 0 |
Can anyone suggest how to move from table 2 to table 3 structure? I do not want to hard code them as different locations might get assigned random months in subsequent data. Getting dummy variables in easy but how do I get to last format? any useful functions existing in python for this?