I have imported a json file and I now have a data frame where one column (code) that is a list.
index year   gvkey    code
0    1998    15686    ['TAX', 'ENVR', 'HEALTH']
1    2005    15372    ['EDUC', 'TAX', 'HEALTH', 'JUST']
2    2001    27486    ['LAB', 'TAX', 'HEALTH']
3    2008    84967    ['HEALTH','LAB', 'JUST']
What I want to get is something as follow:
index year   gvkey  TAX  ENVR HEALTH EDUC JUST LAB
0    1998    15686   1     1     1    0    0    0
1    2005    15372   1     0     1    0    1    0
2    2001    27486   1     0     1    0    1    0
3    2008    84967   0     0     1    0    1    1
Following Pandas convert a column of list to dummies I tried the following code (where df is my data frame):
s = pd.Series(df["code"])
l = pd.get_dummies(s.apply(pd.Series).stack()).sum(level=0)
I get the second part of the data right (variables TAX, ENVR, HEALTH, EDUC, JUST and LAB), but loose the first (year and gvkey).
How can I keep the year and gvkey variable?
 
     
    