I have data frame as like df:
id features
100 [{'city': 'Rio'}, {'destination': '2'}]
110 [{'city': 'Sao Paulo'}]
135 [{'city': 'Recife'}, {'destination': '45'}]
145 [{'city': 'Munich'}, {'destination': '67'}]
167 [{'city': 'Berlin'}, {'latitude':'56'}, {'longitude':'30'}]
I have to extract column name and values from features column to separate columns as like:
id city destination latitude longitude
100 'Rio' '2' NaN NaN
110 'Sao Paulo' NaN NaN NaN
135 'Recife' '45' NaN NaN
145 'Munich' '67' NaN NaN
167 'Berlin' NaN '56' '30'
I tried to do it with usage idea as like:
1st method to extract:
df = df.explode('features').reset_index(drop = True)
result = pd.concat([df.drop(columns='features'),
pd.json_normalize(df['features'])], axis=1)
result is only id column.
2nd method:
df = df.explode('features').reset_index(drop = True)
df2 = df.set_index('id')
df2 = df2['features'].astype('str')
df2 = df2.apply(lambda x: ast.literal_eval(x))
df2 = df2.apply(pd.Series)
result = df2.reset_index()
result is very closed what I need:
id city destination latitude longitude
100 'Rio' NaN NaN NaN
100 NaN '2' NaN NaN
110 'Sao Paulo' NaN NaN NaN
135 'Recife' NaN NaN NaN
135 NaN '45' NaN NaN
145 'Munich' NaN NaN NaN
145 'Munich' '67' NaN NaN
167 'Berlin' NaN NaN NaN
167 NaN NaN '56' NaN
167 NaN NaN NaN '30'
How is possible to achieve an expected result in view of:
id city destination latitude longitude
100 'Rio' '2' NaN NaN
110 'Sao Paulo' NaN NaN NaN
135 'Recife' '45' NaN NaN
145 'Munich' '67' NaN NaN
167 'Berlin' NaN '56' '30'
Thanks