I have a column of data, it mostly has only one value, but some are multi-valued data connected with commas, and some missing values. I want to split multivalued data connected with commas into multiple lines.
I found a good solution in this (Split cell into multiple rows in pandas dataframe).
But this can only extract a few lines split from multi-valued data, I will splicing it with the original data, but my data is a large file, I can't clearly know where each multi-valued data is and delete them.
eg:
In [1]:data = {'id': [106452, 233649, 547531, 707841, 457009],
                'size': (np.nan, 1, 40, 40, '12,13')}
df = pd.DataFrame(data)
then:
In [2]:df_new = (df.set_index(['id'])
   .stack()
   .str.split(',', expand=True)
   .stack()
   .unstack(-2)
   .reset_index(-1, drop=True)
   .reset_index()
)
df_new
Out[1]: id  size
0    457009  12
1    457009  13
if:
In [3]:df_new = (df.set_index(['id'])
   .stack()
   .str.split(',', expand=True)
   .stack()
   .unstack(-2)
   .reset_index(-1, drop=True)
   .reset_index()
)
df = pd.concat([df,df_new]) # I know it's a bit stupid, but I just want to express the idea of merging.
df
Out[2]:
      id    size
0   106452  NaN
1   233649  1
2   547531  40
3   707841  40
4   457009  12,13
0   457009  12
1   457009  13
I want this:
Out[2]:
      id    size
0   106452  NaN
1   233649  1
2   547531  40
3   707841  40
4   457009  12
5   457009  13
I should How to do it?
 
    