This question has been asked for excel. How to automatically insert a blank row after a group of data. I would like to know if there is a function for doing the same in pandas dataframe.
            Asked
            
        
        
            Active
            
        
            Viewed 516 times
        
    1 Answers
1
            There are probably other (faster?) ways to do this. Here an attempt.
First create a sample dataframe to use for this example. We assume that the index defines the groups that need to be split:
users = {'user_id': ['A','A','A','A', 'B','B','B'],
     'status': ['S1', 'S2', 'S1', 'S3', 'S1', 'S2', 'S1'],
     'value': [100, 30, 100, 20, 50, 30, 60 ],
    }
df1 = pd.DataFrame(users, columns = ['user_id', 'status', 'value'])
df1.set_index('user_id', drop=True, inplace=True)
Here the output:
| user_id | status | value | 
|---|---|---|
| A | S1 | 100 | 
| A | S2 | 30 | 
| A | S1 | 100 | 
| A | S3 | 20 | 
| B | S1 | 50 | 
| B | S2 | 30 | 
| B | S1 | 60 | 
We create a new dataframe with empty rows, one for each unique element of the index:
df2 = pd.DataFrame(index=df1.index.drop_duplicates(keep='first'))
Append the new rows and sort the index:
df_merged = df1.append(df2)
df_merged.sort_index(inplace=True)
The final output looks as follows:
| user_id | status | value | 
|---|---|---|
| A | S1 | 100 | 
| A | S2 | 30 | 
| A | S1 | 100 | 
| A | S3 | 20 | 
| A | NaN | NaN | 
| B | S1 | 50 | 
| B | S2 | 30 | 
| B | S1 | 60 | 
| B | NaN | NaN | 
        notiv
        
- 441
 - 5
 - 12