I have 2 DataFrames : df0 and df1 and df1.shape[0] > df1.shape[0].
df0 and df1 have the exact same columns.
Most of the rows of df0 are in df1.
The indices of df0 and df1 are
df0.index = range(df0.shape[0])
df1.index = range(df1.shape[0])
I then created dft
dft = pd.concat([df0, df1], axis=0, sort=False)
and removed duplicated rows with
dft.drop_duplicates(subset='this_col_is_not_index', keep='first', inplace=True)
I have some duplicates on the index of dft. For example :
dft.loc[3].shape
returns
(2, 38)
My aim is to change the index of the second row returned to have a unique index 3.
This second row should be indexed dft.index.sort_values()[-1]+1.
I would like to apply this operation on all duplicates.
References :
Python Pandas: Get index of rows which column matches certain value