I am trying to extract common rows from several dataframes using pd.concat:
>>> import numpy as np
>>> import pandas as pd
>>> x = np.random.random(size=(5, 3))
>>> df1 = pd.DataFrame(x)
>>> df2 = pd.DataFrame(x[1:3])
>>> df3 = pd.DataFrame(x[2:4])
>>> df1
          0         1         2
0  0.257662  0.453542  0.805230
1  0.060493  0.463148  0.715994
2  0.452379  0.470137  0.965709
3  0.447546  0.964252  0.163247
4  0.187212  0.973557  0.871090
>>> df2
          0         1         2
0  0.060493  0.463148  0.715994
1  0.452379  0.470137  0.965709
>>> df3
          0         1         2
0  0.452379  0.470137  0.965709
1  0.447546  0.964252  0.163247
As you can see, only the row 0.452379  0.470137  0.965709 is common to all three dataframes. To extract it, I tried:
>>> pd.concat([df1, df2, df3], join='inner')
          0         1         2
0  0.257662  0.453542  0.805230
1  0.060493  0.463148  0.715994
2  0.452379  0.470137  0.965709
3  0.447546  0.964252  0.163247
4  0.187212  0.973557  0.871090
0  0.060493  0.463148  0.715994
1  0.452379  0.470137  0.965709
0  0.452379  0.470137  0.965709
1  0.447546  0.964252  0.163247
Thus, join==inner doesn't seem to work! I should also point out that ignore_index=True has no effect on the behavior. In an article on Real Python, using axis=1 is suggested. However, it is wrong in my opinion:
>>> pd.concat([df1, df2, df3], join='inner', axis=1)
          0         1         2         0         1         2         0         1         2
0  0.257662  0.453542  0.805230  0.060493  0.463148  0.715994  0.452379  0.470137  0.965709
1  0.060493  0.463148  0.715994  0.452379  0.470137  0.965709  0.447546  0.964252  0.163247
What is wrong with what I am doing? Also, how would I extract common rows from several dataframes if this way doesn't work? I am using Pandas version 0.25.3.
 
     
     
     
     
    