Following this answer:
https://stackoverflow.com/a/47107164/11462274
I try to create a DataFrame that is only the lines not found in another DataFrame, however, not according to all columns, but according to only some specific columns, so I tried to do it this way:
import pandas as pd
df1 = pd.DataFrame(data = {'col1' : [1, 2, 3, 4, 5, 3], 
                           'col2' : [10, 11, 12, 13, 14, 10],
                           'col3' : [1,5,7,9,6,7]}) 
df2 = pd.DataFrame(data = {'col1' : [1, 2, 3],
                           'col2' : [10, 11, 12],
                           'col3' : [1,5,8]})
df_merge = df1.merge(df2.drop_duplicates(), on=['col1','col3'], 
                   how='left', indicator=True)
df_merge = df_merge.query("_merge == 'left_only'")[df1.columns]
print(df_merge)
But note that when not using all the columns, they change their name like col2 to col2_x:
   col1  col2_x  col3  col2_y     _merge
0     1      10     1    10.0       both
1     2      11     5    11.0       both
2     3      12     7     NaN  left_only
3     4      13     9     NaN  left_only
4     5      14     6     NaN  left_only
5     3      10     7     NaN  left_only
So when I try to create the final DataFrame without the unnecessary columns, the unused columns are not found to generate the desired filter:
KeyError(f"{not_found} not in index")
KeyError: "['col2'] not in index"
 
     
    