I have two very large csv files. They are both only one col with integers. I need to check for every integer in dfA if they are in dfB. If so, I need to remove item a from dfA.
I would probably loop through dfA and check for every value if in dfB, but looping is wayyyy too slow.
dfA :
        0
0  9312969810
1  3045897298
2  8162414592
3  2030000000
4  7876904982
dfB:
        0
0  2030000000
1  2030156119
2  2030389149
3  2030641047
4  2030693850
output:
        0
0  2030156119
1  2030389149
2  2030641047
3  2030693850
Since 2030000000 is in dfB, we need to remove from dfA.
Does anyone have a better way. Thanks
edit: csv for dfB is 2gb and dfA is 5mb
 
    