I'm new to pandas and I would like to use your help.
I have two files, one of them is really big (100G+), which I need to merge based on some columns. I skip some lines in the big file, thus I get the file as buffer to the read_csv method.
Firsy, I tried to use pandas. However, when I tried to open the file using pandas, the process was killed by the OS.
with open(self.all_file, 'r') as f:
    line = f.readline()
    while line.startswith('##'):
          pos = f.tell()
          line = f.readline()
    f.seek(pos)
    return pd.read_csv(f,sep='\t')
Afterwards, I tried to use dask instead of pandas, however dask can't get a buffer as input for read_csv method and it fails.
    return dd.read_csv(f,sep='\t')
How can I open large file as buffer and merge the two dataframes?
Thank you!