I am trying to concatenate multiple files in a directory. I am able to concatenate the files, however I am getting duplicated rows and columns. If a field doesn't exist in the input file, the output file should ideally show NaN or null.
Ideally the output would look as follows:
Merged.csv
   ItemID  Price  Discount ItemName          Desc  Barcode
0       1      4       2.0      Toy      For kids   667865
1       2     10       1.0     Game    For adults   998364
2       3      8       NaN      NaN  For everyone    43831
file1.csv
   ItemID  Price  Discount
0       1      4       2.0
1       2     10       1.0
2       3      8       NaN
file2.csv
   ItemID ItemName          Desc
0       1      Toy      For kids
1       2     Game    For adults
2       3      NaN  For everyone
file3.CSV
   ItemID  Barcode
0       1   667865
1       2   998364
2       3    43831
The following is the code I am working on:
import glob , os
import pandas as pd
path = '/location/test/'
all_files = glob.glob(os.path.join(path, "*.csv"))
pd.concat([pd.read_csv(f, sep=';').assign(file=os.path.basename(f)) for f in all_files]).to_csv('merged.csv', index=False)
 
    