I am getting data in file1.dat file with data separated by | character.
109|LK98765|2|18.07.2021|01|abc1|01|abc2|01|abc3
110|LK67665|2|10.10.1987|02|abc1|01|abc2|01|abc3
111|LK43465|2|23.07.2005|03|abc1|01|abc2|01|abc3
112|LK23265|2|13.02.2012|04|abc1|01|abc2|01|abc3
My requirement is to add header to the file and change it to .csv with field separator as ,.
To achieve the above requirement, below code is written in python.
to add header:
def fn_add_header(file_name):
    with(open(file_name) as f:
        r=csv.reader(f)
        data = line [line for line in r]
    with(open(file_name,'wb') as f:
        w =csv.writer(f)
        w.writerow(['ID','SEC_NO','SEC_CD','SEC_DATE','SEC_ID1','SEC_DESC1','SEC_ID2','SEC_DESC2','SEC_ID3','SEC_DESC3'])
        w.writerows(data)
To change the file to csv:
def fn_replace(filename,directory)
    final_file = directory+"\file1.csv"
    for file in os.listdir(filename)
        if fnmatch.fnmatch(file.lower(),filename.lower()):
            shutil.copyfile (file,final_file )
            cmd = ["sed","-i","-e"'s/|/,/g',final_file )
            ret2,out2,err2 = fn_run_cmd(cmd)
The above code is working fine and I am getting the converted file as:
ID,SEC_NO,SEC_CD,SEC_DATE,SEC_ID1,SEC_DESC1,SEC_ID2,SEC_DESC2,SEC_ID3,SEC_DESC3
109,LK98765,2,18.07.2021,01,abc1,01,abc2,01,abc3
110,LK67665,2,10.10.1987,02,abc1,01,abc2,01,abc3
111,LK43465,2,23.07.2005,03,abc1,01,abc2,01,abc3
112,LK23265,2,13.02.2012,04,abc1,01,abc2,01,abc3
I am facing issue while reading the above converted file.csv in yml. To read the file i am using below code:
frameworkComponents:
 today_file:
   inputDirectoryPath: <path of the file>
   componentName: today_file
   componentType: inputLoader
   hadoopfileFormat: csv
   csvSep: ','
selectstmt:
 componentName: selectstmt
 componentType: executeSparlSQL
 sql: |-
      select ID,SEC_NO,
             SEC_CD,SEC_DATE,
             SEC_ID1,SEC_DESC1,
             SEC_ID2,SEC_DESC2,
             SEC_ID3,SEC_DESC3
        from today_file
write_file:
   componentName: write_file
   componentType: outputWriter
   hadoopfileFormat: avro
   numberofPartition: 1
   outputDirectoryPath: <path of the file>
precedence:
 selectstmt:
     dependsOn:
        today_file: today_file
 write_file:
     dependsOn:
        selectstmt: selectstmt
When I am running the yml I am getting below error.
Unable to infer schema for CSV. It must be specified manually.
 
     
    