I have big_file.csv containing a bunch of company information. Here's a snippet
CompanyName, CompanyNumber,RegAddress.CareOf,...
"! # 1 AVAILABLE LOCKSMITH LTD","05905727","",...
"!NSPIRED LIMITED","06019953",""...
"CENTRE FOR COUNSELLING, PSYCHOTHERAPY AND TRAINING LTD","07981734",""...
I only need the CompanyName and CompanyNumber fields, so I did the following:
cut -d, -f 1,2 big_file.csv > big_file_names_codes_only.csv
As you can see tho (and I understand why) the third entry in the big_file.csv gets cut after the first comma which is actually part of CompanyName. I know how to remove in sed the first comma (but that would break the whole csv strucutre), so i was wondering if any of you knew how to remove the comma from the first (it's always on position 1) "string, with, commas, or not and non alphanum chars!". 
So basically the intermediate output i am looking for is:
CompanyName, CompanyNumber
"! # 1 AVAILABLE LOCKSMITH LTD","05905727"
"!NSPIRED LIMITED","06019953"
"CENTRE FOR COUNSELLING PSYCHOTHERAPY AND TRAINING LTD","07981734"
But this last line becomes:
"CENTRE FOR COUNSELLING, PSYCHOTHERAPY AND TRAINING LTD"
Once I get this intermediate output I need to clean the company of all non alpha num characters in the name and leading spaces - which works very well with this:
sed -i 's/[^a-zA-Z0-9 ,]//g; s/^[ \t]*//'
In the end my file should be:
CompanyName, CompanyNumber,RegAddress.CareOf,...
AVAILABLE LOCKSMITH LTD,05905727
NSPIRED LIMITED,06019953
CENTRE FOR COUNSELLING PSYCHOTHERAPY AND TRAINING LTD,07981734
 
     
     
     
     
     
     
     
     
     
     
    