I have a problem when processing text files in a data processing pipeline in Shell and Python.
What is a better solution to print text files to stdout to put through a data processing pipeline (using perl in the script tokenise.sh and python)?
My current script in Shell works fine except that it does not output the last line in a txt file. I'm not sure if I should use cat or echo or something else (instead of while IFS= read line ...) for better performance. 
for f in path/to/dir/*.txt; do
  while IFS= read line
  do
    echo $line 
  done < "$f" \
  | tokenize.sh \
  | python clean.py \
  >> $f.clean.txt 
  rm $f 
  mv $f.clean.txt $f 
done
I tried using awk as below and it seems to work well.
for f in path/to/dir/*.txt; do
  awk '{ print }' $f \
  | tokenize.sh \
  | python clean.py \
  >> $f.clean.txt 
  rm $f 
  mv $f.clean.txt $f 
done
 
     
    