0

I have a text file that is couple GBs. I am trying to shuffle this text file in a pipe.

For example these are some sample lines of what I am using but it is not efficient and in fact the pipe does not seem to start until the whole file is read. Maybe I am wrong on it.

shuf HUGETEXTFILE.txt|some command

cat HUGETEXTFILE.txt|sort -R |some command

I also tried to use

split -n 1/numberofchunks HUGETEXTFILE.txt|sort -R|some command 

But the piping ends when the first chunk finishes.

I am trying to find an efficient way to pipe text file shuffling in a pipe because I do not want to write hundreds of files everytime I need a new way of shuffling, or random distribution.

thanks

yarun can
  • 1,060

1 Answers1

0

You can try this approach:

cat bigfile.txt|
  while IFS= read -r line; do
    echo '%s\n' "$line" |shuf |sort -n| grep "sample";
  done

IFS is used to split the output into lines here.