1

I have the following commands:

time grep -F -f 'in2.txt' test.fastq
time zgrep -F -f 'in2.txt' test.fastq.gz

There are about 30 search terms on files with ~5 GB. However I notice that on one computer it takes over 3-5x time to finish searching, this is on an Amazon spinup. Thus I'm wondering what is impacting the speed? Should I spin up an ECS that has more memory or better CPU speed?

ahdee
  • 11
  • 2

1 Answers1

2

CPU and I/O. If you are searching for a small (30 is quite small) set of terms, you are most likely to be I/O bound, and conceivably going to be CPU bound. You will not be memory bound.

The right answer, of course, is to test it. You can do this a few ways, including having two terminals open and running dstat while you run the command in question. If it takes a couple of seconds to complete, you should get an idea which resources are maxed out (to 100% or to some steady-state value), and which are not.

Dave M
  • 13,250