I have a huge data source that I'm filtering using some greps.
Here's basically what I'm doing right now:
#!/bin/bash
param1='something'
param2='another'
param3='yep'
echo $(avro-read /log/huge_data | grep $param1 | grep "$param2-" | grep $param3 | wc -l) / $(avro-read /log/ap/huge_data | grep $param1 | grep -v "$param2-" | grep $param3 | wc -l) | bc -l
Notice how I'm doing mostly the same filtering twice (a single difference the second time), taking the count of each, and dividing the final result. This is definitely a hacky thing to do, but I'd like to try and speed it up just a bit and only perform the initial filtering once without using a temp file.
I tried using a fifo, but I'm not sure if it's possible to have two processes in one script reading from it, as well as have a third process "wait" until both are done to compute the final result. I also looked into using tee, but again not sure how to synchronize the resulting sub processes.
EDIT: Solved this myself using https://superuser.com/a/561248/43649, but marked another suggestion as the answer.