I've got a couple of text files (a.txt and b.txt) containing a bunch of URLs, each on a separate line. Think of these files as blacklists. I want to sanitize my c.txt file, scrubbing it of any of the strings in a.txt and b.txt. My approach is to rename c.txt to c_old.txt, and then build a new c.txt by grepping out the strings in a.txt and b.txt.
type c_old.txt | grep -f a.txt -v | grep -f b.txt -v > c.txt
For a long while, it seemed like my system was working just fine. However, lately, I've lost nearly everything that was in c.txt, and new additions are being removed despite not occurring in a.txt or b.txt. I have no idea why.
P.S. I'm on Windows 7, so grep has been installed separately. I'd appreciate it if there are solutions that don't require me to install additional Linux tools.
Update: I've discovered one mistake in my batch file. I used ren c.txt c_old.txt without realising that ren refuses to overwrite the target file if it exists. Thus, the type c_old.txt | ... always used the same data. This explains why new additions to c.txt were being wiped out, but it does not explain why so many entries that were in c.txt have gone missing.