Why is find exec grep > file an Infinite Loop?

Question

I was trying to collect all of the Message-ID: headers (lines) in a directory with 200K .eml (plain text) files. A bit naively, I said:

find -type f -exec grep -Fi "message-id:" {} \; > messageids.txt

I let it run overnight, since I figured it would take a while to grep through that many files. A bit to my surprise this morning, messageids.txt is 1.7TB and my partition is full. I realize that what must have happened is that grep's own output is being picked up as input, but I wouldn't (and still don't, intuitively) expect it to repeat endlessly. Which means that my understanding of the forces at play isn't as strong as it should be.

Can anyone provide a detailed explanation of how the command above works and why this infinite loop should (I assume) be expected? Thanks.

Update: The way I'd expect it to work is that find finds a list of files, and on each one of them grep is called. So at some point grep is called on messageids.txt. If I were to do this on, say, a sort command, messageids.txt would be created as soon as the command executes (possibly whacking it, if it already existed), but it wouldn't be populated until the command completes. In this case, for the loop to be infinite, the file must be getting populated before the output is complete, but in such a way that the input from grep is perpetually keeping up on it. That's the bit that doesn't behave like I'd expect, and I was hoping for a detailed explanation of how this process chain is executing so I can firm up my Linux fundamentals.

score 3 · Accepted Answer · answered Feb 22 '12 at 17:36

3

Every time it finds a line with message-id in it, it soon writes it to messageids.txt. And every time it writes a line with message-id in it to messageids.txt, it soon finds it. So this is a trivial endless loop.

answered Feb 22 '12 at 17:36

David Schwartz

62,365

score 0 · Answer 2 · answered Feb 22 '12 at 17:42

0

I just tested something like this and it worked.

for f in $(find . -type f); do grep -Fi "message-id:" $f > messageids.txt; done

answered Feb 22 '12 at 17:42

Rob

2,422

Why is find exec grep > file an Infinite Loop?

2 Answers2