If you have a very large number of files in your directories, and/or when using pipes may not apply, etc.,
for instance because xargs would be limited by the number of arguments allowed by your system, one option is to use the exit status of an exec command as a filter for the next actions, something like:
rm /tmp/count ; find . -type f -exec bash -c 'echo "$(( $(cat /tmp/count) + 1 ))" > /tmp/count' \; -exec bash -c 'test $( cat /tmp/count ) -lt 5000' \; -exec echo "any command instead of echo of this file: {}" \;
The first exec will just increment the counter. The second exec tests the count, if less than 5000, then exits with 0 and the next command is executed. The third exec will do the intended on the file, in this case a simple echo, we can also -print -delete, etc. (I would use -delete instead of -exec rm {} \; for instance.
This is all based on the fact that find actions are executed in sequence assuming the previous one returns 0.
When using the above example, you'd want to make sure /tmp/count is not used by a concurrent process.
[edits following comments from Scott]
Thanks a lot Scott for your comments.
Based on them: the number was changed to 5,000 to match the initial thread.
Also: this is absolutely correct that /tmp/count file will still be written 42,000 times (as many times as files being browsed), so "find" will still go through all the 42,000 entries,but will only execute the command of interest 5,000 times. So this command will not avoid browsing the whole and is just presented as an alternate option to usual pipes. Using a memory mapped temporary directory to host this /tmp/count file would seem appropriate.
And besides your comments, some additional edits:
Pipes would be simpler in most typical cases.
Please find below more reasons for which pipes would not apply that easily though:
when file names have spaces in them, the "find" exec command would not want to forget to surround the {} with quotes "{}", to support this case,
when the intended command does not allow having all the file names in a raw, for instance, something like: -exec somespecificprogram -i "{}" -o "{}.myoutput" \;
So this example is essentially posted for those around who would have faced challenges with pipes and still do not want to go into a more elaborated programming option.