Analysis
In your try:
find invasive/ -type f -exec mv --backup=numbered -t invasive2/ {} + | head -1000
head gets no input at all because find does not print anything.
If you did
find … -print -exec mv … {} + | head …
or
find … -exec mv … {} + -print | head …
then head would get some input and exit, find could get SIGPIPE; but in general the signal wouldn't happen precisely when you need it. This is because -exec … {} + substitutes {} with possibly many pathnames, a bunch of pathnames.
In case of … -print -exec … -print acts first for many pathnames that are going to form a bunch for -exec … {} +. If SIGPIPE happens then -exec won't be performed for the bunch.
In case of … -exec … {} + -print -exec acts first for the whole bunch, only then -print prints for each pathname separately. SIGPIPE can happen only when the tool prints something, so it cannot interrupt -exec mv, it can only interrupt -print. If SIGPIPE happens, it will prevent find from printing more pathnames; but -exec mv … has already happened for the whole bunch.
You want to count only successful move operations. If mv tries to move multiple files and succeeds, then you know all the files were moved. If it fails then you cannot easily know how many files were moved. For this reason you need a separate mv for each file you're trying to move. So you need -exec mv … \; rather than -exec mv … {} + (besides -exec … {} + is not useful as a test in find anyway, it always returns true).
Another complication is you cannot be sure -print prints exactly one line per file (because pathnames may contain newline characters). A reliable solution is -printf '\n' (if your find supports -printf) or -exec printf '\n' \;.
This leads us to the following solution (flawed though):
# flawed
find invasive/ -type f -exec mv --backup=numbered -t invasive2/ {} \; -printf '\n' \
| head -n 999 >/dev/null
In theory it works like this:
- If and only if
mv succeeds, a newline is printed.
head terminates after 999 newlines, i.e. after 999 successful move operations. The sole disappearance of head does not cause find to immediately receive SIGPIPE. After 999 successful move operations head is no more, but find still works.
find will receive SIGPIPE only if it tries to print something after head terminates. This happens after the 1000th successful move operation.
In practice there is no guarantee head reads fast enough and terminates fast enough to cause SIGPIPE exactly when we need it. This is the flaw in the above code. There's a buffer between find and head. It may happen find manages to print more lines than head is instructed to read. The mechanics of the pipe is designed to terminate the preceding tool (here: find) eventually, not precisely at exact moment; we cannot rely on it when we want to interrupt our find exactly after 1000 successful move operations.
Relaying on the output of head is not flawed this way. Something like
find … -print | head -n 1000 | code_that_runs_mv
is a good start, but since pathnames in general may contain newline characters, you need -print0 (which is not portable), head -z (also not portable) and so on. And if you want to count successful move operations then it should rather be:
find … -print0 | code0_that_runs_mv_and_counts
It's possible to build working code0_that_runs_mv_and_counts as a shell script, at least in Bash. My attempt is below.
Solution
find invasive/ -type f -print0 | bash -c '
counter=1000
while [ "$counter" -gt 0 ] && IFS= read -r -d "" pathname; do
</dev/tty mv --backup=numbered -t invasive2/ "$pathname" && ((counter--))
done
' code0_that_runs_mv_and_counts
Note I used </dev/tty mv … to prevent mv from consuming stdin in case it prompts for confirmation or something. Well, with --backup=numbered I guess it shouldn't prompt; but in general it might and we don't want it to read anything from our find.
The code from above is not portable, I don't like it very much.
Portable* solution
If your find does not support -print0 or you cannot use bash (or you simply enjoy more portable code) then consider the following approach:
while :; do echo; done | head -n 999 | find invasive/ -type f -exec sh -c '
for pathname do
</dev/tty mv --backup=numbered -t invasive2/ "$pathname" \
&& { read dummy || { kill -s PIPE "$PPID"; exit 0; } }
done
' find-sh {} +
* AFAIK the only non-portable thing here is mv with the options you used. If you didn't use --backup=numbered, then we could rewrite this mv to a portable form. Everything I added is portable, that's why I called this solution portable.
This is how the code works:
find starts sh and passes possibly many pathnames to it as arguments. There may be more than one sh started one after another, the number doesn't matter.
sh attempts to mv files one by one in a loop. After a successful move operation it tries to read exactly one line from its stdin inherited from find.
while … | head -n 999 (which could be yes | head -n 999, but yes is not portable) generates exactly 999 lines. Unless we run out of files first, exactly 999 reads will succeed. The read after the 1000th successful move operation will be the first read that fails.
Failed read occurs exactly after the 1000th successful move operation. It causes two things:
find ($PPID, the parent process of sh) gets SIGPIPE, so it won't start more sh processes;
- the current
sh exits, so it won't process more pathnames.
Notes
All the snippets are designed to move 1000 files; some contain 1000, some contain 999 in the code. You can adjust them to move N files, but pay attention if you need N or N-1 in the code.
Counting successful move operations makes sense, but in some circumstances it may cause a potential problem. When moving a file between filesystems, mv creates a copy, then deletes the source. A failure in deletion causes mv to report non-zero exit status, but the copy remains. Imagine your invasive/ is read-only for you. In such scenario our code will copy regular files to invasive2/ but no mv will count as successful. All the regular files will be copied.
I used bash -c '…' code0_that_runs_mv_and_counts, find … -exec sh -c '…' find-sh {} +. If you are surprised by code0_that_runs_mv_and_counts and find-sh being arguments then read What is the second sh in sh -c 'some shell code' sh?