I am trying to write a pkill-like utility that matches processes by searching for a certain string in /proc/pid/environ and kills them. But the matched process may exit before I can kill it and another process may take its pid. In that case, I will end up killing the new process. Is there a way to avoid this race condition?
- 133
3 Answers
Reconsider what you're doing. Instead of searching and killing processes by brute force, look for a way to prevent whatever starts them from starting them. (For example, if it's an unwanted service, disable the service.)
Find out the maximum PID value and calculate the probability of PID reuse. Process IDs on Linux are allocated incrementally until they roll over, and
kernel.pid_maxis 2^16 in older distributions and 2^22 in newer ones, and your tool performs both actions at about the same time (a few instructions apart) – so in order for that to happen, the system would need to start 2^22 new processes without ever scheduling your tool's process to continue running in the mean time – that's pretty unlikely if the process runs at normal priority.Upgrade to a recent kernel and use pidfd functions that work with process file descriptors, such as
pidfd_open()andpidfd_send_signal().
- 501,077
If you’re running on a new enough version of Linux, you can use pidfd_open() (needs Linux 5.3 or newer) or open the directory for the PID in /proc (as a file using the regular open() function, this approach instead needs Linux 5.1 or newer) to get a file descriptor that is tied to the specific process referenced by a given PID at the moment the system call is made. You can then use pidfd_send_signal() to send a signal to that process. By first opening the PID file descriptor and then using that to interact with the process directory (instead of checking the files in /proc individually), you can close the race condition you’re worrying about.
But in general, that’s only worth it for tools like pkill. If you are just trying to kill a process that you are managing, you should be either using a proper process supervisor like systemd or runit, or if you can’t do that at least using cgroups, both of which do not have this issue at all.
- 113
- 10,724
Is there a way to avoid this race condition?
Consider ISC DHCPd. This software places a PID file in /var/run/dhcpd.pid or some such location (user-configurable, and default location is sometimes customized by operating system distributions). To end the software, you can verify that file exists, and use: [ -r /var/run/dhcpd.pid ] && kill $( cat /var/run/dhcpd.pid )
That might not be foolproof, I guess, in some operating systems. But PID re-use and guessability has been treated as a potential security hazard for years, so recent code may use rather random PID numbers. The chances of PID re-use may be exceedingly small, and especially re-use within a matter of seconds.
- 16,486