13

I am trying to write a pkill-like utility that matches processes by searching for a certain string in /proc/pid/environ and kills them. But the matched process may exit before I can kill it and another process may take its pid. In that case, I will end up killing the new process. Is there a way to avoid this race condition?

John Ao
  • 133

3 Answers3

18
  • Reconsider what you're doing. Instead of searching and killing processes by brute force, look for a way to prevent whatever starts them from starting them. (For example, if it's an unwanted service, disable the service.)

  • Find out the maximum PID value and calculate the probability of PID reuse. Process IDs on Linux are allocated incrementally until they roll over, and kernel.pid_max is 2^16 in older distributions and 2^22 in newer ones, and your tool performs both actions at about the same time (a few instructions apart) – so in order for that to happen, the system would need to start 2^22 new processes without ever scheduling your tool's process to continue running in the mean time – that's pretty unlikely if the process runs at normal priority.

  • Upgrade to a recent kernel and use pidfd functions that work with process file descriptors, such as pidfd_open() and pidfd_send_signal().

grawity
  • 501,077
13

If you’re running on a new enough version of Linux, you can use pidfd_open() (needs Linux 5.3 or newer) or open the directory for the PID in /proc (as a file using the regular open() function, this approach instead needs Linux 5.1 or newer) to get a file descriptor that is tied to the specific process referenced by a given PID at the moment the system call is made. You can then use pidfd_send_signal() to send a signal to that process. By first opening the PID file descriptor and then using that to interact with the process directory (instead of checking the files in /proc individually), you can close the race condition you’re worrying about.

But in general, that’s only worth it for tools like pkill. If you are just trying to kill a process that you are managing, you should be either using a proper process supervisor like systemd or runit, or if you can’t do that at least using cgroups, both of which do not have this issue at all.

0

Is there a way to avoid this race condition?

Consider ISC DHCPd. This software places a PID file in /var/run/dhcpd.pid or some such location (user-configurable, and default location is sometimes customized by operating system distributions). To end the software, you can verify that file exists, and use: [ -r /var/run/dhcpd.pid ] && kill $( cat /var/run/dhcpd.pid )

That might not be foolproof, I guess, in some operating systems. But PID re-use and guessability has been treated as a potential security hazard for years, so recent code may use rather random PID numbers. The chances of PID re-use may be exceedingly small, and especially re-use within a matter of seconds.

TOOGAM
  • 16,486