Yesterday I performed a release upgrade which included the upgrade from PHP 7.4 to 8.2.
At first, I thought that the problem could come from the PHP version (because was php8.2-fpm the service that kept dying), so I reverted back to php7.4-fpm, but this didn't solve the problem.
php7.4-fpm.service: A process of this unit has been killed by the OOM killer.
php7.4-fpm.service: Failed with result 'oom-kill'.
php7.4-fpm.service: Consumed 12min 18.367s CPU time.
I have site's sockets organized by pools
All have this config:
pm = dynamic
pm.max_children = 5
pm.start_servers = 2
pm.min_spare_servers = 1
pm.max_spare_servers = 3
Service drops in around 1 hour. So I decided to introduce a max_requests limitation
pm.max_requests = 1000
And then it went much more stable, but ended dropping in around 12 hours. So now I fear that just tweaking this and keeping reducing this number, will only make the server drop simply later but drop at some point, which is not ideal. The weird part here is that before the update (Ubuntu 20.04 LTS) it was 100% stable, never dropping, and after the update to Ubuntu 22.04 LTS, things have started to go wild.
After revisiting many other questions in Server Fault and Super User regarding OOM, I'm running out of ideas to test more options, so I wonder if anyone could suggest other new possibilities for me to test