8

While devising an answer to this question I ran into a snag while testing this MySQL Monit ruleset on an Ubuntu 12.04.5 setup:

check process mysqld with pidfile /var/run/mysqld/mysqld.pid
  group mysql
  start program = "/etc/init.d/mysql start"
  stop program = "/etc/init.d/mysql stop"
  if failed host 127.0.0.1 port 3306
    with timeout 15 seconds
  then restart
  if 5 restarts within 5 cycles
  then timeout
  alert email_address@example.com only on { timeout, nonexist }

The issue is I was attempting to invoke start/stop items via /etc/init.d/ — which is more of a CentOS/RedHat system construct — instead of using /usr/sbin/service which would be more appropriate for a Ubuntu/Debian system.

Okay, my bad… But the issue is you see that if 5 restarts within 5 cycles then timeout part? That seems to have bit me hard. With the /etc/init.d/mysql start command not able to work, the system attempted 5 restarts, failed 5 times and then timed out as a result. And the timeout condition seems to result in the MySQL service ruleset being ignored my Monit.

I’ve restarted the Monit service a few times and even rejiggered the ruleset to see if it helps but none of that seems to affect anything.

What can I do to get Monit to pay attention to rulesets it has “unmonitored” due to timeout conditions being met?

Giacomo1968
  • 58,727

2 Answers2

7

Monit includes commands to enable and disable monitoring of all or specific services.

If a service has become unmonitored you can re-enable monitoring with e.g. monit monitor mysql or monit monitor all.

Note you must have the Monit HTTP interface enabled for these commands to work.

user51928
  • 171
7

After doing some digging, it turns out Monit stores system monitoring data in a “state” file. And this “state” file keeps track of what services are being monitored/unmonitored.

So while this is a bit “brute force”-ish, it definitely works. If a service becomes “unmonitored” due to something like a timeout, then just remove the Monit state file from the system like this:

sudo rm /var/lib/monit/state

And then restart Monit like this and all should be good:

sudo service monit restart

FWIW, on other systems/setups the Monit “state” file might be saved as state or monit.state or even .monit.state (with a dot/period . prepending it) in another directory. Be sure to determine exactly where that “state” file is being saved when you actually attempt to implement this fix.

Giacomo1968
  • 58,727