Мониторинг с помощью утилиты MMonit

Мониторинг большинства служб, процессов, системных параметров и прочих вещей в жизни любого администратора.

CentOS

yum install monit

Dubina/Bubunta

apt-get install monit

Не забываем про различные пути до сервисов на разных ОС! Особенно Gentoo :D
Под каждый вид мониторинга создается отдельно файл в папке /etc/monit.d/

Все настройки производятся в файле /etc/monit.conf

1) Раскомментируем и изменим путь для записи лог-файла;

# set logfile syslog facility log_daemon      
set logfile /var/log/monit.log  

2) Настроим отправку уведомлений по SMTP (для клиентских лучше использовать sendmail, чтобы не светить парольку от почты)
Для этого пишем set mailserver localhost

set mailserver smtp.yandex.ru port 465 
    username "noreply@mail.ru" password "******" 
    using SSLAUTO

3) Доступы к веб-морде (лучше не использовать. Консольный вариант также удобен и понятен)

set httpd port 2812 and
    use address localhost  # only accept connection from localhost
    allow localhost        # allow localhost to connect to the server and
    allow admin:123123123123      # require user 'admin' with password 'monit'
    allow @monit           # allow users of group 'monit' to connect (rw)
    allow @users readonly  # allow users of group 'users' to connect readonly

4) Время опроса демона и задержка перед опросом после запуска

set daemon  60              # проверять сервисы каждую минуту
#with start delay 240  # лучше оставить закоменченной, чтобы монит при старте инициализировал все правила

5) Кому отправлять алерты по-умолчанию (используется только в том случае, если не указан другой адрес для уведомления)

set alert alert@mail.ru

Список доступных оповещений

Мониторинг процессов

Для каждой службы выбираем свои действия по которым будем получать уведомления. Здесь представлены только примеры!
Значения также настраиваются исходя из конфигурация сервера и нагрузки на них!

Мониторинг NGINX

check process nginx with pidfile /var/run/nginx.pid
group nginx
start program = "/etc/init.d/nginx start" 
stop program = "/etc/init.d/nginx stop" 
if failed host 127.0.0.1 port 80 then restart
if cpu > 60% for 2 cycles then alert
if cpu > 90% for 5 cycles then restart
if 5 restarts within 5 cycles then timeout
alert alert@mail.ru on {pid, nonexist, timeout} 
        with mail-format { 
              from:     noreply@mail.ru
              subject:  NGINX $EVENT - $ACTION
              message:  This event occurred on $HOST at $DATE. 
              NGINX in SERVER_TEST seems down!,
              Your awesome monit!
      }

Мониторинг PHP-FPM

check process php-fpm with pidfile /var/run/php-fpm/php-fpm.pid
    start program = "/etc/init.d/php-fpm start" 
    stop program  = "/etc/init.d/php-fpm stop" 
    if failed port 9000 type TCP then restart
    if cpu > 60% for 2 cycles then alert
    if cpu > 90% for 5 cycles then restart
    if 5 restarts within 5 cycles then timeout
    alert alert@mail.ru on {pid, nonexist, timeout} 
        with mail-format { 
              from:     noreply@mail.ru
              subject:  PHP-FPM $EVENT - $ACTION
              message:  This event occurred on $HOST at $DATE. 
              PHP-FPM in SERVER_TEST seems down!,
              Your awesome monit!
      }

Мониторинг MySQL (для MariaDB отдельный конфиг)

check process mysqld with pidfile /var/run/mysqld/mysqld.pid
group mysql
start program = "/etc/init.d/mysqld start" 
stop program = "/etc/init.d/mysqld stop" 
if failed host 127.0.0.1 port 3306 then restart
if cpu > 60% for 2 cycles then alert
if cpu > 90% for 5 cycles then restart
if 5 restarts within 5 cycles then timeout
alert alert@mail.ru on {pid, nonexist, timeout} 
        with mail-format { 
              from:     noreply@mail.ru
              subject:  MySQL $EVENT - $ACTION
              message:  This event occurred on $HOST at $DATE. 
              MySQL in SERVER_TEST seems down!,
              Your awesome monit!
      }

Мониторинг MariaDB

Внимание!!! Если работает по сокетам, то указываем путь до каждого из них в новом файле!

check process mysql with pidfile /var/lib/mysql/voicefree.ru.pid
group mysql
start program = "/etc/init.d/mysql start" 
stop program = "/etc/init.d/mysql stop" 
if failed host 127.0.0.1 port 3306 then restart
if cpu > 60% for 2 cycles then alert
if cpu > 90% for 5 cycles then restart
if 5 restarts within 5 cycles then timeout
alert alert@mail.ru on {pid, nonexist, timeout} 
        with mail-format { 
              from:     noreply@mail.ru
              subject:  MySQL $EVENT - $ACTION
              message:  This event occurred on $HOST at $DATE. 
              MySQL in SERVER_TEST seems down!,
              Your awesome monit!
      }

Мониторинг HTTPD

Не стоит забывать, что порт может быть отличный от примера.

check process httpd with pidfile /var/run/httpd/httpd.pid
group apache
start program = "/etc/init.d/httpd start" 
stop program = "/etc/init.d/httpd stop" 
if failed host 127.0.0.1 port 80 then restart
if cpu > 60% for 2 cycles then alert
if cpu > 90% for 5 cycles then restart
if 5 restarts within 5 cycles then timeout
alert alert@mail.ru on {pid, nonexist, timeout} 
        with mail-format { 
              from:     noreply@mail.ru
              subject:  HTTPD $EVENT - $ACTION
              message:  This event occurred on $HOST at $DATE. 
              NGINX in SERVER_TEST seems down!,
              Your awesome monit!
      }

Мониторинг места на диске

Если нужно мониторить все место на диске, то вместо имени папки - пишем rootfs

check device var with path /var
    if SPACE usage > 80% then alert

Мониторинг RAID-а

Создаем скрипт в папке /home/scripts/

#!/bin/bash

# Copy this file to `/usr/local/bin` and make this file executable.
# Set a cron entry for this script for 10 minutes (see `crontab` file bellow).

# The space usage limit
readonly PERCENT=90

if [ `df -h /dev/md0 | egrep -o '([[:digit:]]{1,3})%' | egrep -o '([[:digit:]]{1,3})' ` -lt PERCENT ] then
  touch /var/tmp/monit_flag_raid
fi

Добавляем в crontab

*/10 * * * * monit_raid_size > /dev/null 2>&1

Добавляем конфиг в MMonit

check file raid_hdd with path /home/scripts/monit_flag_raid
  if timestamp > 25 minutes then alert

Мониторинг CPU, RAM, LA хоста

check system voicefree.ru (ТУТ ПИШЕМ вывод команды hostname или localhost )
if memory usage > 30% then alert
if cpu usage (user) > 80% for 3 cycles then alert
if cpu usage (system) > 80% for 3 cycles then alert
if cpu usage (wait) > 20% then alert
if loadavg (1min) > 2 then alert
if loadavg (5min) > 1 then alert

Проверка и мониторинг маунтов

check filesystem datafs with path /dev/sdb1
start program  = "/bin/mount /data" 
stop program  = "/bin/umount /data" 
if failed permission 660 then alert
if failed uid root then alert
if failed gid disk then alert
if space usage > 80% for 5 times within 15 cycles then alert
if space usage > 99% then stop
if inode usage > 30000 then alert
if inode usage > 99% then stop
group server

Проверка привилегий для файлов, папок

check directory bin with path /bin
if failed permission 755 then   alert
if failed uid 0 then alert
if failed gid 0 then alert

Мониторинг NTP

check process ntpd with pidfile /var/run/ntpd.pid
  start program = "/etc/init.d/ntpd start" 
  stop  program = "/etc/init.d/ntpd stop" 
  if failed host 127.0.0.1 port 123 type udp then alert
  if 5 restarts within 5 cycles then timeout

Мониторинг Net-SNMP

check process snmpd with pidfile /var/run/snmpd.pid
   start program = "/etc/init.d/snmpd start" 
   stop program = "/etc/init.d/snmpd stop" 
   if failed host 192.168.1.1 port 161 type udp then restart
   if failed host 192.168.1.1 port 199 type tcp then restart
   if 5 restarts within 5 cycles then timeout

Мониторинг BIND (chrooted)

check process named with pidfile /var/named/chroot/var/run/named/named.pid
   start program = "/etc/init.d/named start" 
   stop program = "/etc/init.d/named stop" 
   if failed host 127.0.0.1 port 53 type tcp protocol dns then alert
   if failed host 127.0.0.1 port 53 type udp protocol dns then alert
   if 5 restarts within 5 cycles then timeout

Мониторинг FTP (Proftpd)

check process proftpd with pidfile /var/run/proftpd.pid
   start program = "/etc/init.d/proftpd start" 
   stop program  = "/etc/init.d/proftpd stop" 
   if failed port 21 protocol ftp then restart
   if 5 restarts within 5 cycles then timeout

Мониторинг SSH

check process sshd with pidfile /var/run/sshd.pid
   start program  "/etc/init.d/sshd start" 
   stop program  "/etc/init.d/sshd stop" 
   if failed port 22 protocol ssh then restart
   if 5 restarts within 5 cycles then timeout

Мониторинг ответа от веб-сервера

check process apache with pidfile /var/run/httpd.pid
       start "/etc/init.d/httpd start" 
       stop  "/etc/init.d/httpd stop" 
       if failed 
          host www.mail.ru port 80 and
          send "GET / HTTP/1.1\r\nHost: wwww.mail.ru\r\n\r\n" 
          expect "HTTP/[0-9\.]{3} 200.*" 
       then alert

Мониторинг по заголовкам

check host www.mail.ru with address www.mail.ru
    if failed
       port 80 protocol http
       with http headers [Host: www.mail.ru, Cache-Control: no-cache,
         Cookie: csrftoken=nj1bI3CnMCaiNv4beqo8ZaCfAQQvpgLH]
       and request /index.php with content = "hosting [0-9.]+" 
    then alert

Документация MMonit и пояснения по каждой функции