4

I am currently using Monit to monitor Apache and restart it if its memory usage is too high. However, I'd also like to be able to monitor the individual apache2 subprocesses that are spawned, and kill any subprocess whose memory usage is too high over a period of a few minutes. How can I do that?

Matt White
  • 707
  • 1
  • 5
  • 17

2 Answers2

3

Monit's documentation suggests that you can natively monitor the total memory used by Apahce and its child processes, not any individual child process.

However, you can check the return status of a script using the check program test:

http://mmonit.com/monit/documentation/monit.html#program_status_testing

So, you can do something like this as a check script:

#/bin/bash
threshold=10000 # 10MB

for childmem in $(ps h orss p $(pgrep -P $(cat /var/run/httpd.pid)))
do
  if [ $childmem -gt $threshold ]; then
     exit 1
  fi
done
exit 0

If that script is /usr/local/bin/check_apache_children.sh, then you can do something like:

 check program myscript with path "/usr/local/bin/check_apache_children.sh"
       if status != 0 then exec "/usr/local/bin/kill_apache_children.sh"

The kill script will presumably look like the check script, but with a kill on the PID instead of an exit.

The scripts are, of course, illustrative, and should be modified to your environment.

cjc
  • 24,533
  • 2
  • 49
  • 69
  • 1
    Nice workaround. – ewwhite Jul 17 '12 at 16:57
  • Monit = nice and simple if you can work within the provided primitives. If not, then you have more and more complicated workarounds, and then you think, "why didn't I use Nagios?" – cjc Jul 17 '12 at 17:42
  • Thanks, this is exactly what I needed. I'm going to post the exact solution I used below, in case it helps anyone else. – Matt White Jul 17 '12 at 19:28
2

I accepted cjc's answer above, but wanted to post exactly how I used his suggestion to solve this problem. Note that you will need to use at least Monit 5.3 to use Monit's "check program". I am running Debian.

/usr/local/bin/monit_check_apache2_children:

#!/usr/bin/env bash

log_file=/path/to/monit_check_apache2_children.log
mem_limit=6
kill_after_minutes=5

exit_code=0
date_nice=$(date +'%Y-%m-%d %H:%M:%S')
date_seconds=$(date +'%s')
apache_children=$(ps h -o pid,%mem p $(pgrep -P $(cat /var/run/apache2.pid)) | sed 's/^ *//' | tr ' ' ',' | sed 's/,,/,/g')

for apache_child in $apache_children; do
  pid=`echo $apache_child | awk -F, '{ print $1 }'`
  mem=`echo $apache_child | awk -F, '{ print $2 }'`
  mem_rounded=`echo $apache_child | awk -F, '{ printf("%d\n", $2 + 0.5) }'`

  if [ $mem_rounded -ge $mem_limit ]; then
    log_entry_count=$(cat $log_file | grep -v 'KILLED' | grep " $pid; " | wc -l)
    log_entry_time=$(cat $log_file | grep -v 'KILLED' | grep " $pid; " | tail -$kill_after_minutes | head -1 | awk '{ print $3 }')

    if [ "$1" != "kill" ]; then
      echo "$date_nice $date_seconds Process: $pid; Memory Usage: $mem" >> $log_file
    fi

    if [ $((date_seconds - log_entry_time)) -le $(((kill_after_minutes * 60) + 30)) ] && [ $log_entry_count -ge $kill_after_minutes ]; then
      if [ "$1" = "kill" ]; then
        kill -9 $pid
        echo "$date_nice $date_seconds ***** KILLED APACHE2 PROCESS: $pid; MEMORY USAGE: $mem" >> $log_file
      else
        exit_code=1
      fi
    fi
  fi
done

exit $exit_code

/etc/monitrc:

...
check program apache2_children
  with path "/usr/local/bin/monit_check_apache2_children"
  if status != 0 then exec "/usr/local/bin/monit_check_apache2_children kill"
...
Matt White
  • 707
  • 1
  • 5
  • 17