-1

Hi I'm looking for a BASH script for CentOS 6.5 that kills a processes by name that's been running longer than 5 minutes. I asked this question before and received a response that doesn't work under CentOS because killall doesn't have --older-than. I'm looking for an equivalent to that that will work under CentOS.

That post is here: Kill any GS process that's been running for over 5m on CentOS 6.5

Thanks!

Jonathan
  • 197
  • 4
  • 10

2 Answers2

3

I really suggest finding the root cause of this issue (or this issue or this issue).

A killall is a heavy-handed approach to process management, and your real issue is probably an application or resource problem.

Can you outline what you've tried so far? The types of things I would check are:

  • System vitals at the time these runaway Ghostscript processes occur: RAM? CPU?
  • Make sure the system this is running on has enough memory and doesn't have major contention for other resources.
  • Is this a physical or virtual server?
  • Talk to the vendor. There's a community and some level of support around PrinceXML.
  • A possible strace of the affected PIDs and Parent PIDs.
  • Are all of the requisite fonts installed?
  • Try logging the times that this happens to see if there's a correlation between the hang and other system events.
  • If you don't have historical and granular monitoring, you should. You could even try something like NewRelic to try to get a picture of what is happening or happened at a given time.
  • Check apache settings. It looks as though Ghostscript is being spawned by the apache user. Are there any limits or server settings that should be examined here?

Based on your output from an earlier question, it looks like you've only allocated 1 Gigabyte of RAM to this system and possibly only have a single CPU - no swap either...

If all else fails, you can write a script that can clean up old or stalled processes... or just compile a version of killall that supports the --older-than flag.

ewwhite
  • 194,921
  • 91
  • 434
  • 799
  • I'm a programmer not a sysadmin so I'll be honest I'm completely clueless on how to fix this the right way. Are you interested in taking a contract job to diagnose this for us? PM me your email if you're interested. – Jonathan Oct 24 '14 at 13:35
  • @Jonathan Well, how often does this happen? Can you increase RAM on your Rackspace instance? Is it just one CPU? – ewwhite Oct 24 '14 at 13:37
1

Will something like this be ok?

#!/bin/bash

PROC_NAME=my_proc_name

# Get all PIDs for process name
procs=(`ps aux | grep  $PROC_NAME | awk '{print $2}'`)

# for each PID in PIDs array
for pid in $procs; do
    # get elapsed time in form mm:ss and remove ":" character
    # to make it easier to parse time 
    time=(`ps -o etime $pid | sed -e 's/[:-]/ /g'`)
    # get minutes from time
    min=${time[1]}
    # if proces runs 5 minutes then kill it
    if [ "$min" -gt "5" ]; then
        kill -9 $pid
    fi
done;

Of course it should be executed by cron or something like that to check processes periodically.

Kamil
  • 111
  • 4