We have a cluster running CentOS with Sun Grid Engine for research simulations. However, some users who aren't too familiar with the system end up running jobs on the head node, which of of course makes things slow for everyone else.

In the future, I'd like to put an SSH login message warning people of a maximum run time for programs run by users directly on the head node, but also allow long jobs to run (e.g. a big tar/gzip operation) if they nice it.

TL;DR: how do I limit jobs run by users to 24 hours cpu time, if the nice value <= 0?

  • 113
  • 7

3 Answers3


You can try to write a script that checks jobs running and record their utilization and decides what to do after a certain ammount. You can also try AND - Auto Nice Daemon and ulimits/PAM limits to limit the CPU time of a user (after that much cpu time is used by a user, the processes/sessions are killed).

  • 12,573
  • 2
  • 34
  • 53
  • I've written a script-based answer below myself. PAM limits would otherwise be perfect, just my fault for lack of full disclosure (interactive sessions are OK if they tunnel through to a separate cluster node, and a PAM limit would terminate their session). – sargant Aug 21 '10 at 15:44

using cpulimit; in debian is in the repositories, but not for centos. Here they have a howto for compiling and using it: link. If you have a cluster, just compile it in one spare machine, test it and deploy it with your configuration management tool, like cfengine.

natxo asenjo
  • 5,641
  • 2
  • 25
  • 27
  • well, this is obviously *not* for limiting the cpu time for the user, but the cpu capacity they may use in percentage. Sorry for the misunderstanding. – natxo asenjo Aug 20 '10 at 12:24
  • Yeah, thanks though. The problem is not that they're using too much CPU, that they shouldn't be using the node for big jobs at all. I've since tried to cobble something together using the otherwise mysterious `ps` command. – sargant Aug 21 '10 at 10:35

Not a full solution, but I've subsequently found the command I need:

ps alr | sed 1d | awk '{print ($6 <= 0)? $2 " " $3 " " $12 : ""}' | grep -v "^$"

Which lists all running jobs with a nice value <= 0 in three columns, consisting of a UID, PID and total running time, as such:

543 3208 11436:31
511 16491 0:00

I'm hoping it's then the relatively straightforward matter of setting up a cron job to parse this data, check UIDs and kill jobs as appropriate.

  • 113
  • 7