9

We're using SGE (Sun Grid Manager). We have some limitations on the total number of concurrent jobs from all users.

I would like to know if it's possible to set a temporary, voluntary limit on the number of concurrent running jobs for a specific user.

For example user dave is about to submit 500 jobs, but he would like no more than 100 to run concurrently, e.g. since he knows the jobs do lots of I/O which stuck the filesytem (true story, unfortunately).

Is that possible?

Kamil Kisiel
  • 11,946
  • 7
  • 46
  • 68
David B
  • 193
  • 1
  • 1
  • 3

1 Answers1

10

You can define a complex with qconf -mc. Call it something like high_io or whatever you'd like, and set the consumable field to YES. Then in either the global configuration with qconf -me global or in a particular queue with qconf -mq <queue name> set high_io=500 in the complex values. Now tell your users to specify -l high_io=1 or however many "tokens" you'd like them to use. This will limit the number of concurrent jobs to whatever you set the complex value to.

The other way to do this is with quotas. Add a quota with qconf -arqs that looks something like:

 {
        name         dave_max_slots
        description  "Limit dave to 500 slots"
        enabled      true
        limit        users {dave} to slots=500
 }
Kamil Kisiel
  • 11,946
  • 7
  • 46
  • 68
  • Thanks Kamil and sorry for the late reply. A couple of follow-ups, since I'm quite new to `qconf`. Regarding your first suggestion, could you be a bit more explicit? What is "consumable"? After configuring as mentioned, fo I simply tell the user to `qsub` with `-l high_io=1`? – David B Sep 28 '10 at 09:39
  • 1
    Basically a complex is a resource of value that can be requested by a job with the `-l` switch to `qsub`. By setting a complex to be consumable, it means that when a job requests that complex the number available is decreased. So if a queue has 500 of the high_io complex, and a job requests 20, there will be 480 available for other jobs. You'd request the complex just as in your example. – Kamil Kisiel Sep 28 '10 at 22:42
  • Thank you Kamil. Sorry I can't vote up (not enough reputation yet). – David B Oct 01 '10 at 09:08