3

We are looking to deploy a queueing system, and SGE is looking like it will meet nearly all of our wishes. However, we had the idea of supporting both a synchronous and asynchronous queueing model. In other words:

  1. We would have all our worker nodes tied to a synchronous queue, so that jobs that get assigned to them would queue up as normal - i.e. a job runs, when it finishes another is accepted and run.

  2. We wanted to be able to assign "asynchronous" jobs to nodes as well. These would be tasks that could be done in parallel with other jobs, usually maintenance tasks on the machines themselves.

I see in the SGE documentation that it is possible to define multiple queues across nodes, but that isn't quite the same thing as having a queue that takes any job given to it and launches it into the background, then accepts another. I'm not completely up to speed on all the configuration options in SGE, but it seems like this might be possible. Can anyone point me towards some info on how this might be configured?

Rick Reynolds
  • 341
  • 3
  • 10

1 Answers1

4

You can define the number of slots per queue. So for your "synchronous" job queue you may want to set the number of slots per host to 1. This way only a single job will be accepted to a host's queue at a time, and once it finishes another one may run. For the "asynchronous" queue, just set the number of slots to some high number so that whatever number of jobs you need can run at once on the host.

Now that you have two queues, you need some way to target jobs to them. A simple way to do that would be to just use the -q switch to qsub to explicitly select a queue, eg: qsub -q 'sync.q@*'.

However, it's preferable to allow gridengine to decide which queue to place a job in to. For that you can define a complex, say sync and set the forced attribute to true. Then you assign that complex to the queue. Now jobs submitted with qsub -l sync will only be placed in to a queue with that attribute available. The benefit of this is some additional flexibility if you decide to reorganize your queues, as well as a slightly simpler submission procedure.

Kamil Kisiel
  • 11,946
  • 7
  • 46
  • 68