Questions tagged [gridengine]

Grid Engine is a distributed resource management (DRM) system that manages the distribution of users' workloads to available compute resources.

Links:

73 questions
23
votes
3 answers

What are the differences between wall clock time, user time and cpu time

We are running computing jobs with GridEngine. Every jobs returns 3 different times: Wall clock time User time CPU time What are the differences between these three? Which of these three is most suitable to compare the performance of two…
Peter Smit
  • 1,649
  • 4
  • 21
  • 37
14
votes
4 answers

Multiple servers acting like a single one with all the hardware?

by now I have 10 servers for hpc, power computing oriented. My users need to launch several processes using qmake. The users are used to work with ubuntu 9.10, and the software from the repositories is switable for them. I've deployed ubuntu 9.10 to…
Marc Riera
  • 1,587
  • 4
  • 21
  • 38
13
votes
2 answers

Sun Grid Engine huhohshdhjha

when I type qstat -h, I get the following option [-s {p|r|s|z|hu|ho|hs|hd|hj|ha|h|a}] show pending, running, suspended, zombie jobs, jobs with a user/operator/system/array-dependency hold, …
jm1234567890
  • 241
  • 1
  • 5
9
votes
1 answer

How can I set the maximum number of running jobs per user on SGE?

We're using SGE (Sun Grid Manager). We have some limitations on the total number of concurrent jobs from all users. I would like to know if it's possible to set a temporary, voluntary limit on the number of concurrent running jobs for a specific…
David B
  • 193
  • 1
  • 1
  • 3
8
votes
1 answer

Track memory usage of a job on SGE

I'm looking for some guidance in how to precisely figure out how much RAM my job is using on my cluster. My job is not multi-threaded and runs on a single cpu. When I run my job and run "top" I can see that it uses this much RAM... VIRT: 45.6g RES:…
lonestar21
  • 191
  • 1
  • 2
  • 4
8
votes
1 answer

Kill an SGE job "already in deletion", as user

Is there a way that my users can kill their own jobs that are stuck in the dr state? qstat -f as the user, returns job is already in deletion yet when run as root it does get deleted
pufferfish
  • 2,660
  • 9
  • 37
  • 40
6
votes
7 answers

How to reserve complete nodes on Sun Grid Engine?

How do you use SGE to reserve complete nodes on a cluster? I don't want 2 processors from one machine, 3 processors from another, and so on. I have a quadcore cluster and I want to reserve 4 complete machines, each having 4 slots. I cannot just…
artif
  • 223
  • 1
  • 2
  • 6
5
votes
4 answers

Howto set up SGE for CUDA devices?

I'm currently facing the problem of integrating GPU-Servers into an existing SGE environment. Using google I found some examples of Clusters where this has been set up but no information on how this had been done. Is there some form of howto or…
luxifer
  • 177
  • 1
  • 3
  • 12
5
votes
1 answer

Overlapping queues on Sun Grid Engine?

We would like to have an SGE-based compute cluster with a queue that gives access to all nodes for the computational staff, and a second cluster queue that gives access to, say, half the nodes for occasional (but heavy) use by other staff. We want…
Alex Reynolds
  • 453
  • 2
  • 9
  • 20
4
votes
2 answers

What is the difference between h_rss and h_vmem in Sun Grid Engine (SGE)?

So far as I understood, mem_free can be specified to submit a job in a host that has the memory free = mem_free, whereas h_vmem is the hard limit of the memory up to which the job can consume and if the job reaches the h_vmem, the job crashes? I…
GP92
  • 599
  • 2
  • 6
  • 25
4
votes
1 answer

What does the qstat output jclass mean?

What does the qstat output jclass mean? $ qstat -help UGE 8.1.4 $ qstat -u myusername job-ID prior name user state submit/start at queue jclass slots ja-task-ID…
Daniel
  • 153
  • 5
4
votes
2 answers

Are there cluster resource schedulers abstraction layers?

I'm writing an application that could potentially be run on any cluster resource scheduler (SGE, LSF or SLURM to name a few of them), using very basic functionalities. I'm wondering if a framework/abstraction layer does exist for interacting with…
nicoulaj
  • 1,155
  • 2
  • 10
  • 12
3
votes
1 answer

Is it a bad idea to add lots of machines as submit hosts in an SGE environment?

We're replacing a home-grown queueing system with SGE/OGE. The current work environment has engineers using their own local Linux workstation to submit jobs. So I'm wondering about adding many machines as submit hosts to an SGE/OGE cluster. In…
Rick Reynolds
  • 341
  • 3
  • 10
3
votes
1 answer

asynchronous job queueing in sun grid engine (SGE) - possible?

We are looking to deploy a queueing system, and SGE is looking like it will meet nearly all of our wishes. However, we had the idea of supporting both a synchronous and asynchronous queueing model. In other words: We would have all our worker…
Rick Reynolds
  • 341
  • 3
  • 10
3
votes
2 answers

Sun Grid Engine Array Job Individual Resources

Is it possible in Sun Grid Engine to have array jobs where each subtask has a unique requirement? For example I may have an array job for which each task has a small unique requirement but I do not want to have to launch each job seperately.
Amit
  • 31
  • 1
1
2 3 4 5