Questions tagged [torque]

TORQUE is a fork of OpenPBS, a job scheduling application used to distribute workloads within a UNIX-based computing cluster.

30 questions
30
votes
6 answers

Outgrowing cron: what's the next scheduler?

We've been using cron for about as long as I can remember to handle all of our job scheduling needs. Everything from storage clones/snapshots to reports against databases to daily system reports to monitoring checks are scheduled across a few…
Cakemox
  • 24,141
  • 6
  • 41
  • 67
5
votes
0 answers

torque pbs 4.0.1 job stays queued ('Q') state; the scheduler seems not receiving any notification

I am using torque 4.0.1 on openSUSE 12.1 in a cluster environment. When I qsub a job (simple as "echo hello"), it remains in 'Q' state, and never gets scheduled. I can force the job to run with qrun, and it is executed on the first node without…
liding
  • 51
  • 1
  • 4
3
votes
0 answers

Email notifications per JOB ARRAY not per job in PBS torque

Is there a way to configure torque to send email notifications on start and end of job array, and not per job, Im managing job arrays of thousands of jobs, and I dont want to get flooded by mails. But indeed I want to know when the entire job array…
3
votes
2 answers

Torque jobs does not enter "E" state (unless "qrun")

Jobs I add to the queue stays there in "Queued" state without attempts to be executed (unless I manually qrun them) /var/spool/torque/server_logs say just 04/11/2011 12:43:27;0100;PBS_Server;Job;16.localhost;enqueuing into batch, state 1 hop…
Vi.
  • 821
  • 11
  • 19
2
votes
1 answer

Job submitted to Torque does not generate error/log file

As stated, I have just installed Torque on a Ubuntu 16.04 machine. The submitted jobs complete just fine but the -e and -o flags seem to not be working. No error and log files are created even though I have given the flag an absolute path to the…
user121392
  • 13
  • 1
  • 6
2
votes
0 answers

Torque pbs queue system runs queue in reverse

I have a small compute cluster set up on Redhat 7.1. It runs the PBS torque queue system with version 5.1.1. When I queue several jobs it starts to run the jobs in "backwards" priority. It starts with the job which was submitted last. Is there any…
Pe2
  • 21
  • 2
2
votes
1 answer

qsub: How can I find out what DRM middleware exactly is installed on a cluster?

I have a user account on a very big cluster. I have previous experience with Grid Engine and want to use the cluster for array jobs. The documentation tells me to use "qsub" for load balancing / submission of many jobs. Therefore I assumed this…
user116990
2
votes
1 answer

How can I set up interactive-job-only or batch-job-only partition on a SLURM cluster?

I'm managing a PBS/torque HPC cluster, and now I'm setting up another cluster with SLURM. On the PBS cluster, I can set a queue to accept only interactive jobs by qmgr -c "set queue interactive_q disallowed_types = batch" and to accept only batch…
wdg
  • 143
  • 1
  • 5
1
vote
1 answer

Running tensorflow code in torque job

I have a cluster running with torque to distribute jobs. I want to run a job with tensorflow code and I am having problems with tensorflow not being recognized. I installed tensorflow on my LDAP user using anaconda and so I can enter the tensorflow…
Oha Noch
  • 121
  • 5
1
vote
0 answers

How to setup bash environment inside Torque PBS? Why source ~/.bashrc doesn't work?

I have successfully installed Torque PBS on my ubuntu server. Job submission is fine. However there is an annoying thing. I found bash environment is not right inside PBS for example echo 'echo $PATH > ~/res.txt' | qsub and the content of res.txt…
user15964
  • 121
  • 5
1
vote
0 answers

PBS Torque Limit Resource by Time of Day

I am using Torque to manage software that I have a limited number of licenses of (4 to be exact). During the day I need to keep 2 licenses free for use of setting up cases and at night and over the weekend I can use all 4 licenses to solve cases. Is…
LWhitson2
  • 111
  • 3
1
vote
0 answers

Torque queue issue

I am having troubles with Torque + Maui. The problem is the following: I have 2 queues, each queue has 10 associated nodes. If i submit 10k jobs to the first queue and i submit 1 job to the second one, the job in the second one remains in Q…
Andrea
  • 11
  • 1
1
vote
1 answer

Why does qdel comman return 'Unknown Job Id'

OS Version: CentOS release 4.6 (Final) Kernel \r on an \m 2.6.9-100.ELsmp Problem When I run qdel i get the following error: qdel: Unknown Job Id 20432.scyld.localdomain Information Output of qstat -n: head0.localdomain: …
1
vote
5 answers

I get the error qsub: Bad UID for job execution when trying to submit a job via PBS

OS Version: CentOS release 4.6 (Final) Kernel \r on an \m 2.6.9-100.ELsmp When I attempt to run a job it gives me the error as follows. qsub: Bad UID for job execution I have created a fresh user account and the same error occurs, yet other users…
1
vote
1 answer

Torque and maui node status

I am new for torque and maui. I was checking for node state to looking for which nodes are free and which nodes are in use. For torque one command is pbsnodes. Which gives status and other info related to node. When I was checking for maui then I…
Nilesh
  • 255
  • 1
  • 6
  • 17
1
2