Questions tagged [torque]

TORQUE is a fork of OpenPBS, a job scheduling application used to distribute workloads within a UNIX-based computing cluster.

30 questions
1
vote
1 answer

Non-exclusive job scheduling in PBS/Torque

The cluster resource manager Torque typically allocates compute nodes on an exclusive basis. However, when you have a lot of small jobs (like we do) running against multi-core compute nodes, this can result in a lot of wasted resources. Is there…
ajdecon
  • 1,291
  • 4
  • 14
  • 21
1
vote
2 answers

Torque works half of the time. Fails No Permission the other half

We upgraded our OS from Debian 5 to Debian 6 and consequently upgraded Torque. Now qstat and qsub works for about 1 minute and fails for another minute. I have torque-2.5.5 (but I tried 2.4.8 and it had same issues). When we run qstat half of the…
Aleksandr Levchuk
  • 2,415
  • 3
  • 21
  • 41
1
vote
0 answers

Assigning CPU and MEM usage variables in PBS/TORQUE

I asked this question at meta.stackoverflow.com; they said it was the wrong forum I am using the FALCON assembler (for genomes by PacBio; https://github.com/PacificBiosciences/pb-assembly) and it's designed for usage on GRID engines with a scheduler…
Andor Kiss
  • 111
  • 1
0
votes
1 answer

Less CPUload with torque

We have installed torque on a dual Xeon (26 core, 52 available in hyperthreading). The node is configured with np=104. If I launch a MPI calculation in command line, I get near 100% cpu usage : %Cpu(s): 53.9 us, 44.6 sy, 0.0 ni, 1.4 id, 0.0 wa, …
0
votes
1 answer

Solution for dynamic allocation of CPU cores for MPI program

I am using a MPI program on limited CPU resources. It involved running an application that requires 20 separate processes on a 12 thread CPU. I run it again and again with different parameters. Towards the end of the application, most of the 20…
user121392
  • 13
  • 1
  • 6
0
votes
1 answer

How to append PBS job output to log files instead of overwriting it?

I'm using Torque and I want my PBS log+error output to be appended to previous log/error file instead of overwriting it.
user121392
  • 13
  • 1
  • 6
0
votes
1 answer

Library not found only when application is executed from PBS file.

I have a compiled file a.out that runs fine when executed directly from my terminal. However, trying to execute that file from my PBS file yields a missing library libmkl_intel_lp64.so. I have already tried exporting the path of the library to…
user121392
  • 13
  • 1
  • 6
0
votes
1 answer

prevent normal users from running code on a cluster outside of pbs system

In our cluster with PBS batch system (torque) installed, we wish all the users execute their jobs by qsub so that the CPU resources can be well managed. However, it is found that users in our cluster can still directly run their programs directly…
0
votes
1 answer

Torque reports error when posting job to client nodes

The system has two machines, one (called macondo02) runs pbs_server and pbs_schedule, another (called macondo01) runs pbs_mom. I have ensured that the host can clearly identify the existance of the guest: $ pbsnodes -a macondo01 state = free np =…
0
votes
3 answers

Parallel prologue and epilogue in Grid Engine

We have a cluster being used to run MPI jobs for a customer. Previously this cluster used Torque as the scheduler, but we are transitioning to Grid Engine 6.2u5 (for some other features). Unfortunately, we are having trouble duplicating some of…
ajdecon
  • 1,291
  • 4
  • 14
  • 21
0
votes
1 answer

Installing Torque on a single machine

Does anyone know if I can install Torque on a multicore machine instead of across a cluster? We're looking to test some software that requires Torque and exceeds the hardware capability of our LSF based cluster. We have a multicore machine that…
geoffjentry
  • 151
  • 1
  • 6
0
votes
1 answer

Only half of my pbs/Torque jobs are being scheduled

My supercomputing center recently moved from SGE to pbs/Torque. Now, when I schedule my array jobs, only half of the jobs in the array get scheduled. When they finish, the other half get scheduled. This happens despite the fact that they are largely…
vy32
  • 2,018
  • 1
  • 15
  • 20
0
votes
0 answers

torque/undelivered - collect finished jobs

I had the problem that due to an incorrectly configured ssh the pbs_mom couldn't send the finished data (.ER and .OU files) back to the server. After a look into the log, I could locate the data on the nodes in the /var/spool/torque/undelivered/…
stephanp
  • 21
  • 3
-1
votes
1 answer

How can we configure torque with multiple nodes for a workstation?

I've got a GPU workstation with 48 core CPU + 4 NVIDIA GPU. I am going to make this machine to be a small cluster which contains: 4 nodes 12 core +1 CPU/node I've installed Torque in this machine with command: ./configure --without-tcl…
-1
votes
1 answer

Torque: How lock node cores to one application

I am having an issue that I do not know how to solve. Say I have 2 nodes (node1 and node2) each with 24 cores, I have software where I have a license for 32 cores. I want to be able to configure node1 to ONLY accept jobs for that software, and I…
James
  • 1
1
2