1

I asked this question at meta.stackoverflow.com; they said it was the wrong forum

I am using the FALCON assembler (for genomes by PacBio; https://github.com/PacificBiosciences/pb-assembly) and it's designed for usage on GRID engines with a scheduler (local, sge, lsf, pbs, torque, slurm).

The software installs via CONDA and I've run the test data (locally; within my login node) and the program runs as expected. Now I am trying to run it out on the GRID from my login node using the PBS/TORQUE scheduler.

My question is this: How do I properly specify the maximum number of processors and maximum memory variables?

The script should query the GRID and find out the resources [${NPROC} & ${MB}] at the beginning of the run - but I am unclear as to whether or not the term for the variables ${NPROC} and ${MB} are appropriate (correct) for PBS/TORQUE

For SGE this is how the config was written, and I've begun to modify it for PBS/TORQUE.

#JMout job_type=sge
#JMout pwatcher_type=blocking
job_type=pbs
JOB_QUEUE=batch
MB=32768
NPROC=6
njobs=32
submit = qsub -S /bin/bash -V  \
  -q ${JOB_QUEUE}     \
  -N ${JOB_NAME}      \
  -o "${JOB_STDOUT}"  \
  -e "${JOB_STDERR}"  \
  -pe smp ${NPROC}    \
  -l h_vmem=${MB}M    \
  "${JOB_SCRIPT}"

Specifically these two lines are the ones that I had concerns about:

 -pe smp ${NPROC}    \
 -l h_vmem=${MB}M    \

TIA

Andor Kiss
  • 111
  • 1
  • Looks as though <-pe smp> refers to a parallel environment with a single memory process - in other words a multi-core single node process. The job should NOT be divided across nodes. And the <-l h_vmem> is the hard limit of RAM – Andor Kiss Apr 12 '21 at 20:03

0 Answers0