1

There is a problem with my OGE configuration. The load_avg for the nodes does not get set (remains at -NA-). Because of this and because of the np_load_avg threshold on the queue no jobs are being run.

[ce@node1 ce]$ qhost -F -l h=node2
HOSTNAME                ARCH         NCPU  LOAD  MEMTOT  MEMUSE  SWAPTO  SWAPUS
-------------------------------------------------------------------------------
node2                   -               -     -       -       -       -       -

No errors pop up in default/spool/localhost/messages nor in qmaster/messages. The queue scheduling message is 'no value for complex attribute np_load_avg'.

I do not see any indications as to what could be going wrong, the following works on the execution node:

  • gethostname
  • gethostbyname master
  • qstat -f
  • loadcheck
Adversus
  • 121
  • 3

1 Answers1

1

The problem was in my /etc/hosts file, I had:

127.0.0.1 node2

this had to become:

10.0.0.2    node2

Finally giving me

[ce@node1 ce]$ qhost -F -l h=node2
HOSTNAME                ARCH         NCPU  LOAD  MEMTOT  MEMUSE  SWAPTO  SWAPUS
-------------------------------------------------------------------------------
node2                   linux-x64       8  0.00   31.3G  308.8M   11.9G     0.0

and

[ce@node2 ce]# utilbin/linux-x64/gethostname 
Hostname: node2
Aliases:  
Host Address(es): 10.0.0.2 
Adversus
  • 121
  • 3