0

With gridengine-master 6.2u5-7.3 (Ubuntu Trusty), our /var/lib/gridengine/spool/qmaster/messages gets constantly filled with:

12/07/2016 04:11:43|worker|tools-grid-master|E|got load report of unknown exec host "tools-exec-1204.eqiad.wmflabs"

(tools-exec-1204.eqiad.wmflabs is a host that no longer exists.)

How can I convince the grid master to "move on", i. e. "accept" that it did receive a load report from an unknown host, or "delete" the load report from its inbox?

1 Answers1

0

Apparently the problem was that the host had been shut down and removed from DNS, but was still referenced in host_aliases. Removing the entry for the host from host_aliases and restarting the grid master to reread host_aliases (service gridengine-master restart) caused the errors to stop.