Has anyone had luck randomizing Slurm node allocations? We have a small cluster of 12 nodes that could be used by anywhere from 1-8 people at a time with jobs of various size/length. When testing our new Slurm setup, jobs always go to the first node in the partition if there are no other users for both interactive and batch jobs. Is there a way to randomize this scheduling?
It seems like, depending on a user's timeline, they could consistently get the same nodes and this could disguise issues in hardware/configuration that might otherwise be visible. Our nodes are always exclusive, so we're only looking at randomizing the node-level scheduling...