3

We are running a Beowulf cluster using the Scyld distribution from Penguin Computing, and it looks like cgroups are configured on the head node, but not the compute nodes. I'm trying to configure Slurm to use the proctrack/cgroup plugin, but it won't work on the compute nodes.

For example, I can list the cgroups on the head node, but not on a compute node:

$ bpsh -1 systemd-cgls
├─1 /usr/lib/systemd/systemd --switched-root --system --deserialize 21
├─user.slice
...
$ bpsh 1 systemd-cgls
Failed to create bus connection: No such file or directory
$

If I look at the mount point for the cgroup system, it's mounted on the head node, but not the compute nodes. The compute nodes just have an empty directory at that location.

$ bpsh -1 findmnt /sys/fs/cgroup
TARGET         SOURCE FSTYPE OPTIONS
/sys/fs/cgroup tmpfs  tmpfs  ro,nosuid,nodev,noexec,mode=755
$ bpsh 1 findmnt /sys/fs/cgroup
$ bpsh 1 ls -l /sys/fs/cgroup
total 0
$

I assume I have to start some cgroup service on the compute nodes, but how? I found the RHEL documentation on cgroups, but it only describes using them, not the initial setup.

Update

man7.org describes how to mount cgroups controllers, but says this:

Note that on many systems, the v1 controllers are automatically mounted under /sys/fs/cgroup; in particular, systemd(1) automatically creates such mount points.

That explains why I can't see any configuration for cgroups on the head node: they're just mounted automatically. Why aren't they mounted automatically on the compute nodes?

It looks like the drivers are loaded on the compute node, but not mounted:

$ cat /proc/cgroups
#subsys_name    hierarchy   num_cgroups enabled
cpuset  6   1   1
cpu 4   1   1
cpuacct 4   1   1
memory  2   1   1
devices 3   1   1
freezer 10  1   1
net_cls 7   1   1
blkio   5   1   1
perf_event  9   1   1
hugetlb 8   1   1
pids    11  1   1
net_prio    7   1   1
$ bpsh 0 cat /proc/cgroups
#subsys_name    hierarchy   num_cgroups enabled
cpuset  0   1   1
cpu 0   1   1
cpuacct 0   1   1
memory  0   1   1
devices 0   1   1
freezer 0   1   1
net_cls 0   1   1
blkio   0   1   1
perf_event  0   1   1
hugetlb 0   1   1
pids    0   1   1
net_prio    0   1   1

I tried searching for "cgroup" in /var/log/messages, and I found the head node initializing the cgroup subsystems, but nothing from the compute nodes.

Don Kirkby
  • 1,154
  • 3
  • 10
  • 23
  • Amazon is using a Beowulf cluster? – Ward - Reinstate Monica Aug 29 '17 at 04:19
  • Nice to hear from you, @Ward. I don't work at Amazon anymore. – Don Kirkby Aug 29 '17 at 17:13
  • 1
    Any kernel appends? Look in `/proc/cmdline` on your node. Also check if `systemd` is running as PID1 `pidof systemd`. – Thomas Sep 02 '17 at 11:02
  • It looks like `systemd` isn't running on the compute node, @Thomas, but it is on the head node. I guess that's why cgroups are mounted on the head but not on the compute nodes. As for the cmdline, it has `cgroup_enable=memory swapaccount=1` among other things. It looks like cgroups are loaded but not mounted. – Don Kirkby Sep 06 '17 at 16:21
  • If `systemd` is not running it's pretty sure the reason it is not mounted on the compute nodes. You could either confiugre `cgconfig` in `/etc/cgconfig.conf` or better `/etc/cgconfig.d/.conf`. But that is also started by a `systemd` unit. So it might be best to contact the Penguin Computing to solve the issue. – Thomas Sep 06 '17 at 19:21

0 Answers0