I have problem with server slowdowns in very specific scenario. The facts are:
- 1) I use computational application WRF (Weather Research and Forecast)
- 2) I use Dual Xeon E5-2620 v3 with 128GB RAM (NUMA architecture - probably related to problem!)
- 3) I run WRF with mpirun -n 22 wrf.exe (I have 24 logical cores available)
- 4) I use Centos 7 with 3.10.0-514.26.2.el7.x86_64 kernel
- 5) Everthing works OK in terms of computational performance until one of things happen:
- 5a) linux file cache gets some data, or
- 5b) I use tmpfs and fill it with some data
In 5a or 5b scenario, my WRF start to slow down suddenly and get sometimes even ~5x slower than normal.
- 6) RAM does not get swapped, it is not even close to happening, I have around 80% of RAM free in worst case scenario!
- 7) vm.zone_reclaim_mode = 1 in /etc/sysctl.conf seems to help a bit to delay issue in 5a scenario
- 8) echo 1 > /proc/sys/vm/drop_caches resolve problem completely in 5a scenario, restores WRF performance to maximum speed, but only temporary until file cache get data again, so I use this command in cron (don't worry, it IS ok, I use computer only for WRF and it does not need file cache to work at full performance)
- 9) but, above command still does nothing in 5b scenario (when I use tmpfs for temporary files)
- 10) perfomanace is restored in 5b scenario only if I manually empty tmpfs
- 11) It is not WRF or mpi problem
- 12) This happens only on this one computer type and I administer a lot of them for same/similar purporse (WRF). Only this one has full NUMA architecture so I suspect this has something with it
- 13) I also suspect that RHEL kernel has something with this but not sure, didn't tried to reinstall into different distro yet
- 14) numad and numactl option to invoke mpirun like "numactl -l", did not make any difference
Let me know if you have any idea to try to aviod those slowdowns.
One idea come to me after following some "Related" links on this question. Can Transparent Huge Pages be a source of this problem? Some articles highly suggest that THP does not play well on NUMA systems.