We are seeing rather high interrupt and context switch counts on our Nomad clients:

# vmstat 1
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
6  0      0 317479392   4168 104093176    0    0     0     0 145232 306836 11  4 84  0  0

The average of ~150k in/s and ~300k cs/s are not spikes but sustained over the day. We analysed the processes (with pidstat), the Top 1 service type generates around 40k cs/s. We looked into networking as well, it seems most interrupts originate from there. So we argue that Docker networking is responsible for the high counts.

My Question: Are these expected values for a medium to large sized node? We are running on 32c/64t CPU (~20% busy), 512GB RAM (~140GB free). Thanks for sharing your experience.

  • 121
  • 4

0 Answers0