I have a script I'm writing that runs badblocks against a disk shelf full of drives and I'm trying to understand the server load that develops and at what point the server load is critical in this usecase.
I have generally held to the general guideline that have a server load <=#of_cores is ideal, while <=2x #of_cores is generally not going to cause significant performance degradation unless servicing realtime sensitive workloads, but I don't think that generality applies in this usecase.
In the below screen capture from top you see I'm running badblocks against 8 devices the associated load is ~8, which I understand as there are 8 processes effectively stalled in queue due to the nature of badblocks. But only 2 cpu cores are iobound by these processes. So a couple questions:
1.> Am I slowing down my badblock testing my attempting this many simultaneously tests and if so why doesn't it use the available cores?
2.> I'm assuming this generally "non-ideal" cpu load would not impact servicing other requests like say for data being shared from other drives on the server? (assuming no bottleneck at the sas card) because 2 cores are free and available correct?
3.> If 2 cores are able to support 8 badblock processes without impact to each other (as shown) other why is it that 2 badblock processes use one core while a 3rd causes a 2nd core to be utilized, that scheduling/optimization would imply one to assume 8 processes should be consuming 3-4 cores not 2 scheduled optimally no?
The platform is Centos 7 |--| Processor is Intel e3-1220 v2 (quad-core no hyper-threading) |--| Disk shelf is connected to server by way of external SAS HBA (non raid)
top - 16:03:12 up 6 days, 15:21, 13 users, load average: 7.84, 7.52, 6.67
Tasks: 171 total, 2 running, 169 sleeping, 0 stopped, 0 zombie
%Cpu0 : 0.0 us, 0.0 sy, 0.0 ni, 99.7 id, 0.3 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu1 : 0.3 us, 6.0 sy, 0.0 ni, 0.0 id, 93.6 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu2 : 0.0 us, 0.0 sy, 0.0 ni, 95.7 id, 4.3 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu3 : 2.3 us, 3.0 sy, 0.0 ni, 0.0 id, 94.7 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 7978820 total, 7516404 free, 252320 used, 210096 buff/cache
KiB Swap: 4194300 total, 4194300 free, 0 used. 7459724 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
22322 root 20 0 122700 9164 832 D 3.3 0.1 18:36.77 badblocks
22394 root 20 0 122700 9164 832 D 1.3 0.1 15:52.98 badblocks
23165 root 20 0 122700 9152 820 D 1.3 0.1 0:36.94 badblocks
23186 root 20 0 122700 5792 808 D 1.3 0.1 0:02.54 badblocks
23193 root 20 0 122700 5004 768 D 1.3 0.1 0:02.17 badblocks
23166 root 20 0 122700 9152 820 D 1.0 0.1 0:36.11 badblocks
23167 root 20 0 122700 9148 820 D 1.0 0.1 0:39.74 badblocks
23194 root 20 0 122700 6584 808 D 1.0 0.1 0:01.47 badblocks