0

I have a script I'm writing that runs badblocks against a disk shelf full of drives and I'm trying to understand the server load that develops and at what point the server load is critical in this usecase.

I have generally held to the general guideline that have a server load <=#of_cores is ideal, while <=2x #of_cores is generally not going to cause significant performance degradation unless servicing realtime sensitive workloads, but I don't think that generality applies in this usecase.

In the below screen capture from top you see I'm running badblocks against 8 devices the associated load is ~8, which I understand as there are 8 processes effectively stalled in queue due to the nature of badblocks. But only 2 cpu cores are iobound by these processes. So a couple questions:

1.> Am I slowing down my badblock testing my attempting this many simultaneously tests and if so why doesn't it use the available cores?

2.> I'm assuming this generally "non-ideal" cpu load would not impact servicing other requests like say for data being shared from other drives on the server? (assuming no bottleneck at the sas card) because 2 cores are free and available correct?

3.> If 2 cores are able to support 8 badblock processes without impact to each other (as shown) other why is it that 2 badblock processes use one core while a 3rd causes a 2nd core to be utilized, that scheduling/optimization would imply one to assume 8 processes should be consuming 3-4 cores not 2 scheduled optimally no?

The platform is Centos 7 |--| Processor is Intel e3-1220 v2 (quad-core no hyper-threading) |--| Disk shelf is connected to server by way of external SAS HBA (non raid)

top - 16:03:12 up 6 days, 15:21, 13 users,  load average: 7.84, 7.52, 6.67
Tasks: 171 total,   2 running, 169 sleeping,   0 stopped,   0 zombie
%Cpu0  :  0.0 us,  0.0 sy,  0.0 ni, 99.7 id,  0.3 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu1  :  0.3 us,  6.0 sy,  0.0 ni,  0.0 id, 93.6 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu2  :  0.0 us,  0.0 sy,  0.0 ni, 95.7 id, 4.3 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu3  :  2.3 us,  3.0 sy,  0.0 ni,  0.0 id, 94.7 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem :  7978820 total,  7516404 free,   252320 used,   210096 buff/cache
KiB Swap:  4194300 total,  4194300 free,        0 used.  7459724 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
22322 root      20   0  122700   9164    832 D   3.3  0.1   18:36.77 badblocks
22394 root      20   0  122700   9164    832 D   1.3  0.1  15:52.98 badblocks
23165 root      20   0  122700   9152    820 D   1.3  0.1   0:36.94 badblocks
23186 root      20   0  122700   5792    808 D   1.3  0.1   0:02.54 badblocks
23193 root      20   0  122700   5004    768 D   1.3  0.1   0:02.17 badblocks
23166 root      20   0  122700   9152    820 D   1.0  0.1   0:36.11 badblocks
23167 root      20   0  122700   9148    820 D   1.0  0.1   0:39.74 badblocks
23194 root      20   0  122700   6584    808 D   1.0  0.1   0:01.47 badblocks
Zenonk
  • 56
  • 4

1 Answers1

1

Load average and CPU utilization

Load average is a slow moving metric of approximately the number of runnable tasks on CPU. Except, early on Linux decided to also count noninterruptible tasks in the hopes of catching I/O load. Load lower than the number of CPUs definitely can run more tasks, but the maximum recommended isn't as obvious.

Disk I/O in modern systems requires minimal CPU involvement. So, iowait is almost idle. User + system being so low indicates the CPU has little to do while waiting for very slow spindles.

Parallel jobs for this workload

Limit to one badblocks per physical spindle. Multiple might seek the disk head back and forth resulting in terrible performance.

Possibly there also is a bottleneck in the SAS card or other component of the storage system. When you see I/O bandwidth (perhaps via iotop) no longer increase, use fewer processes. Or just pick 8 or so at a time as an arbitrary sized batch to run in parallel (perhaps with GNU parallel).

Task scheduling

The task scheduler is optimizing for several things. Even in a multiple CPU system, focusing on a few cores can keep data hot in caches, throttle down idle cores for power savings, and still be able to handle interrupts. Also, NUMA and SMT scheduling considerations, although this CPU doesn't have the features.

In this case, you have two almost idle cores. I would expect the host to be reasonably snappy. Although, don't run much more work while you are doing this. Limited I/O bandwidth and IOPS can leave CPUs waiting while work done doesn't increase.

John Mahowald
  • 30,009
  • 1
  • 17
  • 32