Currently I'm running Proxmox 5.3-7 on ZFS with few idling debian virtual machines. I'm using two SSDPE2MX450G7 NVME drives in RAID 1. After 245 days of running this setup the S.M.A.R.T values are terrible.
SMART/Health Information (NVMe Log 0x02, NSID 0xffffffff)
Critical Warning: 0x00
Temperature: 27 Celsius
Available Spare: 98%
Available Spare Threshold: 10%
Percentage Used: 21%
Data Units Read: 29,834,793 [15.2 TB]
Data Units Written: 765,829,644 [392 TB]
Host Read Commands: 341,748,298
Host Write Commands: 8,048,478,631
Controller Busy Time: 1
Power Cycles: 27
Power On Hours: 5,890
Unsafe Shutdowns: 0
Media and Data Integrity Errors: 0
Error Information Log Entries: 0
I was trying to debug what's consuming so much write commands, but I'm failing. iotop
shows 400kB/s average writes with 4MB/s spikes.
I've tried to run zpool iostat and it doesn't look bad too.
zpool iostat rpool 60
capacity operations bandwidth
pool alloc free read write read write
rpool 342G 74.3G 0 91 10.0K 1.95M
rpool 342G 74.3G 0 90 7.80K 1.95M
rpool 342G 74.3G 0 107 7.60K 2.91M
rpool 342G 74.3G 0 85 22.1K 2.15M
rpool 342G 74.3G 0 92 8.47K 2.16M
rpool 342G 74.3G 0 90 6.67K 1.71M
I've decided to take a look into writes by echoing 1
into /proc/sys/vm/block_dump
and looking into /var/log/syslog
. Here's the result:
Jan 25 16:56:19 proxmox kernel: [505463.283056] z_wr_int_2(438): WRITE block 310505368 on nvme0n1p2 (16 sectors)
Jan 25 16:56:19 proxmox kernel: [505463.283058] z_wr_int_0(436): WRITE block 575539312 on nvme1n1p2 (16 sectors)
Jan 25 16:56:19 proxmox kernel: [505463.283075] z_wr_int_1(437): WRITE block 315902632 on nvme0n1p2 (32 sectors)
Jan 25 16:56:19 proxmox kernel: [505463.283096] z_wr_int_4(562): WRITE block 460141312 on nvme0n1p2 (8 sectors)
Jan 25 16:56:19 proxmox kernel: [505463.283108] z_wr_int_4(562): WRITE block 460141328 on nvme0n1p2 (16 sectors)
Jan 25 16:56:19 proxmox kernel: [505463.283271] z_null_iss(418): WRITE block 440 on nvme1n1p2 (8 sectors)
Jan 25 16:56:19 proxmox kernel: [505463.283315] z_null_iss(418): WRITE block 952 on nvme1n1p2 (8 sectors)
Jan 25 16:56:19 proxmox kernel: [505463.283348] z_null_iss(418): WRITE block 878030264 on nvme1n1p2 (8 sectors)
Jan 25 16:56:19 proxmox kernel: [505463.283378] z_null_iss(418): WRITE block 878030776 on nvme1n1p2 (8 sectors)
Jan 25 16:56:19 proxmox kernel: [505463.283409] z_null_iss(418): WRITE block 440 on nvme0n1p2 (8 sectors)
Jan 25 16:56:19 proxmox kernel: [505463.283442] z_null_iss(418): WRITE block 952 on nvme0n1p2 (8 sectors)
Jan 25 16:56:19 proxmox kernel: [505463.283472] z_null_iss(418): WRITE block 878030264 on nvme0n1p2 (8 sectors)
Jan 25 16:56:19 proxmox kernel: [505463.283502] z_null_iss(418): WRITE block 878030776 on nvme0n1p2 (8 sectors)
Jan 25 16:56:19 proxmox kernel: [505463.289562] z_wr_iss(434): WRITE block 460808488 on nvme1n1p2 (24 sectors)
Jan 25 16:56:19 proxmox kernel: [505463.289572] z_wr_iss(434): WRITE block 460808488 on nvme0n1p2 (24 sectors)
Jan 25 16:56:19 proxmox kernel: [505463.457366] z_wr_iss(430): WRITE block 460808744 on nvme1n1p2 (24 sectors)
Jan 25 16:56:19 proxmox kernel: [505463.457382] z_wr_iss(430): WRITE block 460808744 on nvme0n1p2 (24 sectors)
Jan 25 16:56:19 proxmox kernel: [505463.459003] z_wr_iss(431): WRITE block 460809000 on nvme1n1p2 (24 sectors)
and so on. Is there any way to limit number of writes? As you can see the data units written are outrageous and I'm stuck, because I'm out of ideas how to limit it.