ZFS performance: Extreme low write speed

Question

I am running a small home server. The specs are:

CPU: AMD Ryzen 5 2600
RAM: 32 GB ECC
System drive: 128GB NVMe SSD
Data drives: 3x 4 TB Seagate Barracuda HDD

The server runs some applications like Nextcloud or Gitea and I want to run 1-2 VMs on it. So there are some web applications, databases and VMs.

The applications and qcow2 images are stored on a raidz1 pool:

$ sudo zpool status
  pool: tank
 state: ONLINE
config:

        NAME        STATE     READ WRITE CKSUM
        tank        ONLINE       0     0     0
          raidz1-0  ONLINE       0     0     0
            sdb     ONLINE       0     0     0
            sdc     ONLINE       0     0     0
            sdd     ONLINE       0     0     0

errors: No known data errors

When I used the applications in the first weeks, I experienced no problems. But since a few weeks I realized extremly low write speeds. The nextcloud instance is not very fast and when I try to start a fresh VM with Windows 10 it needs about 5 Minutes to get to the login screen.

I did some performance testing using fio and got following results:

Test	IOPS	Bandwith (KiB/s)
random read	37,800	148,000
random write	31	127
sequential read	72,100	282,000
sequential write	33	134

I did some research before posting here and read that I should add a SLOG to the zfs pool for better performance with databases and VMs. But that's no option at the moment. I need to get christmas gifts first :D

But even without a SLOG I don't think these figures are correct :(

Does anyone have an idea? :)

I already disabled `atime` on all datasets and set `recordsize` to 64K on the dataset storing the qcow2 images — Philip Szalla, Nov 21 '21 at 10:41
You should not use RAIDZ1 in any circumstances. Whenever one device fails, there is a very high risk that second drive fails during resilvering and you'll lose all your data. RAIDZ1 might also manifest in slowness like this. Use mirror instead. — Tero Kilkanen, Nov 21 '21 at 13:11

score 2 · Answer 1 · answered Nov 21 '21 at 21:27

As a first-order approximation raidz provides the random performances of a single disk, which for a 7.2K HDD are about 70 IOPS. Your test shows 50% less IOPS (ie: ~30 vs ~70) and this can be explained with the relatively large recordsize you selected.

Especially for random writes, any recordsize larger than 4KB is going to face considerable read/modify/write penalty. Please note that I am not advocating for using such a small recordsize on mechanical disks as it commands a very high metadata overhead, high fragmentation and (almost) no compression. As a reference, when using HDDs I leave the default (128KB) recordsize even for virtualization hosts.

You can do the following to improve performance:

use mirror vs raidz where applicable (but you only have 3 disks, preventing the use of mirroring+striping)
use RAW image files rather than QCOW2 (if QCOW2 files really are required, be sure to preallocate their metadata)
try setting sync=disabled (but be sure to understand that in case of sudden power loss your system will lose up to 5s of written data)

Thank you for your answer and excuse me for my late reply! I removed one drive from the system and created a mirror pool. That improved the read speeds by about 30% as expected. But the write speeds didn't changed :( — Philip Szalla, Dec 04 '21 at 20:43

score 2 · Accepted Answer · answered Dec 04 '21 at 20:48

I found the problem by myself.

I saw an article mentioning CMR and SMR. I checked my drives and I realized that I accidentally bought hard drives with SMR :(

I will keep a mirror pool until I replaced the drives with new CMR drives. When I have the new drive I will also use a mirror pool.

Thank you all!

ZFS performance: Extreme low write speed

2 Answers2