2

I am running a small home server. The specs are:

  • CPU: AMD Ryzen 5 2600
  • RAM: 32 GB ECC
  • System drive: 128GB NVMe SSD
  • Data drives: 3x 4 TB Seagate Barracuda HDD

The server runs some applications like Nextcloud or Gitea and I want to run 1-2 VMs on it. So there are some web applications, databases and VMs.

The applications and qcow2 images are stored on a raidz1 pool:

$ sudo zpool status
  pool: tank
 state: ONLINE
config:

        NAME        STATE     READ WRITE CKSUM
        tank        ONLINE       0     0     0
          raidz1-0  ONLINE       0     0     0
            sdb     ONLINE       0     0     0
            sdc     ONLINE       0     0     0
            sdd     ONLINE       0     0     0

errors: No known data errors

When I used the applications in the first weeks, I experienced no problems. But since a few weeks I realized extremly low write speeds. The nextcloud instance is not very fast and when I try to start a fresh VM with Windows 10 it needs about 5 Minutes to get to the login screen.

I did some performance testing using fio and got following results:

Test IOPS Bandwith (KiB/s)
random read 37,800 148,000
random write 31 127
sequential read 72,100 282,000
sequential write 33 134

I did some research before posting here and read that I should add a SLOG to the zfs pool for better performance with databases and VMs. But that's no option at the moment. I need to get christmas gifts first :D

But even without a SLOG I don't think these figures are correct :(

Does anyone have an idea? :)

  • I already disabled `atime` on all datasets and set `recordsize` to 64K on the dataset storing the qcow2 images – Philip Szalla Nov 21 '21 at 10:41
  • 1
    You should not use RAIDZ1 in any circumstances. Whenever one device fails, there is a very high risk that second drive fails during resilvering and you'll lose all your data. RAIDZ1 might also manifest in slowness like this. Use mirror instead. – Tero Kilkanen Nov 21 '21 at 13:11

2 Answers2

2

As a first-order approximation raidz provides the random performances of a single disk, which for a 7.2K HDD are about 70 IOPS. Your test shows 50% less IOPS (ie: ~30 vs ~70) and this can be explained with the relatively large recordsize you selected.

Especially for random writes, any recordsize larger than 4KB is going to face considerable read/modify/write penalty. Please note that I am not advocating for using such a small recordsize on mechanical disks as it commands a very high metadata overhead, high fragmentation and (almost) no compression. As a reference, when using HDDs I leave the default (128KB) recordsize even for virtualization hosts.

You can do the following to improve performance:

  • use mirror vs raidz where applicable (but you only have 3 disks, preventing the use of mirroring+striping)
  • use RAW image files rather than QCOW2 (if QCOW2 files really are required, be sure to preallocate their metadata)
  • try setting sync=disabled (but be sure to understand that in case of sudden power loss your system will lose up to 5s of written data)
shodanshok
  • 44,038
  • 6
  • 98
  • 162
  • Thank you for your answer and excuse me for my late reply! I removed one drive from the system and created a mirror pool. That improved the read speeds by about 30% as expected. But the write speeds didn't changed :( – Philip Szalla Dec 04 '21 at 20:43
2

I found the problem by myself.

I saw an article mentioning CMR and SMR. I checked my drives and I realized that I accidentally bought hard drives with SMR :(

I will keep a mirror pool until I replaced the drives with new CMR drives. When I have the new drive I will also use a mirror pool.

Thank you all!