Bandwidth and IO issues when running hardware raid and zfs

Question

I've inherited a setup using Linux Ubuntu 14, a megaraid_sas, and ZFS.

We're getting some performance problems (never being able to fully utilize the 6GB bandwidth from the raid) and I'm curious if it's related to the ZFS setup - which seems a little unusual.

The question really are:

Is this setup (see below) problematic? At best, it seems unnecessarily complex (why use ZFS instead of just controlling data space size at the raid an mounting directly?)
We do not seem to be utilizing the maximum read rates from the raid. Could this setup be why?

SETUP

The RAID has a number of RAID 5 and 6 virtual disks which present to the Linux server as local devices. I.e.

# lsblk
NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda      8:0    0  14,6T  0 disk
├─sda1   8:1    0  14,6T  0 part
└─sda9   8:9    0     8M  0 part
sdb      8:16   02,6G  0 disk
├─sdb1   8:17   0  94,6G  0 part /
├─sdb2   8:18   0     1K  0 part
└─sdb5   8:21   0   128G  0 part [SWAP]
sdc      8:32   0  14,6T  0 disk
├─sdc1   8:33   0  14,6T  0 part
└─sdc9   8:41   0     8M  0 part
sdd      8:48   0  10,9T  0 disk
├─sdd1   8:49   0  10,9T  0 part
└─sdd9   8:57   0     8M  0 part
sd<N>      8:64   0   7,3T  0 disk
├─sd<N>1   8:65   0   7,3T  0 part
└─sd<N>9   8:73   0     8M  0 part

These are then all strapped together again as a zpool. I.e.

# zpool status
  pool: zpool
 state: ONLINE
  scan: scrub repaired 0 in 84h36m with 0 errors on Tue Aug 29 00:48:43 2017
config:

        NAME        STATE     READ WRITE CKSUM
        zpool       ONLINE       0     0     0
          sd<N>     ONLINE       0     0     0
          sda       ONLINE       0     0     0
          sdc       ONLINE       0     0     0
          sdd       ONLINE       0     0     0

errors: No known data errors

...which are then divided up into datasets. I.e.

# df -h
Filesystem               Size  Used Avail Use% Mounted on
udev                      63G   12K   63G   1% /dev
tmpfs                     13G   23M   13G   1% /run
/dev/sdb1                 94G   13G   76G  15% /
zpool/dataset1            13T   11T  2,3T  82% /common/share
zpool/dataset1/archive   3,9T  1,6T  2,3T  41% /common/share/archive
zpool/dataset2           6,6T  4,3T  2,3T  66% /h2t
zpool                    5,2T  2,9T  2,3T  56% /zpool
zpool/dataset3           12T  8,8T  2,3T  80% /zpool/backup
zpool/dataset4           2,4T   28G  2,3T   2% /zpool/homes
zpool/dataset4/extern    2,7T  317G  2,3T  12% /zpool/homes/externstaff
zpool/homes/students      14T   12T  2,3T  84% /zpool/homes/students
zpool/temp               2,4T   50G  2,3T   3% /common/temp

Although this setup is a catastrophe, I don't see why it should prevent from getting 6Gbps from the raid array. In the same time I don't understand why you expect it to be only 6Gbps - it's a single SAS/SATA lane bandwidth, and one cable gets you 4 lanes, so it's 24 Gbps. — drookie, Feb 06 '19 at 14:11

score 2 · Accepted Answer · answered May 05 '19 at 00:42

What speeds are you actually getting? From what workloads? What kind of disks?

I wouldn't generally recommend using hardware RAID beneath ZFS, but it can be done. One of the things people tend to get badly wrong about striped arrays is that their performance for most workloads trends toward that of a single disk, not toward that of the number of disks in the array. In perfectly ideal conditions, you can get the throughput of 4 disks out of a 6-disk raid6 array, but in most conditions, you'll bind on IOPS, not throughput - and the IOPS of any striped array is roughly that of a single disk, no matter how wide the array (and it gets worse the wider the array is, not better).

6 Gbps =~ 768 MB/sec ; I would not expect to get anything vaguely like that amount of throughput out of a bunch of rust disks in striped arrays outside very specialized and very carefully controlled workloads. If you've got multiple users accessing bunches of files - let alone any kind of database or VM access patterns - you're not going to see anything like that level of performance.

Bandwidth and IO issues when running hardware raid and zfs

1 Answers1