How to obtain read speeds of two disks using mdadm/btrfs raid1 or zfs mirror?

Question

Given that RAID1 writes two copies of the data, my understanding is that reads should be close to twice that of a single disk.

I have tested read performance of different technologies (mdadm, zfs, btrfs) with little success.

From my experience:

btrfs only used one disk during reads
zfs/mdadm used both disks however the read speed achieved was that of a single disk

I've verified the results using dd, iotop and iostat.

How can data on both disks be read at the same time in order to achieve twice a single disk's read performance?

You have provided little detail about how the devices were configured and nothing about how you measured performance. You might want to take a look at fio for benchmarking. — symcbean, Nov 12 '16 at 19:36
Yet to try bonnie++, having a go at fio now though and whilst it is complex, it's definitely a good way to test random writes which is what I am after. — Greg, Nov 13 '16 at 18:32

Waxhead · Accepted Answer · 2016-11-13T22:00:36.497

The question is a bit limited on information as to how the test where done so I am assuming that you where just using dd to write directly to a file.

BTRFS:

In BTRFS terms RAID1 means TWO COPIES regardless of how many disks you have in the pool. So for simplicity let's assume that both data and metadata is stored in RAID1 on two disks. Therefore each disk is a copy of the content of the other disk (in BTRFS the disks layout may not be identical).

When BTRFS executes reads it (the last time I checked) relied on the PID of the process to determine what disks to read from first. That means that if you run a single process it will only read from one of the disks, unless there is a error and a good copy needs to be retrieved form the other disk.

The next time you run that process it may have a different PID and BTRFS will read the data from another disk first.

MD: (MDADM)

For reads that are sequential you would not gain much if anything since the same data is on both devices and both disks would therefore need to skip (seek over) the same amount of data before starting a read. E.g. disk A would need to wait (skip over) the first 10 bytes before reading the next 10 bytes and even if disk B could read the first 10 bytes at the same time disk A is skipping it would still need to wait until disk A have finished to be able to put together the 20 bytes it was supposed to read in memory.

From the MD manual page (man md):

"Note that the read balancing done by the driver does not make the RAID1 performance profile be the same as for RAID0; a single stream of sequential input will not be accelerated (e.g. a single dd), but multiple sequential streams or a random workload will use more than one spindle. In theory, having an N-disk RAID1 will allow N sequential threads to read from all disks."

ZFS:

I have no knowledge of ZFS , but I expect it to work roughly the same as BTRFS/MDADM.

CONCLUSION:

For single sequential read operations like you probably do with dd there is not much to be gained performance wise by having a RAID1 setup on both BTRFS and/or MDADM.

If you would like to see the improved read speed (that does exist) on both BTRFS and MDADM you would need to do multiple different reads in parallel on the array. BTRFS would likely distribute reads on different disks based on PID and MDADM should reduce the number of seeks significantly. Remember that RAID1 is not the same as RAID0 and especially not on BTRFS arrays.

The tests I've done were all dd however as per @symcbean's suggestion I have done some tests with fio (albeit not finished them yet). What I find difficult to understand is why RAID1 reads are not done in the same way as RAID0 reads. That is, why isn't half the file ignored on one drive and half the file ignored on the other, and the rest treated exactly like a RAID0 scenario. I want to use RAID1 to add more performance and resiliency for a database such as MySQL or PostgreSQL. I assume that the PID's for each database request have different PIDs? What do you think about using RAID1 this way? — Greg, Nov 13 '16 at 18:28
I updated my answer with details from the MD manual. Raid0 interleaves chunks of data so reading from 1 to 100 will read essentially read odd values from disk 1 and even values from disk 2. Raid1 on the other hand is a mirror e.g. duplicate and while I don't know the exact technical reason why raid1 will not on MD perform as fast as raid0 I can assume that it's better for one thread to get "full disk speed" for what it wants to read instead of having to share the bandwidth (and seeks) with all other threads accessing the disk. — Waxhead, Nov 13 '16 at 22:12

How to obtain read speeds of two disks using mdadm/btrfs raid1 or zfs mirror?

1 Answers1

Linked