15

I have a workstation system that will have two 64GB industrial SSDs, and the plan is to have both disks in a RAID1 configuration for redundancy which is set up in the kickstart. The system will be running CentOS 7. In looking into this, I discovered that the RHEL Storage Administration Guide doesn't recommend RAID1 for SSDs.

Red Hat also warns that software RAID levels 1, 4, 5, and 6 are not recommended for use on SSDs. During the initialization stage of these RAID levels, some RAID management utilities (such as mdadm) write to all of the blocks on the storage device to ensure that checksums operate properly. This will cause the performance of the SSD to degrade quickly.

Is this something I should be seriously concerned with? Are there alternatives for redundancy that I can use?

According to RHEL documentation again, LVM mirroring now leverages MD software RAID, so the RAID warning also applies to that.

More info: The SSDs are Swissbit X-200 series (SATA), and it looks like overprovisioning is at 40%.

Hardware RAID won't be an option, according to the hardware team.

mochatiger
  • 153
  • 1
  • 1
  • 7
  • Can you elaborate on what the application is? Are you using industrial SSDs because this is a harsh environment or controller system of some sort? – ewwhite Jul 15 '14 at 15:50
  • Yes, the machines will be outdoors and have to withstand rugged temperature/environmental conditions. – mochatiger Jul 15 '14 at 17:27
  • 2
    If you're really worried about it, you could use `mdadm -C --assume-clean...` to avoid the initial sync. At least with RAID-1. – derobert Jul 21 '14 at 20:27

4 Answers4

10

I wouldn't quite recommend Linux software RAID with SSDs, especially for boot. I'd make the decision based on the potential failure scenario(s) and what the impact of downtime is. For industrial SSDs, I've typically used them standalone, without RAID.

If this workstation were to fail, how quickly can you 1). recovery from backups or 2). rebuild/reimage?

What type of SSDs are these (make/model)? If they're overprovisioned, this may not be too much of an issue. If they're SATA and connected to the motherboard, you'll have some TRIM options.

You can use an entry-level LSI hardware RAID controller to ease deployment and recovery. At least the underlying RAID will be transparent to the OS.


Edit:

These are highly overprovisioned industrial SSDs. Configure the RAID 1 mirror as normal and just monitor the drives over time.

ewwhite
  • 194,921
  • 91
  • 434
  • 799
  • 1
    Though if you do decide to do hardware RAID, you need to make sure that the OS supports the hardware such that you can see the state of the underlying devices, or you won't know when devices start failing unless you're physically in front of the machine. +1 from me, anyway. – MadHatter Jul 15 '14 at 16:03
  • All major hardware RAID vendors provide Linux software which can monitor the individual devices behind the adapter. These can be tied into Nagios, etc. for monitoring. – Stefan Lasiewski Jul 15 '14 at 17:25
  • I've put the answers to your SSD/hardware questions as extra info in the question (hope that's okay, I'm new here). The idea is on the rare occasion that one of these drives fails in the field, be able to recover the mirrored data from the one that hasn't failed. Data-loss impact is high. Knowing that, would RAID1 still not be too much of an issue as you said? – mochatiger Jul 15 '14 at 17:37
  • 3
    @mochatiger Knowing what you've said and that the SSDs are highly overprovisioned (40%), I would configure software RAID 1 as you were planning. Red Hat's documentation is meant for general use cases and consumer hardware. Your situation is definitely different. – ewwhite Jul 15 '14 at 17:40
8

Is this something I should be seriously concerned with?

No

Are there alternatives for redundancy that I can use?

I prefer hardware RAID controllers but that's a personal thing, you're fine like this.

Chopper3
  • 100,240
  • 9
  • 106
  • 238
  • 1
    Chopper can you explain why we shouldn't be concerned about this? Shouldn't we be concerned with anything that causes 'performance of the SSD to degrade quickly.'? – Stefan Lasiewski Jul 15 '14 at 18:48
  • 2
    Sure, if you first explain how this level of work could possibly cause 'performance of the SSD to degrade quickly' given 2014-spec enterprise (OP uses the term 'industrial') SSDs. – Chopper3 Jul 15 '14 at 18:58
5

Question you should ask is when that documentation was written. They generally use the same material and update it if required, and SSD technology has changed since then.

Even though they are industrial, write and read performance is not the same. The documentation is referring to write performance, but with a mirror setup, you will get better read performance with /boot and / mounts.

So questioning the documentation in some respects is worthwhile.

paulcube
  • 181
  • 1
  • 9
1

You can use it without much problems in soft RAID1 configuration (even if the SSD was not so much overprovisioned), but only provided you TRIM on it after creating.

You can do it by on of the following:

  • using new enough kernel which supports MD passing TRIM to SSD (at least 3.8.something IIRC, but please check), and running fstrim(8) (from util-linux package) nightly

  • using new enough kernel and having "discard" mount option on it (for ext4/xfs). Note that this is lower performance than above, as TRIM is nonqueueable, and this does not take advantage of batching above

  • on older kernels, run nightly cron run of mdtrim. Be sure to test in with provided test script before putting important data on it!

Also note that it all applies only for filesystem directly on softRAID. It won't work for most all hardware RAIDs. Also, it (currently) won't work if you have LVM or some other layer on top of MD softraid. You'll need big overprovisioning to survive those (and luckily you have 40% of it, so you're fine).

Matija Nalis
  • 2,409
  • 23
  • 37