8

I plan to build some storage schema but I have to anticipate downtime, maintenance and rebuild Time.

Some guys said to me that a 10 disk RAID6 of 10To (SATA) rebuild will last for about a week ! and some company policy ask for stopping activity on Array when rebuilding.

If I use RAID 5 or 6 or 5+1 or 6+1 is there an approximate formula that can give me hint on rebuild time depending on disk size and type (SAS/SATA/SSD). something like rpm x size(Mb) x type-factor x nb-of-disk ...

I would like to be able to anticipate all rebuild time scenario depending on Size/type of RAID/type of Disk.

I know it may depend on hardware quality, but let says I am out of dedicated hardware like 3PAR / STOREWIZE / NETAPP or likes. I am using conventional servers with traditional SAS or SATA drives with software RAID.

dominix
  • 366
  • 2
  • 3
  • 12
  • Said company policy *may* make sense to *some* extent with RAID 5. But already with RAID 6, a rebuild of *one* failed disk, when the array would still work with *two* failures, it seems unnecessarily restrictive. The more disks you use, the more often there *will* be a disk failing, and you do not want the complete business come to halt for several hours. With hot-spares, a rebuild in the background may even have completed or almost completed before you even have the chance to communicate a company-wide shutdown of all relevant activities ... – Hagen von Eitzen May 19 '19 at 10:57
  • 2
    PLEASE don't use R5, it's dangerous and no professional would use it for disks >1TB, you WILL lose data - this isn't option either, it's well documented fact. – Chopper3 May 19 '19 at 19:06

3 Answers3

14

You can calculate the best-case rebuild rate fairly simply: as rebuild is sequential, the needed time is capacity / transfer rate. For example, rebuilding a 10 TB disk with a 200 MB/s transfer rate needs at least 10000000 / 200 = 50000s = ~14h.

Now take this result and trow it away, as it is an overly optimistic scenario: it suppose 100% disk availability for the rebuild operation and totally sequential reads/writes. Toss in the mix some non-rebuild (ie: application) load, cap the rebuild itself to 30% (to not grind other applications to an halt) and you are suddenly in the 10x (eg: a week) rebuild time.

These long rebuild times are the reason while I avoid RAID5/6 in many system, favoring mirroring instead. Anyway, with such big drives, absolutely avoid RAID5, which is too much exposed to double failure and/or URE issues.

If you want to play with the number, give a look here

shodanshok
  • 44,038
  • 6
  • 98
  • 162
3

The theoretical absolute minimum rebuild time is the time needed to write a complete disk worth of data : the capacity of a disk divided by the average sustained write speed a disk can maintain without cache.
(Note: that average sustained write speed will probably be not even near the performance numbers quoted in the specs.)

Larger disks take longer.
Slower disks take longer.
Parity calculations take extra time.

Real world numbers will vary but will certainly be (much) larger and depend on your RAID level , the number of remaining disks, load on the system while the array rebuild takes place, the controller etc.

Also see What are the different widely used RAID levels and when should I consider them?

HBruijn
  • 72,524
  • 21
  • 127
  • 192
2

It depends upon your RAID controller (or software RAID stack). As others mentioned, first don't use RAID-5 with large hard drives (it's OK for up to 1TB SSDs and not much else).

In my experience, rebuild time vary largely with storage solicitation. For idle systems, most of controllers will require 36 to 72 hours to rebuild arrays of 8 to 12 TB drives (depending upon your controller type and disk size).

When the system is under IO load during rebuild, however, it's not uncommon to see this duration grow to a week length.

Notice that Helium drives have a much better reliability record than standard drives; in my experience UltraStar He drives failure rate is low enough to still make RAID-6 relevant (a typical 100 TB to 1 PB system won't see more than one rebuild in a 5 years time span).

wazoox
  • 6,782
  • 4
  • 30
  • 62