10

I have got IBM x3650 M4 server. It is configured with Raid 5 & includes 4 SAS hard disk with capacity of 500 GB each. Now the 2 Hard disk are showing as bad. So by replacing the 2 hard drives with new one will the data get rebuild automatically or do I need to do some other changes. I do not know much about Raid configuration so please help.

MadHatter
  • 78,442
  • 20
  • 178
  • 229
lakhan vasre
  • 101
  • 1
  • 5
  • 3
    This seems apropos: http://serverfault.com/questions/2888/why-is-raid-not-a-backup – Andrew Henle Feb 22 '17 at 11:11
  • 2
    Is the array currently online? Can you access it? Also, what are your priorities? Is your backup up to date? Is downtime a problem? – David Schwartz Feb 22 '17 at 12:16
  • 2
    As a sidenote, rebuilding RAID hard disks is a very stressfull operation for HDs... There is a very distinct possibility that other hard disks will die while doing it (they all have the same age, they are from the same batch, if they have defects, they all have the same defect)... It is probably better to try copying all the data somewhere else. – xanatos Feb 22 '17 at 12:42
  • @ xanatos How can I copy all the data somewhere else. can you tell me the same. – lakhan vasre Feb 22 '17 at 13:09
  • 11
    "How can I copy all the data somewhere else." That'd be what we refer to as a "backup". You're doing that already, right? And you're regularly testing that you can restore it, too? – Roger Lipscombe Feb 22 '17 at 17:04
  • @xanatos, If by "defect" you mean the actual defects in the platters, then your statement is wrong. It is very unlikely that the defects will be in exactly the same place on multiple drives. These defects are determined by the defects in the physical coatings on the platters, the physical characteristics of the heads, how well the manufacturing process optimized the parameters for the read/write channel, how well the manufacturing process mapped the defects at that time, etc. All of these vary from drive to drive, and even from platter to platter within a single drive. – Makyen Feb 23 '17 at 08:36

2 Answers2

19

If you lose more than a single disk in a RAID 5, your array has been irreperably damaged in some way. In most cases, the data is entirely destroyed in your case if you're not an expert at recovery, or if you are unwilling to ship it off to a recovery outfit. If you DO want to recover the data from this array, take it offline immediately and either recover it on your own or send the array + the card off to someone like DriveSavers.

This is one of the reasons it's generally advised to stay away from RAID 5, and use RAID 6, 10, or some level of RAID-Z or unRAID.

Now would be a great time to restore from backup. If you intend to create a new array with new disks, you might also consider giving these remaining disks the axe if they're just as old.

Spooler
  • 7,016
  • 16
  • 29
  • 10
    I'd say that "generally advised to stay away from RAID5" is untrue. Like any tool or technology, you just need to be aware of its limitations. Two disk failures in RAID1 or RAID10 could also cause the same problem. – Mark Henderson Feb 22 '17 at 11:54
  • 6
    Ditto @MarkHenderson. RAID 6 often comes at a performance cost, and certainly a storage space cost; RAID 10 comes at a storage space cost; and RAID-Z1 is no more resilient against multi-disk failure than is RAID 5 except insofar as ZFS is more resilient than whatever else one might use, which might be not at all. I don't know about unRAID. I suspect that the OP's underlying issue is not monitoring the array for problems, but that (nor the point on staying away from RAID 5) doesn't invalidate the bulk of this answer: a RAID 5 array with two dead disks is not ever going to recover on its own. – user Feb 22 '17 at 12:04
  • 1
    The advice to avoid RAID5 is valid for new builds, especially with very large drives. The main concern with RAID5 is that during the time a rebuild is occurring, a second drive failure may occur. The longer the rebuild time, the greater the chance of this happening. Older RAID5 arrays are made of smaller drives, so the risk is less. – barbecue Feb 22 '17 at 22:35
  • @MichaelKjörling, as I understand it, unRAID is basically RAID 5 with file-level striping rather than block-level striping. Yes, a two-disk failure means you lose the array, but the different storage pattern means you can recover everything except the files that were on the failed disks. – Mark Feb 23 '17 at 00:09
  • True, the advice regarding RAID 5 is mostly targeted at new arrays with large disks. If OP is having double disk failures on what might be years old disks, it's time to get new ones - and the cost of new >=1TB drives is usually about the same as a 500G drive in most cases. It's hard to give a less generic answer when I don't know what the workload is / needs to be. – Spooler Feb 23 '17 at 03:22
8

So just to clarify, you had a 4-disk R5 array, you replaced 2 disks at once - is that right?

will the data get rebuild automatically or do I need to do some other changes

If what I'm reading you've done is correct then no, no it won't get rebuilt, ever and you've destroyed your data and yes you will have to do some other changes in that you'll have to wipe the array and restore from your last backup.

If I've misread your question then please clarify, otherwise you played yourself.

Chopper3
  • 100,240
  • 9
  • 106
  • 238
  • 1
    Hi Chopper3 I have not done anything yet. Just now i bough the new Hard disk but before connecting it to the server I wanted to ask you guys. that whether should I connect the hard drives or should i try some other options for recovery or something else has to be done. I am new into this so i am not sure what exactly has to be done – lakhan vasre Feb 22 '17 at 10:51
  • 5
    Ah - good news - in that case what you need to do is replace both drives ONE AT A TIME, ensuring that the array is fully rebuilt after replacing the first drive before replacing the second drive. Once this is done and your R5 array is 100% good you need to form a plan to migrate from R5 to something more stable such as R6/60 or R10 ok. – Chopper3 Feb 22 '17 at 10:55
  • Raid5 is still okay if you get a hotspare disk, in my opinion. It's all about risk management and weighing the performance hit between raid5 and raid6, or the monetary hit between raid5 and raid10, and all of those against a raid5 with a hotspare and a tested backup. – Daniel Feb 22 '17 at 11:07
  • 3
    @Chopper3 I'd think RAID-5 for a 4-disk array doesn't necessarily need to be replaced with RAID-6 or RAID-10. RAID-5 should provide adequate availability (the two-drive failure here notwithstanding...), and no version of RAID provides adequate backup anyway. – Andrew Henle Feb 22 '17 at 11:07
  • Dear all thank you for the support. i will try the same & check whether it works properly or not. – lakhan vasre Feb 22 '17 at 11:13
  • 2
    Daniel and Andrew - with 4 x 500GB disks then yes I can see how you'd be just about happy to carry on with R5 but as we know it's positively dangerous to use R5 with >1TB disks and has been for the best part of a decade – Chopper3 Feb 22 '17 at 11:14
  • 1
    @Daniel With a RAID 5 plus hot spare, just watch out so you don't end up with the [endless resilver](https://serverfault.com/a/523413/58408). (Unless of course it *will* fail in your case; ZFS is more like the exception...) – user Feb 22 '17 at 12:31
  • 5
    @Chopper3: That rule applies for naive RAID5 implementations and highly valuable data. A smart RAID5 controller can recover from 2 disks with Unrecoverable Read Errors, if they do not coincide. And with 1 TB disks, that's already a fairly low chance. (You're still in trouble when a whole disk dies, plus URE's on another disk, but that risk is fairly unrelated to size) – MSalters Feb 22 '17 at 12:32
  • 1
    @lakhanvasre & chopper, Recovery of as much data as possible is a non-automatic process that is significantly more complex than replacing one drive at a time. If you do just replace one drive rebuild, then replace the other drive and rebuild, you will end up with corrupted data in the sectors which are bad on the drive you replace 2nd. You can recover all data, except for those sectors which are bad on both drives. Having identical sectors fail due to age is quite unlikely. Having identical sectors fail as a result of external causes (array physically hit while active) is not unlikely. – Makyen Feb 23 '17 at 08:53
  • Team I have installed 2 new SAS drives. Out of which I configured one drive on the virtual drive & another hard disk on spare drive so it automatically rebuild the Raid drive. It has completed 100% then I restarted the server as it asked me to restart the same in the megaraid configuration utility. After restarting the same I am unable to boot my server it is getting restarted everytime. can you guys please suggest me where I have gone wrong. also I do not have any backup with me, – lakhan vasre Mar 14 '17 at 18:15