2

I have been working on trying to restore data from an array created on a Dell MD3220 PowerVault storage unit. On the phone with Dell and another support group for weeks now and running into brick walls. I was hoping that someone here might be able to have an idea that I could try in hopes of recovering the data. The storage appliance has 24 drive bays identified with a 0 base (so drive 24 is called 23, and drive 1 is called 0).

MD3220 FrontMD3220 Back

The unit experienced a power outage and I guess the storage unit going offline before the two servers accessing the data (via SAS cables) did was the cause of the issue. So the DBs that contains the array config located on each controller in the MD3220 (there are two) became corrupt.

  • We tried to recover the DBs by replacing the current DB with the latest backup found on the controller itself (a common scenario). That seemed to fail.

  • We even went as far as trying to rebuild the database with the files stored on my server (DBM files) that I use to manage the appliance. We had Dell generate a Validator key to use when rebuilding the databases. That seemed to fail as well.

The error I keep seeing that I can't get around is Exception type N3adp6Device24ExtentAllocatedExceptionE message "N3adp6Device24ExtentAllocatedExceptionE"with extent:553 of size:1106 for drive ordinal22.

09/29/21-19:24:37 (tRAID): WARN:  UWManager::initializeNvsramIWLog: IWLog invalidated
09/29/21-19:24:37 (tRAID): NOTE:  UWMgr findIWLogs: Found IW log drive. Devnum 0x10001 tray=0 slot=2 ssd=0 qos=3 controller=0
09/29/21-19:24:37 (tRAID): NOTE:  UWMgr findIWLogs: Found IW log drive. Devnum 0x10002 tray=0 slot=3 ssd=0 qos=3 controller=0
09/29/21-19:24:37 (IWTask): NOTE:  UWMgr: IW logging started
09/29/21-19:24:41 (tRAID): ERROR: CrushDrive::allocateExtent - Exception type N3adp6Device24ExtentAllocatedExceptionE message "N3adp6Device24ExtentAllocatedExceptionE"with extent:553 of size:1106 for drive ordinal22
09/29/21-19:24:41 (tRAID): ERROR: CrushStripe DeSerialization - Couldn't allocate extent! CrushDrive 22 Volume 1 CrushPiece 2 Extent 553
09/29/21-19:24:41 (tRAID): ERROR: Exception during stripe allocation in vdm::CrushStripePersistenceManager::initialize(1)
09/29/21-19:24:41 (tRAID): ERROR: vdm::CrushInvalidCfgMgr DB_CORRUPT detected
09/29/21-19:24:41 (tRAID): NOTE:  lockdownPrimaryDBInvalidWorker: OBB already in pcache, not updating.
09/29/21-19:24:41 (tRAID): WARN:  BackupDatabaseManager:lockdownPrimaryDBInvalid Exception IconSendInfeasibleException Error
09/29/21-19:24:41 (tRAID): WARN:  BDBM:  Client detected Primary DB Corruption. Forcing dualControllerLockdown.
09/29/21-19:24:41 (tRAID): WARN:  Ctl Reboot:
                                Reboot CompID: 0x407
                                Reboot reason: 0x11
                                Reboot reason extra: 0x2
09/29/21-19:24:41 (tRAID): WARN:  Rebooting this Controller now

I'm guessing "ordinal 22" is talking about drive 23 (of 24 drives)? Not sure what it's complaining about though. Is drive 23 bad? Is there a database on all the drives and the DB on drive 23 bad? Is there a way to restore that drives database like copy it from another etc? Is it even talking about drive 23? Any help that anyone can toss at me would have a bunch.

Thanks!!

Arvo Bowen
  • 795
  • 5
  • 15
  • 33
  • That's a 10 year old model - I'm surprised it's still in support! When you say databases do you mean the array layout or your actual application databases? Can you not just wipe the array completely, rebuild the array and restore your data from backup? – Chopper3 Oct 19 '21 at 15:47
  • open a ticket on Dell if you still bought it – djdomi Oct 19 '21 at 16:19
  • @djdomi not really sure how to respond to that comment. In my question, I said I have been working with dell for weeks now. So yes I have a ticket that I opened with Dell and ... yes I bought it..? – Arvo Bowen Oct 19 '21 at 18:57
  • @Chopper3 it's not under contract and we had to pay dearly to have the "one-time support" option. By DB I mean array layout. Dell calls it a database on the RAID controllers. There are a few things on it that were not backed up. So yes I could but I would lose some data I would prefer not to. It could save me many weeks in rebuild time so it would be worth trying to recover. – Arvo Bowen Oct 19 '21 at 19:00
  • Shortly said, Pay the Support for dell aslong as you have this item in use. We had a Similar device, which got broken due the fact that the controllers stopped working. The Final Critial point was, where dell was onside, everything was shutdowned the last of both controller failed in that moment.... a Firmware update revived them – djdomi Oct 20 '21 at 05:07
  • > The unit experienced a power outage and I guess the storage unit going offline before the two servers accessing the data (via SAS cables) did was the cause of the issue. That is something you *really* need to prevent. The box has two PSUs and at least one of them needs to be hooked up to a UPS (or both to different UPSes). A sudden power outage can corrupt the RAID setup (as in your case) or the data stored on the appliance - even silently, so you'd only notice days/weeks later or perhaps never. I had an MD3220i in heavy use for many years and remember that the original firmware caused a few – Zac67 Oct 22 '21 at 20:14

0 Answers0