0

I really need help.

I am running an XCP-NG server, with an LSI MegaRAID 9260-8i card installed. I have a RAID 5 virtual disk with one partition on said disk that takes up all the space. That partition is now mysteriously gone. Here is what I had done before I noticed this:

Ran the following command:

sudo yum install -y tar bzip2 make automake gcc gcc-c++ pciutils
elfutils-libelf-devel libglvnd-devel

I actually started it, CTRL+C’d out, and then actually ran it.

Next, I wanted to configure PCI passthrough for my VMs, so I exited into the BIOS to enable IOMMU. After enabling it, I came back to my server only to see that the partition on my virtual disk (on the raid controller) was completely gone. No sign of it on lsblk.

For whatever reason, my partition on the RAID card is now gone. No sign of it. It contained days worth of configuration work, as this is a brand new server. I really don’t want to redo all of that. Please someone come to my rescue!

Edit: looks like the folder where I would mount the array has disappeared, along with my /etc/fstab entry.

I installed the RAID management software, it’s now seeing my RAID5 as a RAID0.

Output of lspci -v

My OS drive does have some mysterious unrecognized 18Gb partition that I can’t mount. I don’t think XCP-NG made it.

james
  • 13
  • 4
  • 1
    Restore from your backups. – Michael Hampton Aug 15 '21 at 09:38
  • @MichaelHampton This was a brand new server, backups haven't been configured yet. – james Aug 15 '21 at 17:09
  • *I installed the RAID management software, it’s now seeing my RAID5 as a RAID0.* I'd be more worried about this server having hardware issues and not being suitable for use than I would care about having to redo all the work building it. First thing is run the full diagnostics available on the system (if any - IBM/Lenovo servers for example have extensive built-in diagnostics), then run a full memory test such as memtest86. – Andrew Henle Aug 15 '21 at 17:17
  • 2
    You put "days worth of configuration work" into it. Configuring backups should have come first, before _any_ of that. It's a hard lesson to learn. At this point I agree with @AndrewHenle that you should check the hardware. At the very least run `storcli /cx show all` and look for obvious problems, check your cables didn't come unplugged, boot into the RAID BIOS and look for issues there, etc. – Michael Hampton Aug 15 '21 at 19:10
  • And after you do what @MichaelHampton recommended, when you rebuild the RAID5 array, run a ***full*** init of the array - `storcli /c0/v0 start init` (assuming controller 0 and virtual disk 0), and wait for it to finish successfully. Because if what you said is true - your system converted a RAID5 array into a RAID0 array - that system now has to prove it's trustworthy. And it takes as long as it takes - you don't want this system pulling another "Presto! Change-o! RAID5 is now RAID zero!" again when you have a month's worth of production data on it and you're relying on it to be up. – Andrew Henle Aug 15 '21 at 21:27

0 Answers0