5

Has anyone had to deal with a corrupt file system on a storage gateway volume? One of my volumes is now telling me that it's corrupted or unreadable. I've tried running a chkdsk /r on it and it took days (10TB volume). Once completed I got the same error message. I didn't have snapshotting scheduled so I don't have a previous version of these files. I'm currently working with AWS support and they're having me run chkdsk a couple of different ways. Has anyone had to deal with this before?

PS: BTW, don't run a chkdsk on a storage gateway volume, it screws up your cache and runs very slowly

blsub6
  • 1,101
  • 6
  • 25
  • 44
  • 2
    I have not encountered corruptions on SG volumes myself yet. However, could you specify what actions preceded the failure? Did you mount/unmount your bucket right before the corruption, or have you expirienced any connectivity issues? – Strepsils May 25 '17 at 10:38
  • I believe it was due to connectivity issues. I was copying a bunch of data up to my SG volume and copying backup data over the same pipe so bandwidth was taken away. My upload buffer filled and couldn't copy anything up because of no bandwidth. I eventually cleared the bandwidth choker, finished copying everything in my buffer up and now I have a corrupted file system – blsub6 May 25 '17 at 18:30

1 Answers1

2

We solved this and got the files back. Under the advice of AWS support, I created an EBS snapshot of the storage gateway volume, restored it as an EBS volume, attached it to a fairly beefy EC2 instance and ran the chkdsk from there. Because it was an EBS volume directly attached to a computer and not going over storage gateway or WAN to do the chkdsk, it ran much faster than doing it other ways (still took days to run the chkdsk on 6TB of data on a 10TB volume). When the chkdsk finished and we confirmed that we had access to the files from the EC2 instance, we snapshotted the volume and restored it to our on-prem storage gateway.

Moral of the story - if you're using storage gateway, know that file systems can be corrupted in the cloud and schedule snapshotting on your volumes in case it happens.

blsub6
  • 1,101
  • 6
  • 25
  • 44