4

We are storing files on FreeNAS 9.2 using ZFS. I love the data integrity claims made by ZFS, having randomly lost data in the past on servers using ext3, XFS, and ReiserFS that were not mistreated (power outages, etc). It was rare, but disastrous when a server needed a reboot, fsck kicked in, and found lots of errors.

We are also using this same NAS as a shared storage target for XenServer virtual machines. At first I was thinking about how nice it is to have our VMs backed with ZFS, but now I'm second guessing the integrity is really that failure proof.

If a VM's virtual disk is just a large file containing its own file system (assume the default recommended ext4), then what prevents it from becoming corrupted within that virtual disk? Perhaps a network cable becomes faulty and iSCSI doesn't know it received a few bad bytes to store (resiliently in ZFS)? I'm guessing there are other possible faults that can occur between the VM and the shared storage that a "trusting" file system wouldn't detect. Is the only solution to also use an error correcting file system within the VM, such as ZFS or btrfs?

jimp
  • 638
  • 3
  • 11
  • 20

1 Answers1

4

Sorry, nothing prevents that from happening in your guests.

Protect your environment!

  • Use uninterruptible power supplies to back your server and networking hardware.
  • Dual power supplies in everything you can.
  • Multiple storage paths (MPIO for the iSCSI in your case).
  • Backups.

And heck, even in-VM ZFS filesystems can have problems.

Every 3.0s: zpool status -v                                                                                                                                                                                    Fri Dec 27 12:49:47 2013

  pool: vol1
 state: ONLINE
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: http://zfsonlinux.org/msg/ZFS-8000-8A
  scan: scrub in progress since Fri Dec 27 12:35:06 2013
    42.1G scanned out of 46.3G at 48.9M/s, 0h1m to go
    0 repaired, 90.80% done
config:

        NAME        STATE     READ WRITE CKSUM
        vol1        ONLINE       0     0   167
          sdb       ONLINE       0     0   448

errors: Permanent errors have been detected in the following files:

        <metadata>:<0x67>
        <metadata>:<0x6f>
        <metadata>:<0x8e>
        vol1/ppro:/isam/IM00013.ISI
        vol1/ppro:/isam/IM00014.ISI
        vol1/ppro:/isam/IM00015.ISI
        vol1/ppro:/isam/IM00016.ISI
ewwhite
  • 194,921
  • 91
  • 434
  • 799
  • Wow! So if a guest file system is always at risk for corruption, would NFS directly to the NAS file system be immune, or at least only subject to file-by-file corruption? I'm trying to virtualize several NAS boxes for privacy (and cost) on top of a physical NAS--complete with all the redundancy you recommend. I need the best solution for data integrity, while keeping the virtual NAS boxes separate so each group of users has complete control over their own NAS. But the files don't have to live inside a virtual file system if that is more dangerous vs NFS from the guest to the phys NAS. – jimp Jan 19 '14 at 22:51
  • (Thanks for the advice, btw! I definitely have a plan for backups, but I'm also trying my best to avoid known points of failure, common or obscure.) – jimp Jan 19 '14 at 23:09
  • Also make sure your full environment is using ECC RAM. The FreeNAS box, and the clients. Non-ECC RAM can silently corrupt data and you'll never know until something crashes/breaks. – Nex7 Jan 21 '14 at 19:50