I have a VMware ESXi 4.1 server which I recently added 2x1TB SATA drives to. This machine runs a NexentaStor Community VM which hosts ZFS filesystems. Before adding the new drives, all the ZFS zpools resided inside virtual diska (VMDK files) which resided on the server's Vmware Datastore which is on a hardware RAID10.
The new SATA drives have no hardware redundancy, so my goal was to attach them directly to the NexentaStor VM and create a RAID1 zpool out of them.
I followed these instructions to create two Physical RDM files for the new SATA drives using vmkfstools -z /vmfs/devices/disks/idnumber RDM1.vmdk -a lsilogic
After adding the two RDM disks to the VM and creating a raidz1 zpool on them I started copying data to the zpool. The pool was taken offline and I was informed there were thousands of checksum errors.
I searched the web and found a number of people complaining of the same situation. (Example) I have since given up on using RDMs and have created two datastores and two 930GB VMDK files which I will place in a RAIDz1. However I want to know where I went wrong. A lot of people online said they have this configuration working.
My goals behind using RDMs as opposed to VMDKs were:
- Give the VM the ability to monitor S.M.A.R.T. status
- Allow ZFS access to the entire disk (since I knew these wouldn't be used for anything else)
- Make the drives easy to hotswap should one go bad
- Allow me to remove these drives and place them in another ZFS server should I need to
I had planned to use this same setup in a brand new ESXi 5.1 server which I will be setting up later this week. In that case #4 is particularrly important because I wanted to add an existing zpool to a new VM. Why did I get these checksum errors? Is it related to ESXi 4.1? Is there something I did wrong?
Edit: I have created the same setup with VMware ESXi 5.1 and have no issues so far. I'm going to test this extremely thoroughly but so far this appears to be an ESXi 4.1 issue.