3

I'm configuring a FreeBSD server hosting virtualbox serving half dozen mission critical busy mail servers. I just learned ZFS, I'm quite attracted, but have a few questions:

  1. what is the CPU overhead of ZFS? I googled and found little (or no) benchmark for that.

  2. from what I learned, when ZFS updates files, it keeps the old file as snapshot, and write the updated part for the new version. However that would mean for each snapshot it keeps that require significant storage overhead. How much is this storage overhead? For example, suppose I have 2TB usable space, how much space can actually be used for the latest version of files one year later?

  3. is FreeBSD with ZFS hosting virtualbox serving half dozen busy guest mission critical mail servers a reasonable combination? Anything particular to be careful with? And can I still choose ZFS for the guest OSs? This is because I may build another identical such box for redundancy, and will need to do some mirroring between each pair of the guest systems across the boxes.

  4. I'm trying to configure a Dell R710 for this. From what I learned, I shouldn't choose any RAID at all, is that true? In that case, are the drives still arrive hot swappable?

  5. this may sounds a bit pathetic, but since I have no experience with ZFS at all, and this is a mission critical server, so just ask just in case: I'm choosing twin Intel L5630 processors, and 6 x 600GB 15K RPM Serial-Attach SCSI drives. If I need more space in the future, I would just hot swap some drivers with larger capacity to expand the storage. There is no problem with these, right?

John
  • 31
  • 1
  • 2

5 Answers5

4

I'll address #3 here. I don't think VirtualBox+FreeBSD+ZFS is the best solution for what you're proposing (based on your usage of "mission critical" and "busy").

  • What do you hope to accomplish by using ZFS? It's a great filesystem and I'm definitely an advocate, but what value does it add in this case? Snapshots? Checksumming? (personally, it took me several attempts and a lot of research before I began hosting critical applications on ZFS).

  • I understand that you're new to ZFS, but how's your experience with FreeBSD and VirtualBox?

  • Would there be any problem with running a single instance of the mail software and hosting multiple domains within, or do you need the complete isolation afforded by virtualization?

  • If you do choose to use ZFS, you don't want the traditional Perc RAID controllers, but should specify something that passes the RAW SAS drives to the operating system. See: ZFS SAS/SATA controller recommendations (BTW - your drives would still be hot-swappable.)

ewwhite
  • 194,921
  • 91
  • 434
  • 799
  • 2
    The main attraction of ZFS in this setup is, it eliminates the single point of failure in the RAID controller. I'm also interested in snapshot, which may be useful for reverting guests, although virtual machine software can also do that. I'm very experienced with FreeBSD. Never worked with VirtualBox, but have quite some experience with KVM, vmware and zen (on centos). We need to have complete isolation of the email domains. They used to run on distinctive machines. We just start to virtualize them, but will never merge them into one machine. – John Jan 28 '11 at 16:39
  • I don't think ZFS eliminates a point of failure in the RAID controller. Personally, I'd skip the VirtualBox and run VMware ESX or ESXi with separate VM guests running the mail server OS of your choice, which appears to be FreeBSD. ZFS doesn't offer much here. – ewwhite Jan 28 '11 at 17:21
  • But, VMWare doesn't run on FreeBSD as host, or at least doesn't offer native I/O for guest. And running all FreeBSD guests on a Windows host makes me feel bad. If running VMWare on a linux host, the free version has inadequate performance, while we don't like the enterprise version since there are excellent free alternative. So we have moved to KVM and Xen long time ago. But they don't provide native I/O speed for FreeBSD as guest. That's why are are looking into virtualbox. – John Jan 28 '11 at 20:11
  • 1
    I was referring to the free VMWare ESXi to run on the hardware. http://www.vmware.com/products/vsphere-hypervisor/index.html – ewwhite Jan 28 '11 at 20:12
  • The free version has inferior performance, either I/O or some management issue. I can't remember the details now, but we used it before and decided to move on. – John Jan 28 '11 at 20:28
  • Besides, why ZFS doesn't eliminate the single point of failure in the RAID controller? Actually that's one thing I couldn't understand in your first reply - seems that I can't just order the bunches of drives for the server from Dell - I have to order another controller to work with ZFS. Is that right? Then indeed I don't see much point using ZFS. – John Jan 28 '11 at 20:38
4

I have experience with ZFS running on OpenSolaris on a 50TB NFS files server for HPC, so I will answer your questions based on that.

Question 1

What is the CPU overhead of ZFS?

It's very small. It will vary depending on what checksum and compression algorithms you choose and it you enable deduplication. I have all 3 enables with the default options and rarely get my 16 cores to utilize more than 15% each. Keep in mind that compression and dedup also reduce the amount of data that you need to write so things actually endup happening faster at the expense of minor CPU utilization. CPUs are bloody fast nowadays.

Question 2

From what I learned, when ZFS updates files, it keeps the old file as snapshot, and write the updated part for the new version. However that would mean for each snapshot it keeps that require significant storage overhead. How much is this storage overhead? For example, suppose I have 2TB usable space, how much space can actually be used for the latest version of files one year later?

The snapshots store only a difference of what's changed. You only start seeing the snapshots take up space if you delete or modify existing data. For mail server that stores mail in plain text that would mean that only the deleted emails will result in overhead in the snapshots. If you accumulate have 1.5GB (after ZFS compression) of emails and 0.5GB were delete over time then you will be able to fit everything into your 2GB zpool no matter how many snapshots you made.

Having 1 or more snapshots means that you will not be able to free up space by deleting files but you can free-up space by deleting snapshots.

ZFS is a transactional fs so even deleting a snapshot will require writing a small log to disk. This means that if your have 0 bytes of free space then you can't delete. I got stuck like that once. So take some care to setup a disk space quota (say 99% of your zpool) so that when you run out of space you will be able actually delete things.

Question 3

Is FreeBSD with ZFS hosting virtualbox serving half dozen busy guest mission critical mail servers a reasonable combination? Anything particular to be careful with? And can I still choose ZFS for the guest OSs? This is because I may build another identical such box for redundancy, and will need to do some mirroring between each pair of the guest systems across the boxes.

I don't know how well VirtualBox will work under this kind of load. You should test the performance before you deploy. Replicating would be the best with zfs send.

Question 4

I'm trying to configure a Dell R710 for this. From what I learned, I shouldn't choose any RAID at all, is that true? In that case, are the drives still arrive hot swappable?

If you format the drives as JBODs then you can use ZFS's ZRAID. They will be hot swappable.

We have a SAN from LSI and we did not use ZRAID. Instead we relied on the hardware RAID6. There were cases when ZFS detected data corruption and I was able to tell which files were affected. The data was restored later by the hardware but if we had ZRAID the there would not be any visible data corruption at the file level.

Question 5

If I need more space in the future, I would just hot swap some drivers with larger capacity to expand the storage. There is no problem with these, right?

It's a good question. This would be a problem if you do hardware RAID. On the other hand, ZFS should be able to let you expand like that with ZRAID. I never tired that. When expanding we just add new shelves and create new zpools. Growing an existing zpools would be just as easy as adding new ones.

Aleksandr Levchuk
  • 2,415
  • 3
  • 21
  • 41
2

Can't you run your mailservers in FreeBSD jails (see the ezjails port). You can place the individual jails on zfs filesystems and thus have all the snapshot features.

That said, what OS and software are your vbox guest supposed to run? Now that 8.2 is (almost) out, it comes with a lot of improvements in those department. And for 9.0, there's lots more coming.

  • +1 In the generic case, Jails would greatly simplify a server like this, remove some of the virtualization overhead, and expose the ZFS file system for direct use by the 'guest' mail servers. – Chris S Feb 22 '11 at 03:19
1

In a nutshell, ZFS is just fine for a server running Virtualbox, but you really should not be putting the hard drives on the same machine as the virtual machines. ZFS can make use of all the RAM that you throw at it, which is hard to do on a VM server. But on a specialised storage server you can set up ZFS right, leverage snapshots for backups and so on. Ideally use iSCSI to communicate between VM servers and storage servers.

Added explanation to answer comment... Note that snapshots are not backups, but they can be used for making backups. In other words, shut down your db server software or similar, snapshot, and restart the software. Then start the backup using the snapshot as the source. Your downtime is only the time required to shutdown and restart the mission critical process. That is why snapshots are so useful.

Michael Dillon
  • 1,809
  • 13
  • 16
  • 1
    Snapshots cannot be used for backups. With snapshots you can restore files that were accidentally deleted or modified but you cannot used them for recovery if something happens to the storage, if you make a mistake with something like `zfs destroy -r ...`, or if you have data corruption. Backups should be done to a different cheaper storage. Probably the best way is `zfs send` – Aleksandr Levchuk Feb 22 '11 at 02:53
0
  1. now very much that I know of, the main overhead is that it's supposed to have RAM enough to work
  2. snapshots are not made on each update, but only if you (manually, or with a script) create snapshots; space needed is the bare minimum as it is using a copy-on-write approach (the blocks in the old file which were changed in the new one)
  3. I wound indeed use FreeBSD and ZFS, but take care the virtual machine has enough RAM assigned to it

About 4 and 5, I'm not sure about that.

lapo
  • 311
  • 3
  • 7
  • The problem is, if you google "ZFS CPU overhead" there are indeed quite some hits. Furthermore, there is a new ZFS feature that allows you to tune CPU utilization, see "ZFS System Processes and the New System Duty Cycle Scheduling Class" in http://constantin.glez.de/blog/2010/09/oracle-solaris-10-0910-zfs-highlights . That got me worried. – John Jan 28 '11 at 16:55
  • I'm still on ZFSv15 myself (ZFSv28 has not yet landed on FreeBSD -STABLE), didn't know about that. I'l tryy and read something about it: now I'm curious too. ;) – lapo Jan 28 '11 at 20:21
  • People who want ZFS are usually going to use it in a storage server scenario and if they can get an advantage from giving it a greater percentage of the CPU time, they will, because otherwise the CPU will be idle. – Michael Dillon Feb 22 '11 at 01:53