6

I have a pair of 480 GB "Datacenter" SSDs (SAMSUNG MZ7LM480HMHQ-00005) that will make up a ZFS pool in a mirroring configuration. The pool's only content will be a ZFS volume (ZVOL) for a virtual machine. However, the ZVOL will only be 400 GB in size, leaving 80 GB of unused space on each SSD.

I'd like the SSDs to make the best use of their unused space, allowing them to use it for reducing the wearing of the flash memory. What would be the best way to achieve this?

  1. Simply use the whole SSDs for the pool?

  2. Create a 400 GB partition on each SSD and use these for the pool, leaving 80 GB unpartitioned?

  3. Try to change the SSDs' Host Protected Area (HPA) so that only 400 GB of capacity will be visible and then use the "shrunk" SSDs for the pool (as in option #1)?

Or maybe there is no difference at all, and the SSD will automatically make good use of all of its flash cells?

If it matters: I'm assuming that the virtual machine will issue TRIM commands and that they reach the host's ZFS and finally the SSDs themselves.

David Scherfgen
  • 265
  • 2
  • 6

3 Answers3

2

Drives are consumable.

I suggest just using them as intended.
RAID them. Monitor their health closely.

Or maybe there is no difference at all, and the SSD will automatically make good use of all of its flash cells?

These datacenter SSDs are already over-provisioned; likely 512GB for the 480GB capacity you've listed. The hardware already accounts for the wear-leveling.

ewwhite
  • 194,921
  • 91
  • 434
  • 799
  • 5
    Having disassembled 1.6TB Sandisk enterprise SSDs to ensure all that all chips are destroyed, I can confirm that those drives actually have a matrix of 32x64G chips for a total of 2TB of actual capacity. A 20% over provision. – Rowan Hawkins May 02 '20 at 20:46
  • It may still be beneficial to reduce the pool size beforehand (aside from wear leveling); as pools cannot be shrunk, but can easily be expanded, it is better to start with 400. This enables 400, 480 and 512 GB sized drives to be used as replacements and/or upgrades. If one were to start with 512 GB, the only possible replacements would be 512 or higher (more expensive). – user121391 Dec 01 '20 at 15:18
2

All three approaches will yield acceptable results. Personally, I use 1, have autotrim=on on my pools, and my VMs are backed by zvols and use the paravirtualized SCSI driver which understands trim/discard commands. My reasoning is that I will probably never fill the disks above 80% of capacity, but it is nice to be able to if ever necessary.

Options 2 and 3 will yield similar results, assuming you start with a freshly blanked (secure-erased or blkdiscard-ed SSD).

If you use autotrim=on and you never fill up the SSDs above 80% of capacity, there won't be any significant difference between the 3 approaches mentioned.

Gordan Bobić
  • 936
  • 4
  • 10
2

Overprovisioning is achieved by any method that prevents all the pages in the drive from being used. Partitions are one approach used by other file systems to prevent all the pages from being used, but ZFS has a much less convoluted way. It's called a quota.

Quotas are traditionally used to prevent specific file systems from using all of a disk, maybe because there are multiple users, each issued a private file system, and each file system is put under a quota so one of the users can't hog all the space on drive. But quotas are also suitable for over provisioning. Here's an example of the command that sets a 20% over provision on a pool named tank for a top level file system called foo on a 1000 GB SSD. The single drive was added to the pool with the default syntax (basically almost all of the drive).

zfs set quota=800G tank/foo 

Yes, its pretty simple. The syntax of ZFS that lets you (omg!) use simpler suffixes like G means you don't have to do partition math. As a system administrator (even if this is your home NAS, desktop, laptop, etc.) you will have to remember to not change the quota. But its not much different from remembering to not format a raw partition or messing with the firmware so the drive under reports its size.

The upside is there isn't a loose raw partition beckoning to every utility out there to be provisioned and used. In addition it is easy to change, so you can adjust it based on the way the drive is used. If a drive doesn't get a lot of writes that rewrite existing pages, over provisioning is just wasted space so the size of a reserved set of pages is best decided on how often existing pages get changed on a drive. This approach also works with zfs pool types like mirror, simple and raid-z because they fill a set of drives evenly.

  • Thank you for sharing this idea. Is it correct that the ZFS quota works on data size *after* compression? Because that's what we would want, right? – David Scherfgen May 09 '21 at 15:46