25

Main references

ZFS L2ARC (Brendan Gregg) (2008-07-22) and ZFS and the Hybrid Storage Concept (Anatol Studler's Blog) (2008-11-11) include the following diagram:

a ZFS pyramid view of ARC, L2ARC, ZIL and a disk storage pool

Question

Should I interpret the vertical white line – at the SSDs layer – as a preference to use separate SSDs –

  • a preference to not mix L2ARC and ZIL on a single disk?

Background (response to comments)

Personally, at home I'm unlikely to use either L2ARC or ZIL with any computer that's available to me. (My everyday computer is a MacBookPro5,2 with 8 GB memory and hybrid Seagate ST750LX003-1AC154. No plans to replace the optical drive with an SSD.)

Elsewhere: at work there'll be some repurposing of kit, but I don't have a date or full details. (Xserve RAID x2 in the mix … at this time I don't imagine giving those to ZFS, but I keep an open mind.)

My curiosity about SSD best practices for both L2ARC and ZIL began whilst following performance-related discussions in the ZEVO area – in particular the topic mentioned below, where a user has both L2ARC and ZIL on a single disk.

Other references and discussions

L2ARC Screenshots (Brendan Gregg) (2009-01-30)

SLOG Screenshots (Brendan Gregg) (2009-06-26)

[zfs-discuss] ZFS root backup/"disaster" recovery, and moving root pool (2011-01-10) recommends against a mixture of three things (root pool, ZIL and L2ARC) on a single disk –

… not worth the headaches that can occur when trying to manage all 3 on the same disk. For example, if you decide to reinstall and accidentally clobber the contents of the ZIL for your data pool. Don't share disks for pool components or across pools to keep management and recovery simple. …

– I'm more interested in whether it's recommended to not mix two of those things on a single disk.

https://superuser.com/a/238744/84988 (2011-01-28) mentions "cache (L2ARC cache) and write log (ZIL) onto SSD" (singular). However as it relates to FUSE and Windows, I don't treat that answer as particularly relevant to more commonplace and performance-minded uses of ZFS.

@ChrisS mentioned ZIL and L2ARC in The Comms Room on 2011-08-16.

http://forums.macrumors.com/showpost.php?p=14248388 (2012-01-31) discusses multiple SSDs:

Something you need to understand about ZFS: It has two different kinds of cacheing, read and write (L2ARC and ZIL) that are typically housed on SSD's. The ZIL is the write cache. That's probably where this misconception comes from. The ZIL is getting hammered (assuming an active system) with every write that occurs to the zpool. The problem is that using an mlc-based SSD as a ZIL causes them to wear out and fail quite quickly. You need a (much more expensive) slc-based SSD to be used as a ZIL drive.

Having a zpool made up entirely of SSD's is not only possible, but it works quite well. It also basically eliminates the need for separate drives for the ZIL and L2ARC. Yes, you don't have TRIM support, but based on the copy-on-write nature of ZFS, that's probably a good thing.

With that said, ZFS does NOT play well with nearly full (say, 85% or higher) zpools. Performance begins to drop off significantly - regardless of whether you're using rotational magnetic media or solid-state. Lack of TRIM support would probably exacerbate that problem, but it's already a problem.

https://serverfault.com/a/397431/91969 (2012-06-11) recommends:

  • SLC type SSD (specifically not MLC) for ZIL
  • MLC type SSD for L2ARC.

https://superuser.com/a/451145/84988 (2012-07-19) mentions a singular "SSD for ZIL and L2ARC to speed up ZFS".

zevo.getgreenbytes.com • View topic - Performance issue with FW800 connection order? (2012-09-24) is concerned with the order of things on a FireWire bus with a single SSD for ZIL and L2ARC

  • bus order aside, that ZEVO topic started me wondering whether separate SSDs might be preferable.

More specifically: I wondered about interpretations of the white line in the diagram above …

Graham Perrin
  • 635
  • 2
  • 10
  • 24
  • 2
    This looks like it might be more of a Server Fault question. But something for you to consider is the Read vs. Write load on your storage pool. There's some research that shows how SSD raid in general can have drastically lower performance for write than single drive config. http://www.xbitlabs.com/articles/storage/display/kigston-hyperx-ssd-raid0.html –  Sep 24 '12 at 20:18
  • 2
    Those most likely to be intimately familiar with ZFS are more likely to be at Server Fault than SuperUser. Voting to move, but an excellent question. – afrazier Oct 07 '12 at 19:51
  • I see two current votes to close, instead can we simply move the question? Thanks @afrazier – Graham Perrin Oct 07 '12 at 21:04
  • 1
    Welcome to Server Fault. As the FAQ states, we prefer _practical, answerable questions based on specific problems that you face_. That said, you've gone over a lot of theory and discussion here, but the thing that seems to be missing is the problem you're trying to solve. Add the practical details, and this has the makings of a great question. – Michael Hampton Oct 08 '12 at 00:14
  • 1
    Just to note, VTCs *are* a way to move answers. If the majority of VTCs are to move to a site, it will be moved. And yes, practical details, please, this looks *really* well written and detailed, but without knowing the situation you're in, its hard to actually get a specific answer. You're obviously building a kickass ZFS setup, and details would be helpful in working out the answer. – Journeyman Geek Oct 08 '12 at 01:02
  • My current setup at home is far from kickass … maybe in the future. At work there's a bunch of kit that might be repurposed, but I don't have a date. Answers here (I like the first) help me to plan. – Graham Perrin Oct 08 '12 at 19:53

2 Answers2

17

Short answer, since I don't see what problem you're looking to solve...

If you can, use separate devices. This depends on the scale of your environment... If it's just a simple home system or a virtualized or an all-in-one ZFS solution, you can use a single device.

In larger or high-performance ZFS solutions, I use devices suited specifically for their ZIL or L2ARC roles... E.g. STEC ZeusRAM or DDRDrive for ZIL and any enterprise SLC or MLC SAS SSD for L2ARC.

  • ZIL devices should be low-capacity, low-latency devices capable of high IOPS. They are typically mirrored.
  • L2ARC devices should be high-capacity (within reason: You need to add RAM as L2ARC size increases). They scale by striping.

What are you doing?

ewwhite
  • 194,921
  • 91
  • 434
  • 799
  • I added some background to the question. This answer seems ideal –thanks – I'll leave things open for a few days before accepting. – Graham Perrin Oct 08 '12 at 19:55
  • For link purposes: a 2011 question from @ewwhite [ZFS - how to partition SSD for ZIL or L2ARC use?](http://serverfault.com/q/238675/91969) and within the [accepted answer](http://serverfault.com/a/272799/91969), "… Dedicated ZIL and L2ARC devices per pool is the way to go". – Graham Perrin Nov 03 '12 at 11:13
10

There are some fundamental misconceptions from the outset about ZIL which need correcting before continuing.

Understand this: Under "normal" circumstances, ZIL/SLOG is not touched.

It's only written to when synchronous writes are commanded or if sync=always is enabled on a particular pool/dataset ("zfs get sync pool/dataset")

ZIL is never read from under normal circumstances. It's a disaster recovery feature.

IE: The ZIL is only there for when the power goes off. It's used to replay data which had been acked back to the OS before that data was committed to the pool. All ZFS writes to the pool (sync or async) are from memory buffers.

Under normal circumstances once the data hits the pool, the slog entry is allowed to evaporate - it's just a big circular write buffer and it doesn't need to be very large (even 1GB is overkill in most circumstances)

Non-synchronous writes are buffered in ram, collated and written to disk at an opportune moment. if the power goes off, that data is lost but the FS integrity is maintained (this is why you may want to set sync=always)

On the other hand, L2ARC is heavily hammered at both read and write level.

There's such a thing as "too much l2arc", because the metadata for what's in l2arc comes out of your ARC ram (ie, if you boost l2arc size you must boost ram to suit. Failure to do so can result in severe performance degradation and eventually l2arc usage will level off at some level well below "all the available space")

Despite the protestations of some manufacturers, you cannot make up a memory shortfall by boosting the l2arc sizes (Several makers of hardware raid arrays who've branched into ZFS appliances have made this assumption)

tl;dr: If your IO load is database activity then ZIL is likely to be slammed hard. If it's anything else then it is likely that it will only be lightly touched. It's highly likely that in 99.9% of activity ZIL functions never kick in.

Knowing that will allow you to decide if you need a SLOG partition for ZIL, whether it can cohabit with the l2arc partition or if it needs a standalone drive (and what performance level that standalone drive should be).

stoat
  • 101
  • 1
  • 2