3

We received a quote for the following server setup :

  • Intel Xeon E5-2609V2
  • 2 x WD RE 2TB SATA III
  • 34 x WD RE 4TB SAS II
  • 32 GB memory (ECC)
  • LSI MegaRAID 9271-8i bulk, 8 x SATA III/SAS II internal hardware RAID
  • LSI CacheVault voor 9266-9271 series

We wanne add (directly) a JBOD to that server, half filled with 8TB drives, we can extend later. They suggested :

  • LSI MegaRAID 9380-8e
  • 22 x HGST Ultrastar 8TB He8 enterprise, SAS III

Now this was based on our previous server, which we setup as a ZFS server and did not have much "pleasure" from. (although the configuration was to blame I guess)

I have a few questions about this setup : - The argument to take 2x2TB is, use it as mirror for the system, since when a disk has to be replaced IO is sluggish during rebuild. Speed is not our real problem, space is, also we have a online backup, that will be only used as read platform (during problems). Would 36 x 4TB be a better choice ? (36 = 3*12 disk in a pool) - is 32 Gb memory enough ? (ZFS on linux, taking in consideration the JBOD at max capacity 44*8+32*4) - This is a raid controller, a JBOD/HBA (?) would be a better choice ? If so, what kind of JBOD should I be looking for ? - How would I best setup this system to be "ready" for upgrading the next 22 disks in the JBOD ? (its a 44 disk JBOD, 22 slots are filled)

Some more info based on the comments :

  • uptime/availability: we don't care if it drops out for a few minutes, aslong as this is not all to common. no HA needed. In essence this will be a hot backup for our current storage server. We mainly read and writing is not speed limited. (by far)
  • Reading speed is important but we don't want to give up space for it
  • Write speed is not that important, mostly its streams from machine while large files are written there copy's so it can run overnight.
ewwhite
  • 194,921
  • 91
  • 434
  • 799
SvennD
  • 739
  • 5
  • 18
  • What kind of uptime/availability requirements do you have? What are your performance requirements? How much disk space are you willing to give up for redundancy? – Andrew Henle Jan 14 '16 at 15:10
  • 2
    Your 9380-8e is a SPOF. I would never sleep well with such a setup. I would go with 2 * SAS HBA and multipath your JBOD. The real question is, does you JBOD have 2 SAS Interface Modules? You definitely lack some RAM and will have some performance impacts from it. I would go with 1GB of RAM per TB of Disk Space (Only if you don't plan to use Dedup). Performance is done with ARC / L2ARC. – embedded Jan 14 '16 at 15:10
  • 1
    First I would recommend taking the same capacity disks, because otherwise your performance will be unstable. Could you tell more about your reading needs? If you read most of the time the same content (MFU), you could just use a PCIe SSD cache (saves you a disk slot ;) as L2ARC. In that case you could use raidz1/2 to gain more space out of your disks. – Jeroen Jan 14 '16 at 16:20
  • @embedded you tell me we need ~160 GB of memory ? its a bit overkill no ? – SvennD Jan 15 '16 at 12:53
  • @SvennDhert It depends... The RAM is used as an Adaptive replacement cache (ARC) by ZFS. Yes, there is the option to add SSDs as L2ARC, but SSDs typically have access times around 0.05ms versus 50.00 ns of RAM. Its basically a question how much Performance do you need. A general rule of thumb is: Add as much RAM (only use ECC!) as you can afford then add a few SSD L2ARC devices. What you will get: A big read Cache = A lot of cache hits after some burn in time (Yes, depends on your usage scenario) – embedded Jan 15 '16 at 13:25
  • 1
    See some recommendations at: https://www.reddit.com/r/zfs/comments/410gsk/zfs_100tb_hardware_setup_suggestions/ – ewwhite Jan 17 '16 at 18:00

2 Answers2

4

I would work with a ZFS professional or vendor who specializes in ZFS-based solutions. You're talking about 100TB of data, and at that scale, there's too much opportunity to screw this up.

ZFS is not an easy thing to get right; especially when you incorporate high-availability and design for resilience.

I wouldn't plan on half-filling storage enclosures or anything like that. Expanding ZFS arrays is not something you can do easily with RAIDZ1/2/3, and expanding ZFS mirrors can leave you with unbalanced data.

ewwhite
  • 194,921
  • 91
  • 434
  • 799
  • So generally you say forget ZFS run straight raidsets ? Thanks for your advice. – SvennD Jan 14 '16 at 19:49
  • 3
    No, I love ZFS... It just takes some intelligent design to make it work well. – ewwhite Jan 14 '16 at 20:48
  • That's why I asked advice here I know some professional are around, including you. However we prefer to train people then to just buy "solutions". (as in black box with leds) – SvennD Jan 15 '16 at 09:11
1

I am not sure I would use ZFS on Linux for such a setup, as ZoL remain a somewhat "moving target".

Regarding your RAID card, if it can be configured in JBOD, there is no problem. However, if it only work in RAID mode, I would change it for a JBOD/HBA adapter.

Anyway, as suggested by ewwhite, I would ask to a professional ZFS verdor/consultant.

shodanshok
  • 44,038
  • 6
  • 98
  • 162
  • thanks for your advice, I believe this RAID card cant run in JBOD. A quick google did not come up with ZFS experts, and the company selling us the server is infact selling some software that does ZFS. Its just making things way more expensive on useless licenses. If I can't learn it will be straigth raidsets. – SvennD Jan 14 '16 at 19:51
  • In theory, you can turn your RAID card in something *similar* to a JBOD/HBA card by using multiple RAID0 sets, *one for each disk* and disabling any controller-level cache. While I would not recommend this setup, it can be done if really necessary. – shodanshok Jan 14 '16 at 21:51
  • @SvennDhert Who is selling you the solution? – ewwhite Jan 15 '16 at 00:10
  • @ewwhite http://www.ahead-it.eu They have been our server vendor for a while, and we are rather happy with them, altough I'm not sure we are getting the best solution right now. for shodanshok I know, we have the same raidcontroller and ZFS setup, we want a "live" backup for this setup to recreate it. RAID0 on every disks makes disk failes = raidcontroller lockup. (=eg system down until I get there) Thats way worse then any raid array 6 setup. – SvennD Jan 15 '16 at 09:07
  • @SvennDhert I was not talking about a single RAID0 spanning all your disks. Rather, I was referring to a 1:1 mapping between RAID0 arrays and disks. For example, having 24 disks, you should create 24 RAID0 sets each spanning a single, different disk. This is a workaround for non-JBOD/HBA raid cards which insist on creating RAID arrays. Let me stress again that it is not best practices, rather only a (dirty) workaround. – shodanshok Jan 15 '16 at 11:15
  • @shodanshok I also was, its currently setup as 34*raid 0 disk + 2 disks in RAID1 for system. If 1 disks fails with raid 0 on it, the system hangs completely. (no data loss offcourse since ZFS has raidz2 on top) So please don't advice or mention it, cause its not a valid solution to run ZFS on this controller. – SvennD Jan 15 '16 at 11:34
  • @SvennDhert I mentioned it for reference only. However, it is very strange to have a complete system lookup for a failing array. Are you sure all is working properly? – shodanshok Jan 15 '16 at 12:17
  • The raidcontroller got replaced after the last one failed on something simular (cache would not clean after 1 vdisk raid0 had died). So we got a replacement with latest firmware. Tested by our vendor, so I do think this is "default" behavior. Hence I am not a fan of RAID0/disk or this raidcontroller. – SvennD Jan 15 '16 at 12:51
  • @shodanshok A failed RAID0 hardware RAID when used in ZFS is very problematic - Disk replacement become difficult. – ewwhite Jan 15 '16 at 13:06