3

I'm looking for a higher-performance build for our 1RU Dell R320 servers, in terms of IOPS.

Right now I'm fairly settled on:

  • 4 x 600 GB 3.5" 15K RPM SAS
  • RAID 1+0 array

This should give good performance, but if possible, I want to also add an SSD Cache into the mix, but I'm not sure if there's enough room?

According to the tech-specs, there's only up to 4 total 3.5" drive bays available.

Is there any way to fit at least a single SSD drive along-side the 4x3.5" drives? I was hoping there's a special spot to put the cache SSD drive (though from memory, I doubt there'd be room). Or am I right in thinking that the cache drives are simply drives plugged in "normally" just as any other drive, but are nominated as CacheCade drives in the PERC controller?

Are there any options for having the 4x600GB RAID 10 array, and the SSD cache drive, too?

Based on the tech-specs (with up to 8x2.5" drives), maybe I need to use 2.5" SAS drives, leaving another 4 bays spare, plenty of room for the SSD cache drive.

Has anyone achieved this using 3.5" drives, somehow?

Edit: Further Info on Requirements

To give a picture of the uses/requirements of this hardware:

  • We currently have several "internal" VMs, running on a VMWare ESXi 5.x host. But we only have a handful of hosts at the moment, it's a basic setup.

  • We recently started rolling out Dell R320s as our standard hardware for shared ESXi hosts. I'd prefer to keep to the R320s to try and keep our hardware (and hence our need for spares, upgrades, monitoring supports etc...) as standardised as possible. Having to keep a different set of drives as spare is better than having to keep entire spare chassis' on top of what we already have.

  • These VMs are primarily responsible for either our internal tools (such as Monitoring, Call Accounting, Intranet Websites). Or shared infrastructure; such as: DNS, SMTP Relay/Filtering, a small Set of shared websites; shared VOIP PBX.

  • Each of these roles are separated out into relatively small sized VMs, as needed. Almost all of which are Linux boxes. Some of these do have database loads, but I would consider them very small (enough that I have been OK with putting individual MySQL instances on each VM for isolation, where appropriate).

  • Generally, the performance is fine. The main catalyst for this new hardware is ironically the SMTP relay. Whenever we get hit by a decent sized mail-out from a customer, it causes a major backlog in our filters. I've confirmed that this is due to disk IO.

  • Given the nature of the other VMs running on this host, despite the fact there's obviously Disk IO contention, no real impact is noticed aside from the mail backlog -- VOIP is primarily all in memory; all the internal sites are so low in traffic that page loads are still reasonable; we've not had any reports of issues on the customer facing VMs on this particular host.

The Goals of this hardware

Shamefully, I really don't have any solid numbers in terms of the kinds of IOPS I want to achieve. I feel like it would be difficult to test, given the varying nature of VMs I want to put on there - it's not as if I have a single application I can benchmark against for a gaurunteed target.

I suppose my best bet would be to setup a test with the worst offenders (eg. DB-backed websites and the SMTP relays) and simulate some high load. This may be something I do in the coming week.

Frankly, my motivation is simply that I know that Disk IO is pretty much always a bottleneck, and so I'd prefer that for our infrastructure, that we have as much IO as we can reasonably afford to have.

I'll try to give you a rough idea of the goals in any case:

  • To reasonably survive the performance hits during a large customer initiated mail-out (which they're not supposed to do!). Of course, I understand this becomes "how long is a piece of string?" As you can't really predict how large any given mail-out might be. But basically, I know it's a disk IO issue so I'm trying to throw some extra IOPS on this particular set of hosts, to be able to handle a sudden burst of mail.

  • My thoughts are that with a large burst of small emails, this would typically be mostly random IO, which would seem best suited to SSDs. Though I'm sure we could do fine in the forseeable future without them.

  • **As stated previously, the above is really the catalyst for this. I realise I could put the SMTP relays onto their own physical hardware and basically be done with it. But I'm aiming for a more general solution in order to allow for all of our internal VMs to have IO available to them if it's needed).

  • To isolate a set of internal VMs from some customer facing VMs which are currently on the same host. To avoid performance issues from resource spikes on said customer VMs.

  • My plan is to have at least two hosts (for now) with the same VMs, and configure active/passive redundancy for each pair (Won't be using vCenter, but rather application specific failover).

  • I potentially will be deploying more VMs onto this host in future. One thing I am looking towards is a pair of shared MySQL and MS SQL VMs. If I were do to this, I'd definitely be looking at SSDs, so that we can have a central pair of DB servers which are redundant and high-performance. But this is further down the road, and I'd likely have dedicated hardware for each node in this case.

Geekman
  • 451
  • 1
  • 10
  • 21
  • If you already have existing systems (eg. R320 servers with 4x3.5" backplanes), have you considered adding a SAN (PowerVaults or Equallogic, in Dell-land) and its respective HBAs (I'd go SAS or iSCSI) instead of changing the servers internal storage? – Luke404 Oct 21 '13 at 16:34
  • @Luke404 I've considered it in the past, and we're looking currently into a SAN of some kind for additioanl storage space (vs. a server with 24TB using iSCSI on it). However, I must say I've never really warmed to the idea of using a SAN completely in place of local storage. I know it's quite popular, particularly in our situation, with things like ESX hosts - allowing you to use vCentre Migrations etc... – Geekman Oct 22 '13 at 08:58
  • @Luke404 It just seems to me like you're adding another single point of failure (though I know if you buy the right equipment eg. HP Fibre Channel SAN that there's lots of redundancy per chassis). I'd really always avoid it unless I had the budget for a redundant SAN -- and unfortunately I'd struggle to get the budget to buy one. – Geekman Oct 22 '13 at 09:00
  • If you add a SAN to a server then yes, it could be considered more prone to failure than adding local storage. If you add a SAN to a couple servers I don't think it could be worse than adding local storage to each and every server (if both cases are well engineered, of course) – Luke404 Oct 22 '13 at 15:47

4 Answers4

6

The Dell PowerEdge R320 is a lower-end 1U rackmount server. Your storage options within that chassis are either 8 x 2.5" small-form-factor disks or 4 x 3.5" large-form-factor drives. Due to the price point of this server, it's commonly spec'd with in the 4 x 3.5" disk combination...

Sidenote: one of the things that's happened in server storage recently is the reversal of internal disk roles.

  • For enterprise storage, 3.5" SAS disks are generally available in 300GB, 450GB and 600GB capacities and 15,000 RPM rotational speeds. They've stagnated at the 600GB capacity for several years.
  • Enterprise SAS disks in the 2.5" form-factor are available in 146GB, 300GB, 600GB, 900GB, 1200GB capacities and 10,000 RPM speeds. The 146GB and 300GB sizes are also available in 15,000 RPM.
  • For nearline SAS storage at 7,200 RPM, the 3.5" disks can reach up to 4TB in capacity, while 2.5" disks have maxed-out at 1TB.

So the combinations above influence server design (or vice-versa). The Dell R320 is usually configured with the larger 3.5" drives since the platform isn't typically used for more I/O intensive applications or where more expandability is required. Higher-end Dells (and HP ProLiants) are typically configured with small-form-factor 2.5" disks. This is to support more I/O capabilities within the chassis.

For your situation:

  • You will not be able to fit an SSD into the chassis of a Dell R320 with 3.5" disks. There's no room inside the chassis, and the RAID controller you're depending on for CacheCade requires that the disks be connected to the same drive backplane. CacheCade will not be able to leverage a PCIe SSD device.
  • The R320 server is available to accommodate 8 x 2.5" disks. If you wish to use CacheCade, that combination makes more sense. There are some design considerations, though. See below:

enter image description here

For CacheCade and your IOPS goals, have you measured your existing IOPS needs? Do you understand your applications I/O requirements and read/write patterns? Ideally, your design should be able to support a non-cached/cacheable workload on the spinning disks. So if you need 6 disks to get the IOPS you need, you should spec 6 disks. If 4 disks can support it, then you're fine. But since this approach will require using the 2.5" disks, you have more flexibility to tune for the application.

Also see: How effective is LSI CacheCade SSD storage tiering?

ewwhite
  • 194,921
  • 91
  • 434
  • 799
  • Appreciate the details, as well as succinctly answering my core question. I hate to admit it, but I must confess that I'm pretty much trying to just see how many IOPS I can throw into the R320 chassis at a reasonable price. I don't necessarily have any particular requirements in terms of IOPS, and I think I'd rather stick to standard hardware over pushing for more IOPS at the moment. – Geekman Oct 22 '13 at 09:08
4

I don't believe there's space for what you want if you wish to use 3.5" disks - have you considered using 2.5" disks instead? if you did you could easily fit what you want into the machine and generally a 2.5" 10krpm disk will perform roughly on par with a 3.5" 15krpm disk - especially when fronted by a nice bit of SSD cache as you wish. I use this solution using HP kit (including their SmartCache, which is the same thing) and I'm very happy with them.

Chopper3
  • 100,240
  • 9
  • 106
  • 238
  • Thanks for this. Particularly appreciate the comment about 2.5" 10K being roughly equivalent to 15K 3.5", didn't think of that. Will wait and see if anyone else has bright ideas, but I suspect your conclusions are correct. – Geekman Oct 21 '13 at 08:32
  • 1
    I'd also add that while 2.5" 15k disks are definitely faster than 2.5" 10k ones, the difference is usually smaller than the price difference. When possible, I personally prefer putting a greater number of 10k disks than a smaller number of 15k ones at the same price. – Luke404 Oct 22 '13 at 15:49
1

Other things I'd do:

  1. use six disks, so there is always three of them in a RAID1.

    This has the advantage that you do not immediately lose redundancy when one disk fails (so you are betting that the other mirror has no single broken sector), and, if your controller supports it, you can run the periodic consistency check on two disks and keep the third in regular operation so the I/O rate doesn't drop too significantly.

  2. If it's a database load, use a dedicated disk or SSD for the indexes.

    This is by far the greatest speed boost you can have -- provided you have a DBA who knows how to use this.

Simon Richter
  • 3,209
  • 17
  • 17
  • It's a low-end server, though. Not likely to be used for DB workloads (single CPU, E5-2400-series). The controller doesn't support triple-mirrors. – ewwhite Oct 21 '13 at 13:48
1

As stated before proper SSD cache setup results in using 8x2,5" disk backplane. Your post implies that Dell R320 server(s) is already in use. If so:

Are you ready to upgrade R320 backplane? Is SSD cache really needed here?

SSD cache useful mainly for random read/write performance.

With RAID10 on SAS 15K and NVRAM write-back cache powered by a hardware controller you already have a good random write performance. Even if CacheCade can leverage perfect SSD iops what about data protection from SSD failure?

For random reads you can consider just memory upgrade (up to 192GB supported by R320).

Or another solution: just use RAID10 on SSDs.

Veniamin
  • 853
  • 6
  • 11
  • +1 for mentioning need for a 2.5" backplane -- yet another component we'd need to be prepared to keep spares of. – Geekman Oct 22 '13 at 09:25
  • 1
    If you have more than a single server to keep running, it's usually easier and cheaper to just buy and deploy N+1 servers than keeping spares for every part. Incidentally, this works better if you have a shared SAN (SAS, iSCSI, FC, whatever) to easily move workloads around. – Luke404 Oct 22 '13 at 15:51
  • @Luke404 I can definitely agree with this idea. Unfortunately, we have a very mixed environment (we don't host a particular app, but rather do managed dedicated/vps hosting). So while we definitely have N+1 for all our shared/critical infrastructure, providing that redundancy for customers who are only paying for a single dedicated server or VPS, is not particularly feasible. – Geekman Oct 22 '13 at 23:24
  • @Luke404 Very interested about the notion of a SAN being cheaper than than spares. Going through the numbers in my head, it provokes a few thoughts. Might have to investigate this! Thanks. – Geekman Oct 22 '13 at 23:30
  • Never talked about a SAN cheaper than spares. I said usually N+1 servers are cheaper than N servers and spares for every bit down to backplanes and such. At work we do something similar to you, and we have a mix of cheap Dell systems (old PE1425, R300, R320, etc) all with maxed out ram and all running off MD3220i iSCSI, workloads are virtualized and we can easily move stuff off an old server to the new one if we want to replace it. – Luke404 Oct 23 '13 at 06:50
  • @Geekman If you currently use VPS infrastructure you will likely use SAN in the (near) future. It brings great benefits for virtual environment. Thus, enterprise class SSD disks can be the best choice to buy now since it can be reused in dedicated storage system in the future. – Veniamin Oct 23 '13 at 15:12
  • @Luke404 Sorry Luke, got mixed up. Clearly your points N+1 vs N, and using a SAN were separate. – Geekman Oct 24 '13 at 02:23