4

I work for a small state college. We currently have 4 ESXi hosts (all made by Dell), 2 EqualLogic SANs (PS4000 and PS4100) and a bunch of old HP Procurve switches. The current setup is very far from being redundant and fast so we want to improve it. I read several threads but get even more confused.

enter image description here

The Procurve Switches are 2824. I know they don't support Jumbo Frames and Flow Control at the same time, but we have plans to upgrade to something like Procurve 3500yl. Any suggestions? I heard Dell Powerconnects 6xxx are pretty good but I'm not sure how they compare to HPs.

There will be a 4-port Etherchannel (Link Aggregation) between the switches, and all control modules on SAN will be connected to different switches.

Is there anything that will make the setup better? Are there better switches then Procurves 3500yl that cost less than 5k? What kind of bandwidth can I expect between ESXi hosts (they will also be connected to 2824 with multiple cables) and SANs?

Basil
  • 8,811
  • 3
  • 37
  • 73
Sergey
  • 41
  • 1
  • 2
  • I wasn't aware that those switches can't do JF and flow-control at the same time, if I were you I'd stick to using JF as that'll benefit you much more than flow-control. – Chopper3 Sep 25 '12 at 14:57
  • Looks pretty good as is... Neither JF nor Flow Control should have *that much* effect on your setup; I'd go with JF as Chopper said. I wouldn't upgrade the switches until you know there's a problem with your current setup. – Chris S Sep 25 '12 at 15:08
  • Jumbo frames doesn't help much in reality, 3% less. Actual benchmarks show this. Jumbos are not worth the configuration pain and risk. See http://www.boche.net/blog/index.php/2011/01/24/jumbo-frames-comparison-testing-with-ip-storage-and-vmotion/ – rmalayter Sep 25 '12 at 17:04
  • Two notes, we setup an iSCSI config using the MD series boxes from Dell. `1)` Contact Dell, they have whitepapers from their R&D group that has designs and setup steps for high-load environments like VMs and DBs. `2)` The big thing performance wise for us was making sure we had our MPIO drivers setup and connected properly, that helped a TON – Brent Pabst Sep 26 '12 at 17:34
  • If you are concerned about support it may be helpful to know that Dell has stopped testing the 3500yl-48G switches at EQL firmware 4.1 and switch firmware K12.12 (EQL firmware is currently up to 6.0 and the 3500yl is at K15.x) They will still support the 3500yl however they will depreciate your support to Level 3. You can read more about it [here](http://en.community.dell.com/cfs-file.ashx/__key/communityserver-components-postattachments/00-19-85-68-62/EQL-Compatibility-Matrix_2D00_092112.pdf) and view other switches that are supported by Dell. –  Sep 26 '12 at 17:20

2 Answers2

2

On the note between jumbo frames and flow control: if you have to choose between them, remember that flow control only benefits you when you are saturating an ethernet link in the data path. iSCSI traffic's flow control natively is dropped packets, and unfortunately the underlying SCSI stack can't handle that well. It results in multi-second read latency. So while jumbo frames will always benefit you, when you're pushing your storage to its limit, flow control will benefit you more.

Basil
  • 8,811
  • 3
  • 37
  • 73
1

You've made it about as redundant as you can given the current hardware on hand. Some thoughts:

Of course make sure that each ESXi host is connected to both switches.

  • You need to use "per port load balancing" on the ESXi side and "adaptive load balancing" or whatever they call it on the Equallogic side if you want to handle redundancy at the Ethernet layer. You cannot use LACP or any other form of channel-bonding, as the switches are totally independent and do not support MLAG.

  • If you do not configure iSCSI multi-pathing on both ESXi and EqualLogic sides, you will be limited to 1 GBps throughput to each ESXi host. Using network-layer redundancy is simple to confiugre, but it comes at a price.

  • Make sure you have rapid spanning tree enabled, with one switch configured as root primary, the other as root secondary. BPDU guard or similar on the all ports except the trunk between switches to avoid meltdowns.

rmalayter
  • 3,744
  • 19
  • 27
  • LACP breaks MPIO, don't do it in any way shape or form. Let MPIO do it's job. Spanning tree is not necessary at all in the above setup, though it should be enabled in case he gets the setup wrong. – Chris S Sep 25 '12 at 15:09
  • @ChrisS exactly. Although LACP is now there in ESXi 5.1 and as far as I have read there is nothing preventing you from using an LACP bundle as part of an MPIO setup. But it would be a fairly stupid and pointeless configuration. As far as the spanning tree is concerned, it is necessary to prevent meltdowns due to cabling errors/problems, and to prevent someone from bridging two interfaces inside a VM (bridging two interfaces is way to easy to configure on both Linux and Windows Server, even though that's not what an admin actually wants to do 99.9% of the time). – rmalayter Sep 25 '12 at 17:01
  • rmalayter, do we need RSTP even though the two switches are completely isolated from the rest of the network? I'm not concerned about bridging two interfaces as only IT support have admin access to all machines. – Sergey Sep 25 '12 at 19:36
  • 1
    You should have some form of spanning tree enabled (Rapid Spanning Tree or Multiple Spanning Tree is now default on most switches), if only to protect against cabling errors that result in a loop. If you've never experienced a layer-2 loop in your career, congratulations, but you likely will some day and I assure you they are not fun. – rmalayter Sep 26 '12 at 13:45