6

We have several Dell 6248 swtich stacks (of 3 each) that form the backbone of our iSCSI Storage network. We need to perform firmware updates on the switch stacks, but are concerned about the downtime required.

By way of information our storage is exclusively Dell/Equallogic PS6000 series enclosures with 3 or 4 GigE uplinks per enclosure.

As you might already know, these switches can't be upgraded a member at a time, and the reboot required to upgrade the switches is on the order of two minutes (i.e. longer than the iSCSI initiator timeout for a volume).

Does anyone have any suggestions for how we might be able to accomplish a iSCSI SAN switch stack upgrade while minimizing downtime?

Thanks for any help or suggestions.

Joe

Ben Pilbrow
  • 11,995
  • 5
  • 35
  • 57
Guamaniac
  • 458
  • 1
  • 5
  • 8
  • 4
    So you have a non-redundant SAN switch stack? Sounds like problems waiting to happen; perhaps this is a good opportunity to remedy that. – Chris S Oct 28 '10 at 21:34

3 Answers3

5

If your core iSCSI network has been setup correctly for Equallogic you should have two separate stacks, with normal ISLs connecting the two stacks, and all arrays and hosts should have at least one connection to each stack. If that is the case then the simplest and lowest impact approach will be to follow the standard Dell firmware update procedure for stacked PowerConnects with its 2+ minute per switch timescale. You shouldn't experience any actual downtime if the cabling has been done properly but performance will be significantly degraded so you should only do this when everything is quiet. I'd be double checking that all of the connections are OK first though because you will definitely be relying on many single links keeping things alive while the upgrade happens.

Breaking out PowerConnect switches from the stack and upgrading them individually might be possible but you will have to go through a very intricate process to ensure that each switch upgrade happens in isolation and you have to be very careful reconnecting the upgraded switches because they can't be stacked until all units are at the same version. You will possibly have to recreate switch configs for most of the switches if you take this route. You will also have to ensure that all active switches have some reasonably high bandwidth connectivity to both stacks when you bring them online - that is an Equallogic requirement that seriously complicates this sort of exercise. If you end up in a scenario where one switch appears to be active as far as the arrays are concerned but is isolated from either stack then at best you will have some serious performance problems and at worst all volumes hosted from the arrays connected to that switch may go offline. I really wouldn't want to do it that way to be honest, far too many points at which it could go wrong.

Helvick
  • 19,579
  • 4
  • 37
  • 55
  • Thanks for all the details, Helvick. This was what I thought we would have to do as well. Also, yes, it has become painfully apparent that the switch stack itself is a single point of failure. We are probably going down the route of having a second switch stack with an ISL configured and just bop the uplinks from the EQL enclosures over to the other stack. thanks for all the insight, all! – Guamaniac Oct 30 '10 at 16:43
2

Can you break the stack to do the upgrade?

That's what I would do, or find some way to round robin things by using spare switches to build a new stack and then leapfrog the connections over to it.

SpacemanSpiff
  • 8,733
  • 1
  • 23
  • 35
  • I couldn't make a comment about Chris' "avatar" without mentioning yours. Calvin and Hobbes is one of my favorire cartoons. Carry on, Spaceman Spiff... – joeqwerty Oct 28 '10 at 21:22
2

If you have several distinct switch stacks, and each of your hosts and storage is connected to more than one stack (as they should be to ensure redundancy), then it shouldn't be a problem to take one stack out of service to upgrade the firmware.

It seems unlikely that you'd be forced to upgrade the firmware on the separate stacks at the same time. Is this the case?

Your hosts/storage will automatically switch to a different active path if MPIO is appropriately configured within your environment.

Chris Thorpe
  • 9,903
  • 22
  • 32