1

This is similar to enabling jumbo frames on an iSCSI network after the fact but in our case we configured the switches and then went through all the hosts, enabling jumbo frames (we may have screwed ourselves here). The SAN is a Lefthand (actually 2 SANs mirrored). So the plan is to configure one of the SANs for jumbo frames, bring it back on line, wait for replication and then do the same with the other SAN. Are we in for a world of hurt with one SAN having jumbo enabled and the other not yet configured while we wait for SAN replication? Of course all of this has to be done live. Should we go through the pain of resetting all the hosts back to 1500 MTU and then configure the SANs for jumbo first?

Some clarifying info. Each SAN device is connected to two dedicated switches (each SAN has two nics). The switches have an ether channel link between them. The SANs have MPIO configured. Each host has two nics configured for the SAN network connected to each switch (full mesh configuration).

UPDATE: It worked. When taking the second SAN offline all the hosts experienced problems connecting to the SAN for about a minute (very scary), but they all came back except for our OCS server which we had to reboot.

murisonc
  • 2,968
  • 2
  • 20
  • 30

1 Answers1

0

Answering my own question here in the hopes it will help others. We did this today and I have to say the pucker factor was pretty high:) Changing the first SAN to jumbo had no problems at all. When we broke the MPIO on the second SAN all the hosts got very angry (stopped responding) and it took about a minute for them to recover. Our Office Communications Server 2007R2 refused to play ball and required a reboot. Other than that it worked but I'm thinking next time (never will happen now that I know to configure jumbo before going live) I'd rather do the hosts last because then we just worry about a single host at a time.

murisonc
  • 2,968
  • 2
  • 20
  • 30
  • the connection drop can be fixed in one of 2 ways: either install lefthand's DSM, or add the IP addresses of at least two additional nodes per cluster as discovery gateways in your iSCSI configuration. what's happening is the cluster IP is moving and the servers have to wait for the ARP cache to timeout. by adding the additional IP's, iSCSI's timeout causes it to look for the LUN on another gateway. or use the lefthand DSM because it discovers all nodes in a cluster upon initial connection. – longneck Jun 12 '12 at 15:09