5

We are replacing a Clariion CX320 SAN because it is old (7.5 years). It has been otherwise very reliable. Yes, we have replaced a lot of disks, battery modules, and maybe even one of the controllers, but the failure rate is not very high -- maybe an issue once every nine months or so.

The new SAN is an equallogic with similar performance characteristics (IOPS, network, etc). Even though much newer the technology really hasnt changed much (unless we wanted 10gig network or SSDs disks, which I would have LIKED to have but couldnt justify spending twice the price). I think the big change is probably the price point, we paid 40k for the CX3 back then and the equallogic is half that price for similar setup.

I want to keep the CX3-20 in production but I have to make a some case that this thing should be able to live another 4-5 years without a high cost of ownership. We would have to source parts outside of EMC since this is end-of-life (which actually isnt bad because the parts are very inexpensive in the third party. The only drawback is it takes 1-2 days to get parts instead of same day within hours).

So the question is: is anyone really running these things for 12 years and they are as rock solid as before? The failure rate is supposed to go up with time but I am not seeing it happen. We have 3 SANs that are 7+ years old now. We had issues at years 2-5 but the last 2.5 years havent been bad. maybe with the 3 7+ year old SANs we have replaced a total of 4 disks.

I know in talking to resellers/vendors that there are many people that run a "lot of these end-of-life" SANs but I never get to talk to the people behind them. Is it a constant headache they are battling but they are running cabinets of this stuff they cannot justify the cost to upgrade? Or maybe they are hosting companies and they just cannot afford the downtime of migrating VMs... or are these things really rock solid and they will just last forever if you maintain them and "end-of-life" is just EMC's way of nudging us to buy newer gear?

Michael Hampton
  • 237,123
  • 42
  • 477
  • 940
Ryan Hata
  • 51
  • 1
  • I personally wouldn't run a SAN for more than 5 years max. Also, i think this question is out of scope for this site. – Frederik Jan 17 '15 at 16:52

4 Answers4

5

This is a exercise in planning and setting expectations (to your users, the business, etc.).

I'll use servers as an example. When I sell/buy a system, I plan for it to have a primary service life of 3-5 years. That's a pretty good metric for modern equipment since there's usually a substantial jump in technology in the interim and good justification to upgrade after that period. That's also when the failure modes of the systems present themselves.

Systems beyond that age are still usable, but the lack of parts and support relegate older servers to non-critical functions or use in clusters that can tolerate failure.

Storage has also evolved since the time your EMC was in wide use. I'd say that SAN storage has become more commoditized, with more intelligent caching and performance features. You're probably leaving a lot of performance on the table...

As far as keeping the old unit in use, you can, but why not rely on the new equipment you have? What do you expect to gain by keeping the older gear in place?

ewwhite
  • 194,921
  • 91
  • 434
  • 799
2

Things you will have problems with on 'old' kit:

  • Code updates: vendors will rarely commit to releasing updates on older kit.
  • Replacement parts: Spares will become increasingly hard to come by - sometimes you can use newer parts, but not always, because the speed/communication mode/protocol etc. gets updated. Newer SFPs stop supporting lower transmission rates, that kind of thing.
  • Failure rate of moving parts: spinning disks wear out, so you will start to see the rate of disk failures increasing.
  • Infrastructure compatibility: vendors like to change protocols as time passes. Windows domain controllers, for example, deprecate legacy encryption protocols.

You are also paying the opportunity cost of not upgrading:

  • New toys are typically Bigger, Better, Faster. Storage doesn't quite keep up with processors, but there's some very nice capabilities leveraging flash drives, bigger memory caches, etc.
  • hiring people with experience will become difficult.
  • steadily increasing migration overhead when you do finally make the switch, because the migration paths to 'leapfrog' a tech generation are less well trodden.
  • Some vendors offer trade-in deals, for much the same reason car retailers do.

I won't say it's a bad idea, but you need to consider why you bought a storage array in the first place. They're usually quite expensive ways of buying capacity - what you're doing is making use of performance over subscription - to get a better 'burst' with the same 'average'. Both at disk layer, and cache layers.

They're also more expensive because of improved reliability - 'enterprise' components with better MTBF.

Both of these things diminish over time. The former because the goalposts move, the latter because of wear and tear and availability.

So it's really more a matter of acceptable risk as anything else. For my production kit, the data on it is FAR more important to my organisation than the cost of replacement and vendor support contracts.

For my test/dev kit, I don't care nearly as much.

I would suggest you therefore present this a bit like a thin provision. It's not saving money it's deferred expenditure. You are still going to need to replace it. You are going to incur additional overheads as it ages. You are going to make your replacement and migration harder. You are also taking a commercial risk of hitting an unfixable fault. That would need vendor support, who will either point and laugh, or hit you with a ridiculous bill. Or maybe both.

But you might find the money you save in the meantime, offsets the cost, and that by delaying your purchase you get bigger and faster for the same money.

The bathtub curve is applicable here: http://en.wikipedia.org/wiki/Bathtub_curve

It very much applies to storage arrays. You can probably compare with a car quite nicely - as a car gets older, the cost of keeping it on the road steadily increases, as do the odds of it breaking down, and the trade in value decreases. If breaking down every few months and needing to repair it right then is acceptable, then you might well run an older car. But you wouldn't do this with an ambulance, because whilst the odds are the same, the consequences of failure and downtime are higher as well.

Sobrique
  • 3,697
  • 2
  • 14
  • 34
0

we have quite the same situation with a CX300, which is roughly the same age (little more than 8 years). I totally agree on the point, that thing is rock solid (we just exchanged a couple disks, in those years & one controller battery is now faulty), but I wouldn't stretch it to far. As our storage moved out of service, we decided to migrate it to a use case, where reliability is not the main goal (in our case backup spool). Sure there is the possibility, that your CX320 is making a lot more years without a lot of hassle, but keep in mind, that 8 years in IT in general and especially for "moving parts" is quite a long time.

From the economic point, there will be (or already is!) a certain time, where maintaining the CX320 doesn't financially compare to buying new stuff (disks got much bigger, FC disks wont be easily available in future times...)

My suggest is to keep it running in a place, where reliability doesn't really matter or where you can replace it quick and stress free (eg. keeping new storage already build in as "cold spare")

Henrik
  • 668
  • 5
  • 19
0

Honestly unless the 'gravity' of the data on the CX-3 is such that it can't be moved easily/on a timely basis, bite the bullet and migrate to something new.

Other benefits of upgrading: 1. More storage capacity in a smaller amount of rack space, due to larger disk sizes and smaller form factors. 2. Much better performance for mixed workloads due to SSDs and larger amounts of cache in the more modern systems. 3. Sometimes things do break that require more than a simple part swap. When that happens, you'll want official vendor support.