3

We are in the process of building a new secure web application for our company, and are looking at options for ensuring redundancy in our data center. We don't have a lot of money, but have relatively high bandwidth requirements, so have considered a 50/20 FiOS Business connection with a T1 backup, which would cost ~$600/month total. In comparison, a 15Mbps partial T3 costs $2500/month in our area.

If we were to go with this set up, is there a way for someone typing in our address to automatically be redirected to the other line if the FiOS is down? Since it's an SSL connection, we'd need to ensure that the certificate would work either way as well.

Update - to those suggesting an outside data center solution, we have relatively high internal user bandwidth requirements (~80Mbps). Purchasing an 80Mbps pipe to a colocation facility, to which we'd also have to pay high monthly fees, isn't really financially feasible. In our area, such a setup would cost around $10,000/month.

Beep beep
  • 1,843
  • 2
  • 18
  • 33

8 Answers8

2

To expand and put together two ideas previously mentioned: you could use a multi-wan router which would have failover and purchase DNS hosting service with load balance/failover feature which will automatically switch your A records when your Verizon line or your T1 line goes down. This would mean you need to configure your server for both IPs and test them every time you make a significant change.

If you want to simplify it, I would recommend colocating in a datacenter which offers a good BGP mix of providers, and buy bandwidth from them, which will make everything much simpler.

gekkz
  • 4,219
  • 2
  • 20
  • 19
  • We'd love to do colocation, but we need servers to be local for now (our customer service team accesses the same data as our clients, and it's absolutely critical that the CSRs have 100% up-time and fast performance for data entry). – Beep beep Dec 29 '09 at 18:32
  • If your data entry includes filling in forms and clicking submit this won't be a problem even if it's hundreds of miles away. If you're actually uploading larger amounts of data at a time, distance may become a factor. – gekkz Dec 29 '09 at 19:01
  • We actually have both. Plus, our internal bandwidth requirements are ~80Mbps (partly due to hundreds of complex pdf forms being printed per minute), while we expect our client requirements to be < 15Mbps. Getting a reliable 100Mbps uplink to a colocation center would be cost prohibitive. – Beep beep Dec 29 '09 at 20:00
2

To me, using words like "data center" and "FIOS / T1" in the same sentence is nonsensical. The term "data center", in my mind, conjures images of server rooms with ample bandwidth coming in from one or more sources.

With that said, if said application is of such importance, perhaps it should be hosted in a cloud based solution -- or at least externally? In a cloud scenario, the redundancy is built in, so you don't have to worry if one or more links go down...

And depending on which provider you would choose, it may be cheaper to host the site outside rather than try to bring in the bandwidth / redundancy you desire.

Though, I can only speculate about this as we do not know more details about the implementation -- where the data lives vs. the application, etc.

Corey S.
  • 2,379
  • 1
  • 18
  • 23
  • I'd love to put it externally, but the app is accessed by both internal and external customers. Our internal bandwidth is ~80Mbps, while external is expected to be ~15Mbps. We can afford 15Mbps of connectivity, we cannot afford a 100Mbps pipe to an external data center (which itself would cost extra money). – Beep beep Dec 29 '09 at 20:31
1

You have two choices.

  1. Basic: set up the relevant DNS records with short TTL. Find a way to update them with the T1's IP addresses should your FIOS line go down.

  2. Advanced: get an AS and your own IP block. Install routers in front of your network, configure them to run BGP.

If you don't want to use both uplinks simultaneously, I'd say the first option is totally fine.

Max Alginin
  • 3,284
  • 14
  • 11
  • Sure, DNS way works SOMETIMES. But, it is not true redundancy. Definitely not a way to run mission critical services. – xeon Dec 29 '09 at 17:28
  • DNS won't work only for badly broken clients. Especially if we're talking about web only - BGP has little to no added value. – Max Alginin Dec 29 '09 at 17:43
  • @xeon - can you clarify? Why would it only work sometimes? All of our clients are accessing our services via domain name. – Beep beep Dec 29 '09 at 18:30
  • To failover via DNS you'd need a record with very short TTL value. Not all DNS clients/caches will use your TTL value, but may cache the record for a longer time. In other words, you'd be dependent on the client-side, which you can't control. – Martijn Heemels Dec 29 '09 at 19:21
  • Thanks Martijn, that makes it sound like a suboptimal solution – Beep beep Dec 29 '09 at 20:33
  • It is not, because your alternative is to use BGP, which has convergence time of the same order, about 15 minutes (which is incidentally the default for cache expiration of a typical browser). If 15 minutes is too much, then your only solution is to make the physical layer redundant. – Max Alginin Dec 29 '09 at 21:08
  • Martijn is right. DNS is not the right solution for load balancing/failover. Ever. Especially if your application layer doesn't compensate for the differences in Layer 3 data. – Tom O'Connor Dec 30 '09 at 00:39
  • I'm yet to hear a good alternative given the OP's constraints (the most reasonable solution I saw in this topic was external hosting for login page with subsequent redirect, but that's also a can of worms in its own right). Until then - sorry, but DNS _is_ the best solution. – Max Alginin Dec 30 '09 at 03:20
  • @ynguldyn - I agree with your statement about DNS, but why do you feel the external hosting with redirect is a can of worms? Just curious. – Beep beep Dec 30 '09 at 16:45
  • Off the top of my head: 1. You'll have to deal with users who'll bookmark the wrong pages. 2. You'll need to have multiple SSL certs (one for the gateway URL, two more for the destination URLs). 3. You'll need to figure a way to pass credentials and deal with cookies. And I'm sure this is not a complete list. – Max Alginin Dec 30 '09 at 18:00
1

The real solution, IMHO, is to put a server somewhere reliable. A previous poster is essentially correct: the only way to get true redundancy is to get your own IP block and do your own routing, which a Verizon FIOS connection will not let you do.

You can either rent a server from someone who's willing to do this for you, or buy a server and stick it somewhere that will provide this for you. The other option is to set a low Time to Live (TTL) on your DNS records (maybe 5 mins?) and then change the DNS records when one line goes down, and back when it comes up.

Dan Udey
  • 1,460
  • 12
  • 17
  • As I said earlier, I'd love to put it externally, but the app is accessed by both internal and external customers. Our internal bandwidth is ~80Mbps, while external is expected to be ~15Mbps. We can afford 15Mbps of connectivity, we cannot afford a 100Mbps pipe to an external data center (which itself would cost extra money) ... in our area a 100Mbps pipe costs close to $10k/month. – Beep beep Dec 29 '09 at 20:42
1

Like most people have said, the real solution is colocation. However, as an option because of your high local bandwidth requirements, in many places, you can do point to point metro ethernet. This would let you go from your office to a data center and be much cheaper than paying for external bandwidth at the data center. Most of the big players offer it (AT&T, Verizon, etc) and if nothing else it’s at least worth getting pricing on, even if you're not interested.

I’m also thinking you could pay for hosting somewhere and have extra URLs set up. ie. www.yoursite.com, www2.yoursite.com (fios), www3.yoursite.com (backup T1). When external clients go to www.yoursite.com, you would have an app on your remote hosting site check if www2 was up. If so, then redirect clients to www2. If not, redirect clients to www3. This would only really work for new connections and people who don't bookmark the www2 or www3 and would be subject to latency, and a million other ifs that makes this a bad solution, although a potentially workable one.

David
  • 3,337
  • 25
  • 20
  • We asked Verizon about the point-to-point metro ethernet, and it was only available to one of their co-lo facilities in our area. That was the cheapest 100Mbps option, and the $10,000/month figure I quoted earlier. However, your second point is really interesting since all clients would be going to our login page first. So we could host the login page externally, and have that redirect to our internal site either over the FiOS or T1. If the FiOS goes down and they get booted, they could login again and they'd be redirected over the T1. That's the idea, right? – Beep beep Dec 30 '09 at 02:33
  • That was my idea, yes. You’d have to pay for hosting that allows you to schedule a job that checks your servers every 1-5 minutes. The downside is that anyone who was redirected to the T1 will stay on the T1, even after the fios line is back, unless you write logic into your code to force people back to fios. You'd have to connect to that external hosting account and say "is our fios up?" and if it is, redirect them to www2 (and maintaining their login, cookies, sessions, etc). You also have to hope your clients are going to be happy having to – David Dec 30 '09 at 23:35
0

they make multi-WAN routers. can you use one of those?

djangofan
  • 4,172
  • 10
  • 45
  • 59
  • Would that allow a client to type in www.fakesite.com and route over the FiOS or the T1 (depending on which is up)? – Beep beep Dec 29 '09 at 20:42
  • no, it wouldn't unless you were using a outside DNS provider that would round-robin the dns request in a load balanced fashion. – djangofan Dec 31 '09 at 00:41
  • for example: http://network-tools.com/default.asp?prog=dnsrec&host=cnn.com – djangofan Dec 31 '09 at 00:44
0

Ok, so as i'm reading this, you've got an office, which houses your staff, and a server (or 2+). Your staff are used to being able to access your servers quickly, and want to keep it that way, but to eliminate your single point of failure, you need another server.

You could spend a fortune on getting this working in your office, get a transit provider to give you a leased line. Trust me, there's no point doing this over "FiOS" or a jumped-up ADSL line. I don't know whether your FiOS quote is contended bandwidth, but you're unlikely to get anywhere near 50Mbit.

Look closely at your application layer data. Can you optimise it for WAN transmission? Gzip everything you can, and install a reverse-proxy to cache as much data as possible. Also shift all the external media, Images, CSS and javascript type stuff to a CDN. They can worry about the bandwidth that you can't afford.

It also sounds like you want cheap, good and reliable. You can have any 2 of the three, but not all of them.

If you insist on hosting the servers from your office, you'll need carrier-class IP transit and routing. The connectivity itself might have a high OPEX, but the CAPEX for the routing hardware will be expensive too. You'll need a pair (for true High Availability) of good routers, firewalls, switches to begin with.

If you attempt to optimize your application, you might find that you can get similar performance over diverse WAN links, and are able to shove the entire application onto a cloud environment, such as Amazon's EC2.

The bottom line of the connectivity thing is this: If your application means that much to you, it's worth investing the money in doing it the right way. Cost cutting may seem like a bright idea now, but will come back to bite you in the ass.

Tom O'Connor
  • 27,440
  • 10
  • 72
  • 148
  • Thanks Tom ... I'd be more apt to use a local co-location facility than Amazon's EC2. Any site EC2 site I've used always seems to have problems and their 99.95% SLA is rather low given the cost. I've had friends who used it for their business and have said 'never again'. The reason we liked FiOS was that it was VERY cheap and very fast, and recorded uptime is >99.9%. Since there is no SLA, a T1 could temporarily service our clients in a limited fashion if FiOS went down every once in a while. While not optimal, the $ difference (800/mth vs 10,000) makes it worthwhile to investigate IMO. – Beep beep Dec 30 '09 at 02:29
0

Ecassa makes some relatively inexpensive load balance/failover devices. Look like there could be a fit.

Exassa Products

Dave M
  • 4,494
  • 21
  • 30
  • 30