I am currently trying to figure out a good configuration to make a Bastion host highly available. I want to meet the following targets:
- The bastion host(s) need to able to withstand a Availability Zone failure and ec2 instance failure. A small downtime (a few minutes) may be acceptable.
- The bastion host(s) needs to be reachable via a permanent DNS entry.
- No manual intervention needed
My current setup is as follows: Bastion host in Auto Scaling Group in two availability zones, ELB in front of the Auto Scaling Group.
This setup has a few advantages:
- Easy to setup using CloudFormation
- Auto Scaling Groups over two AZs can be used to guarantee availability
- The does not count towards the accounts EIP limit
It also has some disadvantages:
- With two or more bastion hosts behind the ELB, SSH host key warnings are common, and I do not want our users to get accustomed to ignore SSH warnings.
- The ELB costs money, as opposed to EIP. About as much as the bastion host, actually. This is not really much of a concern, I added this point only for sake of completeness.
The obvious other solution is to use an ElasticIP, which has - as I see it - a few drawbacks:
- I can (obvously) not attach an EIP to an Auto Scaling Group directly
- When not using Auto Scaling Groups, I have to put something in place that starts new EC2 bastion hosts if the old ones fail, e.g. using AWS Lambda. This adds additional complexity.
- When the EIP is attached to an Auto Scaling Group manually, on Availability Zone failure, the EIP will get unattached and not be reattached to a new instance. Again, this can be resolved by running a program (on the instance or AWS Lambda) that reattaches the EIP to an instance. Again this adds additional complexity.
What are best practices for High availability SSH instances, i.e. bastion hosts?