I'd like to start an EC2 instance on-demand, and to take it down when it is idle for some period of time (e.g. no network activity for >= 1 hour), but I can't tell what a standard way of doing this in AWS looks like, given that AWS doesn't support wake-on-LAN.
The service I intend to run will require a persistent network connection, e.g. ssh.
The user experience I am aiming for goes something like this:
- If the service is up when the user attempts to connect, the user has immediate access to the service.
- If the service is down when the user attempts to connect, the user receives a "service is starting" reply (and the connection is closed). The user retries after a few minutes and connects successfully (or receives the "starting" message again if he retries too early). The service remains up for up to an hour after the last user disconnects.
My motivation here is primarily cost savings. The demand will be highly unpredictable (so scheduled instances are not a good fit), probably less than 12 active hours/day, and the users are willing to wait a few minutes for the service to start. And I don't want to get locked in to a 1+ year term with reserved instance pricing.
I also have some wild stabs at how I might accomplish this, and would appreciate feedback on how plausible/sensible they are:
- Use an auto-scaling group that "scales" the service from 0 to a maximum of 1 instances. But I don't know how I'd be able to issue the "service is starting" reply if there are no instances running.
- Run a t2.micro instance when the service is down whose sole purpose is to catch a connection attempt, issue the "starting" reply, trigger the start the actual service instance, and then die. When the service instance goes down due to inactivity, it would need to start the t2.micro instance again.
Thanks!