I have an Appengine standard environment with an application that exposes a series of REST services to consume, everything works alright but the response times are a bit high. The first request takes around 40 seconds to respond, after that the response time it's in milliseconds, if the service it's unused for a while, this happen again.
I understand that the instance initialization causes the delay and that an instance can be disposed if it's being unused, so I tried to change the scaling type, and the best result so far has been automatic-scaling with 1 idle-instance minimum, but the problem still persist.
I also programed a cron-job that continuously calls a service so that the instance can't be disposed, didn't work.
What's the proper way to manage the instances so that the services can be available at a low response time? This is expected in a standard environment? Is this problem solved by switching to a flexible standard?