Is it possible to achieve 100% availability of a web application deployed in Azure App Service? We have an ASP.NET MVC Web application deployed into Azure App Service. We have this application deployed into 3 regions of Azure App Services and the Pricing Tier is Premium - P3. Each region has Auto Scaling enabled to scale the App Service Plan from 2 to 10 instances based on Performance (CPU%). A Traffic Manager is used to route traffic between the three regions (Performance based routing).The Traffic Manager end point monitoring has the below configurations:
- Probing Interval: 10 seconds
- Tolerated Number of Failures: 0 (A value of 0 means a single monitoring failure can cause that endpoint to be marked as unhealthy.)
- Probe Timeout: 5 seconds
However, when we tested the system by stopping an App Service in one region under high load (we stopped Central US since most of our traffic is expected to hit this App Service), we have observed that some of the requests/transactions failed/errored out before the traffic was redirected to the other regions. This is not 100% availability. How can we ensure a 100% availability of the system?
Please note: I am not looking for the details of an Azure SLA that guarantees a 100% availability, and I know there is no such thing. I am looking for a design pattern or modifications to our current design which I have explained in my question which will help us achieve it.