2

I have an app service plan that has scaled out to two instances. I've managed to get the statistics on the two instances and one of them is at 100% CPU, and the other is at 5% CPU. This is an issue because it seems all HTTP requests are being sent to the one at 100% CPU usage, and thus the web page is loading very slowly.

I have turned off ARR Affinity as per this page

Is there another reason why all API and HTTP requests are being sent to the same instance? What can I do to balance the load fairly between the two instances?

A_toaster
  • 155
  • 5

2 Answers2

1

Is there another reason why all API and HTTP requests are being sent to the same instance?

If you have load balancer, using session persistence you can keep the same client access the same virtual machine.

In the Free and Shared tiers, an app receives CPU minutes on a shared VM instance and cannot scale out. In other tiers, an app runs and scales as follows.

When you create an app in App Service, it is put into an App Service plan. When the app runs, it runs on all the VM instances configured in the App Service plan. If multiple apps are in the same App Service plan, they all share the same VM instances. If you have multiple deployment slots for an app, all deployment slots also run on the same VM instances. If you enable diagnostic logs, perform backups, or run WebJobs, they also use CPU cycles and memory on these VM instances.

In this way, the App Service plan is the scale unit of the App Service apps. If the plan is configured to run five VM instances, then all apps in the plan run on all five instances. If the plan is configured for autoscaling, then all apps in the plan are scaled out together based on the autoscale settings.

What can I do to balance the load fairly between the two instances?

You can use auto-scale conditions to scale VMs based on a metric such as 50% CPU per instance. For information on scaling out an app, see Scale instance count manually or automatically

Nancy Xiong
  • 610
  • 4
  • 5
1

So the problem fixed itself when I downscaled from P2v2 to S3. These are both exactly the same price, one is premium tier, the other is standard tier.

All requests are now being balanced evenly and the website response time has dropped from 10-12 seconds to 60ms, as a good website should.

Still not sure why it happened in the first place

A_toaster
  • 155
  • 5
  • very interesting. I am having the same issue, and have tried out your suggestion. Did you ever end up figuring out why it was happening in the first place? – Jurgen Cuschieri Jan 07 '21 at 14:00