Software engineer here, not a ton of experience managing servers, but wanting to understand how auto-scale works.
Here's the background:
We have a stateless application running on the azure cloud, which talks to an Azure SQL database behind the scenes. The database itself is geo-replicated across two different server regions.
We've setup auto-scale, such that, if server load exceeds 80%, we will scale out and add an instance. When the load drops back below 50%, we will scale back down. Scaling does not usually happen, but during periods of peak usage, the server will auto-scale.
Here's my question:
With auto-scale on, does azure automatically handle any load balancing between the instances? I understand that azure also has some load balancer products, but I'm trying to understand if we need them or not.
Without a load balancer explicitly setup, is scaling our instances pointless?