AUTOSCALING IN APP SERVICE
Before we jump to autoscaling, let’s understand what type of applications we can host on Azure App Service.
App service is used to host web applications, APIs, backend for mobile apps, SPAs, Web Jobs, POCs and many more.
Applications which require to maintain state like maintaining the persistent data in memory or local storage, or Operating system interaction, or desktop-based applications, or heavy background processing are not doable/suggestable on Azure App Service because it’s stateless in nature.
There are ways to maintain the state if your applications are deployed in Azure App Service. You may have to use Redis to maintain user session, Blob Storage for persistent files etc.
Note – To deploy apps like CLI based, windows services or batch files or PowerShell or phyton based scheduled scripts can be deployed on Azure Batch, Logic Apps or VMS or on AKS sometimes.
We call it Horizontal scaling, when we add more instances to our application. Purpose of horizontal scaling is to maintain the high availability or to tackle the failover or for load balancing.
App Service provides you both options to EITHER set it to manual scaling where you mention set numbers of instances OR Autoscaling, where you mention some rules like if CPU is busy for more than 60% add one more instance, and if CPU is at 10% remove one instance.
We call it Vertical scaling, when we add more memory or compute power to our application. Purpose is to let our app handle the heavy tasks.
For Vertical scaling, you only option has to change the service tier you are on currently. There is no way you can do scaling up and down based on some rules.
AUTOSCALING IN AKS
let’s understand what type of applications we can host on AKS first, generally banking or insurance-based application which requires regulatory requirements like encryption, audit logs, fine grained access control, or service discovery or load balancing or private clusters are the applications which are preferred to be deployed to AKS.
In AKS, we do EITHER horizontal scaling or cluser level scaling.
What is POD and NODE?
The term NODE in AKS, which is nothing but a virtual machine that runs the application POD. Each NODE can run multiple PODS based on its CPU/RAM capacity.
POD is nothing but a wrapper around your application container. POD normally wrapped around 1 container, but sometimes there can be more than 1 container. These containers normally talk to each other via localhost.
I shared one LinkedIn post on Service Mesh & S2S Authentication, this is the perfect example of a POD with 2 container.
Normally these PODS shares the netowkr, storage, configuration and environment variables.
What is HPA?
HPA is Horizontal Pod Autoscaler, which can add more pods when needed and remove the pods whenever demand goes down.
So HPA keeps watching the metrics like CPU, memory or any rule that you have set. As soon as the time is there to execute it, it will create a new POD with the replication of everything which is running the existing POD by using the YAML file you created for deployment. Now your application replica is running in 2nd pod.
HPA only adds PODS.
What is Cluster Autoscaler?
If a node (It’s a virtual machine with specific configuration) is FULL, means CPU and memory and other resources are running high already, then there are high chances new pods cannot be created. This is when Cluster Autoscaler steps in and create a new node for us and pending pod will be scheduled for new node.
Cluster Autoscaler adds NODES.
Why Vertical Autoscaling is not suggestable?
Vertical autoscaling adjusts the CPU or memory for pods, sometime it can trigger a restart for the pod and effect availability. And then you wont be able to check the exact performance of the pod. Moreover, it will add more memory or CPU but not remove it.
So HPA and Cluster AutoScaler is the best choice.