Autoscaling

Configure autoscaling in Azure Cloud Kit.

Horizontal Pod Autoscaler
Vertical Pod Autoscaler
Cluster Autoscaler
Set Resource Requests & Limits

The Azure Cloud Kit has three different types of autoscalers:

Horizontal Pod Autoscaler;
Vertical Pod Autoscaler; and
Cluster Autoscaler.

The following sections provide descriptions of each of these.

Horizontal Pod Autoscaler

The Horizontal Pod Autoscaler controls the scale of a Deployment and its ReplicaSet. It’s implemented as a Kubernetes API resource and a controller. It cannot be deployed with Terraform. It is deployed together with the application since scaling is dependent on the load requirements of the application.

The walkthrough example on Kubernetes’s site shows how to use horizontal pod autoscaling.

Vertical Pod Autoscaler

The Vertical Pod Autoscaler tries to set resource requests and limits on running containers based on past usage.

Cluster Autoscaler

The Azure Cluster Autoscaler component can watch for pods in your cluster that can’t be scheduled because of resource constraints. When issues are detected, the number of nodes in the node pool is increased to meet the application demand.

To enable autoscaling with Terraform, you need to set up these variables in the variables.tf file:

enable_auto_scaling: This should be set to true.
min_count: Set this to the minimum number of nodes in cluster.
max_count: Set this variable to the maximum number of nodes.

By default, Azure has quotas as to how many specific types of resources are permitted. For example, vCPU amounts can be limited to as low as 10 per subscription. This restricts the cluster size — unless you increase the quota. Quotas can be increased, though, in the portal Quotas section.

You can find more information about Microsoft Azure subscription limits on their tutorial page entitled, Azure Subscription and Service Limits, Quotas, and Constraints.

Set Resource Requests & Limits

Setting resource requests and limits is different for Vertical Pod Autoscaling.

For improved scaling, you should use resource requests and limits in your application deployments. You can find plenty of best practices guides: The right one for you depends on your application. Read more on Kubernetes' Resource Management for Pods and Containers page.

Here’s an example of how you might set resource requests and their limits:

Source code

yamlcontainers:
- name: prodcontainer1
  image: ubuntu
  resources:
    requests:
      memory: “64Mi”
      cpu: “300m”
    limits:
      memory: “128Mi”

More Information:

Docs