kubectl-autoscale
Manage Horizontal Pod Autoscalers
TLDR
Auto scale a deployment with no target CPU utilization specified
Auto scale a deployment with target CPU utilization
SYNOPSIS
kubectl autoscale deployment|rs|rc <name> --min=<min-pods> --max=<max-pods> [--cpu-percent=<%%>] [--resource=cpu] [options]
PARAMETERS
--cpu-percent int32
Target average CPU utilization percentage (default 80)
--defaults string
Default metrics like 'cpu=80%%,memory=80%%' for unmatched requests
--disable-scaling
If true, disables autoscaling on the target
--horizontal-pod-autoscaler-config string
Path to HPA config file for defaults
--limits-cpu string
CPU limit (e.g., 200m, 1) for resource spec
--limits-memory string
Memory limit (e.g., 200Mi, 1Gi) for resource spec
--max int32
Maximum number of pods
--min int32
Minimum number of pods
--name string
Name of the autoscaler (defaults to deployment name)
--requests-cpu string
CPU request (e.g., 200m, 1) for resource spec
--requests-memory string
Memory request (e.g., 200Mi, 1Gi) for resource spec
--resource string
Metric resource name (default 'cpu'); requires --resource-value
--resource-value string
Target value for the resource metric (default 50)
--value int32
Target percentage of requested value (default 50)
--dry-run=client|server|none
Dry run without applying changes
DESCRIPTION
The kubectl autoscale command creates a HorizontalPodAutoscaler (HPA) resource that automatically adjusts the number of replicas in a Deployment, ReplicaSet, or ReplicationController based on observed metrics like CPU or memory utilization.
It enables horizontal scaling to maintain performance under varying loads by increasing or decreasing pod counts within specified min/max limits. For example, you can scale a deployment when average CPU usage exceeds 80%.
Requires a metrics server (like Prometheus or metrics-server) for CPU/memory metrics. Supports resource metrics (CPU/memory), custom metrics, and external metrics. Updates existing HPA if one already exists for the target.
Ideal for production workloads needing elasticity without manual intervention.
CAVEATS
Requires metrics-server or compatible provider for CPU/memory metrics.
HPA cannot scale below min or above max replicas.
Does not handle vertical scaling or node autoscaling.
Existing HPA updated, not recreated.
EXAMPLE
kubectl autoscale deployment nginx --min=2 --max=10 --cpu-percent=80
Creates HPA scaling nginx deployment between 2-10 pods at 80% CPU.
PREREQUISITES
Deploy metrics-server: kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
Ensure workloads have resource requests/limits set.
HISTORY
Introduced in Kubernetes v1.1 (2015) for basic CPU scaling; enhanced in v1.23+ with memory, custom/external metrics, and improved defaults.
SEE ALSO
kubectl scale(1), kubectl get(1), kubectl describe(1)


