VPA vs HPA in Kubernetes: Why Most Teams Choose the Wrong Autoscaler
Most Kubernetes teams reach for HPA first. It's visible, familiar, and the CPU dashboard makes the decision feel obvious. When traffic spikes, pods scale out. Clean mental model. The problem: HPA s...

Source: DEV Community
Most Kubernetes teams reach for HPA first. It's visible, familiar, and the CPU dashboard makes the decision feel obvious. When traffic spikes, pods scale out. Clean mental model. The problem: HPA solves one specific failure mode — traffic-driven throughput degradation. An under-resourced pod doesn't need more replicas. It needs more CPU. More replicas of a starved pod just gives you more starved pods. The Core Distinction HPA and VPA are not two ways to do the same thing. They scale different dimensions: HPA — Horizontal Pod Autoscaler Scales replica count. Trigger: load (CPU, memory, custom metrics). Solves: traffic-driven saturation. Risk: cold start amplification, latency spikes during scale-out. VPA — Vertical Pod Autoscaler Scales resource requests and limits. Trigger: resource efficiency gap. Solves: OOM kills, CPU throttling, mis-sized pods. Risk: eviction disruption, node fragmentation at scale. HPA doesn't prevent OOM kills. VPA doesn't absorb traffic bursts. Applying the wron