Kubernetes Requests vs Limits: The Scheduler Guarantees One Thing. The Kernel Enforces Another.
You set requests. You set limits. The pod still gets throttled — or killed. Not because Kubernetes is broken. Because requests and limits operate at two completely different layers of the stack — a...

Source: DEV Community
You set requests. You set limits. The pod still gets throttled — or killed. Not because Kubernetes is broken. Because requests and limits operate at two completely different layers of the stack — and most teams treat them as a single resource configuration. Here's what's actually happening: The Scheduler Uses Requests Only. It Ignores Limits Entirely. When a pod is created, the scheduler evaluates node capacity against resource requests and makes a placement decision. After that — it's done. It doesn't monitor the pod. It doesn't know what limits are set. It guaranteed placement, not performance. The Kubelet + Kernel Enforce Limits Only. At Runtime. Under Pressure. The kubelet continuously monitors container usage against configured limits and enforces them via cgroups. It doesn't know what the scheduler decided. It watches usage and reacts when thresholds are crossed. These two systems share no state. A pod can be perfectly placed and still get throttled or killed at runtime — because