SLO-safe density & overcommit

Kernel enforcement makes tight packing safe; the optional L1 layer converts that safety into capacity. Both components are off by default — complement mode — and each is one helm flag away.

Complement mode (the default)

Out of the box, Temper does not touch placement. Karpenter, Cast AI, Cluster Autoscaler, or the stock scheduler decide where pods go; Temper decides who gets the CPU once they share a node. This is deliberate: the node engine needs only Kubernetes-native inputs, so it works under any placer, and your existing tool’s aggressive mode stops being scary — the blast radius of a wrong prediction becomes batch throughput, not the p99 of a revenue service.

This composition has been measured with Karpenter running the autoscaling in both arms — Temper made the tighter packing hold at equal load and SLO (deep dive: sideloading).

# complement mode is simply the default:
helm install temper deploy/helm/temper -n temper --create-namespace
# scheduler.enabled=false, webhook.enabled=false

The density-aware scheduler plugin

When you do want Temper placing pods, the scheduler plugin extends kube-scheduler with awareness of L0’s enforcement state. It reads the per-tier load annotations every agent publishes (pod counts and CPU millis per tier) and scores nodes by where protection capacity actually exists — contention-aware for latency-critical pods, packing-aware for batch.

helm upgrade temper deploy/helm/temper --reuse-values \
  --set scheduler.enabled=true

The overcommit webhook

Kubernetes bin-packs by declared CPU requests, which are usually padded. The overcommit webhook shrinks that padding where it is safe:

Opt-in per namespace — the webhook only acts in namespaces you label temper.codes/overcommit=enabled. Everything else is untouched.
Requests only, never limits — the pod’s ceiling is unchanged; only the scheduler’s packing input shrinks by the configured factor.
Never the Critical tier — latency-critical pods keep their full requests.
Reversible and audited — every mutation records the original value in an annotation on the pod, so reverting is mechanical and the change history is inspectable.

helm upgrade temper deploy/helm/temper --reuse-values \
  --set webhook.enabled=true

kubectl label namespace batch temper.codes/overcommit=enabled

Why this is safe with Temper underneath and reckless without it: overcommitting requests means more runnable threads per node, which under CFS translates directly into latency-critical tail damage. With the kernel fence in place, the extra pressure lands on the Open layers — batch absorbs it, protected tiers do not.

Measured results

The packing, consolidation, and Karpenter-composition measurements — with their methodology and the caveats that travel with each number — are collected in deep dive: sideloading (“Turning protection into capacity”).

Undoing it

Remove the namespace label to stop new mutations; already-admitted pods carry their original request values in annotations and pick them back up on the next rollout. The whole layer can be stood down with webhook.enabled=false / scheduler.enabled=false without touching L0 enforcement.