Resource Limits
Requests & Limits
spec:
containers:
- name: app
image: myapp:v1
resources:
requests:
cpu: "250m" # 0.25 CPU cores
memory: "256Mi" # guaranteed allocation
limits:
cpu: "1000m" # 1 CPU core max
memory: "512Mi" # hard limit; OOMKilled if exceeded
# CPU units:
# 1 CPU = 1000m (millicores)
# 0.5 CPU = 500m
# Memory units:
# Mi = mebibytes (1024^2 bytes)
# Gi = gibibytes (1024^3 bytes)
# M = megabytes (1000^2 bytes) โ avoid ambiguity, use Mi
# CPU throttling: container is throttled (not killed) when over limit
# Memory OOM: container is killed when it exceeds memory limit
QoS Classes
| QoS Class | Condition | Eviction Priority |
|---|---|---|
| Guaranteed | requests == limits for all containers | Last (safest) |
| Burstable | At least one container has requests != limits | Medium |
| BestEffort | No requests or limits set | First (evicted first) |
LimitRange
apiVersion: v1
kind: LimitRange
metadata:
name: default-limits
namespace: production
spec:
limits:
- type: Container
default:
cpu: "500m"
memory: "256Mi"
defaultRequest:
cpu: "100m"
memory: "128Mi"
max:
cpu: "2"
memory: "2Gi"
min:
cpu: "50m"
memory: "64Mi"
- type: Pod
max:
cpu: "4"
memory: "4Gi"
- type: PersistentVolumeClaim
max:
storage: 50Gi
min:
storage: 1Gi
ResourceQuota
apiVersion: v1
kind: ResourceQuota
metadata:
name: production-quota
namespace: production
spec:
hard:
# Compute
requests.cpu: "10"
requests.memory: "20Gi"
limits.cpu: "20"
limits.memory: "40Gi"
# Objects
pods: "50"
services: "20"
persistentvolumeclaims: "20"
secrets: "50"
configmaps: "50"
# Storage
requests.storage: "500Gi"
# Check quota usage
kubectl describe resourcequota production-quota -n production
HPA (Horizontal Pod Autoscaler)
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: myapp-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: myapp
minReplicas: 2
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: AverageValue
averageValue: "400Mi"
behavior:
scaleDown:
stabilizationWindowSeconds: 300 # wait 5min before scaling down
policies:
- type: Percent
value: 25
periodSeconds: 60
scaleUp:
stabilizationWindowSeconds: 0 # scale up immediately
policies:
- type: Percent
value: 100
periodSeconds: 15
VPA (Vertical Pod Autoscaler)
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: myapp-vpa
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: myapp
updatePolicy:
updateMode: "Auto" # Auto / Recreate / Initial / Off
resourcePolicy:
containerPolicies:
- containerName: myapp
minAllowed:
cpu: "50m"
memory: "64Mi"
maxAllowed:
cpu: "4"
memory: "4Gi"
controlledResources: ["cpu", "memory"]
# View VPA recommendations
kubectl describe vpa myapp-vpa