Resource Limits

Requests & Limits

spec: containers: - name: app image: myapp:v1 resources: requests: cpu: "250m" # 0.25 CPU cores memory: "256Mi" # guaranteed allocation limits: cpu: "1000m" # 1 CPU core max memory: "512Mi" # hard limit; OOMKilled if exceeded # CPU units: # 1 CPU = 1000m (millicores) # 0.5 CPU = 500m # Memory units: # Mi = mebibytes (1024^2 bytes) # Gi = gibibytes (1024^3 bytes) # M = megabytes (1000^2 bytes) — avoid ambiguity, use Mi # CPU throttling: container is throttled (not killed) when over limit # Memory OOM: container is killed when it exceeds memory limit

QoS Classes

QoS ClassConditionEviction Priority
Guaranteedrequests == limits for all containersLast (safest)
BurstableAt least one container has requests != limitsMedium
BestEffortNo requests or limits setFirst (evicted first)

LimitRange

apiVersion: v1 kind: LimitRange metadata: name: default-limits namespace: production spec: limits: - type: Container default: cpu: "500m" memory: "256Mi" defaultRequest: cpu: "100m" memory: "128Mi" max: cpu: "2" memory: "2Gi" min: cpu: "50m" memory: "64Mi" - type: Pod max: cpu: "4" memory: "4Gi" - type: PersistentVolumeClaim max: storage: 50Gi min: storage: 1Gi

ResourceQuota

apiVersion: v1 kind: ResourceQuota metadata: name: production-quota namespace: production spec: hard: # Compute requests.cpu: "10" requests.memory: "20Gi" limits.cpu: "20" limits.memory: "40Gi" # Objects pods: "50" services: "20" persistentvolumeclaims: "20" secrets: "50" configmaps: "50" # Storage requests.storage: "500Gi" # Check quota usage kubectl describe resourcequota production-quota -n production

HPA (Horizontal Pod Autoscaler)

apiVersion: autoscaling/v2 kind: HorizontalPodAutoscaler metadata: name: myapp-hpa spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: myapp minReplicas: 2 maxReplicas: 20 metrics: - type: Resource resource: name: cpu target: type: Utilization averageUtilization: 70 - type: Resource resource: name: memory target: type: AverageValue averageValue: "400Mi" behavior: scaleDown: stabilizationWindowSeconds: 300 # wait 5min before scaling down policies: - type: Percent value: 25 periodSeconds: 60 scaleUp: stabilizationWindowSeconds: 0 # scale up immediately policies: - type: Percent value: 100 periodSeconds: 15

VPA (Vertical Pod Autoscaler)

apiVersion: autoscaling.k8s.io/v1 kind: VerticalPodAutoscaler metadata: name: myapp-vpa spec: targetRef: apiVersion: apps/v1 kind: Deployment name: myapp updatePolicy: updateMode: "Auto" # Auto / Recreate / Initial / Off resourcePolicy: containerPolicies: - containerName: myapp minAllowed: cpu: "50m" memory: "64Mi" maxAllowed: cpu: "4" memory: "4Gi" controlledResources: ["cpu", "memory"] # View VPA recommendations kubectl describe vpa myapp-vpa