Resource Limits

Requests & Limits

spec:
  containers:
    - name: app
      image: myapp:v1
      resources:
        requests:
          cpu: "250m"       # 0.25 CPU cores
          memory: "256Mi"   # guaranteed allocation
        limits:
          cpu: "1000m"      # 1 CPU core max
          memory: "512Mi"   # hard limit; OOMKilled if exceeded

# CPU units:
# 1 CPU = 1000m (millicores)
# 0.5 CPU = 500m

# Memory units:
# Mi = mebibytes (1024^2 bytes)
# Gi = gibibytes (1024^3 bytes)
# M  = megabytes (1000^2 bytes) — avoid ambiguity, use Mi

# CPU throttling: container is throttled (not killed) when over limit
# Memory OOM: container is killed when it exceeds memory limit

QoS Classes

QoS Class	Condition	Eviction Priority
Guaranteed	requests == limits for all containers	Last (safest)
Burstable	At least one container has requests != limits	Medium
BestEffort	No requests or limits set	First (evicted first)

LimitRange

apiVersion: v1
kind: LimitRange
metadata:
  name: default-limits
  namespace: production
spec:
  limits:
    - type: Container
      default:
        cpu: "500m"
        memory: "256Mi"
      defaultRequest:
        cpu: "100m"
        memory: "128Mi"
      max:
        cpu: "2"
        memory: "2Gi"
      min:
        cpu: "50m"
        memory: "64Mi"
    - type: Pod
      max:
        cpu: "4"
        memory: "4Gi"
    - type: PersistentVolumeClaim
      max:
        storage: 50Gi
      min:
        storage: 1Gi

ResourceQuota

apiVersion: v1
kind: ResourceQuota
metadata:
  name: production-quota
  namespace: production
spec:
  hard:
    # Compute
    requests.cpu: "10"
    requests.memory: "20Gi"
    limits.cpu: "20"
    limits.memory: "40Gi"
    # Objects
    pods: "50"
    services: "20"
    persistentvolumeclaims: "20"
    secrets: "50"
    configmaps: "50"
    # Storage
    requests.storage: "500Gi"

# Check quota usage
kubectl describe resourcequota production-quota -n production

HPA (Horizontal Pod Autoscaler)

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: myapp-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: myapp
  minReplicas: 2
  maxReplicas: 20
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70
    - type: Resource
      resource:
        name: memory
        target:
          type: AverageValue
          averageValue: "400Mi"
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300   # wait 5min before scaling down
      policies:
        - type: Percent
          value: 25
          periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 0    # scale up immediately
      policies:
        - type: Percent
          value: 100
          periodSeconds: 15

VPA (Vertical Pod Autoscaler)

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: myapp-vpa
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: myapp
  updatePolicy:
    updateMode: "Auto"   # Auto / Recreate / Initial / Off
  resourcePolicy:
    containerPolicies:
      - containerName: myapp
        minAllowed:
          cpu: "50m"
          memory: "64Mi"
        maxAllowed:
          cpu: "4"
          memory: "4Gi"
        controlledResources: ["cpu", "memory"]

# View VPA recommendations
kubectl describe vpa myapp-vpa

Resource Limits

Requests & Limits

QoS Classes

LimitRange

ResourceQuota

HPA (Horizontal Pod Autoscaler)

VPA (Vertical Pod Autoscaler)

Related Tools