K8s Troubleshooting
Quick Diagnostic Commands
# Check pod status kubectl get pods -n <namespace> -o wide # Describe pod (events and conditions) kubectl describe pod <pod-name> -n <namespace> # View container logs kubectl logs <pod-name> -c <container-name> --previous # Interactive shell in running pod kubectl exec -it <pod-name> -- /bin/sh # Check resource usage kubectl top pods -n <namespace> kubectl top nodes
Common Issues & Solutions
CrashLoopBackOff
Container keeps crashing and restarting.
kubectl logs <pod> --previous # Check last crash logs kubectl describe pod <pod> # Check exit code and events # Exit code 137 = OOMKilled # Exit code 1 = App error # Exit code 126/127 = Command not found
ImagePullBackOff / ErrImagePull
Cannot pull container image.
# Causes: wrong image name, auth issue, registry unreachable kubectl describe pod <pod> | grep -A 5 Events # Fix: create imagePullSecret kubectl create secret docker-registry regcred \ --docker-server=registry.example.com \ --docker-username=user \ --docker-password=pass
Pending Pod
# Check events for scheduling failures kubectl describe pod <pod> | grep Events -A 20 # Common causes: # 1. Insufficient CPU/Memory: kubectl describe nodes | grep -A 5 Allocatable # 2. No matching NodeSelector/Taint kubectl get nodes --show-labels # 3. PVC not bound kubectl get pvc -n <namespace>
OOMKilled
Container exceeded memory limit.
# Exit code 137 = killed by OOM
kubectl describe pod <pod> | grep -i oom
# Fix: increase memory limit in deployment
resources:
requests:
memory: "128Mi"
limits:
memory: "512Mi" # increase this
# Monitor: kubectl top pods --sort-by=memory