kubectl commands failetcdctl compact + etcdctl defrag.
etcdctl compact $(etcdctl endpoint status --write-out="json" | jq '.[0].Status.header.revision') followed by etcdctl defrag.kubectl rollout undo mechanism.web-0.web.default.svc.cluster.localvolumeClaimTemplates. When a Pod is rescheduled, it reattaches to the same PVCNodeIP:NodePort. Mostly for dev/testing or when you control external routing.# Example Ingress
spec:
rules:
- host: api.example.com
http:
paths:
- path: /v1
backend:
service:
name: api-v1-svc
port: number: 80
- path: /v2
backend:
service:
name: api-v2-svc
An Ingress Controller (nginx, AWS ALB Ingress Controller, Traefik, Istio) must be deployed to implement the Ingress resource — the resource is just configuration, the controller is the actual proxy.kubectl get endpoints my-serviceIf
ENDPOINTS is <none> — the Service selector doesn't match any Pod labels. Compare kubectl get svc my-service -o yaml selector vs kubectl get pods --show-labels.kubectl get pods -l app=my-app kubectl describe pod <pod-name>Are Pods in Running state? Are readiness probes passing? A Pod not passing readiness is removed from Endpoints automatically.
kubectl exec -it debug-pod -- curl http://<pod-ip>:<port>Bypasses Service/kube-proxy — isolates whether the app itself is responding.
targetPort matches the port the container is actually listening on.kubectl get endpoints first — if it's empty, that's your answer.apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: api-allow-frontend
spec:
podSelector:
matchLabels:
app: api # applies to api pods
ingress:
- from:
- podSelector:
matchLabels:
app: frontend # only allow from frontend pods
ports:
- port: 8080
Important: NetworkPolicies are enforced by the CNI plugin — if your CNI doesn't support them (e.g., Flannel), the resources exist but have no effect.
kube-system namespace). Each Pod's /etc/resolv.conf points to the CoreDNS ClusterIP.<service-name>.<namespace>.svc.cluster.localmy-service — the search domains in resolv.conf expand it automatically.<pod-ip-dashes>.<namespace>.pod.cluster.local (e.g., 10-244-1-5.default.pod.cluster.local)kubectl exec -it debug-pod -- nslookup my-service kubectl exec -it debug-pod -- cat /etc/resolv.conf
kubectl describe pod <pod-name>Check the
Events section at the bottom — the scheduler emits a specific message.0/3 nodes are available: 3 Insufficient memory. All nodes lack the requested CPU/memory. Fix: scale cluster, reduce requests, or check if requests are set unreasonably high.3 node(s) had taint that the pod didn't tolerate. Pod needs a toleration or nodes need the taint removed.nodeSelector or nodeAffinity rules don't match any node. Check node labels: kubectl get nodes --show-labels.kubectl get pvc.kubectl describe pod — that message tells me exactly what constraint is failing, so I don't need to guess."# Taint a node (mark it as GPU-only) kubectl taint nodes gpu-node-1 hardware=gpu:NoSchedule # Pod toleration tolerations: - key: "hardware" operator: "Equal" value: "gpu" effect: "NoSchedule"Taint effects:
NoSchedule — new Pods without toleration won't be scheduled here. Existing Pods stay.PreferNoSchedule — scheduler tries to avoid, but will use if no other option.NoExecute — evicts existing Pods that don't have the toleration, in addition to blocking new ones.node-role.kubernetes.io/master:NoSchedule).
apiVersion: v1
kind: LimitRange
spec:
limits:
- type: Container
default: # applied if container sets no limits
cpu: 500m
memory: 256Mi
defaultRequest: # applied if container sets no requests
cpu: 100m
memory: 128Mi
max:
cpu: "2"
memory: 1Gi
ResourceQuota — caps the total resources consumed by all objects in a namespace. Prevents a single team from starving the cluster.
spec:
hard:
requests.cpu: "10"
requests.memory: 20Gi
pods: "50"
services.loadbalancers: "2"
Use LimitRange to set per-Pod guardrails. Use ResourceQuota to enforce per-namespace budgets in multi-tenant clusters.
apiVersion: v1
kind: PersistentVolumeClaim
spec:
storageClassName: gp3 # references a StorageClass
accessModes: [ReadWriteOnce]
resources:
requests:
storage: 20Gi
Access modes: ReadWriteOnce (one node, read-write), ReadOnlyMany (many nodes, read-only), ReadWriteMany (many nodes, read-write — requires NFS/EFS).
kubectl describe pod. Prefer mounting as files in a tmpfs volume.emptyDir — ephemeral directory created when Pod starts, deleted when it ends. Lives on the node's disk (or memory if medium: Memory — becomes a tmpfs). Perfect for sidecar patterns: a log collector sidecar reads from the same emptyDir that the main app writes logs to.configMap / secret — mount cluster resources as files.projected — combine multiple sources (configMap, secret, serviceAccountToken, downwardAPI) into one mount point.kind: Role rules: - apiGroups: ["apps"] resources: ["deployments"] verbs: ["get","list","watch","create","update"] --- kind: RoleBinding subjects: - kind: ServiceAccount name: ci-runner namespace: default roleRef: kind: Role name: deployment-manager
default ServiceAccount, but you should create dedicated ones per workload./var/run/secrets/kubernetes.io/serviceaccount/token containing a short-lived JWT token (bound service account token, introduced in K8s 1.21).serviceAccountName: my-app-sa # Annotation on SA: eks.amazonaws.com/role-arn: arn:aws:iam::123:role/MyRole
labels: pod-security.kubernetes.io/enforce: restricted pod-security.kubernetes.io/warn: restrictedAdvanced policy enforcement — use OPA/Gatekeeper or Kyverno: admission webhooks that intercept API requests and enforce custom policies (e.g., require all images come from a specific registry, require resource limits set, require specific labels).
securityContext:
runAsNonRoot: true # K8s rejects Pod if image defaults to root
runAsUser: 1000 # run as specific UID
runAsGroup: 3000
fsGroup: 2000 # GID for volume ownership
readOnlyRootFilesystem: true # prevent writes to container FS
allowPrivilegeEscalation: false
capabilities:
drop: ["ALL"] # drop all Linux capabilities
add: ["NET_BIND_SERVICE"] # add back only what's needed
These can be set at the Pod level (applies to all containers) and overridden at the container level.runAsNonRoot: true.
MutatingWebhookConfiguration / ValidatingWebhookConfiguration resources, pointing to an HTTPS endpoint (your webhook server or a tool like Kyverno/Gatekeeper).failurePolicy: Fail, all matching API requests fail — can break cluster operations. Set failurePolicy: Ignore for non-critical webhooks or ensure high availability.
failureThreshold), kubelet restarts the container. Use for: detecting deadlocks, stuck processes that appear running but aren't making progress.httpGet, tcpSocket, exec (run a command inside the container).
kubectl logs <pod> --previous # logs from the last crashStep 2 — Check events:
kubectl describe pod <pod>Look at exit code in the container status section.
1 — application error (check app logs)127 — command not found (check image/entrypoint)137 — OOM killed (SIGKILL). Increase memory limit.139 — segmentation fault143 — SIGTERM, graceful shutdown requested but container exited non-zerokubectl run debug --image=my-app:tag --command -- sleep 3600 kubectl exec -it debug -- /bin/sh
kubectl exec -- env | grep MY_VAR.kube-state-metrics — Deployment/Pod/Node status metricsnode-exporter — per-node CPU, memory, disk, network/metrics and /metrics/cadvisor — container resource usage/metrics endpoint (instrument with Prometheus client library)apiVersion: policy/v1
kind: PodDisruptionBudget
spec:
minAvailable: 2 # at least 2 pods must always be available
# OR
maxUnavailable: 1 # at most 1 pod can be down at once
selector:
matchLabels:
app: my-api
When you need one: Any stateless service with multiple replicas that you want to remain available during node maintenance, K8s upgrades, or Cluster Autoscaler scale-downs.kubectl drain can evict all Pods of a deployment at once — causing an outage even if you have multiple replicas.
desiredReplicas = ceil(currentReplicas × currentValue / targetValue)kubectl autoscale deployment my-app \ --min=2 --max=20 --cpu-percent=60Requirements: Pods must have CPU
requests set — HPA uses requests as the denominator for CPU utilization %.maxUnavailable — max Pods that can be down at once (default 25%)maxSurge — max extra Pods above desired count during rollout (default 25%)postgres-dev and postgres-prod)-f values.yaml or --set key=valuehelm install my-app ./charts/my-app \ -f values-prod.yaml \ --set image.tag=v1.2.3 \ --namespace productionHelm also tracks release history —
helm rollback my-app 1 reverts to a previous revision.
helm upgrade --install with image.tag=latest in a production pipeline.until nc -z db-service 5432; do sleep 2; doneemptyDir volumeinitContainers:
- name: wait-for-db
image: busybox
command: ['sh', '-c',
'until nc -z db-svc 5432; do echo waiting; sleep 2; done']
Init containers can have different images and security contexts from the main container — useful for privileged setup steps that the main app doesn't need.
maxUnavailable: 0 (never take a Pod down before a new one is ready) and maxSurge: 1 (one extra Pod at a time).terminationGracePeriodSeconds to allow time. Add a preStop lifecycle hook with a short sleep to let kube-proxy drain the endpoint before SIGTERM.sleep 5) is critical — there's a race condition between kube-proxy updating iptables rules and the Pod being removed from Endpoints. The sleep bridges that gap and prevents 502s during rollout.kubectl get nodes kubectl describe node <node-name>Check
Conditions — look for MemoryPressure, DiskPressure, PIDPressure, NetworkUnavailable.systemctl status kubelet journalctl -u kubelet -n 100 --no-pagerCommon causes: kubelet crashed, certificate expired, containerd unhealthy, disk full, out of memory (OOM on the node itself).
kubectl cordon <node> # stop new pods scheduling here kubectl drain <node> --ignore-daemonsets --delete-emptydir-dataStep 4 — Replace: In cloud environments, terminate the instance — the ASG or Karpenter will provision a fresh replacement node.
Unknown. After 5 minutes, eviction begins and Pods are rescheduled to healthy nodes.
kubectl describe node <node> | grep -A5 Conditions kubectl get events --field-selector reason=EvictedFixes:
--container-log-max-size on kubelet)etcdctl endpoint health etcdctl endpoint statusetcd disk exhaustion or quorum loss causes the API server to stop accepting writes.
systemctl status kube-apiserver # or for kubeadm clusters: crictl ps | grep apiserver3. Check API server resource usage: Is the API server OOM-killed? Check node memory on control plane nodes.
apiserver_request_total metrics for rate spikes.kubeadm certs check-expiration.team: and env: labels. Use Kubecost or OpenCost to attribute cluster cost to namespaces/teams. Feed into internal chargeback or showback dashboards.
kubectl get pods showed 3 of 6 pods in CrashLoopBackOff. kubectl describe pod showed exit code 137 — OOM killed."kubectl rollout undo. Pods recovered immediately. Then temporarily increased memory limits while the dev team patched the leak."