How to Fix Docker Out of Memory (OOM) on Kubernetes Pod


Fixing Docker Out of Memory (OOM) on Kubernetes Pods

This guide provides a direct approach to troubleshoot and resolve “Docker Out of Memory (OOM)” issues affecting Kubernetes pods.


1. The Root Cause

Kubernetes pods are assigned specific memory limits, and if a container’s application consumes resources beyond this configured threshold, the Linux kernel’s OOM killer terminates the process. This manifests as the pod exhibiting an OOMKilled status, indicating that the workload’s memory demands exceed its allocated capacity.

2. Quick Fix (CLI)

To quickly mitigate an ongoing OOM issue, directly edit the Kubernetes deployment or statefulset to increase the container’s memory limit.

# 1. Identify the problematic pod and its parent deployment or statefulset.
#    Replace <namespace> with your Kubernetes namespace if not 'default'.
kubectl get pods -n <namespace>
kubectl get deployments -n <namespace>
kubectl get statefulsets -n <namespace>

# 2. Edit the manifest of the relevant deployment (e.g., 'my-app-deployment').
#    This command opens the YAML definition in your default text editor (e.g., vi, nano).
kubectl edit deployment my-app-deployment -n <namespace>

# 3. Inside the editor, navigate to the 'resources' section for your application's container
#    and increase the 'memory' limit. For example, change '512Mi' to '1Gi' or '2Gi'.
#
#    Look for a structure similar to this:
#    spec:
#      template:
#        spec:
#          containers:
#          - name: my-container-name
#            image: your-repo/your-app:latest
#            resources:
#              limits:
#                memory: "1Gi"  # <--- INCREASE THIS VALUE
#              requests:
#                memory: "512Mi"
#
# 4. Save and exit the editor. Kubernetes will automatically roll out new pods
#    with the updated memory limits.

3. Configuration Check

For a permanent solution and to maintain version control, update the source YAML definition file (e.g., deployment.yaml, statefulset.yaml) for your application.

# In your application's deployment.yaml or statefulset.yaml:
apiVersion: apps/v1
kind: Deployment # or StatefulSet
metadata:
  name: my-app-deployment
spec:
  template:
    spec:
      containers:
      - name: my-container-name
        image: your-repo/your-app:latest
        resources:
          limits:
            memory: "1Gi"   # Increase this value (e.g., from 512Mi to 1Gi)
            cpu: "1"        # (Optional) Adjust CPU limits as necessary
          requests:
            memory: "512Mi" # Ensure requests are set to a reasonable baseline
            cpu: "500m"     # (Optional) Adjust CPU requests
        # ... other container configurations

After modifying the file, apply the changes to your Kubernetes cluster:

kubectl apply -f your-deployment.yaml -n <namespace>

4. Verification

Confirm that the memory limits have been successfully applied and the OOM issue is resolved.

# 1. Monitor the status of the new pods. They should transition to "Running" and remain stable.
kubectl get pods -n <namespace> -w

# 2. Describe a newly created pod to verify the updated memory limits are in effect.
#    Replace <new-pod-name> with the actual name of one of the healthy pods.
kubectl describe pod <new-pod-name> -n <namespace> | grep -A 5 "Limits:"

# Expected output (confirm the increased memory limit under 'Limits:'):
#   Limits:
#     cpu:     1
#     memory:  1Gi
#   Requests:
#     cpu:     500m
#     memory:  512Mi

# 3. Review the application logs for stability and the absence of further OOM errors.
kubectl logs -f <new-pod-name> -n <namespace>