DevNote 'AWS EKS not enough ips for pods'

IP Exhaustion Investigation: Pods Stuck in Pending Status

Problem Description

Development teams reported that pods on the Kubernetes cluster are stuck in Pending status due to problems with IP address assignment. Initial investigation suggests this may be related to IP address exhaustion in the VPC, preventing new pods from being scheduled.

Investigation Steps and Findings

1. Identifying Affected Pods

First, I identified all pods stuck in Pending status to understand the scope of the issue:

# Find all pods in Pending status
kubectl get pods --all-namespaces -o json | jq -r '.items[] | select(.status.phase=="Pending")| .metadata.namespace + "/" + .metadata.name'

# Save the list to a file for processing
kubectl get pods --all-namespaces -o json | jq -r '.items[] | select(.status.phase=="Pending")| .metadata.namespace + "/" + .metadata.name' > stuck-pods.txt

Findings:

Approximately XX pods were found in Pending status across multiple namespaces
Common error in events: “no available IP addresses”

2. Checking VPC CNI Configuration

Examined the current AWS VPC CNI plugin configuration to understand IP allocation settings:

# Get environment variables from aws-node daemonset
kubectl describe daemonset aws-node -n kube-system | grep -A 20 Environment

# Alternative command to get all environment variables in JSON format
kubectl get daemonset aws-node -n kube-system -o jsonpath='{.spec.template.spec.containers[0].env}' | jq .

Findings:

ENABLE_PREFIX_DELEGATION was not enabled
AWS_VPC_K8S_CNI_CUSTOM_NETWORK_CFG was set to false

3. Analyzing IP Address Utilization

Checked the current IP address utilization across nodes:

# Check IP address allocation on nodes
kubectl exec -n kube-system ds/aws-node -- curl -s http://localhost:61679/v1/enis | jq .

# Get IP address usage statistics
kubectl exec -n kube-system ds/aws-node -- curl -s http://localhost:61679/v1/stats | jq .

Findings:

Most nodes were at or near their maximum IP allocation
Available IPs across the cluster: XX out of total capacity YY

4. Solution Implementation: Enabling Prefix Delegation

Based on findings, determined that enabling Prefix Delegation would significantly improve IP address utilization:

# Enable required settings for Prefix Delegation
kubectl set env daemonset aws-node -n kube-system AWS_VPC_K8S_CNI_CUSTOM_NETWORK_CFG=true
kubectl set env daemonset aws-node -n kube-system ENABLE_PREFIX_DELEGATION=true

Prefix Delegation allows each secondary IP address to be used for multiple pods, dramatically increasing the IP capacity of the cluster.

5. Cluster Node Restart

After applying the configuration changes, nodes needed to be restarted for the changes to take effect:

# Cordon and drain nodes one by one
kubectl cordon node-name
kubectl drain node-name --ignore-daemonsets --delete-emptydir-data

# After node restarts, uncordon
kubectl uncordon node-name

6. Verification

After implementing the solution:

# Verify new configuration is applied
kubectl describe daemonset aws-node -n kube-system | grep -A 20 Environment

# Check if pending pods are now being scheduled
kubectl get pods --all-namespaces | grep Pending

Conclusion

The root cause of pods being stuck in Pending status was IP address exhaustion in the VPC. By enabling Prefix Delegation in the AWS VPC CNI plugin, we significantly increased the number of available IP addresses without requiring VPC or subnet changes.

This change allows each secondary IP to be used for multiple pods, improving IP address utilization efficiency by up to 16x. After implementation, pods were successfully scheduled and the cluster returned to normal operation.

Future Recommendations

Monitor IP address utilization regularly
Implement alerting for IP address utilization thresholds
Document Prefix Delegation as a standard configuration for new clusters