Common issues and troubleshooting
Troubleshooting guide for PerfectScale self-hosted environment
Explore step-by-step guidance for troubleshooting common pod failures and infrastructure-related issues in a self-hosted PerfectScale deployment.
Pod failure troubleshooting
After installing all Helm charts, follow these systematic steps to diagnose and resolve pod issues
Check the pod status
Check the status of all pods in your namespace.
kubectl get pods -n <namespace_name>
Look for pods with status indicators such as Error, CrashLoopBackOff, ImagePullBackOff, or Pending.
Examine the pod logs
For pods that show errors, examine the logs for detailed error messages.
kubectl logs -n <namespace_name> <pod_name>
For multi-container pods, specify the container name.
kubectl logs -n <namespace_name> <pod_name> -c <container_name>
Analyze the pod events and details
Get comprehensive information about problematic pods.
kubectl describe pod <pod_name> -n <namespace_name>
Focus on:
State: The current container state has potential error messages.
Last state: The previous container states whether restarts have occurred.
Ready: Indicates whether the pod passed readiness probes.
Common pod failure scenarios and solutions
ImagePullBackOff
👉🏻 Reason: Container registry access issues or the image does not exist.
💡 How to solve:
Verify container registry credentials.
Verify the image name and tag for accuracy.
Ensure network connectivity with the registry.
CrashLoopBackOff
👉🏻 Reason: The application crashes immediately after starting.
💡 How to solve:
Check application logs.
Verify the environment variables and configuration.
Ensure sufficient resource allocation.
Resource Constraints
👉🏻 Reason: Insufficient CPU and/or memory.
💡 How to solve:
Check the availability of node resources.
kubectl describe node <node_name>
Verify the pod's resource requests and limits.
kubectl get pod <pod_name> -n <namespace_name> -o yaml | grep -A 5 resources
Network and DNS troubleshooting
General network diagnostics
Verify service connectivity.
kubectl get svc -n <namespace_name>
Test network policies.
kubectl get networkpolicies -n <namespace_name>
Check ingress resources.
kubectl get ingress -n <namespace_name>
AWS-specific network configuration
Ensure that a valid domain name is configured in the Route53 hosted zone.
Verify the necessary DNS records: NS, CNAME, and A records.
Configure the AWS Certificate Manager (ACM) for the issuance of domain certificates.
External AWS configuration resources:
Helm chart troubleshooting
PerfectScale utilizes Helm charts for service deployment. Follow these steps to troubleshoot Helm installations.
Check Helm Release Status
List all Helm releases in your namespace.
helm list -n <namespace_name>
Review the STATUS
column for each release. A status of deployed
indicates successful deployment.
Manage Problematic Releases
For releases in a failed
or pending
state:
View release history
helm history <release-name> -n <namespace_name>
Roll back to a previous stable version.
helm rollback <release-name> <revision-number> -n <namespace_name>
View detailed release information
helm get all <release-name> -n <namespace_name>
Support resources
If the issue persists after following these steps, feel free to contact PerfectScale support through your preferred channel, either Slack or email, for further assistance.
To help us resolve the issue faster, please include the following information when reaching out:
Namespace name
Relevant pod logs and events
Cluster information
Steps attempted.
Last updated
Was this helpful?