# Common issues and troubleshooting Explore step-by-step guidance for troubleshooting common pod failures and infrastructure-related issues in a self-hosted PerfectScale deployment. ## Pod failure troubleshooting After installing all Helm charts, follow these systematic steps to diagnose and resolve pod issues ### Check the pod status Check the status of all pods in your namespace. ``` kubectl get pods -n ``` Look for pods with status indicators such as *Error*, *CrashLoopBackOff*, *ImagePullBackOff*, or *Pending*. ### Examine the pod logs For pods that show errors, examine the logs for detailed error messages. ``` kubectl logs -n ``` For multi-container pods, specify the container name. ``` kubectl logs -n -c ``` ### Analyze the pod events and details Get comprehensive information about problematic pods. ``` kubectl describe pod -n ``` Focus on: * **State**: The current container state has potential error messages. * **Last state**: The previous container states whether restarts have occurred. * **Ready**: Indicates whether the pod passed readiness probes. ### Common pod failure scenarios and solutions #### **ImagePullBackOff** 👉🏻 **Reason**: Container registry access issues or the image does not exist. 💡 **How to solve**: * Verify container registry credentials. * Verify the image name and tag for accuracy. * Ensure network connectivity with the registry. #### **CrashLoopBackOff** 👉🏻 **Reason**: The application crashes immediately after starting. 💡 **How to solve**: * Check application logs. * Verify the environment variables and configuration. * Ensure sufficient resource allocation. * [Detailed CrashLoopBackOff troubleshooting guide](https://www.perfectscale.io/blog/crashloopbackoff). #### **Resource Constraints** 👉🏻 **Reason**: Insufficient CPU and/or memory. 💡 **How to solve**: * Check the availability of node resources. ``` kubectl describe node ``` * Verify the pod's resource requests and limits. ``` kubectl get pod -n -o yaml | grep -A 5 resources ``` ## Network and DNS troubleshooting ### General network diagnostics 1. Verify service connectivity. ``` kubectl get svc -n ``` 2. Test network policies. ``` kubectl get networkpolicies -n ``` 3. Check ingress resources. ``` kubectl get ingress -n ``` ### AWS-specific network configuration 1. Ensure that a valid domain name is configured in the Route53 hosted zone. 2. Verify the necessary DNS records: NS, CNAME, and A records. 3. Configure the AWS Certificate Manager (ACM) for the issuance of domain certificates. ### **External AWS configuration resources:** * [AWS Certificate Manager documentation](https://docs.aws.amazon.com/acm/latest/userguide/acm-overview.html) * [Route53 documentation](https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/Welcome.html) ## Helm chart troubleshooting PerfectScale utilizes Helm charts for service deployment. Follow these steps to troubleshoot Helm installations. #### Check Helm Release Status List all Helm releases in your namespace. ``` helm list -n ``` Review the `STATUS` column for each release. A status of `deployed` indicates successful deployment. #### Manage Problematic Releases For releases in a `failed` or `pending` state: 1. View release history ``` helm history -n ``` 2. Roll back to a previous stable version. ``` helm rollback

-n ``` 3. View detailed release information ``` helm get all -n ``` ## Support resources If the issue persists after following these steps, feel free to contact PerfectScale support through your preferred channel, either [Slack](https://join.slack.com/t/perfectscalecommunity/shared_invite/zt-1tu9teu9e-Z9tGt4LpNI8tUC3j8obcmQ) or [email](mailto:support@perfectscale.io), for further assistance. To help us resolve the issue faster, please include the following information when reaching out: * Namespace name * Relevant pod logs and events * Cluster information * Steps attempted.