kubectl -n namespace get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
cassandra-data-apigee-cassandra-default-0 Bound pvc-b247faae-0a2b-11ea-867b-42010a80006e 10Gi RWO standard 15m
...
描述失败的 pod 的 PVC。例如,以下命令描述了绑定到 pod apigee-cassandra-default-0 的 PVC:
kubectl apigee describe pvc cassandra-data-apigee-cassandra-default-0
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning ProvisioningFailed 3m (x143 over 5h) persistentvolume-controller storageclass.storage.k8s.io "apigee-sc" not found
[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["很难理解","hardToUnderstand","thumb-down"],["信息或示例代码不正确","incorrectInformationOrSampleCode","thumb-down"],["没有我需要的信息/示例","missingTheInformationSamplesINeed","thumb-down"],["翻译问题","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["最后更新时间 (UTC):2025-09-04。"],[[["\u003cp\u003eThis documentation is for Apigee hybrid version 1.3, which is end-of-life, and users should upgrade to a newer version for support and updates.\u003c/p\u003e\n"],["\u003cp\u003eCassandra pods may become stuck in a \u003ccode\u003ePending\u003c/code\u003e state due to insufficient resources like CPU or memory, or because the persistent volume needed for the pod hasn't been created.\u003c/p\u003e\n"],["\u003cp\u003eCassandra pods in a \u003ccode\u003eCrashLoopBackoff\u003c/code\u003e state may be caused by data center discrepancies with previous data, or issues with truststore directories related to TLS connection problems.\u003c/p\u003e\n"],["\u003cp\u003eNode failures can lead to Cassandra pods remaining in the \u003ccode\u003ePending\u003c/code\u003e state, which requires removing the dead pod and associated volume claim, and then updating the volume template to include the newly added node.\u003c/p\u003e\n"],["\u003cp\u003eTroubleshooting steps involve using \u003ccode\u003ekubectl\u003c/code\u003e commands to check pod status, describe pods and persistent volume claims, and check the Cassandra error logs for specific issues and messages.\u003c/p\u003e\n"]]],[],null,["# Cassandra troubleshooting guide\n\n| You are currently viewing version 1.3 of the Apigee hybrid documentation. **This version is end of life.** You should upgrade to a newer version. For more information, see [Supported versions](/apigee/docs/hybrid/supported-platforms#supported-versions).\n\n\nThis topic discusses steps you can take to troubleshoot and fix problems with the\n[Cassandra](/apigee/docs/hybrid/v1.3/what-is-hybrid#cassandra-datastore) datastore. Cassandra is a\npersistent datastore\nthat runs in the `cassandra` component of the\n[hybrid runtime architecture](/apigee/docs/hybrid/v1.3/what-is-hybrid#about-the-runtime-plane).\nSee also\n[Runtime service configuration overview](/apigee/docs/hybrid/v1.3/service-config).\n\nCassandra pods are stuck in the Pending state\n---------------------------------------------\n\n### Symptom\n\n\nWhen starting up, the Cassandra pods remain in the **Pending** state.\n\n### Error message\n\n\nWhen you use `kubectl` to view the pod states, you see that one or more\nCassandra pods are stuck in the `Pending` state. The\n`Pending` state indicates that Kubernetes is unable to schedule the pod\non a node: the pod cannot be created. For example: \n\n kubectl get pods -n \u003cvar translate=\"no\"\u003enamespace\u003c/var\u003e\n\n NAME READY STATUS RESTARTS AGE\n adah-resources-install-4762w 0/4 Completed 0 10m\n apigee-cassandra-default-0 0/1 Pending 0 10m\n ...\n\n### Possible causes\n\n\nA pod stuck in the Pending state can have multiple causes. For example:\n\n### Diagnosis\n\nUse `kubectl`\nto describe the pod to determine the source of the error. For example: \n\n```\nkubectl -n namespace describe pods pod_name\n```\n\n\nFor example: \n\n```\nkubectl -n apigee describe pods apigee-cassandra-default-0\n```\n\n\nThe output may show one of these possible problems:\n\n- If the problem is insufficient resources, you will see a Warning message that indicates insufficient CPU or memory.\n- If the error message indicates that the pod has unbound immediate PersistentVolumeClaims (PVC), it means the pod is not able to create its [Persistent volume](https://kubernetes.io/docs/concepts/storage/persistent-volumes/).\n\n### Resolution\n\n#### Insufficient resources\n\nModify the Cassandra node pool so that it has sufficient CPU and memory resources.\nSee [Resizing a node pool](https://cloud.google.com/kubernetes-engine/docs/how-to/node-pools#resizing_a_node_pool) for details.\n\n#### Persistent volume not created\n\nIf you determine a persistent volume issue, describe the PersistentVolumeClaim (PVC) to determine\nwhy it is not being created:\n\n1. List the PVCs in the cluster: \n\n ```\n kubectl -n namespace get pvc\n\n NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE\n cassandra-data-apigee-cassandra-default-0 Bound pvc-b247faae-0a2b-11ea-867b-42010a80006e 10Gi RWO standard 15m\n ...\n ```\n2. Describe the PVC for the pod that is failing. For example, the following command describes the PVC bound to the pod `apigee-cassandra-default-0`: \n\n ```\n kubectl apigee describe pvc cassandra-data-apigee-cassandra-default-0\n\n Events:\n Type Reason Age From Message\n ---- ------ ---- ---- -------\n Warning ProvisioningFailed 3m (x143 over 5h) persistentvolume-controller storageclass.storage.k8s.io \"apigee-sc\" not found\n ```\n\n\n Note that in this example, the StorageClass named `apigee-sc` does not exist. To\n resolve this problem, create the missing StorageClass in the cluster, as explained in [Change the default StorageClass](/apigee/docs/hybrid/v1.3/cassandra-config).\n\n\nSee also [Debugging Pods](https://kubernetes.io/docs/tasks/debug-application-cluster/debug-pod-replication-controller#debugging-pods).\n\nCassandra pods are stuck in the CrashLoopBackoff state\n------------------------------------------------------\n\n### Symptom\n\n\nWhen starting up, the Cassandra pods remain in the **CrashLoopBackoff** state.\n\n### Error message\n\n\nWhen you use `kubectl` to view the pod states, you see that one or more\nCassandra pods are in the `CrashLoopBackoff` state.\nThis state indicates that Kubernetes is unable to create the pod. For example: \n\n kubectl get pods -n \u003cvar translate=\"no\"\u003enamespace\u003c/var\u003e\n\n NAME READY STATUS RESTARTS AGE\n adah-resources-install-4762w 0/4 Completed 0 10m\n apigee-cassandra-default-0 0/1 CrashLoopBackoff 0 10m\n ...\n\n### Possible causes\n\n\nA pod stuck in the `CrashLoopBackoff` state can have multiple causes. For example:\n\n### Diagnosis\n\n\nCheck the [Cassandra error log](/apigee/docs/hybrid/v1.3/cassandra-logs) to determine the cause of the problem.\n\n1. List the pods to get the ID of the Cassandra pod that is failing: \n\n ```\n kubectl get pods -n namespace\n ```\n2. Check the failing pod's log: \n\n ```\n kubectl logs pod_id -n namespace\n ```\n\n### Resolution\n\n\nLook for the following clues in the pod's log:\n\n#### Data center differs from previous data center\n\n\nIf you see this log message: \n\n```\nCannot start node if snitch's data center (us-east1) differs from previous data center\n```\n\n- Check if there are any stale or old PVC in the cluster and delete them.\n- If this is a fresh install, delete all the PVCs and re-try the setup. For example: \n\n kubectl -n \u003cvar translate=\"no\"\u003enamespace\u003c/var\u003e get pvc\n kubectl -n \u003cvar translate=\"no\"\u003enamespace\u003c/var\u003e delete pvc cassandra-data-apigee-cassandra-default-0\n\n#### Truststore directory not found\n\n\nIf you see this log message: \n\n```\nCaused by: java.io.FileNotFoundException: /apigee/cassandra/ssl/truststore.p12\n(No such file or directory)\n```\n\n\nVerify the key and certificates if provided in your overrides file are correct and valid. For\nexample: \n\n```\ncassandra:\n sslRootCAPath: path_to_root_ca-file\n sslCertPath: path-to-tls-cert-file\n sslKeyPath: path-to-tls-key-file\n```\n\nNode failure\n------------\n\n### Symptom\n\nWhen starting up, the Cassandra pods remain in the Pending state. This\nproblem can indicate an underlying node failure.\n\n### Diagnosis\n\n1. Determine which Cassandra pods are not running: \n\n ```bash\n $ kubectl get pods -n your_namespace\n NAME READY STATUS RESTARTS AGE\n cassandra-default-0 0/1 Pending 0 13s\n cassandra-default-1 1/1 Running 0 8d\n cassandra-default-2 1/1 Running 0 8d\n ```\n2. Check the worker nodes. If one is in the **NotReady** state, then that is the node that has failed: \n\n ```scdoc\n kubectl get nodes -n your_namespace\n NAME STATUS ROLES AGE VERSION\n ip-10-30-1-190.ec2.internal Ready \u003cnone\u003e 8d v1.13.2\n ip-10-30-1-22.ec2.internal Ready master 8d v1.13.2\n ip-10-30-1-36.ec2.internal NotReady \u003cnone\u003e 8d v1.13.2\n ip-10-30-2-214.ec2.internal Ready \u003cnone\u003e 8d v1.13.2\n ip-10-30-2-252.ec2.internal Ready \u003cnone\u003e 8d v1.13.2\n ip-10-30-2-47.ec2.internal Ready \u003cnone\u003e 8d v1.13.2\n ip-10-30-3-11.ec2.internal Ready \u003cnone\u003e 8d v1.13.2\n ip-10-30-3-152.ec2.internal Ready \u003cnone\u003e 8d v1.13.2\n ip-10-30-3-5.ec2.internal Ready \u003cnone\u003e 8d v1.13.2\n ```\n\n### Resolution\n\n1. Remove the dead Cassandra pod from the cluster. \n\n $ kubectl exec -it apigee-cassandra-default-0 -- nodetool status\n $ kubectl exec -it apigee-cassandra-default-0 -- nodetool removenode deadnode_hostID\n\n2. Remove the VolumeClaim from the dead node to prevent the Cassandra pod from attempting to come up on the dead node because of the affinity: \n\n kubectl get pvc -n your_namespace\n kubectl delete pvc \u003cvar translate=\"no\"\u003evolumeClaim_name\u003c/var\u003e -n \u003cvar translate=\"no\"\u003eyour_namespace\u003c/var\u003e\n\n3. Update the volume template and create PersistentVolume for the newly added node. The following is an example volume template: \n\n ```actionscript-3\n apiVersion: v1\n kind: PersistentVolume\n metadata:\n name: cassandra-data-3\n spec:\n capacity:\n storage: 100Gi\n accessModes:\n - ReadWriteOnce\n persistentVolumeReclaimPolicy: Retain\n storageClassName: local-storage\n local:\n path: /apigee/data\n nodeAffinity:\n \"required\":\n \"nodeSelectorTerms\":\n - \"matchExpressions\":\n - \"key\": \"kubernetes.io/hostname\"\n \"operator\": \"In\"\n \"values\": [\"ip-10-30-1-36.ec2.internal\"]\n ```\n4. Replace the values with the new hostname/IP and apply the template: \n\n ```text\n kubectl apply -f volume-template.yaml\n ```"]]