exportENV=my-environment-nameexportNAMESPACE=apigee#the namespace where apigee is deployedexportCOMPONENT=runtime#can be udca or synchronizerexportMAX_REPLICAS=2exportMIN_REPLICAS=1
[[["容易理解","easyToUnderstand","thumb-up"],["確實解決了我的問題","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["難以理解","hardToUnderstand","thumb-down"],["資訊或程式碼範例有誤","incorrectInformationOrSampleCode","thumb-down"],["缺少我需要的資訊/範例","missingTheInformationSamplesINeed","thumb-down"],["翻譯問題","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["上次更新時間:2025-08-19 (世界標準時間)。"],[[["\u003cp\u003eApigee hybrid services in Kubernetes can be scaled using the command line or by modifying the \u003ccode\u003eoverrides.yaml\u003c/code\u003e file, allowing adjustments to replica counts and autoscaling behavior.\u003c/p\u003e\n"],["\u003cp\u003eDifferent services, such as Cassandra, Ingress/LoadBalancer, Logger, MART Apigee Connect Watcher, and Runtime Synchronizer UDCA, have specific methods for scaling, including setting \u003ccode\u003ereplicaCountMin\u003c/code\u003e and \u003ccode\u003ereplicaCountMax\u003c/code\u003e properties or leveraging Horizontal Pod Autoscaling (HPA).\u003c/p\u003e\n"],["\u003cp\u003eAdvanced scaling options allow for environment-specific configurations, such as different replica counts for different environments or components within an environment, which can be implemented via the \u003ccode\u003ekubernetes patch\u003c/code\u003e command.\u003c/p\u003e\n"],["\u003cp\u003eMetrics-based scaling for \u003ccode\u003eapigee-runtime\u003c/code\u003e pods can be configured using the \u003ccode\u003ehpaBehavior\u003c/code\u003e field, which defines scale-up and scale-down behaviors based on metrics like \u003ccode\u003eserverNioTaskWaitTime\u003c/code\u003e and \u003ccode\u003eserverMainTaskWaitTime\u003c/code\u003e, and while it cannot be completely disabled, it can be set to not be triggered.\u003c/p\u003e\n"],["\u003cp\u003eYou can adjust how aggressive the scaling will be by modifying the values of \u003ccode\u003epercent\u003c/code\u003e and \u003ccode\u003epods\u003c/code\u003e in the \u003ccode\u003escaleUp\u003c/code\u003e or \u003ccode\u003escaleDown\u003c/code\u003e sections in the configuration, and if metrics-based scaling does not work and the HPA shows \u003ccode\u003eunknown\u003c/code\u003e for metrics values, you can check the HPA output with \u003ccode\u003ekubectl describe hpa HPA_NAME\u003c/code\u003e to see the status of CPU scaling.\u003c/p\u003e\n"]]],[],null,["# Scale and autoscale runtime services\n\n| You are currently viewing version 1.9 of the Apigee hybrid documentation. **This version is end of life.** You should upgrade to a newer version. For more information, see [Supported versions](/apigee/docs/hybrid/supported-platforms#supported-versions).\n\nYou can scale most services running in Kubernetes from the\ncommand line or in a configuration override. You can set scaling\nparameters for Apigee hybrid runtime services in the\n[`overrides.yaml` file](/apigee/docs/hybrid/v1.9/customize-services).\n\nAdvanced configurations\n-----------------------\n\n\nIn some scenarios, you may need to use advanced scaling options. Example scenarios include:\n\n- Setting different scaling options for each environment. For example, where env1 has a `minReplica` of 5 and env2 has a `minReplica` of 2.\n- Setting different scaling options for each component within an environment. For example, where the `udca` component has a `maxReplica` of 5 and the `synchronizer` component has a `maxReplica` of 2.\n\n\nThe following example shows how to use the `kubernetes patch` command to change\nthe `maxReplicas` property for the `runtime` component:\n\n1. Create environment variables to use with the command: \n\n ```gdscript\n export ENV=my-environment-name\n export NAMESPACE=apigee #the namespace where apigee is deployed\n export COMPONENT=runtime #can be udca or synchronizer\n export MAX_REPLICAS=2\n export MIN_REPLICAS=1\n ```\n2. Apply the patch. Note that this example assumes that `kubectl` is in your `PATH`: \n\n ```carbon\n kubectl patch apigeeenvironment -n $NAMESPACE \\\n $(kubectl get apigeeenvironments -n $NAMESPACE -o jsonpath='{.items[?(@.spec.name == \"'$ENV'\" )]..metadata.name}') \\\n --patch \"$(echo -e \"spec:\\n components:\\n $COMPONENT:\\n autoScaler:\\n maxReplicas: $MAX_REPLICAS\\n minReplicas: $MIN_REPLICAS\")\" \\\n --type merge\n ```\n3. Verify the change: \n\n ```text\n kubectl get hpa -n $NAMESPACE\n ```\n\nEnvironment-based scaling\n-------------------------\n\n\nBy default, scaling is described at the organization level. You can\noverride the default settings by specifying environment-specific scaling\nin the `overrides.yaml` file as shown in the following example: \n\n```text\nenvs:\n# Apigee environment name\n- name: test\n components:\n # Environment-specific scaling override\n # Otherwise, uses scaling defined at the respective root component\n runtime:\n replicaCountMin: 2\n replicaCountMax: 20\n```\n\nMetrics-based scaling\n---------------------\n\nWith metrics-based scaling, the runtime can use CPU and application metrics to scale the `apigee-runtime` pods.\nThe Kubernetes [Horizontal Pod Autoscaler (HPA) API](https://kubernetes.io/docs/reference/kubernetes-api/workload-resources/horizontal-pod-autoscaler-v2/#HorizontalPodAutoscalerSpec),\nuses the `hpaBehavior` field to configure the scale-up and scale-down behaviors of the target service.\nMetrics-based scaling is not available for any other components in a hybrid deployment.\n| **Note** : An internal connection between Prometheus and the Prometheus-Adapter on port 6443 must be open in order to receive metrics data and enable scaling. For more information on required ports, see [Internal connections](/apigee/docs/hybrid/v1.9/ports#internal).\n\nScaling can be adjusted based on the following metrics:\n\nThe following example from the `runtime` stanza in the `overrides.yaml`\nillustrates the standard parameters (and permitted ranges) for scaling `apigee-runtime` pods in a hybrid implementation:\n\n```actionscript-3\nhpaMetrics:\n serverMainTaskWaitTime: 400M (300M to 450M)\n serverNioTaskWaitTime: 400M (300M to 450M)\n targetCPUUtilizationPercentage: 75\n hpaBehavior:\n scaleDown:\n percent:\n periodSeconds: 60 (30 - 180)\n value: 20 (5 - 50)\n pods:\n periodSeconds: 60 (30 - 180)\n value: 2 (1 - 15)\n selectPolicy: Min\n stabilizationWindowSeconds: 120 (60 - 300)\n scaleUp:\n percent:\n periodSeconds: 60 (30 - 120)\n value: 20 (5 - 100)\n pods:\n periodSeconds: 60 (30 - 120)\n value: 4 (2 - 15)\n selectPolicy: Max\n stabilizationWindowSeconds: 30 (30 - 120)\n \n```\n\n### Configure more aggressive scaling\n\nIncreasing the `percent` and `pods` values of the scale-up policy will result in a more aggressive\nscale-up policy. Similarly, increasing the `percent` and `pods` values in `scaleDown`\nwill result in an aggressive scale-down policy. For example:\n\n```actionscript-3\nhpaMetrics:\n serverMainTaskWaitTime: 400M\n serverNioTaskWaitTime: 400M\n targetCPUUtilizationPercentage: 75\n hpaBehavior:\n scaleDown:\n percent:\n periodSeconds: 60\n value: 20\n pods:\n periodSeconds: 60\n value: 4\n selectPolicy: Min\n stabilizationWindowSeconds: 120\n scaleUp:\n percent:\n periodSeconds: 60\n value: 30\n pods:\n periodSeconds: 60\n value: 5\n selectPolicy: Max\n stabilizationWindowSeconds: 30\n```\n\nIn the above example, the `scaleDown.pods.value` is increased to **5** , the `scaleUp.percent.value `\nis increased to **30** , and the `scaleUp.pods.value` is increased to **5**.\n| **Note** : The value of `periodSeconds` should not go below 30.\n\n### Configure less aggressive scaling\n\nThe `hpaBehavior` configuration values can also be decreased to implement less aggressive scale-up and scale-down policies. For example:\n\n```actionscript-3\nhpaMetrics:\n serverMainTaskWaitTime: 400M\n serverNioTaskWaitTime: 400M\n targetCPUUtilizationPercentage: 75\n hpaBehavior:\n scaleDown:\n percent:\n periodSeconds: 60\n value: 10\n pods:\n periodSeconds: 60\n value: 1\n selectPolicy: Min\n stabilizationWindowSeconds: 180\n scaleUp:\n percent:\n periodSeconds: 60\n value: 20\n pods:\n periodSeconds: 60\n value: 4\n selectPolicy: Max\n stabilizationWindowSeconds: 30\n```\n\nIn the above example, the `scaleDown.percent.value` is decreased to **10** , the `scaleDown.pods.value`\nis decreased to **1** , and the `scaleUp.stablizationWindowSeconds` is increased to **180**.\n\nFor more information about metrics-based scaling using the `hpaBehavior` field, see [Scaling policies](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#scaling-policies).\n\n### Disable metrics-based scaling\n\nWhile metrics-based scaling is enabled by default and cannot be completely disabled, you can\nconfigure the metrics thresholds at a level that metrics-based scaling will not be triggered. The resulting scaling\nbehavior will be the same as CPU-based scaling. For example, you can use the following configuration to prevent triggering metrics-based scaling:\n\n```actionscript-3\nhpaMetrics:\n serverMainTaskWaitTime: 4000M\n serverNioTaskWaitTime: 4000M\n targetCPUUtilizationPercentage: 75\n hpaBehavior:\n scaleDown:\n percent:\n periodSeconds: 60\n value: 10\n pods:\n periodSeconds: 60\n value: 1\n selectPolicy: Min\n stabilizationWindowSeconds: 180\n scaleUp:\n percent:\n periodSeconds: 60\n value: 20\n pods:\n periodSeconds: 60\n value: 4\n selectPolicy: Max\n stabilizationWindowSeconds: 30\n```\n\nTroubleshooting\n---------------\n\nThis section describes troubleshooting methods for common errors you may encounter while configuring scaling and auto-scaling.\n\n### HPA shows `unknown` for metrics values\n\nIf metrics-based scaling does not work and the HPA shows `unknown`\nfor metrics values, use the following command to check the HPA output: \n\n```\nkubectl describe hpa HPA_NAME\n```\n\nWhen running the command, replace \u003cvar translate=\"no\"\u003eHPA_NAME\u003c/var\u003e with the name of the HPA you wish to view.\n\nThe output will show the CPU target and utilization of the service, indicating that CPU scaling will work\nin the absence of metrics-based scaling. For HPA behavior using multiple\nparameters, see [Scaling on multiple metrics](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#scaling-on-multiple-metrics)."]]