如果一个或多个指标因某种原因而不可用,则 Pod 横向自动扩缩器仍会根据计算的最大大小纵向扩容,但不会缩减。
防止抖动
抖动指的是 Pod 横向自动扩缩器在工作负载完成对之前的自动扩缩操作的响应之前尝试执行后续自动扩缩操作的情况。为了防止抖动,Pod 横向自动扩缩器会根据最近五分钟的情况选择最大的建议。
限制
不要在 CPU 或内存上一并使用 Pod 横向自动扩缩器和 Pod 纵向自动扩缩器。您可以将 Pod 横向自动扩缩器与 Pod 纵向自动扩缩器搭配使用以获取其他指标。您可以配置 Pod 多维自动扩缩(Beta 版),以便同时根据 CPU 进行横向扩缩,并根据内存进行纵向扩缩。
如果您有 Deployment,请勿在支持 Pod 的 ReplicaSet 或副本控制器上配置 Pod 横向自动扩缩。在 Deployment 或副本控制器上执行滚动更新时,Deployment 或副本控制器会被新的副本控制器所替换。请改为在 Deployment 本身上配置 Pod 横向自动扩缩。
您不能将 Pod 横向自动扩缩用于无法扩缩的工作负载,例如 DaemonSet。
Pod 横向自动扩缩会将指标作为 Kubernetes 资源公开,这会对指标名称施加限制,例如不允许使用大写字母或“/”字符。您的指标适配器可能允许重命名。例如,请参阅 prometheus-adapteras 运算符。
[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["很难理解","hardToUnderstand","thumb-down"],["信息或示例代码不正确","incorrectInformationOrSampleCode","thumb-down"],["没有我需要的信息/示例","missingTheInformationSamplesINeed","thumb-down"],["翻译问题","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["最后更新时间 (UTC):2025-09-01。"],[],[],null,["# Horizontal Pod autoscaling\n\n[Autopilot](/kubernetes-engine/docs/concepts/autopilot-overview) [Standard](/kubernetes-engine/docs/concepts/choose-cluster-mode)\n\n*** ** * ** ***\n\nThis page provides an overview of horizontal Pod autoscaling and explains\nhow it works in Google Kubernetes Engine (GKE). You can also read about how to\n[configure and use horizontal Pod autoscaling](/kubernetes-engine/docs/how-to/horizontal-pod-autoscaling)\non your clusters.\n\nThe Horizontal Pod Autoscaler changes the shape of your Kubernetes workload by\nautomatically increasing or decreasing the number of Pods in response to the\nworkload's CPU or memory consumption, or in response to custom metrics reported\nfrom within Kubernetes or external metrics from sources outside of your cluster.\n\nGKE clusters with node auto provisioning automatically scale the\nnumber of nodes in the cluster based on changes in the number of Pods. For that\nreason, we recommend that you use horizontal Pod autoscaling for all clusters.\n\nWhy use horizontal Pod autoscaling\n----------------------------------\n\nWhen you first deploy your workload to a Kubernetes cluster, you may not be sure\nabout its resource requirements and how those requirements might change\ndepending on usage patterns, external dependencies, or other factors. Horizontal\nPod autoscaling helps to ensure that your workload functions consistently in\ndifferent situations, and lets you control costs by only paying for extra\ncapacity when you need it.\n\nIt's not always easy to predict the indicators that show whether your workload\nis under-resourced or under-utilized. The Horizontal Pod Autoscaler can\nautomatically scale the number of Pods in your workload based on one or more\nmetrics of the following types:\n\n- **Actual resource usage**: when a given Pod's CPU or memory usage exceeds a\n threshold. This can be expressed as a raw value or as a percentage of the\n amount the Pod requests for that resource.\n\n- **Custom metrics**: based on any metric reported by a Kubernetes object in\n a cluster, such as the rate of client requests per second or I/O writes per\n second.\n\n This can be useful if your application is prone to network bottlenecks, rather\n than CPU or memory.\n- **External metrics**: based on a metric from an application or service\n external to your cluster.\n\n For example, your workload might need more CPU when ingesting a large number\n of requests from a pipeline such as Pub/Sub. You can create an\n external metric for the size of the queue, and configure the Horizontal Pod Autoscaler to automatically\n increase the number of Pods when the queue size reaches a given threshold, and\n to reduce the number of Pods when the queue size shrinks.\n\nYou can combine a Horizontal Pod Autoscaler with a Vertical Pod Autoscaler, with some\n[limitations](#limitations).\n\nHow horizontal Pod autoscaling works\n------------------------------------\n\nEach configured Horizontal Pod Autoscaler operates using a control loop.\nA separate Horizontal Pod Autoscaler exists for each workload. Each Horizontal Pod Autoscaler periodically\nchecks a given workload's metrics against the target thresholds you configure,\nand changes the shape of the workload automatically.\n\n### Per-Pod resources\n\nFor resources that are allocated per-Pod, such as CPU, the controller queries\nthe resource metrics API for each container running in the Pod.\n\n- If you specify a raw value for CPU or memory, the value is used.\n- If you specify a percentage value for CPU or memory, the Horizontal Pod Autoscaler calculates the **average** utilization value as a percentage of that Pod's CPU or memory requests.\n- Custom and external metrics are expressed as raw values or average values.\n\n| **Note:** To use resource utilization percentage targets with horizontal Pod autoscaling, you must configure requests for that resource for each container running in each Pod in the workload. Otherwise, the Horizontal Pod Autoscaler cannot perform the calculations it needs to, and takes no action related to that metric.\n\nThe controller uses the average or raw value for a reported metric to produce a\nratio, and uses that ration to autoscale the workload. You can read a description\nof the\n[Horizontal Pod Autoscaler algorithm](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#algorithm-details)\nin the Kubernetes project documentation.\n\n### Responding to multiple metrics\n\nIf you configure a workload to autoscale based on multiple metrics, the\nHorizontal Pod Autoscaler evaluates each metric separately and uses the scaling\nalgorithm to determine the new workload scale based on each one. The **largest**\nscale is selected for the autoscale action.\n\nIf one or more of the metrics are unavailable for some reason, the Horizontal\nPod Autoscaler still scales\n**up** based on the largest size calculated, but does not scale **down**.\n\n### Preventing thrashing\n\n*Thrashing* refers to a situation in which the Horizontal Pod Autoscaler\nattempts to perform subsequent autoscaling actions before the workload finishes\nresponding to prior autoscaling actions. To prevent thrashing, the Horizontal\nPod Autoscaler chooses the largest recommendation based on the last five\nminutes.\n\nLimitations\n-----------\n\n- Don't use the Horizontal Pod Autoscaler together with the [Vertical Pod Autoscaler](/kubernetes-engine/docs/concepts/verticalpodautoscaler) on CPU or memory. You can use the Horizontal Pod Autoscaler with the Vertical Pod Autoscaler for other metrics. You can [configure multidimensional Pod autoscaling (in beta)](/kubernetes-engine/docs/how-to/multidimensional-pod-autoscaling) in order to scale horizontally on CPU and vertically on memory at the same time.\n- If you have a Deployment, don't configure horizontal Pod autoscaling on the ReplicaSet or Replication Controller backing it. When you perform a rolling update on the Deployment or Replication Controller, it is replaced by a new Replication Controller. Instead configure horizontal Pod autoscaling on the Deployment itself.\n- You can't use Horizontal Pod autoscaling for workloads that cannot be scaled, such as DaemonSets.\n- Horizontal Pod autoscaling exposes metrics as Kubernetes resources, which imposes limitations on metric names such as no uppercase or '/' characters. Your metric adapter might allow renaming. For example, see the [`prometheus-adapter` `as` operator](https://github.com/kubernetes-sigs/prometheus-adapter/blob/master/docs/config.md#naming).\n- Horizontal Pod Autoscaler won't scale down if any of the [metrics](/kubernetes-engine/docs/how-to/horizontal-pod-autoscaling#multiple-metrics) that it's configured to monitor are unavailable. To check if you have unavailable metrics, see [Viewing details about a Horizontal Pod Autoscaler](/kubernetes-engine/docs/how-to/horizontal-pod-autoscaling#viewing).\n\nScalability\n-----------\n\nWhile the Horizontal Pod Autoscaler doesn't have a hard limit on the number of\nsupported HPA objects, its performance can be affected as this number grows.\nSpecifically, the period between HPA recalculations might become longer than the\nstandard 15 seconds.\n\n- In **GKE minor version 1.22 or later** , the recalculation period should stay within 15 seconds with up to **300 HPA objects**.\n- In **GKE minor version 1.31 or later** , if the **Performance HPA profile** is configured, the recalculation period should stay within 15 seconds with up to **1,000 HPA objects** . Learn how to [configure the Performance HPA profile](/kubernetes-engine/docs/how-to/horizontal-pod-autoscaling#hpa-profile).\n- In **GKE minor version 1.33 or later** , if the **Performance HPA profile** is configured, the recalculation period should stay within 15 seconds with up to **5,000 HPA objects** . The Performance HPA profile is enabled by default on all clusters that meet the [requirements](/kubernetes-engine/docs/how-to/horizontal-pod-autoscaling#requirements_2).\n\nThe following factors can also affect performance:\n\n- **[Scaling on multiple metrics](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#scaling-on-multiple-metrics)**: each metric adds a fetch call for recommendation calculations, which affects the recalculation period.\n- **The latency of the [custom\n metrics](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#scaling-on-custom-metrics)\n stack**: response times over approximately 50 milliseconds would be more than typically observed with the standard Kubernetes metrics, affecting the recalculation period.\n\nInteracting with `HorizontalPodAutoscaler` objects\n--------------------------------------------------\n\nYou can configure a Horizontal Pod Autoscaler for a workload, and get information about autoscaling\nevents and what caused them, by visiting the\n[Workloads](https://console.cloud.google.com/kubernetes/workload/overview)\npage in the Google Cloud console.\n\nEach Horizontal Pod Autoscaler exists in the cluster as a `HorizontalPodAutoscaler` object. You can use commands like\n`kubectl get hpa` or `kubectl describe hpa `\u003cvar translate=\"no\"\u003eHPA_NAME\u003c/var\u003e to\ninteract with these objects.\n\nYou can also create `HorizontalPodAutoscaler` objects using the\n[`kubectl autoscale` command](https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#autoscale).\n\nWhat's next\n-----------\n\n- Learn how to [configure horizontal Pod autoscaling](/kubernetes-engine/docs/how-to/horizontal-pod-autoscaling)\n- Learn how to [manually scale an Application](/kubernetes-engine/docs/how-to/scaling-apps)\n- Learn how to [scale to zero using KEDA](/kubernetes-engine/docs/tutorials/scale-to-zero-using-keda)\n- Learn more about [Vertical Pod Autoscaler](/kubernetes-engine/docs/concepts/verticalpodautoscaler)\n- Learn more about [Cluster Autoscaler](/kubernetes-engine/docs/concepts/cluster-autoscaler)"]]