批量导入作业需要工作器来处理和写入数据,这可以提高特征存储区的 CPU 利用率并影响在线传送的性能。如果优先考虑保持在线服务性能,请在一开始为每十个在线服务节点配备一个工作器。在导入期间,监控在线存储的 CPU 使用率。如果 CPU 使用率低于预期,则增加未来批量导入作业的工作器数量,以提高吞吐量。如果 CPU 使用率高于预期,则增加在线传送节点的数量以增加 CPU 容量或减少批量导入工作器的数量,这两种做法都可以降低 CPU 使用率。
如果增加在线服务节点的数量,请注意,在您更新后,Vertex AI Feature Store(旧版)大约需要 15 分钟才能达到最佳性能。
如果您广泛使用 Vertex AI Feature Store(旧版),并在流量模式中频繁遇到负载波动现象,请使用自动扩缩功能来优化费用。自动扩缩功能让 Vertex AI Feature Store(旧版)可以查看流量模式,并根据 CPU 利用率自动增加或减少节点数量,而不是保持较高的节点数。此选项非常适合逐步增长和下降的流量模式。
[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["很难理解","hardToUnderstand","thumb-down"],["信息或示例代码不正确","incorrectInformationOrSampleCode","thumb-down"],["没有我需要的信息/示例","missingTheInformationSamplesINeed","thumb-down"],["翻译问题","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["最后更新时间 (UTC):2025-09-02。"],[],[],null,["# Best practices for Vertex AI Feature Store (Legacy)\n\nThe following best practices will help you plan and use\nVertex AI Feature Store (Legacy) in various scenarios. This guide is not intended to\nbe exhaustive.\n\nModel features that jointly describe multiple entities\n------------------------------------------------------\n\nSome features might apply to multiple entity types. For example, you\nmight have a calculated value that records clicks per product by user. This\nfeature jointly describes product-user pairs.\n\nThe best practice, in this case, is to create a separate entity type to group\nshared features. You can create an entity type, such as `product-user`, to\ncontain shared features.\n\nFor the entity IDs, concatenate the IDs of the individual entities,\nsuch as the entity IDs of the individual product and user. The only requirement\nis that the IDs must be strings. These combined entity types are referred to as\n*composite entity types*.\n\nFor more information, see [creating an entity\ntype](/vertex-ai/docs/featurestore/managing-entity-types#create).\n\nUse IAM policies to control access across multiple teams\n--------------------------------------------------------\n\nUse IAM roles and policies to set different levels of access to\ndifferent groups of users. For example, ML researchers, data scientists, DevOps,\nand site reliability engineers all require access to the same featurestore, but\ntheir level of access can differ. For example, DevOps users might require\npermissions to manage a featurestore, but they don't require access to the\ncontents of the featurestore.\n\nYou can also restrict access to a particular featurestore or entity type by\nusing [resource-level IAM\npolicies](/vertex-ai/docs/featurestore/resource-policy).\n\nAs an example, imagine that your organization includes the following personas.\nBecause each persona requires a different level of access, each persona is\nassigned a different [predefined IAM\nrole](/vertex-ai/docs/general/access-control#predefined-roles). You can also create and use\nyour own custom roles.\n\nMonitor and tune resources accordingly to optimize batch import\n---------------------------------------------------------------\n\nBatch import jobs require workers to process and write data, which can\nincrease the CPU utilization of your featurestore and affect online serving\nperformance. If preserving online serving performance is a priority, start with\none worker for every ten online serving nodes. During import, monitor the CPU\nusage of the online storage. If CPU usage is lower than expected, increase the\nnumber of workers for future batch import jobs to increase throughput. If\nCPU usage is higher than expected, increase the number of online serving nodes\nto increase CPU capacity or lower the batch import worker count, both of\nwhich can lower CPU usage.\n\nIf you do increase the number of online serving nodes, note that\nVertex AI Feature Store (Legacy) takes roughly 15 minutes to reach optimal\nperformance after you make the update.\n\nFor more information, see\n[update a featurestore](/vertex-ai/docs/featurestore/managing-featurestores#update) and\n[batch import feature values](/vertex-ai/docs/featurestore/ingesting-batch).\n\nFor more information about featurestore monitoring, see [Cloud Monitoring\nmetrics](/vertex-ai/docs/general/monitoring-metrics).\n\nUse the `disableOnlineServing` field when backfilling historical data\n---------------------------------------------------------------------\n\nBackfilling is the process of importing historical feature values and don't\nimpact the most recent feature values. In this case, you can disable online\nserving, which skips any changes to the [online\nstore](/vertex-ai/docs/featurestore/managing-featurestores#storage). For more information, see\n[Backfill historical data](/vertex-ai/docs/featurestore/ingesting-batch#backfill).\n\nUse autoscaling to reduce costs during load fluctuations\n--------------------------------------------------------\n\nIf you use Vertex AI Feature Store (Legacy) extensively and encounter frequent load\nfluctuations in your traffic patterns, use autoscaling to optimize costs.\nAutoscaling lets Vertex AI Feature Store (Legacy) review traffic patterns and\nautomatically adjust the number of nodes up or down depending on CPU utilization, instead of maintaining a high node count. This option\nworks well for traffic patterns that encounter gradual growth and decline.\n| **Caution:** Autoscaling is not effective for managing sudden bursts of traffic. For more information, see [Additional considerations for autoscaling](/vertex-ai/docs/featurestore/managing-featurestores#additional_considerations). \n| If you are expecting sudden bursts of traffic, set the `minNodeCount` flag to a threshold that is high enough to handle the bursts. For more information, see [Scaling](/vertex-ai/docs/reference/rest/v1/projects.locations.featurestores#scaling) in the [API reference](/vertex-ai/docs/reference/rest/v1/projects.locations.featurestores#scaling).\n\nFor more information about autoscaling, see\n[Scaling options](/vertex-ai/docs/featurestore/managing-featurestores#scaling_options).\n\nTest the performance of online serving nodes for real-time serving\n------------------------------------------------------------------\n\nYou can verify the performance of your featurestore during real-time online\nserving by testing the performance of your online serving nodes. You can\nperform these tests based on various benchmarking parameters, such as QPS,\nlatency, and API. Follow these guidelines to test the performance of online\nserving nodes:\n\n- **Run all test clients from the same region, preferably on Compute Engine or Google Kubernetes Engine**: This prevents discrepancies due to network latency resulting from hops across regions.\n\n- **Use the gRPC API in the SDK**: The gRPC API performs better than the REST API. If you need to use the REST API, enable the HTTP keep-alive option to reuse HTTP connections. Otherwise, each request results in the creation of a new HTTP connection, which increases the latency.\n\n- **Run longer duration tests**: Run tests with longer duration (15 minutes or more) and a minimum of 5 QPS to calculate more accurate metrics.\n\n- **Add a \"warm-up\" period**: If you start testing after a period of inactivity, you might observe high latency while connections are reestablished. To account for the initial period of high latency, you can designate this period as a \"warm-up period\", when the initial data reads are ignored. As an alternative, you can send a low but consistent rate of artificial traffic to the featurestore to keep the connection active.\n\n- **If required, enable autoscaling**: If you anticipate gradual growth and decline in your online traffic, enable autoscaling. If you choose autoscaling, Vertex AI automatically changes the number of online serving nodes based on CPU utilization.\n\nFor more information about online serving, see [Online serving](/vertex-ai/docs/featurestore/serving-online). For more information about online serving nodes, see [Online serving nodes](/vertex-ai/docs/featurestore/managing-featurestores#onlineservingnodes).\n\nSpecify a start time to optimize offline storage costs during batch serve and batch export\n------------------------------------------------------------------------------------------\n\nTo optimize offline storage costs during batch serving and batch export, you can specify the a `startTime` in your `batchReadFeatureValues` or `exportFeatureValues` request. The request runs a query over a subset of the available feature data, based on the specified `startTime`. Otherwise, the request queries the entire available volume of feature data, resulting in high offline storage usage costs.\n\nWhat's next\n-----------\n\nLearn Vertex AI Feature Store (Legacy) [best practices for implementing custom-trained ML models on Vertex AI](/architecture/ml-on-gcp-best-practices#use-vertex-feature-store-with-structured-data)."]]