Restez organisé à l'aide des collections
Enregistrez et classez les contenus selon vos préférences.
Vous pouvez redémarrer toute ressource persistante dont l'état est RUNNING ou ERROR.
Le redémarrage d'une ressource persistante vous permet de résoudre des erreurs dont la ressource persistante ne peut pas récupérer par elle-même. Vous pouvez également redémarrer une ressource persistante pour obtenir manuellement des clusters plus à jour. Cette page vous explique comment redémarrer une ressource persistante à l'aide de la console Google Cloud et de l'API REST.
Rôles requis
Pour obtenir l'autorisation dont vous avez besoin pour redémarrer une ressource persistante, demandez à votre administrateur de vous accorder le rôle IAM Administrateur Vertex AI (roles/aiplatform.admin) sur votre projet.
Pour en savoir plus sur l'attribution de rôles, consultez la page Gérer l'accès aux projets, aux dossiers et aux organisations.
Ce rôle prédéfini contient l'autorisation aiplatform.persistentResources.update, qui est nécessaire pour redémarrer une ressource persistante.
Sélectionnez l'un des onglets suivants pour obtenir des instructions sur le redémarrage d'une ressource persistante. Assurez-vous qu'aucun job d'entraînement n'est en cours d'exécution sur la ressource persistante.
Le redémarrage d'une ressource persistante est une opération de longue durée au cours de laquelle la ressource persistante ne peut pas être supprimée. L'opération contient un champ progressMessage qui est renseigné avec un état d'erreur le cas échéant. Une fois que l'opération indique "done: true", vérifiez l'état de la ressource persistante. Si la ressource persistante est à l'état RUNNING, le redémarrage a réussi et la ressource est prête à exécuter des jobs d'entraînement.
Limites
Les limites suivantes s'appliquent au redémarrage d'une ressource persistante :
Dans certains cas, il est possible de perdre la capacité en ressources rares lors du redémarrage d'une ressource persistante. La conservation complète des ressources n'est pas garantie.
Le redémarrage n'est pas disponible pour Ray sur Vertex AI.
Les ressources persistantes contenant des pools de nœuds de calcul avec autoscaling redémarrent avec le nombre minimal d'instances dupliquées.
Sauf indication contraire, le contenu de cette page est régi par une licence Creative Commons Attribution 4.0, et les échantillons de code sont régis par une licence Apache 2.0. Pour en savoir plus, consultez les Règles du site Google Developers. Java est une marque déposée d'Oracle et/ou de ses sociétés affiliées.
Dernière mise à jour le 2025/08/28 (UTC).
[[["Facile à comprendre","easyToUnderstand","thumb-up"],["J'ai pu résoudre mon problème","solvedMyProblem","thumb-up"],["Autre","otherUp","thumb-up"]],[["Difficile à comprendre","hardToUnderstand","thumb-down"],["Informations ou exemple de code incorrects","incorrectInformationOrSampleCode","thumb-down"],["Il n'y a pas l'information/les exemples dont j'ai besoin","missingTheInformationSamplesINeed","thumb-down"],["Problème de traduction","translationIssue","thumb-down"],["Autre","otherDown","thumb-down"]],["Dernière mise à jour le 2025/08/28 (UTC)."],[],[],null,["# Reboot a persistent resource\n\nYou can reboot any persistent resource that's in the `RUNNING` or `ERROR` state. Rebooting a persistent resource lets you recover from errors that the persistent resource can't recover from on its own. You can also reboot a persistent resource to manually obtain more up-to-date clusters. This page shows you how to reboot a persistent resource by using the Google Cloud console and the REST API.\n\n\u003cbr /\u003e\n\nRequired roles\n--------------\n\n\nTo get the permission that\nyou need to reboot a persistent resource,\n\nask your administrator to grant you the\n\n\n[Vertex AI Administrator](/iam/docs/roles-permissions/aiplatform#aiplatform.admin) (`roles/aiplatform.admin`)\nIAM role on your project.\n\n\nFor more information about granting roles, see [Manage access to projects, folders, and organizations](/iam/docs/granting-changing-revoking-access).\n\n\nThis predefined role contains the\n` aiplatform.persistentResources.update`\npermission,\nwhich is required to\nreboot a persistent resource.\n\n\nYou might also be able to get\nthis permission\nwith [custom roles](/iam/docs/creating-custom-roles) or\nother [predefined roles](/iam/docs/roles-overview#predefined).\n\nReboot a persistent resource\n----------------------------\n\nSelect one of the following tabs for instructions on how to reboot a persistent\nresource. Make sure there's no training jobs running on the persistent resource. \n\n### Console\n\nTo reboot a persistent resource in the Google Cloud console, do the following:\n\n1. In the Google Cloud console, go to the **Persistent resources** page.\n\n [Go to Persistent resources](https://console.cloud.google.com/vertex-ai/training/persistent-resources)\n2. Next to the name of the persistent resource that you want to reboot, click\n the vertical ellipses (more_vert).\n\n3. Click **Reboot**.\n\n4. Click **Confirm**.\n\n\n### gcloud\n\n\nBefore using any of the command data below,\nmake the following replacements:\n\n- \u003cvar class=\"edit\" scope=\"PROJECT_ID\" translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e: The Project ID of the persistent resource that you want to reboot.\n- \u003cvar class=\"edit\" scope=\"LOCATION\" translate=\"no\"\u003eLOCATION\u003c/var\u003e: The region of the persistent resource that you want to reboot.\n- \u003cvar class=\"edit\" scope=\"PERSISTENT_RESOURCE_ID\" translate=\"no\"\u003ePERSISTENT_RESOURCE_ID\u003c/var\u003e: The ID of the persistent resource that you want to reboot.\n\n\nExecute the\n\nfollowing\n\ncommand:\n\n#### Linux, macOS, or Cloud Shell\n\n**Note:** Ensure you have initialized the Google Cloud CLI with authentication and a project by running either [gcloud init](/sdk/gcloud/reference/init); or [gcloud auth login](/sdk/gcloud/reference/auth/login) and [gcloud config set project](/sdk/gcloud/reference/config/set). \n\n```bash\ngcloud ai persistent-resources reboot PERSISTENT_RESOURCE_ID \\\n --project=PROJECT_ID \\\n --region=LOCATION\n```\n\n#### Windows (PowerShell)\n\n**Note:** Ensure you have initialized the Google Cloud CLI with authentication and a project by running either [gcloud init](/sdk/gcloud/reference/init); or [gcloud auth login](/sdk/gcloud/reference/auth/login) and [gcloud config set project](/sdk/gcloud/reference/config/set). \n\n```bash\ngcloud ai persistent-resources reboot PERSISTENT_RESOURCE_ID `\n --project=PROJECT_ID `\n --region=LOCATION\n```\n\n#### Windows (cmd.exe)\n\n**Note:** Ensure you have initialized the Google Cloud CLI with authentication and a project by running either [gcloud init](/sdk/gcloud/reference/init); or [gcloud auth login](/sdk/gcloud/reference/auth/login) and [gcloud config set project](/sdk/gcloud/reference/config/set). \n\n```bash\ngcloud ai persistent-resources reboot PERSISTENT_RESOURCE_ID ^\n --project=PROJECT_ID ^\n --region=LOCATION\n```\n\nYou should receive a response similar to the following:\n\n```\nUsing endpoint [https://us-central1-aiplatform.googleapis.com/]\nRequest to reboot the PersistentResource [projects/sample-project/locations/us-central1/persistentResources/test-persistent-resource] has been sent.\n\nYou may view the status of your persistent resource with the command\n\n $ gcloud ai persistent-resources describe projects/sample-project/locations/us-central1/persistentResources/test-persistent-resource\n```\n\n### REST\n\n\nBefore using any of the request data,\nmake the following replacements:\n\n- \u003cvar class=\"edit\" scope=\"PROJECT_ID\" translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e: The Project ID of the persistent resource that you want to reboot.\n- \u003cvar class=\"edit\" scope=\"LOCATION\" translate=\"no\"\u003eLOCATION\u003c/var\u003e: The region of the persistent resource that you want to reboot.\n- \u003cvar class=\"edit\" scope=\"PERSISTENT_RESOURCE_ID\" translate=\"no\"\u003ePERSISTENT_RESOURCE_ID\u003c/var\u003e: The ID of the persistent resource that you want to reboot.\n\n\nHTTP method and URL:\n\n```\nPOST https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/persistentResources/PERSISTENT_RESOURCE_ID:reboot\n```\n\nTo send your request, expand one of these options:\n\n#### curl (Linux, macOS, or Cloud Shell)\n\n| **Note:** The following command assumes that you have logged in to the `gcloud` CLI with your user account by running [`gcloud init`](/sdk/gcloud/reference/init) or [`gcloud auth login`](/sdk/gcloud/reference/auth/login) , or by using [Cloud Shell](/shell/docs), which automatically logs you into the `gcloud` CLI . You can check the currently active account by running [`gcloud auth list`](/sdk/gcloud/reference/auth/list).\n\n\nExecute the following command:\n\n```\ncurl -X POST \\\n -H \"Authorization: Bearer $(gcloud auth print-access-token)\" \\\n -H \"Content-Type: application/json; charset=utf-8\" \\\n -d \"\" \\\n \"https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/persistentResources/PERSISTENT_RESOURCE_ID:reboot\"\n```\n\n#### PowerShell (Windows)\n\n| **Note:** The following command assumes that you have logged in to the `gcloud` CLI with your user account by running [`gcloud init`](/sdk/gcloud/reference/init) or [`gcloud auth login`](/sdk/gcloud/reference/auth/login) . You can check the currently active account by running [`gcloud auth list`](/sdk/gcloud/reference/auth/list).\n\n\nExecute the following command:\n\n```\n$cred = gcloud auth print-access-token\n$headers = @{ \"Authorization\" = \"Bearer $cred\" }\n\nInvoke-WebRequest `\n -Method POST `\n -Headers $headers `\n -Uri \"https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/persistentResources/PERSISTENT_RESOURCE_ID:reboot\" | Select-Object -Expand Content\n```\n\nYou should receive a JSON response similar to the following:\n\n```\nresponse: \n {\n \"name\": \"projects/123456789012/locations/us-central1/persistentResources/test-persistent-resource/operations/1234567890123456789\",\n \"metadata\": {\n \"@type\": \"type.googleapis.com/google.cloud.aiplatform.v1.RebootPersistentResourceOperationMetadata\",\n \"genericMetadata\": {\n \"createTime\": \"2024-03-18T17:31:54.955004Z\",\n \"updateTime\": \"2024-03-18T17:31:55.204817Z\",\n \"state\": \"RUNNING\",\n \"worksOn\": [\n \"projects/123456789012/locations/us-central1/persistentResources/test-persistent-resource\"\n ]\n },\n \"progressMessage\": \"Waiting for persistent resource shut down.\"\n }\n }\n```\n\n\u003cbr /\u003e\n\nRebooting a persistent resource is a\n[long running operation](/vertex-ai/docs/general/long-running-operations),\nduring which the persistent resource can't be deleted. The operation contains a\n`progressMessage` field that populates with an error status if one occurs. After\nthe operation indicates `\"done: true\"`,\n[check the status](/vertex-ai/docs/training/persistent-resource-get#get_information_about_a_persistent_resource)\nof the persistent resource. If the persistent resource is in the `RUNNING`\nstate, the reboot is successful and it's ready to run training jobs.\n\nLimitations\n-----------\n\nThe following are limitations for rebooting a persistent resource:\n\n- In some cases, it's possible to lose capacity of scarce resources when rebooting a persistent resource. Full resource retention is not guaranteed.\n- Reboot is not available on Ray on Vertex AI.\n- Persistent resources containing autoscaled worker pools reboot with the minimum replica count.\n\nWhat's next\n-----------\n\n- [Learn about persistent resource](/vertex-ai/docs/training/persistent-resource-overview).\n- [Create and use a persistent resource](/vertex-ai/docs/training/persistent-resource-create).\n- [Run training jobs on a persistent resource](/vertex-ai/docs/training/persistent-resource-train).\n- [Get information about a persistent resource](/vertex-ai/docs/training/persistent-resource-get).\n- [Delete a persistent resource](/vertex-ai/docs/training/persistent-resource-delete)."]]