ํ…์ŠคํŠธ ๋ถ„๋ฅ˜ ๋ชจ๋ธ์—์„œ ์˜ˆ์ธก ๊ฐ€์ ธ์˜ค๊ธฐ

์ด ํŽ˜์ด์ง€์—์„œ๋Š” Google Cloud ์ฝ˜์†” ๋˜๋Š” Vertex AI API๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํ…์ŠคํŠธ ๋ถ„๋ฅ˜ ๋ชจ๋ธ์—์„œ ์˜จ๋ผ์ธ (์‹ค์‹œ๊ฐ„) ์˜ˆ์ธก๊ณผ ์ผ๊ด„ ์˜ˆ์ธก์„ ์–ป๋Š” ๋ฐฉ๋ฒ•์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.

์˜จ๋ผ์ธ ์˜ˆ์ธก๊ณผ ์ผ๊ด„ ์˜ˆ์ธก ๊ฐ„์˜ ์ฐจ์ด์ 

์˜จ๋ผ์ธ ์˜ˆ์ธก์€ ๋ชจ๋ธ ์—”๋“œํฌ์ธํŠธ์— ์ˆ˜ํ–‰๋˜๋Š” ๋™๊ธฐ์‹ ์š”์ฒญ์ž…๋‹ˆ๋‹ค. ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜ ์ž…๋ ฅ์— ๋Œ€ํ•œ ์‘๋‹ต์œผ๋กœ ์š”์ฒญํ•˜๊ฑฐ๋‚˜ ์ ์‹œ์˜ ์ถ”๋ก ์ด ํ•„์š”ํ•œ ์ƒํ™ฉ์—์„œ ์š”์ฒญํ•˜๋Š” ๊ฒฝ์šฐ์—๋Š” ์˜จ๋ผ์ธ ์˜ˆ์ธก์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

์ผ๊ด„ ์˜ˆ์ธก์€ ๋น„๋™๊ธฐ์‹ ์š”์ฒญ์ž…๋‹ˆ๋‹ค. ๋ชจ๋ธ์„ ์—”๋“œํฌ์ธํŠธ์— ๋ฐฐํฌํ•  ํ•„์š” ์—†์ด ๋ชจ๋ธ ๋ฆฌ์†Œ์Šค์—์„œ ์ง์ ‘ ์ผ๊ด„ ์˜ˆ์ธก์„ ์š”์ฒญํ•ฉ๋‹ˆ๋‹ค. ํ…์ŠคํŠธ ๋ฐ์ดํ„ฐ์˜ ๊ฒฝ์šฐ ์ฆ‰๊ฐ์ ์ธ ์‘๋‹ต์ด ํ•„์š”ํ•˜์ง€ ์•Š๊ณ  ๋‹จ์ผ ์š”์ฒญ์„ ์‚ฌ์šฉํ•˜์—ฌ ๋ˆ„์ ๋œ ๋ฐ์ดํ„ฐ๋ฅผ ์ฒ˜๋ฆฌํ•˜๊ณ  ์‹ถ์€ ๊ฒฝ์šฐ ์ผ๊ด„ ์˜ˆ์ธก์„ ์‚ฌ์šฉํ•˜์„ธ์š”.

์˜จ๋ผ์ธ ์˜ˆ์ธก ์ˆ˜ํ–‰

์—”๋“œํฌ์ธํŠธ์— ๋ชจ๋ธ ๋ฐฐํฌ

๋ชจ๋ธ์„ ์˜จ๋ผ์ธ ์˜ˆ์ธก์„ ์ œ๊ณตํ•˜๋Š” ๋ฐ ์‚ฌ์šฉํ•˜๋ ค๋ฉด ๋จผ์ € ๋ชจ๋ธ์„ ์—”๋“œํฌ์ธํŠธ์— ๋ฐฐํฌํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ๋ชจ๋ธ์„ ๋ฐฐํฌํ•˜๋ฉด ๋ฌผ๋ฆฌ์  ๋ฆฌ์†Œ์Šค๊ฐ€ ๋ชจ๋ธ๊ณผ ์—ฐ๊ฒฐ๋˜๋ฏ€๋กœ ์งง์€ ์ง€์—ฐ ์‹œ๊ฐ„์œผ๋กœ ์˜จ๋ผ์ธ ์˜ˆ์ธก์„ ์ œ๊ณตํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์—”๋“œํฌ์ธํŠธ 1๊ฐœ์— ๋ชจ๋ธ์„ 2๊ฐœ ์ด์ƒ ๋ฐฐํฌํ•  ์ˆ˜ ์žˆ๊ณ  2๊ฐœ ์ด์ƒ์˜ ์—”๋“œํฌ์ธํŠธ์— ๋ชจ๋ธ 1๊ฐœ๋ฅผ ๋ฐฐํฌํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋ชจ๋ธ ๋ฐฐํฌ ์˜ต์…˜ ๋ฐ ์‚ฌ์šฉ ์‚ฌ๋ก€์— ๋Œ€ํ•œ ์ž์„ธํ•œ ๋‚ด์šฉ์€ ๋ชจ๋ธ ๋ฐฐํฌ ์ •๋ณด๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.

๋‹ค์Œ ๋ฐฉ๋ฒ• ์ค‘ ํ•˜๋‚˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ชจ๋ธ์„ ๋ฐฐํฌํ•ฉ๋‹ˆ๋‹ค.

Google Cloud ์ฝ˜์†”

  1. Google Cloud ์ฝ˜์†”์˜ Vertex AI ์„น์…˜์—์„œ ๋ชจ๋ธ ํŽ˜์ด์ง€๋กœ ์ด๋™ํ•ฉ๋‹ˆ๋‹ค.

    ๋ชจ๋ธ ํŽ˜์ด์ง€๋กœ ์ด๋™

  2. ๋ฐฐํฌํ•˜๋ ค๋Š” ๋ชจ๋ธ์˜ ์ด๋ฆ„์„ ํด๋ฆญํ•˜์—ฌ ์„ธ๋ถ€์ •๋ณด ํŽ˜์ด์ง€๋ฅผ ์—ฝ๋‹ˆ๋‹ค.

  3. ๋ฐฐํฌ ๋ฐ ํ…Œ์ŠคํŠธ ํƒญ์„ ์„ ํƒํ•ฉ๋‹ˆ๋‹ค.

    ์ด๋ฏธ ์—”๋“œํฌ์ธํŠธ์— ๋ฐฐํฌ๋œ ๋ชจ๋ธ์€ ๋ชจ๋ธ ๋ฐฐํฌ ์„น์…˜์— ๋‚˜์—ด๋ฉ๋‹ˆ๋‹ค.

  4. ์—”๋“œํฌ์ธํŠธ์— ๋ฐฐํฌ๋ฅผ ํด๋ฆญํ•ฉ๋‹ˆ๋‹ค.

  5. ๋ชจ๋ธ์„ ์ƒˆ ์—”๋“œํฌ์ธํŠธ์— ๋ฐฐํฌํ•˜๋ ค๋ฉด ์ƒˆ ์—”๋“œํฌ์ธํŠธ ๋งŒ๋“ค๊ธฐ๋ฅผ ์„ ํƒํ•˜๊ณ  ์ƒˆ ์—”๋“œํฌ์ธํŠธ์˜ ์ด๋ฆ„์„ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค. ๊ธฐ์กด ์—”๋“œํฌ์ธํŠธ์— ๋ชจ๋ธ์„ ๋ฐฐํฌํ•˜๋ ค๋ฉด ๊ธฐ์กด ์—”๋“œํฌ์ธํŠธ์— ์ถ”๊ฐ€๋ฅผ ์„ ํƒํ•˜๊ณ  ๋“œ๋กญ๋‹ค์šด ๋ชฉ๋ก์—์„œ ์—”๋“œํฌ์ธํŠธ๋ฅผ ์„ ํƒํ•ฉ๋‹ˆ๋‹ค.

    ์—”๋“œํฌ์ธํŠธ 1๊ฐœ์— ๋ชจ๋ธ์„ 2๊ฐœ ์ด์ƒ ์ถ”๊ฐ€ํ•  ์ˆ˜ ์žˆ๊ณ  2๊ฐœ ์ด์ƒ์˜ ์—”๋“œํฌ์ธํŠธ์— ๋ชจ๋ธ 1๊ฐœ๋ฅผ ์ถ”๊ฐ€ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ž์„ธํžˆ ์•Œ์•„๋ณด์„ธ์š”.

  6. ๋ชจ๋ธ ํ•˜๋‚˜ ์ด์ƒ์ด ๋ฐฐํฌ๋œ ๊ธฐ์กด ์—”๋“œํฌ์ธํŠธ์— ๋ชจ๋ธ์„ ๋ฐฐํฌํ•˜๋Š” ๊ฒฝ์šฐ ๋น„์œจ์„ ๋ชจ๋‘ ํ•ฉํ•˜๋ฉด 100%๊ฐ€ ๋˜๋„๋ก ๋ฐฐํฌ ์ค‘์ธ ๋ชจ๋ธ๊ณผ ์ด๋ฏธ ๋ฐฐํฌ๋œ ๋ชจ๋ธ์˜ ํŠธ๋ž˜ํ”ฝ ๋ถ„ํ•  ๋น„์œจ์„ ์—…๋ฐ์ดํŠธํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

  7. AutoML Text๋ฅผ ์„ ํƒํ•˜๊ณ  ๋‹ค์Œ๊ณผ ๊ฐ™์ด ๊ตฌ์„ฑํ•ฉ๋‹ˆ๋‹ค.

    1. ๋ชจ๋ธ์„ ์ƒˆ ์—”๋“œํฌ์ธํŠธ์— ๋ฐฐํฌํ•˜๋Š” ๊ฒฝ์šฐ ํŠธ๋ž˜ํ”ฝ ๋ถ„ํ•  ๊ฐ’์œผ๋กœ 100์„ ํ—ˆ์šฉํ•ฉ๋‹ˆ๋‹ค. ์•„๋‹ˆ๋ฉด ๋ชจ๋‘ ํ•ฉํ•˜์—ฌ 100์ด ๋˜๋„๋ก ์—”๋“œํฌ์ธํŠธ์˜ ๋ชจ๋“  ๋ชจ๋ธ์— ๋Œ€ํ•œ ํŠธ๋ž˜ํ”ฝ ๋ถ„ํ•  ๊ฐ’์„ ์กฐ์ •ํ•ฉ๋‹ˆ๋‹ค.

    2. ๋ชจ๋ธ์—์„œ ์™„๋ฃŒ๋ฅผ ํด๋ฆญํ•˜๊ณ  ๋ชจ๋“  ํŠธ๋ž˜ํ”ฝ ๋ถ„ํ•  ๋น„์œจ์ด ์˜ฌ๋ฐ”๋ฅด๋ฉด ๊ณ„์†์„ ํด๋ฆญํ•ฉ๋‹ˆ๋‹ค.

      ๋ชจ๋ธ์ด ๋ฐฐํฌ๋˜๋Š” ๋ฆฌ์ „์ด ํ‘œ์‹œ๋ฉ๋‹ˆ๋‹ค. ์ด ๋ฆฌ์ „์€ ๋ชจ๋ธ์„ ๋งŒ๋“  ๋ฆฌ์ „์ด์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

    3. ๋ฐฐํฌ๋ฅผ ํด๋ฆญํ•˜์—ฌ ๋ชจ๋ธ์„ ์—”๋“œํฌ์ธํŠธ์— ๋ฐฐํฌํ•ฉ๋‹ˆ๋‹ค.

API

Vertex AI API๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ชจ๋ธ์„ ๋ฐฐํฌํ•˜๋Š” ๊ฒฝ์šฐ ๋‹ค์Œ ๋‹จ๊ณ„๋ฅผ ์™„๋ฃŒํ•ฉ๋‹ˆ๋‹ค.

  1. ํ•„์š”ํ•œ ๊ฒฝ์šฐ ์—”๋“œํฌ์ธํŠธ๋ฅผ ๋งŒ๋“ญ๋‹ˆ๋‹ค.
  2. ์—”๋“œํฌ์ธํŠธ ID๋ฅผ ๊ฐ€์ ธ์˜ต๋‹ˆ๋‹ค.
  3. ๋ชจ๋ธ์„ ์—”๋“œํฌ์ธํŠธ์— ๋ฐฐํฌํ•ฉ๋‹ˆ๋‹ค.

์—”๋“œํฌ์ธํŠธ ๋งŒ๋“ค๊ธฐ

๊ธฐ์กด ์—”๋“œํฌ์ธํŠธ์— ๋ชจ๋ธ์„ ๋ฐฐํฌํ•˜๋Š” ๊ฒฝ์šฐ ์ด ๋‹จ๊ณ„๋ฅผ ๊ฑด๋„ˆ๋›ธ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

gcloud

๋‹ค์Œ ์˜ˆ์‹œ์—์„œ๋Š” gcloud ai endpoints create ๋ช…๋ น์–ด๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

gcloud ai endpoints create \
  --region=LOCATION \
  --display-name=ENDPOINT_NAME

๋‹ค์Œ์„ ๋ฐ”๊ฟ‰๋‹ˆ๋‹ค.

  • LOCATION_ID: Vertex AI๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๋ฆฌ์ „
  • ENDPOINT_NAME: ์—”๋“œํฌ์ธํŠธ์˜ ํ‘œ์‹œ ์ด๋ฆ„

Google Cloud CLI ๋„๊ตฌ๊ฐ€ ์—”๋“œํฌ์ธํŠธ๋ฅผ ๋งŒ๋“œ๋Š” ๋ฐ ๋ช‡ ์ดˆ ์ •๋„ ๊ฑธ๋ฆด ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

REST

์š”์ฒญ ๋ฐ์ดํ„ฐ๋ฅผ ์‚ฌ์šฉํ•˜๊ธฐ ์ „์— ๋‹ค์Œ์„ ๋ฐ”๊ฟ‰๋‹ˆ๋‹ค.

  • LOCATION_ID: ๋ฆฌ์ „
  • PROJECT_ID: ํ”„๋กœ์ ํŠธ ID์ž…๋‹ˆ๋‹ค.
  • ENDPOINT_NAME: ์—”๋“œํฌ์ธํŠธ์˜ ํ‘œ์‹œ ์ด๋ฆ„

HTTP ๋ฉ”์„œ๋“œ ๋ฐ URL:

POST https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints

JSON ์š”์ฒญ ๋ณธ๋ฌธ:

{
  "display_name": "ENDPOINT_NAME"
}

์š”์ฒญ์„ ๋ณด๋‚ด๋ ค๋ฉด ๋‹ค์Œ ์˜ต์…˜ ์ค‘ ํ•˜๋‚˜๋ฅผ ํŽผ์นฉ๋‹ˆ๋‹ค.

๋‹ค์Œ๊ณผ ๋น„์Šทํ•œ JSON ์‘๋‹ต์ด ํ‘œ์‹œ๋ฉ๋‹ˆ๋‹ค.

{
  "name": "projects/PROJECT_NUMBER/locations/LOCATION_ID/endpoints/ENDPOINT_ID/operations/OPERATION_ID",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.aiplatform.v1.CreateEndpointOperationMetadata",
    "genericMetadata": {
      "createTime": "2020-11-05T17:45:42.812656Z",
      "updateTime": "2020-11-05T17:45:42.812656Z"
    }
  }
}
์‘๋‹ต์— "done": true๊ฐ€ ํฌํ•จ๋  ๋•Œ๊นŒ์ง€ ์ž‘์—… ์ƒํƒœ๋ฅผ ํด๋งํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

Java

์ด ์ƒ˜ํ”Œ์„ ์‚ฌ์šฉํ•ด ๋ณด๊ธฐ ์ „์— Vertex AI ๋น ๋ฅธ ์‹œ์ž‘: ํด๋ผ์ด์–ธํŠธ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ์‚ฌ์šฉ์˜ Java ์„ค์ • ์•ˆ๋‚ด๋ฅผ ๋”ฐ๋ฅด์„ธ์š”. ์ž์„ธํ•œ ๋‚ด์šฉ์€ Vertex AI Java API ์ฐธ๊ณ  ๋ฌธ์„œ๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.

Vertex AI์— ์ธ์ฆํ•˜๋ ค๋ฉด ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜ ๊ธฐ๋ณธ ์‚ฌ์šฉ์ž ์ธ์ฆ ์ •๋ณด๋ฅผ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค. ์ž์„ธํ•œ ๋‚ด์šฉ์€ ๋กœ์ปฌ ๊ฐœ๋ฐœ ํ™˜๊ฒฝ์˜ ์ธ์ฆ ์„ค์ •์„ ์ฐธ์กฐํ•˜์„ธ์š”.


import com.google.api.gax.longrunning.OperationFuture;
import com.google.cloud.aiplatform.v1.CreateEndpointOperationMetadata;
import com.google.cloud.aiplatform.v1.Endpoint;
import com.google.cloud.aiplatform.v1.EndpointServiceClient;
import com.google.cloud.aiplatform.v1.EndpointServiceSettings;
import com.google.cloud.aiplatform.v1.LocationName;
import java.io.IOException;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.TimeoutException;

public class CreateEndpointSample {

  public static void main(String[] args)
      throws IOException, InterruptedException, ExecutionException, TimeoutException {
    // TODO(developer): Replace these variables before running the sample.
    String project = "YOUR_PROJECT_ID";
    String endpointDisplayName = "YOUR_ENDPOINT_DISPLAY_NAME";
    createEndpointSample(project, endpointDisplayName);
  }

  static void createEndpointSample(String project, String endpointDisplayName)
      throws IOException, InterruptedException, ExecutionException, TimeoutException {
    EndpointServiceSettings endpointServiceSettings =
        EndpointServiceSettings.newBuilder()
            .setEndpoint("us-central1-aiplatform.googleapis.com:443")
            .build();

    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (EndpointServiceClient endpointServiceClient =
        EndpointServiceClient.create(endpointServiceSettings)) {
      String location = "us-central1";
      LocationName locationName = LocationName.of(project, location);
      Endpoint endpoint = Endpoint.newBuilder().setDisplayName(endpointDisplayName).build();

      OperationFuture<Endpoint, CreateEndpointOperationMetadata> endpointFuture =
          endpointServiceClient.createEndpointAsync(locationName, endpoint);
      System.out.format("Operation name: %s\n", endpointFuture.getInitialFuture().get().getName());
      System.out.println("Waiting for operation to finish...");
      Endpoint endpointResponse = endpointFuture.get(300, TimeUnit.SECONDS);

      System.out.println("Create Endpoint Response");
      System.out.format("Name: %s\n", endpointResponse.getName());
      System.out.format("Display Name: %s\n", endpointResponse.getDisplayName());
      System.out.format("Description: %s\n", endpointResponse.getDescription());
      System.out.format("Labels: %s\n", endpointResponse.getLabelsMap());
      System.out.format("Create Time: %s\n", endpointResponse.getCreateTime());
      System.out.format("Update Time: %s\n", endpointResponse.getUpdateTime());
    }
  }
}

Node.js

์ด ์ƒ˜ํ”Œ์„ ์‚ฌ์šฉํ•ด ๋ณด๊ธฐ ์ „์— Vertex AI ๋น ๋ฅธ ์‹œ์ž‘: ํด๋ผ์ด์–ธํŠธ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ์‚ฌ์šฉ์˜ Node.js ์„ค์ • ์•ˆ๋‚ด๋ฅผ ๋”ฐ๋ฅด์„ธ์š”. ์ž์„ธํ•œ ๋‚ด์šฉ์€ Vertex AI Node.js API ์ฐธ๊ณ  ๋ฌธ์„œ๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.

Vertex AI์— ์ธ์ฆํ•˜๋ ค๋ฉด ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜ ๊ธฐ๋ณธ ์‚ฌ์šฉ์ž ์ธ์ฆ ์ •๋ณด๋ฅผ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค. ์ž์„ธํ•œ ๋‚ด์šฉ์€ ๋กœ์ปฌ ๊ฐœ๋ฐœ ํ™˜๊ฒฝ์˜ ์ธ์ฆ ์„ค์ •์„ ์ฐธ์กฐํ•˜์„ธ์š”.

/**
 * TODO(developer): Uncomment these variables before running the sample.\
 * (Not necessary if passing values as arguments)
 */

// const endpointDisplayName = 'YOUR_ENDPOINT_DISPLAY_NAME';
// const project = 'YOUR_PROJECT_ID';
// const location = 'YOUR_PROJECT_LOCATION';

// Imports the Google Cloud Endpoint Service Client library
const {EndpointServiceClient} = require('@google-cloud/aiplatform');

// Specifies the location of the api endpoint
const clientOptions = {
  apiEndpoint: 'us-central1-aiplatform.googleapis.com',
};

// Instantiates a client
const endpointServiceClient = new EndpointServiceClient(clientOptions);

async function createEndpoint() {
  // Configure the parent resource
  const parent = `projects/${project}/locations/${location}`;
  const endpoint = {
    displayName: endpointDisplayName,
  };
  const request = {
    parent,
    endpoint,
  };

  // Get and print out a list of all the endpoints for this resource
  const [response] = await endpointServiceClient.createEndpoint(request);
  console.log(`Long running operation : ${response.name}`);

  // Wait for operation to complete
  await response.promise();
  const result = response.result;

  console.log('Create endpoint response');
  console.log(`\tName : ${result.name}`);
  console.log(`\tDisplay name : ${result.displayName}`);
  console.log(`\tDescription : ${result.description}`);
  console.log(`\tLabels : ${JSON.stringify(result.labels)}`);
  console.log(`\tCreate time : ${JSON.stringify(result.createTime)}`);
  console.log(`\tUpdate time : ${JSON.stringify(result.updateTime)}`);
}
createEndpoint();

Python

Vertex AI SDK for Python์„ ์„ค์น˜ํ•˜๊ฑฐ๋‚˜ ์—…๋ฐ์ดํŠธํ•˜๋Š” ๋ฐฉ๋ฒ•์€ Vertex AI SDK for Python ์„ค์น˜๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”. ์ž์„ธํ•œ ๋‚ด์šฉ์€ Python API ์ฐธ๊ณ  ๋ฌธ์„œ๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.

def create_endpoint_sample(
    project: str,
    display_name: str,
    location: str,
):
    aiplatform.init(project=project, location=location)

    endpoint = aiplatform.Endpoint.create(
        display_name=display_name,
        project=project,
        location=location,
    )

    print(endpoint.display_name)
    print(endpoint.resource_name)
    return endpoint

์—”๋“œํฌ์ธํŠธ ID ๊ฒ€์ƒ‰

๋ชจ๋ธ์„ ๋ฐฐํฌํ•˜๋ ค๋ฉด ์—”๋“œํฌ์ธํŠธ ID๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.

gcloud

๋‹ค์Œ ์˜ˆ์‹œ์—์„œ๋Š” gcloud ai endpoints list ๋ช…๋ น์–ด๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

gcloud ai endpoints list \
  --region=LOCATION \
  --filter=display_name=ENDPOINT_NAME

๋‹ค์Œ์„ ๋ฐ”๊ฟ‰๋‹ˆ๋‹ค.

  • LOCATION_ID: Vertex AI๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๋ฆฌ์ „
  • ENDPOINT_NAME: ์—”๋“œํฌ์ธํŠธ์˜ ํ‘œ์‹œ ์ด๋ฆ„

ENDPOINT_ID ์—ด์— ํ‘œ์‹œ๋˜๋Š” ๋ฒˆํ˜ธ๋ฅผ ํ™•์ธํ•ฉ๋‹ˆ๋‹ค. ๋‹ค์Œ ๋‹จ๊ณ„์—์„œ ์ด ID๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

REST

์š”์ฒญ ๋ฐ์ดํ„ฐ๋ฅผ ์‚ฌ์šฉํ•˜๊ธฐ ์ „์— ๋‹ค์Œ์„ ๋ฐ”๊ฟ‰๋‹ˆ๋‹ค.

  • LOCATION_ID: Vertex AI๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๋ฆฌ์ „
  • PROJECT_ID: .
  • ENDPOINT_NAME: ์—”๋“œํฌ์ธํŠธ์˜ ํ‘œ์‹œ ์ด๋ฆ„

HTTP ๋ฉ”์„œ๋“œ ๋ฐ URL:

GET https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints?filter=display_name=ENDPOINT_NAME

์š”์ฒญ์„ ๋ณด๋‚ด๋ ค๋ฉด ๋‹ค์Œ ์˜ต์…˜ ์ค‘ ํ•˜๋‚˜๋ฅผ ํŽผ์นฉ๋‹ˆ๋‹ค.

๋‹ค์Œ๊ณผ ๋น„์Šทํ•œ JSON ์‘๋‹ต์ด ํ‘œ์‹œ๋ฉ๋‹ˆ๋‹ค.

{
  "endpoints": [
    {
      "name": "projects/PROJECT_NUMBER/locations/LOCATION_ID/endpoints/ENDPOINT_ID",
      "displayName": "ENDPOINT_NAME",
      "etag": "AMEw9yPz5pf4PwBHbRWOGh0PcAxUdjbdX2Jm3QO_amguy3DbZGP5Oi_YUKRywIE-BtLx",
      "createTime": "2020-04-17T18:31:11.585169Z",
      "updateTime": "2020-04-17T18:35:08.568959Z"
    }
  ]
}
ENDPOINT_ID๋ฅผ ํ™•์ธํ•ฉ๋‹ˆ๋‹ค.

๋ชจ๋ธ ๋ฐฐํฌ

์•„๋ž˜์—์„œ ์–ธ์–ด ๋˜๋Š” ํ™˜๊ฒฝ์— ๋Œ€ํ•œ ํƒญ์„ ์„ ํƒํ•˜์„ธ์š”.

gcloud

๋‹ค์Œ ์˜ˆ์‹œ์—์„œ๋Š” gcloud ai endpoints deploy-model ๋ช…๋ น์–ด๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

๋‹ค์Œ ์˜ˆ์‹œ์—์„œ๋Š” ์—ฌ๋Ÿฌ DeployedModel ๋ฆฌ์†Œ์Šค ๊ฐ„์— ํŠธ๋ž˜ํ”ฝ์„ ๋ถ„ํ• ํ•˜์ง€ ์•Š๊ณ  Model์„ Endpoint์— ๋ฐฐํฌํ•ฉ๋‹ˆ๋‹ค.

์•„๋ž˜์˜ ๋ช…๋ น์–ด ๋ฐ์ดํ„ฐ๋ฅผ ์‚ฌ์šฉํ•˜๊ธฐ ์ „์— ๋‹ค์Œ์„ ๋ฐ”๊ฟ‰๋‹ˆ๋‹ค.

  • ENDPOINT_ID: ์—”๋“œํฌ์ธํŠธ์˜ ID
  • LOCATION_ID: Vertex AI๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๋ฆฌ์ „
  • MODEL_ID: ๋ฐฐํฌํ•  ๋ชจ๋ธ์˜ ID
  • DEPLOYED_MODEL_NAME: DeployedModel์˜ ์ด๋ฆ„์ž…๋‹ˆ๋‹ค. DeployedModel์˜ Model ํ‘œ์‹œ ์ด๋ฆ„๋„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • MIN_REPLICA_COUNT: ์ด ๋ฐฐํฌ์˜ ์ตœ์†Œ ๋…ธ๋“œ ์ˆ˜. ์ถ”๋ก  ๋กœ๋“œ ์‹œ ํ•„์š”์— ๋”ฐ๋ผ ๋…ธ๋“œ ์ˆ˜๋ฅผ ์ตœ๋Œ€ ๋…ธ๋“œ ์ˆ˜๊นŒ์ง€ ๋Š˜๋ฆฌ๊ฑฐ๋‚˜ ์ด ๋…ธ๋“œ ์ˆ˜๊นŒ์ง€ ์ค„์ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • MAX_REPLICA_COUNT: ์ด ๋ฐฐํฌ์˜ ์ตœ๋Œ€ ๋…ธ๋“œ ์ˆ˜. ์ถ”๋ก  ๋กœ๋“œ ์‹œ ํ•„์š”์— ๋”ฐ๋ผ ์ด ๋…ธ๋“œ ์ˆ˜๋ฅผ ๋…ธ๋“œ ์ˆ˜๊นŒ์ง€ ๋Š˜๋ฆฌ๊ฑฐ๋‚˜ ์ตœ์†Œ ๋…ธ๋“œ ์ˆ˜๊นŒ์ง€ ์ค„์ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. --max-replica-count ํ”Œ๋ž˜๊ทธ๋ฅผ ์ƒ๋žตํ•˜๋ฉด ์ตœ๋Œ€ ๋…ธ๋“œ ์ˆ˜๊ฐ€ --min-replica-count ๊ฐ’์œผ๋กœ ์„ค์ •๋ฉ๋‹ˆ๋‹ค.

gcloud ai endpoints deploy-model ๋ช…๋ น์–ด๋ฅผ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค.

Linux, macOS ๋˜๋Š” Cloud Shell

gcloud ai endpoints deploy-model ENDPOINT_ID\
  --region=LOCATION_ID \
  --model=MODEL_ID \
  --display-name=DEPLOYED_MODEL_NAME \
  --traffic-split=0=100

Windows(PowerShell)

gcloud ai endpoints deploy-model ENDPOINT_ID`
  --region=LOCATION_ID `
  --model=MODEL_ID `
  --display-name=DEPLOYED_MODEL_NAME `
  --traffic-split=0=100

Windows(cmd.exe)

gcloud ai endpoints deploy-model ENDPOINT_ID^
  --region=LOCATION_ID ^
  --model=MODEL_ID ^
  --display-name=DEPLOYED_MODEL_NAME ^
  --traffic-split=0=100
 

ํŠธ๋ž˜ํ”ฝ ๋ถ„ํ• 

์•ž์˜ ์˜ˆ์‹œ์—์„œ --traffic-split=0=100 ํ”Œ๋ž˜๊ทธ๋Š” Endpoint๊ฐ€ ์ˆ˜์‹ ํ•˜๋Š” ์˜ˆ์ธก ํŠธ๋ž˜ํ”ฝ์˜ 100%๋ฅผ ์ƒˆ DeployedModel๋กœ ์ „์†กํ•˜๋ฉฐ ์ž„์‹œ ID๋Š” 0์œผ๋กœ ํ‘œํ˜„๋ฉ๋‹ˆ๋‹ค. Endpoint์— ์ด๋ฏธ ๋‹ค๋ฅธ DeployedModel ๋ฆฌ์†Œ์Šค๊ฐ€ ์žˆ์œผ๋ฉด ์ƒˆ DeployedModel ๋ฐ ์ด์ „ ๋ชจ๋ธ ๊ฐ„์— ํŠธ๋ž˜ํ”ฝ์„ ๋ถ„ํ• ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด ํŠธ๋ž˜ํ”ฝ์˜ 20%๋ฅผ ์ƒˆ DeployedModel๋กœ, 80%๋ฅผ ์ด์ „ ๋ชจ๋ธ๋กœ ์ „์†กํ•˜๋ ค๋ฉด ๋‹ค์Œ ๋ช…๋ น์–ด๋ฅผ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค.

์•„๋ž˜์˜ ๋ช…๋ น์–ด ๋ฐ์ดํ„ฐ๋ฅผ ์‚ฌ์šฉํ•˜๊ธฐ ์ „์— ๋‹ค์Œ์„ ๋ฐ”๊ฟ‰๋‹ˆ๋‹ค.

  • OLD_DEPLOYED_MODEL_ID: ๊ธฐ์กด DeployedModel์˜ ID

gcloud ai endpoints deploy-model ๋ช…๋ น์–ด๋ฅผ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค.

Linux, macOS ๋˜๋Š” Cloud Shell

gcloud ai endpoints deploy-model ENDPOINT_ID\
  --region=LOCATION_ID \
  --model=MODEL_ID \
  --display-name=DEPLOYED_MODEL_NAME \ 
  --min-replica-count=MIN_REPLICA_COUNT \
  --max-replica-count=MAX_REPLICA_COUNT \
  --traffic-split=0=20,OLD_DEPLOYED_MODEL_ID=80

Windows(PowerShell)

gcloud ai endpoints deploy-model ENDPOINT_ID`
  --region=LOCATION_ID `
  --model=MODEL_ID `
  --display-name=DEPLOYED_MODEL_NAME \ 
  --min-replica-count=MIN_REPLICA_COUNT `
  --max-replica-count=MAX_REPLICA_COUNT `
  --traffic-split=0=20,OLD_DEPLOYED_MODEL_ID=80

Windows(cmd.exe)

gcloud ai endpoints deploy-model ENDPOINT_ID^
  --region=LOCATION_ID ^
  --model=MODEL_ID ^
  --display-name=DEPLOYED_MODEL_NAME \ 
  --min-replica-count=MIN_REPLICA_COUNT ^
  --max-replica-count=MAX_REPLICA_COUNT ^
  --traffic-split=0=20,OLD_DEPLOYED_MODEL_ID=80
 

REST

๋ชจ๋ธ ๋ฐฐํฌ

์š”์ฒญ ๋ฐ์ดํ„ฐ๋ฅผ ์‚ฌ์šฉํ•˜๊ธฐ ์ „์— ๋‹ค์Œ์„ ๋ฐ”๊ฟ‰๋‹ˆ๋‹ค.

  • LOCATION_ID: Vertex AI๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๋ฆฌ์ „
  • PROJECT_ID: .
  • ENDPOINT_ID: ์—”๋“œํฌ์ธํŠธ์˜ ID
  • MODEL_ID: ๋ฐฐํฌํ•  ๋ชจ๋ธ์˜ ID์ž…๋‹ˆ๋‹ค.
  • DEPLOYED_MODEL_NAME: DeployedModel์˜ ์ด๋ฆ„์ž…๋‹ˆ๋‹ค. DeployedModel์˜ Model ํ‘œ์‹œ ์ด๋ฆ„๋„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • TRAFFIC_SPLIT_THIS_MODEL: ์ด ์ž‘์—…๊ณผ ํ•จ๊ป˜ ๋ฐฐํฌ๋˜๋Š” ๋ชจ๋ธ๋กœ ๋ผ์šฐํŒ…๋  ์ด ์—”๋“œํฌ์ธํŠธ์— ๋Œ€ํ•œ ์˜ˆ์ธก ํŠธ๋ž˜ํ”ฝ ๋น„์œจ์ž…๋‹ˆ๋‹ค. ๊ธฐ๋ณธ๊ฐ’์€ 100์ž…๋‹ˆ๋‹ค. ๋ชจ๋“  ํŠธ๋ž˜ํ”ฝ ๋น„์œจ์˜ ํ•ฉ์€ 100์ด ๋˜์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ํŠธ๋ž˜ํ”ฝ ๋ถ„ํ• ์— ๋Œ€ํ•ด ์ž์„ธํžˆ ์•Œ์•„๋ณด๊ธฐ
  • DEPLOYED_MODEL_ID_N: ์„ ํƒ์‚ฌํ•ญ. ๋‹ค๋ฅธ ๋ชจ๋ธ์ด ์ด ์—”๋“œํฌ์ธํŠธ์— ๋ฐฐํฌ๋œ ๊ฒฝ์šฐ ๋ชจ๋“  ๋น„์œจ์˜ ํ•ฉ์ด 100์ด ๋˜๋„๋ก ํŠธ๋ž˜ํ”ฝ ๋ถ„ํ•  ๋น„์œจ์„ ์—…๋ฐ์ดํŠธํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
  • TRAFFIC_SPLIT_MODEL_N: ๋ฐฐํฌ๋œ ๋ชจ๋ธ ID ํ‚ค์˜ ํŠธ๋ž˜ํ”ฝ ๋ถ„ํ•  ๋น„์œจ ๊ฐ’
  • PROJECT_NUMBER: ํ”„๋กœ์ ํŠธ์˜ ์ž๋™์œผ๋กœ ์ƒ์„ฑ๋œ ํ”„๋กœ์ ํŠธ ๋ฒˆํ˜ธ

HTTP ๋ฉ”์„œ๋“œ ๋ฐ URL:

POST https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints/ENDPOINT_ID:deployModel

JSON ์š”์ฒญ ๋ณธ๋ฌธ:

{
  "deployedModel": {
    "model": "projects/PROJECT_ID/locations/us-central1/models/MODEL_ID",
    "displayName": "DEPLOYED_MODEL_NAME",
    "automaticResources": {
     }
  },
  "trafficSplit": {
    "0": TRAFFIC_SPLIT_THIS_MODEL,
    "DEPLOYED_MODEL_ID_1": TRAFFIC_SPLIT_MODEL_1,
    "DEPLOYED_MODEL_ID_2": TRAFFIC_SPLIT_MODEL_2
  },
}

์š”์ฒญ์„ ๋ณด๋‚ด๋ ค๋ฉด ๋‹ค์Œ ์˜ต์…˜ ์ค‘ ํ•˜๋‚˜๋ฅผ ํŽผ์นฉ๋‹ˆ๋‹ค.

๋‹ค์Œ๊ณผ ๋น„์Šทํ•œ JSON ์‘๋‹ต์ด ํ‘œ์‹œ๋ฉ๋‹ˆ๋‹ค.

{
  "name": "projects/PROJECT_ID/locations/LOCATION_ID/endpoints/ENDPOINT_ID/operations/OPERATION_ID",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.aiplatform.v1.DeployModelOperationMetadata",
    "genericMetadata": {
      "createTime": "2020-10-19T17:53:16.502088Z",
      "updateTime": "2020-10-19T17:53:16.502088Z"
    }
  }
}

Java

์ด ์ƒ˜ํ”Œ์„ ์‚ฌ์šฉํ•ด ๋ณด๊ธฐ ์ „์— Vertex AI ๋น ๋ฅธ ์‹œ์ž‘: ํด๋ผ์ด์–ธํŠธ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ์‚ฌ์šฉ์˜ Java ์„ค์ • ์•ˆ๋‚ด๋ฅผ ๋”ฐ๋ฅด์„ธ์š”. ์ž์„ธํ•œ ๋‚ด์šฉ์€ Vertex AI Java API ์ฐธ๊ณ  ๋ฌธ์„œ๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.

Vertex AI์— ์ธ์ฆํ•˜๋ ค๋ฉด ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜ ๊ธฐ๋ณธ ์‚ฌ์šฉ์ž ์ธ์ฆ ์ •๋ณด๋ฅผ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค. ์ž์„ธํ•œ ๋‚ด์šฉ์€ ๋กœ์ปฌ ๊ฐœ๋ฐœ ํ™˜๊ฒฝ์˜ ์ธ์ฆ ์„ค์ •์„ ์ฐธ์กฐํ•˜์„ธ์š”.


import com.google.api.gax.longrunning.OperationFuture;
import com.google.api.gax.longrunning.OperationTimedPollAlgorithm;
import com.google.api.gax.retrying.RetrySettings;
import com.google.cloud.aiplatform.v1.AutomaticResources;
import com.google.cloud.aiplatform.v1.DedicatedResources;
import com.google.cloud.aiplatform.v1.DeployModelOperationMetadata;
import com.google.cloud.aiplatform.v1.DeployModelResponse;
import com.google.cloud.aiplatform.v1.DeployedModel;
import com.google.cloud.aiplatform.v1.EndpointName;
import com.google.cloud.aiplatform.v1.EndpointServiceClient;
import com.google.cloud.aiplatform.v1.EndpointServiceSettings;
import com.google.cloud.aiplatform.v1.MachineSpec;
import com.google.cloud.aiplatform.v1.ModelName;
import com.google.cloud.aiplatform.v1.stub.EndpointServiceStubSettings;
import java.io.IOException;
import java.util.HashMap;
import java.util.Map;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.TimeoutException;
import org.threeten.bp.Duration;

public class DeployModelSample {

  public static void main(String[] args)
      throws IOException, InterruptedException, ExecutionException, TimeoutException {
    // TODO(developer): Replace these variables before running the sample.
    String project = "YOUR_PROJECT_ID";
    String deployedModelDisplayName = "YOUR_DEPLOYED_MODEL_DISPLAY_NAME";
    String endpointId = "YOUR_ENDPOINT_NAME";
    String modelId = "YOUR_MODEL_ID";
    int timeout = 900;
    deployModelSample(project, deployedModelDisplayName, endpointId, modelId, timeout);
  }

  static void deployModelSample(
      String project,
      String deployedModelDisplayName,
      String endpointId,
      String modelId,
      int timeout)
      throws IOException, InterruptedException, ExecutionException, TimeoutException {

    // Set long-running operations (LROs) timeout
    final OperationTimedPollAlgorithm operationTimedPollAlgorithm =
        OperationTimedPollAlgorithm.create(
            RetrySettings.newBuilder()
                .setInitialRetryDelay(Duration.ofMillis(5000L))
                .setRetryDelayMultiplier(1.5)
                .setMaxRetryDelay(Duration.ofMillis(45000L))
                .setInitialRpcTimeout(Duration.ZERO)
                .setRpcTimeoutMultiplier(1.0)
                .setMaxRpcTimeout(Duration.ZERO)
                .setTotalTimeout(Duration.ofSeconds(timeout))
                .build());

    EndpointServiceStubSettings.Builder endpointServiceStubSettingsBuilder =
        EndpointServiceStubSettings.newBuilder();
    endpointServiceStubSettingsBuilder
        .deployModelOperationSettings()
        .setPollingAlgorithm(operationTimedPollAlgorithm);
    EndpointServiceStubSettings endpointStubSettings = endpointServiceStubSettingsBuilder.build();
    EndpointServiceSettings endpointServiceSettings =
        EndpointServiceSettings.create(endpointStubSettings);
    endpointServiceSettings =
        endpointServiceSettings.toBuilder()
            .setEndpoint("us-central1-aiplatform.googleapis.com:443")
            .build();

    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (EndpointServiceClient endpointServiceClient =
        EndpointServiceClient.create(endpointServiceSettings)) {
      String location = "us-central1";
      EndpointName endpointName = EndpointName.of(project, location, endpointId);
      // key '0' assigns traffic for the newly deployed model
      // Traffic percentage values must add up to 100
      // Leave dictionary empty if endpoint should not accept any traffic
      Map<String, Integer> trafficSplit = new HashMap<>();
      trafficSplit.put("0", 100);
      ModelName modelName = ModelName.of(project, location, modelId);
      AutomaticResources automaticResourcesInput =
          AutomaticResources.newBuilder().setMinReplicaCount(1).setMaxReplicaCount(1).build();
      DeployedModel deployedModelInput =
          DeployedModel.newBuilder()
              .setModel(modelName.toString())
              .setDisplayName(deployedModelDisplayName)
              .setAutomaticResources(automaticResourcesInput)
              .build();

      OperationFuture<DeployModelResponse, DeployModelOperationMetadata> deployModelResponseFuture =
          endpointServiceClient.deployModelAsync(endpointName, deployedModelInput, trafficSplit);
      System.out.format(
          "Operation name: %s\n", deployModelResponseFuture.getInitialFuture().get().getName());
      System.out.println("Waiting for operation to finish...");
      DeployModelResponse deployModelResponse = deployModelResponseFuture.get(20, TimeUnit.MINUTES);

      System.out.println("Deploy Model Response");
      DeployedModel deployedModel = deployModelResponse.getDeployedModel();
      System.out.println("\tDeployed Model");
      System.out.format("\t\tid: %s\n", deployedModel.getId());
      System.out.format("\t\tmodel: %s\n", deployedModel.getModel());
      System.out.format("\t\tDisplay Name: %s\n", deployedModel.getDisplayName());
      System.out.format("\t\tCreate Time: %s\n", deployedModel.getCreateTime());

      DedicatedResources dedicatedResources = deployedModel.getDedicatedResources();
      System.out.println("\t\tDedicated Resources");
      System.out.format("\t\t\tMin Replica Count: %s\n", dedicatedResources.getMinReplicaCount());

      MachineSpec machineSpec = dedicatedResources.getMachineSpec();
      System.out.println("\t\t\tMachine Spec");
      System.out.format("\t\t\t\tMachine Type: %s\n", machineSpec.getMachineType());
      System.out.format("\t\t\t\tAccelerator Type: %s\n", machineSpec.getAcceleratorType());
      System.out.format("\t\t\t\tAccelerator Count: %s\n", machineSpec.getAcceleratorCount());

      AutomaticResources automaticResources = deployedModel.getAutomaticResources();
      System.out.println("\t\tAutomatic Resources");
      System.out.format("\t\t\tMin Replica Count: %s\n", automaticResources.getMinReplicaCount());
      System.out.format("\t\t\tMax Replica Count: %s\n", automaticResources.getMaxReplicaCount());
    }
  }
}

Node.js

์ด ์ƒ˜ํ”Œ์„ ์‚ฌ์šฉํ•ด ๋ณด๊ธฐ ์ „์— Vertex AI ๋น ๋ฅธ ์‹œ์ž‘: ํด๋ผ์ด์–ธํŠธ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ์‚ฌ์šฉ์˜ Node.js ์„ค์ • ์•ˆ๋‚ด๋ฅผ ๋”ฐ๋ฅด์„ธ์š”. ์ž์„ธํ•œ ๋‚ด์šฉ์€ Vertex AI Node.js API ์ฐธ๊ณ  ๋ฌธ์„œ๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.

Vertex AI์— ์ธ์ฆํ•˜๋ ค๋ฉด ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜ ๊ธฐ๋ณธ ์‚ฌ์šฉ์ž ์ธ์ฆ ์ •๋ณด๋ฅผ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค. ์ž์„ธํ•œ ๋‚ด์šฉ์€ ๋กœ์ปฌ ๊ฐœ๋ฐœ ํ™˜๊ฒฝ์˜ ์ธ์ฆ ์„ค์ •์„ ์ฐธ์กฐํ•˜์„ธ์š”.

/**
 * TODO(developer): Uncomment these variables before running the sample.\
 * (Not necessary if passing values as arguments)
 */

// const modelId = "YOUR_MODEL_ID";
// const endpointId = 'YOUR_ENDPOINT_ID';
// const deployedModelDisplayName = 'YOUR_DEPLOYED_MODEL_DISPLAY_NAME';
// const project = 'YOUR_PROJECT_ID';
// const location = 'YOUR_PROJECT_LOCATION';

const modelName = `projects/${project}/locations/${location}/models/${modelId}`;
const endpoint = `projects/${project}/locations/${location}/endpoints/${endpointId}`;
// Imports the Google Cloud Endpoint Service Client library
const {EndpointServiceClient} = require('@google-cloud/aiplatform');

// Specifies the location of the api endpoint:
const clientOptions = {
  apiEndpoint: 'us-central1-aiplatform.googleapis.com',
};

// Instantiates a client
const endpointServiceClient = new EndpointServiceClient(clientOptions);

async function deployModel() {
  // Configure the parent resource
  // key '0' assigns traffic for the newly deployed model
  // Traffic percentage values must add up to 100
  // Leave dictionary empty if endpoint should not accept any traffic
  const trafficSplit = {0: 100};
  const deployedModel = {
    // format: 'projects/{project}/locations/{location}/models/{model}'
    model: modelName,
    displayName: deployedModelDisplayName,
    automaticResources: {minReplicaCount: 1, maxReplicaCount: 1},
  };
  const request = {
    endpoint,
    deployedModel,
    trafficSplit,
  };

  // Get and print out a list of all the endpoints for this resource
  const [response] = await endpointServiceClient.deployModel(request);
  console.log(`Long running operation : ${response.name}`);

  // Wait for operation to complete
  await response.promise();
  const result = response.result;

  console.log('Deploy model response');
  const modelDeployed = result.deployedModel;
  console.log('\tDeployed model');
  if (!modelDeployed) {
    console.log('\t\tId : {}');
    console.log('\t\tModel : {}');
    console.log('\t\tDisplay name : {}');
    console.log('\t\tCreate time : {}');

    console.log('\t\tDedicated resources');
    console.log('\t\t\tMin replica count : {}');
    console.log('\t\t\tMachine spec {}');
    console.log('\t\t\t\tMachine type : {}');
    console.log('\t\t\t\tAccelerator type : {}');
    console.log('\t\t\t\tAccelerator count : {}');

    console.log('\t\tAutomatic resources');
    console.log('\t\t\tMin replica count : {}');
    console.log('\t\t\tMax replica count : {}');
  } else {
    console.log(`\t\tId : ${modelDeployed.id}`);
    console.log(`\t\tModel : ${modelDeployed.model}`);
    console.log(`\t\tDisplay name : ${modelDeployed.displayName}`);
    console.log(`\t\tCreate time : ${modelDeployed.createTime}`);

    const dedicatedResources = modelDeployed.dedicatedResources;
    console.log('\t\tDedicated resources');
    if (!dedicatedResources) {
      console.log('\t\t\tMin replica count : {}');
      console.log('\t\t\tMachine spec {}');
      console.log('\t\t\t\tMachine type : {}');
      console.log('\t\t\t\tAccelerator type : {}');
      console.log('\t\t\t\tAccelerator count : {}');
    } else {
      console.log(
        `\t\t\tMin replica count : \
          ${dedicatedResources.minReplicaCount}`
      );
      const machineSpec = dedicatedResources.machineSpec;
      console.log('\t\t\tMachine spec');
      console.log(`\t\t\t\tMachine type : ${machineSpec.machineType}`);
      console.log(
        `\t\t\t\tAccelerator type : ${machineSpec.acceleratorType}`
      );
      console.log(
        `\t\t\t\tAccelerator count : ${machineSpec.acceleratorCount}`
      );
    }

    const automaticResources = modelDeployed.automaticResources;
    console.log('\t\tAutomatic resources');
    if (!automaticResources) {
      console.log('\t\t\tMin replica count : {}');
      console.log('\t\t\tMax replica count : {}');
    } else {
      console.log(
        `\t\t\tMin replica count : \
          ${automaticResources.minReplicaCount}`
      );
      console.log(
        `\t\t\tMax replica count : \
          ${automaticResources.maxReplicaCount}`
      );
    }
  }
}
deployModel();

Python

Vertex AI SDK for Python์„ ์„ค์น˜ํ•˜๊ฑฐ๋‚˜ ์—…๋ฐ์ดํŠธํ•˜๋Š” ๋ฐฉ๋ฒ•์€ Vertex AI SDK for Python ์„ค์น˜๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”. ์ž์„ธํ•œ ๋‚ด์šฉ์€ Python API ์ฐธ๊ณ  ๋ฌธ์„œ๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.

def deploy_model_with_automatic_resources_sample(
    project,
    location,
    model_name: str,
    endpoint: Optional[aiplatform.Endpoint] = None,
    deployed_model_display_name: Optional[str] = None,
    traffic_percentage: Optional[int] = 0,
    traffic_split: Optional[Dict[str, int]] = None,
    min_replica_count: int = 1,
    max_replica_count: int = 1,
    metadata: Optional[Sequence[Tuple[str, str]]] = (),
    sync: bool = True,
):
    """
    model_name: A fully-qualified model resource name or model ID.
          Example: "projects/123/locations/us-central1/models/456" or
          "456" when project and location are initialized or passed.
    """

    aiplatform.init(project=project, location=location)

    model = aiplatform.Model(model_name=model_name)

    model.deploy(
        endpoint=endpoint,
        deployed_model_display_name=deployed_model_display_name,
        traffic_percentage=traffic_percentage,
        traffic_split=traffic_split,
        min_replica_count=min_replica_count,
        max_replica_count=max_replica_count,
        metadata=metadata,
        sync=sync,
    )

    model.wait()

    print(model.display_name)
    print(model.resource_name)
    return model

์ž‘์—… ์ƒํƒœ ๊ฐ€์ ธ์˜ค๊ธฐ

์ผ๋ถ€ ์š”์ฒญ์€ ์™„๋ฃŒํ•˜๋Š” ๋ฐ ์‹œ๊ฐ„์ด ๊ฑธ๋ฆฌ๋Š” ์žฅ๊ธฐ ์‹คํ–‰ ์ž‘์—…์„ ์‹œ์ž‘ํ•ฉ๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ์š”์ฒญ์€ ์ž‘์—… ์ƒํƒœ๋ฅผ ๋ณด๊ฑฐ๋‚˜ ์ž‘์—…์„ ์ทจ์†Œํ•˜๋Š” ๋ฐ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ์ž‘์—… ์ด๋ฆ„์„ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค. Vertex AI๋Š” ์žฅ๊ธฐ ์‹คํ–‰ ์ž‘์—…์„ ํ˜ธ์ถœํ•˜๋Š” ๋„์šฐ๋ฏธ ๋ฉ”์„œ๋“œ๋ฅผ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ์ž์„ธํ•œ ๋‚ด์šฉ์€ ์žฅ๊ธฐ ์‹คํ–‰ ์ž‘์—… ๋‹ค๋ฃจ๊ธฐ๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.

๋ฐฐํฌ๋œ ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜์—ฌ ์˜จ๋ผ์ธ ์˜ˆ์ธก ์ˆ˜ํ–‰

์˜จ๋ผ์ธ ์˜ˆ์ธก์„ ์ˆ˜ํ–‰ํ•˜๋ ค๋ฉด ๋ถ„์„์„ ์œ„ํ•ด ํ•˜๋‚˜ ์ด์ƒ์˜ ํ…Œ์ŠคํŠธ ํ•ญ๋ชฉ์„ ๋ชจ๋ธ์— ์ œ์ถœํ•˜๋ฉด ๋ชจ๋ธ์ด ๋ชจ๋ธ์˜ ๋ชฉํ‘œ์— ๋”ฐ๋ฅธ ๊ฒฐ๊ณผ๋ฅผ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค. ์˜ˆ์ธก ๊ฒฐ๊ณผ์— ๋Œ€ํ•œ ์ž์„ธํ•œ ๋‚ด์šฉ์€ ๊ฒฐ๊ณผ ํ•ด์„ ํŽ˜์ด์ง€๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.

์ฝ˜์†”

Google Cloud ์ฝ˜์†”์„ ์‚ฌ์šฉํ•˜์—ฌ ์˜จ๋ผ์ธ ์˜ˆ์ธก์„ ์š”์ฒญํ•ฉ๋‹ˆ๋‹ค. ๋ชจ๋ธ์„ ์—”๋“œํฌ์ธํŠธ์— ๋ฐฐํฌํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

  1. Google Cloud ์ฝ˜์†”์˜ Vertex AI ์„น์…˜์—์„œ ๋ชจ๋ธ ํŽ˜์ด์ง€๋กœ ์ด๋™ํ•ฉ๋‹ˆ๋‹ค.

    ๋ชจ๋ธ ํŽ˜์ด์ง€๋กœ ์ด๋™

  2. ๋ชจ๋ธ ๋ชฉ๋ก์—์„œ ์˜ˆ์ธก์„ ์š”์ฒญํ•  ๋ชจ๋ธ์˜ ์ด๋ฆ„์„ ํด๋ฆญํ•ฉ๋‹ˆ๋‹ค.

  3. ๋ฐฐํฌ ๋ฐ ํ…Œ์ŠคํŠธ ํƒญ์„ ์„ ํƒํ•ฉ๋‹ˆ๋‹ค.

  4. ๋ชจ๋ธ ํ…Œ์ŠคํŠธ ์„น์…˜์—์„œ ํ…Œ์ŠคํŠธ ํ•ญ๋ชฉ์„ ์ถ”๊ฐ€ํ•˜์—ฌ ์˜ˆ์ธก์„ ์š”์ฒญํ•ฉ๋‹ˆ๋‹ค.

    ํ…์ŠคํŠธ ๋ชฉํ‘œ์— ๋Œ€ํ•œ AutoML ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜๋ ค๋ฉด ํ…์ŠคํŠธ ์ž…๋ ฅ๋ž€์— ์ฝ˜ํ…์ธ ๋ฅผ ์ž…๋ ฅํ•˜๊ณ  ์˜ˆ์ธก์„ ํด๋ฆญํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

    ๋กœ์ปฌ ํŠน์„ฑ ์ค‘์š”๋„์— ๋Œ€ํ•œ ์ž์„ธํ•œ ๋‚ด์šฉ์€ ์„ค๋ช… ๋ณด๊ธฐ๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.

    ์˜ˆ์ธก์ด ์™„๋ฃŒ๋˜๋ฉด Vertex AI๊ฐ€ ์ฝ˜์†”์— ๊ฒฐ๊ณผ๋ฅผ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค.

API

Vertex AI API๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์˜จ๋ผ์ธ ์˜ˆ์ธก์„ ์š”์ฒญํ•ฉ๋‹ˆ๋‹ค. ๋ชจ๋ธ์„ ์—”๋“œํฌ์ธํŠธ์— ๋ฐฐํฌํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

gcloud

  1. ๋‹ค์Œ ์ฝ˜ํ…์ธ ๋กœ request.json๋ผ๋Š” ํŒŒ์ผ์„ ๋งŒ๋“ญ๋‹ˆ๋‹ค.

    {
      "instances": [{
        "mimeType": "text/plain",
        "content": "CONTENT"
      }]
    }
    

    ๋‹ค์Œ์„ ๋ฐ”๊ฟ‰๋‹ˆ๋‹ค.

    • CONTENT: ์˜ˆ์ธก์— ์‚ฌ์šฉ๋˜๋Š” ํ…์ŠคํŠธ ์Šค๋‹ˆํŽซ์ž…๋‹ˆ๋‹ค.
  2. ๋‹ค์Œ ๋ช…๋ น์–ด๋ฅผ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค.

    gcloud ai endpoints predict ENDPOINT_ID \
      --region=LOCATION_ID \
      --json-request=request.json

    ๋‹ค์Œ์„ ๋ฐ”๊ฟ‰๋‹ˆ๋‹ค.

    • ENDPOINT_ID: ์—”๋“œํฌ์ธํŠธ์˜ ID
    • LOCATION_ID: Vertex AI๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๋ฆฌ์ „

REST

์š”์ฒญ ๋ฐ์ดํ„ฐ๋ฅผ ์‚ฌ์šฉํ•˜๊ธฐ ์ „์— ๋‹ค์Œ์„ ๋ฐ”๊ฟ‰๋‹ˆ๋‹ค.

  • LOCATION_ID: ์—”๋“œํฌ์ธํŠธ๊ฐ€ ์žˆ๋Š” ๋ฆฌ์ „. ์˜ˆ๋ฅผ ๋“ค๋ฉด us-central1์ž…๋‹ˆ๋‹ค.
  • PROJECT_ID:
  • ENDPOINT_ID: ์—”๋“œํฌ์ธํŠธ์˜ ID์ž…๋‹ˆ๋‹ค.
  • CONTENT: ์˜ˆ์ธก์— ์‚ฌ์šฉ๋˜๋Š” ํ…์ŠคํŠธ ์Šค๋‹ˆํŽซ์ž…๋‹ˆ๋‹ค.
  • DEPLOYED_MODEL_ID: ์˜ˆ์ธก์„ ์œ„ํ•ด ์‚ฌ์šฉ๋œ ๋ฐฐํฌ๋œ ๋ชจ๋ธ์˜ ID์ž…๋‹ˆ๋‹ค.

HTTP ๋ฉ”์„œ๋“œ ๋ฐ URL:

POST https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints/ENDPOINT_ID:predict

JSON ์š”์ฒญ ๋ณธ๋ฌธ:

{
  "instances": [{
    "mimeType": "text/plain",
    "content": "CONTENT"
  }]
}

์š”์ฒญ์„ ๋ณด๋‚ด๋ ค๋ฉด ๋‹ค์Œ ์˜ต์…˜ ์ค‘ ํ•˜๋‚˜๋ฅผ ์„ ํƒํ•ฉ๋‹ˆ๋‹ค.

curl

์š”์ฒญ ๋ณธ๋ฌธ์„ request.json ํŒŒ์ผ์— ์ €์žฅํ•˜๊ณ  ๋‹ค์Œ ๋ช…๋ น์–ด๋ฅผ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค.

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints/ENDPOINT_ID:predict"

PowerShell

์š”์ฒญ ๋ณธ๋ฌธ์„ request.json ํŒŒ์ผ์— ์ €์žฅํ•˜๊ณ  ๋‹ค์Œ ๋ช…๋ น์–ด๋ฅผ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค.

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints/ENDPOINT_ID:predict" | Select-Object -Expand Content

๋‹ค์Œ๊ณผ ๋น„์Šทํ•œ JSON ์‘๋‹ต์ด ํ‘œ์‹œ๋ฉ๋‹ˆ๋‹ค.

{
  "predictions": [
    {
      "ids": [
        "1234567890123456789",
        "2234567890123456789",
        "3234567890123456789"
      ],
      "displayNames": [
        "GreatService",
        "Suggestion",
        "InfoRequest"
      ],
      "confidences": [
        0.8986392080783844,
        0.81984345316886902,
        0.7722353458404541
      ]
    }
  ],
  "deployedModelId": "0123456789012345678"
}

Java

์ด ์ƒ˜ํ”Œ์„ ์‚ฌ์šฉํ•ด ๋ณด๊ธฐ ์ „์— Vertex AI ๋น ๋ฅธ ์‹œ์ž‘: ํด๋ผ์ด์–ธํŠธ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ์‚ฌ์šฉ์˜ Java ์„ค์ • ์•ˆ๋‚ด๋ฅผ ๋”ฐ๋ฅด์„ธ์š”. ์ž์„ธํ•œ ๋‚ด์šฉ์€ Vertex AI Java API ์ฐธ๊ณ  ๋ฌธ์„œ๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.

Vertex AI์— ์ธ์ฆํ•˜๋ ค๋ฉด ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜ ๊ธฐ๋ณธ ์‚ฌ์šฉ์ž ์ธ์ฆ ์ •๋ณด๋ฅผ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค. ์ž์„ธํ•œ ๋‚ด์šฉ์€ ๋กœ์ปฌ ๊ฐœ๋ฐœ ํ™˜๊ฒฝ์˜ ์ธ์ฆ ์„ค์ •์„ ์ฐธ์กฐํ•˜์„ธ์š”.

import com.google.cloud.aiplatform.util.ValueConverter;
import com.google.cloud.aiplatform.v1.EndpointName;
import com.google.cloud.aiplatform.v1.PredictResponse;
import com.google.cloud.aiplatform.v1.PredictionServiceClient;
import com.google.cloud.aiplatform.v1.PredictionServiceSettings;
import com.google.cloud.aiplatform.v1.schema.predict.instance.TextClassificationPredictionInstance;
import com.google.cloud.aiplatform.v1.schema.predict.prediction.ClassificationPredictionResult;
import com.google.protobuf.Value;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;

public class PredictTextClassificationSingleLabelSample {

  public static void main(String[] args) throws IOException {
    // TODO(developer): Replace these variables before running the sample.
    String project = "YOUR_PROJECT_ID";
    String content = "YOUR_TEXT_CONTENT";
    String endpointId = "YOUR_ENDPOINT_ID";

    predictTextClassificationSingleLabel(project, content, endpointId);
  }

  static void predictTextClassificationSingleLabel(
      String project, String content, String endpointId) throws IOException {
    PredictionServiceSettings predictionServiceSettings =
        PredictionServiceSettings.newBuilder()
            .setEndpoint("us-central1-aiplatform.googleapis.com:443")
            .build();

    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (PredictionServiceClient predictionServiceClient =
        PredictionServiceClient.create(predictionServiceSettings)) {
      String location = "us-central1";
      EndpointName endpointName = EndpointName.of(project, location, endpointId);

      TextClassificationPredictionInstance predictionInstance =
          TextClassificationPredictionInstance.newBuilder().setContent(content).build();

      List<Value> instances = new ArrayList<>();
      instances.add(ValueConverter.toValue(predictionInstance));

      PredictResponse predictResponse =
          predictionServiceClient.predict(endpointName, instances, ValueConverter.EMPTY_VALUE);
      System.out.println("Predict Text Classification Response");
      System.out.format("\tDeployed Model Id: %s\n", predictResponse.getDeployedModelId());

      System.out.println("Predictions:\n\n");
      for (Value prediction : predictResponse.getPredictionsList()) {

        ClassificationPredictionResult.Builder resultBuilder =
            ClassificationPredictionResult.newBuilder();

        // Display names and confidences values correspond to
        // IDs in the ID list.
        ClassificationPredictionResult result =
            (ClassificationPredictionResult) ValueConverter.fromValue(resultBuilder, prediction);
        int counter = 0;
        for (Long id : result.getIdsList()) {
          System.out.printf("Label ID: %d\n", id);
          System.out.printf("Label: %s\n", result.getDisplayNames(counter));
          System.out.printf("Confidence: %.4f\n", result.getConfidences(counter));
          counter++;
        }
      }
    }
  }
}

Node.js

์ด ์ƒ˜ํ”Œ์„ ์‚ฌ์šฉํ•ด ๋ณด๊ธฐ ์ „์— Vertex AI ๋น ๋ฅธ ์‹œ์ž‘: ํด๋ผ์ด์–ธํŠธ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ์‚ฌ์šฉ์˜ Node.js ์„ค์ • ์•ˆ๋‚ด๋ฅผ ๋”ฐ๋ฅด์„ธ์š”. ์ž์„ธํ•œ ๋‚ด์šฉ์€ Vertex AI Node.js API ์ฐธ๊ณ  ๋ฌธ์„œ๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.

Vertex AI์— ์ธ์ฆํ•˜๋ ค๋ฉด ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜ ๊ธฐ๋ณธ ์‚ฌ์šฉ์ž ์ธ์ฆ ์ •๋ณด๋ฅผ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค. ์ž์„ธํ•œ ๋‚ด์šฉ์€ ๋กœ์ปฌ ๊ฐœ๋ฐœ ํ™˜๊ฒฝ์˜ ์ธ์ฆ ์„ค์ •์„ ์ฐธ์กฐํ•˜์„ธ์š”.

/**
 * TODO(developer): Uncomment these variables before running the sample.\
 * (Not necessary if passing values as arguments)
 */

// const text = 'YOUR_PREDICTION_TEXT';
// const endpointId = 'YOUR_ENDPOINT_ID';
// const project = 'YOUR_PROJECT_ID';
// const location = 'YOUR_PROJECT_LOCATION';
const aiplatform = require('@google-cloud/aiplatform');
const {instance, prediction} =
  aiplatform.protos.google.cloud.aiplatform.v1.schema.predict;

// Imports the Google Cloud Model Service Client library
const {PredictionServiceClient} = aiplatform.v1;

// Specifies the location of the api endpoint
const clientOptions = {
  apiEndpoint: 'us-central1-aiplatform.googleapis.com',
};

// Instantiates a client
const predictionServiceClient = new PredictionServiceClient(clientOptions);

async function predictTextClassification() {
  // Configure the resources
  const endpoint = `projects/${project}/locations/${location}/endpoints/${endpointId}`;

  const predictionInstance =
    new instance.TextClassificationPredictionInstance({
      content: text,
    });
  const instanceValue = predictionInstance.toValue();

  const instances = [instanceValue];
  const request = {
    endpoint,
    instances,
  };

  const [response] = await predictionServiceClient.predict(request);
  console.log('Predict text classification response');
  console.log(`\tDeployed model id : ${response.deployedModelId}\n\n`);

  console.log('Prediction results:');

  for (const predictionResultValue of response.predictions) {
    const predictionResult =
      prediction.ClassificationPredictionResult.fromValue(
        predictionResultValue
      );

    for (const [i, label] of predictionResult.displayNames.entries()) {
      console.log(`\tDisplay name: ${label}`);
      console.log(`\tConfidences: ${predictionResult.confidences[i]}`);
      console.log(`\tIDs: ${predictionResult.ids[i]}\n\n`);
    }
  }
}
predictTextClassification();

Python

Vertex AI SDK for Python์„ ์„ค์น˜ํ•˜๊ฑฐ๋‚˜ ์—…๋ฐ์ดํŠธํ•˜๋Š” ๋ฐฉ๋ฒ•์€ Vertex AI SDK for Python ์„ค์น˜๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”. ์ž์„ธํ•œ ๋‚ด์šฉ์€ Python API ์ฐธ๊ณ  ๋ฌธ์„œ๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.

def predict_text_classification_single_label_sample(
    project, location, endpoint, content
):
    aiplatform.init(project=project, location=location)

    endpoint = aiplatform.Endpoint(endpoint)

    response = endpoint.predict(instances=[{"content": content}], parameters={})

    for prediction_ in response.predictions:
        print(prediction_)

์ผ๊ด„ ์˜ˆ์ธก ๊ฐ€์ ธ์˜ค๊ธฐ

์ผ๊ด„ ์˜ˆ์ธก ์š”์ฒญ์„ ์ˆ˜ํ–‰ํ•˜๋ ค๋ฉด Vertex AI๊ฐ€ ์˜ˆ์ธก ๊ฒฐ๊ณผ๋ฅผ ์ €์žฅํ•˜๋Š” ์ž…๋ ฅ ์†Œ์Šค์™€ ์ถœ๋ ฅ ํ˜•์‹์„ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค.

์ž…๋ ฅ ๋ฐ์ดํ„ฐ ์š”๊ตฌ์‚ฌํ•ญ

์ผ๊ด„ ์š”์ฒญ์˜ ์ž…๋ ฅ์€ ์˜ˆ์ธก์„ ์œ„ํ•ด ๋ชจ๋ธ์— ๋ณด๋‚ผ ํ•ญ๋ชฉ์„ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค. ํ…์ŠคํŠธ ๋ถ„๋ฅ˜ ๋ชจ๋ธ์˜ ๊ฒฝ์šฐ JSON Lines ํŒŒ์ผ์„ ์‚ฌ์šฉํ•˜์—ฌ ์˜ˆ์ธก์„ ์ˆ˜ํ–‰ํ•  ๋ฌธ์„œ ๋ชฉ๋ก์„ ์ง€์ •ํ•œ ๋‹ค์Œ JSON Lines ํŒŒ์ผ์„ Cloud Storage ๋ฒ„ํ‚ท์— ์ €์žฅํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋‹ค์Œ ์ƒ˜ํ”Œ์—์„œ๋Š” ์ž…๋ ฅ JSON Lines ํŒŒ์ผ์˜ ๋‹จ์ผ ์ค„์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.

{"content": "gs://sourcebucket/datasets/texts/source_text.txt", "mimeType": "text/plain"}

๋ฐฐ์น˜ ์˜ˆ์ธก ์š”์ฒญ

์ผ๊ด„ ์˜ˆ์ธก ์š”์ฒญ์˜ ๊ฒฝ์šฐ Google Cloud ์ฝ˜์†” ๋˜๋Š” Vertex AI API๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ œ์ถœํ•œ ์ž…๋ ฅ ํ•ญ๋ชฉ ์ˆ˜์— ๋”ฐ๋ผ ์ผ๊ด„ ์˜ˆ์ธก ํƒœ์Šคํฌ๋ฅผ ์™„๋ฃŒํ•˜๋Š” ๋ฐ ๋‹ค์†Œ ์‹œ๊ฐ„์ด ๊ฑธ๋ฆด ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

Google Cloud ์ฝ˜์†”

Google Cloud ์ฝ˜์†”์„ ์‚ฌ์šฉํ•˜์—ฌ ์ผ๊ด„ ์˜ˆ์ธก์„ ์š”์ฒญํ•ฉ๋‹ˆ๋‹ค.

  1. Google Cloud ์ฝ˜์†”์˜ Vertex AI ์„น์…˜์—์„œ ์ผ๊ด„ ์˜ˆ์ธก ํŽ˜์ด์ง€๋กœ ์ด๋™ํ•ฉ๋‹ˆ๋‹ค.

    ์ผ๊ด„ ์˜ˆ์ธก ํŽ˜์ด์ง€๋กœ ์ด๋™

  2. ๋งŒ๋“ค๊ธฐ๋ฅผ ํด๋ฆญํ•˜์—ฌ ์ƒˆ ์ผ๊ด„ ์˜ˆ์ธก ์ฐฝ์„ ์—ด๊ณ  ๋‹ค์Œ ๋‹จ๊ณ„๋ฅผ ์™„๋ฃŒํ•ฉ๋‹ˆ๋‹ค.

    1. ์ผ๊ด„ ์˜ˆ์ธก์˜ ์ด๋ฆ„์„ ์ž…๋ ฅํ•ฉ๋‹ˆ๋‹ค.
    2. ๋ชจ๋ธ ์ด๋ฆ„์—์„œ ์ด ์ผ๊ด„ ์˜ˆ์ธก์— ์‚ฌ์šฉํ•  ๋ชจ๋ธ์˜ ์ด๋ฆ„์„ ์„ ํƒํ•ฉ๋‹ˆ๋‹ค.
    3. ์†Œ์Šค ๊ฒฝ๋กœ์—์„œ JSON Lines ์ž…๋ ฅ ํŒŒ์ผ์ด ์žˆ๋Š” Cloud Storage ์œ„์น˜๋ฅผ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค.
    4. ๋Œ€์ƒ ๊ฒฝ๋กœ์—์„œ ์ผ๊ด„ ์˜ˆ์ธก ๊ฒฐ๊ณผ๊ฐ€ ์ €์žฅ๋˜๋Š” Cloud Storage ์œ„์น˜๋ฅผ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค. ์ถœ๋ ฅ ํ˜•์‹์€ ๋ชจ๋ธ์˜ ๋ชฉํ‘œ์— ๋”ฐ๋ผ ๊ฒฐ์ •๋ฉ๋‹ˆ๋‹ค. ํ…์ŠคํŠธ ๋ชฉํ‘œ์˜ AutoML ๋ชจ๋ธ์€ JSON Lines ํŒŒ์ผ์„ ์ถœ๋ ฅํ•ฉ๋‹ˆ๋‹ค.

API

Vertex AI API๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ผ๊ด„ ์˜ˆ์ธก ์š”์ฒญ์„ ์ „์†กํ•ฉ๋‹ˆ๋‹ค.

REST

์š”์ฒญ ๋ฐ์ดํ„ฐ๋ฅผ ์‚ฌ์šฉํ•˜๊ธฐ ์ „์— ๋‹ค์Œ์„ ๋ฐ”๊ฟ‰๋‹ˆ๋‹ค.

  • LOCATION_IS: ๋ชจ๋ธ์ด ์ €์žฅ๋˜๊ณ  ์ผ๊ด„ ์˜ˆ์ธก ์ž‘์—…์ด ์‹คํ–‰๋˜๋Š” ๋ฆฌ์ „. ์˜ˆ๋ฅผ ๋“ค๋ฉด us-central1์ž…๋‹ˆ๋‹ค.
  • PROJECT_ID:
  • BATCH_JOB_NAME: ์ผ๊ด„ ์ž‘์—…์˜ ํ‘œ์‹œ ์ด๋ฆ„
  • MODEL_ID: ์˜ˆ์ธก์„ ์ˆ˜ํ–‰ํ•˜๋Š” ๋ฐ ์‚ฌ์šฉํ•  ๋ชจ๋ธ์˜ ID
  • URI: ์ž…๋ ฅ JSON Lines ํŒŒ์ผ์ด ์žˆ๋Š” Cloud Storage URI
  • BUCKET: Cloud Storage ๋ฒ„ํ‚ท
  • PROJECT_NUMBER: ํ”„๋กœ์ ํŠธ์˜ ์ž๋™์œผ๋กœ ์ƒ์„ฑ๋œ ํ”„๋กœ์ ํŠธ ๋ฒˆํ˜ธ

HTTP ๋ฉ”์„œ๋“œ ๋ฐ URL:

POST https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/batchPredictionJobs

JSON ์š”์ฒญ ๋ณธ๋ฌธ:

{
    "displayName": "BATCH_JOB_NAME",
    "model": "projects/PROJECT_ID/locations/LOCATION_ID/models/MODEL_ID",
    "inputConfig": {
        "instancesFormat": "jsonl",
        "gcsSource": {
            "uris": ["URI"]
        }
    },
    "outputConfig": {
        "predictionsFormat": "jsonl",
        "gcsDestination": {
            "outputUriPrefix": "OUTPUT_BUCKET"
        }
    }
}

์š”์ฒญ์„ ๋ณด๋‚ด๋ ค๋ฉด ๋‹ค์Œ ์˜ต์…˜ ์ค‘ ํ•˜๋‚˜๋ฅผ ์„ ํƒํ•ฉ๋‹ˆ๋‹ค.

curl

์š”์ฒญ ๋ณธ๋ฌธ์„ request.json ํŒŒ์ผ์— ์ €์žฅํ•˜๊ณ  ๋‹ค์Œ ๋ช…๋ น์–ด๋ฅผ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค.

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/batchPredictionJobs"

PowerShell

์š”์ฒญ ๋ณธ๋ฌธ์„ request.json ํŒŒ์ผ์— ์ €์žฅํ•˜๊ณ  ๋‹ค์Œ ๋ช…๋ น์–ด๋ฅผ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค.

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/batchPredictionJobs" | Select-Object -Expand Content

๋‹ค์Œ๊ณผ ๋น„์Šทํ•œ JSON ์‘๋‹ต์ด ํ‘œ์‹œ๋ฉ๋‹ˆ๋‹ค.

{
  "name": "projects/PROJECT_NUMBER/locations/LOCATION/batchPredictionJobs/BATCH_JOB_ID",
  "displayName": "BATCH_JOB_NAME",
  "model": "projects/PROJECT_NUMBER/locations/LOCATION/models/MODEL_ID",
  "inputConfig": {
    "instancesFormat": "jsonl",
    "gcsSource": {
      "uris": [
        "CONTENT"
      ]
    }
  },
  "outputConfig": {
    "predictionsFormat": "jsonl",
    "gcsDestination": {
      "outputUriPrefix": "BUCKET"
    }
  },
  "state": "JOB_STATE_PENDING",
  "completionStats": {
    "incompleteCount": "-1"
  },
  "createTime": "2022-12-19T20:33:48.906074Z",
  "updateTime": "2022-12-19T20:33:48.906074Z",
  "modelVersionId": "1"
}

์ž‘์—… state๊ฐ€ JOB_STATE_SUCCEEDED๊ฐ€ ๋  ๋•Œ๊นŒ์ง€ BATCH_JOB_ID๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ผ๊ด„ ์ž‘์—…์˜ ์ƒํƒœ๋ฅผ ํด๋งํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

Java

์ด ์ƒ˜ํ”Œ์„ ์‚ฌ์šฉํ•ด ๋ณด๊ธฐ ์ „์— Vertex AI ๋น ๋ฅธ ์‹œ์ž‘: ํด๋ผ์ด์–ธํŠธ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ์‚ฌ์šฉ์˜ Java ์„ค์ • ์•ˆ๋‚ด๋ฅผ ๋”ฐ๋ฅด์„ธ์š”. ์ž์„ธํ•œ ๋‚ด์šฉ์€ Vertex AI Java API ์ฐธ๊ณ  ๋ฌธ์„œ๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.

Vertex AI์— ์ธ์ฆํ•˜๋ ค๋ฉด ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜ ๊ธฐ๋ณธ ์‚ฌ์šฉ์ž ์ธ์ฆ ์ •๋ณด๋ฅผ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค. ์ž์„ธํ•œ ๋‚ด์šฉ์€ ๋กœ์ปฌ ๊ฐœ๋ฐœ ํ™˜๊ฒฝ์˜ ์ธ์ฆ ์„ค์ •์„ ์ฐธ์กฐํ•˜์„ธ์š”.

import com.google.api.gax.rpc.ApiException;
import com.google.cloud.aiplatform.v1.BatchPredictionJob;
import com.google.cloud.aiplatform.v1.GcsDestination;
import com.google.cloud.aiplatform.v1.GcsSource;
import com.google.cloud.aiplatform.v1.JobServiceClient;
import com.google.cloud.aiplatform.v1.JobServiceSettings;
import com.google.cloud.aiplatform.v1.LocationName;
import com.google.cloud.aiplatform.v1.ModelName;
import java.io.IOException;

public class CreateBatchPredictionJobTextClassificationSample {

  public static void main(String[] args) throws IOException {
    // TODO(developer): Replace these variables before running the sample.
    String project = "PROJECT";
    String location = "us-central1";
    String displayName = "DISPLAY_NAME";
    String modelId = "MODEL_ID";
    String gcsSourceUri = "GCS_SOURCE_URI";
    String gcsDestinationOutputUriPrefix = "GCS_DESTINATION_OUTPUT_URI_PREFIX";
    createBatchPredictionJobTextClassificationSample(
        project, location, displayName, modelId, gcsSourceUri, gcsDestinationOutputUriPrefix);
  }

  static void createBatchPredictionJobTextClassificationSample(
      String project,
      String location,
      String displayName,
      String modelId,
      String gcsSourceUri,
      String gcsDestinationOutputUriPrefix)
      throws IOException {
    // The AI Platform services require regional API endpoints.
    JobServiceSettings settings =
        JobServiceSettings.newBuilder()
            .setEndpoint("us-central1-aiplatform.googleapis.com:443")
            .build();

    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (JobServiceClient client = JobServiceClient.create(settings)) {
      try {
        String modelName = ModelName.of(project, location, modelId).toString();
        GcsSource gcsSource = GcsSource.newBuilder().addUris(gcsSourceUri).build();
        BatchPredictionJob.InputConfig inputConfig =
            BatchPredictionJob.InputConfig.newBuilder()
                .setInstancesFormat("jsonl")
                .setGcsSource(gcsSource)
                .build();
        GcsDestination gcsDestination =
            GcsDestination.newBuilder().setOutputUriPrefix(gcsDestinationOutputUriPrefix).build();
        BatchPredictionJob.OutputConfig outputConfig =
            BatchPredictionJob.OutputConfig.newBuilder()
                .setPredictionsFormat("jsonl")
                .setGcsDestination(gcsDestination)
                .build();
        BatchPredictionJob batchPredictionJob =
            BatchPredictionJob.newBuilder()
                .setDisplayName(displayName)
                .setModel(modelName)
                .setInputConfig(inputConfig)
                .setOutputConfig(outputConfig)
                .build();
        LocationName parent = LocationName.of(project, location);
        BatchPredictionJob response = client.createBatchPredictionJob(parent, batchPredictionJob);
        System.out.format("response: %s\n", response);
      } catch (ApiException ex) {
        System.out.format("Exception: %s\n", ex.getLocalizedMessage());
      }
    }
  }
}

Node.js

์ด ์ƒ˜ํ”Œ์„ ์‚ฌ์šฉํ•ด ๋ณด๊ธฐ ์ „์— Vertex AI ๋น ๋ฅธ ์‹œ์ž‘: ํด๋ผ์ด์–ธํŠธ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ์‚ฌ์šฉ์˜ Node.js ์„ค์ • ์•ˆ๋‚ด๋ฅผ ๋”ฐ๋ฅด์„ธ์š”. ์ž์„ธํ•œ ๋‚ด์šฉ์€ Vertex AI Node.js API ์ฐธ๊ณ  ๋ฌธ์„œ๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.

Vertex AI์— ์ธ์ฆํ•˜๋ ค๋ฉด ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜ ๊ธฐ๋ณธ ์‚ฌ์šฉ์ž ์ธ์ฆ ์ •๋ณด๋ฅผ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค. ์ž์„ธํ•œ ๋‚ด์šฉ์€ ๋กœ์ปฌ ๊ฐœ๋ฐœ ํ™˜๊ฒฝ์˜ ์ธ์ฆ ์„ค์ •์„ ์ฐธ์กฐํ•˜์„ธ์š”.

/**
 * TODO(developer): Uncomment these variables before running the sample.\
 * (Not necessary if passing values as arguments)
 */

// const batchPredictionDisplayName = 'YOUR_BATCH_PREDICTION_DISPLAY_NAME';
// const modelId = 'YOUR_MODEL_ID';
// const gcsSourceUri = 'YOUR_GCS_SOURCE_URI';
// const gcsDestinationOutputUriPrefix = 'YOUR_GCS_DEST_OUTPUT_URI_PREFIX';
//    eg. "gs://<your-gcs-bucket>/destination_path"
// const project = 'YOUR_PROJECT_ID';
// const location = 'YOUR_PROJECT_LOCATION';

// Imports the Google Cloud Job Service Client library
const {JobServiceClient} = require('@google-cloud/aiplatform').v1;

// Specifies the location of the api endpoint
const clientOptions = {
  apiEndpoint: 'us-central1-aiplatform.googleapis.com',
};

// Instantiates a client
const jobServiceClient = new JobServiceClient(clientOptions);

async function createBatchPredictionJobTextClassification() {
  // Configure the parent resource
  const parent = `projects/${project}/locations/${location}`;
  const modelName = `projects/${project}/locations/${location}/models/${modelId}`;

  const inputConfig = {
    instancesFormat: 'jsonl',
    gcsSource: {uris: [gcsSourceUri]},
  };
  const outputConfig = {
    predictionsFormat: 'jsonl',
    gcsDestination: {outputUriPrefix: gcsDestinationOutputUriPrefix},
  };
  const batchPredictionJob = {
    displayName: batchPredictionDisplayName,
    model: modelName,
    inputConfig,
    outputConfig,
  };
  const request = {
    parent,
    batchPredictionJob,
  };

  // Create batch prediction job request
  const [response] = await jobServiceClient.createBatchPredictionJob(request);

  console.log('Create batch prediction job text classification response');
  console.log(`Name : ${response.name}`);
  console.log('Raw response:');
  console.log(JSON.stringify(response, null, 2));
}
createBatchPredictionJobTextClassification();

Python

Vertex AI SDK for Python์„ ์„ค์น˜ํ•˜๊ฑฐ๋‚˜ ์—…๋ฐ์ดํŠธํ•˜๋Š” ๋ฐฉ๋ฒ•์€ Vertex AI SDK for Python ์„ค์น˜๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”. ์ž์„ธํ•œ ๋‚ด์šฉ์€ Python API ์ฐธ๊ณ  ๋ฌธ์„œ๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.

def create_batch_prediction_job_sample(
    project: str,
    location: str,
    model_resource_name: str,
    job_display_name: str,
    gcs_source: Union[str, Sequence[str]],
    gcs_destination: str,
    sync: bool = True,
):
    aiplatform.init(project=project, location=location)

    my_model = aiplatform.Model(model_resource_name)

    batch_prediction_job = my_model.batch_predict(
        job_display_name=job_display_name,
        gcs_source=gcs_source,
        gcs_destination_prefix=gcs_destination,
        sync=sync,
    )

    batch_prediction_job.wait()

    print(batch_prediction_job.display_name)
    print(batch_prediction_job.resource_name)
    print(batch_prediction_job.state)
    return batch_prediction_job

์ผ๊ด„ ์˜ˆ์ธก ๊ฒฐ๊ณผ ๊ฒ€์ƒ‰

์ผ๊ด„ ์˜ˆ์ธก ์ž‘์—…์ด ์™„๋ฃŒ๋˜๋ฉด ์š”์ฒญ์— ์ง€์ •ํ•œ Cloud Storage ๋ฒ„ํ‚ท์— ์˜ˆ์ธก ๊ฒฐ๊ณผ๊ฐ€ ์ €์žฅ๋ฉ๋‹ˆ๋‹ค.

์ผ๊ด„ ์˜ˆ์ธก ๊ฒฐ๊ณผ ์˜ˆ์‹œ

๋‹ค์Œ์€ ํ…์ŠคํŠธ ๋ถ„๋ฅ˜ ๋ชจ๋ธ์˜ ์ผ๊ด„ ์˜ˆ์ธก ๊ฒฐ๊ณผ ์˜ˆ์‹œ์ž…๋‹ˆ๋‹ค.

{
  "instance": {"content": "gs://bucket/text.txt", "mimeType": "text/plain"},
  "predictions": [
    {
      "ids": [
        "1234567890123456789",
        "2234567890123456789",
        "3234567890123456789"
      ],
      "displayNames": [
        "GreatService",
        "Suggestion",
        "InfoRequest"
      ],
      "confidences": [
        0.8986392080783844,
        0.81984345316886902,
        0.7722353458404541
      ]
    }
  ]
}