์ด๋ฏธ์ง€ ๋ถ„๋ฅ˜ ๋ชจ๋ธ์—์„œ ์ถ”๋ก  ๊ฐ€์ ธ์˜ค๊ธฐ

์ด ํŽ˜์ด์ง€์—์„œ๋Š” Google Cloud ์ฝ˜์†” ๋˜๋Š” Vertex AI API๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ด๋ฏธ์ง€ ๋ถ„๋ฅ˜ ๋ชจ๋ธ์—์„œ ์˜จ๋ผ์ธ (์‹ค์‹œ๊ฐ„) ์ถ”๋ก ๊ณผ ์ผ๊ด„ ์ถ”๋ก ์„ ์–ป๋Š” ๋ฐฉ๋ฒ•์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.

์˜จ๋ผ์ธ ์ถ”๋ก ๊ณผ ์ผ๊ด„ ์ถ”๋ก ์˜ ์ฐจ์ด์ 

์˜จ๋ผ์ธ ์ถ”๋ก ์€ ๋ชจ๋ธ ์—”๋“œํฌ์ธํŠธ์— ์ˆ˜ํ–‰๋˜๋Š” ๋™๊ธฐ์‹ ์š”์ฒญ์ž…๋‹ˆ๋‹ค. ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜ ์ž…๋ ฅ์— ๋Œ€ํ•œ ์‘๋‹ต์œผ๋กœ ์š”์ฒญํ•˜๊ฑฐ๋‚˜ ์ ์‹œ์˜ ์ถ”๋ก ์ด ํ•„์š”ํ•œ ์ƒํ™ฉ์—์„œ ์š”์ฒญํ•˜๋Š” ๊ฒฝ์šฐ์— ์˜จ๋ผ์ธ ์ถ”๋ก ์„ ์‚ฌ์šฉํ•˜์„ธ์š”.

์ผ๊ด„ ์ถ”๋ก ์€ ๋น„๋™๊ธฐ์‹ ์š”์ฒญ์ž…๋‹ˆ๋‹ค. ๋ชจ๋ธ์„ ์—”๋“œํฌ์ธํŠธ์— ๋ฐฐํฌํ•  ํ•„์š” ์—†์ด ๋ชจ๋ธ ๋ฆฌ์†Œ์Šค์—์„œ ์ง์ ‘ ์ผ๊ด„ ์ถ”๋ก ์„ ์š”์ฒญํ•ฉ๋‹ˆ๋‹ค. ์ด๋ฏธ์ง€ ๋ฐ์ดํ„ฐ์˜ ๊ฒฝ์šฐ ์ฆ‰๊ฐ์ ์ธ ์‘๋‹ต์ด ํ•„์š”ํ•˜์ง€ ์•Š๊ณ  ๋‹จ์ผ ์š”์ฒญ์„ ์‚ฌ์šฉํ•˜์—ฌ ๋ˆ„์ ๋œ ๋ฐ์ดํ„ฐ๋ฅผ ์ฒ˜๋ฆฌํ•˜๊ณ  ์‹ถ์€ ๊ฒฝ์šฐ ์ผ๊ด„ ์ถ”๋ก ์„ ์‚ฌ์šฉํ•˜์„ธ์š”.

์˜จ๋ผ์ธ ์ถ”๋ก  ์ˆ˜ํ–‰

์—”๋“œํฌ์ธํŠธ์— ๋ชจ๋ธ ๋ฐฐํฌ

๋ชจ๋ธ์„ ์˜จ๋ผ์ธ ์ถ”๋ก ์„ ์ œ๊ณตํ•˜๋Š” ๋ฐ ์‚ฌ์šฉํ•˜๋ ค๋ฉด ๋จผ์ € ๋ชจ๋ธ์„ ์—”๋“œํฌ์ธํŠธ์— ๋ฐฐํฌํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ๋ชจ๋ธ์„ ๋ฐฐํฌํ•˜๋ฉด ๋ฌผ๋ฆฌ์  ๋ฆฌ์†Œ์Šค๊ฐ€ ๋ชจ๋ธ๊ณผ ์—ฐ๊ฒฐ๋˜๋ฏ€๋กœ ์งง์€ ์ง€์—ฐ ์‹œ๊ฐ„์œผ๋กœ ์˜จ๋ผ์ธ ์ถ”๋ก ์„ ์ œ๊ณตํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์—”๋“œํฌ์ธํŠธ 1๊ฐœ์— ๋ชจ๋ธ์„ 2๊ฐœ ์ด์ƒ ๋ฐฐํฌํ•  ์ˆ˜ ์žˆ๊ณ  2๊ฐœ ์ด์ƒ์˜ ์—”๋“œํฌ์ธํŠธ์— ๋ชจ๋ธ 1๊ฐœ๋ฅผ ๋ฐฐํฌํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋ชจ๋ธ ๋ฐฐํฌ ์˜ต์…˜ ๋ฐ ์‚ฌ์šฉ ์‚ฌ๋ก€์— ๋Œ€ํ•œ ์ž์„ธํ•œ ๋‚ด์šฉ์€ ๋ชจ๋ธ ๋ฐฐํฌ ์ •๋ณด๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.

๋‹ค์Œ ๋ฐฉ๋ฒ• ์ค‘ ํ•˜๋‚˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ชจ๋ธ์„ ๋ฐฐํฌํ•ฉ๋‹ˆ๋‹ค.

Google Cloud ์ฝ˜์†”

  1. Google Cloud ์ฝ˜์†”์˜ Vertex AI ์„น์…˜์—์„œ ๋ชจ๋ธ ํŽ˜์ด์ง€๋กœ ์ด๋™ํ•ฉ๋‹ˆ๋‹ค.

    ๋ชจ๋ธ ํŽ˜์ด์ง€๋กœ ์ด๋™

  2. ๋ฐฐํฌํ•˜๋ ค๋Š” ๋ชจ๋ธ์˜ ์ด๋ฆ„์„ ํด๋ฆญํ•˜์—ฌ ๋ชจ๋ธ ์„ค๋ช… ํŽ˜์ด์ง€๋ฅผ ์—ฝ๋‹ˆ๋‹ค.

  3. ๋ฒ„์ „ ID ์—ด์—์„œ ๋ฐฐํฌํ•  ๋ชจ๋ธ ๋ฒ„์ „์˜ ID๋ฅผ ํด๋ฆญํ•ฉ๋‹ˆ๋‹ค.

  4. ๋ฐฐํฌ ๋ฐ ํ…Œ์ŠคํŠธ๋ฅผ ํด๋ฆญํ•ฉ๋‹ˆ๋‹ค.

    ์ด๋ฏธ ์—”๋“œํฌ์ธํŠธ์— ๋ฐฐํฌ๋œ ๋ชจ๋ธ์€ ๋ชจ๋ธ ๋ฐฐํฌ ์„น์…˜์— ๋‚˜์—ด๋ฉ๋‹ˆ๋‹ค.

  5. ์—”๋“œํฌ์ธํŠธ์— ๋ฐฐํฌ๋ฅผ ํด๋ฆญํ•ฉ๋‹ˆ๋‹ค.

  6. ๋ชจ๋ธ์„ ์ƒˆ ์—”๋“œํฌ์ธํŠธ์— ๋ฐฐํฌํ•˜๋ ค๋ฉด ์ƒˆ ์—”๋“œํฌ์ธํŠธ ๋งŒ๋“ค๊ธฐ๋ฅผ ํด๋ฆญํ•˜๊ณ  ์ƒˆ ์—”๋“œํฌ์ธํŠธ์˜ ์ด๋ฆ„์„ ์ž…๋ ฅํ•ฉ๋‹ˆ๋‹ค. ๊ธฐ์กด ์—”๋“œํฌ์ธํŠธ์— ๋ชจ๋ธ์„ ๋ฐฐํฌํ•˜๋ ค๋ฉด ๊ธฐ์กด ์—”๋“œํฌ์ธํŠธ์— ์ถ”๊ฐ€๋ฅผ ํด๋ฆญํ•˜๊ณ  ์—”๋“œํฌ์ธํŠธ ์ด๋ฆ„์„ ์„ ํƒํ•ฉ๋‹ˆ๋‹ค.

    ์—”๋“œํฌ์ธํŠธ 1๊ฐœ์— ๋ชจ๋ธ์„ 2๊ฐœ ์ด์ƒ ์ถ”๊ฐ€ํ•  ์ˆ˜ ์žˆ๊ณ  ์—ฌ๋Ÿฌ ์—”๋“œํฌ์ธํŠธ์— ๋ชจ๋ธ 1๊ฐœ๋ฅผ ์ถ”๊ฐ€ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ž์„ธํžˆ ์•Œ์•„๋ณด๊ธฐ

    ์ƒˆ ์—”๋“œํฌ์ธํŠธ์— ๋ฐฐํฌํ•˜๋Š” ๊ฒฝ์šฐ ์—”๋“œํฌ์ธํŠธ์— ์•ก์„ธ์Šคํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์„ ํƒํ•ฉ๋‹ˆ๋‹ค.

    • ์—”๋“œํฌ์ธํŠธ์—์„œ REST API๋ฅผ ํ†ตํ•ด ์ถ”๋ก ํ•  ์ˆ˜ ์žˆ๋„๋ก ํ‘œ์ค€์„ ํด๋ฆญํ•ฉ๋‹ˆ๋‹ค.

    • ์—”๋“œํฌ์ธํŠธ์—์„œ ๋น„๊ณต๊ฐœ๋ฅผ ํด๋ฆญํ•˜์—ฌ ๋น„๊ณต๊ฐœ ์—ฐ๊ฒฐ์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

    ๋ชจ๋ธ์ด ํ•˜๋‚˜ ์ด์ƒ ๋ฐฐํฌ๋œ ๊ธฐ์กด ์—”๋“œํฌ์ธํŠธ์— ๋ชจ๋ธ์„ ๋ฐฐํฌํ•˜๋Š” ๊ฒฝ์šฐ ๋ฐฐํฌ ์ค‘์ธ ๋ชจ๋ธ๊ณผ ์ด๋ฏธ ๋ฐฐํฌ๋œ ๋ชจ๋ธ์˜ ํŠธ๋ž˜ํ”ฝ ๋ถ„ํ•  ๋น„์œจ์„ ์—…๋ฐ์ดํŠธํ•˜์—ฌ ๋ชจ๋“  ๋น„์œจ ํ•ฉ๊ณ„๊ฐ€ 100%๊ฐ€ ๋˜๋„๋ก ํ•ฉ๋‹ˆ๋‹ค.

  7. AutoML ์ด๋ฏธ์ง€๋ฅผ ์„ ํƒํ•˜๊ณ  ๋‹ค์Œ๊ณผ ๊ฐ™์ด ๊ตฌ์„ฑํ•ฉ๋‹ˆ๋‹ค.

    1. ๋ชจ๋ธ์„ ์ƒˆ ์—”๋“œํฌ์ธํŠธ์— ๋ฐฐํฌํ•˜๋Š” ๊ฒฝ์šฐ ํŠธ๋ž˜ํ”ฝ ๋ถ„ํ•  ๊ฐ’์œผ๋กœ 100์„ ํ—ˆ์šฉํ•ฉ๋‹ˆ๋‹ค. ์•„๋‹ˆ๋ฉด ๋ชจ๋‘ ํ•ฉํ•˜์—ฌ 100์ด ๋˜๋„๋ก ์—”๋“œํฌ์ธํŠธ์˜ ๋ชจ๋“  ๋ชจ๋ธ์— ๋Œ€ํ•œ ํŠธ๋ž˜ํ”ฝ ๋ถ„ํ•  ๊ฐ’์„ ์กฐ์ •ํ•ฉ๋‹ˆ๋‹ค.

    2. ๋ชจ๋ธ์— ์ œ๊ณตํ•  ์ปดํ“จํŒ… ๋…ธ๋“œ ์ˆ˜๋ฅผ ์ž…๋ ฅํ•ฉ๋‹ˆ๋‹ค.

      ์ด ์ˆซ์ž๋Š” ํ•ญ์ƒ ์ด ๋ชจ๋ธ์— ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ๋…ธ๋“œ ์ˆ˜์ž…๋‹ˆ๋‹ค. ์ถ”๋ก  ํŠธ๋ž˜ํ”ฝ์ด ์—†๋Š” ๋…ธ๋“œ์— ๋Œ€ํ•œ ์š”๊ธˆ๋„ ์ฒญ๊ตฌ๋ฉ๋‹ˆ๋‹ค. ์ž์„ธํ•œ ๋‚ด์šฉ์€ ๊ฐ€๊ฒฉ ์ฑ…์ • ํŽ˜์ด์ง€๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.

    3. ์ถ”๋ก  ๋กœ๊น…์˜ ๊ธฐ๋ณธ ์„ค์ •์„ ๋ณ€๊ฒฝํ•˜๋Š” ๋ฐฉ๋ฒ• ์•Œ์•„๋ณด๊ธฐ

    4. ๋ถ„๋ฅ˜ ๋ชจ๋ธ๋งŒ (์„ ํƒ์‚ฌํ•ญ): ์„ค๋ช… ๊ธฐ๋Šฅ ์˜ต์…˜ ์„น์…˜์—์„œ์ด ๋ชจ๋ธ์˜ ํŠน์„ฑ ๊ธฐ์—ฌ ๋ถ„์„ ์‚ฌ์šฉ ์„ค์ •์„ ์„ ํƒํ•˜์—ฌ Vertex Explainable AI๋ฅผ ์‚ฌ์šฉ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค. ๊ธฐ์กด ์‹œ๊ฐํ™” ์„ค์ •์„ ์ˆ˜๋ฝํ•˜๊ฑฐ๋‚˜ ์ƒˆ ๊ฐ’์„ ์„ ํƒํ•˜๊ณ  ์™„๋ฃŒ๋ฅผ ํด๋ฆญํ•ฉ๋‹ˆ๋‹ค.

      Vertex Explainable AI๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ AutoML ์ด๋ฏธ์ง€ ๋ถ„๋ฅ˜ ๋ชจ๋ธ์„ ๋ฐฐํฌํ•˜๊ณ  ์„ค๋ช…์„ ์‚ฌ์šฉํ•˜์—ฌ ์ถ”๋ก ์„ ์ˆ˜ํ–‰ํ•˜๋Š” ๊ฒƒ์€ ์„ ํƒ์‚ฌํ•ญ์ž…๋‹ˆ๋‹ค. ๋ฐฐํฌ ์‹œ Vertex Explainable AI๋ฅผ ์‚ฌ์šฉ ์„ค์ •ํ•˜๋ฉด ๋ฐฐํฌ๋œ ๋…ธ๋“œ ์ˆ˜์™€ ๋ฐฐํฌ ์‹œ๊ฐ„์„ ๊ธฐ์ค€์œผ๋กœ ์ถ”๊ฐ€ ๋น„์šฉ์ด ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค. ์ž์„ธํ•œ ๋‚ด์šฉ์€ ๊ฐ€๊ฒฉ ์ฑ…์ •์„ ์ฐธ์กฐํ•˜์„ธ์š”.

    5. ๋ชจ๋ธ์—์„œ ์™„๋ฃŒ๋ฅผ ํด๋ฆญํ•˜๊ณ  ๋ชจ๋“  ํŠธ๋ž˜ํ”ฝ ๋ถ„ํ•  ๋น„์œจ์ด ์˜ฌ๋ฐ”๋ฅด๋ฉด ๊ณ„์†์„ ํด๋ฆญํ•ฉ๋‹ˆ๋‹ค.

      ๋ชจ๋ธ์ด ๋ฐฐํฌ๋˜๋Š” ๋ฆฌ์ „์ด ํ‘œ์‹œ๋ฉ๋‹ˆ๋‹ค. ์ด ๋ฆฌ์ „์€ ๋ชจ๋ธ์„ ๋งŒ๋“  ๋ฆฌ์ „์ด์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

    6. ๋ฐฐํฌ๋ฅผ ํด๋ฆญํ•˜์—ฌ ๋ชจ๋ธ์„ ์—”๋“œํฌ์ธํŠธ์— ๋ฐฐํฌํ•ฉ๋‹ˆ๋‹ค.

API

Vertex AI API๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ชจ๋ธ์„ ๋ฐฐํฌํ•˜๋Š” ๊ฒฝ์šฐ ๋‹ค์Œ ๋‹จ๊ณ„๋ฅผ ์™„๋ฃŒํ•ฉ๋‹ˆ๋‹ค.

  1. ํ•„์š”ํ•œ ๊ฒฝ์šฐ ์—”๋“œํฌ์ธํŠธ๋ฅผ ๋งŒ๋“ญ๋‹ˆ๋‹ค.
  2. ์—”๋“œํฌ์ธํŠธ ID๋ฅผ ๊ฐ€์ ธ์˜ต๋‹ˆ๋‹ค.
  3. ๋ชจ๋ธ์„ ์—”๋“œํฌ์ธํŠธ์— ๋ฐฐํฌํ•ฉ๋‹ˆ๋‹ค.

์—”๋“œํฌ์ธํŠธ ๋งŒ๋“ค๊ธฐ

๊ธฐ์กด ์—”๋“œํฌ์ธํŠธ์— ๋ชจ๋ธ์„ ๋ฐฐํฌํ•˜๋Š” ๊ฒฝ์šฐ ์ด ๋‹จ๊ณ„๋ฅผ ๊ฑด๋„ˆ๋›ธ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

gcloud

๋‹ค์Œ ์˜ˆ์‹œ์—์„œ๋Š” gcloud ai endpoints create ๋ช…๋ น์–ด๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

gcloud ai endpoints create \
  --region=LOCATION \
  --display-name=ENDPOINT_NAME

๋‹ค์Œ์„ ๋ฐ”๊ฟ‰๋‹ˆ๋‹ค.

  • LOCATION_ID: Vertex AI๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๋ฆฌ์ „
  • ENDPOINT_NAME: ์—”๋“œํฌ์ธํŠธ์˜ ํ‘œ์‹œ ์ด๋ฆ„

Google Cloud CLI ๋„๊ตฌ๊ฐ€ ์—”๋“œํฌ์ธํŠธ๋ฅผ ๋งŒ๋“œ๋Š” ๋ฐ ๋ช‡ ์ดˆ ์ •๋„ ๊ฑธ๋ฆด ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

REST

์š”์ฒญ ๋ฐ์ดํ„ฐ๋ฅผ ์‚ฌ์šฉํ•˜๊ธฐ ์ „์— ๋‹ค์Œ์„ ๋ฐ”๊ฟ‰๋‹ˆ๋‹ค.

  • LOCATION_ID: ๋ฆฌ์ „
  • PROJECT_ID: ํ”„๋กœ์ ํŠธ ID์ž…๋‹ˆ๋‹ค.
  • ENDPOINT_NAME: ์—”๋“œํฌ์ธํŠธ์˜ ํ‘œ์‹œ ์ด๋ฆ„

HTTP ๋ฉ”์„œ๋“œ ๋ฐ URL:

POST https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints

JSON ์š”์ฒญ ๋ณธ๋ฌธ:

{
  "display_name": "ENDPOINT_NAME"
}

์š”์ฒญ์„ ๋ณด๋‚ด๋ ค๋ฉด ๋‹ค์Œ ์˜ต์…˜ ์ค‘ ํ•˜๋‚˜๋ฅผ ํŽผ์นฉ๋‹ˆ๋‹ค.

๋‹ค์Œ๊ณผ ๋น„์Šทํ•œ JSON ์‘๋‹ต์ด ํ‘œ์‹œ๋ฉ๋‹ˆ๋‹ค.

{
  "name": "projects/PROJECT_NUMBER/locations/LOCATION_ID/endpoints/ENDPOINT_ID/operations/OPERATION_ID",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.aiplatform.v1.CreateEndpointOperationMetadata",
    "genericMetadata": {
      "createTime": "2020-11-05T17:45:42.812656Z",
      "updateTime": "2020-11-05T17:45:42.812656Z"
    }
  }
}
์‘๋‹ต์— "done": true๊ฐ€ ํฌํ•จ๋  ๋•Œ๊นŒ์ง€ ์ž‘์—… ์ƒํƒœ๋ฅผ ํด๋งํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

Java

์ด ์ƒ˜ํ”Œ์„ ์‚ฌ์šฉํ•ด ๋ณด๊ธฐ ์ „์— Vertex AI ๋น ๋ฅธ ์‹œ์ž‘: ํด๋ผ์ด์–ธํŠธ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ์‚ฌ์šฉ์˜ Java ์„ค์ • ์•ˆ๋‚ด๋ฅผ ๋”ฐ๋ฅด์„ธ์š”. ์ž์„ธํ•œ ๋‚ด์šฉ์€ Vertex AI Java API ์ฐธ๊ณ  ๋ฌธ์„œ๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.

Vertex AI์— ์ธ์ฆํ•˜๋ ค๋ฉด ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜ ๊ธฐ๋ณธ ์‚ฌ์šฉ์ž ์ธ์ฆ ์ •๋ณด๋ฅผ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค. ์ž์„ธํ•œ ๋‚ด์šฉ์€ ๋กœ์ปฌ ๊ฐœ๋ฐœ ํ™˜๊ฒฝ์˜ ์ธ์ฆ ์„ค์ •์„ ์ฐธ์กฐํ•˜์„ธ์š”.


import com.google.api.gax.longrunning.OperationFuture;
import com.google.cloud.aiplatform.v1.CreateEndpointOperationMetadata;
import com.google.cloud.aiplatform.v1.Endpoint;
import com.google.cloud.aiplatform.v1.EndpointServiceClient;
import com.google.cloud.aiplatform.v1.EndpointServiceSettings;
import com.google.cloud.aiplatform.v1.LocationName;
import java.io.IOException;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.TimeoutException;

public class CreateEndpointSample {

  public static void main(String[] args)
      throws IOException, InterruptedException, ExecutionException, TimeoutException {
    // TODO(developer): Replace these variables before running the sample.
    String project = "YOUR_PROJECT_ID";
    String endpointDisplayName = "YOUR_ENDPOINT_DISPLAY_NAME";
    createEndpointSample(project, endpointDisplayName);
  }

  static void createEndpointSample(String project, String endpointDisplayName)
      throws IOException, InterruptedException, ExecutionException, TimeoutException {
    EndpointServiceSettings endpointServiceSettings =
        EndpointServiceSettings.newBuilder()
            .setEndpoint("us-central1-aiplatform.googleapis.com:443")
            .build();

    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (EndpointServiceClient endpointServiceClient =
        EndpointServiceClient.create(endpointServiceSettings)) {
      String location = "us-central1";
      LocationName locationName = LocationName.of(project, location);
      Endpoint endpoint = Endpoint.newBuilder().setDisplayName(endpointDisplayName).build();

      OperationFuture<Endpoint, CreateEndpointOperationMetadata> endpointFuture =
          endpointServiceClient.createEndpointAsync(locationName, endpoint);
      System.out.format("Operation name: %s\n", endpointFuture.getInitialFuture().get().getName());
      System.out.println("Waiting for operation to finish...");
      Endpoint endpointResponse = endpointFuture.get(300, TimeUnit.SECONDS);

      System.out.println("Create Endpoint Response");
      System.out.format("Name: %s\n", endpointResponse.getName());
      System.out.format("Display Name: %s\n", endpointResponse.getDisplayName());
      System.out.format("Description: %s\n", endpointResponse.getDescription());
      System.out.format("Labels: %s\n", endpointResponse.getLabelsMap());
      System.out.format("Create Time: %s\n", endpointResponse.getCreateTime());
      System.out.format("Update Time: %s\n", endpointResponse.getUpdateTime());
    }
  }
}

Node.js

์ด ์ƒ˜ํ”Œ์„ ์‚ฌ์šฉํ•ด ๋ณด๊ธฐ ์ „์— Vertex AI ๋น ๋ฅธ ์‹œ์ž‘: ํด๋ผ์ด์–ธํŠธ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ์‚ฌ์šฉ์˜ Node.js ์„ค์ • ์•ˆ๋‚ด๋ฅผ ๋”ฐ๋ฅด์„ธ์š”. ์ž์„ธํ•œ ๋‚ด์šฉ์€ Vertex AI Node.js API ์ฐธ๊ณ  ๋ฌธ์„œ๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.

Vertex AI์— ์ธ์ฆํ•˜๋ ค๋ฉด ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜ ๊ธฐ๋ณธ ์‚ฌ์šฉ์ž ์ธ์ฆ ์ •๋ณด๋ฅผ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค. ์ž์„ธํ•œ ๋‚ด์šฉ์€ ๋กœ์ปฌ ๊ฐœ๋ฐœ ํ™˜๊ฒฝ์˜ ์ธ์ฆ ์„ค์ •์„ ์ฐธ์กฐํ•˜์„ธ์š”.

/**
 * TODO(developer): Uncomment these variables before running the sample.\
 * (Not necessary if passing values as arguments)
 */

// const endpointDisplayName = 'YOUR_ENDPOINT_DISPLAY_NAME';
// const project = 'YOUR_PROJECT_ID';
// const location = 'YOUR_PROJECT_LOCATION';

// Imports the Google Cloud Endpoint Service Client library
const {EndpointServiceClient} = require('@google-cloud/aiplatform');

// Specifies the location of the api endpoint
const clientOptions = {
  apiEndpoint: 'us-central1-aiplatform.googleapis.com',
};

// Instantiates a client
const endpointServiceClient = new EndpointServiceClient(clientOptions);

async function createEndpoint() {
  // Configure the parent resource
  const parent = `projects/${project}/locations/${location}`;
  const endpoint = {
    displayName: endpointDisplayName,
  };
  const request = {
    parent,
    endpoint,
  };

  // Get and print out a list of all the endpoints for this resource
  const [response] = await endpointServiceClient.createEndpoint(request);
  console.log(`Long running operation : ${response.name}`);

  // Wait for operation to complete
  await response.promise();
  const result = response.result;

  console.log('Create endpoint response');
  console.log(`\tName : ${result.name}`);
  console.log(`\tDisplay name : ${result.displayName}`);
  console.log(`\tDescription : ${result.description}`);
  console.log(`\tLabels : ${JSON.stringify(result.labels)}`);
  console.log(`\tCreate time : ${JSON.stringify(result.createTime)}`);
  console.log(`\tUpdate time : ${JSON.stringify(result.updateTime)}`);
}
createEndpoint();

Python

Vertex AI SDK for Python์„ ์„ค์น˜ํ•˜๊ฑฐ๋‚˜ ์—…๋ฐ์ดํŠธํ•˜๋Š” ๋ฐฉ๋ฒ•์€ Vertex AI SDK for Python ์„ค์น˜๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”. ์ž์„ธํ•œ ๋‚ด์šฉ์€ Python API ์ฐธ๊ณ  ๋ฌธ์„œ๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.

def create_endpoint_sample(
    project: str,
    display_name: str,
    location: str,
):
    aiplatform.init(project=project, location=location)

    endpoint = aiplatform.Endpoint.create(
        display_name=display_name,
        project=project,
        location=location,
    )

    print(endpoint.display_name)
    print(endpoint.resource_name)
    return endpoint

์—”๋“œํฌ์ธํŠธ ID ๊ฒ€์ƒ‰

๋ชจ๋ธ์„ ๋ฐฐํฌํ•˜๋ ค๋ฉด ์—”๋“œํฌ์ธํŠธ ID๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.

gcloud

๋‹ค์Œ ์˜ˆ์‹œ์—์„œ๋Š” gcloud ai endpoints list ๋ช…๋ น์–ด๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

gcloud ai endpoints list \
  --region=LOCATION \
  --filter=display_name=ENDPOINT_NAME

๋‹ค์Œ์„ ๋ฐ”๊ฟ‰๋‹ˆ๋‹ค.

  • LOCATION_ID: Vertex AI๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๋ฆฌ์ „
  • ENDPOINT_NAME: ์—”๋“œํฌ์ธํŠธ์˜ ํ‘œ์‹œ ์ด๋ฆ„

ENDPOINT_ID ์—ด์— ํ‘œ์‹œ๋˜๋Š” ๋ฒˆํ˜ธ๋ฅผ ํ™•์ธํ•ฉ๋‹ˆ๋‹ค. ๋‹ค์Œ ๋‹จ๊ณ„์—์„œ ์ด ID๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

REST

์š”์ฒญ ๋ฐ์ดํ„ฐ๋ฅผ ์‚ฌ์šฉํ•˜๊ธฐ ์ „์— ๋‹ค์Œ์„ ๋ฐ”๊ฟ‰๋‹ˆ๋‹ค.

  • LOCATION_ID: Vertex AI๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๋ฆฌ์ „
  • PROJECT_ID: .
  • ENDPOINT_NAME: ์—”๋“œํฌ์ธํŠธ์˜ ํ‘œ์‹œ ์ด๋ฆ„

HTTP ๋ฉ”์„œ๋“œ ๋ฐ URL:

GET https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints?filter=display_name=ENDPOINT_NAME

์š”์ฒญ์„ ๋ณด๋‚ด๋ ค๋ฉด ๋‹ค์Œ ์˜ต์…˜ ์ค‘ ํ•˜๋‚˜๋ฅผ ํŽผ์นฉ๋‹ˆ๋‹ค.

๋‹ค์Œ๊ณผ ๋น„์Šทํ•œ JSON ์‘๋‹ต์ด ํ‘œ์‹œ๋ฉ๋‹ˆ๋‹ค.

{
  "endpoints": [
    {
      "name": "projects/PROJECT_NUMBER/locations/LOCATION_ID/endpoints/ENDPOINT_ID",
      "displayName": "ENDPOINT_NAME",
      "etag": "AMEw9yPz5pf4PwBHbRWOGh0PcAxUdjbdX2Jm3QO_amguy3DbZGP5Oi_YUKRywIE-BtLx",
      "createTime": "2020-04-17T18:31:11.585169Z",
      "updateTime": "2020-04-17T18:35:08.568959Z"
    }
  ]
}
ENDPOINT_ID๋ฅผ ํ™•์ธํ•ฉ๋‹ˆ๋‹ค.

๋ชจ๋ธ ๋ฐฐํฌ

์•„๋ž˜์—์„œ ์–ธ์–ด ๋˜๋Š” ํ™˜๊ฒฝ์— ๋Œ€ํ•œ ํƒญ์„ ์„ ํƒํ•˜์„ธ์š”.

gcloud

๋‹ค์Œ ์˜ˆ์‹œ์—์„œ๋Š” gcloud ai endpoints deploy-model ๋ช…๋ น์–ด๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

๋‹ค์Œ ์˜ˆ์‹œ์—์„œ๋Š” ์—ฌ๋Ÿฌ DeployedModel ๋ฆฌ์†Œ์Šค ๊ฐ„์— ํŠธ๋ž˜ํ”ฝ์„ ๋ถ„ํ• ํ•˜์ง€ ์•Š๊ณ  Model์„ Endpoint์— ๋ฐฐํฌํ•ฉ๋‹ˆ๋‹ค.

์•„๋ž˜์˜ ๋ช…๋ น์–ด ๋ฐ์ดํ„ฐ๋ฅผ ์‚ฌ์šฉํ•˜๊ธฐ ์ „์— ๋‹ค์Œ์„ ๋ฐ”๊ฟ‰๋‹ˆ๋‹ค.

  • ENDPOINT_ID: ์—”๋“œํฌ์ธํŠธ์˜ ID
  • LOCATION_ID: Vertex AI๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๋ฆฌ์ „
  • MODEL_ID: ๋ฐฐํฌํ•  ๋ชจ๋ธ์˜ ID
  • DEPLOYED_MODEL_NAME: DeployedModel์˜ ์ด๋ฆ„์ž…๋‹ˆ๋‹ค. DeployedModel์˜ Model ํ‘œ์‹œ ์ด๋ฆ„๋„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • MIN_REPLICA_COUNT: ์ด ๋ฐฐํฌ์˜ ์ตœ์†Œ ๋…ธ๋“œ ์ˆ˜. ์ถ”๋ก  ๋กœ๋“œ ์‹œ ํ•„์š”์— ๋”ฐ๋ผ ๋…ธ๋“œ ์ˆ˜๋ฅผ ์ตœ๋Œ€ ๋…ธ๋“œ ์ˆ˜๊นŒ์ง€ ๋Š˜๋ฆฌ๊ฑฐ๋‚˜ ์ด ๋…ธ๋“œ ์ˆ˜๊นŒ์ง€ ์ค„์ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • MAX_REPLICA_COUNT: ์ด ๋ฐฐํฌ์˜ ์ตœ๋Œ€ ๋…ธ๋“œ ์ˆ˜. ์ถ”๋ก  ๋กœ๋“œ ์‹œ ํ•„์š”์— ๋”ฐ๋ผ ์ด ๋…ธ๋“œ ์ˆ˜๋ฅผ ๋…ธ๋“œ ์ˆ˜๊นŒ์ง€ ๋Š˜๋ฆฌ๊ฑฐ๋‚˜ ์ตœ์†Œ ๋…ธ๋“œ ์ˆ˜๊นŒ์ง€ ์ค„์ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. --max-replica-count ํ”Œ๋ž˜๊ทธ๋ฅผ ์ƒ๋žตํ•˜๋ฉด ์ตœ๋Œ€ ๋…ธ๋“œ ์ˆ˜๊ฐ€ --min-replica-count ๊ฐ’์œผ๋กœ ์„ค์ •๋ฉ๋‹ˆ๋‹ค.

gcloud ai endpoints deploy-model ๋ช…๋ น์–ด๋ฅผ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค.

Linux, macOS ๋˜๋Š” Cloud Shell

gcloud ai endpoints deploy-model ENDPOINT_ID\
  --region=LOCATION_ID \
  --model=MODEL_ID \
  --display-name=DEPLOYED_MODEL_NAME \
  --min-replica-count=MIN_REPLICA_COUNT \
  --max-replica-count=MAX_REPLICA_COUNT \
  --traffic-split=0=100

Windows(PowerShell)

gcloud ai endpoints deploy-model ENDPOINT_ID`
  --region=LOCATION_ID `
  --model=MODEL_ID `
  --display-name=DEPLOYED_MODEL_NAME `
  --min-replica-count=MIN_REPLICA_COUNT `
  --max-replica-count=MAX_REPLICA_COUNT `
  --traffic-split=0=100

Windows(cmd.exe)

gcloud ai endpoints deploy-model ENDPOINT_ID^
  --region=LOCATION_ID ^
  --model=MODEL_ID ^
  --display-name=DEPLOYED_MODEL_NAME ^
  --min-replica-count=MIN_REPLICA_COUNT ^
  --max-replica-count=MAX_REPLICA_COUNT ^
  --traffic-split=0=100
 

ํŠธ๋ž˜ํ”ฝ ๋ถ„ํ• 

์•ž์˜ ์˜ˆ์‹œ์—์„œ --traffic-split=0=100 ํ”Œ๋ž˜๊ทธ๋Š” Endpoint๊ฐ€ ์ˆ˜์‹ ํ•˜๋Š” ์˜ˆ์ธก ํŠธ๋ž˜ํ”ฝ์˜ 100%๋ฅผ ์ƒˆ DeployedModel๋กœ ์ „์†กํ•˜๋ฉฐ ์ž„์‹œ ID๋Š” 0์œผ๋กœ ํ‘œํ˜„๋ฉ๋‹ˆ๋‹ค. Endpoint์— ์ด๋ฏธ ๋‹ค๋ฅธ DeployedModel ๋ฆฌ์†Œ์Šค๊ฐ€ ์žˆ์œผ๋ฉด ์ƒˆ DeployedModel ๋ฐ ์ด์ „ ๋ชจ๋ธ ๊ฐ„์— ํŠธ๋ž˜ํ”ฝ์„ ๋ถ„ํ• ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด ํŠธ๋ž˜ํ”ฝ์˜ 20%๋ฅผ ์ƒˆ DeployedModel๋กœ, 80%๋ฅผ ์ด์ „ ๋ชจ๋ธ๋กœ ์ „์†กํ•˜๋ ค๋ฉด ๋‹ค์Œ ๋ช…๋ น์–ด๋ฅผ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค.

์•„๋ž˜์˜ ๋ช…๋ น์–ด ๋ฐ์ดํ„ฐ๋ฅผ ์‚ฌ์šฉํ•˜๊ธฐ ์ „์— ๋‹ค์Œ์„ ๋ฐ”๊ฟ‰๋‹ˆ๋‹ค.

  • OLD_DEPLOYED_MODEL_ID: ๊ธฐ์กด DeployedModel์˜ ID

gcloud ai endpoints deploy-model ๋ช…๋ น์–ด๋ฅผ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค.

Linux, macOS ๋˜๋Š” Cloud Shell

gcloud ai endpoints deploy-model ENDPOINT_ID\
  --region=LOCATION_ID \
  --model=MODEL_ID \
  --display-name=DEPLOYED_MODEL_NAME \ 
  --min-replica-count=MIN_REPLICA_COUNT \
  --max-replica-count=MAX_REPLICA_COUNT \
  --traffic-split=0=20,OLD_DEPLOYED_MODEL_ID=80

Windows(PowerShell)

gcloud ai endpoints deploy-model ENDPOINT_ID`
  --region=LOCATION_ID `
  --model=MODEL_ID `
  --display-name=DEPLOYED_MODEL_NAME \ 
  --min-replica-count=MIN_REPLICA_COUNT `
  --max-replica-count=MAX_REPLICA_COUNT `
  --traffic-split=0=20,OLD_DEPLOYED_MODEL_ID=80

Windows(cmd.exe)

gcloud ai endpoints deploy-model ENDPOINT_ID^
  --region=LOCATION_ID ^
  --model=MODEL_ID ^
  --display-name=DEPLOYED_MODEL_NAME \ 
  --min-replica-count=MIN_REPLICA_COUNT ^
  --max-replica-count=MAX_REPLICA_COUNT ^
  --traffic-split=0=20,OLD_DEPLOYED_MODEL_ID=80
 

REST

๋ชจ๋ธ ๋ฐฐํฌ

์š”์ฒญ ๋ฐ์ดํ„ฐ๋ฅผ ์‚ฌ์šฉํ•˜๊ธฐ ์ „์— ๋‹ค์Œ์„ ๋ฐ”๊ฟ‰๋‹ˆ๋‹ค.

  • LOCATION_ID: Vertex AI๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๋ฆฌ์ „
  • PROJECT_ID: .
  • ENDPOINT_ID: ์—”๋“œํฌ์ธํŠธ์˜ ID
  • MODEL_ID: ๋ฐฐํฌํ•  ๋ชจ๋ธ์˜ ID์ž…๋‹ˆ๋‹ค.
  • DEPLOYED_MODEL_NAME: DeployedModel์˜ ์ด๋ฆ„์ž…๋‹ˆ๋‹ค. DeployedModel์˜ Model ํ‘œ์‹œ ์ด๋ฆ„๋„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • MIN_REPLICA_COUNT: ์ด ๋ฐฐํฌ์˜ ์ตœ์†Œ ๋…ธ๋“œ ์ˆ˜. ์ถ”๋ก  ๋กœ๋“œ ์‹œ ํ•„์š”์— ๋”ฐ๋ผ ๋…ธ๋“œ ์ˆ˜๋ฅผ ์ตœ๋Œ€ ๋…ธ๋“œ ์ˆ˜๊นŒ์ง€ ๋Š˜๋ฆฌ๊ฑฐ๋‚˜ ์ด ๋…ธ๋“œ ์ˆ˜๊นŒ์ง€ ์ค„์ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • MAX_REPLICA_COUNT: ์ด ๋ฐฐํฌ์˜ ์ตœ๋Œ€ ๋…ธ๋“œ ์ˆ˜. ์ถ”๋ก  ๋กœ๋“œ ์‹œ ํ•„์š”์— ๋”ฐ๋ผ ์ด ๋…ธ๋“œ ์ˆ˜๋ฅผ ๋…ธ๋“œ ์ˆ˜๊นŒ์ง€ ๋Š˜๋ฆฌ๊ฑฐ๋‚˜ ์ตœ์†Œ ๋…ธ๋“œ ์ˆ˜๊นŒ์ง€ ์ค„์ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • TRAFFIC_SPLIT_THIS_MODEL: ์ด ์ž‘์—…๊ณผ ํ•จ๊ป˜ ๋ฐฐํฌ๋˜๋Š” ๋ชจ๋ธ๋กœ ๋ผ์šฐํŒ…๋  ์ด ์—”๋“œํฌ์ธํŠธ์— ๋Œ€ํ•œ ์˜ˆ์ธก ํŠธ๋ž˜ํ”ฝ ๋น„์œจ์ž…๋‹ˆ๋‹ค. ๊ธฐ๋ณธ๊ฐ’์€ 100์ž…๋‹ˆ๋‹ค. ๋ชจ๋“  ํŠธ๋ž˜ํ”ฝ ๋น„์œจ์˜ ํ•ฉ์€ 100์ด ๋˜์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ํŠธ๋ž˜ํ”ฝ ๋ถ„ํ• ์— ๋Œ€ํ•ด ์ž์„ธํžˆ ์•Œ์•„๋ณด๊ธฐ
  • DEPLOYED_MODEL_ID_N: ์„ ํƒ์‚ฌํ•ญ. ๋‹ค๋ฅธ ๋ชจ๋ธ์ด ์ด ์—”๋“œํฌ์ธํŠธ์— ๋ฐฐํฌ๋œ ๊ฒฝ์šฐ ๋ชจ๋“  ๋น„์œจ์˜ ํ•ฉ์ด 100์ด ๋˜๋„๋ก ํŠธ๋ž˜ํ”ฝ ๋ถ„ํ•  ๋น„์œจ์„ ์—…๋ฐ์ดํŠธํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
  • TRAFFIC_SPLIT_MODEL_N: ๋ฐฐํฌ๋œ ๋ชจ๋ธ ID ํ‚ค์˜ ํŠธ๋ž˜ํ”ฝ ๋ถ„ํ•  ๋น„์œจ ๊ฐ’
  • PROJECT_NUMBER: ํ”„๋กœ์ ํŠธ์˜ ์ž๋™์œผ๋กœ ์ƒ์„ฑ๋œ ํ”„๋กœ์ ํŠธ ๋ฒˆํ˜ธ

HTTP ๋ฉ”์„œ๋“œ ๋ฐ URL:

POST https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints/ENDPOINT_ID:deployModel

JSON ์š”์ฒญ ๋ณธ๋ฌธ:

{
  "deployedModel": {
    "model": "projects/PROJECT_ID/locations/LOCATION_ID/models/MODEL_ID",
    "displayName": "DEPLOYED_MODEL_NAME",
    "automaticResources": {
       "minReplicaCount": MIN_REPLICA_COUNT,
       "maxReplicaCount": MAX_REPLICA_COUNT
     }
  },
  "trafficSplit": {
    "0": TRAFFIC_SPLIT_THIS_MODEL,
    "DEPLOYED_MODEL_ID_1": TRAFFIC_SPLIT_MODEL_1,
    "DEPLOYED_MODEL_ID_2": TRAFFIC_SPLIT_MODEL_2
  },
}

์š”์ฒญ์„ ๋ณด๋‚ด๋ ค๋ฉด ๋‹ค์Œ ์˜ต์…˜ ์ค‘ ํ•˜๋‚˜๋ฅผ ํŽผ์นฉ๋‹ˆ๋‹ค.

๋‹ค์Œ๊ณผ ๋น„์Šทํ•œ JSON ์‘๋‹ต์ด ํ‘œ์‹œ๋ฉ๋‹ˆ๋‹ค.

{
  "name": "projects/PROJECT_ID/locations/LOCATION_ID/endpoints/ENDPOINT_ID/operations/OPERATION_ID",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.aiplatform.v1.DeployModelOperationMetadata",
    "genericMetadata": {
      "createTime": "2020-10-19T17:53:16.502088Z",
      "updateTime": "2020-10-19T17:53:16.502088Z"
    }
  }
}

Java

์ด ์ƒ˜ํ”Œ์„ ์‚ฌ์šฉํ•ด ๋ณด๊ธฐ ์ „์— Vertex AI ๋น ๋ฅธ ์‹œ์ž‘: ํด๋ผ์ด์–ธํŠธ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ์‚ฌ์šฉ์˜ Java ์„ค์ • ์•ˆ๋‚ด๋ฅผ ๋”ฐ๋ฅด์„ธ์š”. ์ž์„ธํ•œ ๋‚ด์šฉ์€ Vertex AI Java API ์ฐธ๊ณ  ๋ฌธ์„œ๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.

Vertex AI์— ์ธ์ฆํ•˜๋ ค๋ฉด ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜ ๊ธฐ๋ณธ ์‚ฌ์šฉ์ž ์ธ์ฆ ์ •๋ณด๋ฅผ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค. ์ž์„ธํ•œ ๋‚ด์šฉ์€ ๋กœ์ปฌ ๊ฐœ๋ฐœ ํ™˜๊ฒฝ์˜ ์ธ์ฆ ์„ค์ •์„ ์ฐธ์กฐํ•˜์„ธ์š”.


import com.google.api.gax.longrunning.OperationFuture;
import com.google.api.gax.longrunning.OperationTimedPollAlgorithm;
import com.google.api.gax.retrying.RetrySettings;
import com.google.cloud.aiplatform.v1.AutomaticResources;
import com.google.cloud.aiplatform.v1.DedicatedResources;
import com.google.cloud.aiplatform.v1.DeployModelOperationMetadata;
import com.google.cloud.aiplatform.v1.DeployModelResponse;
import com.google.cloud.aiplatform.v1.DeployedModel;
import com.google.cloud.aiplatform.v1.EndpointName;
import com.google.cloud.aiplatform.v1.EndpointServiceClient;
import com.google.cloud.aiplatform.v1.EndpointServiceSettings;
import com.google.cloud.aiplatform.v1.MachineSpec;
import com.google.cloud.aiplatform.v1.ModelName;
import com.google.cloud.aiplatform.v1.stub.EndpointServiceStubSettings;
import java.io.IOException;
import java.util.HashMap;
import java.util.Map;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.TimeoutException;
import org.threeten.bp.Duration;

public class DeployModelSample {

  public static void main(String[] args)
      throws IOException, InterruptedException, ExecutionException, TimeoutException {
    // TODO(developer): Replace these variables before running the sample.
    String project = "YOUR_PROJECT_ID";
    String deployedModelDisplayName = "YOUR_DEPLOYED_MODEL_DISPLAY_NAME";
    String endpointId = "YOUR_ENDPOINT_NAME";
    String modelId = "YOUR_MODEL_ID";
    int timeout = 900;
    deployModelSample(project, deployedModelDisplayName, endpointId, modelId, timeout);
  }

  static void deployModelSample(
      String project,
      String deployedModelDisplayName,
      String endpointId,
      String modelId,
      int timeout)
      throws IOException, InterruptedException, ExecutionException, TimeoutException {

    // Set long-running operations (LROs) timeout
    final OperationTimedPollAlgorithm operationTimedPollAlgorithm =
        OperationTimedPollAlgorithm.create(
            RetrySettings.newBuilder()
                .setInitialRetryDelay(Duration.ofMillis(5000L))
                .setRetryDelayMultiplier(1.5)
                .setMaxRetryDelay(Duration.ofMillis(45000L))
                .setInitialRpcTimeout(Duration.ZERO)
                .setRpcTimeoutMultiplier(1.0)
                .setMaxRpcTimeout(Duration.ZERO)
                .setTotalTimeout(Duration.ofSeconds(timeout))
                .build());

    EndpointServiceStubSettings.Builder endpointServiceStubSettingsBuilder =
        EndpointServiceStubSettings.newBuilder();
    endpointServiceStubSettingsBuilder
        .deployModelOperationSettings()
        .setPollingAlgorithm(operationTimedPollAlgorithm);
    EndpointServiceStubSettings endpointStubSettings = endpointServiceStubSettingsBuilder.build();
    EndpointServiceSettings endpointServiceSettings =
        EndpointServiceSettings.create(endpointStubSettings);
    endpointServiceSettings =
        endpointServiceSettings.toBuilder()
            .setEndpoint("us-central1-aiplatform.googleapis.com:443")
            .build();

    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (EndpointServiceClient endpointServiceClient =
        EndpointServiceClient.create(endpointServiceSettings)) {
      String location = "us-central1";
      EndpointName endpointName = EndpointName.of(project, location, endpointId);
      // key '0' assigns traffic for the newly deployed model
      // Traffic percentage values must add up to 100
      // Leave dictionary empty if endpoint should not accept any traffic
      Map<String, Integer> trafficSplit = new HashMap<>();
      trafficSplit.put("0", 100);
      ModelName modelName = ModelName.of(project, location, modelId);
      AutomaticResources automaticResourcesInput =
          AutomaticResources.newBuilder().setMinReplicaCount(1).setMaxReplicaCount(1).build();
      DeployedModel deployedModelInput =
          DeployedModel.newBuilder()
              .setModel(modelName.toString())
              .setDisplayName(deployedModelDisplayName)
              .setAutomaticResources(automaticResourcesInput)
              .build();

      OperationFuture<DeployModelResponse, DeployModelOperationMetadata> deployModelResponseFuture =
          endpointServiceClient.deployModelAsync(endpointName, deployedModelInput, trafficSplit);
      System.out.format(
          "Operation name: %s\n", deployModelResponseFuture.getInitialFuture().get().getName());
      System.out.println("Waiting for operation to finish...");
      DeployModelResponse deployModelResponse = deployModelResponseFuture.get(20, TimeUnit.MINUTES);

      System.out.println("Deploy Model Response");
      DeployedModel deployedModel = deployModelResponse.getDeployedModel();
      System.out.println("\tDeployed Model");
      System.out.format("\t\tid: %s\n", deployedModel.getId());
      System.out.format("\t\tmodel: %s\n", deployedModel.getModel());
      System.out.format("\t\tDisplay Name: %s\n", deployedModel.getDisplayName());
      System.out.format("\t\tCreate Time: %s\n", deployedModel.getCreateTime());

      DedicatedResources dedicatedResources = deployedModel.getDedicatedResources();
      System.out.println("\t\tDedicated Resources");
      System.out.format("\t\t\tMin Replica Count: %s\n", dedicatedResources.getMinReplicaCount());

      MachineSpec machineSpec = dedicatedResources.getMachineSpec();
      System.out.println("\t\t\tMachine Spec");
      System.out.format("\t\t\t\tMachine Type: %s\n", machineSpec.getMachineType());
      System.out.format("\t\t\t\tAccelerator Type: %s\n", machineSpec.getAcceleratorType());
      System.out.format("\t\t\t\tAccelerator Count: %s\n", machineSpec.getAcceleratorCount());

      AutomaticResources automaticResources = deployedModel.getAutomaticResources();
      System.out.println("\t\tAutomatic Resources");
      System.out.format("\t\t\tMin Replica Count: %s\n", automaticResources.getMinReplicaCount());
      System.out.format("\t\t\tMax Replica Count: %s\n", automaticResources.getMaxReplicaCount());
    }
  }
}

Node.js

์ด ์ƒ˜ํ”Œ์„ ์‚ฌ์šฉํ•ด ๋ณด๊ธฐ ์ „์— Vertex AI ๋น ๋ฅธ ์‹œ์ž‘: ํด๋ผ์ด์–ธํŠธ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ์‚ฌ์šฉ์˜ Node.js ์„ค์ • ์•ˆ๋‚ด๋ฅผ ๋”ฐ๋ฅด์„ธ์š”. ์ž์„ธํ•œ ๋‚ด์šฉ์€ Vertex AI Node.js API ์ฐธ๊ณ  ๋ฌธ์„œ๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.

Vertex AI์— ์ธ์ฆํ•˜๋ ค๋ฉด ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜ ๊ธฐ๋ณธ ์‚ฌ์šฉ์ž ์ธ์ฆ ์ •๋ณด๋ฅผ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค. ์ž์„ธํ•œ ๋‚ด์šฉ์€ ๋กœ์ปฌ ๊ฐœ๋ฐœ ํ™˜๊ฒฝ์˜ ์ธ์ฆ ์„ค์ •์„ ์ฐธ์กฐํ•˜์„ธ์š”.

/**
 * TODO(developer): Uncomment these variables before running the sample.\
 * (Not necessary if passing values as arguments)
 */

// const modelId = "YOUR_MODEL_ID";
// const endpointId = 'YOUR_ENDPOINT_ID';
// const deployedModelDisplayName = 'YOUR_DEPLOYED_MODEL_DISPLAY_NAME';
// const project = 'YOUR_PROJECT_ID';
// const location = 'YOUR_PROJECT_LOCATION';

const modelName = `projects/${project}/locations/${location}/models/${modelId}`;
const endpoint = `projects/${project}/locations/${location}/endpoints/${endpointId}`;
// Imports the Google Cloud Endpoint Service Client library
const {EndpointServiceClient} = require('@google-cloud/aiplatform');

// Specifies the location of the api endpoint:
const clientOptions = {
  apiEndpoint: 'us-central1-aiplatform.googleapis.com',
};

// Instantiates a client
const endpointServiceClient = new EndpointServiceClient(clientOptions);

async function deployModel() {
  // Configure the parent resource
  // key '0' assigns traffic for the newly deployed model
  // Traffic percentage values must add up to 100
  // Leave dictionary empty if endpoint should not accept any traffic
  const trafficSplit = {0: 100};
  const deployedModel = {
    // format: 'projects/{project}/locations/{location}/models/{model}'
    model: modelName,
    displayName: deployedModelDisplayName,
    automaticResources: {minReplicaCount: 1, maxReplicaCount: 1},
  };
  const request = {
    endpoint,
    deployedModel,
    trafficSplit,
  };

  // Get and print out a list of all the endpoints for this resource
  const [response] = await endpointServiceClient.deployModel(request);
  console.log(`Long running operation : ${response.name}`);

  // Wait for operation to complete
  await response.promise();
  const result = response.result;

  console.log('Deploy model response');
  const modelDeployed = result.deployedModel;
  console.log('\tDeployed model');
  if (!modelDeployed) {
    console.log('\t\tId : {}');
    console.log('\t\tModel : {}');
    console.log('\t\tDisplay name : {}');
    console.log('\t\tCreate time : {}');

    console.log('\t\tDedicated resources');
    console.log('\t\t\tMin replica count : {}');
    console.log('\t\t\tMachine spec {}');
    console.log('\t\t\t\tMachine type : {}');
    console.log('\t\t\t\tAccelerator type : {}');
    console.log('\t\t\t\tAccelerator count : {}');

    console.log('\t\tAutomatic resources');
    console.log('\t\t\tMin replica count : {}');
    console.log('\t\t\tMax replica count : {}');
  } else {
    console.log(`\t\tId : ${modelDeployed.id}`);
    console.log(`\t\tModel : ${modelDeployed.model}`);
    console.log(`\t\tDisplay name : ${modelDeployed.displayName}`);
    console.log(`\t\tCreate time : ${modelDeployed.createTime}`);

    const dedicatedResources = modelDeployed.dedicatedResources;
    console.log('\t\tDedicated resources');
    if (!dedicatedResources) {
      console.log('\t\t\tMin replica count : {}');
      console.log('\t\t\tMachine spec {}');
      console.log('\t\t\t\tMachine type : {}');
      console.log('\t\t\t\tAccelerator type : {}');
      console.log('\t\t\t\tAccelerator count : {}');
    } else {
      console.log(
        `\t\t\tMin replica count : \
          ${dedicatedResources.minReplicaCount}`
      );
      const machineSpec = dedicatedResources.machineSpec;
      console.log('\t\t\tMachine spec');
      console.log(`\t\t\t\tMachine type : ${machineSpec.machineType}`);
      console.log(
        `\t\t\t\tAccelerator type : ${machineSpec.acceleratorType}`
      );
      console.log(
        `\t\t\t\tAccelerator count : ${machineSpec.acceleratorCount}`
      );
    }

    const automaticResources = modelDeployed.automaticResources;
    console.log('\t\tAutomatic resources');
    if (!automaticResources) {
      console.log('\t\t\tMin replica count : {}');
      console.log('\t\t\tMax replica count : {}');
    } else {
      console.log(
        `\t\t\tMin replica count : \
          ${automaticResources.minReplicaCount}`
      );
      console.log(
        `\t\t\tMax replica count : \
          ${automaticResources.maxReplicaCount}`
      );
    }
  }
}
deployModel();

Python

Vertex AI SDK for Python์„ ์„ค์น˜ํ•˜๊ฑฐ๋‚˜ ์—…๋ฐ์ดํŠธํ•˜๋Š” ๋ฐฉ๋ฒ•์€ Vertex AI SDK for Python ์„ค์น˜๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”. ์ž์„ธํ•œ ๋‚ด์šฉ์€ Python API ์ฐธ๊ณ  ๋ฌธ์„œ๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.

def deploy_model_with_automatic_resources_sample(
    project,
    location,
    model_name: str,
    endpoint: Optional[aiplatform.Endpoint] = None,
    deployed_model_display_name: Optional[str] = None,
    traffic_percentage: Optional[int] = 0,
    traffic_split: Optional[Dict[str, int]] = None,
    min_replica_count: int = 1,
    max_replica_count: int = 1,
    metadata: Optional[Sequence[Tuple[str, str]]] = (),
    sync: bool = True,
):
    """
    model_name: A fully-qualified model resource name or model ID.
          Example: "projects/123/locations/us-central1/models/456" or
          "456" when project and location are initialized or passed.
    """

    aiplatform.init(project=project, location=location)

    model = aiplatform.Model(model_name=model_name)

    model.deploy(
        endpoint=endpoint,
        deployed_model_display_name=deployed_model_display_name,
        traffic_percentage=traffic_percentage,
        traffic_split=traffic_split,
        min_replica_count=min_replica_count,
        max_replica_count=max_replica_count,
        metadata=metadata,
        sync=sync,
    )

    model.wait()

    print(model.display_name)
    print(model.resource_name)
    return model

์ถ”๋ก  ๋กœ๊น…์˜ ๊ธฐ๋ณธ ์„ค์ •์„ ๋ณ€๊ฒฝํ•˜๋Š” ๋ฐฉ๋ฒ• ์•Œ์•„๋ณด๊ธฐ

์ž‘์—… ์ƒํƒœ ๊ฐ€์ ธ์˜ค๊ธฐ

์ผ๋ถ€ ์š”์ฒญ์€ ์™„๋ฃŒํ•˜๋Š” ๋ฐ ์‹œ๊ฐ„์ด ๊ฑธ๋ฆฌ๋Š” ์žฅ๊ธฐ ์‹คํ–‰ ์ž‘์—…์„ ์‹œ์ž‘ํ•ฉ๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ์š”์ฒญ์€ ์ž‘์—… ์ƒํƒœ๋ฅผ ๋ณด๊ฑฐ๋‚˜ ์ž‘์—…์„ ์ทจ์†Œํ•˜๋Š” ๋ฐ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ์ž‘์—… ์ด๋ฆ„์„ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค. Vertex AI๋Š” ์žฅ๊ธฐ ์‹คํ–‰ ์ž‘์—…์„ ํ˜ธ์ถœํ•˜๋Š” ๋„์šฐ๋ฏธ ๋ฉ”์„œ๋“œ๋ฅผ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ์ž์„ธํ•œ ๋‚ด์šฉ์€ ์žฅ๊ธฐ ์‹คํ–‰ ์ž‘์—… ๋‹ค๋ฃจ๊ธฐ๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.

๋ฐฐํฌ๋œ ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜์—ฌ ์˜จ๋ผ์ธ ์ถ”๋ก  ์ˆ˜ํ–‰

์˜จ๋ผ์ธ ์ถ”๋ก ์„ ์ˆ˜ํ–‰ํ•˜๋ ค๋ฉด ๋ถ„์„์„ ์œ„ํ•ด ํ•˜๋‚˜ ์ด์ƒ์˜ ํ…Œ์ŠคํŠธ ํ•ญ๋ชฉ์„ ๋ชจ๋ธ์— ์ œ์ถœํ•˜๋ฉด ๋ชจ๋ธ์ด ๋ชจ๋ธ์˜ ๋ชฉํ‘œ์— ๋”ฐ๋ฅธ ๊ฒฐ๊ณผ๋ฅผ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค. ์ถ”๋ก  ๊ฒฐ๊ณผ์— ๋Œ€ํ•œ ์ž์„ธํ•œ ๋‚ด์šฉ์€ ๊ฒฐ๊ณผ ํ•ด์„ ํŽ˜์ด์ง€๋ฅผ ์ฐธ๊ณ ํ•˜์„ธ์š”.

์ฝ˜์†”

Google Cloud ์ฝ˜์†”์„ ์‚ฌ์šฉํ•˜์—ฌ ์˜จ๋ผ์ธ ์ถ”๋ก ์„ ์š”์ฒญํ•ฉ๋‹ˆ๋‹ค. ๋ชจ๋ธ์„ ์—”๋“œํฌ์ธํŠธ์— ๋ฐฐํฌํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

  1. Google Cloud ์ฝ˜์†”์˜ Vertex AI ์„น์…˜์—์„œ ๋ชจ๋ธ ํŽ˜์ด์ง€๋กœ ์ด๋™ํ•ฉ๋‹ˆ๋‹ค.

    ๋ชจ๋ธ ํŽ˜์ด์ง€๋กœ ์ด๋™

  2. ๋ชจ๋ธ ๋ชฉ๋ก์—์„œ ์ถ”๋ก ์„ ์š”์ฒญํ•  ๋ชจ๋ธ์˜ ์ด๋ฆ„์„ ํด๋ฆญํ•ฉ๋‹ˆ๋‹ค.

  3. ๋ฐฐํฌ ๋ฐ ํ…Œ์ŠคํŠธ ํƒญ์„ ์„ ํƒํ•ฉ๋‹ˆ๋‹ค.

  4. ๋ชจ๋ธ ํ…Œ์ŠคํŠธ ์„น์…˜์—์„œ ํ…Œ์ŠคํŠธ ํ•ญ๋ชฉ์„ ์ถ”๊ฐ€ํ•˜์—ฌ ์ถ”๋ก ์„ ์š”์ฒญํ•ฉ๋‹ˆ๋‹ค.

    ์ด๋ฏธ์ง€ ๋ชฉํ‘œ์— ๋Œ€ํ•œ AutoML ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜๋ ค๋ฉด ์ด๋ฏธ์ง€๋ฅผ ์—…๋กœ๋“œํ•˜์—ฌ ์ถ”๋ก ์„ ์š”์ฒญํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

    ๋กœ์ปฌ ํŠน์„ฑ ์ค‘์š”๋„์— ๋Œ€ํ•œ ์ž์„ธํ•œ ๋‚ด์šฉ์€ ์„ค๋ช… ๋ณด๊ธฐ๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.

    ์ถ”๋ก ์ด ์™„๋ฃŒ๋˜๋ฉด Vertex AI๊ฐ€ ์ฝ˜์†”์— ๊ฒฐ๊ณผ๋ฅผ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค.

API

Vertex AI API๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์˜จ๋ผ์ธ ์ถ”๋ก ์„ ์š”์ฒญํ•ฉ๋‹ˆ๋‹ค. ๋ชจ๋ธ์„ ์—”๋“œํฌ์ธํŠธ์— ๋ฐฐํฌํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

์ผ๊ด„ ์ถ”๋ก  ๊ฐ€์ ธ์˜ค๊ธฐ

์ผ๊ด„ ์ถ”๋ก  ์š”์ฒญ์„ ์ˆ˜ํ–‰ํ•˜๋ ค๋ฉด Vertex AI๊ฐ€ ์ถ”๋ก  ๊ฒฐ๊ณผ๋ฅผ ์ €์žฅํ•˜๋Š” ์ž…๋ ฅ ์†Œ์Šค์™€ ์ถœ๋ ฅ ํ˜•์‹์„ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค. AutoML ์ด๋ฏธ์ง€ ๋ชจ๋ธ ์œ ํ˜•์˜ ์ผ๊ด„ ์ถ”๋ก ์—๋Š” ์ž…๋ ฅ JSON Lines ํŒŒ์ผ๊ณผ ์ถœ๋ ฅ์„ ์ €์žฅํ•  Cloud Storage ๋ฒ„ํ‚ท ์ด๋ฆ„์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.

์ž…๋ ฅ ๋ฐ์ดํ„ฐ ์š”๊ตฌ์‚ฌํ•ญ

์ผ๊ด„ ์š”์ฒญ์˜ ์ž…๋ ฅ์€ ์ถ”๋ก ์„ ์œ„ํ•ด ๋ชจ๋ธ์— ๋ณด๋‚ผ ํ•ญ๋ชฉ์„ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค. ์ด๋ฏธ์ง€ ๋ถ„๋ฅ˜ ๋ชจ๋ธ์˜ ๊ฒฝ์šฐ JSON Lines ํŒŒ์ผ์„ ์‚ฌ์šฉํ•˜์—ฌ ์ถ”๋ก ์„ ์ˆ˜ํ–‰ํ•  ์ด๋ฏธ์ง€ ๋ชฉ๋ก์„ ์ง€์ •ํ•œ ๋‹ค์Œ JSON Lines ํŒŒ์ผ์„ Cloud Storage ๋ฒ„ํ‚ท์— ์ €์žฅํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋‹ค์Œ ์ƒ˜ํ”Œ์—์„œ๋Š” ์ž…๋ ฅ JSON Lines ํŒŒ์ผ์˜ ๋‹จ์ผ ์ค„์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.

{"content": "gs://sourcebucket/datasets/images/source_image.jpg", "mimeType": "image/jpeg"}

์ผ๊ด„ ์ถ”๋ก  ์š”์ฒญ

์ผ๊ด„ ์ถ”๋ก  ์š”์ฒญ์˜ ๊ฒฝ์šฐ Google Cloud ์ฝ˜์†” ๋˜๋Š” Vertex AI API๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ œ์ถœํ•œ ์ž…๋ ฅ ํ•ญ๋ชฉ ์ˆ˜์— ๋”ฐ๋ผ ์ผ๊ด„ ์ถ”๋ก  ํƒœ์Šคํฌ๋ฅผ ์™„๋ฃŒํ•˜๋Š” ๋ฐ ๋‹ค์†Œ ์‹œ๊ฐ„์ด ๊ฑธ๋ฆด ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

Google Cloud ์ฝ˜์†”

Google Cloud ์ฝ˜์†”์„ ์‚ฌ์šฉํ•˜์—ฌ ์ผ๊ด„ ์ถ”๋ก ์„ ์š”์ฒญํ•ฉ๋‹ˆ๋‹ค.

  1. Google Cloud ์ฝ˜์†”์˜ Vertex AI ์„น์…˜์—์„œ ์ผ๊ด„ ์˜ˆ์ธก ํŽ˜์ด์ง€๋กœ ์ด๋™ํ•ฉ๋‹ˆ๋‹ค.

    ์ผ๊ด„ ์˜ˆ์ธก ํŽ˜์ด์ง€๋กœ ์ด๋™

  2. ๋งŒ๋“ค๊ธฐ๋ฅผ ํด๋ฆญํ•˜์—ฌ ์ƒˆ ์ผ๊ด„ ์˜ˆ์ธก ์ฐฝ์„ ์—ด๊ณ  ๋‹ค์Œ ๋‹จ๊ณ„๋ฅผ ์™„๋ฃŒํ•ฉ๋‹ˆ๋‹ค.

    1. ์ผ๊ด„ ์ถ”๋ก ์˜ ์ด๋ฆ„์„ ์ž…๋ ฅํ•ฉ๋‹ˆ๋‹ค.
    2. ๋ชจ๋ธ ์ด๋ฆ„์—์„œ ์ด ์ผ๊ด„ ์ถ”๋ก ์— ์‚ฌ์šฉํ•  ๋ชจ๋ธ์˜ ์ด๋ฆ„์„ ์„ ํƒํ•ฉ๋‹ˆ๋‹ค.
    3. ์†Œ์Šค ๊ฒฝ๋กœ์—์„œ JSON Lines ์ž…๋ ฅ ํŒŒ์ผ์ด ์žˆ๋Š” Cloud Storage ์œ„์น˜๋ฅผ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค.
    4. ๋Œ€์ƒ ๊ฒฝ๋กœ์—์„œ ์ผ๊ด„ ์ถ”๋ก  ๊ฒฐ๊ณผ๊ฐ€ ์ €์žฅ๋˜๋Š” Cloud Storage ์œ„์น˜๋ฅผ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค. ์ถœ๋ ฅ ํ˜•์‹์€ ๋ชจ๋ธ์˜ ๋ชฉํ‘œ์— ๋”ฐ๋ผ ๊ฒฐ์ •๋ฉ๋‹ˆ๋‹ค. ์ด๋ฏธ์ง€ ๋ชฉํ‘œ์˜ AutoML ๋ชจ๋ธ์€ JSON Lines ํŒŒ์ผ์„ ์ถœ๋ ฅํ•ฉ๋‹ˆ๋‹ค.

API

Vertex AI API๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ผ๊ด„ ์ถ”๋ก  ์š”์ฒญ์„ ์ „์†กํ•ฉ๋‹ˆ๋‹ค.

REST

์š”์ฒญ ๋ฐ์ดํ„ฐ๋ฅผ ์‚ฌ์šฉํ•˜๊ธฐ ์ „์— ๋‹ค์Œ์„ ๋ฐ”๊ฟ‰๋‹ˆ๋‹ค.

  • LOCATION_ID: ๋ชจ๋ธ์ด ์ €์žฅ๋˜๊ณ  ์ผ๊ด„ ์ถ”๋ก  ์ž‘์—…์ด ์‹คํ–‰๋˜๋Š” ๋ฆฌ์ „์ž…๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค๋ฉด us-central1์ž…๋‹ˆ๋‹ค.
  • PROJECT_ID:
  • BATCH_JOB_NAME: ์ผ๊ด„ ์ž‘์—…์˜ ํ‘œ์‹œ ์ด๋ฆ„
  • MODEL_ID: ์ถ”๋ก ์„ ์ˆ˜ํ–‰ํ•˜๋Š” ๋ฐ ์‚ฌ์šฉํ•  ๋ชจ๋ธ์˜ ID์ž…๋‹ˆ๋‹ค.
  • THRESHOLD_VALUE(์„ ํƒ์‚ฌํ•ญ): Vertex AI๋Š” ์‹ ๋ขฐ๋„ ์ ์ˆ˜๊ฐ€ ์ด ๊ฐ’ ์ด์ƒ์ธ ์ถ”๋ก ๋งŒ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค. ๊ธฐ๋ณธ๊ฐ’์€ 0.0์ž…๋‹ˆ๋‹ค.
  • MAX_PREDICTIONS (์„ ํƒ์‚ฌํ•ญ): Vertex AI๋Š” ์‹ ๋ขฐ๋„ ์ ์ˆ˜๊ฐ€ ๊ฐ€์žฅ ๋†’์€ ์ˆœ์œผ๋กœ ์ตœ๋Œ€ ์ด ์ˆ˜์˜ ์ถ”๋ก ์„ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค. ๊ธฐ๋ณธ๊ฐ’์€ 10์ž…๋‹ˆ๋‹ค.
  • URI: ์ž…๋ ฅ JSON Lines ํŒŒ์ผ์ด ์žˆ๋Š” Cloud Storage URI
  • BUCKET: Cloud Storage ๋ฒ„ํ‚ท
  • PROJECT_NUMBER: ํ”„๋กœ์ ํŠธ์˜ ์ž๋™์œผ๋กœ ์ƒ์„ฑ๋œ ํ”„๋กœ์ ํŠธ ๋ฒˆํ˜ธ

HTTP ๋ฉ”์„œ๋“œ ๋ฐ URL:

POST https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/batchPredictionJobs

JSON ์š”์ฒญ ๋ณธ๋ฌธ:

{
    "displayName": "BATCH_JOB_NAME",
    "model": "projects/PROJECT/locations/LOCATION/models/MODEL_ID",
    "modelParameters": {
      "confidenceThreshold": THRESHOLD_VALUE,
      "maxPredictions": MAX_PREDICTIONS
    },
    "inputConfig": {
        "instancesFormat": "jsonl",
        "gcsSource": {
            "uris": ["URI"],
        },
    },
    "outputConfig": {
        "predictionsFormat": "jsonl",
        "gcsDestination": {
            "outputUriPrefix": "OUTPUT_BUCKET",
        },
    },
}

์š”์ฒญ์„ ๋ณด๋‚ด๋ ค๋ฉด ๋‹ค์Œ ์˜ต์…˜ ์ค‘ ํ•˜๋‚˜๋ฅผ ์„ ํƒํ•ฉ๋‹ˆ๋‹ค.

curl

์š”์ฒญ ๋ณธ๋ฌธ์„ request.json ํŒŒ์ผ์— ์ €์žฅํ•˜๊ณ  ๋‹ค์Œ ๋ช…๋ น์–ด๋ฅผ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค.

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/batchPredictionJobs"

PowerShell

์š”์ฒญ ๋ณธ๋ฌธ์„ request.json ํŒŒ์ผ์— ์ €์žฅํ•˜๊ณ  ๋‹ค์Œ ๋ช…๋ น์–ด๋ฅผ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค.

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/batchPredictionJobs" | Select-Object -Expand Content

๋‹ค์Œ๊ณผ ๋น„์Šทํ•œ JSON ์‘๋‹ต์ด ํ‘œ์‹œ๋ฉ๋‹ˆ๋‹ค.

{
  "name": "projects/PROJECT_NUMBER/locations/LOCATION_ID/batchPredictionJobs/BATCH_JOB_ID",
  "displayName": "BATCH_JOB_NAME",
  "model": "projects/PROJECT_ID/locations/LOCATION_ID/models/MODEL_ID",
  "inputConfig": {
    "instancesFormat": "jsonl",
    "gcsSource": {
      "uris": [
        "CONTENT"
      ]
    }
  },
  "outputConfig": {
    "predictionsFormat": "jsonl",
    "gcsDestination": {
      "outputUriPrefix": "BUCKET"
    }
  },
  "state": "JOB_STATE_PENDING",
  "createTime": "2020-05-30T02:58:44.341643Z",
  "updateTime": "2020-05-30T02:58:44.341643Z",
  "modelDisplayName": "MODEL_NAME",
  "modelObjective": "MODEL_OBJECTIVE"
}

์ž‘์—… state๊ฐ€ JOB_STATE_SUCCEEDED๊ฐ€ ๋  ๋•Œ๊นŒ์ง€ BATCH_JOB_ID๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ผ๊ด„ ์ž‘์—…์˜ ์ƒํƒœ๋ฅผ ํด๋งํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

Python

Vertex AI SDK for Python์„ ์„ค์น˜ํ•˜๊ฑฐ๋‚˜ ์—…๋ฐ์ดํŠธํ•˜๋Š” ๋ฐฉ๋ฒ•์€ Vertex AI SDK for Python ์„ค์น˜๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”. ์ž์„ธํ•œ ๋‚ด์šฉ์€ Python API ์ฐธ๊ณ  ๋ฌธ์„œ๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.

def create_batch_prediction_job_sample(
    project: str,
    location: str,
    model_resource_name: str,
    job_display_name: str,
    gcs_source: Union[str, Sequence[str]],
    gcs_destination: str,
    sync: bool = True,
):
    aiplatform.init(project=project, location=location)

    my_model = aiplatform.Model(model_resource_name)

    batch_prediction_job = my_model.batch_predict(
        job_display_name=job_display_name,
        gcs_source=gcs_source,
        gcs_destination_prefix=gcs_destination,
        sync=sync,
    )

    batch_prediction_job.wait()

    print(batch_prediction_job.display_name)
    print(batch_prediction_job.resource_name)
    print(batch_prediction_job.state)
    return batch_prediction_job

์ผ๊ด„ ์ถ”๋ก  ๊ฒฐ๊ณผ ๊ฒ€์ƒ‰

Vertex AI๋Š” ์ง€์ •๋œ ๋Œ€์ƒ์— ์ผ๊ด„ ์ถ”๋ก  ์ถœ๋ ฅ์„ ๋ณด๋ƒ…๋‹ˆ๋‹ค.

์ผ๊ด„ ์ถ”๋ก  ์ž‘์—…์ด ์™„๋ฃŒ๋˜๋ฉด ์š”์ฒญ์— ์ง€์ •ํ•œ Cloud Storage ๋ฒ„ํ‚ท์— ์ถ”๋ก  ๊ฒฐ๊ณผ๊ฐ€ ์ €์žฅ๋ฉ๋‹ˆ๋‹ค.

์ผ๊ด„ ์ถ”๋ก  ๊ฒฐ๊ณผ ์˜ˆ์‹œ

๋‹ค์Œ์€ ์ด๋ฏธ์ง€ ๋ถ„๋ฅ˜ ๋ชจ๋ธ์˜ ์ผ๊ด„ ์ถ”๋ก  ๊ฒฐ๊ณผ ์˜ˆ์‹œ์ž…๋‹ˆ๋‹ค.

{
  "instance": {"content": "gs://bucket/image.jpg", "mimeType": "image/jpeg"},
  "prediction": {
    "ids": [1, 2],
    "displayNames": ["cat", "dog"],
    "confidences": [0.7, 0.5]
  }
}