์™€์ด๋“œ ์•ค ๋”ฅ ์ผ๊ด„ ์˜ˆ์ธก ๊ฐ€์ ธ์˜ค๊ธฐ

์ด ํŽ˜์ด์ง€์—์„œ๋Š” Google Cloud ์ฝ˜์†” ๋˜๋Š” Vertex AI API๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํ•™์Šต๋œ ๋ถ„๋ฅ˜ ๋˜๋Š” ํšŒ๊ท€ ๋ชจ๋ธ์— ์ผ๊ด„ ์˜ˆ์ธก ์š”์ฒญ์„ ์ˆ˜ํ–‰ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.

์ผ๊ด„ ์˜ˆ์ธก ์š”์ฒญ์€ ๋น„๋™๊ธฐ์‹ ์š”์ฒญ์ž…๋‹ˆ๋‹ค(๋™๊ธฐ์‹ ์š”์ฒญ์ธ ์˜จ๋ผ์ธ ์˜ˆ์ธก๊ณผ ๋ฐ˜๋Œ€). ๋ชจ๋ธ์„ ์—”๋“œํฌ์ธํŠธ์— ๋ฐฐํฌํ•  ํ•„์š” ์—†์ด ๋ชจ๋ธ ๋ฆฌ์†Œ์Šค์—์„œ ์ง์ ‘ ์ผ๊ด„ ์˜ˆ์ธก์„ ์š”์ฒญํ•ฉ๋‹ˆ๋‹ค. ํ‘œ ํ˜•์‹ ๋ฐ์ดํ„ฐ์˜ ๊ฒฝ์šฐ ์ฆ‰๊ฐ์ ์ธ ์‘๋‹ต์ด ํ•„์š”ํ•˜์ง€ ์•Š๊ณ  ๋‹จ์ผ ์š”์ฒญ์„ ์‚ฌ์šฉํ•˜์—ฌ ๋ˆ„์ ๋œ ๋ฐ์ดํ„ฐ๋ฅผ ์ฒ˜๋ฆฌํ•˜๊ณ  ์‹ถ์€ ๊ฒฝ์šฐ ์ผ๊ด„ ์˜ˆ์ธก์„ ์‚ฌ์šฉํ•˜์„ธ์š”.

์ผ๊ด„ ์˜ˆ์ธก ์š”์ฒญ์„ ์ˆ˜ํ–‰ํ•˜๋ ค๋ฉด Vertex AI๊ฐ€ ์˜ˆ์ธก ๊ฒฐ๊ณผ๋ฅผ ์ €์žฅํ•˜๋Š” ์ž…๋ ฅ ์†Œ์Šค์™€ ์ถœ๋ ฅ ํ˜•์‹์„ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค.

์‹œ์ž‘ํ•˜๊ธฐ ์ „์—

์ผ๊ด„ ์˜ˆ์ธก ์š”์ฒญ์„ ์ˆ˜ํ–‰ํ•˜๊ธฐ ์ „์— ๋จผ์ € ๋ชจ๋ธ์„ ํ•™์Šต์‹œ์ผœ์•ผ ํ•ฉ๋‹ˆ๋‹ค.

์ž…๋ ฅ ๋ฐ์ดํ„ฐ

์ผ๊ด„ ์˜ˆ์ธก ์š”์ฒญ์˜ ์ž…๋ ฅ ๋ฐ์ดํ„ฐ๋Š” ๋ชจ๋ธ์ด ์˜ˆ์ธก์„ ์ˆ˜ํ–‰ํ•˜๋Š” ๋ฐ ์‚ฌ์šฉํ•˜๋Š” ๋ฐ์ดํ„ฐ์ž…๋‹ˆ๋‹ค. ๋ถ„๋ฅ˜ ๋˜๋Š” ํšŒ๊ท€ ๋ชจ๋ธ์˜ ๊ฒฝ์šฐ ๋‹ค์Œ ๋‘ ๊ฐ€์ง€ ํ˜•์‹ ์ค‘ ํ•˜๋‚˜๋กœ ์ž…๋ ฅ ๋ฐ์ดํ„ฐ๋ฅผ ์ œ๊ณตํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

  • BigQuery ํ…Œ์ด๋ธ”
  • Cloud Storage์˜ CSV ๊ฐ์ฒด

๋ชจ๋ธ ํ•™์Šต์— ์‚ฌ์šฉํ•œ ํ˜•์‹๊ณผ ๋™์ผํ•œ ํ˜•์‹์„ ์ž…๋ ฅ ๋ฐ์ดํ„ฐ์— ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ์ข‹์Šต๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด BigQuery์—์„œ ๋ฐ์ดํ„ฐ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ชจ๋ธ์„ ํ•™์Šต์‹œ์ผฐ์œผ๋ฉด BigQuery ํ…Œ์ด๋ธ”์„ ์ผ๊ด„ ์˜ˆ์ธก์˜ ์ž…๋ ฅ์œผ๋กœ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ๊ฐ€์žฅ ์ข‹์Šต๋‹ˆ๋‹ค. Vertex AI๋Š” ๋ชจ๋“  CSV ์ž…๋ ฅ ํ•„๋“œ๋ฅผ ๋ฌธ์ž์—ด๋กœ ์ทจ๊ธ‰ํ•˜๋ฏ€๋กœ ํ•™์Šต ๋ฐ์ดํ„ฐ์™€ ์ž…๋ ฅ ๋ฐ์ดํ„ฐ ํ˜•์‹์„ ํ˜ผํ•ฉํ•˜๋ฉด ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๋ฐ์ดํ„ฐ ์†Œ์Šค์—๋Š” ๋ชจ๋ธ ํ•™์Šต์— ์‚ฌ์šฉ๋œ ๋ชจ๋“  ์—ด์„ ์–ด๋–ค ์ˆœ์„œ๋Œ€๋กœ๋“  ํฌํ•จํ•˜๋Š” ํ…Œ์ด๋ธ” ํ˜•์‹ ๋ฐ์ดํ„ฐ๊ฐ€ ์žˆ์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ํ•™์Šต ๋ฐ์ดํ„ฐ์— ์—†๊ฑฐ๋‚˜, ํ•™์Šต ๋ฐ์ดํ„ฐ์—๋Š” ์žˆ์ง€๋งŒ ํ•™์Šต์— ์‚ฌ์šฉํ•  ์ˆ˜ ์—†๋Š” ์—ด์„ ํฌํ•จํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ์ถ”๊ฐ€ ์—ด์€ ์ถœ๋ ฅ์— ํฌํ•จ๋˜์ง€๋งŒ ์˜ˆ์ธก ๊ฒฐ๊ณผ์— ์˜ํ–ฅ์„ ์ฃผ์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

์ž…๋ ฅ ๋ฐ์ดํ„ฐ ์š”๊ตฌ์‚ฌํ•ญ

BigQuery ํ…Œ์ด๋ธ”

BigQuery ํ…Œ์ด๋ธ”์„ ์ž…๋ ฅ์œผ๋กœ ์„ ํƒํ•˜๋Š” ๊ฒฝ์šฐ ๋‹ค์Œ์„ ํ™•์ธํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

  • BigQuery ๋ฐ์ดํ„ฐ ์†Œ์Šค ํ…Œ์ด๋ธ”์€ 100GB๋ฅผ ๋„˜์ง€ ์•Š์•„์•ผ ํ•ฉ๋‹ˆ๋‹ค.
  • ํ…Œ์ด๋ธ”์ด ๋‹ค๋ฅธ ํ”„๋กœ์ ํŠธ์— ์žˆ์œผ๋ฉด ํ•ด๋‹น ํ”„๋กœ์ ํŠธ์˜ Vertex AI ์„œ๋น„์Šค ๊ณ„์ •์— BigQuery Data Editor ์—ญํ• ์„ ์ œ๊ณตํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

CSV ํŒŒ์ผ

Cloud Storage์—์„œ ์ž…๋ ฅ์œผ๋กœ CSV ๊ฐ์ฒด๋ฅผ ์„ ํƒํ•  ๊ฒฝ์šฐ ๋‹ค์Œ ์‚ฌํ•ญ์„ ํ™•์ธํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

  • ๋ฐ์ดํ„ฐ ์†Œ์Šค๋Š” ์—ด ์ด๋ฆ„์ด ์žˆ๋Š” ํ—ค๋” ํ–‰์œผ๋กœ ์‹œ์ž‘ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
  • ๊ฐ ๋ฐ์ดํ„ฐ ์†Œ์Šค ๊ฐ์ฒด๋Š” 10GB๋ฅผ ๋„˜์ง€ ์•Š์•„์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์—ฌ๋Ÿฌ ํŒŒ์ผ์„ ํฌํ•จํ•  ์ˆ˜๋„ ์žˆ์ง€๋งŒ ์ตœ๋Œ€ ์šฉ๋Ÿ‰์€ 100GB๋กœ ์ œํ•œ๋ฉ๋‹ˆ๋‹ค.
  • Cloud Storage ๋ฒ„ํ‚ท์ด ๋‹ค๋ฅธ ํ”„๋กœ์ ํŠธ์— ์žˆ์œผ๋ฉด ํ•ด๋‹น ํ”„๋กœ์ ํŠธ์˜ Storage Object Creator ์—ญํ• ์„ Vertex AI ์„œ๋น„์Šค ๊ณ„์ •์— ๋ถ€์—ฌํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
  • ๋ชจ๋“  ๋ฌธ์ž์—ด์„ ํฐ๋”ฐ์˜ดํ‘œ(")๋กœ ๋ฌถ์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

์ถœ๋ ฅ ํ˜•์‹

์ผ๊ด„ ์˜ˆ์ธก ์š”์ฒญ์˜ ์ถœ๋ ฅ ํ˜•์‹์€ ์ž…๋ ฅ์— ์‚ฌ์šฉํ•œ ํ˜•์‹๊ณผ ๋™์ผํ•  ํ•„์š”๊ฐ€ ์—†์Šต๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด BigQuery ํ…Œ์ด๋ธ”์„ ์ž…๋ ฅ์œผ๋กœ ์‚ฌ์šฉํ•œ ๊ฒฝ์šฐ Cloud Storage์˜ CSV ๊ฐ์ฒด๋กœ ๊ฒฐ๊ณผ๋ฅผ ์ถœ๋ ฅํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๋ชจ๋ธ์— ์ผ๊ด„ ์˜ˆ์ธก ์š”์ฒญ ๋ณด๋‚ด๊ธฐ

์ผ๊ด„ ์˜ˆ์ธก ์š”์ฒญ์„ ์ˆ˜ํ–‰ํ•˜๋ ค๋ฉด Google Cloud ์ฝ˜์†” ๋˜๋Š” Vertex AI API๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด ๋ฉ๋‹ˆ๋‹ค. ์ž…๋ ฅ ๋ฐ์ดํ„ฐ ์†Œ์Šค๋Š” Cloud Storage ๋ฒ„ํ‚ท์ด๋‚˜ BigQuery ํ…Œ์ด๋ธ”์— ์ €์žฅ๋œ CSV ๊ฐ์ฒด์ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ž…๋ ฅ์œผ๋กœ ์ œ์ถœํ•˜๋Š” ๋ฐ์ดํ„ฐ ์–‘์— ๋”ฐ๋ผ ์ผ๊ด„ ์˜ˆ์ธก ํƒœ์Šคํฌ๊ฐ€ ์™„๋ฃŒ๋˜๋Š” ๋ฐ ๋‹ค์†Œ ์‹œ๊ฐ„์ด ๊ฑธ๋ฆด ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

Google Cloud ์ฝ˜์†”

Google Cloud ์ฝ˜์†”์„ ์‚ฌ์šฉํ•˜์—ฌ ์ผ๊ด„ ์˜ˆ์ธก์„ ์š”์ฒญํ•ฉ๋‹ˆ๋‹ค.

  1. Google Cloud ์ฝ˜์†”์˜ Vertex AI ์„น์…˜์—์„œ ์ผ๊ด„ ์˜ˆ์ธก ํŽ˜์ด์ง€๋กœ ์ด๋™ํ•ฉ๋‹ˆ๋‹ค.

    ์ผ๊ด„ ์˜ˆ์ธก ํŽ˜์ด์ง€๋กœ ์ด๋™

  2. ๋งŒ๋“ค๊ธฐ๋ฅผ ํด๋ฆญํ•˜์—ฌ ์ƒˆ ์ผ๊ด„ ์˜ˆ์ธก ์ฐฝ์„ ์—ฝ๋‹ˆ๋‹ค.
  3. ์ผ๊ด„ ์˜ˆ์ธก ์ •์˜์—์„œ ๋‹ค์Œ ๋‹จ๊ณ„๋ฅผ ์™„๋ฃŒํ•ฉ๋‹ˆ๋‹ค.
    1. ์ผ๊ด„ ์˜ˆ์ธก์˜ ์ด๋ฆ„์„ ์ž…๋ ฅํ•ฉ๋‹ˆ๋‹ค.
    2. ๋ชจ๋ธ ์ด๋ฆ„์— ์ด ์ผ๊ด„ ์˜ˆ์ธก์— ์‚ฌ์šฉํ•  ๋ชจ๋ธ์˜ ์ด๋ฆ„์„ ์„ ํƒํ•ฉ๋‹ˆ๋‹ค.
    3. ๋ฒ„์ „์— ์ด ์ผ๊ด„ ์˜ˆ์ธก์— ์‚ฌ์šฉํ•  ๋ชจ๋ธ ๋ฒ„์ „์„ ์„ ํƒํ•ฉ๋‹ˆ๋‹ค.
    4. ์†Œ์Šค ์„ ํƒ์—์„œ ์†Œ์Šค ์ž…๋ ฅ ๋ฐ์ดํ„ฐ๊ฐ€ Cloud Storage์˜ CSV ํŒŒ์ผ์ธ์ง€ ๋˜๋Š” BigQuery์˜ ํ…Œ์ด๋ธ”์ธ์ง€ ์—ฌ๋ถ€๋ฅผ ์„ ํƒํ•ฉ๋‹ˆ๋‹ค.
      • CSV ํŒŒ์ผ์˜ ๊ฒฝ์šฐ CSV ์ž…๋ ฅ ํŒŒ์ผ์ด ์žˆ๋Š” Cloud Storage ์œ„์น˜๋ฅผ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค.
      • BigQuery ํ…Œ์ด๋ธ”์˜ ๊ฒฝ์šฐ ํ…Œ์ด๋ธ”์ด ์žˆ๋Š” ํ”„๋กœ์ ํŠธ ID, BigQuery ๋ฐ์ดํ„ฐ ์„ธํŠธ ID, BigQuery ํ…Œ์ด๋ธ” ๋˜๋Š” ๋ทฐ ID๋ฅผ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค.
    5. ์ถœ๋ ฅ์—์„œ CSV ๋˜๋Š” BigQuery๋ฅผ ์„ ํƒํ•ฉ๋‹ˆ๋‹ค.
      • CSV์˜ ๊ฒฝ์šฐ Vertex AI์—์„œ ์ถœ๋ ฅ์„ ์ €์žฅํ•˜๋Š” Cloud Storage ๋ฒ„ํ‚ท์„ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค.
      • BigQuery์˜ ๊ฒฝ์šฐ ํ”„๋กœ์ ํŠธ ID ๋˜๋Š” ๊ธฐ์กด ๋ฐ์ดํ„ฐ ์„ธํŠธ๋ฅผ ์ง€์ •ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
        • ํ”„๋กœ์ ํŠธ ID๋ฅผ ์ง€์ •ํ•˜๋ ค๋ฉด Google Cloud ํ”„๋กœ์ ํŠธ ID ํ•„๋“œ์— ํ”„๋กœ์ ํŠธ ID๋ฅผ ์ž…๋ ฅํ•ฉ๋‹ˆ๋‹ค. Vertex AI์—์„œ ์ƒˆ๋กœ์šด ์ถœ๋ ฅ ๋ฐ์ดํ„ฐ ์„ธํŠธ๋ฅผ ์ž๋™์œผ๋กœ ๋งŒ๋“ญ๋‹ˆ๋‹ค.
        • ๊ธฐ์กด ๋ฐ์ดํ„ฐ ์„ธํŠธ๋ฅผ ์ง€์ •ํ•˜๋ ค๋ฉด Google Cloud ํ”„๋กœ์ ํŠธ ID ํ•„๋“œ์— BigQuery ๊ฒฝ๋กœ๋ฅผ ์ž…๋ ฅํ•ฉ๋‹ˆ๋‹ค(์˜ˆ: bq://projectid.datasetid).
  4. (์„ ํƒ์‚ฌํ•ญ) ์ผ๊ด„ ์˜ˆ์ธก์„ ์œ„ํ•œ ๋ชจ๋ธ ๋ชจ๋‹ˆํ„ฐ๋ง ๋ถ„์„์€ ํ”„๋ฆฌ๋ทฐ๋กœ ์ œ๊ณต๋ฉ๋‹ˆ๋‹ค. ์ผ๊ด„ ์˜ˆ์ธก ์ž‘์—…์— ํŽธํ–ฅ ๊ฐ์ง€ ๊ตฌ์„ฑ์„ ์ถ”๊ฐ€ํ•˜๋Š” ๋ฐฉ๋ฒ•์€ ๊ธฐ๋ณธ ์š”๊ฑด์„ ์ฐธ์กฐํ•˜์„ธ์š”.
    1. ์ด ์ผ๊ด„ ์˜ˆ์ธก์— ๋ชจ๋ธ ๋ชจ๋‹ˆํ„ฐ๋ง ์‚ฌ์šฉ ์„ค์ •์„ ํด๋ฆญํ•˜์—ฌ ์ผœ๊ฑฐ๋‚˜ ๋•๋‹ˆ๋‹ค.
    2. ํ•™์Šต ๋ฐ์ดํ„ฐ ์†Œ์Šค๋ฅผ ์„ ํƒํ•ฉ๋‹ˆ๋‹ค. ์„ ํƒํ•œ ํ•™์Šต ๋ฐ์ดํ„ฐ ์†Œ์Šค์˜ ๋ฐ์ดํ„ฐ ๊ฒฝ๋กœ ๋˜๋Š” ์œ„์น˜๋ฅผ ์ž…๋ ฅํ•ฉ๋‹ˆ๋‹ค.
    3. (์„ ํƒ์‚ฌํ•ญ) ์•Œ๋ฆผ ๊ธฐ์ค€ ์•„๋ž˜์—์„œ ์•Œ๋ฆผ์„ ํŠธ๋ฆฌ๊ฑฐํ•  ์ž„๊ณ—๊ฐ’์„ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค.
    4. ์•Œ๋ฆผ ์ด๋ฉ”์ผ์˜ ๊ฒฝ์šฐ ๋ชจ๋ธ์ด ์•Œ๋ฆผ ๊ธฐ์ค€์„ ์ดˆ๊ณผํ•˜๋ฉด ์•Œ๋ฆผ์„ ๋ฐ›์„ ์ด๋ฉ”์ผ ์ฃผ์†Œ ํ•˜๋‚˜ ์ด์ƒ์„ ์‰ผํ‘œ๋กœ ๊ตฌ๋ถ„ํ•˜์—ฌ ์ž…๋ ฅํ•ฉ๋‹ˆ๋‹ค.
    5. (์„ ํƒ์‚ฌํ•ญ) ์•Œ๋ฆผ ์ฑ„๋„์˜ ๊ฒฝ์šฐ ๋ชจ๋ธ์ด ์•Œ๋ฆผ ๊ธฐ์ค€์„ ์ดˆ๊ณผํ•˜๋ฉด ์•Œ๋ฆผ์„ ๋ฐ›์„ Cloud Monitoring ์ฑ„๋„์„ ์ถ”๊ฐ€ํ•ฉ๋‹ˆ๋‹ค. ๊ธฐ์กด Cloud Monitoring ์ฑ„๋„์„ ์„ ํƒํ•˜๊ฑฐ๋‚˜ ์•Œ๋ฆผ ์ฑ„๋„ ๊ด€๋ฆฌ๋ฅผ ํด๋ฆญํ•˜์—ฌ ์ƒˆ ์ฑ„๋„์„ ๋งŒ๋“ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ฝ˜์†”์—์„œ๋Š” PagerDuty, Slack, Pub/Sub ์•Œ๋ฆผ ์ฑ„๋„์„ ์ง€์›ํ•ฉ๋‹ˆ๋‹ค.
  5. ๋งŒ๋“ค๊ธฐ๋ฅผ ํด๋ฆญํ•ฉ๋‹ˆ๋‹ค.

API: BigQuery

REST

batchPredictionJobs.create ๋ฉ”์„œ๋“œ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ผ๊ด„ ์˜ˆ์ธก์„ ์š”์ฒญํ•ฉ๋‹ˆ๋‹ค.

์š”์ฒญ ๋ฐ์ดํ„ฐ๋ฅผ ์‚ฌ์šฉํ•˜๊ธฐ ์ „์— ๋‹ค์Œ์„ ๋ฐ”๊ฟ‰๋‹ˆ๋‹ค.

  • LOCATION_ID: ๋ชจ๋ธ์ด ์ €์žฅ๋˜๊ณ  ์ผ๊ด„ ์˜ˆ์ธก ์ž‘์—…์ด ์‹คํ–‰๋˜๋Š” ๋ฆฌ์ „์ž…๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค๋ฉด us-central1์ž…๋‹ˆ๋‹ค.
  • PROJECT_ID: ํ”„๋กœ์ ํŠธ ID์ž…๋‹ˆ๋‹ค.
  • BATCH_JOB_NAME: ์ผ๊ด„ ์ž‘์—…์˜ ํ‘œ์‹œ ์ด๋ฆ„์ž…๋‹ˆ๋‹ค.
  • MODEL_ID: ์˜ˆ์ธก์„ ์ˆ˜ํ–‰ํ•˜๋Š” ๋ฐ ์‚ฌ์šฉํ•  ๋ชจ๋ธ์˜ ID์ž…๋‹ˆ๋‹ค.
  • INPUT_URI: BigQuery ๋ฐ์ดํ„ฐ ์†Œ์Šค์— ๋Œ€ํ•œ ์ฐธ์กฐ์ž…๋‹ˆ๋‹ค. ๋‹ค์Œ ์•ˆ๋‚ด๋ฅผ ๋”ฐ๋ผ ์–‘์‹์„ ์ž‘์„ฑํ•˜์„ธ์š”.
    bq://bqprojectId.bqDatasetId.bqTableId
    
  • OUTPUT_URI: ์˜ˆ์ธก์ด ๊ธฐ๋ก๋˜๋Š” BigQuery ๋Œ€์ƒ์— ๋Œ€ํ•œ ์ฐธ์กฐ์ž…๋‹ˆ๋‹ค. ํ”„๋กœ์ ํŠธ ID๋ฅผ ์ง€์ •ํ•˜๊ณ  ์„ ํƒ์ ์œผ๋กœ ๊ธฐ์กด ๋ฐ์ดํ„ฐ ์„ธํŠธ ID๋ฅผ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค. ํ”„๋กœ์ ํŠธ ID๋งŒ ์ง€์ •ํ•˜๋ฉด Vertex AI๊ฐ€ ์ƒˆ๋กœ์šด ์ถœ๋ ฅ ๋ฐ์ดํ„ฐ ์„ธํŠธ๋ฅผ ๋งŒ๋“ญ๋‹ˆ๋‹ค. ๋‹ค์Œ ํ˜•์‹์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.
    bq://bqprojectId.bqDatasetId
    
  • MACHINE_TYPE: ์ด ์ผ๊ด„ ์˜ˆ์ธก ์ž‘์—…์— ์‚ฌ์šฉํ•  ๋จธ์‹  ๋ฆฌ์†Œ์Šค์ž…๋‹ˆ๋‹ค. ์ž์„ธํžˆ ์•Œ์•„๋ณด๊ธฐ
  • STARTING_REPLICA_COUNT: ์ด ์ผ๊ด„ ์˜ˆ์ธก ์ž‘์—…์˜ ์‹œ์ž‘ ๋…ธ๋“œ ์ˆ˜์ž…๋‹ˆ๋‹ค. ๋กœ๋“œ ์‹œ ํ•„์š”์— ๋”ฐ๋ผ ์ตœ๋Œ€ ๋…ธ๋“œ ์ˆ˜๊นŒ์ง€ ๋…ธ๋“œ ์ˆ˜๋ฅผ ๋Š˜๋ฆฌ๊ฑฐ๋‚˜ ์ค„์ผ ์ˆ˜ ์žˆ์ง€๋งŒ ์ ˆ๋Œ€ ์ด ์ˆ˜ ๋ฏธ๋งŒ์ด ๋˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.
  • MAX_REPLICA_COUNT: ์ด ์ผ๊ด„ ์˜ˆ์ธก ์ž‘์—…์˜ ์ตœ๋Œ€ ๋…ธ๋“œ ์ˆ˜์ž…๋‹ˆ๋‹ค. ๋กœ๋“œ ์‹œ ํ•„์š”์— ๋”ฐ๋ผ ๋…ธ๋“œ ์ˆ˜๋ฅผ ๋Š˜๋ฆฌ๊ฑฐ๋‚˜ ์ค„์ผ ์ˆ˜ ์žˆ์ง€๋งŒ ์ตœ๋Œ“๊ฐ’์„ ์ ˆ๋Œ€ ์ดˆ๊ณผํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ์„ ํƒ์‚ฌํ•ญ์œผ๋กœ, ๊ธฐ๋ณธ๊ฐ’์€ 10์ž…๋‹ˆ๋‹ค.

HTTP ๋ฉ”์„œ๋“œ ๋ฐ URL:

POST https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/batchPredictionJobs

JSON ์š”์ฒญ ๋ณธ๋ฌธ:

{
  "displayName": "BATCH_JOB_NAME",
  "model": "MODEL_ID",
  "inputConfig": {
    "instancesFormat": "bigquery",
    "bigquerySource": {
      "inputUri": "INPUT_URI"
    }
  },
  "outputConfig": {
    "predictionsFormat": "bigquery",
    "bigqueryDestination": {
      "outputUri": "OUTPUT_URI"
    }
  },
  "dedicatedResources": {
    "machineSpec": {
      "machineType": "MACHINE_TYPE",
      "acceleratorCount": "0"
    },
    "startingReplicaCount": STARTING_REPLICA_COUNT,
    "maxReplicaCount": MAX_REPLICA_COUNT
  },

}

์š”์ฒญ์„ ๋ณด๋‚ด๋ ค๋ฉด ๋‹ค์Œ ์˜ต์…˜ ์ค‘ ํ•˜๋‚˜๋ฅผ ์„ ํƒํ•ฉ๋‹ˆ๋‹ค.

curl

์š”์ฒญ ๋ณธ๋ฌธ์„ request.json ํŒŒ์ผ์— ์ €์žฅํ•˜๊ณ  ๋‹ค์Œ ๋ช…๋ น์–ด๋ฅผ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค.

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/batchPredictionJobs"

PowerShell

์š”์ฒญ ๋ณธ๋ฌธ์„ request.json ํŒŒ์ผ์— ์ €์žฅํ•˜๊ณ  ๋‹ค์Œ ๋ช…๋ น์–ด๋ฅผ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค.

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/batchPredictionJobs" | Select-Object -Expand Content

๋‹ค์Œ๊ณผ ๋น„์Šทํ•œ JSON ์‘๋‹ต์ด ํ‘œ์‹œ๋ฉ๋‹ˆ๋‹ค.

{
  "name": "projects/PROJECT_ID/locations/LOCATION_ID/batchPredictionJobs/67890",
  "displayName": "batch_job_1 202005291958",
  "model": "projects/12345/locations/us-central1/models/5678",
  "state": "JOB_STATE_PENDING",
  "inputConfig": {
    "instancesFormat": "bigquery",
    "bigquerySource": {
      "inputUri": "INPUT_URI"
    }
  },
  "outputConfig": {
    "predictionsFormat": "bigquery",
    "bigqueryDestination": {
        "outputUri": bq://12345
    }
  },
  "dedicatedResources": {
    "machineSpec": {
      "machineType": "n1-standard-32",
      "acceleratorCount": "0"
    },
    "startingReplicaCount": 2,
    "maxReplicaCount": 6
  },
  "manualBatchTuningParameters": {
    "batchSize": 4
  },
  "generateExplanation": false,
  "outputInfo": {
    "bigqueryOutputDataset": "bq://12345.reg_model_2020_10_02_06_04
  }
  "state": "JOB_STATE_PENDING",
  "createTime": "2020-09-30T02:58:44.341643Z",
  "updateTime": "2020-09-30T02:58:44.341643Z",
}

Java

์ด ์ƒ˜ํ”Œ์„ ์‚ฌ์šฉํ•ด ๋ณด๊ธฐ ์ „์— Vertex AI ๋น ๋ฅธ ์‹œ์ž‘: ํด๋ผ์ด์–ธํŠธ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ์‚ฌ์šฉ์˜ Java ์„ค์ • ์•ˆ๋‚ด๋ฅผ ๋”ฐ๋ฅด์„ธ์š”. ์ž์„ธํ•œ ๋‚ด์šฉ์€ Vertex AI Java API ์ฐธ๊ณ  ๋ฌธ์„œ๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.

Vertex AI์— ์ธ์ฆํ•˜๋ ค๋ฉด ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜ ๊ธฐ๋ณธ ์‚ฌ์šฉ์ž ์ธ์ฆ ์ •๋ณด๋ฅผ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค. ์ž์„ธํ•œ ๋‚ด์šฉ์€ ๋กœ์ปฌ ๊ฐœ๋ฐœ ํ™˜๊ฒฝ์˜ ์ธ์ฆ ์„ค์ •์„ ์ฐธ์กฐํ•˜์„ธ์š”.

๋‹ค์Œ ์ƒ˜ํ”Œ์—์„œ INSTANCES_FORMAT ๋ฐ PREDICTIONS_FORMAT์„ `bigquery`๋กœ ๋ฐ”๊ฟ‰๋‹ˆ๋‹ค. ๋‹ค๋ฅธ ์ž๋ฆฌํ‘œ์‹œ์ž๋ฅผ ๊ต์ฒดํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์•Œ์•„๋ณด๋ ค๋ฉด ์ด ์„น์…˜์˜ `REST & CMD LINE` ํƒญ์„ ์ฐธ์กฐํ•˜์„ธ์š”.
import com.google.cloud.aiplatform.v1.BatchPredictionJob;
import com.google.cloud.aiplatform.v1.BigQueryDestination;
import com.google.cloud.aiplatform.v1.BigQuerySource;
import com.google.cloud.aiplatform.v1.JobServiceClient;
import com.google.cloud.aiplatform.v1.JobServiceSettings;
import com.google.cloud.aiplatform.v1.LocationName;
import com.google.cloud.aiplatform.v1.ModelName;
import com.google.gson.JsonObject;
import com.google.protobuf.Value;
import com.google.protobuf.util.JsonFormat;
import java.io.IOException;

public class CreateBatchPredictionJobBigquerySample {

  public static void main(String[] args) throws IOException {
    // TODO(developer): Replace these variables before running the sample.
    String project = "PROJECT";
    String displayName = "DISPLAY_NAME";
    String modelName = "MODEL_NAME";
    String instancesFormat = "INSTANCES_FORMAT";
    String bigquerySourceInputUri = "BIGQUERY_SOURCE_INPUT_URI";
    String predictionsFormat = "PREDICTIONS_FORMAT";
    String bigqueryDestinationOutputUri = "BIGQUERY_DESTINATION_OUTPUT_URI";
    createBatchPredictionJobBigquerySample(
        project,
        displayName,
        modelName,
        instancesFormat,
        bigquerySourceInputUri,
        predictionsFormat,
        bigqueryDestinationOutputUri);
  }

  static void createBatchPredictionJobBigquerySample(
      String project,
      String displayName,
      String model,
      String instancesFormat,
      String bigquerySourceInputUri,
      String predictionsFormat,
      String bigqueryDestinationOutputUri)
      throws IOException {
    JobServiceSettings settings =
        JobServiceSettings.newBuilder()
            .setEndpoint("us-central1-aiplatform.googleapis.com:443")
            .build();
    String location = "us-central1";

    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (JobServiceClient client = JobServiceClient.create(settings)) {
      JsonObject jsonModelParameters = new JsonObject();
      Value.Builder modelParametersBuilder = Value.newBuilder();
      JsonFormat.parser().merge(jsonModelParameters.toString(), modelParametersBuilder);
      Value modelParameters = modelParametersBuilder.build();
      BigQuerySource bigquerySource =
          BigQuerySource.newBuilder().setInputUri(bigquerySourceInputUri).build();
      BatchPredictionJob.InputConfig inputConfig =
          BatchPredictionJob.InputConfig.newBuilder()
              .setInstancesFormat(instancesFormat)
              .setBigquerySource(bigquerySource)
              .build();
      BigQueryDestination bigqueryDestination =
          BigQueryDestination.newBuilder().setOutputUri(bigqueryDestinationOutputUri).build();
      BatchPredictionJob.OutputConfig outputConfig =
          BatchPredictionJob.OutputConfig.newBuilder()
              .setPredictionsFormat(predictionsFormat)
              .setBigqueryDestination(bigqueryDestination)
              .build();
      String modelName = ModelName.of(project, location, model).toString();
      BatchPredictionJob batchPredictionJob =
          BatchPredictionJob.newBuilder()
              .setDisplayName(displayName)
              .setModel(modelName)
              .setModelParameters(modelParameters)
              .setInputConfig(inputConfig)
              .setOutputConfig(outputConfig)
              .build();
      LocationName parent = LocationName.of(project, location);
      BatchPredictionJob response = client.createBatchPredictionJob(parent, batchPredictionJob);
      System.out.format("response: %s\n", response);
      System.out.format("\tName: %s\n", response.getName());
    }
  }
}

Vertex AI SDK for Python

Vertex AI SDK for Python์„ ์„ค์น˜ํ•˜๊ฑฐ๋‚˜ ์—…๋ฐ์ดํŠธํ•˜๋Š” ๋ฐฉ๋ฒ•์€ Vertex AI SDK for Python ์„ค์น˜๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”. ์ž์„ธํ•œ ๋‚ด์šฉ์€ Vertex AI SDK for Python API ์ฐธ์กฐ ๋ฌธ์„œ๋ฅผ ํ™•์ธํ•˜์„ธ์š”.

๋‹ค์Œ ์ƒ˜ํ”Œ์—์„œ๋Š” `instances_format` ๋ฐ `predictions_format` ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ `'bigquery'`๋กœ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค. ๋‹ค๋ฅธ ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์„ค์ •ํ•˜๋Š” ๋ฐฉ๋ฒ•์€ ์ด ์„น์…˜์˜ `REST & CMD LINE` ํƒญ์„ ์ฐธ์กฐํ•˜์„ธ์š”.
from google.cloud import aiplatform_v1beta1
from google.protobuf import json_format
from google.protobuf.struct_pb2 import Value


def create_batch_prediction_job_bigquery_sample(
    project: str,
    display_name: str,
    model_name: str,
    instances_format: str,
    bigquery_source_input_uri: str,
    predictions_format: str,
    bigquery_destination_output_uri: str,
    location: str = "us-central1",
    api_endpoint: str = "us-central1-aiplatform.googleapis.com",
):
    # The AI Platform services require regional API endpoints.
    client_options = {"api_endpoint": api_endpoint}
    # Initialize client that will be used to create and send requests.
    # This client only needs to be created once, and can be reused for multiple requests.
    client = aiplatform_v1beta1.JobServiceClient(client_options=client_options)
    model_parameters_dict = {}
    model_parameters = json_format.ParseDict(model_parameters_dict, Value())

    batch_prediction_job = {
        "display_name": display_name,
        # Format: 'projects/{project}/locations/{location}/models/{model_id}'
        "model": model_name,
        "model_parameters": model_parameters,
        "input_config": {
            "instances_format": instances_format,
            "bigquery_source": {"input_uri": bigquery_source_input_uri},
        },
        "output_config": {
            "predictions_format": predictions_format,
            "bigquery_destination": {"output_uri": bigquery_destination_output_uri},
        },
        # optional
        "generate_explanation": True,
    }
    parent = f"projects/{project}/locations/{location}"
    response = client.create_batch_prediction_job(
        parent=parent, batch_prediction_job=batch_prediction_job
    )
    print("response:", response)

API: Cloud Storage

REST

batchPredictionJobs.create ๋ฉ”์„œ๋“œ๋ฅผ ์‚ฌ์šฉํ•ด ์ผ๊ด„ ์˜ˆ์ธก์„ ์š”์ฒญํ•ฉ๋‹ˆ๋‹ค.

์š”์ฒญ ๋ฐ์ดํ„ฐ๋ฅผ ์‚ฌ์šฉํ•˜๊ธฐ ์ „์— ๋‹ค์Œ์„ ๋ฐ”๊ฟ‰๋‹ˆ๋‹ค.

  • LOCATION_ID: ๋ชจ๋ธ์ด ์ €์žฅ๋˜๊ณ  ์ผ๊ด„ ์˜ˆ์ธก ์ž‘์—…์ด ์‹คํ–‰๋˜๋Š” ๋ฆฌ์ „์ž…๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค๋ฉด us-central1์ž…๋‹ˆ๋‹ค.
  • PROJECT_ID: ํ”„๋กœ์ ํŠธ ID์ž…๋‹ˆ๋‹ค.
  • BATCH_JOB_NAME: ์ผ๊ด„ ์ž‘์—…์˜ ํ‘œ์‹œ ์ด๋ฆ„์ž…๋‹ˆ๋‹ค.
  • MODEL_ID: ์˜ˆ์ธก์„ ์ˆ˜ํ–‰ํ•˜๋Š” ๋ฐ ์‚ฌ์šฉํ•  ๋ชจ๋ธ์˜ ID์ž…๋‹ˆ๋‹ค.
  • URI: ํ•™์Šต ๋ฐ์ดํ„ฐ๊ฐ€ ํฌํ•จ๋œ Cloud Storage ๋ฒ„ํ‚ท์˜ ๊ฒฝ๋กœ(URI)์ž…๋‹ˆ๋‹ค. ๋‘ ๊ฐœ ์ด์ƒ ์žˆ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๊ฐ URI์˜ ํ˜•์‹์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.
    gs://bucketName/pathToFileName
    
  • OUTPUT_URI_PREFIX: ์˜ˆ์ธก์ด ๊ธฐ๋ก๋˜๋Š” Cloud Storage ๋Œ€์ƒ์˜ ๊ฒฝ๋กœ์ž…๋‹ˆ๋‹ค. Vertex AI์—์„œ ์ด ๊ฒฝ๋กœ์˜ ํƒ€์ž„์Šคํƒฌํ”„๊ฐ€ ์ ์šฉ๋œ ํ•˜์œ„ ๋””๋ ‰ํ„ฐ๋ฆฌ์— ์ผ๊ด„ ์˜ˆ์ธก์„ ๊ธฐ๋กํ•ฉ๋‹ˆ๋‹ค. ์ด ๊ฐ’์„ ๋‹ค์Œ ํ˜•์‹์˜ ๋ฌธ์ž์—ด์— ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค.
    gs://bucketName/pathToOutputDirectory
    
  • MACHINE_TYPE: ์ด ์ผ๊ด„ ์˜ˆ์ธก ์ž‘์—…์— ์‚ฌ์šฉํ•  ๋จธ์‹  ๋ฆฌ์†Œ์Šค์ž…๋‹ˆ๋‹ค. ์ž์„ธํžˆ ์•Œ์•„๋ณด๊ธฐ
  • STARTING_REPLICA_COUNT: ์ด ์ผ๊ด„ ์˜ˆ์ธก ์ž‘์—…์˜ ์‹œ์ž‘ ๋…ธ๋“œ ์ˆ˜์ž…๋‹ˆ๋‹ค. ๋กœ๋“œ ์‹œ ํ•„์š”์— ๋”ฐ๋ผ ์ตœ๋Œ€ ๋…ธ๋“œ ์ˆ˜๊นŒ์ง€ ๋…ธ๋“œ ์ˆ˜๋ฅผ ๋Š˜๋ฆฌ๊ฑฐ๋‚˜ ์ค„์ผ ์ˆ˜ ์žˆ์ง€๋งŒ ์ ˆ๋Œ€ ์ด ์ˆ˜ ๋ฏธ๋งŒ์ด ๋˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.
  • MAX_REPLICA_COUNT: ์ด ์ผ๊ด„ ์˜ˆ์ธก ์ž‘์—…์˜ ์ตœ๋Œ€ ๋…ธ๋“œ ์ˆ˜์ž…๋‹ˆ๋‹ค. ๋กœ๋“œ ์‹œ ํ•„์š”์— ๋”ฐ๋ผ ๋…ธ๋“œ ์ˆ˜๋ฅผ ๋Š˜๋ฆฌ๊ฑฐ๋‚˜ ์ค„์ผ ์ˆ˜ ์žˆ์ง€๋งŒ ์ตœ๋Œ“๊ฐ’์„ ์ ˆ๋Œ€ ์ดˆ๊ณผํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ์„ ํƒ์‚ฌํ•ญ์œผ๋กœ, ๊ธฐ๋ณธ๊ฐ’์€ 10์ž…๋‹ˆ๋‹ค.

HTTP ๋ฉ”์„œ๋“œ ๋ฐ URL:

POST https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/batchPredictionJobs

JSON ์š”์ฒญ ๋ณธ๋ฌธ:

{
  "displayName": "BATCH_JOB_NAME",
  "model": "MODEL_ID",
  "inputConfig": {
    "instancesFormat": "csv",
    "gcsSource": {
      "uris": [
        URI1,...
      ]
    },
  },
  "outputConfig": {
    "predictionsFormat": "csv",
    "gcsDestination": {
      "outputUriPrefix": "OUTPUT_URI_PREFIX"
    }
  },
  "dedicatedResources": {
    "machineSpec": {
      "machineType": "MACHINE_TYPE",
      "acceleratorCount": "0"
    },
    "startingReplicaCount": STARTING_REPLICA_COUNT,
    "maxReplicaCount": MAX_REPLICA_COUNT
  },

}

์š”์ฒญ์„ ๋ณด๋‚ด๋ ค๋ฉด ๋‹ค์Œ ์˜ต์…˜ ์ค‘ ํ•˜๋‚˜๋ฅผ ์„ ํƒํ•ฉ๋‹ˆ๋‹ค.

curl

์š”์ฒญ ๋ณธ๋ฌธ์„ request.json ํŒŒ์ผ์— ์ €์žฅํ•˜๊ณ  ๋‹ค์Œ ๋ช…๋ น์–ด๋ฅผ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค.

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/batchPredictionJobs"

PowerShell

์š”์ฒญ ๋ณธ๋ฌธ์„ request.json ํŒŒ์ผ์— ์ €์žฅํ•˜๊ณ  ๋‹ค์Œ ๋ช…๋ น์–ด๋ฅผ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค.

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/batchPredictionJobs" | Select-Object -Expand Content

๋‹ค์Œ๊ณผ ๋น„์Šทํ•œ JSON ์‘๋‹ต์ด ํ‘œ์‹œ๋ฉ๋‹ˆ๋‹ค.

{
  "name": "projects/PROJECT__ID/locations/LOCATION_ID/batchPredictionJobs/67890",
  "displayName": "batch_job_1 202005291958",
  "model": "projects/12345/locations/us-central1/models/5678",
  "state": "JOB_STATE_PENDING",
  "inputConfig": {
    "instancesFormat": "csv",
    "gcsSource": {
      "uris": [
        "gs://bp_bucket/reg_mode_test"
      ]
    }
  },
  "outputConfig": {
    "predictionsFormat": "csv",
    "gcsDestination": {
      "outputUriPrefix": "OUTPUT_URI_PREFIX"
    }
  },
  "dedicatedResources": {
    "machineSpec": {
      "machineType": "n1-standard-32",
      "acceleratorCount": "0"
    },
    "startingReplicaCount": 2,
    "maxReplicaCount": 6
  },
  "manualBatchTuningParameters": {
    "batchSize": 4
  }
  "outputInfo": {
    "gcsOutputDataset": "OUTPUT_URI_PREFIX/prediction-batch_job_1 202005291958-2020-09-30T02:58:44.341643Z"
  }
  "state": "JOB_STATE_PENDING",
  "createTime": "2020-09-30T02:58:44.341643Z",
  "updateTime": "2020-09-30T02:58:44.341643Z",
}

์ผ๊ด„ ์˜ˆ์ธก ๊ฒฐ๊ณผ ๊ฒ€์ƒ‰

Vertex AI๋Š” ์ผ๊ด„ ์˜ˆ์ธก ์ถœ๋ ฅ์„ ์ง€์ •๋œ ๋Œ€์ƒ(BigQuery ๋˜๋Š” Cloud Storage)์œผ๋กœ ์ „์†กํ•ฉ๋‹ˆ๋‹ค.

BigQuery

์ถœ๋ ฅ ๋ฐ์ดํ„ฐ ์„ธํŠธ

BigQuery๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒฝ์šฐ ์ผ๊ด„ ์˜ˆ์ธก์˜ ์ถœ๋ ฅ์€ ์ถœ๋ ฅ ๋ฐ์ดํ„ฐ ์„ธํŠธ์— ์ €์žฅ๋ฉ๋‹ˆ๋‹ค. Vertex AI์— ๋ฐ์ดํ„ฐ ์„ธํŠธ๋ฅผ ์ œ๊ณตํ•œ ๊ฒฝ์šฐ ๋ฐ์ดํ„ฐ ์„ธํŠธ ์ด๋ฆ„(BQ_DATASET_NAME)์€ ์ด์ „์— ์ œ๊ณตํ•œ ์ด๋ฆ„์ž…๋‹ˆ๋‹ค. ์ถœ๋ ฅ ๋ฐ์ดํ„ฐ ์„ธํŠธ๋ฅผ ์ œ๊ณตํ•˜์ง€ ์•Š์€ ๊ฒฝ์šฐ Vertex AI๊ฐ€ ์ž๋™์œผ๋กœ ๋ฐ์ดํ„ฐ ์„ธํŠธ๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. ๋‹ค์Œ ๋‹จ๊ณ„์— ๋”ฐ๋ผ ์ด๋ฆ„(BQ_DATASET_NAME)์„ ์ฐพ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

  1. Google Cloud ์ฝ˜์†”์—์„œ Vertex AI ์ผ๊ด„ ์˜ˆ์ธก ํŽ˜์ด์ง€๋กœ ์ด๋™ํ•ฉ๋‹ˆ๋‹ค.

    ์ผ๊ด„ ์˜ˆ์ธก ํŽ˜์ด์ง€๋กœ ์ด๋™

  2. ์ƒ์„ฑํ•œ ์˜ˆ์ธก์„ ์„ ํƒํ•ฉ๋‹ˆ๋‹ค.
  3. ์ถœ๋ ฅ ๋ฐ์ดํ„ฐ ์„ธํŠธ๋Š” ๋‚ด๋ณด๋‚ด๊ธฐ ์œ„์น˜์— ์ง€์ •๋ฉ๋‹ˆ๋‹ค. ๋ฐ์ดํ„ฐ ์„ธํŠธ ์ด๋ฆ„์€ prediction_MODEL_NAME_TIMESTAMP ํ˜•์‹์ž…๋‹ˆ๋‹ค.
์ถœ๋ ฅ ํ…Œ์ด๋ธ”

์ถœ๋ ฅ ๋ฐ์ดํ„ฐ ์„ธํŠธ์—๋Š” ๋‹ค์Œ ์„ธ ๊ฐ€์ง€ ์ถœ๋ ฅ ํ…Œ์ด๋ธ” ์ค‘ ํ•˜๋‚˜ ์ด์ƒ์ด ํฌํ•จ๋ฉ๋‹ˆ๋‹ค.

  • ์˜ˆ์ธก ํ…Œ์ด๋ธ”

    ์ด ํ…Œ์ด๋ธ”์—๋Š” ์˜ˆ์ธก์ด ์š”์ฒญ๋œ ์ž…๋ ฅ ๋ฐ์ดํ„ฐ์˜ ๋ชจ๋“  ํ–‰์— ๋Œ€ํ•œ ํ–‰์ด ํฌํ•จ๋ฉ๋‹ˆ๋‹ค(์ฆ‰, TARGET_COLUMN_NAME = null).

  • ์˜ค๋ฅ˜ ํ…Œ์ด๋ธ”

    ์ด ํ…Œ์ด๋ธ”์—๋Š” ์ผ๊ด„ ์˜ˆ์ธก ์ค‘์— ๋ฐœ์ƒํ•˜๋Š” ์ค‘์š”ํ•˜์ง€ ์•Š์€ ์˜ค๋ฅ˜์— ๋Œ€ํ•œ ํ–‰์ด ํฌํ•จ๋ฉ๋‹ˆ๋‹ค. ์ค‘์š”ํ•˜์ง€ ์•Š์€ ๊ฐ ์˜ค๋ฅ˜๋Š” Vertex AI๊ฐ€ ์˜ˆ์ธก์„ ๋ฐ˜ํ™˜ํ•  ์ˆ˜ ์—†๋Š” ์ž…๋ ฅ ๋ฐ์ดํ„ฐ์˜ ํ–‰์— ํ•ด๋‹นํ•ฉ๋‹ˆ๋‹ค.

์˜ˆ์ธก ํ…Œ์ด๋ธ”

ํ…Œ์ด๋ธ”์˜ ์ด๋ฆ„(BQ_PREDICTIONS_TABLE_NAME)์€ ์ผ๊ด„ ์˜ˆ์ธก ์ž‘์—…์ด ์‹œ์ž‘๋œ ํƒ€์ž„์Šคํƒฌํ”„์™€ ํ•จ๊ป˜ 'predictions_'์„ ์ถ”๊ฐ€ํ•˜์—ฌ ํ˜•์„ฑ๋ฉ๋‹ˆ๋‹ค. predictions_TIMESTAMP

์˜ˆ์ธก์„ ๊ฒ€์ƒ‰ํ•˜๋ ค๋ฉด BigQuery ํŽ˜์ด์ง€๋กœ ์ด๋™ํ•˜์„ธ์š”.

BigQuery๋กœ ์ด๋™

์ฟผ๋ฆฌ ํ˜•์‹์€ ๋ชจ๋ธ ์œ ํ˜•์— ๋”ฐ๋ผ ๋‹ค๋ฆ…๋‹ˆ๋‹ค.

๋ถ„๋ฅ˜:

SELECT predicted_TARGET_COLUMN_NAME.classes AS classes,
predicted_TARGET_COLUMN_NAME.scores AS scores
FROM BQ_DATASET_NAME.BQ_PREDICTIONS_TABLE_NAME

classes๋Š” ์ž ์žฌ์  ํด๋ž˜์Šค์˜ ๋ชฉ๋ก์ด๋ฉฐ scores๋Š” ํ•ด๋‹นํ•˜๋Š” ์‹ ๋ขฐ๋„ ์ ์ˆ˜์ž…๋‹ˆ๋‹ค.

ํšŒ๊ท€:

SELECT predicted_TARGET_COLUMN_NAME.value
FROM BQ_DATASET_NAME.BQ_PREDICTIONS_TABLE_NAME

๋ชจ๋ธ์—์„œ ํ™•๋ฅ ์  ์ถ”๋ก ์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒฝ์šฐ predicted_TARGET_COLUMN_NAME.value์—๋Š” ์ตœ์ ํ™” ๋ชฉํ‘œ์˜ ์ตœ์†Œํ™” ๋„๊ตฌ๊ฐ€ ํฌํ•จ๋ฉ๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด ์ตœ์ ํ™” ๋ชฉํ‘œ๊ฐ€ minimize-rmse๋ฉด predicted_TARGET_COLUMN_NAME.value์—๋Š” ํ‰๊ท ๊ฐ’์ด ํฌํ•จ๋ฉ๋‹ˆ๋‹ค. minimize-mae๋ฉด predicted_TARGET_COLUMN_NAME.value์— ์ค‘์•™๊ฐ’์ด ํฌํ•จ๋ฉ๋‹ˆ๋‹ค.

๋ชจ๋ธ์—์„œ ๋ถ„์œ„์ˆ˜๋กœ ํ™•๋ฅ ์  ์ถ”๋ก ์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒฝ์šฐ Vertex AI๋Š” ์ตœ์ ํ™” ๋ชฉํ‘œ์˜ ์ตœ์†Œํ™” ๋„๊ตฌ ์™ธ์— ๋ถ„์œ„์ˆ˜ ๊ฐ’๊ณผ ์˜ˆ์ธก์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ๋ถ„์œ„์ˆ˜ ๊ฐ’์€ ๋ชจ๋ธ ํ•™์Šต ์ค‘์— ์„ค์ •๋ฉ๋‹ˆ๋‹ค. ๋ถ„์œ„์ˆ˜ ์˜ˆ์ธก์€ ๋ถ„์œ„์ˆ˜ ๊ฐ’๊ณผ ์—ฐ๊ฒฐ๋œ ์˜ˆ์ธก ๊ฐ’์ž…๋‹ˆ๋‹ค.

์˜ค๋ฅ˜ ํ…Œ์ด๋ธ”

ํ…Œ์ด๋ธ”์˜ ์ด๋ฆ„(BQ_ERRORS_TABLE_NAME)์€ ์ผ๊ด„ ์˜ˆ์ธก ์ž‘์—…์ด ์‹œ์ž‘๋œ ํƒ€์ž„์Šคํƒฌํ”„์™€ ํ•จ๊ป˜ errors_๋ฅผ ์ถ”๊ฐ€ํ•˜์—ฌ ํ˜•์„ฑ๋ฉ๋‹ˆ๋‹ค. errors_TIMESTAMP ์˜ค๋ฅ˜ ๊ฒ€์ฆ ํ…Œ์ด๋ธ”์„ ๊ฒ€์ƒ‰ํ•˜๋Š” ๊ฒฝ์šฐ ๋‹ค์Œ์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.
  1. ์ฝ˜์†”์—์„œ BigQuery ํŽ˜์ด์ง€๋กœ ์ด๋™ํ•ฉ๋‹ˆ๋‹ค.

    BigQuery๋กœ ์ด๋™

  2. ๋‹ค์Œ ์ฟผ๋ฆฌ๋ฅผ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค.
    SELECT * FROM BQ_DATASET_NAME.BQ_ERRORS_TABLE_NAME
          
์˜ค๋ฅ˜๋Š” ๋‹ค์Œ ์—ด์— ์ €์žฅ๋ฉ๋‹ˆ๋‹ค.
  • error_TARGET_COLUMN_NAME.code
  • errors_TARGET_COLUMN_NAME.message

Cloud Storage

Cloud Storage๋ฅผ ์ถœ๋ ฅ ๋Œ€์ƒ์œผ๋กœ ์ง€์ •ํ•˜๋ฉด ์ผ๊ด„ ์˜ˆ์ธก ์š”์ฒญ์˜ ๊ฒฐ๊ณผ๊ฐ€ ์ง€์ •ํ•œ ๋ฒ„ํ‚ท์˜ ์ƒˆ ํด๋”์— CSV ๊ฐ์ฒด๋กœ ๋ฐ˜ํ™˜๋ฉ๋‹ˆ๋‹ค. ํด๋” ์ด๋ฆ„์€ ๋ชจ๋ธ ์ด๋ฆ„ ์•ž์— 'prediction_'๊ณผ ์ผ๊ด„ ์˜ˆ์ธก ์ž‘์—…์ด ์‹œ์ž‘๋œ ์‹œ์ ์˜ ํƒ€์ž„์Šคํƒฌํ”„๋ฅผ ์ถ”๊ฐ€ํ•ด ์ง€์ •ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ๋ชจ๋ธ์˜ ์ผ๊ด„ ์˜ˆ์ธก ํƒญ์—์„œ Cloud Storage ํด๋” ์ด๋ฆ„์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

Cloud Storage ํด๋”์—๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์€ ๋‘ ๊ฐ€์ง€ ๊ฐ์ฒด๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ์˜ˆ์ธก ๊ฐ์ฒด

    ์˜ˆ์ธก ๊ฐ์ฒด์˜ ์ด๋ฆ„์€ 'predictions_1.csv', 'predictions_2.csv' ๋“ฑ์œผ๋กœ ์ง€์ •๋ฉ๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ์˜ˆ์ธก ํŒŒ์ผ์—๋Š” ์—ด ์ด๋ฆ„์ด ์ง€์ •๋œ ํ—ค๋” ํ–‰๊ณผ ๋ฐ˜ํ™˜๋œ ๋ชจ๋“  ์˜ˆ์ธก์— ๋Œ€ํ•œ ํ•˜๋‚˜์˜ ํ–‰์ด ํฌํ•จ๋ฉ๋‹ˆ๋‹ค. ์˜ˆ์ธก ๊ฐ์ฒด์—์„œ Vertex AI๋Š” ์˜ˆ์ธก ๋ฐ์ดํ„ฐ๋ฅผ ๋ฐ˜ํ™˜ํ•˜๊ณ  ๋ชจ๋ธ ์œ ํ˜•์— ๋”ฐ๋ผ ์˜ˆ์ธก ๊ฒฐ๊ณผ์— ๋Œ€ํ•œ ํ•˜๋‚˜ ์ด์ƒ์˜ ์ƒˆ ์—ด์„ ๋งŒ๋“ญ๋‹ˆ๋‹ค.

    • ๋ถ„๋ฅ˜: ํƒ€๊ฒŸ ์—ด์˜ ๊ฐ ์˜ˆ์ƒ ๊ฐ’์— ๋Œ€ํ•ด ์ด๋ฆ„์ด TARGET_COLUMN_NAME_VALUE_score์ธ ์—ด์ด ๊ฒฐ๊ณผ์— ์ถ”๊ฐ€๋ฉ๋‹ˆ๋‹ค. ์ด ์—ด์—๋Š” ํ•ด๋‹น ๊ฐ’์— ๋Œ€ํ•œ ์ ์ˆ˜ ๋˜๋Š” ์‹ ๋ขฐ๋„ ์ถ”์ •์น˜๊ฐ€ ํฌํ•จ๋ฉ๋‹ˆ๋‹ค.
    • ํšŒ๊ท€: ํ•ด๋‹น ํ–‰์˜ ์˜ˆ์ธก ๊ฐ’์€ predicted_TARGET_COLUMN_NAME๋ผ๋Š” ์—ด์— ๋ฐ˜ํ™˜๋ฉ๋‹ˆ๋‹ค. CSV ์ถœ๋ ฅ์— ๋Œ€ํ•œ ์˜ˆ์ธก ๊ฐ„๊ฒฉ์€ ๋ฐ˜ํ™˜๋˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.
  • ์˜ค๋ฅ˜ ๊ฐ์ฒด

    ์˜ค๋ฅ˜ ๊ฐ์ฒด์˜ ์ด๋ฆ„์€ `errors_1.csv`, `errors_2.csv` ๋“ฑ์œผ๋กœ ์ง€์ •๋ฉ๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์—๋Š” ํ—ค๋” ํ–‰๊ณผ Vertex AI๊ฐ€ ์˜ˆ์ธก์„ ๋ฐ˜ํ™˜ํ•˜์ง€ ๋ชปํ•˜๋Š” ์ž…๋ ฅ ๋ฐ์ดํ„ฐ์˜ ๋ชจ๋“  ํ–‰์— ๋Œ€ํ•œ ํ•˜๋‚˜์˜ ํ–‰์ด ํฌํ•จ๋ฉ๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด null ๋น„ํ—ˆ์šฉ ํŠน์„ฑ์ด null์ธ ๊ฒฝ์šฐ์ž…๋‹ˆ๋‹ค

์ฐธ๊ณ : ๊ฒฐ๊ณผ๊ฐ€ ํฌ๋ฉด ์—ฌ๋Ÿฌ ๊ฐ์ฒด๋กœ ๋ถ„ํ• ๋ฉ๋‹ˆ๋‹ค.

์˜ˆ์ธก ๊ฒฐ๊ณผ ํ•ด์„

๋ถ„๋ฅ˜

๋ถ„๋ฅ˜ ๋ชจ๋ธ์€ ์‹ ๋ขฐ๋„ ์ ์ˆ˜๋ฅผ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค.

์‹ ๋ขฐ๋„ ์ ์ˆ˜๋Š” ๋ชจ๋ธ์ด ๊ฐ ํด๋ž˜์Šค ๋˜๋Š” ๋ผ๋ฒจ์„ ํ…Œ์ŠคํŠธ ํ•ญ๋ชฉ๊ณผ ์–ผ๋งˆ๋‚˜ ๋ฐ€์ ‘ํ•˜๊ฒŒ ์—ฐ๊ด€์‹œํ‚ค๋Š”์ง€๋ฅผ ์ „๋‹ฌํ•ฉ๋‹ˆ๋‹ค. ์ˆซ์ž๊ฐ€ ๋†’์„์ˆ˜๋ก ํ•ด๋‹น ํ•ญ๋ชฉ์— ๋ผ๋ฒจ์ด ์ ์šฉ๋˜์–ด์•ผ ํ•˜๋Š” ๋ชจ๋ธ์˜ ์‹ ๋ขฐ๋„๊ฐ€ ๋†’์•„์ง‘๋‹ˆ๋‹ค. ๋ชจ๋ธ์˜ ๊ฒฐ๊ณผ๋ฅผ ์ˆ˜๋ฝํ•  ์‹ ๋ขฐ๋„ ์ ์ˆ˜๋ฅผ ์–ผ๋งˆ๋‚˜ ๋†’๊ฒŒ ์ฑ…์ •ํ• ์ง€ ๊ฒฐ์ •ํ•ฉ๋‹ˆ๋‹ค.

ํšŒ๊ท€

ํšŒ๊ท€ ๋ชจ๋ธ์€ ์˜ˆ์ธก ๊ฐ’์„ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค.

๋ชจ๋ธ์—์„œ ํ™•๋ฅ ์  ์ถ”๋ก ์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒฝ์šฐ value ํ•„๋“œ์— ์ตœ์ ํ™” ๋ชฉํ‘œ์˜ ์ตœ์†Œํ™” ๋„๊ตฌ๊ฐ€ ํฌํ•จ๋ฉ๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด ์ตœ์ ํ™” ๋ชฉํ‘œ๊ฐ€ minimize-rmse๋ฉด value ํ•„๋“œ์— ํ‰๊ท  ๊ฐ’์ด ํฌํ•จ๋ฉ๋‹ˆ๋‹ค. minimize-mae๋ฉด value ํ•„๋“œ์— ์ค‘์•™๊ฐ’์ด ํฌํ•จ๋ฉ๋‹ˆ๋‹ค.

๋ชจ๋ธ์—์„œ ๋ถ„์œ„์ˆ˜๋กœ ํ™•๋ฅ ์  ์ถ”๋ก ์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒฝ์šฐ Vertex AI๋Š” ์ตœ์ ํ™” ๋ชฉํ‘œ์˜ ์ตœ์†Œํ™” ๋„๊ตฌ ์™ธ์— ๋ถ„์œ„์ˆ˜ ๊ฐ’๊ณผ ์˜ˆ์ธก์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ๋ถ„์œ„์ˆ˜ ๊ฐ’์€ ๋ชจ๋ธ ํ•™์Šต ์ค‘์— ์„ค์ •๋ฉ๋‹ˆ๋‹ค. ๋ถ„์œ„์ˆ˜ ์˜ˆ์ธก์€ ๋ถ„์œ„์ˆ˜ ๊ฐ’๊ณผ ์—ฐ๊ฒฐ๋œ ์˜ˆ์ธก ๊ฐ’์ž…๋‹ˆ๋‹ค.

๋‹ค์Œ ๋‹จ๊ณ„