๋ถ„๋ฅ˜์šฉ ์ด๋ฏธ์ง€ ํ•™์Šต ๋ฐ์ดํ„ฐ ์ค€๋น„

์ด ํŽ˜์ด์ง€์—์„œ๋Š” ์ด๋ฏธ์ง€ ๋ถ„๋ฅ˜ ๋ชจ๋ธ์„ ํ•™์Šต์‹œํ‚ค๊ธฐ ์œ„ํ•ด Vertex AI ๋ฐ์ดํ„ฐ ์„ธํŠธ์—์„œ ์‚ฌ์šฉํ•  ์ด๋ฏธ์ง€ ํ•™์Šต ๋ฐ์ดํ„ฐ๋ฅผ ์ค€๋น„ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค.

๋‹ค์Œ ๋ชฉํ‘œ ์„น์…˜์—๋Š” ๋ฐ์ดํ„ฐ ์š”๊ตฌ์‚ฌํ•ญ, ์ž…๋ ฅ/์ถœ๋ ฅ ์Šคํ‚ค๋งˆ ํŒŒ์ผ, ์Šคํ‚ค๋งˆ๋กœ ์ •์˜๋˜๋Š” ๋ฐ์ดํ„ฐ ๊ฐ€์ ธ์˜ค๊ธฐ ํŒŒ์ผ ํ˜•์‹(JSON Lines ๋ฐ CSV)์ด ํฌํ•จ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค.

๋‹จ์ผ ๋ผ๋ฒจ ๋ถ„๋ฅ˜

๋ฐ์ดํ„ฐ ์š”๊ตฌ์‚ฌํ•ญ

  • ํ•™์Šต ๋ฐ์ดํ„ฐ: ๋ชจ๋ธ์„ ํ•™์Šต์‹œํ‚ฌ ๋•Œ ๋‹ค์Œ๊ณผ ๊ฐ™์€ ์ด๋ฏธ์ง€ ํ˜•์‹์ด ์ง€์›๋ฉ๋‹ˆ๋‹ค. Vertex AI API๋Š” ์ด๋ ‡๊ฒŒ ๊ฐ€์ ธ์˜จ ์ด๋ฏธ์ง€๋ฅผ ์‚ฌ์ „ ์ฒ˜๋ฆฌํ•œ ํ›„ ๋ชจ๋ธ์„ ํ•™์Šต์‹œํ‚ค๋Š” ๋ฐ ์‚ฌ์šฉ๋˜๋Š” ๋ฐ์ดํ„ฐ๋กœ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ์ด๋ฏธ์ง€๋‹น ์ตœ๋Œ€ ํŒŒ์ผ ํฌ๊ธฐ๋Š” 30MB์ž…๋‹ˆ๋‹ค.
    • JPEG
    • GIF
    • PNG
    • BMP
    • ICO
  • ์˜ˆ์ธก ๋ฐ์ดํ„ฐ: ๋ชจ๋ธ์—์„œ ์˜ˆ์ธก์„ ์š”์ฒญ (์ฟผ๋ฆฌ)ํ•  ๋•Œ ์ง€์›๋˜๋Š” ์ด๋ฏธ์ง€ ํ˜•์‹์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค. ์ตœ๋Œ€ ํŒŒ์ผ ํฌ๊ธฐ๋Š” 1.5MB์ž…๋‹ˆ๋‹ค.
    • JPEG
    • GIF
    • PNG
    • WEBP
    • BMP
    • TIFF
    • ICO

    AutoML ๋ชจ๋ธ ํ•™์Šต์— ์‚ฌ์šฉ๋˜๋Š” ์ด๋ฏธ์ง€ ๋ฐ์ดํ„ฐ๋ฅผ ์œ„ํ•œ ๊ถŒ์žฅ์‚ฌํ•ญ

    ๋‹ค์Œ ๊ถŒ์žฅ์‚ฌํ•ญ์€ AutoML์„ ์‚ฌ์šฉํ•˜์—ฌ ๋ชจ๋ธ์„ ํ•™์Šต์‹œํ‚ค๋Š” ๋ฐ์ดํ„ฐ ์„ธํŠธ์— ์ ์šฉ๋ฉ๋‹ˆ๋‹ค.

  • AutoML ๋ชจ๋ธ์€ ํ˜„์‹ค์˜ ๋ฌผ์ฒด๋ฅผ ์ฐ์€ ์‚ฌ์ง„์— ์ตœ์ ํ™”๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค.
  • ํ•™์Šต ๋ฐ์ดํ„ฐ๋Š” ์˜ˆ์ธก์„ ์ˆ˜ํ–‰ํ•  ๋ฐ์ดํ„ฐ์™€ ์ตœ๋Œ€ํ•œ ์œ ์‚ฌํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด ์‚ฌ์šฉ ์‚ฌ๋ก€์— ๋ณด์•ˆ ์นด๋ฉ”๋ผ ์˜์ƒ์ฒ˜๋Ÿผ ํ๋ฆฟํ•œ ์ €ํ•ด์ƒ๋„ ์ด๋ฏธ์ง€๊ฐ€ ํฌํ•จ๋˜๋Š” ๊ฒฝ์šฐ ํ๋ฆฟํ•œ ์ €ํ•ด์ƒ๋„ ์ด๋ฏธ์ง€๋กœ ํ•™์Šต ๋ฐ์ดํ„ฐ๋ฅผ ๊ตฌ์„ฑํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ๋˜ํ•œ ์ผ๋ฐ˜์ ์œผ๋กœ ๋‹ค์–‘ํ•œ ๊ฐ๋„, ํ•ด์ƒ๋„, ๋ฐฐ๊ฒฝ์œผ๋กœ ์ดฌ์˜ํ•œ ํ•™์Šต ์ด๋ฏธ์ง€๋ฅผ ์ œ๊ณตํ•˜๋Š” ๊ฒƒ์ด ์ข‹์Šต๋‹ˆ๋‹ค.
  • ์ธ๊ฐ„์ด ์ง€์ •ํ•  ์ˆ˜ ์—†๋Š” ๋ผ๋ฒจ์€ Vertex AI ๋ชจ๋ธ๋„ ์ผ๋ฐ˜์ ์œผ๋กœ ์˜ˆ์ธกํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ ์ด๋ฏธ์ง€๋ฅผ 1~2์ดˆ ๋ณด๊ณ  ๋ผ๋ฒจ์„ ์ง€์ •ํ•˜๋„๋ก ์‚ฌ๋žŒ์„ ํ•™์Šต์‹œํ‚ฌ ์ˆ˜ ์—†๋‹ค๋ฉด ์ด ๋ชจ๋ธ๋„ ๊ทธ๋ ‡๊ฒŒ ํ•˜๋„๋ก ํ•™์Šต์‹œํ‚ฌ ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค.
  • ๊ถŒ์žฅ๋˜๋Š” ํ•™์Šต ์ด๋ฏธ์ง€ ๊ฐœ์ˆ˜๋Š” ๋ผ๋ฒจ๋‹น 1,000๊ฐœ์ž…๋‹ˆ๋‹ค. ๋ผ๋ฒจ๋‹น ์ตœ์†Œ ๊ฐœ์ˆ˜๋Š” 10๊ฐœ์ž…๋‹ˆ๋‹ค. ์ด๋ฏธ์ง€ ํ•˜๋‚˜์— ๋ผ๋ฒจ์ด ์—ฌ๋Ÿฌ ๊ฐœ์ธ ๋ชจ๋ธ์„ ํ•™์Šต์‹œํ‚ค๋Š” ๊ฒฝ์šฐ ์ผ๋ฐ˜์ ์œผ๋กœ ๋ผ๋ฒจ๋‹น ํ•„์š”ํ•œ ์˜ˆ์˜ ์ˆ˜๊ฐ€ ๋Š˜์–ด๋‚˜๋ฉฐ, ๊ฒฐ๊ณผ ์ ์ˆ˜๋ฅผ ํ•ด์„ํ•˜๊ธฐ๋„ ์–ด๋ ค์›Œ์ง‘๋‹ˆ๋‹ค.
  • ๊ฐ€์žฅ ํ”ํ•œ ๋ผ๋ฒจ์˜ ์ด๋ฏธ์ง€๊ฐ€ ๊ฐ€์žฅ ํ”ํ•˜์ง€ ์•Š์€ ๋ผ๋ฒจ์˜ ์ด๋ฏธ์ง€๋ณด๋‹ค ์ตœ๋Œ€ 100๋ฐฐ ๋งŽ์„ ๋•Œ ๋ชจ๋ธ ํšจ๊ณผ๊ฐ€ ๊ฐ€์žฅ ๋›ฐ์–ด๋‚ฉ๋‹ˆ๋‹ค. ๋นˆ๋„๊ฐ€ ๊ทนํžˆ ๋‚ฎ์€ ๋ผ๋ฒจ์€ ์‚ญ์ œํ•˜๋Š” ๊ฒƒ์ด ์ข‹์Šต๋‹ˆ๋‹ค.
  • ์ •์˜๋œ ๋ผ๋ฒจ ์ค‘ ์–ด๋А ๊ฒƒ๊ณผ๋„ ์ผ์น˜ํ•˜์ง€ ์•Š๋Š” None_of_the_above ๋ผ๋ฒจ๊ณผ ์ด๋ฏธ์ง€๋ฅผ ํฌํ•จํ•˜๋ฉด ์œ ์šฉํ•ฉ๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด ๊ฝƒ ๋ฐ์ดํ„ฐ ์„ธํŠธ์˜ ๊ฒฝ์šฐ ๋ผ๋ฒจ์„ ์ง€์ •ํ•œ ํ’ˆ์ข…์— ์†ํ•˜์ง€ ์•Š๋Š” ๊ฝƒ์˜ ์ด๋ฏธ์ง€๋ฅผ ํฌํ•จํ•˜๊ณ  None_of_the_above ๋ผ๋ฒจ์„ ๋ถ™์ž…๋‹ˆ๋‹ค.

YAML ์Šคํ‚ค๋งˆ ํŒŒ์ผ

๊ณต๊ฐœ์ ์œผ๋กœ ์•ก์„ธ์Šคํ•  ์ˆ˜ ์žˆ๋Š” ๋‹ค์Œ ์Šคํ‚ค๋งˆ ํŒŒ์ผ์„ ์‚ฌ์šฉํ•˜์—ฌ ๋‹จ์ผ ๋ผ๋ฒจ ์ด๋ฏธ์ง€ ๋ถ„๋ฅ˜ ์ฃผ์„์„ ๊ฐ€์ ธ์˜ต๋‹ˆ๋‹ค. ์ด ์Šคํ‚ค๋งˆ ํŒŒ์ผ์€ ๋ฐ์ดํ„ฐ ์ž…๋ ฅ ํŒŒ์ผ์˜ ํ˜•์‹์„ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค. ์ด ํŒŒ์ผ์˜ ๊ตฌ์กฐ๋Š” OpenAPI ์Šคํ‚ค๋งˆ๋ฅผ ๋”ฐ๋ฆ…๋‹ˆ๋‹ค.

gs://google-cloud-aiplatform/schema/dataset/ioformat/image_classification_single_label_io_format_1.0.0.yaml

์ „์ฒด ์Šคํ‚ค๋งˆ ํŒŒ์ผ

title: ImageClassificationSingleLabel
description: >
 Import and export format for importing/exporting images together with
 single-label classification annotation. Can be used in
 Dataset.import_schema_uri field.
type: object
required:
- imageGcsUri
properties:
 imageGcsUri:
   type: string
   description: >
     A Cloud Storage URI pointing to an image. Up to 30MB in size.
     Supported file mime types: `image/jpeg`, `image/gif`, `image/png`,
     `image/webp`, `image/bmp`, `image/tiff`, `image/vnd.microsoft.icon`.
 classificationAnnotation:
   type: object
   description: Single classification Annotation on the image.
   properties:
     displayName:
       type: string
       description: >
         It will be imported as/exported from AnnotationSpec's display name,
         i.e. the name of the label/class.
     annotationResourceLabels:
       description: Resource labels on the Annotation.
       type: object
       additionalProperties:
         type: string
 dataItemResourceLabels:
   description: Resource labels on the DataItem.
   type: object
   additionalProperties:
     type: string

์ž…๋ ฅ ํŒŒ์ผ

JSON Lines

๊ฐ ํ–‰์˜ JSON:


{
  "imageGcsUri": "gs://bucket/filename.ext",
  "classificationAnnotation": {
    "displayName": "LABEL",
    "annotationResourceLabels": {
        "aiplatform.googleapis.com/annotation_set_name": "displayName",
        "env": "prod"
      }
   },
  "dataItemResourceLabels": {
    "aiplatform.googleapis.com/ml_use": "training/test/validation"
  }
}

ํ•„๋“œ ์ฐธ๊ณ ์‚ฌํ•ญ:

  • imageGcsUri - ์œ ์ผํ•œ ํ•„์ˆ˜ ํ•„๋“œ์ž…๋‹ˆ๋‹ค.
  • annotationResourceLabels - ํ‚ค-๊ฐ’ ๋ฌธ์ž์—ด ์Œ์„ ํฌํ•จํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์‹œ์Šคํ…œ์— ์˜ˆ์•ฝ๋œ ์œ ์ผํ•œ ํ‚ค-๊ฐ’ ์Œ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.
    • 'aiplatform.googleapis.com/annotation_set_name': 'value'

    ์—ฌ๊ธฐ์„œ value๋Š” ๋ฐ์ดํ„ฐ ์„ธํŠธ์— ์žˆ๋Š” ๊ธฐ์กด ์ฃผ์„ ์ง‘ํ•ฉ์˜ ํ‘œ์‹œ ์ด๋ฆ„ ์ค‘ ํ•˜๋‚˜์ž…๋‹ˆ๋‹ค.

  • dataItemResourceLabels - ํ‚ค-๊ฐ’ ๋ฌธ์ž์—ด ์Œ์„ ํฌํ•จํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์‹œ์Šคํ…œ์—์„œ ์˜ˆ์•ฝํ•  ์ˆ˜ ์žˆ๋Š” ์œ ์ผํ•œ ํ‚ค-๊ฐ’ ์Œ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ๋ฐ์ดํ„ฐ ํ•ญ๋ชฉ์˜ ๋จธ์‹ ๋Ÿฌ๋‹ ์‚ฌ์šฉ ์„ธํŠธ๋ฅผ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค.
    • 'aiplatform.googleapis.com/ml_use': 'training/test/validation'

JSON Lines ์˜ˆ์‹œ - image_classification_single_label.jsonl:


{"imageGcsUri": "gs://bucket/filename1.jpeg",  "classificationAnnotation": {"displayName": "daisy"}, "dataItemResourceLabels": {"aiplatform.googleapis.com/ml_use": "test"}}
{"imageGcsUri": "gs://bucket/filename2.gif",  "classificationAnnotation": {"displayName": "dandelion"}, "dataItemResourceLabels": {"aiplatform.googleapis.com/ml_use": "training"}}
{"imageGcsUri": "gs://bucket/filename3.png",  "classificationAnnotation": {"displayName": "roses"}, "dataItemResourceLabels": {"aiplatform.googleapis.com/ml_use": "training"}}
{"imageGcsUri": "gs://bucket/filename4.bmp",  "classificationAnnotation": {"displayName": "sunflowers"}, "dataItemResourceLabels": {"aiplatform.googleapis.com/ml_use": "training"}}
{"imageGcsUri": "gs://bucket/filename5.tiff",  "classificationAnnotation": {"displayName": "tulips"}, "dataItemResourceLabels": {"aiplatform.googleapis.com/ml_use": "validation"}}
...

CSV

CSV ํ˜•์‹:

[ML_USE],GCS_FILE_PATH,[LABEL]
์—ด ๋ชฉ๋ก
  • ML_USE(์„ ํƒ์‚ฌํ•ญ) - ๋ชจ๋ธ์„ ํ•™์Šต์‹œํ‚ฌ ๋•Œ ๋ฐ์ดํ„ฐ ๋ถ„ํ•  ๋ชฉ์ ์œผ๋กœ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค. TRAINING, TEST, VALIDATION์ด ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค. ์ˆ˜๋™ ๋ฐ์ดํ„ฐ ๋ถ„ํ• ์— ๋Œ€ํ•œ ์ž์„ธํ•œ ๋‚ด์šฉ์€ AutoML ๋ชจ๋ธ์— ๋Œ€ํ•œ ๋ฐ์ดํ„ฐ ๋ถ„ํ•  ์ •๋ณด๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.
  • GCS_FILE_PATH - ์ด ํ•„๋“œ์—๋Š” ์ด๋ฏธ์ง€์— ๋Œ€ํ•œ Cloud Storage URI๊ฐ€ ํฌํ•จ๋ฉ๋‹ˆ๋‹ค. Cloud Storage URI๋Š” ๋Œ€์†Œ๋ฌธ์ž๋ฅผ ๊ตฌ๋ถ„ํ•ฉ๋‹ˆ๋‹ค.
  • LABEL(์„ ํƒ์‚ฌํ•ญ). ๋ผ๋ฒจ์€ ๋ฌธ์ž๋กœ ์‹œ์ž‘ํ•ด์•ผ ํ•˜๋ฉฐ ๋ฌธ์ž, ์ˆซ์ž, ๋ฐ‘์ค„๋งŒ ํฌํ•จํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

CSV ์˜ˆ์‹œ - image_classification_single_label.csv:

test,gs://bucket/filename1.jpeg,daisy
training,gs://bucket/filename2.gif,dandelion
gs://bucket/filename3.png
gs://bucket/filename4.bmp,sunflowers
validation,gs://bucket/filename5.tiff,tulips
...
    

๋ฉ€ํ‹ฐ ๋ผ๋ฒจ ๋ถ„๋ฅ˜

๋ฐ์ดํ„ฐ ์š”๊ตฌ์‚ฌํ•ญ

  • ํ•™์Šต ๋ฐ์ดํ„ฐ: ๋ชจ๋ธ์„ ํ•™์Šต์‹œํ‚ฌ ๋•Œ ๋‹ค์Œ๊ณผ ๊ฐ™์€ ์ด๋ฏธ์ง€ ํ˜•์‹์ด ์ง€์›๋ฉ๋‹ˆ๋‹ค. Vertex AI API๋Š” ์ด๋ ‡๊ฒŒ ๊ฐ€์ ธ์˜จ ์ด๋ฏธ์ง€๋ฅผ ์‚ฌ์ „ ์ฒ˜๋ฆฌํ•œ ํ›„ ๋ชจ๋ธ์„ ํ•™์Šต์‹œํ‚ค๋Š” ๋ฐ ์‚ฌ์šฉ๋˜๋Š” ๋ฐ์ดํ„ฐ๋กœ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ์ด๋ฏธ์ง€๋‹น ์ตœ๋Œ€ ํŒŒ์ผ ํฌ๊ธฐ๋Š” 30MB์ž…๋‹ˆ๋‹ค.
    • JPEG
    • GIF
    • PNG
    • BMP
    • ICO
  • ์˜ˆ์ธก ๋ฐ์ดํ„ฐ: ๋ชจ๋ธ์—์„œ ์˜ˆ์ธก์„ ์š”์ฒญ (์ฟผ๋ฆฌ)ํ•  ๋•Œ ์ง€์›๋˜๋Š” ์ด๋ฏธ์ง€ ํ˜•์‹์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค. ์ตœ๋Œ€ ํŒŒ์ผ ํฌ๊ธฐ๋Š” 1.5MB์ž…๋‹ˆ๋‹ค.
    • JPEG
    • GIF
    • PNG
    • WEBP
    • BMP
    • TIFF
    • ICO

    AutoML ๋ชจ๋ธ ํ•™์Šต์— ์‚ฌ์šฉ๋˜๋Š” ์ด๋ฏธ์ง€ ๋ฐ์ดํ„ฐ๋ฅผ ์œ„ํ•œ ๊ถŒ์žฅ์‚ฌํ•ญ

    ๋‹ค์Œ ๊ถŒ์žฅ์‚ฌํ•ญ์€ AutoML์„ ์‚ฌ์šฉํ•˜์—ฌ ๋ชจ๋ธ์„ ํ•™์Šต์‹œํ‚ค๋Š” ๋ฐ์ดํ„ฐ ์„ธํŠธ์— ์ ์šฉ๋ฉ๋‹ˆ๋‹ค.

  • AutoML ๋ชจ๋ธ์€ ํ˜„์‹ค์˜ ๋ฌผ์ฒด๋ฅผ ์ฐ์€ ์‚ฌ์ง„์— ์ตœ์ ํ™”๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค.
  • ํ•™์Šต ๋ฐ์ดํ„ฐ๋Š” ์˜ˆ์ธก์„ ์ˆ˜ํ–‰ํ•  ๋ฐ์ดํ„ฐ์™€ ์ตœ๋Œ€ํ•œ ์œ ์‚ฌํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด ์‚ฌ์šฉ ์‚ฌ๋ก€์— ๋ณด์•ˆ ์นด๋ฉ”๋ผ ์˜์ƒ์ฒ˜๋Ÿผ ํ๋ฆฟํ•œ ์ €ํ•ด์ƒ๋„ ์ด๋ฏธ์ง€๊ฐ€ ํฌํ•จ๋˜๋Š” ๊ฒฝ์šฐ ํ๋ฆฟํ•œ ์ €ํ•ด์ƒ๋„ ์ด๋ฏธ์ง€๋กœ ํ•™์Šต ๋ฐ์ดํ„ฐ๋ฅผ ๊ตฌ์„ฑํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ๋˜ํ•œ ์ผ๋ฐ˜์ ์œผ๋กœ ๋‹ค์–‘ํ•œ ๊ฐ๋„, ํ•ด์ƒ๋„, ๋ฐฐ๊ฒฝ์œผ๋กœ ์ดฌ์˜ํ•œ ํ•™์Šต ์ด๋ฏธ์ง€๋ฅผ ์ œ๊ณตํ•˜๋Š” ๊ฒƒ์ด ์ข‹์Šต๋‹ˆ๋‹ค.
  • ์ธ๊ฐ„์ด ์ง€์ •ํ•  ์ˆ˜ ์—†๋Š” ๋ผ๋ฒจ์€ Vertex AI ๋ชจ๋ธ๋„ ์ผ๋ฐ˜์ ์œผ๋กœ ์˜ˆ์ธกํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ ์ด๋ฏธ์ง€๋ฅผ 1~2์ดˆ ๋ณด๊ณ  ๋ผ๋ฒจ์„ ์ง€์ •ํ•˜๋„๋ก ์‚ฌ๋žŒ์„ ํ•™์Šต์‹œํ‚ฌ ์ˆ˜ ์—†๋‹ค๋ฉด ์ด ๋ชจ๋ธ๋„ ๊ทธ๋ ‡๊ฒŒ ํ•˜๋„๋ก ํ•™์Šต์‹œํ‚ฌ ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค.
  • ๊ถŒ์žฅ๋˜๋Š” ํ•™์Šต ์ด๋ฏธ์ง€ ๊ฐœ์ˆ˜๋Š” ๋ผ๋ฒจ๋‹น 1,000๊ฐœ์ž…๋‹ˆ๋‹ค. ๋ผ๋ฒจ๋‹น ์ตœ์†Œ ๊ฐœ์ˆ˜๋Š” 10๊ฐœ์ž…๋‹ˆ๋‹ค. ์ด๋ฏธ์ง€ ํ•˜๋‚˜์— ๋ผ๋ฒจ์ด ์—ฌ๋Ÿฌ ๊ฐœ์ธ ๋ชจ๋ธ์„ ํ•™์Šต์‹œํ‚ค๋Š” ๊ฒฝ์šฐ ์ผ๋ฐ˜์ ์œผ๋กœ ๋ผ๋ฒจ๋‹น ํ•„์š”ํ•œ ์˜ˆ์˜ ์ˆ˜๊ฐ€ ๋Š˜์–ด๋‚˜๋ฉฐ, ๊ฒฐ๊ณผ ์ ์ˆ˜๋ฅผ ํ•ด์„ํ•˜๊ธฐ๋„ ์–ด๋ ค์›Œ์ง‘๋‹ˆ๋‹ค.
  • ๊ฐ€์žฅ ํ”ํ•œ ๋ผ๋ฒจ์˜ ์ด๋ฏธ์ง€๊ฐ€ ๊ฐ€์žฅ ํ”ํ•˜์ง€ ์•Š์€ ๋ผ๋ฒจ์˜ ์ด๋ฏธ์ง€๋ณด๋‹ค ์ตœ๋Œ€ 100๋ฐฐ ๋งŽ์„ ๋•Œ ๋ชจ๋ธ ํšจ๊ณผ๊ฐ€ ๊ฐ€์žฅ ๋›ฐ์–ด๋‚ฉ๋‹ˆ๋‹ค. ๋นˆ๋„๊ฐ€ ๊ทนํžˆ ๋‚ฎ์€ ๋ผ๋ฒจ์€ ์‚ญ์ œํ•˜๋Š” ๊ฒƒ์ด ์ข‹์Šต๋‹ˆ๋‹ค.
  • ์ •์˜๋œ ๋ผ๋ฒจ ์ค‘ ์–ด๋А ๊ฒƒ๊ณผ๋„ ์ผ์น˜ํ•˜์ง€ ์•Š๋Š” None_of_the_above ๋ผ๋ฒจ๊ณผ ์ด๋ฏธ์ง€๋ฅผ ํฌํ•จํ•˜๋ฉด ์œ ์šฉํ•ฉ๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด ๊ฝƒ ๋ฐ์ดํ„ฐ ์„ธํŠธ์˜ ๊ฒฝ์šฐ ๋ผ๋ฒจ์„ ์ง€์ •ํ•œ ํ’ˆ์ข…์— ์†ํ•˜์ง€ ์•Š๋Š” ๊ฝƒ์˜ ์ด๋ฏธ์ง€๋ฅผ ํฌํ•จํ•˜๊ณ  None_of_the_above ๋ผ๋ฒจ์„ ๋ถ™์ž…๋‹ˆ๋‹ค.

YAML ์Šคํ‚ค๋งˆ ํŒŒ์ผ

๊ณต๊ฐœ์ ์œผ๋กœ ์•ก์„ธ์Šคํ•  ์ˆ˜ ์žˆ๋Š” ๋‹ค์Œ ์Šคํ‚ค๋งˆ ํŒŒ์ผ์„ ์‚ฌ์šฉํ•˜์—ฌ ๋ฉ€ํ‹ฐ ๋ผ๋ฒจ ์ด๋ฏธ์ง€ ๋ถ„๋ฅ˜ ์ฃผ์„์„ ๊ฐ€์ ธ์˜ต๋‹ˆ๋‹ค. ์ด ์Šคํ‚ค๋งˆ ํŒŒ์ผ์€ ๋ฐ์ดํ„ฐ ์ž…๋ ฅ ํŒŒ์ผ์˜ ํ˜•์‹์„ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค. ์ด ํŒŒ์ผ์˜ ๊ตฌ์กฐ๋Š” OpenAPI ์Šคํ‚ค๋งˆ๋ฅผ ๋”ฐ๋ฆ…๋‹ˆ๋‹ค.

gs://google-cloud-aiplatform/schema/dataset/ioformat/image_classification_multi_label_io_format_1.0.0.yaml

์ „์ฒด ์Šคํ‚ค๋งˆ ํŒŒ์ผ

title: ImageClassificationMultiLabel
description: >
 Import and export format for importing/exporting images together with
 multi-label classification annotations. Can be used in
 Dataset.import_schema_uri field.
type: object
required:
- imageGcsUri
properties:
 imageGcsUri:
   type: string
   description: >
     A Cloud Storage URI pointing to an image. Up to 30MB in size.
     Supported file mime types: `image/jpeg`, `image/gif`, `image/png`,
     `image/webp`, `image/bmp`, `image/tiff`, `image/vnd.microsoft.icon`.
 classificationAnnotations:
   type: array
   description: Multiple classification Annotations on the image.
   items:
     type: object
     description: Classification annotation.
     properties:
       displayName:
         type: string
         description: >
           It will be imported as/exported from AnnotationSpec's display name,
           i.e. the name of the label/class.
       annotationResourceLabels:
         description: Resource labels on the Annotation.
         type: object
         additionalProperties:
           type: string
 dataItemResourceLabels:
   description: Resource labels on the DataItem.
   type: object
   additionalProperties:
     type: string

์ž…๋ ฅ ํŒŒ์ผ

JSON Lines

๊ฐ ํ–‰์˜ JSON:

{
  "imageGcsUri": "gs://bucket/filename.ext",
  "classificationAnnotations": [
    {
      "displayName": "LABEL1",
      "annotationResourceLabels": {
        "aiplatform.googleapis.com/annotation_set_name":"displayName",
        "label_type": "flower_type"
      }
    },
    {
      "displayName": "LABEL2",
      "annotationResourceLabels": {
        "aiplatform.googleapis.com/annotation_set_name":"displayName",
        "label_type": "image_shot_type"
      }
    }
  ],
  "dataItemResourceLabels": {
    "aiplatform.googleapis.com/ml_use": "training/test/validation"
  }
}

ํ•„๋“œ ์ฐธ๊ณ ์‚ฌํ•ญ:

  • imageGcsUri - ์œ ์ผํ•œ ํ•„์ˆ˜ ํ•„๋“œ์ž…๋‹ˆ๋‹ค.
  • annotationResourceLabels - ํ‚ค-๊ฐ’ ๋ฌธ์ž์—ด ์Œ์„ ํฌํ•จํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์‹œ์Šคํ…œ์— ์˜ˆ์•ฝ๋œ ์œ ์ผํ•œ ํ‚ค-๊ฐ’ ์Œ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.
    • 'aiplatform.googleapis.com/annotation_set_name': 'value'

    ์—ฌ๊ธฐ์„œ value๋Š” ๋ฐ์ดํ„ฐ ์„ธํŠธ์— ์žˆ๋Š” ๊ธฐ์กด ์ฃผ์„ ์ง‘ํ•ฉ์˜ ํ‘œ์‹œ ์ด๋ฆ„ ์ค‘ ํ•˜๋‚˜์ž…๋‹ˆ๋‹ค.

  • dataItemResourceLabels - ํ‚ค-๊ฐ’ ๋ฌธ์ž์—ด ์Œ์„ ํฌํ•จํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์‹œ์Šคํ…œ์—์„œ ์˜ˆ์•ฝํ•  ์ˆ˜ ์žˆ๋Š” ์œ ์ผํ•œ ํ‚ค-๊ฐ’ ์Œ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ๋ฐ์ดํ„ฐ ํ•ญ๋ชฉ์˜ ๋จธ์‹ ๋Ÿฌ๋‹ ์‚ฌ์šฉ ์„ธํŠธ๋ฅผ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค.
    • 'aiplatform.googleapis.com/ml_use': 'training/test/validation'

JSON Lines ์˜ˆ์‹œ - image_classification_multi_label.jsonl:


{"imageGcsUri": "gs://bucket/filename1.jpeg",  "classificationAnnotations": [{"displayName": "daisy"}, {"displayName": "full_shot"}], "dataItemResourceLabels": {"aiplatform.googleapis.com/ml_use": "test"}}
{"imageGcsUri": "gs://bucket/filename2.gif",  "classificationAnnotations": [{"displayName": "dandelion"}, {"displayName": "medium_shot"}], "dataItemResourceLabels": {"aiplatform.googleapis.com/ml_use": "training"}}
{"imageGcsUri": "gs://bucket/filename3.png",  "classificationAnnotations": [{"displayName": "roses"}, {"displayName": "extreme_closeup"}], "dataItemResourceLabels": {"aiplatform.googleapis.com/ml_use": "training"}}
{"imageGcsUri": "gs://bucket/filename4.bmp",  "classificationAnnotations": [{"displayName": "sunflowers"}, {"displayName": "closeup"}], "dataItemResourceLabels": {"aiplatform.googleapis.com/ml_use": "training"}}
{"imageGcsUri": "gs://bucket/filename5.tiff",  "classificationAnnotations": [{"displayName": "tulips"}, {"displayName": "extreme_closeup"}], "dataItemResourceLabels": {"aiplatform.googleapis.com/ml_use": "validation"}}
...

CSV

CSV ํ˜•์‹:

[ML_USE],GCS_FILE_PATH,[LABEL1,LABEL2,...LABELn]
์—ด ๋ชฉ๋ก
  • ML_USE(์„ ํƒ์‚ฌํ•ญ) - ๋ชจ๋ธ์„ ํ•™์Šต์‹œํ‚ฌ ๋•Œ ๋ฐ์ดํ„ฐ ๋ถ„ํ•  ๋ชฉ์ ์œผ๋กœ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค. TRAINING, TEST, VALIDATION์ด ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค. ์ˆ˜๋™ ๋ฐ์ดํ„ฐ ๋ถ„ํ• ์— ๋Œ€ํ•œ ์ž์„ธํ•œ ๋‚ด์šฉ์€ AutoML ๋ชจ๋ธ์— ๋Œ€ํ•œ ๋ฐ์ดํ„ฐ ๋ถ„ํ•  ์ •๋ณด๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.
  • GCS_FILE_PATH - ์ด ํ•„๋“œ์—๋Š” ์ด๋ฏธ์ง€์— ๋Œ€ํ•œ Cloud Storage URI๊ฐ€ ํฌํ•จ๋ฉ๋‹ˆ๋‹ค. Cloud Storage URI๋Š” ๋Œ€์†Œ๋ฌธ์ž๋ฅผ ๊ตฌ๋ถ„ํ•ฉ๋‹ˆ๋‹ค.
  • LABEL(์„ ํƒ์‚ฌํ•ญ). ๋ผ๋ฒจ์€ ๋ฌธ์ž๋กœ ์‹œ์ž‘ํ•ด์•ผ ํ•˜๋ฉฐ ๋ฌธ์ž, ์ˆซ์ž, ๋ฐ‘์ค„๋งŒ ํฌํ•จํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

CSV ์˜ˆ์‹œ - image_classification_multi_label.csv:

test,gs://bucket/filename1.jpeg,daisy,full_shot
training,gs://bucket/filename2.gif,dandelion,medium_shot
gs://bucket/filename3.png
gs://bucket/filename4.bmp,sunflowers,closeup
validation,gs://bucket/filename5.tiff,tulips,extreme_closeup
...