Dataproc ์„ ํƒ์  Flink ๊ตฌ์„ฑ์š”์†Œ

์„ ํƒ์  ๊ตฌ์„ฑ์š”์†Œ ๊ธฐ๋Šฅ์„ ์‚ฌ์šฉํ•˜์—ฌ Dataproc ํด๋Ÿฌ์Šคํ„ฐ๋ฅผ ๋งŒ๋“ค ๋•Œ Flink์™€ ๊ฐ™์€ ์ถ”๊ฐ€ ๊ตฌ์„ฑ์š”์†Œ๋ฅผ ํ™œ์„ฑํ™”ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด ํŽ˜์ด์ง€์—์„œ๋Š” Apache Flink ์„ ํƒ์  ๊ตฌ์„ฑ์š”์†Œ(Flink ํด๋Ÿฌ์Šคํ„ฐ)๊ฐ€ ํ™œ์„ฑํ™”๋œ Dataproc ํด๋Ÿฌ์Šคํ„ฐ๋ฅผ ๋งŒ๋“  ํ›„ ํด๋Ÿฌ์Šคํ„ฐ์—์„œ Flink ์ž‘์—…์„ ์‹คํ–‰ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.

Flink ํด๋Ÿฌ์Šคํ„ฐ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋‹ค์Œ์„ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

  1. Google Cloud ์ฝ˜์†”, Google Cloud CLI ๋˜๋Š” Dataproc API์—์„œ Dataproc Jobs ๋ฆฌ์†Œ์Šค๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ Flink ์ž‘์—… ์‹คํ–‰

  2. Flink ํด๋Ÿฌ์Šคํ„ฐ ๋งˆ์Šคํ„ฐ ๋…ธ๋“œ์—์„œ ์‹คํ–‰๋˜๋Š” flink CLI๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ Flink ์ž‘์—… ์‹คํ–‰

  3. Flink์—์„œ Apache Beam ์ž‘์—… ์‹คํ–‰

  4. Kerberized ํด๋Ÿฌ์Šคํ„ฐ์—์„œ Flink ์‹คํ–‰

Google Cloud ์ฝ˜์†”, Google Cloud CLI ๋˜๋Š” Dataproc API๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํด๋Ÿฌ์Šคํ„ฐ์—์„œ Flink ๊ตฌ์„ฑ์š”์†Œ๊ฐ€ ํ™œ์„ฑํ™”๋œ Dataproc ํด๋Ÿฌ์Šคํ„ฐ๋ฅผ ๋งŒ๋“ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๊ถŒ์žฅ์‚ฌํ•ญ: Flink ๊ตฌ์„ฑ์š”์†Œ๊ฐ€ ํฌํ•จ๋œ ๋งˆ์Šคํ„ฐ 1๊ฐœ ํ‘œ์ค€ VM ํด๋Ÿฌ์Šคํ„ฐ๋ฅผ ์‚ฌ์šฉํ•˜์„ธ์š”. Dataproc ๊ณ ๊ฐ€์šฉ์„ฑ ๋ชจ๋“œ ํด๋Ÿฌ์Šคํ„ฐ(๋งˆ์Šคํ„ฐ VM 3๊ฐœ)๋Š” Flink ๊ณ ๊ฐ€์šฉ์„ฑ ๋ชจ๋“œ๋ฅผ ์ง€์›ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

Google Cloud ์ฝ˜์†”, Google Cloud CLI ๋˜๋Š” Dataproc API์—์„œ Dataproc Jobs ๋ฆฌ์†Œ์Šค๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ Flink ์ž‘์—…์„ ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์ฝ˜์†”

์ฝ˜์†”์—์„œ ์ƒ˜ํ”Œ Flink ์›Œ๋“œ์นด์šดํŠธ ์ž‘์—…์„ ์ œ์ถœํ•˜๋ ค๋ฉด ๋‹ค์Œ ์•ˆ๋‚ด๋ฅผ ๋”ฐ๋ฅด์„ธ์š”.

  1. ๋ธŒ๋ผ์šฐ์ €์˜Google Cloud ์ฝ˜์†”์—์„œ Dataproc ์ž‘์—… ์ œ์ถœ ํŽ˜์ด์ง€๋ฅผ ์—ฝ๋‹ˆ๋‹ค.

  2. ์ž‘์—… ์ œ์ถœ ํŽ˜์ด์ง€์˜ ํ•„๋“œ๋ฅผ ์ž‘์„ฑํ•ฉ๋‹ˆ๋‹ค.

    1. ํด๋Ÿฌ์Šคํ„ฐ ๋ชฉ๋ก์—์„œ ํด๋Ÿฌ์Šคํ„ฐ ์ด๋ฆ„์„ ์„ ํƒํ•ฉ๋‹ˆ๋‹ค.
    2. ์ž‘์—… ์œ ํ˜•์„ Flink๋กœ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค.
    3. ๊ธฐ๋ณธ ํด๋ž˜์Šค ๋˜๋Š” jar์„ org.apache.flink.examples.java.wordcount.WordCount๋กœ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค.
    4. JAR ํŒŒ์ผ์„ file:///usr/lib/flink/examples/batch/WordCount.jar๋กœ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค.
      • file:///์€ ํด๋Ÿฌ์Šคํ„ฐ์— ์žˆ๋Š” ํŒŒ์ผ์„ ๋‚˜ํƒ€๋ƒ…๋‹ˆ๋‹ค. Flink ํด๋Ÿฌ์Šคํ„ฐ๋ฅผ ๋งŒ๋“ค ๋•Œ Dataproc์ด WordCount.jar์„ ์„ค์น˜ํ–ˆ์Šต๋‹ˆ๋‹ค.
      • ์ด ํ•„๋“œ๋Š” Cloud Storage ๊ฒฝ๋กœ(gs://BUCKET/JARFILE) ๋˜๋Š” Hadoop ๋ถ„์‚ฐ ํŒŒ์ผ ์‹œ์Šคํ…œ(HDFS) ๊ฒฝ๋กœ(hdfs://PATH_TO_JAR)๋„ ํ—ˆ์šฉํ•ฉ๋‹ˆ๋‹ค.
  3. ์ œ์ถœ์„ ํด๋ฆญํ•ฉ๋‹ˆ๋‹ค.

    • ์ž‘์—… ๋“œ๋ผ์ด๋ฒ„ ์ถœ๋ ฅ์€ ์ž‘์—… ์„ธ๋ถ€์ •๋ณด ํŽ˜์ด์ง€์— ํ‘œ์‹œ๋ฉ๋‹ˆ๋‹ค.
    • Flink ์ž‘์—…์€ Google Cloud ์ฝ˜์†”์˜ Dataproc ์ž‘์—… ํŽ˜์ด์ง€์— ๋‚˜์—ด๋ฉ๋‹ˆ๋‹ค.
    • ์ž‘์—… ๋˜๋Š” ์ž‘์—… ์„ธ๋ถ€์ •๋ณด ํŽ˜์ด์ง€์—์„œ ์ค‘์ง€ ๋˜๋Š” ์‚ญ์ œ๋ฅผ ํด๋ฆญํ•˜์—ฌ ์ž‘์—…์„ ์ค‘์ง€ํ•˜๊ฑฐ๋‚˜ ์‚ญ์ œํ•ฉ๋‹ˆ๋‹ค.

gcloud

Dataproc Flink ํด๋Ÿฌ์Šคํ„ฐ์— Flink ์ž‘์—…์„ ์ œ์ถœํ•˜๋ ค๋ฉด gcloud CLI gcloud dataproc jobs submit ๋ช…๋ น์–ด๋ฅผ ํ„ฐ๋ฏธ๋„ ์ฐฝ์—์„œ ๋กœ์ปฌ๋กœ ๋˜๋Š”Cloud Shell์—์„œ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค.

gcloud dataproc jobs submit flink \
    --cluster=CLUSTER_NAME \
    --region=REGION \
    --class=MAIN_CLASS \
    --jar=JAR_FILE \
    -- JOB_ARGS

์ฐธ๊ณ :

  • CLUSTER_NAME: ์ž‘์—…์„ ์ œ์ถœํ•  Dataproc Flink ํด๋Ÿฌ์Šคํ„ฐ์˜ ์ด๋ฆ„์„ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค.
  • REGION: ํด๋Ÿฌ์Šคํ„ฐ๊ฐ€ ์žˆ๋Š” Compute Engine ๋ฆฌ์ „์„ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค.
  • MAIN_CLASS: Flink ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์˜ main ํด๋ž˜์Šค๋ฅผ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค๋ฉด ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.
    • org.apache.flink.examples.java.wordcount.WordCount
  • JAR_FILE: Flink ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜ jar ํŒŒ์ผ์„ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค. ๋‹ค์Œ ํ•ญ๋ชฉ์„ ์ง€์ •ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
    • file:///` ํ”„๋ฆฌํ”ฝ์Šค๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํด๋Ÿฌ์Šคํ„ฐ์— ์„ค์น˜๋œ jar ํŒŒ์ผ:
      • file:///usr/lib/flink/examples/streaming/TopSpeedWindowing.jar
      • file:///usr/lib/flink/examples/batch/WordCount.jar
    • Cloud Storage์˜ jar ํŒŒ์ผ: gs://BUCKET/JARFILE
    • HDFS์˜ jar ํŒŒ์ผ: hdfs://PATH_TO_JAR
  • JOB_ARGS: ์„ ํƒ์ ์œผ๋กœ ์ด์ค‘ ๋Œ€์‹œ(--) ๋’ค์— ์ž‘์—… ์ธ์ˆ˜๋ฅผ ์ถ”๊ฐ€ํ•ฉ๋‹ˆ๋‹ค.

  • ์ž‘์—…์„ ์ œ์ถœํ•˜๋ฉด ์ž‘์—… ๋“œ๋ผ์ด๋ฒ„ ์ถœ๋ ฅ์ด ๋กœ์ปฌ ๋˜๋Š” Cloud Shell ํ„ฐ๋ฏธ๋„์— ํ‘œ์‹œ๋ฉ๋‹ˆ๋‹ค.

    Program execution finished
    Job with JobID 829d48df4ebef2817f4000dfba126e0f has finished.
    Job Runtime: 13610 ms
    ...
    (after,1)
    (and,12)
    (arrows,1)
    (ay,1)
    (be,4)
    (bourn,1)
    (cast,1)
    (coil,1)
    (come,1)

REST

์ด ์„น์…˜์—์„œ๋Š” Dataproc jobs.submit API๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ Dataproc Flink ํด๋Ÿฌ์Šคํ„ฐ์— Flink ์ž‘์—…์„ ์ œ์ถœํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.

์š”์ฒญ ๋ฐ์ดํ„ฐ๋ฅผ ์‚ฌ์šฉํ•˜๊ธฐ ์ „์— ๋‹ค์Œ์„ ๋ฐ”๊ฟ‰๋‹ˆ๋‹ค.

  • PROJECT_ID: Google Cloud ํ”„๋กœ์ ํŠธ ID์ž…๋‹ˆ๋‹ค.
  • REGION: ํด๋Ÿฌ์Šคํ„ฐ ๋ฆฌ์ „์ž…๋‹ˆ๋‹ค.
  • CLUSTER_NAME: ์ž‘์—…์„ ์ œ์ถœํ•  Dataproc Flink ํด๋Ÿฌ์Šคํ„ฐ์˜ ์ด๋ฆ„์„ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค.

HTTP ๋ฉ”์„œ๋“œ ๋ฐ URL:

POST https://dataproc.googleapis.com/v1/projects/PROJECT_ID/regions/REGION/jobs:submit

JSON ์š”์ฒญ ๋ณธ๋ฌธ:

{
  "job": {
    "placement": {
      "clusterName": "CLUSTER_NAME"
    },
    "flinkJob": {
      "mainClass": "org.apache.flink.examples.java.wordcount.WordCount",
      "jarFileUris": [
        "file:///usr/lib/flink/examples/batch/WordCount.jar"
      ]
    }
  }
}

์š”์ฒญ์„ ๋ณด๋‚ด๋ ค๋ฉด ๋‹ค์Œ ์˜ต์…˜ ์ค‘ ํ•˜๋‚˜๋ฅผ ํŽผ์นฉ๋‹ˆ๋‹ค.

๋‹ค์Œ๊ณผ ๋น„์Šทํ•œ JSON ์‘๋‹ต์ด ํ‘œ์‹œ๋ฉ๋‹ˆ๋‹ค.

{
  "reference": {
    "projectId": "PROJECT_ID",
    "jobId": "JOB_ID"
  },
  "placement": {
    "clusterName": "CLUSTER_NAME",
    "clusterUuid": "CLUSTER_UUID"
  },
  "flinkJob": {
    "mainClass": "org.apache.flink.examples.java.wordcount.WordCount",
    "args": [
      "1000"
    ],
    "jarFileUris": [
      "file:///usr/lib/flink/examples/batch/WordCount.jar"
    ]
  },
  "status": {
    "state": "PENDING",
    "stateStartTime": "2020-10-07T20:16:21.759Z"
  },
  "jobUuid": "JOB_UUID"
}
  • Flink ์ž‘์—…์€ Google Cloud ์ฝ˜์†”์˜ Dataproc ์ž‘์—… ํŽ˜์ด์ง€์— ๋‚˜์—ด๋ฉ๋‹ˆ๋‹ค.
  • Google Cloud ์ฝ˜์†”์˜ ์ž‘์—… ๋˜๋Š” ์ž‘์—… ์„ธ๋ถ€์ •๋ณด ํŽ˜์ด์ง€์—์„œ ์ค‘์ง€ ๋˜๋Š” ์‚ญ์ œ๋ฅผ ํด๋ฆญํ•˜์—ฌ ์ž‘์—…์„ ์ค‘์ง€ํ•˜๊ฑฐ๋‚˜ ์‚ญ์ œํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

Dataproc Jobs ๋ฆฌ์†Œ์Šค๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ Flink ์ž‘์—…์„ ์‹คํ–‰ํ•˜๋Š” ๋Œ€์‹  flink CLI๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ Flink ํด๋Ÿฌ์Šคํ„ฐ์˜ ๋งˆ์Šคํ„ฐ ๋…ธ๋“œ์—์„œ Flink ์ž‘์—…์„ ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๋‹ค์Œ ์„น์…˜์—์„œ๋Š” Dataproc Flink ํด๋Ÿฌ์Šคํ„ฐ์—์„œ flink CLI ์ž‘์—…์„ ์‹คํ–‰ํ•˜๋Š” ๋‹ค์–‘ํ•œ ๋ฐฉ๋ฒ•์„ ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค.

  1. ๋งˆ์Šคํ„ฐ ๋…ธ๋“œ์— SSH๋กœ ์—ฐ๊ฒฐ: SSH ์œ ํ‹ธ๋ฆฌํ‹ฐ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํด๋Ÿฌ์Šคํ„ฐ ๋งˆ์Šคํ„ฐ VM์˜ ํ„ฐ๋ฏธ๋„ ์ฐฝ์„ ์—ฝ๋‹ˆ๋‹ค.

  2. classpath ์„ค์ •: Flink ํด๋Ÿฌ์Šคํ„ฐ ๋งˆ์Šคํ„ฐ VM์˜ SSH ํ„ฐ๋ฏธ๋„ ์ฐฝ์—์„œ Hadoop ํด๋ž˜์Šค ๊ฒฝ๋กœ๋ฅผ ์ดˆ๊ธฐํ™”ํ•ฉ๋‹ˆ๋‹ค.

    export HADOOP_CLASSPATH=$(hadoop classpath)
    
  3. Flink ์ž‘์—… ์‹คํ–‰: ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜, ์ž‘์—…๋ณ„, ์„ธ์…˜ ๋ชจ๋“œ ๋“ฑ ๋‹ค์–‘ํ•œ YARN ๋ฐฐํฌ ๋ชจ๋“œ์—์„œ Flink ์ž‘์—…์„ ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

    1. ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜ ๋ชจ๋“œ: Flink ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜ ๋ชจ๋“œ๋Š” Dataproc ์ด๋ฏธ์ง€ ๋ฒ„์ „ 2.0 ์ด์ƒ์—์„œ ์ง€์›๋ฉ๋‹ˆ๋‹ค. ์ด ๋ชจ๋“œ๋Š” YARN ์ž‘์—… ๊ด€๋ฆฌ์ž์—์„œ ์ž‘์—…์˜ main() ๋ฉ”์„œ๋“œ๋ฅผ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค. ์ž‘์—…์ด ์™„๋ฃŒ๋˜๋ฉด ํด๋Ÿฌ์Šคํ„ฐ๊ฐ€ ์ข…๋ฃŒ๋ฉ๋‹ˆ๋‹ค.

      ์ž‘์—… ์ œ์ถœ ์˜ˆ์‹œ:

      flink run-application \
          -t yarn-application \
          -Djobmanager.memory.process.size=1024m \
          -Dtaskmanager.memory.process.size=2048m \
          -Djobmanager.heap.mb=820 \
          -Dtaskmanager.heap.mb=1640 \
          -Dtaskmanager.numberOfTaskSlots=2 \
          -Dparallelism.default=4 \
          /usr/lib/flink/examples/batch/WordCount.jar
      

      ์‹คํ–‰ ์ค‘์ธ ์ž‘์—… ๋‚˜์—ด:

      ./bin/flink list -t yarn-application -Dyarn.application.id=application_XXXX_YY
      

      ์‹คํ–‰ ์ค‘์ธ ์ž‘์—… ์ทจ์†Œ:

      ./bin/flink cancel -t yarn-application -Dyarn.application.id=application_XXXX_YY <jobId>
      
    2. ์ž‘์—…๋ณ„ ๋ชจ๋“œ: ์ด Flink ๋ชจ๋“œ๋Š” ํด๋ผ์ด์–ธํŠธ ์ธก์—์„œ ์ž‘์—…์˜ main() ๋ฉ”์„œ๋“œ๋ฅผ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค.

      ์ž‘์—… ์ œ์ถœ ์˜ˆ์‹œ:

      flink run \
          -m yarn-cluster \
          -p 4 \
          -ys 2 \
          -yjm 1024m \
          -ytm 2048m \
          /usr/lib/flink/examples/batch/WordCount.jar
      
    3. ์„ธ์…˜ ๋ชจ๋“œ: ์žฅ๊ธฐ ์‹คํ–‰ Flink YARN ์„ธ์…˜์„ ์‹œ์ž‘ํ•œ ํ›„ ํ•˜๋‚˜ ์ด์ƒ์˜ ์ž‘์—…์„ ์„ธ์…˜์— ์ œ์ถœํ•ฉ๋‹ˆ๋‹ค.

      1. ์„ธ์…˜ ์‹œ์ž‘: ๋‹ค์Œ ๋ฐฉ๋ฒ• ์ค‘ ํ•˜๋‚˜๋กœ Flink ์„ธ์…˜์„ ์‹œ์ž‘ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

        1. Flink ํด๋Ÿฌ์Šคํ„ฐ๋ฅผ ๋งŒ๋“ค๊ณ  gcloud dataproc clusters create ๋ช…๋ น์–ด์— --metadata flink-start-yarn-session=true ํ”Œ๋ž˜๊ทธ๋ฅผ ์ถ”๊ฐ€ํ•ฉ๋‹ˆ๋‹ค(Dataproc Flink ํด๋Ÿฌ์Šคํ„ฐ ๋งŒ๋“ค๊ธฐ ์ฐธ์กฐ). ์ด ํ”Œ๋ž˜๊ทธ๋ฅผ ์‚ฌ์šฉ ์„ค์ •ํ•˜๋ฉด ํด๋Ÿฌ์Šคํ„ฐ๊ฐ€ ์ƒ์„ฑ๋œ ํ›„ Dataproc์ด /usr/bin/flink-yarn-daemon์„ ์‹คํ–‰ํ•˜์—ฌ ํด๋Ÿฌ์Šคํ„ฐ์—์„œ Flink ์„ธ์…˜์„ ์‹œ์ž‘ํ•ฉ๋‹ˆ๋‹ค.

          ์„ธ์…˜์˜ YARN ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜ ID๊ฐ€ /tmp/.yarn-properties-${USER}์— ์ €์žฅ๋ฉ๋‹ˆ๋‹ค. yarn application -list ๋ช…๋ น์–ด๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ID๋ฅผ ๋‚˜์—ดํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

        2. ์ปค์Šคํ…€ ์„ค์ •์œผ๋กœ ํด๋Ÿฌ์Šคํ„ฐ ๋งˆ์Šคํ„ฐ VM์— ์‚ฌ์ „ ์„ค์น˜๋œ yarn-session.sh ์Šคํฌ๋ฆฝํŠธ๋ฅผ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค.

          ์ปค์Šคํ…€ ์„ค์ •์˜ ์˜ˆ์‹œ:

          /usr/lib/flink/bin/yarn-session.sh \
              -s 1 \
              -jm 1024m \
              -tm 2048m \
              -nm flink-dataproc \
              --detached
          
        3. ๊ธฐ๋ณธ ์„ค์ •์œผ๋กœ Flink /usr/bin/flink-yarn-daemon ๋ž˜ํผ ์Šคํฌ๋ฆฝํŠธ๋ฅผ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค.

          . /usr/bin/flink-yarn-daemon
          
      2. ์„ธ์…˜์— ์ž‘์—… ์ œ์ถœ: ๋‹ค์Œ ๋ช…๋ น์–ด๋ฅผ ์‹คํ–‰ํ•˜์—ฌ Flink ์ž‘์—…์„ ์„ธ์…˜์— ์ œ์ถœํ•ฉ๋‹ˆ๋‹ค.

        flink run -m <var>FLINK_MASTER_URL</var>/usr/lib/flink/examples/batch/WordCount.jar
        
        • FLINK_MASTER_URL: ํ˜ธ์ŠคํŒ…๊ณผ ํฌํŒ…์„ ํฌํ•จํ•œ ์ž‘์—…์ด ์‹คํ–‰๋˜๋Š” Flink ๋งˆ์Šคํ„ฐ VM์˜ URL์ž…๋‹ˆ๋‹ค. URL์—์„œ http:// prefix๋ฅผ ์‚ญ์ œํ•ฉ๋‹ˆ๋‹ค. ์ด URL์€ Flink ์„ธ์…˜์„ ์‹œ์ž‘ํ•  ๋•Œ ๋ช…๋ น์–ด ์ถœ๋ ฅ์— ๋‚˜์—ด๋ฉ๋‹ˆ๋‹ค. ๋‹ค์Œ ๋ช…๋ น์–ด๋ฅผ ์‹คํ–‰ํ•˜์—ฌ ์ด URL์„ Tracking-URL ํ•„๋“œ์— ๋‚˜์—ดํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
        yarn application -list -appId=<yarn-app-id> | sed 's#http://##'
           ```
        
      3. ์„ธ์…˜์—์„œ ์ž‘์—… ๋‚˜์—ด: ์„ธ์…˜์—์„œ Flink ์ž‘์—…์„ ๋‚˜์—ดํ•˜๋ ค๋ฉด ๋‹ค์Œ ์ค‘ ํ•˜๋‚˜๋ฅผ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.

        • ์ธ์ˆ˜ ์—†์ด flink list๋ฅผ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค. ์ด ๋ช…๋ น์–ด๋Š” /tmp/.yarn-properties-${USER}์—์„œ ์„ธ์…˜์˜ YARN ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜ ID๋ฅผ ์ฐพ์Šต๋‹ˆ๋‹ค.

        • /tmp/.yarn-properties-${USER} ๋˜๋Š” yarn application -list์˜ ์ถœ๋ ฅ์—์„œ ์„ธ์…˜์˜ YARN ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜ ID๋ฅผ ๊ฐ€์ ธ์˜จ ๋‹ค์Œ <code> flink ๋ชฉ๋ก -yid YARN_APPLICATION_ID๋ฅผ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค.

        • flink list -m FLINK_MASTER_URL๋ฅผ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค.

      4. ์„ธ์…˜ ์ค‘์ง€: ์„ธ์…˜์„ ์ค‘์ง€ํ•˜๋ ค๋ฉด /tmp/.yarn-properties-${USER} ๋˜๋Š” yarn application -list ์ถœ๋ ฅ์—์„œ ์„ธ์…˜์˜ YARN ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜ ID๋ฅผ ๊ฐ€์ ธ์˜จ ํ›„ ๋‹ค์Œ ๋ช…๋ น์–ด ์ค‘ ํ•˜๋‚˜๋ฅผ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค.

        echo "stop" | /usr/lib/flink/bin/yarn-session.sh -id YARN_APPLICATION_ID
        
        yarn application -kill YARN_APPLICATION_ID
        

Dataproc์—์„œ FlinkRunner๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ Apache Beam์„ ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๋‹ค์Œ๊ณผ ๊ฐ™์€ ๋ฐฉ๋ฒ•์œผ๋กœ Flink์—์„œ Beam ์ž‘์—…์„ ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

  1. ์ž๋ฐ” Beam ์ž‘์—…
  2. ํฌํ„ฐ๋ธ” Beam ์ž‘์—…

์ž๋ฐ” Beam ์ž‘์—…

Beam ์ž‘์—…์„ JAR ํŒŒ์ผ๋กœ ํŒจํ‚ค์ง•ํ•ฉ๋‹ˆ๋‹ค. ์ž‘์—…์„ ์‹คํ–‰ํ•˜๋Š” ๋ฐ ํ•„์š”ํ•œ ์ข…์† ํ•ญ๋ชฉ์ด ํฌํ•จ๋œ ๋ฒˆ๋“ค JAR ํŒŒ์ผ์„ ์ œ๊ณตํ•˜์„ธ์š”.

๋‹ค์Œ ์˜ˆ์‹œ์—์„œ๋Š” Dataproc ํด๋Ÿฌ์Šคํ„ฐ์˜ ๋งˆ์Šคํ„ฐ ๋…ธ๋“œ์—์„œ ์ž๋ฐ” Beam ์ž‘์—…์„ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค.

  1. Flink ๊ตฌ์„ฑ์š”์†Œ๋ฅผ ์‚ฌ์šฉ ์„ค์ •ํ•˜์—ฌ Dataproc ํด๋Ÿฌ์Šคํ„ฐ๋ฅผ ๋งŒ๋“ญ๋‹ˆ๋‹ค.

    gcloud dataproc clusters create CLUSTER_NAME \
        --optional-components=FLINK \
        --image-version=DATAPROC_IMAGE_VERSION \
        --region=REGION \
        --enable-component-gateway \
        --scopes=https://www.googleapis.com/auth/cloud-platform
    
    • --optional-components: Flink.
    • --image-version: ํด๋Ÿฌ์Šคํ„ฐ์— ์„ค์น˜๋œ Flink ๋ฒ„์ „์„ ๊ฒฐ์ •ํ•˜๋Š” ํด๋Ÿฌ์Šคํ„ฐ์˜ ์ด๋ฏธ์ง€ ๋ฒ„์ „. ์˜ˆ๋ฅผ ๋“ค์–ด ์ตœ์‹  ๋ฐ ์ด์ „ 2.0.x ์ด๋ฏธ์ง€ ์ถœ์‹œ ๋ฒ„์ „ 4๊ฐœ์— ๋Œ€ํ•˜์—ฌ ๋‚˜์—ด๋œ Apache Flink ๊ตฌ์„ฑ์š”์†Œ ๋ฒ„์ „์„ ์ฐธ์กฐํ•˜์„ธ์š”.
    • --region: ์ง€์›๋˜๋Š” Dataproc ๋ฆฌ์ „
    • --enable-component-gateway: Flink Job Manager UI์— ๋Œ€ํ•œ ์•ก์„ธ์Šค ์‚ฌ์šฉ ์„ค์ •
    • --scopes: ํด๋Ÿฌ์Šคํ„ฐ๋ณ„๋กœ Google Cloud API์— ๋Œ€ํ•œ ์•ก์„ธ์Šค๋ฅผ ์‚ฌ์šฉ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค(๋ฒ”์œ„ ๊ถŒ์žฅ์‚ฌํ•ญ ์ฐธ๊ณ ). Dataproc ์ด๋ฏธ์ง€ ๋ฒ„์ „ 2.1 ์ด์ƒ์„ ์‚ฌ์šฉํ•˜๋Š” ํด๋Ÿฌ์Šคํ„ฐ๋ฅผ ๋งŒ๋“ค ๋•Œ cloud-platform ๋ฒ”์œ„๊ฐ€ ๊ธฐ๋ณธ์ ์œผ๋กœ ์‚ฌ์šฉ ์„ค์ •๋ฉ๋‹ˆ๋‹ค(์ด ํ”Œ๋ž˜๊ทธ ์„ค์ •์„ ํฌํ•จํ•˜์ง€ ์•Š์•„๋„ ๋จ).
  2. SSH ์œ ํ‹ธ๋ฆฌํ‹ฐ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ Flink ํด๋Ÿฌ์Šคํ„ฐ ๋งˆ์Šคํ„ฐ ๋…ธ๋“œ์—์„œ ํ„ฐ๋ฏธ๋„ ์ฐฝ์„ ์—ฝ๋‹ˆ๋‹ค.

  3. Dataproc ํด๋Ÿฌ์Šคํ„ฐ ๋งˆ์Šคํ„ฐ ๋…ธ๋“œ์—์„œ Flink YARN ์„ธ์…˜์„ ์‹œ์ž‘ํ•ฉ๋‹ˆ๋‹ค.

    . /usr/bin/flink-yarn-daemon
    

    Dataproc ํด๋Ÿฌ์Šคํ„ฐ์˜ Flink ๋ฒ„์ „์„ ๊ธฐ๋กํ•ฉ๋‹ˆ๋‹ค.

    flink --version
    
  4. ๋กœ์ปฌ ๋จธ์‹ ์—์„œ ์ž๋ฐ”์˜ ํ‘œ์ค€ Beam ๋‹จ์–ด ์ˆ˜ ์˜ˆ์‹œ๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.

    Dataproc ํด๋Ÿฌ์Šคํ„ฐ์˜ Flink ๋ฒ„์ „๊ณผ ํ˜ธํ™˜๋˜๋Š” Beam ๋ฒ„์ „์„ ์„ ํƒํ•ฉ๋‹ˆ๋‹ค. Beam-Flink ๋ฒ„์ „ ํ˜ธํ™˜์„ฑ์ด ๋‚˜์—ด๋œ Flink ๋ฒ„์ „ ํ˜ธํ™˜์„ฑ ํ‘œ๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.

    ์ƒ์„ฑ๋œ POM ํŒŒ์ผ์„ ์—ฝ๋‹ˆ๋‹ค. <flink.artifact.name> ํƒœ๊ทธ๋กœ ์ง€์ •๋œ Beam Flink ์‹คํ–‰๊ธฐ ๋ฒ„์ „์„ ํ™•์ธํ•ฉ๋‹ˆ๋‹ค. Flink ์•„ํ‹ฐํŒฉํŠธ ์ด๋ฆ„์˜ Beam Flink ์‹คํ–‰๊ธฐ ๋ฒ„์ „์ด ํด๋Ÿฌ์Šคํ„ฐ์˜ Flink ๋ฒ„์ „๊ณผ ์ผ์น˜ํ•˜์ง€ ์•Š์œผ๋ฉด ์ผ์น˜ํ•˜๋„๋ก ๋ฒ„์ „ ๋ฒˆํ˜ธ๋ฅผ ์—…๋ฐ์ดํŠธํ•ฉ๋‹ˆ๋‹ค.

    mvn archetype:generate \
        -DarchetypeGroupId=org.apache.beam \
        -DarchetypeArtifactId=beam-sdks-java-maven-archetypes-examples \
        -DarchetypeVersion=BEAM_VERSION \
        -DgroupId=org.example \
        -DartifactId=word-count-beam \
        -Dversion="0.1" \
        -Dpackage=org.apache.beam.examples \
        -DinteractiveMode=false
    
  5. ๋‹จ์–ด ์ˆ˜ ์˜ˆ์‹œ๋ฅผ ํŒจํ‚ค์ง•ํ•ฉ๋‹ˆ๋‹ค.

    mvn package -Pflink-runner
    
  6. ํŒจํ‚ค์ง•๋œ uber JAR ํŒŒ์ผ word-count-beam-bundled-0.1.jar(135MB ์ดํ•˜)๋ฅผ Dataproc ํด๋Ÿฌ์Šคํ„ฐ์˜ ๋งˆ์Šคํ„ฐ ๋…ธ๋“œ์— ์—…๋กœ๋“œํ•ฉ๋‹ˆ๋‹ค. gcloud storage cp๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด Cloud Storage์—์„œ Dataproc ํด๋Ÿฌ์Šคํ„ฐ๋กœ ํŒŒ์ผ์„ ๋” ๋น ๋ฅด๊ฒŒ ์ „์†กํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

    1. ๋กœ์ปฌ ํ„ฐ๋ฏธ๋„์—์„œ Cloud Storage ๋ฒ„ํ‚ท์„ ๋งŒ๋“ค๊ณ  uber JAR์„ ์—…๋กœ๋“œํ•ฉ๋‹ˆ๋‹ค.

      gcloud storage buckets create BUCKET_NAME
      
      gcloud storage cp target/word-count-beam-bundled-0.1.jar gs://BUCKET_NAME/
      
    2. Dataproc์˜ ๋งˆ์Šคํ„ฐ ๋…ธ๋“œ์—์„œ uber JAR์„ ๋‹ค์šด๋กœ๋“œํ•ฉ๋‹ˆ๋‹ค.

      gcloud storage cp gs://BUCKET_NAME/word-count-beam-bundled-0.1.jar .
      
  7. Dataproc ํด๋Ÿฌ์Šคํ„ฐ์˜ ๋งˆ์Šคํ„ฐ ๋…ธ๋“œ์—์„œ ์ž๋ฐ” Beam ์ž‘์—…์„ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค.

    flink run -c org.apache.beam.examples.WordCount word-count-beam-bundled-0.1.jar \
        --runner=FlinkRunner \
        --output=gs://BUCKET_NAME/java-wordcount-out
    
  8. ๊ฒฐ๊ณผ๊ฐ€ Cloud Storage ๋ฒ„ํ‚ท์— ๊ธฐ๋ก๋˜์—ˆ๋Š”์ง€ ํ™•์ธํ•ฉ๋‹ˆ๋‹ค.

    gcloud storage cat gs://BUCKET_NAME/java-wordcount-out-SHARD_ID
    
  9. Flink YARN ์„ธ์…˜์„ ์ค‘์ง€ํ•ฉ๋‹ˆ๋‹ค.

    yarn application -list
    
    yarn application -kill YARN_APPLICATION_ID
    

ํฌํ„ฐ๋ธ” Beam ์ž‘์—…

Python, Go, ๊ธฐํƒ€ ์ง€์›๋˜๋Š” ์–ธ์–ด๋กœ Beam ์ž‘์—…์„ ์‹คํ–‰ํ•˜๋ ค๋ฉด Beam์˜ Flink Runner์— ์„ค๋ช…๋œ ๋Œ€๋กœ FlinkRunner ๋ฐ PortableRunner๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด ๋ฉ๋‹ˆ๋‹ค. ์ด๋™์„ฑ ํ”„๋ ˆ์ž„์›Œํฌ ๋กœ๋“œ๋งต๋„ ์ฐธ์กฐํ•˜์„ธ์š”.

๋‹ค์Œ ์˜ˆ์‹œ์—์„œ๋Š” Dataproc ํด๋Ÿฌ์Šคํ„ฐ์˜ ๋งˆ์Šคํ„ฐ ๋…ธ๋“œ์—์„œ ์ด๋™ ๊ฐ€๋Šฅํ•œ Beam ์ž‘์—…์„ Python์œผ๋กœ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค.

  1. Flink ๋ฐ Docker ๊ตฌ์„ฑ์š”์†Œ๊ฐ€ ๋ชจ๋‘ ์‚ฌ์šฉ ์„ค์ •๋œ Dataproc ํด๋Ÿฌ์Šคํ„ฐ๋ฅผ ๋งŒ๋“ญ๋‹ˆ๋‹ค.

    gcloud dataproc clusters create CLUSTER_NAME \
        --optional-components=FLINK,DOCKER \
        --image-version=DATAPROC_IMAGE_VERSION \
        --region=REGION \
        --enable-component-gateway \
        --scopes=https://www.googleapis.com/auth/cloud-platform
    

    ์ฐธ๊ณ :

    • --optional-components: Flink ๋ฐ Docker
    • --image-version: ํด๋Ÿฌ์Šคํ„ฐ์— ์„ค์น˜๋œ Flink ๋ฒ„์ „์„ ๊ฒฐ์ •ํ•˜๋Š” ํด๋Ÿฌ์Šคํ„ฐ์˜ ์ด๋ฏธ์ง€ ๋ฒ„์ „. ์˜ˆ๋ฅผ ๋“ค์–ด ์ตœ์‹  ๋ฐ ์ด์ „ 2.0.x ์ด๋ฏธ์ง€ ์ถœ์‹œ ๋ฒ„์ „ 4๊ฐœ์— ๋Œ€ํ•˜์—ฌ ๋‚˜์—ด๋œ Apache Flink ๊ตฌ์„ฑ์š”์†Œ ๋ฒ„์ „์„ ์ฐธ์กฐํ•˜์„ธ์š”.
    • --region: ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ Dataproc ๋ฆฌ์ „
    • --enable-component-gateway: Flink Job Manager UI์— ๋Œ€ํ•œ ์•ก์„ธ์Šค ์‚ฌ์šฉ ์„ค์ •
    • --scopes: ํด๋Ÿฌ์Šคํ„ฐ๋ณ„๋กœ Google Cloud API์— ๋Œ€ํ•œ ์•ก์„ธ์Šค๋ฅผ ์‚ฌ์šฉ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค(๋ฒ”์œ„ ๊ถŒ์žฅ์‚ฌํ•ญ ์ฐธ๊ณ ). Dataproc ์ด๋ฏธ์ง€ ๋ฒ„์ „ 2.1 ์ด์ƒ์„ ์‚ฌ์šฉํ•˜๋Š” ํด๋Ÿฌ์Šคํ„ฐ๋ฅผ ๋งŒ๋“ค ๋•Œ cloud-platform ๋ฒ”์œ„๊ฐ€ ๊ธฐ๋ณธ์ ์œผ๋กœ ์‚ฌ์šฉ ์„ค์ •๋ฉ๋‹ˆ๋‹ค(์ด ํ”Œ๋ž˜๊ทธ ์„ค์ •์„ ํฌํ•จํ•˜์ง€ ์•Š์•„๋„ ๋จ).
  2. ๋กœ์ปฌ ๋˜๋Š” Cloud Shell์—์„œ gcloud CLI๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ Cloud Storage ๋ฒ„ํ‚ท์„ ๋งŒ๋“ญ๋‹ˆ๋‹ค. ์ƒ˜ํ”Œ WordCount ํ”„๋กœ๊ทธ๋žจ์„ ์‹คํ–‰ํ•  ๋•Œ BUCKET_NAME์„ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค.

    gcloud storage buckets create BUCKET_NAME
    
  3. ํด๋Ÿฌ์Šคํ„ฐ VM์˜ ํ„ฐ๋ฏธ๋„ ์ฐฝ์—์„œ Flink YARN ์„ธ์…˜์„ ์‹œ์ž‘ํ•ฉ๋‹ˆ๋‹ค. ์ž‘์—…์ด ์‹คํ–‰๋˜๋Š” Flink ๋งˆ์Šคํ„ฐ์˜ ์ฃผ์†Œ์ธ Flink ๋งˆ์Šคํ„ฐ URL์„ ํ™•์ธํ•ฉ๋‹ˆ๋‹ค. ์ƒ˜ํ”Œ WordCount ํ”„๋กœ๊ทธ๋žจ์„ ์‹คํ–‰ํ•  ๋•Œ FLINK_MASTER_URL์„ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค.

    . /usr/bin/flink-yarn-daemon
    

    Dataproc ํด๋Ÿฌ์Šคํ„ฐ๋ฅผ ์‹คํ–‰ํ•˜๋Š” Flink ๋ฒ„์ „์„ ํ‘œ์‹œํ•ฉ๋‹ˆ๋‹ค. ์ƒ˜ํ”Œ WordCount ํ”„๋กœ๊ทธ๋žจ์„ ์‹คํ–‰ํ•  ๋•Œ FLINK_VERSION์„ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค.

    flink --version
    
  4. ์ž‘์—…์— ํ•„์š”ํ•œ Python ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ํด๋Ÿฌ์Šคํ„ฐ ๋งˆ์Šคํ„ฐ ๋…ธ๋“œ์— ์„ค์น˜ํ•ฉ๋‹ˆ๋‹ค.

  5. ํด๋Ÿฌ์Šคํ„ฐ์˜ Flink ๋ฒ„์ „๊ณผ ํ˜ธํ™˜๋˜๋Š” Beam ๋ฒ„์ „์„ ์„ค์น˜ํ•ฉ๋‹ˆ๋‹ค.

    python -m pip install apache-beam[gcp]==BEAM_VERSION
    
  6. ํด๋Ÿฌ์Šคํ„ฐ ๋งˆ์Šคํ„ฐ ๋…ธ๋“œ์—์„œ ์›Œ๋“œ ์ˆ˜ ์˜ˆ์‹œ๋ฅผ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค.

    python -m apache_beam.examples.wordcount \
        --runner=FlinkRunner \
        --flink_version=FLINK_VERSION \
        --flink_master=FLINK_MASTER_URL
        --flink_submit_uber_jar \
        --output=gs://BUCKET_NAME/python-wordcount-out
    

    ์ฐธ๊ณ :

    • --runner: FlinkRunner.
    • --flink_version: FLINK_VERSION, ์•ž์„œ ์–ธ๊ธ‰๋จ
    • --flink_master: FLINK_MASTER_URL, ์•ž์„œ ์–ธ๊ธ‰๋จ
    • --flink_submit_uber_jar: uber JAR์„ ์‚ฌ์šฉํ•˜์—ฌ Beam ์ž‘์—… ์‹คํ–‰
    • --output: BUCKET_NAME, ์•ž์„œ ์ƒ์„ฑ๋จ
  7. ๋ฒ„ํ‚ท์— ๊ฒฐ๊ณผ๊ฐ€ ๊ธฐ๋ก๋˜์—ˆ๋Š”์ง€ ํ™•์ธํ•ฉ๋‹ˆ๋‹ค.

    gcloud storage cat gs://BUCKET_NAME/python-wordcount-out-SHARD_ID
    
  8. Flink YARN ์„ธ์…˜์„ ์ค‘์ง€ํ•ฉ๋‹ˆ๋‹ค.

    1. ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜ ID๋ฅผ ๊ฐ€์ ธ์˜ต๋‹ˆ๋‹ค.
    yarn application -list
    
    1. Insert the <var>YARN_APPLICATION_ID</var>, then stop the session.
    
    yarn application -kill 
    

Dataproc Flink ๊ตฌ์„ฑ์š”์†Œ๋Š” Kerberized ํด๋Ÿฌ์Šคํ„ฐ๋ฅผ ์ง€์›ํ•ฉ๋‹ˆ๋‹ค. Flink ์ž‘์—…์„ ์ œ์ถœํ•˜๊ณ  ์œ ์ง€ํ•˜๊ฑฐ๋‚˜ Flink ํด๋Ÿฌ์Šคํ„ฐ๋ฅผ ์‹œ์ž‘ํ•˜๋ ค๋ฉด ์œ ํšจํ•œ Kerberos ํ‹ฐ์ผ“์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค. ๊ธฐ๋ณธ์ ์œผ๋กœ Kerberos ํ‹ฐ์ผ“์€ 7์ผ ๋™์•ˆ ์œ ํšจํ•ฉ๋‹ˆ๋‹ค.

Flink ์ž‘์—… ๋˜๋Š” Flink ์„ธ์…˜ ํด๋Ÿฌ์Šคํ„ฐ๋ฅผ ์‹คํ–‰ํ•˜๋Š” ๋™์•ˆ Flink ์ž‘์—… ๊ด€๋ฆฌ์ž ์›น ์ธํ„ฐํŽ˜์ด์Šค๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์›น ์ธํ„ฐํŽ˜์ด์Šค๋ฅผ ์‚ฌ์šฉํ•˜๋ ค๋ฉด ๋‹ค์Œ ๋‹จ๊ณ„๋ฅผ ๋”ฐ๋ฅด์„ธ์š”.

  1. Dataproc Flink ํด๋Ÿฌ์Šคํ„ฐ๋ฅผ ๋งŒ๋“ญ๋‹ˆ๋‹ค.
  2. ํด๋Ÿฌ์Šคํ„ฐ๋ฅผ ๋งŒ๋“  ํ›„ Google Cloud ์ฝ˜์†”์˜ ํด๋Ÿฌ์Šคํ„ฐ ์„ธ๋ถ€์ •๋ณด ํŽ˜์ด์ง€์— ์žˆ๋Š” ์›น ์ธํ„ฐํŽ˜์ด์Šค ํƒญ์—์„œ ๊ตฌ์„ฑ์š”์†Œ ๊ฒŒ์ดํŠธ์›จ์ด YARN ResourceManager ๋งํฌ๋ฅผ ํด๋ฆญํ•ฉ๋‹ˆ๋‹ค.
  3. YARN Resource Manager UI์—์„œ Flink ํด๋Ÿฌ์Šคํ„ฐ ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜ ํ•ญ๋ชฉ์„ ์‹๋ณ„ํ•ฉ๋‹ˆ๋‹ค. ์ž‘์—…์˜ ์™„๋ฃŒ ์ƒํƒœ์— ๋”ฐ๋ผ ApplicationMaster ๋˜๋Š” ๊ธฐ๋ก ๋งํฌ๊ฐ€ ๋‚˜์—ด๋ฉ๋‹ˆ๋‹ค.
  4. ์žฅ๊ธฐ ์‹คํ–‰ ์ŠคํŠธ๋ฆฌ๋ฐ ์ž‘์—…์˜ ๊ฒฝ์šฐ ApplicationManager ๋งํฌ๋ฅผ ํด๋ฆญํ•˜์—ฌ Flink ๋Œ€์‹œ๋ณด๋“œ๋ฅผ ์—ฝ๋‹ˆ๋‹ค. ์™„๋ฃŒ๋œ ์ž‘์—…์˜ ๊ฒฝ์šฐ ๊ธฐ๋ก ๋งํฌ๋ฅผ ํด๋ฆญํ•˜์—ฌ ์ž‘์—… ์„ธ๋ถ€์ •๋ณด๋ฅผ ํ™•์ธํ•ฉ๋‹ˆ๋‹ค.