๋ฐ์ดํ„ฐ ํŒŒ์ดํ”„๋ผ์ธ ๋งŒ๋“ค๊ธฐ

์ด ๋น ๋ฅธ ์‹œ์ž‘์—์„œ๋Š” ๋‹ค์Œ์„ ์ˆ˜ํ–‰ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.

  1. Cloud Data Fusion ์ธ์Šคํ„ด์Šค๋ฅผ ๋งŒ๋“ญ๋‹ˆ๋‹ค.
  2. Cloud Data Fusion ์ธ์Šคํ„ด์Šค์™€ ํ•จ๊ป˜ ์ œ๊ณต๋˜๋Š” ์ƒ˜ํ”Œ ํŒŒ์ดํ”„๋ผ์ธ์„ ๋ฐฐํฌํ•ฉ๋‹ˆ๋‹ค. ํŒŒ์ดํ”„๋ผ์ธ์€ ๋‹ค์Œ ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.
    1. Cloud Storage์˜ NYT ๋ฒ ์ŠคํŠธ์…€๋Ÿฌ ๋ฐ์ดํ„ฐ๊ฐ€ ํฌํ•จ๋œ JSON ํŒŒ์ผ ์ฝ๊ธฐ
    2. ํŒŒ์ผ์—์„œ ๋ณ€ํ™˜์„ ์‹คํ–‰ํ•˜์—ฌ ๋ฐ์ดํ„ฐ ํŒŒ์‹ฑ ๋ฐ ์ •๋ฆฌ
    3. ์ง€๋‚œ ์ฃผ์— ์ถ”๊ฐ€๋œ ์ฑ… ์ค‘์—์„œ ํ‰์ ์ด ๊ฐ€์žฅ ๋†’๊ณ  ๊ฐ€๊ฒฉ์ด $25 ๋ฏธ๋งŒ์ธ ์ฑ…์„ BigQuery๋กœ ๋กœ๋“œ

์‹œ์ž‘ํ•˜๊ธฐ ์ „์—

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  3. Enable the Cloud Data Fusion API.

    Enable the API

  4. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  5. Enable the Cloud Data Fusion API.

    Enable the API

Cloud Data Fusion ์ธ์Šคํ„ด์Šค ๋งŒ๋“ค๊ธฐ

  1. ์ธ์Šคํ„ด์Šค ๋งŒ๋“ค๊ธฐ๋ฅผ ํด๋ฆญํ•ฉ๋‹ˆ๋‹ค.

    ์ธ์Šคํ„ด์Šค๋กœ ์ด๋™

  2. ์ธ์Šคํ„ด์Šค ์ด๋ฆ„์„ ์ž…๋ ฅํ•ฉ๋‹ˆ๋‹ค.
  3. ์ธ์Šคํ„ด์Šค์˜ ์„ค๋ช…์„ ์ž…๋ ฅํ•ฉ๋‹ˆ๋‹ค.
  4. ์ธ์Šคํ„ด์Šค๋ฅผ ๋งŒ๋“ค ๋ฆฌ์ „์„ ์ž…๋ ฅํ•ฉ๋‹ˆ๋‹ค.
  5. ์‚ฌ์šฉํ•  Cloud Data Fusion ๋ฒ„์ „์„ ์„ ํƒํ•ฉ๋‹ˆ๋‹ค.
  6. Cloud Data Fusion ๋ฒ„์ „์„ ์„ ํƒํ•ฉ๋‹ˆ๋‹ค.
  7. Cloud Data Fusion ๋ฒ„์ „ 6.2.3 ์ด์ƒ์˜ ๊ฒฝ์šฐ ์Šน์ธ ํ•„๋“œ์—์„œ Dataproc์—์„œ Cloud Data Fusion ํŒŒ์ดํ”„๋ผ์ธ์„ ์‹คํ–‰ํ•˜๋Š” ๋ฐ ์‚ฌ์šฉํ•  Dataproc ์„œ๋น„์Šค ๊ณ„์ •์„ ์„ ํƒํ•ฉ๋‹ˆ๋‹ค. ๊ธฐ๋ณธ๊ฐ’์ธ Compute Engine ๊ณ„์ •์ด ๋ฏธ๋ฆฌ ์„ ํƒ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค.
  8. ๋งŒ๋“ค๊ธฐ๋ฅผ ํด๋ฆญํ•ฉ๋‹ˆ๋‹ค. ์ธ์Šคํ„ด์Šค ์ƒ์„ฑ ํ”„๋กœ์„ธ์Šค๊ฐ€ ์™„๋ฃŒ๋˜๋Š” ๋ฐ ์ตœ๋Œ€ 30๋ถ„์ด ๊ฑธ๋ฆฝ๋‹ˆ๋‹ค. Cloud Data Fusion์ด ์ธ์Šคํ„ด์Šค๋ฅผ ๋งŒ๋“œ๋Š” ๋™์•ˆ ์ธ์Šคํ„ด์Šค ํŽ˜์ด์ง€์˜ ์ธ์Šคํ„ด์Šค ์ด๋ฆ„ ์˜†์— ์ง„ํ–‰๋ฅ  ํœ ์ด ํ‘œ์‹œ๋ฉ๋‹ˆ๋‹ค. ์™„๋ฃŒ๋˜๋ฉด ๋…น์ƒ‰ ์ฒดํฌํ‘œ์‹œ๋กœ ๋ฐ”๋€Œ์–ด ์ด์ œ ์ธ์Šคํ„ด์Šค๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Œ์„ ์•Œ๋ฆฝ๋‹ˆ๋‹ค.

Cloud Data Fusion์„ ์‚ฌ์šฉํ•  ๋•Œ๋Š” Google Cloud ์ฝ˜์†”๊ณผ ๋ณ„๋„์˜ Cloud Data Fusion ์›น ์ธํ„ฐํŽ˜์ด์Šค๋ฅผ ๋ชจ๋‘ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

  • Google Cloud ์ฝ˜์†”์—์„œ ๋‹ค์Œ ์ž‘์—…์„ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

    • Google Cloud ์ฝ˜์†” ํ”„๋กœ์ ํŠธ ๋งŒ๋“ค๊ธฐ
    • Cloud Data Fusion ์ธ์Šคํ„ด์Šค ๋งŒ๋“ค๊ธฐ ๋ฐ ์‚ญ์ œ
    • Cloud Data Fusion ์ธ์Šคํ„ด์Šค ์„ธ๋ถ€์ •๋ณด ๋ณด๊ธฐ
  • Cloud Data Fusion ์›น ์ธํ„ฐํŽ˜์ด์Šค์—์„œ๋Š” ์ŠคํŠœ๋””์˜ค ๋˜๋Š” Wrangler์™€ ๊ฐ™์€ ๋‹ค์–‘ํ•œ ํŽ˜์ด์ง€๋ฅผ ํ†ตํ•ด Cloud Data Fusion ๊ธฐ๋Šฅ์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

Cloud Data Fusion ์ธํ„ฐํŽ˜์ด์Šค๋ฅผ ํƒ์ƒ‰ํ•˜๋ ค๋ฉด ๋‹ค์Œ ๋‹จ๊ณ„๋ฅผ ๋”ฐ๋ฅด์„ธ์š”.

  1. Google Cloud ์ฝ˜์†”์—์„œ ์ธ์Šคํ„ด์Šค ํŽ˜์ด์ง€๋ฅผ ์—ฝ๋‹ˆ๋‹ค.

    ์ธ์Šคํ„ด์Šค๋กœ ์ด๋™

  2. ์ธ์Šคํ„ด์Šค ์ž‘์—… ์—ด์—์„œ ์ธ์Šคํ„ด์Šค ๋ณด๊ธฐ ๋งํฌ๋ฅผ ํด๋ฆญํ•ฉ๋‹ˆ๋‹ค.
  3. Cloud Data Fusion ์›น ์ธํ„ฐํŽ˜์ด์Šค์—์„œ ์™ผ์ชฝ ํƒ์ƒ‰ ํŒจ๋„์„ ์‚ฌ์šฉํ•˜์—ฌ ์›ํ•˜๋Š” ํŽ˜์ด์ง€๋กœ ์ด๋™ํ•ฉ๋‹ˆ๋‹ค.

์ƒ˜ํ”Œ ํŒŒ์ดํ”„๋ผ์ธ ๋ฐฐํฌ

์ƒ˜ํ”Œ ํŒŒ์ดํ”„๋ผ์ธ์€ ์žฌ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ Cloud Data Fusion ํŒŒ์ดํ”„๋ผ์ธ, ํ”Œ๋Ÿฌ๊ทธ์ธ, ์†”๋ฃจ์…˜์„ ๊ณต์œ ํ•  ์ˆ˜ ์žˆ๋Š” Cloud Data Fusion ํ—ˆ๋ธŒ๋ฅผ ํ†ตํ•ด ์ œ๊ณต๋ฉ๋‹ˆ๋‹ค.

  1. Cloud Data Fusion ์›น ์ธํ„ฐํŽ˜์ด์Šค์—์„œ ํ—ˆ๋ธŒ๋ฅผ ํด๋ฆญํ•ฉ๋‹ˆ๋‹ค.
  2. ์™ผ์ชฝ ํŒจ๋„์—์„œ ํŒŒ์ดํ”„๋ผ์ธ์„ ํด๋ฆญํ•ฉ๋‹ˆ๋‹ค.
  3. Cloud Data Fusion ๋น ๋ฅธ ์‹œ์ž‘ ํŒŒ์ดํ”„๋ผ์ธ์„ ํด๋ฆญํ•ฉ๋‹ˆ๋‹ค.
  4. ๋งŒ๋“ค๊ธฐ๋ฅผ ํด๋ฆญํ•ฉ๋‹ˆ๋‹ค.
  5. Cloud Data Fusion ๋น ๋ฅธ ์‹œ์ž‘ ๊ตฌ์„ฑ ํŒจ๋„์—์„œ ๋งˆ์นจ์„ ํด๋ฆญํ•ฉ๋‹ˆ๋‹ค.
  6. ํŒŒ์ดํ”„๋ผ์ธ ๋งž์ถค์„ค์ •์„ ํด๋ฆญํ•ฉ๋‹ˆ๋‹ค.

    ํŒŒ์ดํ”„๋ผ์ธ์˜ ์‹œ๊ฐ์  ํ‘œํ˜„์ด ์ŠคํŠœ๋””์˜ค ํŽ˜์ด์ง€์— ํ‘œ์‹œ๋˜๋ฉฐ, ์ด๋Š” ๋ฐ์ดํ„ฐ ํ†ตํ•ฉ ํŒŒ์ดํ”„๋ผ์ธ ๊ฐœ๋ฐœ์— ์‚ฌ์šฉ๋˜๋Š” ๊ทธ๋ž˜ํ”ฝ ์ธํ„ฐํŽ˜์ด์Šค์ž…๋‹ˆ๋‹ค. ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ ํŒŒ์ดํ”„๋ผ์ธ ํ”Œ๋Ÿฌ๊ทธ์ธ์ด ์™ผ์ชฝ์— ๋‚˜์—ด๋˜๊ณ  ํ•ด๋‹น ํŒŒ์ดํ”„๋ผ์ธ์ด ๊ธฐ๋ณธ ์บ”๋ฒ„์Šค ์˜์—ญ์— ํ‘œ์‹œ๋ฉ๋‹ˆ๋‹ค. ๊ฐ ํŒŒ์ดํ”„๋ผ์ธ ๋…ธ๋“œ ์œ„์— ํฌ์ธํ„ฐ๋ฅผ ์˜ฌ๋ ค๋†“๊ณ  ์†์„ฑ์„ ํด๋ฆญํ•˜์—ฌ ํŒŒ์ดํ”„๋ผ์ธ์„ ํƒ์ƒ‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๊ฐ ๋…ธ๋“œ์˜ ์†์„ฑ ๋ฉ”๋‰ด๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด ๋…ธ๋“œ์™€ ๊ด€๋ จ๋œ ๊ฐ์ฒด ๋ฐ ์ž‘์—…์„ ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

  7. ์˜ค๋ฅธ์ชฝ ์ƒ๋‹จ ๋ฉ”๋‰ด์—์„œ ๋ฐฐํฌ๋ฅผ ํด๋ฆญํ•ฉ๋‹ˆ๋‹ค. ์ด ๋‹จ๊ณ„์—์„œ๋Š” ํŒŒ์ดํ”„๋ผ์ธ์ด Cloud Data Fusion์— ์ œ์ถœ๋ฉ๋‹ˆ๋‹ค. ์ด ๋น ๋ฅธ ์‹œ์ž‘์˜ ๋‹ค์Œ ์„น์…˜์—์„œ ํŒŒ์ดํ”„๋ผ์ธ์„ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค.

ํŒŒ์ดํ”„๋ผ์ธ ๋ฐฐํฌ

ํŒŒ์ดํ”„๋ผ์ธ ๋ณด๊ธฐ

๋ฐฐํฌ๋œ ํŒŒ์ดํ”„๋ผ์ธ์€ ํŒŒ์ดํ”„๋ผ์ธ ์„ธ๋ถ€์ •๋ณด ๋ทฐ์— ํ‘œ์‹œ๋˜๋ฉฐ, ์—ฌ๊ธฐ์„œ ๋‹ค์Œ ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

  • ํŒŒ์ดํ”„๋ผ์ธ์˜ ๊ตฌ์กฐ์™€ ๊ตฌ์„ฑ์„ ํ™•์ธํ•ฉ๋‹ˆ๋‹ค.
  • ์ˆ˜๋™์œผ๋กœ ํŒŒ์ดํ”„๋ผ์ธ ์‹คํ–‰ ๋˜๋Š” ์ผ์ •์ด๋‚˜ ํŠธ๋ฆฌ๊ฑฐ ์„ค์ •
  • ์‹คํ–‰ ์‹œ๊ฐ„, ๋กœ๊ทธ, ์ธก์ •ํ•ญ๋ชฉ์„ ํฌํ•จํ•˜์—ฌ ํŒŒ์ดํ”„๋ผ์ธ ์ด์ „ ์‹คํ–‰์— ๋Œ€ํ•œ ์š”์•ฝ ๋ณด๊ธฐ

์„œ๋น„์Šค ๊ณ„์ •์„ ๋ณต์‚ฌํ•ฉ๋‹ˆ๋‹ค.

ํŒŒ์ดํ”„๋ผ์ธ ์‹คํ–‰

ํŒŒ์ดํ”„๋ผ์ธ ์„ธ๋ถ€์ •๋ณด ๋ทฐ์—์„œ ์‹คํ–‰์„ ํด๋ฆญํ•˜์—ฌ ํŒŒ์ดํ”„๋ผ์ธ์„ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค.

ํŒŒ์ดํ”„๋ผ์ธ ์‹คํ–‰

ํŒŒ์ดํ”„๋ผ์ธ์„ ์‹คํ–‰ํ•  ๋•Œ Cloud Data Fusion์€ ๋‹ค์Œ์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.

  1. ์ž„์‹œ Dataproc ํด๋Ÿฌ์Šคํ„ฐ ํ”„๋กœ๋น„์ €๋‹
  2. Apache Spark๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํด๋Ÿฌ์Šคํ„ฐ์—์„œ ํŒŒ์ดํ”„๋ผ์ธ ์‹คํ–‰
  3. ํด๋Ÿฌ์Šคํ„ฐ๋ฅผ ์‚ญ์ œํ•ฉ๋‹ˆ๋‹ค.

๊ฒฐ๊ณผ ๋ณด๊ธฐ

๋ช‡ ๋ถ„ ํ›„์— ํŒŒ์ดํ”„๋ผ์ธ์ด ์™„๋ฃŒ๋ฉ๋‹ˆ๋‹ค. ํŒŒ์ดํ”„๋ผ์ธ ์ƒํƒœ๊ฐ€ ์„ฑ๊ณต์œผ๋กœ ๋ฐ”๋€Œ๊ณ  ๊ฐ ๋…ธ๋“œ์—์„œ ์ฒ˜๋ฆฌ๋œ ๋ ˆ์ฝ”๋“œ ์ˆ˜๊ฐ€ ํ‘œ์‹œ๋ฉ๋‹ˆ๋‹ค.

ํŒŒ์ดํ”„๋ผ์ธ ์‹คํ–‰ ์™„๋ฃŒ

  1. BigQuery ์›น ์ธํ„ฐํŽ˜์ด์Šค๋กœ ์ด๋™ํ•ฉ๋‹ˆ๋‹ค.
  2. ๊ฒฐ๊ณผ ์ƒ˜ํ”Œ์„ ๋ณด๋ ค๋ฉด ํ”„๋กœ์ ํŠธ์˜ DataFusionQuickstart ๋ฐ์ดํ„ฐ ์„ธํŠธ๋กœ ์ด๋™ํ•˜๊ณ  top_rated_inexpensive ํ…Œ์ด๋ธ”์„ ํด๋ฆญํ•œ ํ›„ ๊ฐ„๋‹จํ•œ ์ฟผ๋ฆฌ๋ฅผ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค๋ฉด ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

    SELECT * FROM PROJECT_ID.GCPQuickStart.top_rated_inexpensive LIMIT 10
    

    PROJECT_ID๋ฅผ ํ”„๋กœ์ ํŠธ ID๋กœ ๋ฐ”๊ฟ‰๋‹ˆ๋‹ค.

๊ฒฐ๊ณผ ๋ณด๊ธฐ

์‚ญ์ œ

์ด ํŽ˜์ด์ง€์—์„œ ์‚ฌ์šฉํ•œ ๋ฆฌ์†Œ์Šค ๋น„์šฉ์ด Google Cloud ๊ณ„์ •์— ์ฒญ๊ตฌ๋˜์ง€ ์•Š๋„๋ก ํ•˜๋ ค๋ฉด ๋‹ค์Œ ๋‹จ๊ณ„๋ฅผ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.

  1. ์ด ๋น ๋ฅธ ์‹œ์ž‘์—์„œ ํŒŒ์ดํ”„๋ผ์ธ์ด ์ž‘์„ฑํ•œ BigQuery ๋ฐ์ดํ„ฐ ์„ธํŠธ๋ฅผ ์‚ญ์ œํ•ฉ๋‹ˆ๋‹ค.
  2. Cloud Data Fusion ์ธ์Šคํ„ด์Šค๋ฅผ ์‚ญ์ œํ•ฉ๋‹ˆ๋‹ค.

  3. ์„ ํƒ์‚ฌํ•ญ: ํ”„๋กœ์ ํŠธ๋ฅผ ์‚ญ์ œํ•ฉ๋‹ˆ๋‹ค.

  1. In the Google Cloud console, go to the Manage resources page.

    Go to Manage resources

  2. In the project list, select the project that you want to delete, and then click Delete.
  3. In the dialog, type the project ID, and then click Shut down to delete the project.

๋‹ค์Œ ๋‹จ๊ณ„