Dataplex Universal Catalog์—์„œ ๋ฐ์ดํ„ฐ ์ธ์‚ฌ์ดํŠธ ์•ก์„ธ์Šค

๋ฐ์ดํ„ฐ ํ†ต๊ณ„๋Š” ๋ฐ์ดํ„ฐ๋ฅผ ํƒ์ƒ‰ํ•˜๊ณ  ์ดํ•ดํ•˜๊ธฐ ์œ„ํ•œ ์ž๋™ํ™”๋œ ๋ฐฉ๋ฒ•์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ๋ฐ์ดํ„ฐ ์ธ์‚ฌ์ดํŠธ๋ฅผ ํ†ตํ•ด Gemini๋Š” ๋ฉ”ํƒ€๋ฐ์ดํ„ฐ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํ…Œ์ด๋ธ”์— ๊ด€ํ•œ ์ž์—ฐ์–ด ์งˆ๋ฌธ๊ณผ ์ด์— ๋‹ต๋ณ€ํ•˜๋Š” ์ฟผ๋ฆฌ๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. ์ด๋ฅผ ํ†ตํ•ด ํŒจํ„ด์„ ๋ฐœ๊ฒฌํ•˜๊ณ  ๋ฐ์ดํ„ฐ ํ’ˆ์งˆ์„ ํ‰๊ฐ€ํ•˜๋ฉฐ ํ†ต๊ณ„ ๋ถ„์„์„ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์ด ๋ฌธ์„œ์—์„œ๋Š” ๋ฐ์ดํ„ฐ ์ธ์‚ฌ์ดํŠธ์˜ ์ฃผ์š” ๊ธฐ๋Šฅ๊ณผ ์œ ์šฉํ•œ ๋ฐ์ดํ„ฐ ํƒ์ƒ‰์„ ์œ„ํ•ด ๋ฐ์ดํ„ฐ ์ธ์‚ฌ์ดํŠธ๋ฅผ ํ™•์ธํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค.

์‹œ์ž‘ํ•˜๊ธฐ ์ „์—

๋ฐ์ดํ„ฐ ์ธ์‚ฌ์ดํŠธ๋Š” BigQuery์˜ Gemini๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ƒ์„ฑ๋˜๋ฉฐ BigQuery Studio์—์„œ๋งŒ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋จผ์ € BigQuery์˜ Gemini๋ฅผ ์„ค์ •ํ•œ ๋‹ค์Œ BigQuery์—์„œ ํ†ต๊ณ„๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. Gemini๋Š” us-central1 ๋ฆฌ์ „์—์„œ๋งŒ ์ธ์‚ฌ์ดํŠธ๋ฅผ ์–ป๊ธฐ ์œ„ํ•ด ๋ฉ”ํƒ€๋ฐ์ดํ„ฐ๋ฅผ ์ฒ˜๋ฆฌํ•ฉ๋‹ˆ๋‹ค. ์ž์„ธํ•œ ๋‚ด์šฉ์€ Gemini ์ œ๊ณต ์œ„์น˜๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”. ์ธ์‚ฌ์ดํŠธ๋ฅผ ์ƒ์„ฑํ•œ ํ›„์—๋Š” Dataplex Universal Catalog์—์„œ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

ํ•„์š”ํ•œ ์—ญํ• 

์ƒ์„ฑ๋œ ํ†ต๊ณ„์— ๋Œ€ํ•œ ์ฝ๊ธฐ ์ „์šฉ ์•ก์„ธ์Šค ๊ถŒํ•œ์„ ์–ป์œผ๋ ค๋ฉด ๊ด€๋ฆฌ์ž์—๊ฒŒ ๋‹ค์Œ IAM ์—ญํ• ์„ ๋ถ€์—ฌํ•ด ๋‹ฌ๋ผ๊ณ  ์š”์ฒญํ•˜์„ธ์š”.

  • ํ†ต๊ณ„๋ฅผ ๋ณด๋ ค๋Š” BigQuery ํ…Œ์ด๋ธ”์ด ํฌํ•จ๋œ ํ”„๋กœ์ ํŠธ์— ๋Œ€ํ•œ Dataplex DataScan DataViewer(roles/dataplex.dataScanDataViewer)

์—ญํ•  ๋ถ€์—ฌ์— ๋Œ€ํ•œ ์ƒ์„ธ ์„ค๋ช…์€ ํ”„๋กœ์ ํŠธ, ํด๋”, ์กฐ์ง์— ๋Œ€ํ•œ ์•ก์„ธ์Šค ๊ด€๋ฆฌ๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.

์ปค์Šคํ…€ ์—ญํ•  ๋˜๋Š” ๊ธฐํƒ€ ์‚ฌ์ „ ์ •์˜๋œ ์—ญํ• ์„ ํ†ตํ•ด ํ•„์š”ํ•œ ๊ถŒํ•œ์„ ์–ป์„ ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค. ํ†ต๊ณ„๋ฅผ ์ƒ์„ฑํ•˜๋Š” ๋ฐ ํ•„์š”ํ•œ ์ •ํ™•ํ•œ ๊ถŒํ•œ์„ ๋ณด๋ ค๋ฉด ํ•„์ˆ˜ ๊ถŒํ•œ ์„น์…˜์„ ํŽผ์น˜์„ธ์š”.

ํ•„์ˆ˜ ๊ถŒํ•œ

  • dataplex.datascans.get
  • dataplex.datascans.getData

API ์‚ฌ์šฉ ์„ค์ •

๋ฐ์ดํ„ฐ ํ†ต๊ณ„๋ฅผ ์‚ฌ์šฉํ•˜๋ ค๋ฉด ํ”„๋กœ์ ํŠธ์—์„œ ๋‹ค์Œ API๋ฅผ ์‚ฌ์šฉ ์„ค์ •ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

Gemini for Google Cloud API ์‚ฌ์šฉ ์„ค์ •์— ๋Œ€ํ•œ ์ž์„ธํ•œ ๋‚ด์šฉ์€ Google Cloud ํ”„๋กœ์ ํŠธ์—์„œ Gemini for Google Cloud API ์‚ฌ์šฉ ์„ค์ •์„ ์ฐธ๊ณ ํ•˜์„ธ์š”.

๋ฐ์ดํ„ฐ ํ†ต๊ณ„ ์ •๋ณด

๋ฐ์ดํ„ฐ ๋ถ„์„๊ฐ€๋Š” ์ต์ˆ™ํ•˜์ง€ ์•Š์€ ์ƒˆ ํ…Œ์ด๋ธ”์„ ํƒ์ƒ‰ํ•  ๋•Œ ์ฝœ๋“œ ์Šคํƒ€ํŠธ ๋ฌธ์ œ์— ์ง๋ฉดํ•˜๋Š” ๊ฒฝ์šฐ๊ฐ€ ๋งŽ์Šต๋‹ˆ๋‹ค. ์ด ๋ฌธ์ œ์—๋Š” ๋ฐ์ดํ„ฐ ๊ตฌ์กฐ, ์ฃผ์š” ํŒจํ„ด, ๋ฐ์ดํ„ฐ์˜ ๊ด€๋ จ ์ •๋ณด์— ๋Œ€ํ•œ ๋ถˆํ™•์‹ค์„ฑ์ด ํฌํ•จ๋˜๋Š” ๊ฒฝ์šฐ๊ฐ€ ๋งŽ์•„ ์ฟผ๋ฆฌ ์ž‘์„ฑ์„ ์‹œ์ž‘ํ•˜๊ธฐ ์–ด๋ ต์Šต๋‹ˆ๋‹ค.

๋ฐ์ดํ„ฐ ์ธ์‚ฌ์ดํŠธ๋Š” ํ…Œ์ด๋ธ”์˜ ๋ฉ”ํƒ€๋ฐ์ดํ„ฐ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ์ž์—ฐ์–ด ์ฟผ๋ฆฌ์™€ ์ด์— ์ƒ์‘ํ•˜๋Š” SQL์„ ์ž๋™์œผ๋กœ ์ƒ์„ฑํ•˜์—ฌ ์ฝœ๋“œ ์Šคํƒ€ํŠธ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•ฉ๋‹ˆ๋‹ค. ๋นˆ ์ฟผ๋ฆฌ ํŽธ์ง‘๊ธฐ๋กœ ์‹œ์ž‘ํ•˜๋Š” ๋Œ€์‹  ๊ฐ€์น˜ ์žˆ๋Š” ํ†ต๊ณ„๋ฅผ ์ œ๊ณตํ•˜๋Š” ์˜๋ฏธ ์žˆ๋Š” ์ฟผ๋ฆฌ๋กœ ๋ฐ์ดํ„ฐ ํƒ์ƒ‰์„ ๋น ๋ฅด๊ฒŒ ์‹œ์ž‘ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ž์„ธํžˆ ์กฐ์‚ฌํ•˜๋ ค๋ฉด ๋ฐ์ดํ„ฐ ์บ”๋ฒ„์Šค์—์„œ ํ›„์† ์งˆ๋ฌธ์„ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

ํ†ต๊ณ„ ์‹คํ–‰์˜ ์˜ˆ

๋‹ค์Œ ๋ฉ”ํƒ€๋ฐ์ดํ„ฐ๊ฐ€ ํฌํ•จ๋œ telco_churn์ด๋ผ๋Š” ํ…Œ์ด๋ธ”์ด ์žˆ๋‹ค๊ณ  ๊ฐ€์ •ํ•ด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

ํ•„๋“œ ์ด๋ฆ„ ์œ ํ˜•
CustomerID STRING
์„ฑ๋ณ„ STRING
Tenure INT64
InternetService STRING
StreamingTV STRING
OnlineBackup STRING
๊ณ„์•ฝ STRING
TechSupport STRING
PaymentMethod STRING
MonthlyCharges FLOAT
Churn BOOLEAN

๋‹ค์Œ์€ ๋ฐ์ดํ„ฐ ํ†ต๊ณ„๊ฐ€ ์ด ํ…Œ์ด๋ธ”์— ๋Œ€ํ•ด ์ƒ์„ฑํ•˜๋Š” ๋ช‡ ๊ฐ€์ง€ ์ƒ˜ํ”Œ ์ฟผ๋ฆฌ์ž…๋‹ˆ๋‹ค.

  • ๋ชจ๋“  ํ”„๋ฆฌ๋ฏธ์—„ ์„œ๋น„์Šค๋ฅผ ๊ตฌ๋…ํ•˜๊ณ  50๊ฐœ์›” ๋„˜๊ฒŒ ๊ณ ๊ฐ์œผ๋กœ ๋“ฑ๋ก๋œ ์ด๋ ฅ์ด ์žˆ๋Š” ๊ณ ๊ฐ์„ ์‹๋ณ„ํ•ฉ๋‹ˆ๋‹ค.

    SELECT
      CustomerID,
      Contract,
      Tenure
    FROM
      agentville_datasets.telco_churn
    WHERE
      OnlineBackup = 'Yes'
      AND TechSupport = 'Yes'
      AND StreamingTV = 'Yes'
      AND Tenure > 50;
    
  • ๊ณ ๊ฐ ์ดํƒˆ์ด ๊ฐ€์žฅ ๋งŽ์€ ์ธํ„ฐ๋„ท ์„œ๋น„์Šค๋ฅผ ํŒŒ์•…ํ•ฉ๋‹ˆ๋‹ค.

    SELECT
      InternetService,
      COUNT(DISTINCT CustomerID) AS total_customers
    FROM
      agentville_datasets.telco_churn
    WHERE
      Churn = TRUE
    GROUP BY
      InternetService
    ORDER BY
      total_customers DESC
    LIMIT 1;
    
  • ๊ฐ€์น˜๊ฐ€ ๋†’์€ ๊ณ ๊ฐ์˜ ์„ธ๊ทธ๋จผํŠธ๋ณ„ ์ดํƒˆ๋ฅ ์„ ํŒŒ์•…ํ•ฉ๋‹ˆ๋‹ค.

    SELECT
      Contract,
      InternetService,
      Gender,
      PaymentMethod,
      COUNT(DISTINCT CustomerID) AS total_customers,
      SUM(CASE WHEN Churn = TRUE THEN 1 ELSE 0 END) AS churned_customers,
      (SUM(CASE WHEN Churn = TRUE THEN 1 ELSE 0 END) / COUNT(DISTINCT CustomerID))
      * 100 AS churn_rate
    FROM
      agentville_datasets.telco_churn
    WHERE
      MonthlyCharges > 100
    GROUP BY
      Contract,
      InternetService,
      Gender,
      PaymentMethod;
    

ํ†ต๊ณ„ ๋ณด๊ธฐ

BigQuery ํ…Œ์ด๋ธ”์˜ ์ธ์‚ฌ์ดํŠธ๋ฅผ ๋ณด๋ ค๋ฉด Dataplex Universal Catalog ๊ฒ€์ƒ‰์„ ์‚ฌ์šฉํ•˜์—ฌ Dataplex Universal Catalog์˜ ํ…Œ์ด๋ธ” ํ•ญ๋ชฉ์— ์•ก์„ธ์Šคํ•ฉ๋‹ˆ๋‹ค.

  1. Google Cloud ์ฝ˜์†”์—์„œ Dataplex Universal Catalog ๊ฒ€์ƒ‰ ํŽ˜์ด์ง€๋กœ ์ด๋™ํ•ฉ๋‹ˆ๋‹ค.

    ๊ฒ€์ƒ‰ ํŽ˜์ด์ง€๋กœ ์ด๋™

  2. Dataplex Universal Catalog์—์„œ ํ…Œ์ด๋ธ” ํ•ญ๋ชฉ์„ ๊ฒ€์ƒ‰ํ•ฉ๋‹ˆ๋‹ค.

  3. ํ†ต๊ณ„ ํƒญ์„ ํด๋ฆญํ•ฉ๋‹ˆ๋‹ค. ํƒญ์ด ๋น„์–ด ์žˆ๋Š” ๊ฒฝ์šฐ ์ด ํ…Œ์ด๋ธ”์˜ ํ†ต๊ณ„๊ฐ€ ์•„์ง ์ƒ์„ฑ๋˜์ง€ ์•Š์•˜์Œ์„ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค. BigQuery Studio์—์„œ ๋ฐ์ดํ„ฐ ํ†ต๊ณ„๋ฅผ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๊ฐ€๊ฒฉ ์ฑ…์ •

์ด ๊ธฐ๋Šฅ์˜ ๊ฐ€๊ฒฉ ์ฑ…์ •์— ๊ด€ํ•œ ์ž์„ธํ•œ ๋‚ด์šฉ์€ BigQuery์˜ Gemini ๊ฐ€๊ฒฉ ์ฑ…์ • ๊ฐœ์š”๋ฅผ ์ฐธ๊ณ ํ•˜์„ธ์š”.

ํ• ๋‹น๋Ÿ‰ ๋ฐ ํ•œ๋„

์ด ๊ธฐ๋Šฅ์˜ ํ• ๋‹น๋Ÿ‰ ๋ฐ ํ•œ๋„์— ๋Œ€ํ•œ ์ž์„ธํ•œ ๋‚ด์šฉ์€ BigQuery์˜ Gemini ํ• ๋‹น๋Ÿ‰์„ ์ฐธ์กฐํ•˜์„ธ์š”.

์œ„์น˜

Gemini๋Š” ์ „์—ญ์ ์œผ๋กœ ์ž‘๋™ํ•˜๋ฏ€๋กœ ๋ฐ์ดํ„ฐ ์ฒ˜๋ฆฌ๋ฅผ ํŠน์ • ๋ฆฌ์ „์œผ๋กœ ์ œํ•œํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค. BigQuery์˜ Gemini์—์„œ ๋ฐ์ดํ„ฐ๋ฅผ ์ฒ˜๋ฆฌํ•˜๋Š” ์œ„์น˜์— ๋Œ€ํ•œ ์ž์„ธํ•œ ๋‚ด์šฉ์€ Gemini ์ œ๊ณต ์œ„์น˜๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.

์ œํ•œ์‚ฌํ•ญ

  • BigQuery ํ…Œ์ด๋ธ”, BigLake ํ…Œ์ด๋ธ”, ์™ธ๋ถ€ ํ…Œ์ด๋ธ”, ๋ทฐ์— ๋Œ€ํ•œ ๋ฐ์ดํ„ฐ ํ†ต๊ณ„๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ๋ฉ€ํ‹ฐ ํด๋ผ์šฐ๋“œ ๊ณ ๊ฐ์€ ๋‹ค๋ฅธ ํด๋ผ์šฐ๋“œ์˜ ๋ฐ์ดํ„ฐ๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค.
  • ๋ฐ์ดํ„ฐ ํ†ต๊ณ„๋Š” Geo ๋˜๋Š” JSON ์—ด ์œ ํ˜•์„ ์ง€์›ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.
  • ํ†ต๊ณ„๋ฅผ ์‹คํ–‰ํ•  ๋•Œ๋งˆ๋‹ค ๋งค๋ฒˆ ์ฟผ๋ฆฌ๊ฐ€ ํ‘œ์‹œ๋˜๋Š” ๊ฒƒ์€ ์•„๋‹™๋‹ˆ๋‹ค. ๋ณด๋‹ค ํฅ๋ฏธ๋กœ์šด ์ฟผ๋ฆฌ๋ฅผ ์ƒ์„ฑํ•  ๊ฐ€๋Šฅ์„ฑ์„ ๋†’์ด๋ ค๋ฉด BigQuery Studio์—์„œ ์ธ์‚ฌ์ดํŠธ๋ฅผ ๋‹ค์‹œ ์ƒ์„ฑํ•˜์„ธ์š”.

๋‹ค์Œ ๋‹จ๊ณ„