Vertex AI Workbench ๋ฌธ์ œ ํ•ด๊ฒฐ

์ด ํŽ˜์ด์ง€์—์„œ๋Š” Vertex AI Workbench ์‚ฌ์šฉ ์‹œ ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ•  ๊ฒฝ์šฐ ๋„์›€์ด ๋  ์ˆ˜ ์žˆ๋Š” ๋ฌธ์ œ ํ•ด๊ฒฐ ๋‹จ๊ณ„๋ฅผ ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค.

Vertex AI์˜ ๋‹ค๋ฅธ ๊ตฌ์„ฑ์š”์†Œ๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๋ฐ ๋„์›€์ด ํ•„์š”ํ•˜๋ฉด Vertex AI ๋ฌธ์ œ ํ•ด๊ฒฐ๋„ ์ฐธ์กฐํ•˜์„ธ์š”.

์ด ํŽ˜์ด์ง€์˜ ์ฝ˜ํ…์ธ ๋ฅผ ํ•„ํ„ฐ๋งํ•˜๋ ค๋ฉด ์ฃผ์ œ๋ฅผ ํด๋ฆญํ•ฉ๋‹ˆ๋‹ค.

์œ ์šฉํ•œ ์ ˆ์ฐจ

์ด ์„น์…˜์—์„œ๋Š” ์œ ์šฉํ•œ ์ ˆ์ฐจ์— ๋Œ€ํ•ด ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค.

SSH๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์‚ฌ์šฉ์ž ๊ด€๋ฆฌ ๋…ธํŠธ๋ถ ์ธ์Šคํ„ด์Šค์— ์—ฐ๊ฒฐ

Cloud Shell์—์„œ ๋˜๋Š” Google Cloud CLI๊ฐ€ ์„ค์น˜๋œ ํ™˜๊ฒฝ์—์„œ ๋‹ค์Œ ๋ช…๋ น์–ด๋ฅผ ์ž…๋ ฅํ•˜์—ฌ ssh๋ฅผ ํ†ตํ•ด ์ธ์Šคํ„ด์Šค์— ์—ฐ๊ฒฐํ•ฉ๋‹ˆ๋‹ค.

gcloud compute ssh --project PROJECT_ID \
  --zone ZONE \
  INSTANCE_NAME -- -L 8080:localhost:8080

๋‹ค์Œ์„ ๋ฐ”๊ฟ‰๋‹ˆ๋‹ค.

  • PROJECT_ID: ํ”„๋กœ์ ํŠธ ID
  • ZONE: ์ธ์Šคํ„ด์Šค๊ฐ€ ์žˆ๋Š” Google Cloud ์˜์—ญ
  • INSTANCE_NAME: ์ธ์Šคํ„ด์Šค์˜ ์ด๋ฆ„

์ธ์Šคํ„ด์Šค์˜ Compute Engine ์„ธ๋ถ€์ •๋ณด ํŽ˜์ด์ง€๋ฅผ ์—ด๊ณ  SSH ๋ฒ„ํŠผ์„ ํด๋ฆญํ•˜์—ฌ ์ธ์Šคํ„ด์Šค์— ์—ฐ๊ฒฐํ•  ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค.

์—ญ๋ฐฉํ–ฅ ํ”„๋ก์‹œ ์„œ๋ฒ„์— ๋‹ค์‹œ ๋“ฑ๋ก

์‚ฌ์šฉ์ž ๊ด€๋ฆฌ ๋…ธํŠธ๋ถ ์ธ์Šคํ„ด์Šค๋ฅผ ๋‚ด๋ถ€ ์—ญ๋ฐฉํ–ฅ ํ”„๋ก์‹œ ์„œ๋ฒ„์— ๋‹ค์‹œ ๋“ฑ๋กํ•˜๋ ค๋ฉด ์‚ฌ์šฉ์ž ๊ด€๋ฆฌ ๋…ธํŠธ๋ถ ํŽ˜์ด์ง€์—์„œ VM์„ ์ค‘์ง€ํ•˜๊ณ  ์‹œ์ž‘ํ•˜๊ฑฐ๋‚˜ ssh๋ฅผ ์‚ฌ์šฉ์ž ๊ด€๋ฆฌ ๋…ธํŠธ๋ถ ์ธ์Šคํ„ด์Šค์— ์—ฐ๊ฒฐํ•˜๊ณ  ๋‹ค์Œ์„ ์ž…๋ ฅํ•˜๋ฉด ๋ฉ๋‹ˆ๋‹ค.

cd /opt/deeplearning/bin
sudo ./attempt-register-vm-on-proxy.sh

Docker ์„œ๋น„์Šค ์ƒํƒœ ํ™•์ธ

Docker ์„œ๋น„์Šค ์ƒํƒœ๋ฅผ ํ™•์ธํ•˜๋ ค๋ฉด ssh๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์‚ฌ์šฉ์ž ๊ด€๋ฆฌ ๋…ธํŠธ๋ถ ์ธ์Šคํ„ด์Šค์— ์—ฐ๊ฒฐํ•˜๊ณ  ๋‹ค์Œ์„ ์ž…๋ ฅํ•ฉ๋‹ˆ๋‹ค.

sudo service docker status

์—ญ๋ฐฉํ–ฅ ํ”„๋ก์‹œ ์—์ด์ „ํŠธ๊ฐ€ ์‹คํ–‰ ์ค‘์ธ์ง€ ํ™•์ธ

๋…ธํŠธ๋ถ ์—ญ๋ฐฉํ–ฅ ์—์ด์ „ํŠธ๊ฐ€ ์‹คํ–‰ ์ค‘์ธ์ง€ ํ™•์ธํ•˜๋ ค๋ฉด ssh๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์‚ฌ์šฉ์ž ๊ด€๋ฆฌ ๋…ธํŠธ๋ถ ์ธ์Šคํ„ด์Šค์— ์—ฐ๊ฒฐํ•˜๊ณ  ๋‹ค์Œ์„ ์ž…๋ ฅํ•ฉ๋‹ˆ๋‹ค.

# Confirm Inverting Proxy agent Docker container is running (proxy-agent)
sudo docker ps

# Verify State.Status is running and State.Running is true.
sudo docker inspect proxy-agent

# Grab logs
sudo docker logs proxy-agent

Jupyter ์„œ๋น„์Šค ์ƒํƒœ ํ™•์ธ ๋ฐ ๋กœ๊ทธ ์ˆ˜์ง‘

Jupyter ์„œ๋น„์Šค ์ƒํƒœ๋ฅผ ํ™•์ธํ•˜๋ ค๋ฉด ssh๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์‚ฌ์šฉ์ž ๊ด€๋ฆฌ ๋…ธํŠธ๋ถ ์ธ์Šคํ„ด์Šค์— ์—ฐ๊ฒฐํ•˜๊ณ  ๋‹ค์Œ์„ ์ž…๋ ฅํ•ฉ๋‹ˆ๋‹ค.

sudo service jupyter status

Jupyter ์„œ๋น„์Šค ๋กœ๊ทธ๋ฅผ ์ˆ˜์ง‘ํ•˜๋ ค๋ฉด ๋‹ค์Œ์„ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค.

sudo journalctl -u jupyter.service --no-pager

Jupyter ๋‚ด๋ถ€ API๊ฐ€ ํ™œ์„ฑ ์ƒํƒœ์ธ์ง€ ํ™•์ธ

Jupyter API๋Š” ํ•ญ์ƒ ํฌํŠธ 8080์—์„œ ์‹คํ–‰๋˜์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์ธ์Šคํ„ด์Šค์˜ syslog์—์„œ ๋‹ค์Œ๊ณผ ์œ ์‚ฌํ•œ ํ•ญ๋ชฉ์ด ์žˆ๋Š”์ง€ ํ™•์ธํ•˜์—ฌ ์ด๋ฅผ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

Jupyter Server ... running at:
http://localhost:8080

๋˜ํ•œ Jupyter ๋‚ด๋ถ€ API๊ฐ€ ํ™œ์„ฑ ์ƒํƒœ์ธ์ง€ ํ™•์ธํ•˜๋ ค๋ฉด ssh๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์‚ฌ์šฉ์ž ๊ด€๋ฆฌ ๋…ธํŠธ๋ถ ์ธ์Šคํ„ด์Šค์— ์—ฐ๊ฒฐํ•˜๊ณ  ๋‹ค์Œ์„ ์ž…๋ ฅํ•˜๋ฉด ๋ฉ๋‹ˆ๋‹ค.

curl http://127.0.0.1:8080/api/kernelspecs

์š”์ฒญ์ด ๋„ˆ๋ฌด ์˜ค๋ž˜ ๊ฑธ๋ฆฌ๋Š” ๊ฒฝ์šฐ API๊ฐ€ ์‘๋‹ตํ•˜๋Š” ๋ฐ ๊ฑธ๋ฆฌ๋Š” ์‹œ๊ฐ„์„ ์ธก์ •ํ•  ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค.

time curl -V http://127.0.0.1:8080/api/status
time curl -V http://127.0.0.1:8080/api/kernels
time curl -V http://127.0.0.1:8080/api/connections

Vertex AI Workbench ์ธ์Šคํ„ด์Šค์—์„œ ์ด๋Ÿฌํ•œ ๋ช…๋ น์–ด๋ฅผ ์‹คํ–‰ํ•˜๋ ค๋ฉด JupyterLab์„ ์—ด๊ณ  ์ƒˆ ํ„ฐ๋ฏธ๋„์„ ๋งŒ๋“ญ๋‹ˆ๋‹ค.

Docker ์„œ๋น„์Šค ๋‹ค์‹œ ์‹œ์ž‘

Docker ์„œ๋น„์Šค๋ฅผ ๋‹ค์‹œ ์‹œ์ž‘ํ•˜๋ ค๋ฉด ์‚ฌ์šฉ์ž ๊ด€๋ฆฌ ๋…ธํŠธ๋ถ ํŽ˜์ด์ง€์—์„œ VM์„ ์ค‘์ง€ํ•˜๊ณ  ์‹œ์ž‘ํ•˜๊ฑฐ๋‚˜ ssh๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์‚ฌ์šฉ์ž ๊ด€๋ฆฌ ๋…ธํŠธ๋ถ ์ธ์Šคํ„ด์Šค์— ์—ฐ๊ฒฐํ•˜๊ณ  ๋‹ค์Œ์„ ์ž…๋ ฅํ•ฉ๋‹ˆ๋‹ค.

sudo service docker restart

์—ญ๋ฐฉํ–ฅ ํ”„๋ก์‹œ ์—์ด์ „ํŠธ ๋‹ค์‹œ ์‹œ์ž‘

์—ญ๋ฐฉํ–ฅ ํ”„๋ก์‹œ ์—์ด์ „ํŠธ๋ฅผ ๋‹ค์‹œ ์‹œ์ž‘ํ•˜๋ ค๋ฉด ์‚ฌ์šฉ์ž ๊ด€๋ฆฌ ๋…ธํŠธ๋ถ ํŽ˜์ด์ง€์—์„œ VM์„ ์ค‘์ง€ํ•˜๊ณ  ์‹œ์ž‘ํ•˜๊ฑฐ๋‚˜ ssh๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์‚ฌ์šฉ์ž ๊ด€๋ฆฌ ๋…ธํŠธ๋ถ ์ธ์Šคํ„ด์Šค์— ์—ฐ๊ฒฐํ•˜๊ณ  ๋‹ค์Œ์„ ์ž…๋ ฅํ•ฉ๋‹ˆ๋‹ค.

sudo docker restart proxy-agent

Jupyter ์„œ๋น„์Šค ๋‹ค์‹œ ์‹œ์ž‘

Jupyter ์„œ๋น„์Šค๋ฅผ ๋‹ค์‹œ ์‹œ์ž‘ํ•˜๋ ค๋ฉด ์‚ฌ์šฉ์ž ๊ด€๋ฆฌ ๋…ธํŠธ๋ถ ํŽ˜์ด์ง€์—์„œ VM์„ ์ค‘์ง€ํ•˜๊ณ  ์‹œ์ž‘ํ•˜๊ฑฐ๋‚˜ ssh๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์‚ฌ์šฉ์ž ๊ด€๋ฆฌ ๋…ธํŠธ๋ถ ์ธ์Šคํ„ด์Šค์— ์—ฐ๊ฒฐํ•˜๊ณ  ๋‹ค์Œ์„ ์ž…๋ ฅํ•ฉ๋‹ˆ๋‹ค.

sudo service jupyter restart

Notebooks ์ˆ˜์ง‘ ์—์ด์ „ํŠธ ๋‹ค์‹œ ์‹œ์ž‘

Notebooks ์ˆ˜์ง‘ ์—์ด์ „ํŠธ ์„œ๋น„์Šค๋Š” ๋ฐฑ๊ทธ๋ผ์šด๋“œ์—์„œ Vertex AI Workbench ์ธ์Šคํ„ด์Šค์˜ ํ•ต์‹ฌ ์„œ๋น„์Šค ์ƒํƒœ๋ฅผ ํ™•์ธํ•˜๋Š” Python ํ”„๋กœ์„ธ์Šค๋ฅผ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค.

Notebooks ์ˆ˜์ง‘ ์—์ด์ „ํŠธ ์„œ๋น„์Šค๋ฅผ ๋‹ค์‹œ ์‹œ์ž‘ํ•˜๋ ค๋ฉด Google Cloud ์ฝ˜์†”์—์„œ VM์„ ์ค‘์ง€ํ•˜๊ณ  ์‹œ์ž‘ํ•˜๊ฑฐ๋‚˜ ssh๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ Vertex AI Workbench ์ธ์Šคํ„ด์Šค์— ์—ฐ๊ฒฐํ•˜๊ณ  ๋‹ค์Œ์„ ์ž…๋ ฅํ•˜๋ฉด ๋ฉ๋‹ˆ๋‹ค.

sudo systemctl stop notebooks-collection-agent.service

๋‹ค์Œ์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.

sudo systemctl start notebooks-collection-agent.service

Vertex AI Workbench ์ธ์Šคํ„ด์Šค์—์„œ ์ด๋Ÿฌํ•œ ๋ช…๋ น์–ด๋ฅผ ์‹คํ–‰ํ•˜๋ ค๋ฉด JupyterLab์„ ์—ด๊ณ  ์ƒˆ ํ„ฐ๋ฏธ๋„์„ ๋งŒ๋“ญ๋‹ˆ๋‹ค.

Notebooks ์ˆ˜์ง‘ ์—์ด์ „ํŠธ ์Šคํฌ๋ฆฝํŠธ ์ˆ˜์ •

์Šคํฌ๋ฆฝํŠธ์— ์•ก์„ธ์Šคํ•˜๊ณ  ์ˆ˜์ •ํ•˜๋ ค๋ฉด ์ธ์Šคํ„ด์Šค์—์„œ ํ„ฐ๋ฏธ๋„์„ ์—ด๊ฑฐ๋‚˜ ssh๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ Vertex AI Workbench ์ธ์Šคํ„ด์Šค์— ์—ฐ๊ฒฐํ•˜๊ณ  ๋‹ค์Œ์„ ์ž…๋ ฅํ•ฉ๋‹ˆ๋‹ค.

nano /opt/deeplearning/bin/notebooks_collection_agent.py

ํŒŒ์ผ์„ ์ˆ˜์ •ํ•œ ํ›„์—๋Š” ์ €์žฅํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

๊ทธ๋Ÿฐ ๋‹ค์Œ Notebooks ์ˆ˜์ง‘ ์—์ด์ „ํŠธ ์„œ๋น„์Šค๋ฅผ ๋‹ค์‹œ ์‹œ์ž‘ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

์ธ์Šคํ„ด์Šค์—์„œ ํ•„์š”ํ•œ DNS ๋„๋ฉ”์ธ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋Š”์ง€ ํ™•์ธ

์ธ์Šคํ„ด์Šค์—์„œ ํ•„์š”ํ•œ DNS ๋„๋ฉ”์ธ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋Š”์ง€ ํ™•์ธํ•˜๋ ค๋ฉด ssh๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์‚ฌ์šฉ์ž ๊ด€๋ฆฌ ๋…ธํŠธ๋ถ ์ธ์Šคํ„ด์Šค์— ์—ฐ๊ฒฐํ•˜๊ณ  ๋‹ค์Œ์„ ์ž…๋ ฅํ•˜๋ฉด ๋ฉ๋‹ˆ๋‹ค.

host notebooks.googleapis.com
host *.notebooks.cloud.google.com
host *.notebooks.googleusercontent.com
host *.kernels.googleusercontent.com

๋˜๋Š”

curl --silent --output /dev/null "https://notebooks.cloud.google.com"; echo $?

์ธ์Šคํ„ด์Šค์— Dataproc์ด ์‚ฌ์šฉ ์„ค์ •๋œ ๊ฒฝ์šฐ ๋‹ค์Œ์„ ์‹คํ–‰ํ•˜์—ฌ ์ธ์Šคํ„ด์Šค์—์„œ *.kernels.googleusercontent.com์„ ํ™•์ธํ•˜๋Š”์ง€ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

curl --verbose -H "Authorization: Bearer $(gcloud auth print-access-token)" https://${PROJECT_NUMBER}-dot-${REGION}.kernels.googleusercontent.com/api/kernelspecs | jq .

Vertex AI Workbench ์ธ์Šคํ„ด์Šค์—์„œ ์ด๋Ÿฌํ•œ ๋ช…๋ น์–ด๋ฅผ ์‹คํ–‰ํ•˜๋ ค๋ฉด JupyterLab์„ ์—ด๊ณ  ์ƒˆ ํ„ฐ๋ฏธ๋„์„ ๋งŒ๋“ญ๋‹ˆ๋‹ค.

์ธ์Šคํ„ด์Šค์˜ ์‚ฌ์šฉ์ž ๋ฐ์ดํ„ฐ ์‚ฌ๋ณธ ๋งŒ๋“ค๊ธฐ

์ธ์Šคํ„ด์Šค์˜ ์‚ฌ์šฉ์ž ๋ฐ์ดํ„ฐ ์‚ฌ๋ณธ์„ Cloud Storage์— ์ €์žฅํ•˜๋ ค๋ฉด ๋‹ค์Œ ๋‹จ๊ณ„๋ฅผ ์™„๋ฃŒํ•ฉ๋‹ˆ๋‹ค.

Cloud Storage ๋ฒ„ํ‚ท ๋งŒ๋“ค๊ธฐ(์„ ํƒ์‚ฌํ•ญ)

์ธ์Šคํ„ด์Šค๊ฐ€ ์žˆ๋Š” ๋™์ผํ•œ ํ”„๋กœ์ ํŠธ์—์„œ ์‚ฌ์šฉ์ž ๋ฐ์ดํ„ฐ๋ฅผ ์ €์žฅํ•  ์ˆ˜ ์žˆ๋Š” Cloud Storage ๋ฒ„ํ‚ท์„ ๋งŒ๋“ญ๋‹ˆ๋‹ค. Cloud Storage ๋ฒ„ํ‚ท์ด ์ด๋ฏธ ์žˆ์œผ๋ฉด ์ด ๋‹จ๊ณ„๋ฅผ ๊ฑด๋„ˆ๋œ๋‹ˆ๋‹ค.

  • Create a Cloud Storage bucket:
    gcloud storage buckets create gs://BUCKET_NAME
    Replace BUCKET_NAME with a bucket name that meets the bucket naming requirements.

์‚ฌ์šฉ์ž ๋ฐ์ดํ„ฐ ๋ณต์‚ฌ

  1. ์ธ์Šคํ„ด์Šค์˜ JupyterLab ์ธํ„ฐํŽ˜์ด์Šค์—์„œ ํŒŒ์ผ > ์ƒˆ๋กœ ๋งŒ๋“ค๊ธฐ > ํ„ฐ๋ฏธ๋„์„ ์„ ํƒํ•˜์—ฌ ํ„ฐ๋ฏธ๋„ ์ฐฝ์„ ์—ฝ๋‹ˆ๋‹ค. ์‚ฌ์šฉ์ž ๊ด€๋ฆฌ ๋…ธํŠธ๋ถ ์ธ์Šคํ„ด์Šค์˜ ๊ฒฝ์šฐ ๋Œ€์‹  SSH๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ธ์Šคํ„ด์Šค ํ„ฐ๋ฏธ๋„์— ์—ฐ๊ฒฐํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

  2. gcloud CLI๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์‚ฌ์šฉ์ž ๋ฐ์ดํ„ฐ๋ฅผ Cloud Storage ๋ฒ„ํ‚ท์— ๋ณต์‚ฌํ•ฉ๋‹ˆ๋‹ค. ๋‹ค์Œ ์˜ˆ์‹œ ๋ช…๋ น์–ด๋Š” ์ธ์Šคํ„ด์Šค์˜ /home/jupyter/ ๋””๋ ‰ํ„ฐ๋ฆฌ์— ์žˆ๋Š” ๋ชจ๋“  ํŒŒ์ผ์„ Cloud Storage ๋ฒ„ํ‚ท์˜ ๋””๋ ‰ํ„ฐ๋ฆฌ์— ๋ณต์‚ฌํ•ฉ๋‹ˆ๋‹ค.

    gcloud storage cp /home/jupyter/* gs://BUCKET_NAMEPATH --recursive
    

    ๋‹ค์Œ์„ ๋ฐ”๊ฟ‰๋‹ˆ๋‹ค.

    • BUCKET_NAME: Cloud Storage ๋ฒ„ํ‚ท ์ด๋ฆ„
    • PATH: ํŒŒ์ผ์„ ๋ณต์‚ฌํ•  ๋””๋ ‰ํ„ฐ๋ฆฌ์˜ ๊ฒฝ๋กœ(์˜ˆ: /copy/jupyter/)

gcpdiag๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํ”„๋กœ๋น„์ €๋‹ ์ค‘์— ์ค‘๋‹จ๋œ ์ธ์Šคํ„ด์Šค ์กฐ์‚ฌ

gcpdiag๋Š” ์˜คํ”ˆ์†Œ์Šค ๋„๊ตฌ์ž…๋‹ˆ๋‹ค. ๊ณต์‹์ ์œผ๋กœ ์ง€์›๋˜๋Š” Google Cloud ์ œํ’ˆ์ด ์•„๋‹™๋‹ˆ๋‹ค. gcpdiag ๋„๊ตฌ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ Google Cloudํ”„๋กœ์ ํŠธ ๋ฌธ์ œ๋ฅผ ์‹๋ณ„ํ•˜๊ณ  ํ•ด๊ฒฐํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ž์„ธํ•œ ๋‚ด์šฉ์€ GitHub์˜ gcpdiag ํ”„๋กœ์ ํŠธ๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.

์ด gcpdiag ๋Ÿฐ๋ถ์—์„œ๋Š” ๋‹ค์Œ ์˜์—ญ์„ ํฌํ•จํ•˜์—ฌ Vertex AI Workbench ์ธ์Šคํ„ด์Šค๊ฐ€ ํ”„๋กœ๋น„์ €๋‹ ์ƒํƒœ์—์„œ ์ค‘๋‹จ๋˜๋Š” ์ž ์žฌ์  ์›์ธ์„ ์กฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค.
  • ์ƒํƒœ: ํ˜„์žฌ ์ธ์Šคํ„ด์Šค ์ƒํƒœ๋ฅผ ํ™•์ธํ•˜์—ฌ ํ”„๋กœ๋น„์ €๋‹ ์ค‘์— ์ค‘๋‹จ๋˜๊ฑฐ๋‚˜ ํ™œ์„ฑ ์ƒํƒœ๊ฐ€ ์•„๋‹Œ์ง€ ํ™•์ธํ•ฉ๋‹ˆ๋‹ค.
  • ์ธ์Šคํ„ด์Šค์˜ Compute Engine VM ๋ถ€ํŒ… ๋””์Šคํฌ ์ด๋ฏธ์ง€: ์ธ์Šคํ„ด์Šค๊ฐ€ ์ปค์Šคํ…€ ์ปจํ…Œ์ด๋„ˆ, ๊ณต์‹ workbench-instances ์ด๋ฏธ์ง€, Deep Learning VM ์ด๋ฏธ์ง€ ๋˜๋Š” ์ธ์Šคํ„ด์Šค๊ฐ€ ํ”„๋กœ๋น„์ €๋‹ ์ƒํƒœ์—์„œ ์ค‘๋‹จ๋  ์ˆ˜ ์žˆ๋Š” ์ง€์›๋˜์ง€ ์•Š๋Š” ์ด๋ฏธ์ง€๋กœ ์ƒ์„ฑ๋˜์—ˆ๋Š”์ง€ ํ™•์ธํ•ฉ๋‹ˆ๋‹ค.
  • ์ปค์Šคํ…€ ์Šคํฌ๋ฆฝํŠธ: ๊ธฐ๋ณธ Jupyter ํฌํŠธ๋ฅผ ๋ณ€๊ฒฝํ•˜๊ฑฐ๋‚˜ ์ธ์Šคํ„ด์Šค๊ฐ€ ํ”„๋กœ๋น„์ €๋‹ ์ƒํƒœ์—์„œ ์ค‘๋‹จ๋  ์ˆ˜ ์žˆ๋Š” ์ข…์† ํ•ญ๋ชฉ์„ ์ค‘๋‹จํ•˜๋Š” ์ปค์Šคํ…€ ์‹œ์ž‘ ์Šคํฌ๋ฆฝํŠธ ๋˜๋Š” ์‹œ์ž‘ ํ›„ ์Šคํฌ๋ฆฝํŠธ๋ฅผ ์ธ์Šคํ„ด์Šค์—์„œ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ๋Š”์ง€ ํ™•์ธํ•ฉ๋‹ˆ๋‹ค.
  • ํ™˜๊ฒฝ ๋ฒ„์ „: ์—…๊ทธ๋ ˆ์ด๋“œ ๊ฐ€๋Šฅ์„ฑ์„ ํ™•์ธํ•˜์—ฌ ์ธ์Šคํ„ด์Šค์—์„œ ์ตœ์‹  ํ™˜๊ฒฝ ๋ฒ„์ „์„ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ๋Š”์ง€ ํ™•์ธํ•ฉ๋‹ˆ๋‹ค. ์ด์ „ ๋ฒ„์ „์—์„œ๋Š” ์ธ์Šคํ„ด์Šค๊ฐ€ ํ”„๋กœ๋น„์ €๋‹ ์ƒํƒœ์—์„œ ์ค‘๋‹จ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ์ธ์Šคํ„ด์Šค์˜ Compute Engine VM ์„ฑ๋Šฅ: ํ˜„์žฌ VM ์„ฑ๋Šฅ์„ ๊ฒ€์‚ฌํ•˜์—ฌ ๋†’์€ CPU ์‚ฌ์šฉ๋Ÿ‰, ๋ฉ”๋ชจ๋ฆฌ ๋ถ€์กฑ ๋˜๋Š” ๋””์Šคํฌ ๊ณต๊ฐ„ ๋ฌธ์ œ๋กœ ์ธํ•ด ์ •์ƒ์ ์ธ ์ž‘์—…์ด ์ค‘๋‹จ๋˜์ง€ ์•Š์•˜๋Š”์ง€ ํ™•์ธํ•ฉ๋‹ˆ๋‹ค.
  • ์ธ์Šคํ„ด์Šค์˜ Compute Engine ์ง๋ ฌ ํฌํŠธ ๋˜๋Š” ์‹œ์Šคํ…œ ๋กœ๊น…: ์ธ์Šคํ„ด์Šค์— ์ง๋ ฌ ํฌํŠธ ๋กœ๊ทธ๊ฐ€ ์žˆ๋Š”์ง€ ํ™•์ธํ•ฉ๋‹ˆ๋‹ค. ์ด ๋กœ๊ทธ๋Š” Jupyter๊ฐ€ ํฌํŠธ 127.0.0.1:8080์—์„œ ์‹คํ–‰ ์ค‘์ธ์ง€ ํ™•์ธํ•˜๊ธฐ ์œ„ํ•ด ๋ถ„์„๋ฉ๋‹ˆ๋‹ค.
  • ์ธ์Šคํ„ด์Šค์˜ Compute Engine SSH ๋ฐ ํ„ฐ๋ฏธ๋„ ์•ก์„ธ์Šค: ์‚ฌ์šฉ์ž๊ฐ€ SSH๋ฅผ ํ†ตํ•ด ํ„ฐ๋ฏธ๋„์„ ์—ด์–ด 'home/jupyter'์˜ ๊ณต๊ฐ„ ์‚ฌ์šฉ๋Ÿ‰์ด 85% ๋ฏธ๋งŒ์ธ์ง€ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋„๋ก ์ธ์Šคํ„ด์Šค์˜ Compute Engine VM์ด ์‹คํ–‰ ์ค‘์ธ์ง€ ํ™•์ธํ•ฉ๋‹ˆ๋‹ค. ์—ฌ์œ  ๊ณต๊ฐ„์ด ๋ถ€์กฑํ•˜๋ฉด ์ธ์Šคํ„ด์Šค๊ฐ€ ํ”„๋กœ๋น„์ €๋‹ ์ƒํƒœ์—์„œ ์ค‘๋‹จ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ์™ธ๋ถ€ IP๊ฐ€ ์‚ฌ์šฉ ์ค‘์ง€๋จ: ์™ธ๋ถ€ IP ์•ก์„ธ์Šค๊ฐ€ ์‚ฌ์šฉ ์ค‘์ง€๋˜์—ˆ๋Š”์ง€ ํ™•์ธํ•ฉ๋‹ˆ๋‹ค. ์ž˜๋ชป๋œ ๋„คํŠธ์›Œํ‚น ๊ตฌ์„ฑ์œผ๋กœ ์ธํ•ด ์ธ์Šคํ„ด์Šค๊ฐ€ ํ”„๋กœ๋น„์ €๋‹ ์ƒํƒœ์—์„œ ์ค‘๋‹จ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

Google Cloud ์ฝ˜์†”

  1. ๋‹ค์Œ ๋ช…๋ น์–ด๋ฅผ ์ž‘์„ฑํ•˜๊ณ  ๋ณต์‚ฌํ•ฉ๋‹ˆ๋‹ค.
  2. gcpdiag runbook vertex/workbench-instance-stuck-in-provisioning \
        --parameter project_id=PROJECT_ID \
        --parameter instance_name=INSTANCE_NAME \
        --parameter zone=ZONE
  3. Google Cloud ์ฝ˜์†”์„ ์—ด๊ณ  Cloud Shell์„ ํ™œ์„ฑํ™”ํ•ฉ๋‹ˆ๋‹ค.
  4. Cloud ์ฝ˜์†” ์—ด๊ธฐ
  5. ๋ณต์‚ฌํ•œ ๋ช…๋ น์–ด๋ฅผ ๋ถ™์—ฌ๋„ฃ์Šต๋‹ˆ๋‹ค.
  6. gcpdiag ๋ช…๋ น์–ด๋ฅผ ์‹คํ–‰ํ•˜๋ฉด gcpdiag Docker ์ด๋ฏธ์ง€๋ฅผ ๋‹ค์šด๋กœ๋“œํ•œ ํ›„ ์ง„๋‹จ ๊ฒ€์‚ฌ๋ฅผ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค. ํ•ด๋‹นํ•˜๋Š” ๊ฒฝ์šฐ ์ถœ๋ ฅ ์•ˆ๋‚ด์— ๋”ฐ๋ผ ์‹คํŒจํ•œ ๊ฒ€์‚ฌ๋ฅผ ์ˆ˜์ •ํ•ฉ๋‹ˆ๋‹ค.

Docker

Docker ์ปจํ…Œ์ด๋„ˆ์—์„œ gcpdiag๋ฅผ ์‹œ์ž‘ํ•˜๋Š” ๋ž˜ํผ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ gcpdiag๋ฅผ ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. Docker ๋˜๋Š” Podman์ด ์„ค์น˜๋˜์–ด ์žˆ์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

  1. ๋กœ์ปฌ ์›Œํฌ์Šคํ…Œ์ด์…˜์—์„œ ๋‹ค์Œ ๋ช…๋ น์–ด๋ฅผ ๋ณต์‚ฌํ•˜๊ณ  ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค.
    curl https://gcpdiag.dev/gcpdiag.sh >gcpdiag && chmod +x gcpdiag
  2. gcpdiag ๋ช…๋ น์–ด๋ฅผ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค.
    ./gcpdiag runbook vertex/workbench-instance-stuck-in-provisioning \
        --parameter project_id=PROJECT_ID \
        --parameter instance_name=INSTANCE_NAME \
        --parameter zone=ZONE

์ด ๋Ÿฐ๋ถ์— ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ๋ด…๋‹ˆ๋‹ค.

๋‹ค์Œ์„ ๋ฐ”๊ฟ‰๋‹ˆ๋‹ค.

  • PROJECT_ID: ๋ฆฌ์†Œ์Šค๊ฐ€ ํฌํ•จ๋œ ํ”„๋กœ์ ํŠธ์˜ ID์ž…๋‹ˆ๋‹ค.
  • INSTANCE_NAME: ํ”„๋กœ์ ํŠธ ๋‚ด ๋Œ€์ƒ Vertex AI Workbench ์ธ์Šคํ„ด์Šค์˜ ์ด๋ฆ„์ž…๋‹ˆ๋‹ค.
  • ZONE: ๋Œ€์ƒ Vertex AI Workbench ์ธ์Šคํ„ด์Šค๊ฐ€ ์žˆ๋Š” ์˜์—ญ์ž…๋‹ˆ๋‹ค.

์œ ์šฉํ•œ ํ”Œ๋ž˜๊ทธ:

๋ชจ๋“  gcpdiag ๋„๊ตฌ ํ”Œ๋ž˜๊ทธ์˜ ๋ชฉ๋ก๊ณผ ์„ค๋ช…์€ gcpdiag ์‚ฌ์šฉ ์•ˆ๋‚ด๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.

Vertex AI์—์„œ ์„œ๋น„์Šค ๊ณ„์ • ์—ญํ• ์„ ์‚ฌ์šฉํ•  ๋•Œ ๊ถŒํ•œ ์˜ค๋ฅ˜ ๋ฐœ์ƒ

๋ฌธ์ œ

Vertex AI์—์„œ ์„œ๋น„์Šค ๊ณ„์ • ์—ญํ• ์„ ์‚ฌ์šฉํ•˜๋ฉด ์ผ๋ฐ˜ ๊ถŒํ•œ ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค.

์ด๋Ÿฌํ•œ ์˜ค๋ฅ˜๋Š” Cloud Logging์˜ ์ œํ’ˆ ๊ตฌ์„ฑ์š”์†Œ ๋กœ๊ทธ๋‚˜ ๊ฐ์‚ฌ ๋กœ๊ทธ์— ํ‘œ์‹œ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์˜ํ–ฅ์„ ๋ฐ›๋Š” ํ”„๋กœ์ ํŠธ์˜ ์กฐํ•ฉ์œผ๋กœ ํ‘œ์‹œ๋  ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค.

์ด๋Ÿฌํ•œ ๋ฌธ์ œ๋Š” ๋‹ค์Œ ์ค‘ ํ•˜๋‚˜ ๋˜๋Š” ๋‘˜ ๋‹ค๋กœ ์ธํ•ด ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

  • Service Account User ์—ญํ• ์„ ์‚ฌ์šฉํ•ด์•ผ ํ•  ๋•Œ Service Account Token Creator ์—ญํ• ์„ ์‚ฌ์šฉํ•˜๊ฑฐ๋‚˜ ๊ทธ ๋ฐ˜๋Œ€์˜ ๊ฒฝ์šฐ. ์ด๋Ÿฌํ•œ ์—ญํ• ์€ ์„œ๋น„์Šค ๊ณ„์ •์— ์„œ๋กœ ๋‹ค๋ฅธ ๊ถŒํ•œ์„ ๋ถ€์—ฌํ•˜๋ฉฐ ์ƒํ˜ธ ๊ตํ™˜๋  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค. Service Account Token Creator ๋ฐ Service Account User ์—ญํ• ์˜ ์ฐจ์ด์ ์€ ์„œ๋น„์Šค ๊ณ„์ • ์—ญํ• ์„ ์ฐธ์กฐํ•˜์„ธ์š”.

  • ๊ธฐ๋ณธ์ ์œผ๋กœ ํ—ˆ์šฉ๋˜์ง€ ์•Š๋Š” ์—ฌ๋Ÿฌ ํ”„๋กœ์ ํŠธ์— ์„œ๋น„์Šค ๊ณ„์ • ๊ถŒํ•œ์„ ๋ถ€์—ฌํ–ˆ์Šต๋‹ˆ๋‹ค.

ํ•ด๊ฒฐ์ฑ…

์ด ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๋ ค๋ฉด ๋‹ค์Œ ์ค‘ ํ•˜๋‚˜ ์ด์ƒ์„ ์‚ฌ์šฉํ•ด ๋ณด์„ธ์š”.

  • Service Account Token Creator ๋˜๋Š” Service Account User ์—ญํ• ์ด ํ•„์š”ํ•œ์ง€ ํ™•์ธํ•ฉ๋‹ˆ๋‹ค. ์ž์„ธํ•œ ๋‚ด์šฉ์€ ์‚ฌ์šฉ ์ค‘์ธ Vertex AI ์„œ๋น„์Šค ๋ฐ ์‚ฌ์šฉ ์ค‘์ธ ๊ธฐํƒ€ ์ œํ’ˆ ํ†ตํ•ฉ์— ๋Œ€ํ•œ IAM ๋ฌธ์„œ๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.

  • ์—ฌ๋Ÿฌ ํ”„๋กœ์ ํŠธ์— ์„œ๋น„์Šค ๊ณ„์ • ๊ถŒํ•œ์„ ๋ถ€์—ฌํ•œ ๊ฒฝ์šฐ iam.disableCrossProjectServiceAccountUsage๊ฐ€ ์ ์šฉ๋˜์ง€ ์•Š๋Š”์ง€ ํ™•์ธํ•˜์—ฌ ํ”„๋กœ์ ํŠธ ๊ฐ„์— ์—ฐ๊ฒฐํ•  ์„œ๋น„์Šค ๊ณ„์ •์„ ์‚ฌ์šฉ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค. iam.disableCrossProjectServiceAccountUsage๊ฐ€ ์ ์šฉ๋˜์ง€ ์•Š๊ฒŒ ํ•˜๋ ค๋ฉด ๋‹ค์Œ ๋ช…๋ น์–ด๋ฅผ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค.

    gcloud resource-manager org-policies disable-enforce \
      iam.disableCrossProjectServiceAccountUsage \
      --project=PROJECT_ID