[[["์ดํดํ๊ธฐ ์ฌ์","easyToUnderstand","thumb-up"],["๋ฌธ์ ๊ฐ ํด๊ฒฐ๋จ","solvedMyProblem","thumb-up"],["๊ธฐํ","otherUp","thumb-up"]],[["์ดํดํ๊ธฐ ์ด๋ ค์","hardToUnderstand","thumb-down"],["์๋ชป๋ ์ ๋ณด ๋๋ ์ํ ์ฝ๋","incorrectInformationOrSampleCode","thumb-down"],["ํ์ํ ์ ๋ณด/์ํ์ด ์์","missingTheInformationSamplesINeed","thumb-down"],["๋ฒ์ญ ๋ฌธ์ ","translationIssue","thumb-down"],["๊ธฐํ","otherDown","thumb-down"]],["์ต์ข ์ ๋ฐ์ดํธ: 2025-08-27(UTC)"],[[["\u003cp\u003eDataproc on GKE enables the execution of Big Data applications on GKE clusters through the Dataproc \u003ccode\u003ejobs\u003c/code\u003e API.\u003c/p\u003e\n"],["\u003cp\u003eYou can create a Dataproc on GKE virtual cluster and then submit Spark, PySpark, SparkR, or Spark-SQL jobs via the Google Cloud console, Cloud CLI, or the Dataproc API.\u003c/p\u003e\n"],["\u003cp\u003eDataproc on GKE utilizes virtual clusters, which, unlike Dataproc on Compute Engine clusters, do not have separate master and worker VMs.\u003c/p\u003e\n"],["\u003cp\u003eDataproc on GKE job are run as pods on node pools and is managed by GKE.\u003c/p\u003e\n"],["\u003cp\u003eDataproc on GKE supports Spark 3.5 versions.\u003c/p\u003e\n"]]],[],null,["# Dataproc on GKE overview\n\nDataproc on GKE allows you to execute Big Data applications using the\nDataproc `jobs` API on GKE clusters.\nUse the Google Cloud console, Google Cloud CLI or the Dataproc API\n(HTTP request or Cloud Client Libraries) to\n[create a Dataproc on GKE virtual cluster](/dataproc/docs/guides/dpgke/quickstarts/dataproc-gke-quickstart-create-cluster),\nthen submit a Spark, PySpark, SparkR, or Spark-SQL job to the Dataproc\nservice.\n\nDataproc on GKE supports\n[Spark 3.5 versions](/dataproc/docs/guides/dpgke/dataproc-gke-versions).\n\nHow Dataproc on GKE works\n-------------------------\n\nDataproc on GKE deploys Dataproc **virtual** clusters on\na GKE cluster. Unlike\n[Dataproc on Compute Engine clusters](/dataproc/docs/guides/create-cluster),\nDataproc on GKE virtual clusters do not include separate\nmaster and worker VMs. Instead, when you create a Dataproc on GKE virtual cluster,\nDataproc on GKE creates node pools within a GKE cluster. Dataproc on GKE\njobs are run as pods on these node pools. The node pools and\nscheduling of pods on the node pools are managed by GKE."]]