AI Hypercomputer is the integrated supercomputing system underneath every AI workload on Google Cloud. It is made up of hardware, software and consumption models designed to simplify AI deployment, improve system-level efficiency, and optimize costs.
Overview
Choose from compute, storage, and networking options optimized for granular, workload-level objectives, whether that's higher throughput, lower latency, faster time-to-results, or lower TCO. Learn more about: Google Cloud TPU, Google Cloud GPU, plus the latest in storage and networking.
Get more from your hardware with industry-leading software, integrated with open frameworks, libraries, and compilers to make AI development, integration, and management more efficient.
Flexible consumption options allow customers to choose fixed costs with committed use discounts or dynamic on-demand models to meet your business needs. Dynamic Workload Scheduler and Spot VMs can help you get the capacity you need without over allocating. Plus, Google Cloud's cost optimization tools help automate resource utilization to reduce manual tasks for engineers.
Common Uses
Training workloads need to run as highly synchronized jobs across thousands of nodes in tightly coupled clusters. A single degraded node can disrupt an entire job, delaying time-to-market. You need to:
We want to make it extremely easy for customers to deploy and scale training workloads on Google Cloud.
To create an AI cluster, get started with one of our turorials:
"We need GPUs to generate responses to users' messages. And as we get more users on our platform, we need more GPUs to serve them. So on Google Cloud, we can experiment to find what is the right platform for a particular workload. It's great to have that flexibility to choose which solutions are most valuable." Myle Ott, Founding Engineer, Character.AI
Training workloads need to run as highly synchronized jobs across thousands of nodes in tightly coupled clusters. A single degraded node can disrupt an entire job, delaying time-to-market. You need to:
We want to make it extremely easy for customers to deploy and scale training workloads on Google Cloud.
To create an AI cluster, get started with one of our turorials:
"We need GPUs to generate responses to users' messages. And as we get more users on our platform, we need more GPUs to serve them. So on Google Cloud, we can experiment to find what is the right platform for a particular workload. It's great to have that flexibility to choose which solutions are most valuable." Myle Ott, Founding Engineer, Character.AI
Google Cloud provides images that contain common operating systems, frameworks, libraries, and drivers. AI Hypercomputer optimizes these pre-configured images to support your AI workloads.
"Working with Google Cloud to incorporate generative AI allows us to create a bespoke travel concierge within our chatbot. We want our customers to go beyond planning a trip and help them curate their unique travel experience." Martin Brodbeck, CTO, Priceline
Google Cloud provides images that contain common operating systems, frameworks, libraries, and drivers. AI Hypercomputer optimizes these pre-configured images to support your AI workloads.
"Working with Google Cloud to incorporate generative AI allows us to create a bespoke travel concierge within our chatbot. We want our customers to go beyond planning a trip and help them curate their unique travel experience." Martin Brodbeck, CTO, Priceline
Inference is quickly becoming more diverse and complex, evolving in three main areas:
"Our experimental results show that Cloud TPU v5e is the most cost-efficient accelerator on which to run large-scale inference for our model. It delivers 2.7x greater performance per dollar than G2 and 4.2x greater performance per dollar than A2 instances." Domenic Donato,
VP of Technology, AssemblyAI
Inference is quickly becoming more diverse and complex, evolving in three main areas:
"Our experimental results show that Cloud TPU v5e is the most cost-efficient accelerator on which to run large-scale inference for our model. It delivers 2.7x greater performance per dollar than G2 and 4.2x greater performance per dollar than A2 instances." Domenic Donato,
VP of Technology, AssemblyAI
FAQ
For most customers, a managed AI platform like Vertex AI is the easiest way to get started with AI because it has all of the tools, templates, and models baked in. Plus, Vertex AI is powered by AI Hypercomputer under the hood in a way that is optimized on your behalf. Vertex AI is the easiest way to get started because it’s the simplest experience. If you prefer to configure and optimize every component of your infrastructure, you can access AI Hypercomputer’s components as Infrastructure and assemble it in a way that meets your needs.
While individual services offer specific capabilities, AI Hypercomputer provides an integrated system where hardware, software, and consumption models are designed to work optimally together. This integration delivers system-level efficiencies in performance, cost, and time-to-market that are harder to achieve by stitching together disparate services. It simplifies complexity and provides a holistic approach to AI infrastructure.
Yes, AI Hypercomputer is designed with flexibility in mind. Technologies like Cross-Cloud Interconnect provide high-bandwidth connectivity to on-premises data centers and other clouds, facilitating hybrid and multi-cloud AI strategies. We operate with open standards and integrate popular third-party software to enable you to build solutions that span multiple environments, and change services as you please.
Security is a core aspect of AI Hypercomputer. It benefits from Google Cloud’s multi-layered security model. Specific features include Titan security microcontrollers (ensuring systems boot from a trusted state), RDMA Firewall (for zero-trust networking between TPUs/GPUs during training), and integration with solutions like Model Armor for AI safety. These are complemented by robust infrastructure security policies and principles like the Secure AI Framework.
No. AI Hypercomputer can be used for any sized workload. Smaller sized workloads still realize all the benefits of an integrated system, such as efficiency and simplified deployment. AI Hypercomputer also supports customers as their businesses scale, from small proof-of-concepts and experiments to large scale production deployments.
Yes, we are building a library of recipes in Github. You can also use the Cluster Toolkit for pre-built cluster blueprints.
AI-optimized hardware
Storage
Networking
Compute: Access Google Cloud TPUs (Trillium), NVIDIA GPUs (Blackwell), and CPUs (Axion). This allows for optimization based on specific workload needs for throughput, latency, or TCO.
Leading software and open frameworks
Consumption models: