Search overlay panel for performing site-wide searches

Salesforce (Heroku) Recognized as a Leader. Learn More!

Heroku AI Expands Model Offering with OpenAI’s gpt-oss-120b

Start building with OpenAI’s new open-weight model, gpt-oss-120b, now available on Heroku Managed Inference and Agents. This gives developers a powerful, transparent, and flexible way to build and deploy AI applications on the platform they already trust. Access gpt-oss-120b with our OpenAI-compatible chat completions API, which you can drop into any OpenAI-compatible SDK or framework.

Why gpt-oss-120b matters for the AI community and enterprise teams

OpenAI has released gpt-oss-120b as part of its new family of open-source models. This 120-billion parameter model and a Mixture-of-Experts (MoE) architecture are designed for a wide range of text generation and understanding tasks. It represents a significant step forward in making powerful AI more accessible to the developer community. Key features of the gpt-oss-120b include:

  • Open Weight: As an open-weight model, developers can inspect the model’s architecture, understand its inner workings, and fine-tune it for specific use cases. The weights are released under a permissible Apache 2.0 license
  • Architecture: The model uses a Mixture-of-Experts (MoE) architecture, which allows it to be highly performant while remaining computationally efficient. With 117 billion total parameters, it only activates 5.1 billion per token, enabling it to run on a single 80GB GPU.
  • Designed for Agentic Workflows: The model was built with tool use in mind, demonstrating strong capabilities in instruction following, function calling, and executing tasks like web searches and running Python code.

Performance and benchmarks

According to OpenAI, the gpt-oss-120b model delivers performance that is competitive with, and in some cases exceeds, their proprietary o4-mini model. Early benchmark results show that gpt-oss-120b matches or surpasses o4-mini and other open weight models such as DeepSeek and Qwen3.

Benchmark GPT-OSS-120b OpenAI o4-mini DeepSeek R1-0528 Qwen3-235B
MMLU 90.0% 93.0% 85.0% 84.4%
AIME 2025 (with tools) 97.9% 99.5% 87.5% 92.3%
Codeforces (no tools) 2463 (Elo) 2719 (Elo) 1930 (Elo) N/A
Total Parameters 117B N/A 671B 235B
Active Parameters 5.1B N/A 37B 22B

While official benchmarks are strong, early community feedback is still emerging, with some users reporting excellent results in reasoning, scientific research, and tool-assisted tasks.

gpt-oss-120b on Heroku: simplifying AI infrastructure

With Heroku Managed Inference and Agents, your team can:

  • Deploy gpt-oss-120b with zero infrastructure overhead
  • Call the model securely from your Heroku apps
  • Build agents that take action, run code, and fetch data in real time
  • Monitor and scale inference as usage grows

And because it’s built into the Heroku platform, your team avoids the cost and complexity of managing and provisioning inference.

Transparent pricing for scalable AI

The pricing for gpt-oss-120b is designed to allow you to scale your applications cost-effectively.

  • Input Tokens: $0.15 per million tokens
  • Output Tokens: $0.60 per million tokens

Get started today with gpt-oss-120b on Heroku

The gpt-oss-120b model is now available in the Heroku Managed Inference and Agents add-on, which can be added from the Elements Marketplace.

Ready to build? Get Started with Heroku Managed Inference and Agents today.

We look forward to seeing what you create.