Heroku AI Expands Model Offering with OpenAI’s gpt-oss-120b

Last Updated: August 20, 2025

Start building with OpenAI’s new open-weight model, gpt-oss-120b, now available on Heroku Managed Inference and Agents. This gives developers a powerful, transparent, and flexible way to build and deploy AI applications on the platform they already trust. Access gpt-oss-120b with our OpenAI-compatible chat completions API, which you can drop into any OpenAI-compatible SDK or framework.

Why gpt-oss-120b matters for the AI community and enterprise teams

OpenAI has released gpt-oss-120b as part of its new family of open-source models. This 120-billion parameter model and a Mixture-of-Experts (MoE) architecture are designed for a wide range of text generation and understanding tasks. It represents a significant step forward in making powerful AI more accessible to the developer community. Key features of the gpt-oss-120b include:

Open Weight: As an open-weight model, developers can inspect the model’s architecture, understand its inner workings, and fine-tune it for specific use cases. The weights are released under a permissible Apache 2.0 license
Architecture: The model uses a Mixture-of-Experts (MoE) architecture, which allows it to be highly performant while remaining computationally efficient. With 117 billion total parameters, it only activates 5.1 billion per token, enabling it to run on a single 80GB GPU.
Designed for Agentic Workflows: The model was built with tool use in mind, demonstrating strong capabilities in instruction following, function calling, and executing tasks like web searches and running Python code.

Performance and benchmarks

According to OpenAI, the gpt-oss-120b model delivers performance that is competitive with, and in some cases exceeds, their proprietary o4-mini model. Early benchmark results show that gpt-oss-120b matches or surpasses o4-mini and other open weight models such as DeepSeek and Qwen3.

Benchmark	GPT-OSS-120b	OpenAI o4-mini	DeepSeek R1-0528	Qwen3-235B
MMLU	90.0%	93.0%	85.0%	84.4%
AIME 2025 (with tools)	97.9%	99.5%	87.5%	92.3%
Codeforces (no tools)	2463 (Elo)	2719 (Elo)	1930 (Elo)	N/A
Total Parameters	117B	N/A	671B	235B
Active Parameters	5.1B	N/A	37B	22B

While official benchmarks are strong, early community feedback is still emerging, with some users reporting excellent results in reasoning, scientific research, and tool-assisted tasks.

gpt-oss-120b on Heroku: simplifying AI infrastructure

With Heroku Managed Inference and Agents, your team can:

Deploy gpt-oss-120b with zero infrastructure overhead
Call the model securely from your Heroku apps
Build agents that take action, run code, and fetch data in real time
Monitor and scale inference as usage grows

And because it’s built into the Heroku platform, your team avoids the cost and complexity of managing and provisioning inference.

Transparent pricing for scalable AI

The pricing for gpt-oss-120b is designed to allow you to scale your applications cost-effectively.

Input Tokens: $0.15 per million tokens
Output Tokens: $0.60 per million tokens

Get started today with gpt-oss-120b on Heroku

The gpt-oss-120b model is now available in the Heroku Managed Inference and Agents add-on, which can be added from the Elements Marketplace.

Ready to build? Get Started with Heroku Managed Inference and Agents today.

We look forward to seeing what you create.

Originally Published: August 20, 2025
AI Model Heroku AI Managed Inference and Agents openai

More from the Author

Anush DSouza

Senior Product Manager at Heroku

Heroku Staff

Browse the archives for Engineering or all blogs. Subscribe to the RSS feed for Engineering or all blogs.