Heroku AI Expands Model Offering with OpenAIβs gpt-oss-120b
- Last Updated: August 20, 2025
Start building with OpenAIβs new open-weight model, gpt-oss-120b, now available on Heroku Managed Inference and Agents. This gives developers a powerful, transparent, and flexible way to build and deploy AI applications on the platform they already trust. Access gpt-oss-120b with our OpenAI-compatible chat completions API, which you can drop into any OpenAI-compatible SDK or framework.
Why gpt-oss-120b matters for the AI community and enterprise teams
OpenAI has released gpt-oss-120b as part of its new family of open-source models. This 120-billion parameter model and a Mixture-of-Experts (MoE) architecture are designed for a wide range of text generation and understanding tasks. It represents a significant step forward in making powerful AI more accessible to the developer community. Key features of the gpt-oss-120b include:
- Open Weight: As an open-weight model, developers can inspect the modelβs architecture, understand its inner workings, and fine-tune it for specific use cases. The weights are released under a permissible Apache 2.0 license
- Architecture: The model uses a Mixture-of-Experts (MoE) architecture, which allows it to be highly performant while remaining computationally efficient. With 117 billion total parameters, it only activates 5.1 billion per token, enabling it to run on a single 80GB GPU.
- Designed for Agentic Workflows: The model was built with tool use in mind, demonstrating strong capabilities in instruction following, function calling, and executing tasks like web searches and running Python code.
Performance and benchmarks
According to OpenAI, the gpt-oss-120b model delivers performance that is competitive with, and in some cases exceeds, their proprietary o4-mini model. Early benchmark results show that gpt-oss-120b matches or surpasses o4-mini and other open weight models such as DeepSeek and Qwen3.
Benchmark | GPT-OSS-120b | OpenAI o4-mini | DeepSeek R1-0528 | Qwen3-235B |
MMLU | 90.0% | 93.0% | 85.0% | 84.4% |
AIME 2025 (with tools) | 97.9% | 99.5% | 87.5% | 92.3% |
Codeforces (no tools) | 2463 (Elo) | 2719 (Elo) | 1930 (Elo) | N/A |
Total Parameters | 117B | N/A | 671B | 235B |
Active Parameters | 5.1B | N/A | 37B | 22B |
While official benchmarks are strong, early community feedback is still emerging, with some users reporting excellent results in reasoning, scientific research, and tool-assisted tasks.
gpt-oss-120b on Heroku: simplifying AI infrastructure
With Heroku Managed Inference and Agents, your team can:
- Deploy gpt-oss-120b with zero infrastructure overhead
- Call the model securely from your Heroku apps
- Build agents that take action, run code, and fetch data in real time
- Monitor and scale inference as usage grows
And because itβs built into the Heroku platform, your team avoids the cost and complexity of managing and provisioning inference.
Transparent pricing for scalable AI
The pricing for gpt-oss-120b is designed to allow you to scale your applications cost-effectively.
- Input Tokens: $0.15 per million tokens
- Output Tokens: $0.60 per million tokens
Get started today with gpt-oss-120b on Heroku
The gpt-oss-120b model is now available in the Heroku Managed Inference and Agents add-on, which can be added from the Elements Marketplace.
Ready to build? Get Started with Heroku Managed Inference and Agents today.
We look forward to seeing what you create.
- Originally Published:
- AI ModelHeroku AIManaged Inference and Agentsopenai