Managed Inference and Agents API Model Cards
Last updated August 19, 2025
Table of Contents
Our model cards contain documentation for each available AI model.
Available Models
The Heroku Managed Inference and Agent add-on is hosted in two regions: us
and eu
. However, the add-on can be provisioned and accessed from apps in any Heroku region.
Each region offers slightly different models.
Region: us
Model Documentation | Type | API Endpoint | Model Source | Description |
---|---|---|---|---|
Claude 4 Sonnet | text → text |
v1/chat/completions | Anthropic | A state-of-the-art large language model that supports chat and tool-calling. |
Claude 3.7-sonnet | text → text |
v1/chat/completions | Anthropic | A state-of-the-art large language model that supports chat and tool-calling. |
Claude 3.5 Sonnet Latest | text → text |
v1/chat/completions | Anthropic | A state-of-the-art large language model that supports chat and tool-calling. |
Claude 3.5 Haiku | text → text |
v1/chat/completions | Anthropic | A faster, more affordable large language model that supports chat and tool-calling. |
Amazon Nova Lite | text → text |
v1/chat/completions | Amazon | A fast and cost-effective large language model. |
Amazon Nova Pro | text → text |
v1/chat/completions | Amazon | A high-performance large language model designed for complex tasks. |
Cohere Embed Multilingual | text → embedding |
v1/embeddings | Cohere | A state-of-the-art embedding model that supports multiple languages. This model is helpful for developing RAG (Retrieval Augmented Generation) search. |
Stable Image Ultra | text → image |
v1/images/generations | Stability AI | A state-of-the-art diffusion (image generation) model. |
Region: eu
Model Documentation | Type | API Endpoint | Model Source | Description |
---|---|---|---|---|
Claude 4 Sonnet | text → text |
v1/chat/completions | Anthropic | A state-of-the-art large language model that supports chat and tool-calling. |
Claude 3.7 Sonnet | text → text |
v1/chat/completions | Anthropic | A state-of-the-art large language model that supports chat and tool-calling. |
Claude 3 Haiku | text → text |
v1/chat/completions | Anthropic | A faster, more affordable large language model that supports chat and tool-calling. |
Amazon Nova Lite | text → text |
v1/chat/completions | Amazon | A fast and cost-effective large language model. |
Amazon Nova Pro | text → text |
v1/chat/completions | Amazon | A high-performance large language model designed for complex tasks. |
Cohere Embed Multilingual | text → embedding |
v1/embeddings | Cohere | A state-of-the-art embedding model that supports multiple languages. This model is helpful for developing RAG (Retrieval Augmented Generation) search. |