AI Models
Explore available models and their capabilities
Inception
Released February 2026
Mercury 2
Ultra-fast reasoning, diffusion LLM, 1000+ tok/s
Reasoning
Tool Use
About this model
Mercury 2 is an extremely fast reasoning LLM and the first reasoning diffusion LLM (dLLM) that generates and refines multiple tokens in parallel, achieving over 1,000 tokens per second on standard GPUs. It is positioned as 5x faster than competing speed-optimized models at lower cost, with tunable reasoning effort levels and native tool use support.
Context Window
128K
Input Cost
$0.25/M
Output Cost
$0.75/M
Input Types
Text