Products/Foundation Models/Mistral Medium 3.5

Mistral Medium 3.5

A 128B model for coding, reasoning, and long tasks

Foundation ModelsParis, FranceSeries AFounded 2023Raised $385M128B dense model256k context windowConfigurable reasoning effort per request77.6% on SWE-Bench VerifiedOpen weights on HuggingFace under a modified MIT licenseSelf-hostable on 4 GPUsPowers Vibe remote coding agents and Le Chat Work mode (Pro/Team/Enterprise plans)Available on NVIDIA build.nvidia.com and as an NIM container

Visit Mistral Medium 3.5 →

Our Take

Mistral AI just dropped Medium 3.5 and it's making everyone else look silly. A single 128 billion parameter dense model that handles instruction-following, reasoning, AND coding all in one set of weights. That's it. One model. No switching between specialized models, no pipeline headaches. Just clean, unified intelligence that actually knows what it's doing.

Here's the thing that should make every AI company nervous: 256,000 token context window. That's 256k tokens of basically your entire codebase, your entire documentation, your entire everything fitting in a single context. And they made it configurable—you can dial up the reasoning effort when you need deep thinking and dial it down when you just need fast answers. Smart resource allocation that most competitors haven't even thought about yet.

The pricing is $1.5 per million input tokens and $7.5 per million output tokens. For a 128B dense model that competes with models 3x its size? That's highway robbery in the best possible way. Arthur Mensch, Guillaume Lample, and Timothée Lacroix built this—yes, the same trio who bailed from Google DeepMind to prove that French AI isn't just for bread and wine. Mistral keeps releasing models that punch way above their weight class, and Medium 3.5 might be their best knockout yet.

Product page →Source →

A 128B dense model merging coding, reasoning, and instruction-following in one set of weights with 256k context window and configurable reasoning effort. Open weights on HuggingFace for engineers and teams running self-hosted inference.

Problem It Solves

Most frontier-class models either require massive infrastructure to self-host or lock users into proprietary APIs, limiting accessibility and cost control for agentic pipelines.

Target Customer

Backend and ML engineers evaluating open-weight alternatives to proprietary frontier models for agentic pipelines, coding tools, or self-hosted inference.

Use Cases

Agentic pipelines, Coding tasks, Long-horizon reasoning tasks, Self-hosted inference, Fine-tuning and auditing, On-prem deployment

Pricing Details

API pricing is $1.5 per million input tokens and $7.5 per million output tokens.

Differentiator

First "merged" flagship model combining instruction-following, reasoning, and coding in one set of weights rather than split across specialized variants. Configurable reasoning effort allows cost-conscious depth control per call.

Traction

User Count: 114 followers · Notable Metrics: 77.6% on SWE-Bench Verified; Day Rank #10