Mistral Medium 3.5
A 128B model for coding, reasoning, and long tasks

Our Take
Mistral AI just dropped Medium 3.5 and it's making everyone else look silly. A single 128 billion parameter dense model that handles instruction-following, reasoning, AND coding all in one set of weights. That's it. One model. No switching between specialized models, no pipeline headaches. Just clean, unified intelligence that actually knows what it's doing.
Here's the thing that should make every AI company nervous: 256,000 token context window. That's 256k tokens of basically your entire codebase, your entire documentation, your entire everything fitting in a single context. And they made it configurable—you can dial up the reasoning effort when you need deep thinking and dial it down when you just need fast answers. Smart resource allocation that most competitors haven't even thought about yet.
The pricing is $1.5 per million input tokens and $7.5 per million output tokens. For a 128B dense model that competes with models 3x its size? That's highway robbery in the best possible way. Arthur Mensch, Guillaume Lample, and Timothée Lacroix built this—yes, the same trio who bailed from Google DeepMind to prove that French AI isn't just for bread and wine. Mistral keeps releasing models that punch way above their weight class, and Medium 3.5 might be their best knockout yet.
A 128B dense model merging coding, reasoning, and instruction-following in one set of weights with 256k context window and configurable reasoning effort. Open weights on HuggingFace for engineers and teams running self-hosted inference.
Key Facts
The people behind Mistral Medium 3.5
Rohan Chaubey
profileMaker
Links
Similar products worth knowing
Want products like this in your inbox every morning?
Five products. Every morning. Written by someone who actually cares whether they're good or not. Free forever, unsubscribe whenever.

