Mistral 3 is not a single model with a single price. It is a family. Mistral’s December 2, 2025 release introduced Mistral Large 3 plus three Ministral 3 models in 14B, 8B, and 3B sizes. That matters because teams searching for “Mistral 3 pricing” are usually trying to answer a more practical question: Which version gives me the best economics for my workload?
The answer depends on whether you are paying for API usage, using Mistral’s broader product plans, or self-hosting the open models. The API prices are clear. The real budgeting work is deciding when a cheaper model is good enough and when the step up to Large 3 saves money by producing better output in fewer retries.
The short answer
On Mistral’s public model cards, the current API prices for the main Mistral 3 family are:
| Model | Input price | Output price | Best fit |
|---|---|---|---|
| Mistral Large 3 | $0.50 / 1M tokens | $1.50 / 1M tokens | Higher-end multimodal and agentic workloads |
| Ministral 3 14B | $0.20 / 1M tokens | $0.20 / 1M tokens | Strong edge and local-friendly work with better headroom |
| Ministral 3 8B | $0.15 / 1M tokens | $0.15 / 1M tokens | Balanced cost and capability for many production tasks |
| Ministral 3 3B | $0.10 / 1M tokens | $0.10 / 1M tokens | Lowest-cost lightweight inference and embedded use cases |
That pricing ladder is the real story. Mistral gives teams a fairly clean path from ultra-cheap local or edge-friendly inference up to a more capable open frontier model.
Why the pricing is more interesting than it first looks
Mistral positions all of these models as part of one open multimodal and multilingual family. In the launch announcement, the company says the models are released under the Apache 2.0 license. That creates a different budgeting discussion from many closed-model vendors.
With Mistral 3, you are not only choosing an API price. You are also choosing an operating model:
- Use the API if you want speed, low setup friction, and simple variable spend.
- Self-host if you want tighter cost control at scale, local deployment, or more infrastructure ownership.
- Mix both if you want cheap models for high-volume flows and a stronger model for harder turns.
That flexibility is why the raw token price is only step one.
How to choose between Mistral Large 3 and Ministral 3
Mistral Large 3
Mistral Large 3 is the premium tier in this family. Its API price is still relatively restrained compared with many frontier-class models, but it is clearly the expensive option inside Mistral’s own stack. The output price is especially important: at $1.50 per million output tokens, it is three times the input price. That means verbose workflows, long reports, and high-token generations can raise total spend faster than teams expect.
Large 3 makes the most sense when model quality meaningfully changes the business outcome: harder reasoning, better multimodal interpretation, stronger multilingual performance, or long-horizon agent work where retries are costly.
Ministral 3 14B
The 14B model is often the most interesting point in the lineup. It is much cheaper than Large 3, but it still offers enough headroom for teams that need real capability without paying frontier-model rates on every request. For many agent flows, especially internal tooling and bounded automation, this may be the economic sweet spot.
Ministral 3 8B
The 8B tier is the practical middle. It is cheap enough to use broadly, but not so constrained that it only fits toy workloads. If your main goal is getting useful automation into production without overspending, this is the model many teams should test first.
Ministral 3 3B
The 3B model is the budget choice. It is attractive for lightweight classification, extraction, routing, UI helpers, and other flows where you care more about latency and unit economics than deep reasoning. It can also be useful as the first stage in a cascaded system that only escalates harder tasks upward.
The budget trap: paying for the wrong model on the wrong step
The biggest mistake with Mistral 3 pricing is assuming one model should do everything. In practice, the best economics often come from routing.
For example:
- Use Ministral 3 3B or 8B for triage, tagging, extraction, and first-pass drafting.
- Use Ministral 3 14B for medium-complexity agent tasks where quality matters but frontier performance is not essential.
- Escalate only the hard turns to Mistral Large 3.
This kind of tiered architecture often lowers spend more effectively than trying to optimize prompts alone.
How Le Chat and workspace plans fit into the picture
Mistral’s public pricing page adds another layer that can confuse buyers. On the product side, Mistral lists Pro at $14.99 per month and Team at $24.99 per user per month, with Enterprise as custom pricing. Those plans matter if your team is using Le Chat and the broader Mistral workspace experience, including features like search, projects, storage, and Mistral Vibe access.
But those subscription prices are not the same thing as API budgeting. If you are building products, internal tools, or programmatic agents, the token prices on the model cards are still the numbers that matter most. Teams often blur these two pricing layers and end up underestimating their real application spend.
When self-hosting changes the economics
Because Mistral 3 is open, some teams will compare API spend with self-hosting. That comparison can be smart, but only at the right scale.
Self-hosting can improve economics when:
- You have steady, predictable throughput.
- You already run GPU infrastructure or have strong cloud-inference procurement.
- You need local or edge deployment for latency, privacy, or reliability reasons.
- You want tighter control over routing, caching, and model customization.
Self-hosting is usually a mistake when:
- Your workload is still volatile or early-stage.
- You do not have internal ML platform or inference expertise.
- You are optimizing for speed to production, not maximum infrastructure efficiency.
For many teams, the right answer is to start with the API, measure demand, and only then decide whether the volume justifies a self-hosted path.
What teams should budget in practice
If you need a simple planning framework, start here:
- Estimate monthly input and output tokens separately. Large 3 has a materially higher output price than its input price, so long responses matter.
- Map workloads by difficulty. Do not budget as if every request needs the best model.
- Plan for retries and tool usage. Real agents are not single turns.
- Decide whether subscriptions and API use will coexist. Many teams will use both.
- Review architecture before scaling. A routing change can save more than a prompt tweak.
The bottom line
Mistral 3 pricing is attractive because it gives businesses a full ladder instead of one expensive default. The low end of the family is genuinely cheap enough for high-volume production work, while Mistral Large 3 stays credible for teams that need a stronger open model without moving straight to closed frontier pricing.
The practical takeaway is simple: do not ask whether Mistral 3 is cheap or expensive in the abstract. Ask which member of the Mistral 3 family should handle each part of your workflow. That is where the real savings are.