Anthropic released Claude Opus 4.7 on April 16, 2026, and one of the first things many teams noticed was that the headline API pricing looked familiar. Base token rates did not jump above the current Opus level. That makes Opus 4.7 easy to misread as a simple “same price, better model” upgrade.
It is not that simple. The real budget story includes Anthropic’s new tokenizer, prompt-caching rules, batch discounts, US-only data residency uplift, tool-specific charges, and managed-agent runtime pricing. For some teams, Opus 4.7 will feel like a bargain. For others, the effective bill can rise even when the posted per-token price looks unchanged.
Official Claude Opus 4.7 pricing
Anthropic’s current first-party pricing for Claude Opus 4.7 is:
- Base input: $5 per million tokens
- 5-minute cache write: $6.25 per million tokens
- 1-hour cache write: $10 per million tokens
- Cache hit / refresh: $0.50 per million tokens
- Output: $25 per million tokens
Anthropic also offers Batch API pricing at half the normal input and output rate, which brings Opus 4.7 down to $2.50 per million input tokens and $12.50 per million output tokens for asynchronous workloads.
For teams that need US-only inference on the first-party Claude API, Anthropic adds a 1.1x pricing multiplier across token categories.
Why Opus 4.7 can cost more even at the same posted rate
The most important footnote on Anthropic’s pricing page is easy to miss: Claude Opus 4.7 uses a new tokenizer, and Anthropic says it may use up to 35% more tokens for the same fixed text.
That means “same token price” does not always equal “same request price.” If your workload is prompt-heavy, document-heavy, or built around large static instructions, more tokens per request can translate into a higher effective bill even before you account for tool use or runtime costs.
This is the core budgeting mistake to avoid. Teams should compare cost per finished task, not just cost per million tokens.
Prompt caching changes the economics
Opus 4.7’s prompt-caching rules can either save money or add cost depending on how you use them.
Anthropic prices cache operations relative to base input:
- 5-minute cache write: 1.25x base input price
- 1-hour cache write: 2x base input price
- Cache read: 0.1x base input price
That means caching pays off when the same large context gets reused enough times. If you are repeatedly sending the same repository instructions, policy blocks, long documents, or durable agent context, cached reads can pull costs down quickly. If you write large caches that rarely get reused, caching can add overhead instead of saving money.
Tool use can quietly add to your Opus bill
Anthropic’s tool pricing is another place where teams can under-budget. Some tools mainly add token overhead, while others also introduce usage-based fees.
Web search
Anthropic charges $10 per 1,000 searches for web search, on top of standard token costs for the search-generated content that enters the context.
Web fetch
Web fetch has no extra fee beyond normal token charges, but large pages still create token cost because the fetched content becomes part of the conversation.
Code execution
Code execution is free when used with web search or web fetch. Outside that combination, Anthropic bills code execution by container time after a monthly free allowance.
The practical lesson is simple: once agents start using tools heavily, a “token-only” budget stops being enough.
Managed agents add runtime pricing too
If you are using Claude Managed Agents rather than raw model calls alone, there is another cost layer. Anthropic prices managed-agent session runtime at $0.08 per session-hour while the session is actively running.
Anthropic’s own worked example shows that a one-hour coding session using Opus 4.7 with 50,000 input tokens and 15,000 output tokens totals $0.705 when runtime is included. That is a helpful reminder that long-running agent work should be budgeted as a combination of model usage and execution time, not just prompt cost.
What teams should budget for Claude Opus 4.7
Claude Opus 4.7 is not obviously overpriced for frontier work. In fact, on raw posted token rates it can look attractive next to some competing premium models. But teams should budget around four practical realities:
- Tokenizer impact: the same document may tokenize larger than before
- Output cost: output is still far more expensive than cache reads
- Tool usage: search and execution can add real overhead
- Runtime: managed agents introduce an additional metered layer
A simple standard API example makes the headline pricing look reasonable: 10 million input tokens and 2 million output tokens comes out to about $100 before other modifiers. But if token counts rise because of the new tokenizer, if tools generate large extra context, or if a managed workflow runs for long periods, the real spend can move up faster than that clean example suggests.
When Opus 4.7 makes financial sense
Opus 4.7 makes the most sense when better long-horizon reasoning, stronger coding reliability, or improved multimodal understanding reduces retries and failed runs. In those cases, a higher effective per-request cost can still be the cheaper operational choice.
It makes less sense as a casual default for lightweight tasks where a smaller or cheaper model would finish the work just as well.
The practical takeaway is this: Claude Opus 4.7 is not just “same price as before.” It is a model whose real cost depends on tokenization, caching behavior, tools, and runtime. Teams that budget from the headline token rate alone may be surprised. Teams that budget from finished workflow cost will make much better decisions.