Claude Fable 5 API Cost Calculator

Last updated: 2026-07-03

Quick Answer

Claude Fable 5 API cost depends on input tokens, output tokens, cache behavior, tool use and retries. Anthropic lists Claude Fable 5 at $10 per million base input tokens and $50 per million output tokens. Check the official pricing page before scaling, because provider availability and marketplace billing can differ.

Official Anthropic Pricing Snapshot

According to Anthropic pricing docs, Claude Fable 5 uses the following rates for Anthropic Direct API: last checked 2026-06-12. Provider marketplace rates may vary.

Base input tokens $10.00 / MTok

Output tokens $50.00 / MTok

5-minute cache writes $12.50 / MTok

1-hour cache writes $20.00 / MTok

Cache hits & refreshes $1.00 / MTok

Batch input tokens $5.00 / MTok

Batch output tokens $25.00 / MTok

Prices shown are for Anthropic Direct API as of 2026-06-12. Provider marketplace availability and pricing should be verified in the relevant provider console. Prices can change; always check the official Anthropic pricing page before making cost decisions.

Token Cost Formula

Estimate Claude Fable 5 API cost using the base formula:

cost = (input_tokens × $10 / 1,000,000) + (output_tokens × $50 / 1,000,000)

For example:

10K input tokens, 5K output tokens: (10,000 × $10 / 1,000,000) + (5,000 × $50 / 1,000,000) = $0.10 + $0.25 = $0.35 per request
200K input tokens, 50K output tokens: (200,000 × 0.00001) + (50,000 × 0.00005) = $4.50 per request
1M input tokens, 100K output tokens: (1,000,000 × 0.00001) + (100,000 × 0.00005) = $15.00 per request

Context window

Claude Fable 5 supports a 1M token context window, with a 128K-token maximum output according to Anthropic's models overview. Sending large context means all tokens count as input, regardless of whether they were previously cached.

Cache Pricing and Batch Pricing

Claude Fable 5 supports cached context and batch processing, which reduce cost significantly when used appropriately.

Prompt Caching

When you include a cached context using a previous cache write:

Cache writes are billed at $12.50/MTok (5-min TTL) or $20.00/MTok (1-hour TTL) — charged once when the cache is created
Cache hits are billed at $1.00/MTok — significantly cheaper than full input pricing
Cache refreshes are also billed at $1.00/MTok

For example, a 50K-token cache write with 1-hour TTL costs:

50,000 × $20.00 / 1,000,000 = $1.00

Each subsequent request using that cache hit costs $1.00/MTok for the 50K cache plus $50/MTok for the new output tokens.

Batch Processing

Batch API pricing is half the price of standard API pricing:

Batch input: $5.00/MTok (50% of base input)
Batch output: $25.00/MTok (50% of base output)

Batch is suitable for non-real-time workloads. Turnaround time is longer than synchronous API calls.

Context Window Impact

Claude Fable 5's 1M token context window means high-context tasks accumulate significant input costs. For coding agents, each file read, conversation turn and tool call result adds to the token count.

Context window cost factors:

Large file reads — hundreds to thousands of tokens per file
Conversation history — grows with every turn, all counted as input
Tool call results — included in context and billed as input tokens
System prompts — counted toward the input token budget

For long agentic sessions, monitor tokens_in and tokens_out from your request logs. A session with 10 tool calls against a 100K-token context can easily reach 1M+ input tokens, resulting in $10+ per session at standard rates.

Tool Use and Agent Overhead

Using Claude Fable 5 with tools in an agentic workflow adds token overhead beyond a simple chat:

Tool definitions and descriptions count toward input tokens
Tool call results (file contents, command output, search results) are appended to context as input
Each tool call round involves reasoning tokens plus the tool output tokens
Multiple parallel or sequential tool calls multiply the overhead

In a coding agent scenario, a session might include:

Initial system prompt and user task (~2K tokens input)
Project context loading (~20K tokens input)
Three file reads (~15K tokens input)
Three tool call round-trips (~30K tokens input, ~5K tokens output)
Final response (~2K tokens output)

Estimated cost:

(67,000 × $10 / 1,000,000) + (7,000 × $50 / 1,000,000) = $0.67 + $0.35 = $1.02

With prompt caching, if the project context was previously cached, the input cost for the second session drops significantly. Cache hits cost $1.00/MTok vs. $10.00/MTok for uncached input.

Failed Request / Retry Billing Notes

Failed requests and retries can affect your Claude Fable 5 bill depending on how your provider handles them:

Timeout errors — if the provider bills on request initiation, the input tokens sent before timeout may still be charged
Rate limit retries — each retry sends the full context again, compounding input token cost
Validation errors — usually not billed, but check your provider's policy
Partial output — if a request is interrupted mid-stream, output tokens generated before interruption are typically billed

Provider policy varies

Provider policy on failed request billing can differ from Anthropic Direct API behavior. Check usage records and request IDs in your provider dashboard to confirm whether failed or interrupted requests were billed.

Small Prepaid Test Checklist

Before scaling Claude Fable 5 usage, run a small prepaid test to verify actual cost behavior:

✓Send one minimal request and record tokens_in, tokens_out and the response time
✓Compare estimated cost (tokens × rate) against your provider dashboard deduction
✓Test with prompt caching to measure cache write vs. cache hit cost difference
✓Test one agentic tool call round-trip and measure the token increment
✓Simulate a timeout error and check whether tokens were billed before timeout
✓Review request IDs and usage logs for at least 5 requests to establish a cost baseline

Example Cost Scenarios

Simple chat (no tools, no cache)

1K input tokens, 500 output tokens:

(1,000 × $0.00001) + (500 × $0.00005) = $0.035

Coding agent session with file reads

80K input tokens (context + file reads), 10K output tokens:

(80,000 × $0.00001) + (10,000 × $0.00005) = $0.80 + $0.50 = $1.30

Coding agent with cached context (1-hour TTL)

50K cache write (one-time) + 50K cache hit + 10K output:

Cache write: 50,000 × $0.00002 = $1.00 (one-time) Cache hit: 50,000 × $0.000001 = $0.05 Output: 10,000 × $0.00005 = $0.50 Total (first request with new output): $1.55 Total (subsequent requests, same cache): $0.55

Batch processing (non-real-time)

200K input tokens, 50K output tokens via Batch API:

(200,000 × $0.000005) + (50,000 × $0.000025) = $1.00 + $1.25 = $2.25

vs. Standard API for the same tokens: $2.00 + $2.50 = $4.50 (batch saves 50%).

Related Guides

Claude Code Token Cost

Learn more about this topic

Coding Agent Cost

Learn more about this topic

Agent Token Usage

Learn more about this topic

API Billing Mismatch

Learn more about this topic

Small Prepaid Test

Learn more about this topic

AI Summary

Claude Fable 5 API cost is driven by input tokens at $10/MTok and output tokens at $50/MTok for Anthropic Direct API. Prompt caching reduces input costs to $1/MTok for cache hits, with cache writes at $12.50-$20/MTok depending on TTL. Batch pricing is 50% of standard rates. Context window size, tool call overhead and retry behavior all multiply cost for agentic workflows. Cache writes are billed once; each request with a cache hit is cheaper than the initial context load. Failed request billing depends on provider policy; review request IDs and dashboard usage records regularly. Test with a small prepaid balance before scaling agentic or high-volume workflows.

Frequently Asked Questions

How do I estimate Claude Fable 5 API cost?

Multiply input tokens by $10/MTok and output tokens by $50/MTok. For example, 100K input and 20K output costs (0.1M × $10) + (0.02M × $50) = $1.00 + $1.00 = $2.00. Add tool call overhead by counting the tool results as additional input tokens.

Do cache hits reduce Claude Fable 5 cost?

Yes. Cache hits are billed at $1/MTok instead of $10/MTok for base input. A cache write is a one-time charge at $12.50/MTok (5-min TTL) or $20/MTok (1-hour TTL). Subsequent requests reusing that cached context cost less per request than the initial context load.

Do tool calls increase Claude Fable 5 token usage?

Yes. Tool definitions, tool call results and the reasoning around tool use all count toward input tokens. Each tool call round-trip adds the tool output to your context. For coding agents with multiple file reads and command executions, tool call overhead can significantly increase total input tokens per session.

Can retries or failed requests affect the final bill?

Depending on your provider, yes. Timeout errors may bill the input tokens already sent. Rate limit retries resend the full context each time, multiplying input token cost. Check provider policy and review request IDs and dashboard usage to confirm whether failed or interrupted requests were charged.

Should I use batch pricing for Claude Fable 5?

Batch pricing is 50% of standard API rates ($5/MTok input, $25/MTok output) but has longer turnaround time. It is suitable for non-real-time workloads such as batch analysis, report generation or offline processing. For interactive coding agents or real-time applications, standard API pricing applies.

Ready to start?

Create an API key with $1 trial credit and explore live model pricing.

Create API Key $1 trial credit View Live Pricing