docs(examples): add prompt-caching example covering 3 patterns by HermeticOrmus · Pull Request #1480 · anthropics/anthropic-sdk-python

HermeticOrmus · 2026-04-30T15:28:08Z

Why

The examples/ directory has agents, batch, streaming, structured outputs, thinking, and MCP tools — but no prompt-caching example. The feature is documented at platform.claude.com, but a developer browsing the SDK won't find a runnable demonstration of where to put cache_control={\"type\": \"ephemeral\"}, what the response usage fields look like on a hit, or which patterns are worth caching.

This adds one runnable file covering the three patterns most callers actually use.

What changed

examples/prompt_caching.py — one __main__ script with three numbered sections:

Cache the system prompt — chatbot/agent with a long instruction set; cache_control on the system block.
Cache system + tool definitions — agent loop with tools; cache_control on the last tool entry (the bigger win, since tool defs are usually larger than the system prompt).
Cache long static context — RAG / Q&A over a fixed document; cache_control on the document text block, follow-up question in a second block.

Each section has a first call that creates the cache and a second call that hits it. A show_usage() helper prints input / cache_creation / cache_read / output token counts so the reader can confirm the hit visually.

The script padding ensures each cache target exceeds the per-model floor (~1024 tokens) so the cache actually takes effect when run.

How to test

ANTHROPIC_API_KEY=... ./examples/prompt_caching.py

Expected: section [1] first-call shows non-zero cache_creation_input_tokens; second-call shows non-zero cache_read_input_tokens. Same shape for [2] and [3].

Linted clean: ruff check examples/prompt_caching.py (passes) and ruff format --check examples/prompt_caching.py (no changes). Project line-length 120 respected.

Notes

Uses the sync API since it dominates examples/. An async twin (prompt_caching_async.py) is a natural follow-up if maintainers prefer the same coverage in both.
Model pinned to claude-sonnet-4-5-20250929 (the most-used model across examples/).
This wasn't run against a live API from the contributor environment; verification against cache_creation_input_tokens / cache_read_input_tokens is recommended before merge.

The examples/ directory has agents, batch, streaming, structured outputs, thinking, and MCP tools but no prompt-caching example. The feature is heavily documented on platform.claude.com but reading the SDK examples wouldn't tell you it exists. This adds one runnable file covering the three patterns most users hit: 1. Cache the system prompt (chatbots, long instructions). 2. Cache system + tool definitions (agent loops). 3. Cache long static context (RAG / Q&A over a fixed document). Each section has a first call that creates the cache and a second call that hits it. show_usage() prints input / cache_create / cache_read / output token counts so the operator can verify the cache hit.

HermeticOrmus requested a review from a team as a code owner April 30, 2026 15:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs(examples): add prompt-caching example covering 3 patterns#1480

docs(examples): add prompt-caching example covering 3 patterns#1480
HermeticOrmus wants to merge 1 commit intoanthropics:mainfrom
HermeticOrmus:feature/prompt-caching-example

HermeticOrmus commented Apr 30, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

HermeticOrmus commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why

What changed

How to test

Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

HermeticOrmus commented Apr 30, 2026 •

edited

Loading