feat: add ai-cache plugin by janiussyafiq · Pull Request #13578 · apache/apisix

janiussyafiq · 2026-06-19T09:02:54Z

Description

Adds a new ai-cache plugin that caches LLM responses and replays them for subsequent requests that resolve to the same prompt, cutting upstream token cost and latency for repetitive workloads (FAQ bots, document Q&A, translation).

This PR implements the exact (L1) cache layer:

Cache key — a SHA-256 fingerprint of the request as received: client protocol, requested model, normalized messages, and the remaining response-determining body parameters (temperature, top_p, max_tokens, tools, …). Provider-agnostic via ai-protocols, so it works for every chat protocol ai-proxy supports (OpenAI Chat, Anthropic Messages, Bedrock Converse, OpenAI Responses). The key uses the client-requested model (the effective model from ai-proxy's options.model / multi-instance selection isn't known until after the lookup); if differently-modelled routes share one Redis + scope, isolate them via a separate Redis or cache_key.include_vars (e.g. route_id).
Storage — Redis (single-node); connection fields are sourced from apisix.utils.redis-schema via the policy + if/then convention used by limit-count / limit-req / limit-conn.
Scope — shared cache by default; opt-in per-consumer / per-variable isolation (cache_key.include_consumer / include_vars).
Behavior — write-on-200 only (non-streaming); bypass_on opt-out (exact request-header match); max_cache_body_size cap; X-AI-Cache-Status / X-AI-Cache-Age response headers; fails open (proxies as a normal miss) when Redis is unreachable.
Runs below ai-proxy (priority 1035) and depends on ai-proxy / ai-proxy-multi.

Semantic cache, streaming support, and observability are planned as follow-up PRs. User-facing documentation will be added in a later PR once the series is further along.

Which issue(s) this PR fixes:

Related to #13290

Checklist

I have explained the need for this PR and the problem it solves
I have explained the changes or the new features added to this PR
I have added tests corresponding to this change
I have updated the documentation to reflect this change
I have verified that this change is backward compatible

Copilot

Pull request overview

Adds a new ai-cache APISIX plugin that provides an L1 exact-match cache for non-streaming LLM requests handled by ai-proxy, using Redis as the backend and exposing cache debug headers.

Changes:

Introduces the ai-cache plugin implementation, schema, and keying logic (SHA-256 fingerprint + configurable scope).
Adds an end-to-end test suite covering MISS/HIT, bypassing, TTL expiry, scope isolation, and fail-open behavior.
Wires the plugin into the default plugin lists and build/install packaging.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
`apisix/plugins/ai-cache.lua`	Core plugin logic: lookup on access, capture on body/log, Redis integration, cache headers.
`apisix/plugins/ai-cache/schema.lua`	JSON schema for plugin configuration, leveraging `apisix.utils.redis-schema` via `policy` + `if/then`.
`apisix/plugins/ai-cache/key.lua`	Cache key fingerprinting (protocol/model/messages/params) and scope computation.
`t/plugin/ai-cache.t`	New functional + unit tests for cache behavior and edge cases.
`t/admin/plugins.t`	Adds `ai-cache` to the admin plugin list expectation.
`conf/config.yaml.example`	Adds `ai-cache` to the example plugin list with priority comment.
`apisix/cli/config.lua`	Adds `ai-cache` to the CLI’s default plugin list.
`Makefile`	Installs the `apisix/plugins/ai-cache/` directory Lua modules during `make install`.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

…ss_on Encode the request fingerprint with rapidjson (sort_keys) plus a to_rapidjson_value pass that maps the JSON null sentinel and array_mt tables, mirroring ai-transport/http.lua. core.json.stably_encode (dkjson) raised on the cjson null sentinel, so a body carrying an explicit null (e.g. OpenAI's "stop": null) would error out of the access phase. Replace the cache_bypass var-ref opt-out with bypass_on: an array of {header, equals} rules that skip the cache when a request header exactly equals its value (per rfcs#78). Exact header == value only; any matching rule triggers BYPASS. Tests: add a null-body fingerprint regression, migrate the bypass tests to bypass_on, and cover multiple rules where any match bypasses.

… update fingerprinting logic

Copilot

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.

…tion

Document the ai-cache plugin: description, full attribute table (incl. all Redis policy fields), and Admin API / ADC / Ingress Controller examples covering cache MISS/HIT and bypass_on. Add the page to the en and zh plugin sidebars.

…ey configuration

…oute cache sharing scenarios

nic-6443

Thanks for the quick turnaround — all my comments are addressed: per-route scoping by default with share_across_routes opt-out, red:close() on Redis errors instead of pooling a broken connection, the dead layers knob dropped, and the canonical encoding pulled up into core.json.canonical_encode (nicely de-duped with ai-transport). LGTM.

janiussyafiq added 2 commits June 19, 2026 15:23

feat: add ai-cache plugin to installation and configuration

8cd41f8

feat: implement ai-cache plugin with Redis support and testing

1ea1aaa

dosubot Bot added size:XXL This PR changes 1000+ lines, ignoring generated files. enhancement New feature or request plugin labels Jun 19, 2026

Merge remote-tracking branch 'upstream/master' into feat/ai-cache-exact

5c04222

nic-6443 reviewed Jun 22, 2026

View reviewed changes

Comment thread apisix/plugins/ai-cache/key.lua Outdated

shreemaan-abhishek requested a review from Copilot June 23, 2026 01:16

Copilot started reviewing on behalf of shreemaan-abhishek June 23, 2026 01:16 View session

Copilot AI reviewed Jun 23, 2026

View reviewed changes

Comment thread apisix/plugins/ai-cache/key.lua

Comment thread apisix/plugins/ai-cache.lua

Comment thread apisix/plugins/ai-cache.lua

Comment thread t/plugin/ai-cache.t

Comment thread apisix/plugins/ai-cache.lua

janiussyafiq added 2 commits June 23, 2026 09:59

feat(ai-cache): enhance body filter to handle oversized responses and…

d91e68a

… update fingerprinting logic

janiussyafiq requested a review from Copilot June 23, 2026 02:56

Copilot started reviewing on behalf of janiussyafiq June 23, 2026 02:56 View session

Copilot AI reviewed Jun 23, 2026

View reviewed changes

Comment thread apisix/plugins/ai-cache.lua

Comment thread apisix/plugins/ai-cache/schema.lua

Comment thread t/plugin/ai-cache.t Outdated

Comment thread apisix/plugins/ai-cache.lua

janiussyafiq added 2 commits June 23, 2026 11:18

feat(ai-cache): optimize body caching logic and enforce header valida…

652a89f

…tion

nic-6443 reviewed Jun 23, 2026

View reviewed changes

Comment thread apisix/plugins/ai-cache/key.lua Outdated

nic-6443 reviewed Jun 23, 2026

View reviewed changes

Comment thread apisix/plugins/ai-cache/key.lua Outdated

Comment thread apisix/plugins/ai-cache.lua Outdated

Comment thread apisix/plugins/ai-cache.lua

Comment thread apisix/plugins/ai-cache/schema.lua Outdated

janiussyafiq added 2 commits June 23, 2026 15:57

feat(ai-cache): implement canonical JSON encoding and enhance cache k…

84c5ccf

…ey configuration

feat(ai-cache): update tests for exact.ttl validation and add cross-r…

9024b70

…oute cache sharing scenarios

nic-6443 approved these changes Jun 23, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add ai-cache plugin#13578

feat: add ai-cache plugin#13578
janiussyafiq wants to merge 9 commits into
apache:masterfrom
janiussyafiq:feat/ai-cache-exact

janiussyafiq commented Jun 19, 2026 •

edited

Loading

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nic-6443 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

janiussyafiq commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Which issue(s) this PR fixes:

Checklist

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nic-6443 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

janiussyafiq commented Jun 19, 2026 •

edited

Loading