Skip to content

feat: add ai-cache plugin#13578

Open
janiussyafiq wants to merge 9 commits into
apache:masterfrom
janiussyafiq:feat/ai-cache-exact
Open

feat: add ai-cache plugin#13578
janiussyafiq wants to merge 9 commits into
apache:masterfrom
janiussyafiq:feat/ai-cache-exact

Conversation

@janiussyafiq

@janiussyafiq janiussyafiq commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

Description

Adds a new ai-cache plugin that caches LLM responses and replays them for subsequent requests that resolve to the same prompt, cutting upstream token cost and latency for repetitive workloads (FAQ bots, document Q&A, translation).

This PR implements the exact (L1) cache layer:

  • Cache key — a SHA-256 fingerprint of the request as received: client protocol, requested model, normalized messages, and the remaining response-determining body parameters (temperature, top_p, max_tokens, tools, …). Provider-agnostic via ai-protocols, so it works for every chat protocol ai-proxy supports (OpenAI Chat, Anthropic Messages, Bedrock Converse, OpenAI Responses). The key uses the client-requested model (the effective model from ai-proxy's options.model / multi-instance selection isn't known until after the lookup); if differently-modelled routes share one Redis + scope, isolate them via a separate Redis or cache_key.include_vars (e.g. route_id).
  • Storage — Redis (single-node); connection fields are sourced from apisix.utils.redis-schema via the policy + if/then convention used by limit-count / limit-req / limit-conn.
  • Scope — shared cache by default; opt-in per-consumer / per-variable isolation (cache_key.include_consumer / include_vars).
  • Behavior — write-on-200 only (non-streaming); bypass_on opt-out (exact request-header match); max_cache_body_size cap; X-AI-Cache-Status / X-AI-Cache-Age response headers; fails open (proxies as a normal miss) when Redis is unreachable.
  • Runs below ai-proxy (priority 1035) and depends on ai-proxy / ai-proxy-multi.

Semantic cache, streaming support, and observability are planned as follow-up PRs. User-facing documentation will be added in a later PR once the series is further along.

Which issue(s) this PR fixes:

Related to #13290

Checklist

  • I have explained the need for this PR and the problem it solves
  • I have explained the changes or the new features added to this PR
  • I have added tests corresponding to this change
  • I have updated the documentation to reflect this change
  • I have verified that this change is backward compatible

@dosubot dosubot Bot added size:XXL This PR changes 1000+ lines, ignoring generated files. enhancement New feature or request plugin labels Jun 19, 2026
Comment thread apisix/plugins/ai-cache/key.lua Outdated

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new ai-cache APISIX plugin that provides an L1 exact-match cache for non-streaming LLM requests handled by ai-proxy, using Redis as the backend and exposing cache debug headers.

Changes:

  • Introduces the ai-cache plugin implementation, schema, and keying logic (SHA-256 fingerprint + configurable scope).
  • Adds an end-to-end test suite covering MISS/HIT, bypassing, TTL expiry, scope isolation, and fail-open behavior.
  • Wires the plugin into the default plugin lists and build/install packaging.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
apisix/plugins/ai-cache.lua Core plugin logic: lookup on access, capture on body/log, Redis integration, cache headers.
apisix/plugins/ai-cache/schema.lua JSON schema for plugin configuration, leveraging apisix.utils.redis-schema via policy + if/then.
apisix/plugins/ai-cache/key.lua Cache key fingerprinting (protocol/model/messages/params) and scope computation.
t/plugin/ai-cache.t New functional + unit tests for cache behavior and edge cases.
t/admin/plugins.t Adds ai-cache to the admin plugin list expectation.
conf/config.yaml.example Adds ai-cache to the example plugin list with priority comment.
apisix/cli/config.lua Adds ai-cache to the CLI’s default plugin list.
Makefile Installs the apisix/plugins/ai-cache/ directory Lua modules during make install.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread apisix/plugins/ai-cache/key.lua
Comment thread apisix/plugins/ai-cache.lua
Comment thread apisix/plugins/ai-cache.lua
Comment thread t/plugin/ai-cache.t
Comment thread apisix/plugins/ai-cache.lua
…ss_on

Encode the request fingerprint with rapidjson (sort_keys) plus a
to_rapidjson_value pass that maps the JSON null sentinel and array_mt
tables, mirroring ai-transport/http.lua. core.json.stably_encode (dkjson)
raised on the cjson null sentinel, so a body carrying an explicit null
(e.g. OpenAI's "stop": null) would error out of the access phase.

Replace the cache_bypass var-ref opt-out with bypass_on: an array of
{header, equals} rules that skip the cache when a request header exactly
equals its value (per rfcs#78). Exact header == value only; any matching
rule triggers BYPASS.

Tests: add a null-body fingerprint regression, migrate the bypass tests
to bypass_on, and cover multiple rules where any match bypasses.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.

Comment thread apisix/plugins/ai-cache.lua
Comment thread apisix/plugins/ai-cache/schema.lua
Comment thread t/plugin/ai-cache.t Outdated
Comment thread apisix/plugins/ai-cache.lua
Document the ai-cache plugin: description, full attribute table (incl. all
Redis policy fields), and Admin API / ADC / Ingress Controller examples
covering cache MISS/HIT and bypass_on. Add the page to the en and zh plugin
sidebars.
Comment thread apisix/plugins/ai-cache/key.lua Outdated
Comment thread apisix/plugins/ai-cache/key.lua Outdated
Comment thread apisix/plugins/ai-cache.lua Outdated
Comment thread apisix/plugins/ai-cache.lua
Comment thread apisix/plugins/ai-cache/schema.lua Outdated

@nic-6443 nic-6443 left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the quick turnaround — all my comments are addressed: per-route scoping by default with share_across_routes opt-out, red:close() on Redis errors instead of pooling a broken connection, the dead layers knob dropped, and the canonical encoding pulled up into core.json.canonical_encode (nicely de-duped with ai-transport). LGTM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request plugin size:XXL This PR changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants