test#878
Conversation
Every find/getDocument/cascade currently hits Redis via getDocument(METADATA, $id) for the same collection metadata, often 20-50 times per request through relationship resolution and internal helpers. This adds a per-instance cache keyed on (database, namespace, tenant, id) that stores the array form so callers can mutate the returned Document without poisoning the cache. The recursive arrayifyDocuments helper ensures nested attribute/index Documents are also fully isolated. Invalidation rides the existing EVENT_DOCUMENT_PURGE plumbing — every metadata mutation in the library already fires this event, and cloud's cross-region propagation listener (cloud/app/controllers/general.php) re-fires it on receiving regions, so the new in-process invalidator stays in sync without consumer code changes. Full cache wipes are added to the context-changing setters (setNamespace, setDatabase, setSharedTables, setTenant) and delete() since those alter the cache key shape or invalidate the underlying data.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
👮 Files not reviewed due to content moderation or server errors (1)
📝 WalkthroughWalkthroughThe ChangesCollection metadata caching
🎯 4 (Complex) | ⏱️ ~45 minutes
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Greptile SummaryThis PR introduces a per-instance in-process cache for
Confidence Score: 3/5Not safe to merge as-is: two mutation paths leave stale data in the new in-process cache, which could return deleted or pre-update collection metadata within the same request. The cache wiring for attribute/index/relationship mutations is correct, but updateCollection and deleteCollection both call their internal document operations inside a fully-silent scope and omit the explicit trigger(EVENT_DOCUMENT_PURGE) that all the other mutation paths add after their silent wrappers. This means permissions changes from updateCollection and deletions from deleteCollection are invisible to any getCollection() call on the same Database instance for the remainder of the request. src/Database/Database.php — specifically the updateCollection and deleteCollection methods, which need explicit post-silent EVENT_DOCUMENT_PURGE triggers (or direct invalidateCollectionCache calls) matching the pattern already used by createAttribute/deleteAttribute. Important Files Changed
|
CI failure root cause
The bug was in this PR's in-process collection cache. The invalidation strategy hooked FixMoved invalidation into the lower-level |
|
Closing this in favor of keeping the existing selective caching pattern. Captured here so the trade-off is recorded for future reference. Why closeThe existing utopia/database cache (Dragonfly via utopia/cache) is selective by design: it caches documents that are repeatedly read with the same key ( Per-request lifetime bounds the memory growth, so it isn't catastrophic. But the principle "cache only what benefits from being cached" matters more than the few-ms p99 win this PR predicted. What we don't have evidence forThe PR's predicted impact (15-25% p99 reduction on relationship-heavy reads, 20-40% Dragonfly load reduction) was based on an estimate of ~30-50 When to revisitIf after PR #877 lands in staging and the rows-per-SELECT ratio drops as predicted, MySQL CPU should improve materially. If at that point Dragonfly cache load is still substantial or relationship-heavy endpoint p99 remains above target, revisit with a selective L1: only cache collections accessed ≥2 times in the current request, so we match the existing pattern of "cache only what's getting re-read." For now, the EXISTS rewrite in #877 is the higher-confidence, higher-impact change to ship first. |
Summary
Adds a per-instance in-process cache for collection metadata in
Database::getCollection(). Today, every authenticated read path callsgetCollection()many times — directly and through internal helpers likeconvertQuery, validators, and relationship resolution — and each call goes throughgetDocument(METADATA, $id)which incurs a Redis round trip even when the metadata is unchanged.For a relationship-heavy request that touches one collection 50 times via cascade resolution, that's currently 50 redundant Redis reads. With this change it becomes 1 read, with the remaining 49 served from in-process memory.
How invalidation works
We don't add new invalidation call sites. The library already fires
EVENT_DOCUMENT_PURGEwith$collection=METADATAafter every metadata mutation (createAttribute, deleteAttribute, createIndex, deleteRelationship, updateCollection, etc.). A new listener registered in the constructor catches those events and drops the corresponding cache entry.For Appwrite Cloud, this also stays correct across regions for free:
cloud/app/controllers/general.phpalready listens on the same event and propagates purges viaRegionManager. The receiving region'spurgeCachedDocument(METADATA, $id)call re-fires the event on its own Database instance, which invalidates the in-process cache there too.Full cache wipes are added to the four context-changing setters (
setNamespace,setDatabase,setSharedTables,setTenant) anddelete(), since those alter the cache key shape or invalidate the underlying data wholesale.Memory safety
The cache stores the array form of each collection, not the
Documentinstance. The recursivearrayifyDocumentshelper unwraps nested attribute/indexDocumentreferences too. On every cache hit, a freshDocumentis rebuilt from the stored array — so callers can mutate the returned collection (e.g.setAttribute('attributes', $new, SET_TYPE_APPEND)ascreateAttributedoes) without any risk of poisoning the cache.Cache key shape
{database}:{namespace}:{tenant}:{collectionId}— multi-tenant safe.withTenant()doesn't need a clear because the tenant is part of the key.Predicted impact
Test plan
getCollection('foo')→ mutate the result → call again, verify second call returns the original (isolation test)getCollection('foo')→createAttribute('foo', ...)→ callgetCollection('foo'), verify new attribute appears (invalidation test)withTenant($t1)populate cache, switch towithTenant($t2), verify isolationSummary by CodeRabbit