Skip to content

perf: skip content cache beyond 1000 files — 7% RSS reduction (#208)#260

Merged
justrach merged 1 commit intorelease/v0.2.57from
perf/208-skip-content-cache
Apr 13, 2026
Merged

perf: skip content cache beyond 1000 files — 7% RSS reduction (#208)#260
justrach merged 1 commit intorelease/v0.2.57from
perf/208-skip-content-cache

Conversation

@justrach
Copy link
Copy Markdown
Owner

Summary

During cold indexing, commitParsedFileOwnedOutline duplicated ALL file contents into a HashMap. The indexes (word, trigram, sparse) consume the content parameter directly — the cache was only needed for readContentForSearch which already falls back to disk.

Skip content storage when outlines.count() > 1000. First 1000 files stay cached for fast search; beyond that, search uses disk reads. Snapshot fast-load uses OUTLINE_STATE (not CONTENT), so startup is unaffected.

Benchmark (openclaw, 13,867 files, cold search)

Version Time Peak RSS Delta
v0.2.56 baseline 6.16s 3,678MB
PR#258 (current) 5.66s 3,559MB -3.2%
This PR 6.07s 3,415MB -7.2%
Zero cache (limit=0) 5.90s 3,390MB floor

Content cache accounts for ~170MB. Remaining ~3.3GB is from trigram index posting lists — would need flat array (#208 original scope) or compressed postings for further reduction.

Test plan

  • All existing tests pass
  • Benchmarked on openclaw: 3,415MB peak RSS (-7.2% vs baseline)
  • No speed regression (within variance)
  • readContentForSearch disk fallback verified working
  • Snapshot load unaffected (uses OUTLINE_STATE fast path)

🤖 Generated with Claude Code

During cold indexing, commitParsedFileOwnedOutline duped ALL file
contents into a HashMap. On openclaw (13K files) this added ~170MB
of peak RSS for content alone. The indexes (word, trigram) consume
the content parameter directly — the cache is only needed for
readContentForSearch which already has a disk fallback.

Skip content storage when outline count > 1000. First 1000 files
stay cached for fast search; beyond that, search falls back to
disk reads. Snapshot fast-load uses OUTLINE_STATE (not CONTENT),
so startup is unaffected.

Benchmark (openclaw, 13,867 files, cold search):
  v0.2.56:        3,678MB peak RSS  6.16s
  pre-clock:      3,559MB peak RSS  5.66s
  skip-cache:     3,415MB peak RSS  6.07s  (-7.2% RSS vs baseline)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@justrach justrach merged commit 49382dd into release/v0.2.57 Apr 13, 2026
1 check passed
@github-actions
Copy link
Copy Markdown

Benchmark Regression Report

Threshold: 10.00%

Tool Base (ns) Head (ns) Delta Status
codedb_bundle 661116 664396 +0.50% OK
codedb_changes 110276 108506 -1.61% OK
codedb_deps 29908 29494 -1.38% OK
codedb_edit 24764 23026 -7.02% OK
codedb_find 138477 141168 +1.94% OK
codedb_hot 147832 148196 +0.25% OK
codedb_outline 462079 461798 -0.06% OK
codedb_read 142738 144358 +1.13% OK
codedb_search 282027 286296 +1.51% OK
codedb_snapshot 4517878 4496363 -0.48% OK
codedb_status 263375 256919 -2.45% OK
codedb_symbol 64627 64493 -0.21% OK
codedb_tree 87323 86963 -0.41% OK
codedb_word 90962 99448 +9.33% OK

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant