perf: improve text checkout scalability by lodyai[bot] · Pull Request #956 · loro-dev/loro

lodyai · 2026-04-22T07:55:06Z

Summary

Add a text checkout benchmark with profile counters for frontier preparation, VV conversion, diff calculation, richtext tracker work, state apply, event emit, and future-sibling scanning.
Reduce per-change VersionVector work by passing a lightweight causal version view through diff calculation and teaching the richtext tracker to consume it directly.
Use a forward diff calculator for comparable checkout-to-latest paths so linear/import-greater updates avoid persistent CRDT checkout tracking.
Add plain-text no-style apply fast paths and conservative same-deps / same-parent fast paths for high-collaboration text checkout.
Keep FastUpdates/FastSnapshot text imports on the fast path by avoiding the state-apply rollback guard for ordinary text sync, while still forcing rollback for JSON schema imports from external input.

Benchmarks

Text checkout improvements versus main:

1000 peer wide-causal checkout: about 5.13ms -> 1.61ms, diff_calc about 4.90ms -> 1.39ms, tracker checkout about 3.47ms -> 37.6us.
300 peer same-position checkout: about 4.93ms -> 1.78ms, frontier_prepare about 3.04ms -> 37.8us.
1000 peer same-position checkout after frontier fast path: about 16.6ms. Future-sibling scan profile after same-parent fast path: about 1.83ms -> 575us.
checkout-to-latest linear smoke: about 65us with richtext tracker checkout/diff calls at 0.

Full bench rerun notes:

Re-ran all benches against the merge-base using separate target dirs to avoid cross-branch artifact contamination.
The text checkout benchmarks are the intended wins, generally in the 3x-9x range depending on the scenario.
During the full run, three non-checkout text sync/import benches initially regressed by more than 50% because text imports were also forcing state-apply rollback. After narrowing rollback to list/tree for ordinary binary imports and forcing it only for JSON schema imports, the regressions recovered:
- encode_with_sync/update: regressed to about 88ms, now about 46.0ms; base was about 43-45ms.
- refactored direct_apply/B4DirectSync: regressed to about 6.7s, now about 3.69s; base was about 3.6s.
- refactored-sync/B4Parallel: regressed to about 89.7ms, now about 54.0ms; base was about 51.6ms.

Validation

cargo check -p loro-internal --features test_utils,jsonpath
cargo test -p loro-internal --features test_utils,jsonpath import --lib
cargo test -p loro --test panic_test import_json_updates_with_text_insert_out_of_bounds_should_error_without_mutating_doc
pnpm check
pnpm test
CARGO_TARGET_DIR=/tmp/loro-final-bench cargo bench -p loro-internal --features test_utils,jsonpath --bench encode encode_with_sync/update -- --save-baseline final-encode
CARGO_TARGET_DIR=/tmp/loro-final-bench cargo bench -p loro-internal --features test_utils,jsonpath --bench text_r B4DirectSync -- --save-baseline final-direct
CARGO_TARGET_DIR=/tmp/loro-final-bench cargo bench -p loro-internal --features test_utils,jsonpath --bench text_r B4Parallel -- --save-baseline final-parallel

Notes

B4Parallel filtering also matches DecodeUpdates B4Parallel; after the target bench reported 54.0ms, the remaining matched bench was terminated.
Long-running libFuzzer targets were not run.

lodyai · 2026-04-22T08:14:24Z

Benchmark / validation update after the TMP_PLAN cleanup.

Fixed-size checkout bench command shape:

LORO_TEXT_CHECKOUT_PROFILE=1 LORO_TEXT_CHECKOUT_PEERS=1000 LORO_TEXT_CHECKOUT_BASE_LEN=1024 LORO_TEXT_CHECKOUT_CHANGES=1000 cargo bench -p loro-internal --features test_utils --bench text_checkout -- <case> --warm-up-time 0.05 --measurement-time 0.1 --sample-size 10

Results:

case	criterion median-ish range	avg_total	main profile notes
plain/wide-causal-peer-checkout/1001	1.37-1.67 ms	1.46 ms	diff_calc 1.27 ms; richtext_tracker_checkout 33 us; future_scan 0
plain/same-position-peer-checkout/1001	12.0-16.9 ms	15.89 ms	diff_calc 15.15 ms; future_scan 0.57 ms; avg_future_scan_visited 383, max 999
code/checkout-to-latest-linear/1001	515-524 us	521 us	diff_calc 358 us; state_apply 160 us; richtext tracker 0
rich/overlap-mark-peer-checkout/subscribed/1001	11.2-17.1 ms	15.20 ms	state_apply 7.66 ms; diff_calc 6.63 ms; emit_events 72 us; future_scan 31 us

Phase 5 cost note: after the same-parent fast path, future sibling scanning is no longer the dominant cost in the measured 1000-peer cases. It is still visible in the worst same-position case (~0.57 ms, about 3-4% of total), but the remaining slow path is mostly replay/diff_calc. For subscribed rich overlap marks, future_scan is negligible (~31 us), while state apply and diff calc dominate.

Additional validation now passed:

cargo fuzz run text-update -- -max_total_time=60
cargo fuzz run all -- -max_total_time=60

Both fuzz targets completed their 60s runs without crashes. The fuzz lockfile refresh was discarded because it was generated by the local run and is unrelated to this PR.

lodyai · 2026-04-22T08:19:58Z

Follow-up phase 4 update after continuing the 1-4 plan.

Added commit 377afd4e (perf: batch rich text style event deltas):

Batch adjacent, non-overlapping retain-only rich text style event deltas before composing them.
Preserve the original compose order by flushing when a delta overlaps or is not retain-only.
This keeps the optimization local to event conversion and does not add checkout/cache state.

Validation passed after this commit:

cargo check -p loro-internal --features test_utils
cargo test -p loro-internal richtext --features test_utils
cargo test -p loro-internal checkout --features test_utils
cargo test -p loro-internal import --features test_utils
cargo test -p fuzz random_fuzz_1s -- --nocapture
git diff --check

Affected bench rerun:

LORO_TEXT_CHECKOUT_PROFILE=1 LORO_TEXT_CHECKOUT_PEERS=1000 LORO_TEXT_CHECKOUT_BASE_LEN=1024 LORO_TEXT_CHECKOUT_CHANGES=1000 cargo bench -p loro-internal --features test_utils --bench text_checkout -- rich/overlap-mark-peer-checkout/subscribed --warm-up-time 0.05 --measurement-time 0.1 --sample-size 10

Result: Criterion range 8.7449 ms - 13.709 ms, avg_total=12.692376ms, avg_state_apply=6.503316ms, avg_diff_calc=5.456606ms, avg_emit_events=50.446us, avg_richtext_insert_future_scan=27.024us. The sample is noisy (p = 0.17, no statistically significant change), but the profile is directionally better than the previous run (avg_total=15.199131ms, avg_state_apply=7.66148ms).

zxch3n · 2026-04-22T10:10:39Z

Fuzz validation for the text checkout performance PR:

cargo fuzz run text-update -- -max_total_time=1200
- Passed: 317,498 runs in 1201s
- Final: cov: 4634, ft: 18712, corp: 1220/255Kb, rss: 507Mb
cargo fuzz run local_events -- -max_total_time=1200
- Passed: 510,962 runs in 1201s
- Final: cov: 15694, ft: 59080, corp: 2001/428Kb, rss: 566Mb
cargo fuzz run all -- -max_total_time=1200
- Passed: 52,862 runs in 1201s
- Final: cov: 26017, ft: 74208, corp: 625/22Kb, rss: 471Mb

No crashes, panics, sanitizer failures, or reproducer artifacts were reported. Workspace was clean after restoring the fuzz-generated crates/fuzz/fuzz/Cargo.lock refresh.

…out-perf

…out-perf # Conflicts: # crates/loro-internal/src/loro.rs # crates/loro-internal/src/oplog.rs

github-actions · 2026-04-26T16:53:46Z

WASM Size Report

Original size: 3111.82 KB
Gzipped size: 999.85 KB
Brotli size: 696.54 KB

…out-perf

…ro-dev/loro into feat/scale-text-checkout-perf

zxch3n added 3 commits April 22, 2026 07:54

perf: improve text checkout scalability

970e9b7

docs: add text checkout performance plan

e4a9181

docs: remove temporary text checkout plan

ed54924

perf: batch rich text style event deltas

377afd4

zxch3n added 3 commits April 22, 2026 11:15

Merge remote-tracking branch 'origin/main' into feat/scale-text-check…

7b20b7e

…out-perf

fix: handle fuzzed text checkout edge cases

7d27feb

fix: handle shallow root frontiers in fuzzed imports

4ce926f

zxch3n marked this pull request as ready for review April 26, 2026 16:39

Merge remote-tracking branch 'origin/main' into feat/scale-text-check…

c259b80

…out-perf # Conflicts: # crates/loro-internal/src/loro.rs # crates/loro-internal/src/oplog.rs

zxch3n added 17 commits May 7, 2026 13:26

Merge remote-tracking branch 'origin/main' into feat/scale-text-check…

67905cb

…out-perf

fix: clear deleted cache on checkout

87dd333

fix: reject partial shallow root checkout

6e1b49a

fix: reexport shallow root snapshots

162ff39

fix: avoid shallow gca on reexport

dccd172

fix: reject unreachable shallow frontiers

4ce3408

fix: reject shallow root dependency frontiers

df2498c

fix: guard shallow frontier utilities

fdedbd7

fix: clamp shallow frontier conversions

5d046be

fix: clamp empty shallow version frontiers

05a5b62

fix: normalize shallow reexport frontiers

0b656b7

fix: normalize shallow snapshot targets

3e9edca

fix: normalize state-only export frontiers

20770e3

fix: normalize snapshot-at frontiers

b7763b0

fix: keep richtext style pairs in shallow roots

530b901

fix: clamp shallow diff lca frontiers

adbe9a3

fix: normalize shallow state-only targets

394134e

zxch3n and others added 25 commits May 8, 2026 16:32

fix: preserve independent shallow root frontiers

226cf27

fix: handle multi-frontier shallow snapshot checkout

9b45cd3

fix: reject malformed imported text diffs

ca54408

fix: reject empty text marks in JSON import

a38f5d5

fix: reject unpaired text marks in JSON import

37078ae

fix: canonicalize frontiers constructors

b0257ee

fix: preserve canonical state-only snapshot frontiers

3622d49

fix: ignore cyclic tree moves in one-doc fuzz

d3d84bb

fix: preserve commit options after failed change travel

466c97e

Merge branch 'main' into feat/scale-text-checkout-perf

1bf267d

fix: tighten import rollback followups

4a32d2e

Merge branch 'feat/scale-text-checkout-perf' of https://github.com/lo…

2b9a599

…ro-dev/loro into feat/scale-text-checkout-perf

refactor: centralize import rollback container check

d67d534

docs: plan fast diff calc span routing

507aff6

bench: add many text checkout scenario

c350b0e

refactor: route richtext checkout through spans

5c3cd62

perf: filter richtext checkout spans by coverage

91e5ceb

docs: record fast diff calc benchmark results

b8ee18f

bench: report checkout span averages

57bfd67

test: compare filtered richtext diff

f9fb539

docs: update fast diff calc commit list

434b35c

fix: keep list diff calculator small

f8d5752

perf: reuse coverage-local richtext tracker versions

82dd1dc

fix: guard richtext tracker reuse

091e3c9

test: skip shallow peers in gc fuzzer sync

3c2b53e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf: improve text checkout scalability#956

perf: improve text checkout scalability#956
lodyai[bot] wants to merge 50 commits into
mainfrom
feat/scale-text-checkout-perf

lodyai Bot commented Apr 22, 2026 •

edited

Loading

Uh oh!

lodyai Bot commented Apr 22, 2026

Uh oh!

lodyai Bot commented Apr 22, 2026

Uh oh!

zxch3n commented Apr 22, 2026

Uh oh!

github-actions Bot commented Apr 26, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

lodyai Bot commented Apr 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Benchmarks

Validation

Notes

Uh oh!

lodyai Bot commented Apr 22, 2026

Uh oh!

lodyai Bot commented Apr 22, 2026

Uh oh!

zxch3n commented Apr 22, 2026

Uh oh!

github-actions Bot commented Apr 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

WASM Size Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

lodyai Bot commented Apr 22, 2026 •

edited

Loading

github-actions Bot commented Apr 26, 2026 •

edited

Loading