perf: improve text checkout scalability#956
Conversation
|
Benchmark / validation update after the TMP_PLAN cleanup. Fixed-size checkout bench command shape: LORO_TEXT_CHECKOUT_PROFILE=1 LORO_TEXT_CHECKOUT_PEERS=1000 LORO_TEXT_CHECKOUT_BASE_LEN=1024 LORO_TEXT_CHECKOUT_CHANGES=1000 cargo bench -p loro-internal --features test_utils --bench text_checkout -- <case> --warm-up-time 0.05 --measurement-time 0.1 --sample-size 10Results:
Phase 5 cost note: after the same-parent fast path, future sibling scanning is no longer the dominant cost in the measured 1000-peer cases. It is still visible in the worst same-position case (~0.57 ms, about 3-4% of total), but the remaining slow path is mostly replay/diff_calc. For subscribed rich overlap marks, future_scan is negligible (~31 us), while state apply and diff calc dominate. Additional validation now passed: cargo fuzz run text-update -- -max_total_time=60
cargo fuzz run all -- -max_total_time=60Both fuzz targets completed their 60s runs without crashes. The fuzz lockfile refresh was discarded because it was generated by the local run and is unrelated to this PR. |
|
Follow-up phase 4 update after continuing the 1-4 plan. Added commit
Validation passed after this commit: cargo check -p loro-internal --features test_utils
cargo test -p loro-internal richtext --features test_utils
cargo test -p loro-internal checkout --features test_utils
cargo test -p loro-internal import --features test_utils
cargo test -p fuzz random_fuzz_1s -- --nocapture
git diff --checkAffected bench rerun: LORO_TEXT_CHECKOUT_PROFILE=1 LORO_TEXT_CHECKOUT_PEERS=1000 LORO_TEXT_CHECKOUT_BASE_LEN=1024 LORO_TEXT_CHECKOUT_CHANGES=1000 cargo bench -p loro-internal --features test_utils --bench text_checkout -- rich/overlap-mark-peer-checkout/subscribed --warm-up-time 0.05 --measurement-time 0.1 --sample-size 10Result: Criterion range |
|
Fuzz validation for the text checkout performance PR:
No crashes, panics, sanitizer failures, or reproducer artifacts were reported. Workspace was clean after restoring the fuzz-generated |
…out-perf # Conflicts: # crates/loro-internal/src/loro.rs # crates/loro-internal/src/oplog.rs
WASM Size Report
|
…ro-dev/loro into feat/scale-text-checkout-perf
Summary
Benchmarks
Text checkout improvements versus
main:Full bench rerun notes:
encode_with_sync/update: regressed to about 88ms, now about 46.0ms; base was about 43-45ms.refactored direct_apply/B4DirectSync: regressed to about 6.7s, now about 3.69s; base was about 3.6s.refactored-sync/B4Parallel: regressed to about 89.7ms, now about 54.0ms; base was about 51.6ms.Validation
cargo check -p loro-internal --features test_utils,jsonpathcargo test -p loro-internal --features test_utils,jsonpath import --libcargo test -p loro --test panic_test import_json_updates_with_text_insert_out_of_bounds_should_error_without_mutating_docpnpm checkpnpm testCARGO_TARGET_DIR=/tmp/loro-final-bench cargo bench -p loro-internal --features test_utils,jsonpath --bench encode encode_with_sync/update -- --save-baseline final-encodeCARGO_TARGET_DIR=/tmp/loro-final-bench cargo bench -p loro-internal --features test_utils,jsonpath --bench text_r B4DirectSync -- --save-baseline final-directCARGO_TARGET_DIR=/tmp/loro-final-bench cargo bench -p loro-internal --features test_utils,jsonpath --bench text_r B4Parallel -- --save-baseline final-parallelNotes
B4Parallelfiltering also matchesDecodeUpdates B4Parallel; after the target bench reported 54.0ms, the remaining matched bench was terminated.