Prefetch morsels across files in FileStream (bounded at 20)#21682
Prefetch morsels across files in FileStream (bounded at 20)#21682Dandandan wants to merge 4 commits intoapache:mainfrom
Conversation
Co-authored-by: Oleks V <comphead@users.noreply.github.com>
Adds a `FileStreamState::Prefetch` variant that drives multiple planner I/O operations concurrently, so I/O for upcoming files overlaps with CPU decoding of the current file. In-flight morsel-producing work (pending I/O + ready planners + ready morsels + active reader) is capped at 20 to bound buffering. Enabled by default via `FileStreamBuilder` (legacy single-I/O `ScanState` remains available with `with_prefetch(false)`). Stacks on top of apache#21351. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
run benchmarks |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing prefetch-morsels-on-21351 (2165d79) to 961c5fc (merge-base) diff using: clickbench_partitioned File an issue against this benchmark runner |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing prefetch-morsels-on-21351 (2165d79) to 961c5fc (merge-base) diff using: tpch File an issue against this benchmark runner |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing prefetch-morsels-on-21351 (2165d79) to 961c5fc (merge-base) diff using: tpcds File an issue against this benchmark runner |
|
run benchmarks |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing prefetch-morsels-on-21351 (2165d79) to 961c5fc (merge-base) diff using: clickbench_partitioned File an issue against this benchmark runner |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing prefetch-morsels-on-21351 (2165d79) to 961c5fc (merge-base) diff using: tpch File an issue against this benchmark runner |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing prefetch-morsels-on-21351 (2165d79) to 961c5fc (merge-base) diff using: tpcds File an issue against this benchmark runner |
|
Benchmark for this request failed. Last 20 lines of output: Click to expandFile an issue against this benchmark runner |
|
Benchmark for this request failed. Last 20 lines of output: Click to expandFile an issue against this benchmark runner |
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usageclickbench_partitioned — base (merge-base)
clickbench_partitioned — branch
File an issue against this benchmark runner |
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usagetpcds — base (merge-base)
tpcds — branch
File an issue against this benchmark runner |
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usageclickbench_partitioned — base (merge-base)
clickbench_partitioned — branch
File an issue against this benchmark runner |
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usagetpcds — base (merge-base)
tpcds — branch
File an issue against this benchmark runner |
Which issue does this PR close?
Stacks on top of #21351.
Rationale for this change
PR #21351 enables dynamic work scheduling in FileStream but keeps the same single-outstanding-I/O-per-partition property as main. This PR implements the follow-on item @alamb listed:
It lets each partition prefetch upcoming files while the active reader decodes the current file, so planner I/O is no longer serialized within a partition.
What changes are included in this PR?
FileStreamState::Prefetchvariant andPrefetchStatethat drives multiplePendingMorselPlannerI/Os concurrently and issues planner I/O for upcoming files while the active reader is blocked.MAX_PREFETCH_MORSELS = 20in-flight morsel-producing work items (pending I/O + ready planners + ready morsels + active reader) to cap buffering.FileStreamBuilder; the legacy single-I/OScanStatepath is preserved and opt-in-able viaFileStreamBuilder::with_prefetch(false).morsel_prefetch_overlaps_io_across_files— verifies file2's planner I/O is issued while file1's I/O is still pending.morsel_no_prefetch_keeps_files_sequential— verifieswith_prefetch(false)preserves the legacy single-I/O behavior.The reader takes priority over prefetching (step order: poll pending I/O → poll reader → plan → promote morsel → morselize next file), so user-visible latency is not delayed by opening new files, and all existing snapshot tests pass unchanged.
Are these changes tested?
Yes — 27 file_stream tests pass, including the two new prefetch-specific tests. Full
datafusion-datasourceanddatafusioncrate test suites pass locally. Clippy is clean on the affected crates.Are there any user-facing changes?
Yes — prefetching is on by default, so multi-file scans may now have multiple planner I/Os in flight per partition. Users can opt out via
FileStreamBuilder::with_prefetch(false).🤖 Generated with Claude Code